October 03, 2008
Google Analytics Cookies
A general question came up the other day regarding Google Analytics and how it tracks things, so I figured it was about time I posted the info I have on the cookies GoAn sets on the users computer, what the information in each of them is and which cookie does what.
Note that this all pertains to the newer Google Analytics methods. The older version used the same cookies and saved mostly the same information, however there are some minor differences. The way to tell if your site (or someone else's for that matter) is using the new Google Analytics code or the older legacy code is to review the HTML source of a page containing GoAn code and look at the file being called. A reference to "urchin.js" means the site is using the older legacy code. A reference to "ga.js" means it's using the newer code.
First the cookies. As a general rule GoAn sets four cookies on the users machine. Their names are: __utma, __utmb, __utmc and __utmz. There's a possibility of more cookies, depending upon how the webmaster has things set up --such as the __utmv cookie-- but 99% of the time you'll see just the four main ones.
Here are the dirty details of what each does and the information it saves to the users computer, best I can tell.
__utma - utma is the main cookie that saves all kinds of interesting information. The interesting thing about this cookie is the massive amount of really pertinent data it saves in such a small package. This info will look like a bunch of goobly-gook numbers until you understand what they mean.
The cookie contents typically looks like:
XXXX.RRRR.FFFF.PPPP.CCCC.N
Where...
XXXX = A domain hash. A domain hash is simply a group of numbers that relates directly back to the domain name of a site. Think of it as a sort of numerical representation of your domain name.
RRRR = A random number the GoAn script generates to be used as a Unique ID for each visitor.
FFFF = A timestamp of the first visit/session for the user. Or in English, the time someone first hit the site. As a note, all of these timestamps are in the same format you'd give if you ran a php date('U'); call. They're not in the date format we humans are used to seeing, but they're just as effective.
PPPP = A timestamp of the Previous visit by the user. Or the date and time the user last visited your site.
CCCC = The current time, in the same timestamp format as the previous two.
N = The number of visitor Sessions the user has had since their first visit. This number gets incremented by 1 each time the returning visitor starts a new Session.
As you can see, there is a lot of potentially very useful information in this one cookie.
Essentially the utma cookie is what is known as a Persistent cookie. Its expiration date is set out to two years in the future on the first visit, and the expiration date is moved out to two years in the future on each subsequent visit. So if a visitor doesn't let two years pass between visits, you'd still be able to tell the first time they visited your site, the last time they visited your site, the time of their current visit and the number of times they've visited. This would be some good information to know for all kinds of reasons, especially if you're doing any sort of conversion testing. Or tracking visitor loyalty.
Some of the above stay pretty much constant throughout, from the first time the cookie is set to the last visit. Others get updated with each visit. Those that stay the same are the Domain Hash (XXXX), Random ID (RRRR) and Time of First Visit (FFFF). Those that get updated and changed are the Time of the Previous Visit/Session (PPPP), Current Time (CCCC) and the Number of Visits/Sessions (N).
__utmb - utmb is one of two cookies that work together to record information about what happened during the current visit or session. Including the ability to tell when a session ends. This is a cookie that has been changed pretty significantly between the older legacy version of GoAn and the newer version. (The old version simply saved the Domain Hash and nothing else.) The info it saves now looks like:
XXXX.P.10.C
Where...
XXXX = The Domain Hash.
P = Pages of the site viewed this session.
C = A timestamp of the Current Time.
I'm not 100% sure what the "10" in there does, but every one of the dozen or so sites I looked at that had the new GoAn installed all had the number 10 in the third octet. In looking at what's in the ga.js file it looks like this part is going to be utilized at some point in the future to perform automated off-site click tracking, but at this time it's not something that's reported in GoAn.
Again, we've got some useful information. Especially that part where it records the number of pages you have viewed this Session.
This cookie sets itself to expire in 30 minutes, but it's 30 minutes from the time you loaded the last page you viewed. Meaning if you hit a GoAn enabled site, view a page for a couple of minutes, then move to another page of the same site you're going to see two things happen to the cookie. First the Pages This Session (P) value is going to be incremented by one (FTR it gets incremented if you reload the same page, so it technically isn't a Pages Viewed count) and the Created and Expires details are going to get updated to start the 30 minute clock ticking again.
Why 30 minutes? Well, some browsers (some versions of IE notably) don't correctly erase a Session cookie like they should. So the 30 minute timer works pretty well. It gives a user plenty of time to move on to a new page, but ends the session in 30 minutes if nothing happens.
utmb in conjunction with utmc is what GoAn used to determine things like time on page, pages visited per session, how long a session lasts on average, etc.
__utmc - utmc is a true Session cookie, meaning it is one that expires at the end of the current session/visit. If you navigate away from the site or close your browser it should be automatically deleted, if the browser does what it's supposed to do.
As mentioned above, it's utlized with the "b" cookie, with its main function being to tell how long a session or visit lasted. It is not overwritten once a session starts, so by looking at the Created timestamp and the current time one can in theory tell how long a visitor has been on a given site.
The only info in utmc's content is the Domain Hash.
__utmz - utmz is another of those powerhouse cookies, one that saves a lot of information that can be quite useful. The treasure trove of info it saves relates to how a user arrived at a site. The channel though which they came, date/time info and even what keywords they used if they arrived via a search engine.
Its content normally looks something like:
XXXX.TTTT.V.S.utmcsr{source}|utmccn{campaign}|utmcmd{medium}|utmctr{keyword}
Where...
XXXX = The Domain Hash.
TTTT = The timestamp of when the cookie was last set.
V = How many visitor sessions there have been in total. (should be the same as the final number in the "a" cookie in theory.)
S = Via how many different sources or channels this user has arrived at the site. In other words, if a user searched at Google and found your site one time, then searched at Live or Yahoo another time and clicked through, this number should increment. In theory this number should also increment if one time a user clicked on your Organic search ranking in Google and a second time clicked on an Adwords ad you had running on Google.
utmcsr = The source of the last time the cookie was updated. So if someone searched and found your site on Google here it would say utmcsr=google
utmccn = Campaign information. This is really there for Adwords types of situations. If you tag your campaign with an identifier it'll show up here in the cookie. If it's just a normal search hit it should say utmccn=(organic)
utmcmd = Medium. But not as in Large, Medium and Small. Their Medium is more really more channel information. So if a hit comes from a normal, non-paid search it'll show utmcmd=organic
utmctr = The keyword phrase someone typed into the search engine. Really useful this one, however remember it shows the last search data, not how someone originally found you.
The utmz cookie gets set with a lifetime of six months into the future, and gets its expiration date updated with each time the cookie is updated. This can be a bit misleading though. The important thing to remember about this particular cookie is it doesn't necessarily get updated with each and every visit. For normal visitors it pretty much does, but if you're checking your own site and its cookies you may see some sightly odd information updates.
So those are the four main cookies that get set by Google Analyzer, what they do and what information they contain. A fifth one, and one I don't see often, is the __utmv cookie. This one is known as a Custom Segmentation cookie and is one that requires the webmaster to have configured their GoAn via --utmSetVar() to track some special data. I see it rarely. In fact I think I've seen it exactly twice.
Trackback
You can ping this entry by using http://www.randycullom.com/chatterbox/mt-tb.cgi/44 .
Comments
Thank you very much, Randy. This has been most helpful!
Really nice job of breaking all of this down Randy! I got to learn some things I haven't made the time to research on my own. Thanks!
This is a very informative post. I never could find the time to analyze these myself. You've saved me hours of headache! One small point, in your description of the utmz cookie, the letter C should be an S where it says "C = Via how many different sources..."
This is a wicked post - thanks!
Have you ever looked at eh cookie values when using cross domain tracking codes?
My cookie values look very different for instance utmc is set to the value 1
Good article!
You have a typo in the utmz section.
You write the cookies looks like this:
XXXX.TTTT.V.S.utmcsr{source}|utmccn{campaign}|utmcmd{medium}|utmctr{keyword}
but then use C and not S in the breakdown.
- Ophir
Good catch John and Ophir. Not sure how I managed to typo that one, (probably confused myself between Sources and Channels, but I'll go fix it now.
Digit: No I've never looked at cross domain tracking cookies. I assume by the question you mean where you can track between two domains you control by tweaking the GoAn code. I don't do much of that myself.
If you happen to be referring to Third Party cookies, where you're on one site and another site is allowed to set a cookie, I never bother with that personally. Mainly because all of my computers are set up to disallow third party cookies.
Maybe I'm paranoid? ;-)
FYI, rather than defining the timestamp format as "the same format you'd give if you ran a php date('U'); call" you might well declare that the format is in seconds since the Unix Epoch (January 1 1970 00:00:00 GMT).
I'm not a php developer so it took me a few minutes to look that up...