Learn how does Universal Analytics Cross Domain work
Univeral Analytics tracking is based on cookies and on a randomly generated clientId. Yep, that's it. Each time you visit a site using Google Analytics for the first time a new cookie is set in your browser with a randomly generated hash (clientId), that will be used by Google Analytics servers to identificate you in the subsecuent visits.
Let's see how the cookie looks:
Any of those values can be modified using the JavaScript API, and they need to be set within the tracker creation. In a 99.99999% of cases you won't need to deal with this, but it's always good to know that we have the possibility to define our own vales. Let's see how a default tag would work if we were setting it manually:
ga('create', 'UA-XXXXXXX-YYY', { 'cookieName': '_ga', 'cookieDomain': 'thyngster.com', 'cookieExpires': 60 * 60 * 24 * 730 // 2 years expiration date });>
As we can see in the previous screenshot Google Analytics does not just use the randomly generated hash to track the user (device), as it only is a 10 digits number, so in a high traffic site or even if not having good luck it could end in generaeting clientIds collitions. To reduce the chance of having a duplicated clientId under our property. To address this problem Google Analytics add the cookie creation timestamp value to the clientId value instead of only the random hash. If our "_ga" cookie value is: "GA1.2.2055842233.1422407281" , the clientId would be: "2055842233.1422407281" and not just "2055842233".
Going back to the crossdomain tracking, Google Analytics uses the clientId (cid parameter on the hits) to identify the users (devices), and it will make sense to think that when the user goes to a different domain the same clientId should be used on the destination domain. In order to be able to have the same clientId in 2 different domains we'll need to pass the clientId value to the destination domain, that way analytics.js will be able to use that clientId value for the destination domain instead of generating a new one (remember that ga cookies uses 1st party cookies).
The JavaScript tracking API does provide some in-build methods to grab the linker parameters, or even to "decorate" the links to the "external" domain we define.
ga(function(tracker) { var linkerParams = tracker.get('linkerParam'); });
You can see that all info returned by the linker makes sense, but there's a new value that is not in our origin domain. What it is?. It's a value that Universal Analytics uses to assure the cookie data integrity and to prevent in some way to have our cookie info hijacked. Using the linker parameter into a domain name using Universal Analytics with the allow linker option enabled will force the cookie override with the cookie values specified in the url, and we don't to have anything sharing a link to our site in some social network and therefore messing up all our data because all users coming from that link have the same clientId do we?
In the previous Google Analytics versions, the only not randomly generated hash that was being used was the domainhash ( first value in the __utma cookie). It was generated by a pretty simple algorithm (click here to see a PHP port for the domainhash algo) .
This time Google Analytics want further and uses al algorithm that take some more variables in mind, and most of them are not fixed values as it was the hostname.
- The current clientId value
- The current browser user agent
- The current browsers timezoneoffset ( the difference in minutes with the UTC time )
- All the descriptions from the available plugins within the current user's browser (taken from window.navigator.plugins)
- Current year, day hour and minute
Below you can see the values used to generate the checksum for the crossdomain linking on my Firefox Browser:
function La(a) { var b = 1, c = 0, d; if (a) for (b = 0, d = a[y] - 1; 0 <= d; d--) c = a.charCodeAt(d), b = (b << 6 & 268435455) + c + (c << 14), c = b & 266338304, b = 0 != c ? b ^ c >> 21 : b; return b };
When implemeting a crossdomain tracking the generated hash on the destion domain NEEDS to match with the current one generated when creating the tracker with the clientId data in the URL If they don't match the cookie value won't be overrided, and therefore crossdomain linking will fail and the cookie value will be created with a newly random clientId.
Those are the points we need to take in mind when thinking in the crossdomain tracking:
- The linker value is only valid when using the same navigator version and using the same plugins (same browser needs to visit the origin and destination domain).
- The crossdomain tracking won't work if we land on the destination domain later than 2 minutes after the linker value has been generated. ie: the linker value has to be generated on the fly when the user clicks on the destination domain, if we update the links on the page load and an user clicks on the destionation domain after the default time limit it won't work. According to Google's documentation, there's a grace period of 2 minutes for the linker value. And that's why Universal Analytics provides the decorate function, this function generates the linker value when user clicks on a link or either when a form is submitted, without the need of coding any javascript listeners to update the links value in real time when the user clicks on them.
- The clientId used when generating the hash needs to match the one on the linker parameters, so even if having a valid hash for the linking, changing the clientId values won't be possible.
- User's may not change the computer's timezone configured while going to the destination domain.
This is how the crossdomain linking works on Universal Analytics and how Universal analytics tries to keep the cookie data integrity and prevent in some way the Universal Analytics cookie hijacking.