Cross-Domain Tracking on Google Analytics 4 (GA4)
Now that Universal Analytics deprecation has been officially announced, it's time to start writtingsome technical posts about how things works on Google Analytics 4.
First one, is going to be about how Cross Domain Tracking works, ( not only for GA4 but also for any GTAG.JS based tool ).
Back in time (about three years ago) I noticed a new "_gl
" parameter being attached to some links by Google Analytics, I'd say it stands for "Google Linker" , just looking how it offers support for the cross-domain tracking not only for Google Analytics but also for all the other pixels based on the gtag containers (Adwords, Doubleclick ) and not only for Google Analytics. At that point I started to reverse engineer the code to see how it works. If you're interested on it, you can take a quick look to a notes draft I wrote at that time on this Gist:
To the date Google has only officially published a way to perform the cross-domain tracking . and this is adding our domains in the Data Stream configuration section. ( Admin > Data Streams > Select the stream > More Tagging Settings > Configure your domains
)
We may think that some things are happening in the backend, but what's is really happening when we add some configuration in our Admin section is that the current served GTAG Container get's some extra settings.
On this case when we add some domains to our configurationm the GA4 GTAG Container is adding a click/submit listener with some condictions based on the current domains we've added. If you're used to work with Google Tag Manager, think about this like adding a Click Listener
and a Form Submit
Triggers with a triggering condition based on these domains list.
Please have in mind, that on this case, this will also affect the in-build outgoing links tracking on Google Analytics 4 (It will not trigger the events if the current link domain name is within the defined list on the admin side of our Data Stream).
If you read in between lines from the previous paragraphs you may have guessed that there must be some code on the GTAG.js code taking care of the cross-domain linking
, and this means that we can reverse it.
How the Cross Domain Tracking works on the GA4/GTAG
First think is looking how the current new linker parameter
looks like in GA4:
_gl=1*tp0qzs*_ga*OTYxNDI4MjA4LjE2NDg1NzM2OTM.*_ga_RNYCK86MYK*MTY0OTE3NjMxMy41LjEuMTY0OTE4MDM1OC4w
We can easily identify that we could split this in some different values by the "*" character:
Key | Value |
---|---|
1 | tp0qzs |
_ga | OTYxNDI4MjA4LjE2NDg1NzM2OTM |
_ga_RNYCK86MYK | MTY0OTE3NjMxMy41LjEuMTY0OTE4MDM1OC4w |
The first value ( if I', not wrong ), it's a fingerprint based on the current user agent and browser plugins plus a time checking hash. This will check that the current browser getting the linker is the same as the one that generated it and that it was not generated long ago ( it used to have 2 minutes expiration time on Universal Analytics Linker ). This is done to prevent the cookies being overridden by mistake because shares a link with a linker value on it.
Please note, that we will have as many _ga_[A-Z]{6,9}
keys as GA4 Streams being used in our website . and that also will be holding some other cookies values like the Adwords or Double Click ones. This will vary depending on your current setup. For now we'll be just focusing on the Google Analytics 4 ones.
GA4 Cookies Info
If we look back into Google Analytics history, we used to have _utma
, _utmb
, _utmc
, _utmz
cookies for Urchin (urchin.js) and First Google Analytics version (ga.js). At this timet all client/session tracking ( a,b,c cookies ) and attribution info (z cookie) was being calculated client-side, then again when Universal Analytics was released all this logic was moved to the server-side and then Google switched to use just one cookie ( _ga
) to keep hold the clientId ( &cid
) .
Now, it seems we switched to some hybrid method, where the session calculation is done client-side again and the attribution it's being calculated on the server-side. That's why we have a new cookie (in addition to the _ga one ), that's is being used to hold the current session_id (&sid
), session_count (&sct
), session_time, and based on this report also the session_start (&_ss
), first_visit (&_fv
) internal events .
Let's take a look to a typical Google Analytics 4 Cookies Set:
_ga: GA1.1.961428208.1648573693
_ga_THYNGSTER: GS1.1.1649176313.5.1.1649181571.0
There has not been any changes in our well-known "_ga
" cookie. It still holds the current hostname segments count, a random Id and lastly the time when the cookie was created for the first time. Here is a table showing the current values ( I need to double check 2 of the values to be sure what are they for, that's why I actually set them as TBD )
Value | Description |
---|---|
GS1 | Fixed String: Google Stream 1 |
1 | Current hostname segments count |
1649176313 | Current Session Start Time |
5 | Sessions Count |
1 | TBD |
1649181571 | Last hit Time |
0 | TBD |
This is the way GA4
is able to determine when a session start, the session duration, and the current session count, even checking the _ga cookie it will be able to define the first visit time.
Looking Inside the Google Analytics 4 Cross-Domain Tracking
If we take a look at the officials docs, there's no much info about how to customize the cross-domain tracking, beside than telling us to add our domains to the admin.
This can be kind of an issue/limitation if our setup is not just based on clicks, or form submits, I can think on some examples like wanting to do a cross-domain linking to an iFrame, or if our site is redirecting to a destination page that it's generated dynamically ( for example for these forms doing a validation and then redirecting the user to a search listing page, without doing any form submit at all )
These situations won't be handled by GA4 and on these cases we'd want to get the build the linkerParam for GA4 so we can attach it wherever we want.
The good point is that , even if it's not documented, we can use some global variables to grab the data, and even make use of some helpers available to generating the linker, decorate links or anything we want :)
First thing we'll we learn is about the window.google_tag_data.gl
object. This global variable will be holding ALL the cross-domain config and information. This is: the current data model for the Google Linker Config.
It'll looks like:
{
"decorators": [
{
"domains": [
/outgoing\.com/
],
"fragment": false,
"placement": 1,
"forms": true,
"sameHost": false
}
],
"data": {
"query": {},
"fragment": {}
},
"init": true
}
decorators[:decorator] | This holds the current decorators for the cross-domain, we have have more than 1 decorator since we have may have an AdWords gtag container adding one, and GA4 another one. |
:decorator.domains | An array of regexes to be matched against the current clicked link |
:decorator.fragment | Should we attach the _gl parameter to the fragment (#) |
:decorator.forms | Is this decorator for "submit" events |
:decorator.placement | The current order preference for applying the decorator |
:decorator.sameHost | N/A |
data{} | Current Linker Info |
data.query | Cross-linking parameters read from the QueryString ( decoded ) |
data.fragment | Cross-linking parameters read from the Fragment ( decoded ) |
It's more simple that it looks at a first look. This variable will allow us to know if there are any "decorators" configured ( a decorator is meant to "add" the linkerParam (_gl
) to any Link or Form that is matching the current domains on the list.
Also if on the current page load there was a valid linker parameter, the data jet will show us which clientIds ( cookies ) were overriden . Nice!
Note: In the event of an invalid linker (if it was generated on a different browser or some minutes ago , the linker won't work and the data here will show empty)
Trick #1 - Adding a new domain name for the auto-decoration
There's a small trick you can do to add new domains dynamically to GTAG Decorator. Remember we said that the current settings on the Admin were reflected on some code in our GTAG containers?, after checking the current code, we can push
new domains programatically from JS, just like this:
if(window.google_tag_data && window.google_tag_data.gl.decorators && window.google_tag_data.gl.decorators.length > 0){
window.google_tag_data.gl.decorators[0].domains.push(/analytics-ninja\.com/)
}
glBridge Utilities
If the case that we want to grab the current linker to we can add it ourselves to any link. There are also some good news, analytics.js exposes some utilities for performing this task.
The utils are available on the window.google_tag_data.glBridge
object
As you can see there are the same as we use to have on Universal Analytics for setting the autoLinking, the decoration of links, the linkerParam generation. We are just focusing are the generate
and get
ones, the first one will be equivalent to the getLinkerParam
, the second one will allow us to "unhash" the linker values.
google_tag_data.glBridge.generate({})
This function takes an object of clientIds values as an argument and returns a valid "_gl
" linker value that we could attach to our links.
window.google_tag_data.glBridge.generate({
_ga: '121321321321.2315648466',
_ga_THYNGSTER: '1649176313.5.1.1649183273.0'
});
As you can see, we'll need to grab our current cookies values and just pass them to the function and it will return our precious linkerParam
:)
google_tag_data.glBridge.get()
This one is also pretty self-explanatory, it will grab the linker param from the current URL (if present) and it will return the client cookies/id decoded values hold on the linker .
Advise
Please note that universal.js is likely to be gone in like 1.5y. I don't expect Google to remove the analytics.js library and just stop processing hits at some point ( or maybe modifying the library so it doesn't fire hits at all). At the moment the gtag.js container doesn't expose this brigde functions, but it may do in some near future.
If you're look to some guidance about how to implement this functionality without relying on that library I'm providing some examples ( reversed from universal.js code )
If the current linkerParam
value is "invalid" ( ie: was not generated from the same browser or it generated long ago ), this function will just return an empty object {}
NOTE: I'm working on a library that totally replicates this analytics.js Google Linker Bridge
functionality . To have some future proof solution for when Universal Analytics is sunsetted. It will be publish in the next weeks.
If you're interested in all this I'm publishing some proof of concept functions that you could use as a base for your coding. This code should be adapted to support the Adwords, DoubleClick, multiple GA4 cookies, Google Signals ids ( _gid cookie ) , Google Remarketing Cookies ( _gac ) to be able to say that it's a good replacement. But it this point I'm offering these snippets ( all of them reversed/copied from the analytics.js source code )
var decrypt_cookies_ids = function(a, b) {
var P = function(a) {
if (encodeURIComponent instanceof Function) return encodeURIComponent(a);
F(28);
return a
};
var m = function(a, b) {
for (var c in b) b.hasOwnProperty(c) && (a[c] = b[c])
};
var H = function() {
var a = {};
var b = window.google_tag_data;
window.google_tag_data = void 0 === b ? a : b;
a = window.google_tag_data;
b = a.gl;
b && b.decorators || (b = {
decorators: []
}, a.gl = b);
return b
};
var c = P(!!b);
b = H();
b.data || (b.data = {
query: {},
fragment: {}
}, c(b.data));
c = {};
if (b = b.data) m(c, b.query), a && m(c, b.fragment);
return c
}
var generateLinkerParam = function(a) {
// Function to properly grab ID's from Cookies
var getCookiebyName = function(name) {
var pair = document.cookie.match(new RegExp(name + '=([^;]+)'));
return !!pair ? pair[1].match(/GA1\.[0-9]\.(.+)/)[1] : undefined;
};
// These are the 3 values used by the new linker
var cookies = {
_ga: getCookiebyName("_ga"),
// Google Analytics GA ID
_gac: undefined,
// Google Remarketing
_gid: getCookiebyName("_gid")// Google ID
};
// Calculate current browser_fingerprint based on UA, time, timezone and language
//
var browser_fingerprint = (function(a, b) {
var F = function(a) {
// Didnt check what this does, the algo just needs F to be defined. commenting out
Ne.set(a)
};
a = [window.navigator.userAgent, (new Date).getTimezoneOffset(), window.navigator.userLanguage || window.navigator.language, Math.floor((new Date).getTime() / 60 / 1E3) - (void 0 === b ? 0 : b), a].join("*");
if (!(b = F)) {
b = Array(256);
for (var c = 0; 256 > c; c++) {
for (var d = c, e = 0; 8 > e; e++)
d = d & 1 ? d >>> 1 ^ 3988292384 : d >>> 1;
b[c] = d
}
}
F = b;
b = 4294967295;
for (c = 0; c < a.length; c++)
b = b >>> 8 ^ F[(b ^ a.charCodeAt(c)) & 255];
return ((b ^ -1) >>> 0).toString(36);
}
)();
// Function to hash the cookie value
// The following functions takes a string and returns a hash value.
var hash_cookie_value = function(val) {
var A, C, D = function(a) {
A = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_.";
C = {
"0": 52,
"1": 53,
"2": 54,
"3": 55,
"4": 56,
"5": 57,
"6": 58,
"7": 59,
"8": 60,
"9": 61,
"A": 0,
"B": 1,
"C": 2,
"D": 3,
"E": 4,
"F": 5,
"G": 6,
"H": 7,
"I": 8,
"J": 9,
"K": 10,
"L": 11,
"M": 12,
"N": 13,
"O": 14,
"P": 15,
"Q": 16,
"R": 17,
"S": 18,
"T": 19,
"U": 20,
"V": 21,
"W": 22,
"X": 23,
"Y": 24,
"Z": 25,
"a": 26,
"b": 27,
"c": 28,
"d": 29,
"e": 30,
"f": 31,
"g": 32,
"h": 33,
"i": 34,
"j": 35,
"k": 36,
"l": 37,
"m": 38,
"n": 39,
"o": 40,
"p": 41,
"q": 42,
"r": 43,
"s": 44,
"t": 45,
"u": 46,
"v": 47,
"w": 48,
"x": 49,
"y": 50,
"z": 51,
"-": 62,
"_": 63,
".": 64
};
for (var b = [], c = 0; c < a.length; c += 3) {
var d = c + 1 < a.length
, e = c + 2 < a.length
, g = a.charCodeAt(c)
, f = d ? a.charCodeAt(c + 1) : 0
, h = e ? a.charCodeAt(c + 2) : 0
, p = g >> 2;
g = (g & 3) << 4 | f >> 4;
f = (f & 15) << 2 | h >> 6;
h &= 63;
e || (h = 64,
d || (f = 64));
b.push(A[p], A[g], A[f], A[h])
}
return b.join("")
};
return D(String(val));
};
// Now we have all the data Let's build the linker String! =)
// First value is a fixed "1" value, the current GA code does the same. May change in a future
return ["1", browser_fingerprint, "_ga", hash_cookie_value(cookies._ga), "_gid", hash_cookie_value(cookies._gid)].join('*');
};