In Google Analytics the urls are case sensitive, therefore in our content reports /index.html
will be different to /Index.html
, and querystring values will make Google Analytics to think that even if it’s the same page it will recorded as a new one, /index.html?cache=off
and /index.html?cache=on
will be recorded as 2 different pageviews for example.
The first problem its easily fixable with a lowercase filter within the views, but the querystring parameters it’s going to be a problem … I bet you’re saying that you can just add them to the Exclude URL Query Parameters
list within your view configuration page and Yes! that’s right, but I’m pretty sure that you’re likely going to end having some marketing campaigns adding new parameters, or IT adding some parameters to get some funcionality switched on (like enabling some caching feature or whatever).
So today, we’ll be using Google Tag Manager to solve this problem of having all our content reports fragmented due the unexpected querystring parameters in our pages. So let’s think about it, wouldnt be easier to identify the real parameters and getting ride of the rest that are not expected for the page functionality?, If you think about it, it’s likely a better way to do it, we can know which parameters will be used in our site, but we cannot think on unexpected ones.
To achive this, we’re going to make use of just one single variable in Google Tag Manager, yeah that’s it, just one single Custom Javascript
variable.
We’ll just need to configure the paramList array on the code top, and add there all the querystring parameters that we want to keep. Any other parameter that is not listed in our array will be removed from the querystring value that is going to be recorded by Google Analytics
function(){ try{ // We'll need to defined the QS values we want to keep in our reports var paramsList = ["two","one","three"]; // CrossBrowser inArray polyfill if (!Array.prototype.indexOf) { Array.prototype.indexOf = function (searchElement /*, fromIndex */ ) { "use strict"; if (this == null) { throw new TypeError(); } var t = Object(this); var len = t.length >>> 0; if (len === 0) { return -1; } var n = 0; if (arguments.length > 0) { n = Number(arguments[1]); if (n != n) { // shortcut for verifying if it's NaN n = 0; } else if (n != 0 && n != Infinity && n != -Infinity) { n = (n > 0 || -1) * Math.floor(Math.abs(n)); } } if (n >= len) { return -1; } var k = n >= 0 ? n : Math.max(len - Math.abs(n), 0); for (; k < len; k++) { if (k in t && t[k] === searchElement) { return k; } } return -1; } } var qsParamsSanitizer= function(qs,permitted_parameters){ var pairs = qs.slice(1).split('&'); var result = {}; pairs.forEach(function(pair) { pair = pair.split('='); result[pair[0]] = decodeURIComponent(pair[1] || ''); }); var qsParamsObject = JSON.parse(JSON.stringify(result)); for (var p in qsParamsObject){ if(permitted_parameters.indexOf(p)==-1) delete qsParamsObject[p]; } var rw_qs = '?' + Object.keys(qsParamsObject).map(function(key) { return encodeURIComponent(key) + '=' + encodeURIComponent(qsParamsObject[key]); }).join('&'); if(rw_qs=="?") rw_qs=""; return rw_qs; } return qsParamsSanitizer(document.location.search,paramsList); }catch(e){ // let's let GA to use the current location.href if // for some reason our code fails. return undefined; } }
Now, we only need to set our pageview tag “page” parameter so Google Analytics uses the new sanitized array instead of the one that it’s on the url.
We’re done!. Let’s see how it works with a screenshot
Now you just need to sit down, and wait some hours to start seeing your reports in a clean way and with no fragmentation. Happy analyzing!