ANALYTICS

Google Analytics 4 (GA4) Events Demystified


David Vallejo
Share

At his point, many ( if not all ) have heard Google Analytics is moving to an "events" based tracking model with Google Analytics 4. But, what does it really imply? Do we have to worry about it?. To be honest, it's not a big ( from the implementation side ) deal since we have been already using "events" all the time, we used to call them hit types. If we look at it from the reporting side it may lead to some "hard times" when trying to use the data, not because it's better or worse, just because it's different.

This post will try to explain Google Analytics 4 Events from the technical perspective, trying to explain how to current event model works, where can the events come from, the limitations, etc.

I'd say that one of the most important things when working with GA4, is realizing how important is going to be the data model definition we do at the start. Because this is going to condition the future of our implementation and data.

But don't worry about this for now. we'll dig into this across the post ?.

How does Google Analytics 4 record the data

Google Analytics 4 works much similarly to Universal Analytics.

We'll be sending hits (network requests) to a specific endpoint ( https://endpoint.url/collect ). This shouldn't be anything new for anyone, that's how all analytics tools and pixels work. And this is the way it works for the client-side tracking (gtag.js), server-side tracking ( measurement protocol ), and the app tracking ( Firebase Analytics SDK ).

Tracking endpoints

I found there are 5 different endpoints that we could use to send the data to Google Analytics 4, these are:

  • https://www.google-analytics.com/g/collect
  • https://analytics.google.com/g/collect
  • https://custom.domain/g/collect (this will really forward the hits to the first one on this list)
  • https://app-measurement.com
  • https://www.google-analytics.com/mp/collect

Depending on where we are doing the tracking we'll be using one of them.

We could see hits flowing to 4 different endpoints for GA4 + 1 for Firebase

The first two endpoints are the ones used by the client-side tracking but you may wonder why sometimes we see the hits coming through analytics.google.com, and some other times via the google-analytics.com domain. The reason is that if current GA4 property has "Enable Google signals data collection info" turned on, GA4 will use the *.google.com endpoint ( si Google would be able to use their cookies to identify the users, I guess )

JavaScript Client Library

The page tracking is done using a library provided by Google, the same way we used to have analytics.js , ga.js or urchin.js libraries in the past Google Analytics versions.

The default code snippet will look like this:

<!-- Global site tag (gtag.js) - Google Analytics -->
<script async src="https://www.googletagmanager.com/gtag/js?id=G-THYNGSTER"></script>
<script>
  window.dataLayer = window.dataLayer || [];
  function gtag(){dataLayer.push(arguments);}
  gtag('js', new Date());

  gtag('config', 'G-THYNGSTER');
</script>

If you have noticed it the snippet loads a JavaScript file from www.googletagmanager.com domain, and this is because all gtag.js snippets are in essence a predefined Google Tag Manager template. It's not just a plain GTM container, since it does some internal stuff, but it works also based on tags, triggers, and variables.

Previous tracking libraries were offering a public API to perform all the tracking at our end, ie: it was accepting some methods/calls and converting them to hits, doing the cross-domain tracking allowing us to use Tasks, while at the same time doing some logic for generating the cookies, reading the browser details, and this library was shared across all the users worldwide web.

This is no longer working this way, now each Data Stream / Measurement ID will have its own snippet and it will load a separate js file. We may look at this as a performance penalty but it's done this way for a reason.

Each gtag.js container it's now built dynamically at Google's end and contains personalized code for the current property and also holds the settings for the current data Data Stream / Measurement ID. And that's why the container sizes are different for each container we check. Don't worry, this is normal and expected. The container size will vary depending on many things, like if we have the Enhanced measurement features we have enabled or the current settings we defined on the admin interface for our property.

GA4 Containers Sizes

One thing that has been confusing me since Google Analytics 4 arrived, was thinking that there were lots of things happening on the back that were hardly possible to debug, like the conversions, or the created / modified events.

And well, that's not the way it works, almost any setting or feature you enable on the admin it's going to be translated into code and will be executed on the client-side. This means that when you add a new event on the interface that's will add some code on the gtag.js container will send an event, and this will make that you "may" end seeing "ghost" events on the browser, don't waste your time as me trying to see why your implementation was firing duplicated events :). Or for example when we define a conversion event when we configure our internal domains or the ignored referrals.

While this approach may help some people in doing some common tracking tasks, on the other side it's preventing to do some advanced implementation because some "loved" features like the "customTasks" are now missing. I'm ok with Google trying to control how things are done, but there will always be sites that will need custom /U personalized implementations, and I really feel that Google should provide some public/documented API methods to easily perform some of the most used common tasks like the cross-domain tracking in Google Analytics 4.

Let's see some examples, when you "create a new event" from the Admin Interface, this event won't be created server-side, what' is happening is that GA4 will add some code logic to send that hit client-side.

Google Analytics 4 events creation modal

Another example would be when you enable the Enhanced Measurement, this will turn on having some code added to your container. Remember that we mentioned that GA4 was in essence a Google Tag Manager container?, if you take a look at the current Measuring categories you'll notice how they all match the current triggers available on GTM ( clicks tracking, scrolls tracking, youtube tracking )

Enhanced measurement

And that's not all, when we change the session duration or the engagement time, some session_timeout variables will be updated internally (engagementSeconds, sessionMinutes, sessionHours)

Session Timeout Adjust

We could keep going on examples, or build a full list, but that's likely going to get outdates sooner than later. The main idea you need to get from this part of the post is that GTAG is like a "predefined" GTM template and that all the tracking happens on the client's browser.

Firebase Analytics SDK

Apps are usually tracked using the Firebase Analytics SDK . A good starting point would be visiting the following Url: https://firebase.google.com/docs/analytics/get-started?platform=android&hl=en

The apps hits will use their own endpoint and format, the hits will go to https://app-measurement.com and the current payload will be sent in binary format, which makes it really difficult to debug, event if using Charles, Fiddles, or any other MITM proxy app.

If you want to debug your Firebase implementation. I recommend you use my Android Debugger for Windows. Once you install the app, you'll be able to request a free lifetime license.

Android Debugger Splash Screen

Google Analytics 4 Measurement Protocol

Google Analytics finally offers a proper Measurement "Protocol", which is at the time of writing this post it's in Beta stage.

This protocol will use the https://www.google-analytics.com/mp/collect endpoint, and rather than having the developers build the request payloads using some non-intuitive keys, now it accepts a POST request with a JSON string attached to the body using application/json Content-Type:

fetch('https://www.google-analytics.com/mp/collect?measurement_id=G-THYNGSTER&api_secret=12zneF6DSDFSDFjJPgDAzzQ', {
  method: "POST",
  headers: {
     'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    "client_id": "12345678.87654321",
    "user_id": "RandomUserIdHash",
    "events": [{
      "name": "follow_me_at_twitter",
      "params": {
        "twitter_handle": "@thyng",
        "value": 7.77,
    },{
      "name": "follow_intent",
      "params": {
        "status": "success"
    }]
  })
});
KeyType
client_idstrRequired. 
user_idstrOptional.
timestamp_microsintOptional. Hit offset. Up to 3 days ( 2,592e+11 microseconds ) before the current property's defined timezone.
user_properties{}Optional.
non_personalized_adsboolOptional. ( whatever use this event for ads personalization )
event[][]Required. ( Max 25 Events per request )
event[].namestrRequired. 
events[].params{}Optional.

In any case, there are some things you need to have in mind, you should keep your API Secret not exposed, meaning that this endpoint should not be used client-side, because that would mean that your API Secret would need to be exposed. This endpoint is more likely to be used to track offline interactions, ( like refunds ), or for tracking our transactions server-side.

At the time of writing this post ( Apr 2022 ), one of the biggest handicaps of this protocol is that it doesn't support any sessionId parameter, meaning that you won't be able to stitch the current server-side hits to the client-side session. This should be fixed over the next months,

In the meanwhile, I've published a the GA4 Payload Parameters CheatSheet, which you could use to send some server-side hits in the old-school way ( like we used to do with the first Measurement Protocol for Universal Analytics ) and where you could attach the "&sid" parameter.

There are of course some other points to have in mind, like that GA4 has some reserved event and parameters names, that you should not be using. We'll cover this later in the "events" section.

Events Model / Hit Types

Let's start by saying that everything on Google Analytics 4 is an "event". I'm sure that it's not the first time you hear that, and it's totally right, but at the same time if we strictly look to Universal Analytics we were also sending "events", but then we used to call them "hit types".

In a technical meaning, nothing has changed at all. We have networks requests to some endpoints. That is it!. If you want to learn a bit more about how the hits are built or sent from the web tracking library you can take a look at GA4: Google Analytics Measurement Protocol version 2 post to learn a bit more about how it works.

The main difference on GA4 is that now Google does not offer a fixed tracking data model besides the page_views and the e-commerce. Meaning that the responsibility for building a proper data model falls on us. While working on our definition we need to have in mind that there are some predefined/reserved event and parameters names and that we have some limits we need to have in count (About total events, names, and values lengths).

Universal Analytics Hit Types Model

If we take a closer look, since Urchin times we've been using "events" for our tracking in Google Analytics. Yep, I'm not joking, we had, we just called them "hit types".

Just so you know, we could replicate the current Universal Analytics Data Model in Google Analytics 4 following the next table of events:

Hit Type / EventParameters
pageview- Location
- Path
- Title
event- Category
- Action
- Label
- Value
- Non Interaction
timing- Category
- Variable
- Label
- Value
social- Network
- Action
- Opt. Target
exception- Description
- Fatal
screenview- Screen Name
transaction ( Legacy Ecommerce )- Id
- Affiliation
- Revenue
- Tax
- Shipping
- Coupon
item ( Legacy Ecommerce )- Id
- Name
- Brand
- Category
- Variant
- Price
- Quantity

Even Google offers a setting that will automatically convert all your ga() calls to some predefined events on GA4. From your Data Stream configuration you can enable this feature and all events, timing, and exception events will be converted to GA4 events ( they will add a listener to the ga('sent', 'event|exception|timing') calls for doing this,

This tool wil map the data in the following way:

Event NameParameters
[event_name]This will take the current eventAction
eventCategory > event_category 
eventAction > event
eventLabel > event_label
eventValue > value
timing_completetimingCategory > event_category
timingLabel > event_label
timingValue > value
timingVar > name
exceptionexDescription > description 
exFatal > fatal 

Beware because since its converting all Event Actions on "events", depending on your current de events definition on Universal Analytics you have end up hitting the unique event names limit (500)

Google Analytics 4 Events

Event Sources

The events on Google Analytics 4 can come from 4 different sources. These are:

  • Public Web/App endpoint.
  • Measurement Protocol ( Server Side )
  • Internal self-generated events
  • Admin defined events

Public Web Endpoint

The main actual origin for GA4 events we've already talked about them. These are the event that is being generated on our site coming from the GTAG.js container ( Check the GA4 Payload Parameters CheatSheet here ).

Measurement Protocol ( Server Side )

Another source for our events is the measurement protocol. This works similarly to the public endpoint. but the hits would be sent via server-side and we'll need to use an API Secret within our requests.

Internal self-generated Events

This one can be a bit confusing, GA4 auto-generates some of the events we see in the reports. This means that we see some events in our reports that won't be seen in our browser.

This doesn't mean that they're being generated randomly or using some server-side logic. Most ( if not all ) of these events are created because a parameter was added to some event.

Our events payloads may have some extra parameters attached to them sometimes that will make GA4 internally spawn a separate event. As far as I've been able to identify this is the list of the internally generated events and what's the parameter that will trigger them.

Event NameTrigger
session_start&_ss
first_visit&_fv
user_engagement&seg

For example, if the current event payload contains a &_ss parameter, a session_start will be generated, if it contains a $_fv then we should be able to see a first_visit events and so on. This list may grow in the future (and it may be missing some events that I've not been able to spot yet)

If we've enabled the Enhanced Measurement, we may also see some events in our reports ( this time this event will be visible without the browser requests ), these are:

Event NameParameters
clicklink_id
link_classes
link_url
link_domain
outbound
file_downloadlink_id
link_text
link_url
file_name
file_extension
video_play
video_pause
video_seek
video_buffering
video_progress
video_complete
video_url
video_title
video_provider
video_current_time
video_duration
video_percent
visible
view_search_resultssearch_term
scrollpercent_scrolled
page_viewpage_referrer ( URL and Title are Shared Parameters )


On the other side, when working with the Firebase Analytics SDK, this one will automatically track a lot of events, without us needing to explicitly define them.

Here is the current list of autogenerated event names by Firebase:

ad_activeviewAPP
ad_clickAPP
ad_exposureAPP
ad_impressionAPP
ad_queryAPP
adunit_exposureAPP
app_clear_dataAPP
app_installAPP
app_updateAPP
app_removeAPP
errorAPP
first_openAPP
in_app_purchaseAPP
notification_dismissAPP
notification_foregroundAPP
notification_openAPP
notification_receiveAPP
os_updateAPP
screen_viewAPP
user_engagementAPP,
Note: These events will not count towards the unique events name limit

Admin defined events

We've already talked about these ones, when we create or modify an event within the admin section, these settings will be translated to the client-side tracking.

This means the following:

  • You may see events being fired on the browser that you didn't define on Google Tag Manager or GTAG. This is normal, don't go crazy with it. If you see a duplicate event or a new event that you don't know where it's coming from take a look at the Data Stream Settings
  • You may have some unexpected parameters or event names if a "modify" rule is being used.

Events Limitations

Google Analytics 4 is full of limitations in many aspects, and it makes it a bit difficult to understand all of them, even more, when the limits keep constantly changing.

We have limits for event names and values length, same for the event parameters and the user properties. At the same time, we have a limit on how many parameters and properties we can append to each event. And these limits may vary between the free and 360 versions.

There are also, some exporting limitations (The free version it's capped to 1M daily hit export to Big Query ) or the data retention settings wherein the free version will top at 14 months while the 360 will allow to hold up to 50 months on data.

But this is not all the limits we'll have ... we will also have limits for the total conversions, audiences, insights, and funnels we can set. This is not directly related to the events, so if you're interested you can visit the official Configuration Limits Information.

Collecting and Names Limitations

We can attach up to 25 event parameters ( 100 on GA4 360 ) to each event, and we can identify these values in our hits easily these are the ones starting with "^ep(|n).*". Event Parameters are meant to add some metadata to our events.

ep.event_origin: gtag

Each of these parameters should have a name no longer than 40 characters and a value not bigger than 100 characters.

At the same type, we have the "user properties", We can attach up to 25 user properties to each hit these are attributes that will describe segments for our users. For example, we could think about recording the current user newsletter sign-up status, or the total purchases made by the current user. We can identify his data in our hits because they will start with "^up(|n).*",

up.newsletter_opt_in: yes
upn.user_total_purchases: 43

Each of these properties should have a name no longer than 24 characters and a value not bigger than 36 characters.

Logged itemLimitFree360
EventsEvent Name 40 chars
Event parameter Name40 chars
Event parameter Value100 chars
Params per event25100
User propertiesTotal per Property25
Property Name24 chars
Property Value36 chars
User-ID256 characters
Custom dimensionsEvent Scope50125
Item Scope10
User Scope25100
Custom MetricsEvent Scope50125
Events Offset3 days
Full Limits Table

Event Values Typing

You may have noticed that some of the parameters start may start with up, ep, upn, epn . This is because an event parameter/user property can be either a string or a number, the good news is that we don't need to define them since they're automatically typed by GA4. Just take a look at the logic it's used to define if a parameter is a string or a number.

var value = 'something';
if(typeof(value) === "number" && !isNaN(value)){
    console.log("is a number parameter");   
}else{
    console.log("is a string parameter");
}

SGTM - Google Analytics 4 Hits

The last thing I want to shout out is that GA4 hits sent via Server Side Google Tag Manager, are able of doing two things that we won't see on the regular hits.

First of these is that the hits sent server-side are able to set first-party cookies on the user browser, this is achieved using a Cookie-set header to the request:

And the last one is that they may contain a response body, this is used to send back some pixels client-side. ie: SGTM builds up a pixel request and gets it back to the browser so it gets sent if for example, it was missing some third party cookie value (where sending it via server-side won't be making any difference )

More Questions

How can I identify a conversion?

If the current event has a &_c=1 parameter it will be counted as a conversion

Are there any e-commerce limits?

Yes, they're, as far I've been able to deduct from the code.

  • A max of 200 items can be sent within a single event, any item above them will be skipped
  • A Max of 10 items scopes parameters, any parameter above this limit will be removed from the item

It takes some seconds to see my hits

Google Analytics 4, can delay up to 5 seconds the hits firing. This is because it uses an internal queue in order to batch the event and save some hits requests. At this time there is no way to "force" the queue dispatch, and there're some situations where the queue is skipped and the events are sent right way. This is for example the first a visitor comes to your site (ie: when there's no cookie present).

Why can't I use any of my parameters on the reports?

You can send ANY parameters along with your events, but this doesn't mean that you'll be able to use them on your exploring reports. This can be confusing because while you'll see the parameters on the Real-Time reports, you'll need to set up them as dimensions on the admin in order to be able to use them. If you think about it, it makes sense, the real-time report is just some streaming report where no data is being parsed/processed at all, and we can not expect GA4 to process all the data coming with the events, so it will only process the parameters that we've configured. We need to setup then in the Custom Definitions section

I've set-up my dimensions, but they show no data

I'm not if this is only me, but it drove me crazy sometimes. I'd say that if you add a new event with some parameters and then you directly go to adding in the admin, they won't show up, but you'll be able to type the parameter name manually. All times I did this, I was not getting info for that dimension. My advice is to wait some hours before the custom definition and only do it if the dimension is being shown for being selected. ( rather than manually typing it ). If you did it wrong, the only solution that worked for me was archiving the dimension and re-creating it.