What you’re doing here is calculating the engagement rate as the ratio of engaged sessions to total sessions. For those unfamiliar with SQL language:
You’re using the safe_divide
function to divide the first argument by the second argument. This function will not cause an error if you try to divide by zero; instead, it will return NULL.
The first argument is the count of distinct sessions where ‘session_engaged’ = ‘1’. The ‘case when’ statement is used to decide when a session is considered engaged. According to GA4, a session is considered engaged when it lasts at least 10 seconds or had a conversion event or 2 screen or page views.
For the ‘distinct’ condition, you’re using a combination of ‘user_pseudo_id’ and ‘ga_session_id’ to recognize distinct sessions. ‘user_pseudo_id’ is a unique identifier for a user and ‘ga_session_id’ is a unique identifier for a session. Therefore, the combination of these two keys will give you distinct sessions.
The second argument is the total count of distinct sessions. It’s calculated in the same way as above but it includes all sessions, not just engaged ones.
Then you’re aliasing the output of the division as ‘engagement_rate’. This will be the name of the column in the output of your query.
In summary, the ‘engagement rate’ is calculated as ‘the count of distinct engaged sessions divided by the count of all distinct sessions’.
It’s worth mentioning that this is likely a simplification of how engagement rate is calculated in GA4. Google may be using additional or different criteria to determine whether a session is considered engaged.
I hope this helps! Let me know if you have any other questions!