-
Aligning BigQuery Metrics with GA4 Traffic Acquisition Analytics
Hey folks, so here’s the scoop. I’m still new to this whole BigQuery thing and I’m trying to make this Traffic Acquisition GA4 report. Thing is, my results are nowhere near the GA4 view. I’ve gotta be doing something wrong, but here’s where I’m at:
-
I get it, in GA4 and BQ, the source/medium/campaign fields are based on events, not sessions. But why doesn’t every event have a source/medium/campaign under event_parameter_key? Makes sense to me that you’d have these in the ‘session_start’ event, but no dice.
-
Here’s what I’ve tried to recreate the Traffic Acquisition report:
First, I aimed to check the first medium for sessions:
with cte as ( select PARSE_DATE("%Y%m%d", event_date) AS Date, user_pseudo_id, concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')) as session_id, FIRST_VALUE((select value.string_value from unnest(event_params) where key = 'medium')) OVER (PARTITION BY concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')) ORDER BY event_timestamp) as first_medium FROM
project
) select Date, first_medium, count(distinct user_pseudo_id) as Users, count (distinct session_id) as Sessions from cte group by 1,2;Now this query gives me 44k users as ‘null’ medium and 1.8k organic users. Compare this with the 17k users with the ” medium and 8k organic users from GA4, you get the picture.
Then, I switched it up and checked the last medium:
FIRST_VALUE((select value.string_value from unnest(event_params) where key = 'medium')) OVER (PARTITION BY concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')) ORDER BY event_timestamp desc) as last_medium
I did see an increase to 9k organic users, but it’s still way off the GA4 data.
Lastly, I tried this code – https://www.ga4bigquery.com/traffic-source-dimensions-metrics-ga4/ – which uses source / medium (based on session). As you might guess, the results made no sense compared to GA4 data either.
If any wiz out there can lend a hand, I’d be super grateful!
EDIT:
Just stumbled on this gem of an article which thoroughly explains how to query traffic sources data. It’s the best one I’ve bumped into till now:Query the GA4 Session Last Non-direct Traffic Source in BigQuery
-
Log in to reply.