Learn business growth with Google Analytics 4 Forums Google Analytics 4 Discrepancy in User Counts between GA4 and BigQuery

  • Discrepancy in User Counts between GA4 and BigQuery

    Posted by Grayson on 14 August 2022 at 3:07 am

    So here’s the scoop. I used Bigquery to pull some data on active_users and total users up to 31/12/2022. This was broken down by CampaignName and Country as shown:

    select 
    count(distinct case when (select value.int_value from unnest(event_params) where key = 'engagement_time_msec') > 0  or (select value.string_value from unnest(event_params) where key = 'session_engaged') = '1' then user_pseudo_id else null end) AS active_users
    ,count(distinct user_pseudo_id) AS totalusers
    ,traffic_source.name AS CampaignName
    ,geo.country AS Country
    FROM independent-tea-354108.analytics_254831690.events_20221231
    GROUP BY 
    traffic_source.name
    ,geo.country
    

    When I singled out the ‘(organic)’ CampaignName in the results, here’s what I got: (Link to the breakout)

    Now, here’s where things get a bit wonky. When I match it up with the data from GA4, it’s like comparing apples to oranges! There’s a whopping 15000 more active_users showing in GA4 in a single day. Imagine the discrepancy for a whole month! (Here’s the GA4 data)

    And, just in case you think it’s a one off, I gave other CampaignNames a whirl and not a single one lined up. Those gaps were massive too. So, I’m stumped. Any ideas?

    Morgan replied 1 year ago 3 Members · 2 Replies
  • 2 Replies
  • Jaspreet

    Member
    2 September 2022 at 8:53 am

    Hey, no worries, we’ve all been there! You’re right, the numbers seem to be dancing on different beats in GA4 and BigQuery. But hey, let’s see if we can get them to tango together.

    One area to peek into might be your source table. If your website is rolling with a consent mode, you might find some ‘user_pseudo_id’s that have turned out to be shy and ended up blank. These elusive ones might count their dance steps in GA4, but give BigQuery the cold shoulder. Hence, the awkward difference.

    Another possibility that might throw your dance off rhythm could be that BigQuery and GA4 are swinging to different time zone tunes. BigQuery has a thing for UTC, but your GA4, well, it might have its own preferences.

    So, shine your spotlight on these possibilities and see if they transform this awkward dance-off into a graceful ballet. Cheers mate!

  • Morgan

    Member
    6 December 2022 at 7:35 am

    The discrepancy between your Bigquery data and GA4 data for the same parameters could be due to multiple reasons. It could be from how GA4 segments and counts active users, as the methodologies may be different from yours. Check how active users are defined in GA4– they might be basing it on different parameters for activity. The issue could also be from the time zones. BigQuery and GA4 might be working on different time zones, causing variations in the count for the same day. Another issue could be related to data processing–maybe GA4 processes the data in real-time and BigQuery with a delay. Lastly, ensure that all the data is correctly synced from GA4 to BigQuery. It’s possible that BigQuery is not receiving all the data from GA4.

Log in to reply.