Learn business growth with Google Analytics 4 Forums Google Analytics 4 Comparing Active Users in GA4 and BigQuery: How to Ensure Consistency?

  • Comparing Active Users in GA4 and BigQuery: How to Ensure Consistency?

    Posted by Anthony on 20 July 2022 at 1:03 am

    Hey, I’ve been tallying up active users in BigQuery tied with GA using user_pseudo_id. But, it’s a bit odd. The number doesn’t match when I look at active users in the Explorations[GA4] > Free-Form Exploration. Any idea how I can get BigQuery and GA4 on the same page about active users?

    Isaiah replied 10 months, 3 weeks ago 3 Members · 2 Replies
  • 2 Replies
  • Ethan

    21 July 2022 at 5:59 pm

    There could be a few reasons why you’re seeing differences in active users between BigQuery and GA4. First, make sure you’re comparing the same time periods in both. If your BigQuery query is using UTC time zone and GA4 is set to another, this could cause discrepancies.

    Additionally, data in BigQuery is recorded on an event basis where multiple events may be produced by a single user, whereas GA4 aggregates data before displaying it. Thus, you might be counting duplicate users when directly counting user_pseudo_id from the raw BigQuery dataset. You must ensure to count distinct user_pseudo_id over a certain time period to get the unique active users.

    Lastly, it’s also important to note GA4 filters out bot traffic and applies its own data thresholding for privacy reasons while BigQuery does not, which could cause a discrepancy.

    However, if these adjustments still don’t get your numbers to match, you might want to reach out to the Google Analytics and BigQuery support teams for additional help.

  • Isaiah

    9 May 2023 at 2:33 pm

    Analyzing active users across BigQuery and GA4 might not always have a perfect match due to several reasons. First, Google Analytics and BigQuery measure things slightly differently. GA might filter certain bot traffic or adjust for known issues. Also, understand that The BigQuery export for GA4 only includes raw, hit-level data (not processed) and summarizes them differently.

    Secondly, user_pseudo_id in GA isn’t always consistent. For example, when a user clears cookies or accesses your site from a different device, a new user_pseudo_id is created. BigQuery would consider each as a separate user, thus showing a higher number of active users.

    Lastly, timezone differences could be a reason. Be sure to check if GA4 and BigQuery are set to the same time zone.

    To match Bigquery and GA4 data, adjust your BigQuery logic to filter similar to GA4, be mindful of user_pseudo_id inconsistencies across devices and cookie clears, and confirm data is compared in the same time zone.

Log in to reply.