Learn business growth with Google Analytics 4 Forums Google Analytics 4 Replicating the GA4 User Engagement value with Bigquery export

  • Replicating the GA4 User Engagement value with Bigquery export

    Posted by Huong on 21 November 2022 at 6:51 am

    Hey there, I’ve been working with daily GA4 export data in BigQuery and trying to match the User Engagement value with what’s shown in the GA4 API.

    
    SELECT
      SUM((
        SELECT
          value.int_value/1000
        FROM
          UNNEST(event_params)
        WHERE
          KEY = 'engagement_time_msec'))
    FROM
      BI_1.analytics_331127371.events_<span class="hljs-operator">*</span>
    where _TABLE_SUFFIX between '20230301' AND '20230315'
    
    

    But here’s the issue – my numbers are coming out around 3% higher than the API. I checked out Google’s docs, and they’re saying that ‘engagement_time_msec’ is how User engagement is calculated. Have a look here: Google Analytics Help. Any ideas on where I’m going wrong?

    Alexander replied 1 year ago 3 Members · 2 Replies
  • 2 Replies
  • Addison

    Member
    4 June 2023 at 2:58 am

    It’s not uncommon to see a difference between the numbers in BigQuery (BQ) and Google Analytics 4 (GA4) API reports. Actually, a 3% difference is relatively small.

    This discrepancy can occur because GA4 reports are based on sampled and pre-aggregated data, rather than raw data. Conversely, your query in BQ is running on raw data.

    If you need to cross-verify data between GA4 and BQ, I recommend working within the Explore section of GA4 and building out dashboards there. The preset GA4 reports don’t always give a detailed view of the data compared to what you might get from BQ, since they’re more catered towards those who may not be as versed in data analysis. So, no need to worry, you’re on the right track!

  • Alexander

    Member
    4 June 2023 at 2:57 pm

    The discrepancy may be due to various factors that affect data processing between Google Analytics 4 (GA4) and BigQuery. GA4 has additional features, such as engagement_time_msec, that are based on front-end JavaScript events. These events may not be logged with as much precision due to network latencies or system performance variations.

    The API might filter or adjust data for consistency before reporting, which is not reflected in the raw data. This can also be influenced by time zones of the servers where data accumulated, and the specific instant data is exported, which could lead to minor discrepancies.

    Lastly, be aware of the data latency. BigQuery’s data freshness can vary and can take up to 24-48 hours to sync fully. So, if you’re pulling data and comparing it with what’s being shown in the GA4 API at this moment, you might see some differences.

    Given these complexities, the 3% you’re seeing is within an acceptable margin of error. To align BigQuery data more closely with GA4’s API numbers, consider cross-checking your date range, app instance ID, and user_pseudo_id. If the problem persists and the discrepancy is major, you might want to consider contacting Google support.

Log in to reply.