Learn business growth with Google Analytics 4 Forums Google Analytics 4 Discrepancy between GA4 traffic source data and BigQuery records Reply To: Discrepancy between GA4 traffic source data and BigQuery records

  • Lucas

    Member
    29 March 2023 at 8:54 pm

    It seems that you may be experiencing discrepancies between your BigQuery and GA4 data due to the way sessions are recorded and counted, among possible other reasons. Your SQL code omits events that lack a valid pseudo_id and session_id, which might not be the case within GA4 that could include them. It might be worth examining how your code handles null values.

    There’s also a methodological difference in how you’re counting sessions compared to GA4. Your exact count might not match up with GA4’s approximation method, which employs HyperLogLog for session estimations. To lessen this discrepancy, you could consider using BigQuery HyperLogLog functions which are compatible with GA4’s analytics data as described in the Google Developers link you shared.

    However, even with these corrections, attribution might not be perfectly matched due to other factors, including discrepancies in tracking source/mediums, along with factors that could affect whether a user’s actions constitute a “session”. You might need to adjust your method of counting sessions and attributions, investigate your tracking configurations, or the possibility of inconsistencies in the data set.