Learn business growth with Google Analytics 4 Forums Google Analytics 4 Aggregating GA4 Event Data by Session ID in BigQuery

  • Aggregating GA4 Event Data by Session ID in BigQuery

    Posted by Pham on 24 January 2023 at 4:35 am

    Hey there, folks! So, here’s my dilemma: I’m building this SQL query where I need to grid all my GA4 event data based on the session ID. Super simple, right? I thought so too, until now.

    When I fire off this query, the resulting table gives me session ID as a completely separate column – not nestled within an array for event parameters as you’d expect for a certain event. But what’s really grinding my gears is the session_id column, which seems to have its own agenda with non-unique values. I mean, the session id is replicated multiple times for each row, representing a new event within that session. And what I really want to do is to seamlessly blend together those non-unique session ids, so I get all events that corresponds to a particular session_id.

    I tested this query, and it does hand me session_id on a separate column – with the session_id cloned for each event, however.

    
    SELECT
      *,
      (
        SELECT COALESCE(value.int_value, value.float_value, value.double_value)
        FROM UNNEST(event_params)
        WHERE key = 'ga_session_id'
      ) AS session_id,
        (
        SELECT COALESCE(value.string_value)
        FROM UNNEST(event_params)
        WHERE key = 'page_location'
      ) AS page_location
    FROM
      digital-marketing-xxxxxx.analytics_xxxxxxx.events_intraday*
      
    

    For example, my output looks like this, amidst several more columns:

    session_id
    event_name

    1234567
    session_start

    1234567
    click_url

    So, I’m trying to merge the session ids into a solitary cell. Tried this line:

    
    SELECT
      *,
      (
        SELECT COALESCE(value.int_value, value.float_value, value.double_value)
        FROM UNNEST(event_params)
        WHERE key = 'ga_session_id'
      ) AS session_id,
        (
        SELECT COALESCE(value.string_value)
        FROM UNNEST(event_params)
        WHERE key = 'page_location'
      ) AS page_location
    FROM
      digital-marketing-xxxxxxx.analytics_xxxxxxx.events_intraday*
    GROUP BY session_id
    
    

    But Kaboom! Error shows up, claiming I can’t aggregate certain values (like date) – which is tasked when trying to group by session id.

    Got any idea, folks, how to outsmart this? I admit I’m a noob at SQL, and I’m currently stuck in the rabbit hole with no clear guidance on how to roll with this.

    Jesse replied 11 months, 2 weeks ago 2 Members · 1 Reply
  • 1 Reply
  • Jesse

    Member
    5 June 2023 at 9:50 am

    SQL queries can seem complex, but the trick is breaking them down into smaller and more manageable parts. Here’s an approach that I’ve found works quite well for understanding sequences of events. It might take a bit more time because it’s set up to look at other things as well, but it’ll get you what you need:

    with _latest as (
      SELECT 
        --create unique id
                concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')) as unique_session_id,
        
        --create event id
            concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id'),event_name) as session_ids,
              event_name,
              event_date,
               
              TIMESTAMP_MICROS(event_timestamp) AS event_timestamp
            FROM  *******
           
            where
                -- change the date range by using static and/or dynamic dates
               _table_suffix between  '20221113' and '20221114'),
         Exit_count as (
            select *,
         row_number() over (partition by session_ids order by event_timestamp desc) as Event_order
            from _latest)
        
            select 
        Event_order,
        unique_session_id,
            event_date,
            event_name,
            
       FROM
            Exit_count
        
      group by
            Event_order,
            event_name,
            unique_session_id,
            --pagepath,
            event_date
            --Country_site
            order by
                unique_session_id, 
                Event_order 
    

    Don’t worry if you’re new to SQL. Over time you’ll find it easier to write complex queries confidently.

Log in to reply.