Learn business growth with Google Analytics 4 Forums Google Analytics 4 Utilizing GA4 Data Stored in BigQuery with PowerBI Integration

  • Utilizing GA4 Data Stored in BigQuery with PowerBI Integration

    Posted by Anthony on 9 December 2022 at 1:06 pm

    So, I’m grappling with understanding GA4 data that’s stored in BigQuery and using it via Power BI. I’m kinda new to both these things, and I find JSON format particularly tricky to figure out.

    I tried to decode it by following a tutorial to flatten the tables in BQ, which is provided in this link: https://www.ga4bigquery.com/tutorial-how-to-flatten-the-ga4-bigquery-export-schema-for-relational-databases-using-unnest/.

    But, I got stumped again when I tried to query the flattened tables in PowerBI, because they had bunch of nested values without labels.

    Here’s what my data looks right now:

    | event_name | device | geo | traffic_source |
    | – | – | – | – |
    | page_view | {“v”:{“f”:[{“v”:”mobile”},{“v”:”Huawei”},{“v”:”YAL-L21″},{“v”:”Honor 20″},{“v”:null},{“v”:”Android”},{“v”:”Android 10″},{“v”:null},{“v”:null},{“v”:”ar-iq”},{“v”:”No”},{“v”:null},{“v”:null},{“v”:null},{“v”:{“f”:[{“v”:”Android Webview”},{“v”:”110.0.5481.185″},{“v”:”www.mypage.co.uk”}]}}]} | {“v”:{“f”:[{“v”:”Europe”},{“v”:”Ireland”},{“v”:”County Dublin”},{“v”:”Dublin”},{“v”:”Northern Europe”},{“v”:”(not set)”}]}} | {“v”:{“f”:[{“v”:”(organic)”},{“v”:”organic”},{“v”:”google”}]}} |

    But, I want to shuffle it around and pivot them into separate columns that look somewhat like this:

    | event_name | geo.continent | geo.sub_continent | geo.country | etc… |
    | – | – | – | – | – |
    | page_view | Europe | Northern Europe | Ireland |

    Any ideas how I can get this sorted? Thanks in advance!

    Morgan replied 11 months, 1 week ago 3 Members · 2 Replies
  • 2 Replies
  • Xavier

    Member
    7 April 2023 at 8:20 am

    Yes, you are trying to transform your current Google Analytics 4 (GA4) data from BigQuery, currently in JSON format, into a more readable Table format for usage in Power BI. The idea here is that you need to “unnest” and “pivot” your current data into separate columns. This is unfortunately not a straightforward process if you’re new to these tools and to the JSON format, but it’s a common need in data analysis. The operation you are trying to perform is called “normalizing” the data, i.e., transforming the nested JSON-format data into flat, table-format data. This process often involves writing queries or scripts to properly extract and restructure your JSON data. You may need further tutorials or to seek advice from more experienced data analysts.

    Whether you can accomplish this task with Power BI alone or need the aid of BigQuery depends on what level of data manipulation Power BI supports. It may also be possible to solve this by using other tools or programming languages that can work with JSON and will allow you to reshape data, such as Python with pandas. If the structure of your JSON data is always the same, you should be able to automate this process once you’ve successfully done it once.

  • Morgan

    Member
    25 May 2023 at 8:19 am

    Based on the format of your current table in BigQuery and your desired table, it seems like you are trying to unnest JSON into individual columns. To achieve this, you will need to use some SQL functionality.

    Firstly, extracting JSON data into columns in BigQuery can be done using the JSON_EXTRACT_SCALAR function in SQL. The function requires two string-type parameters – the JSON string and the JSONPath query.

    For instance, you can create a view in BigQuery that extracts the parameters from device, geo, and traffic_source and represents them as separate columns. The code will look similar to this:

    SELECT
    event_name,
    JSON_EXTRACT_SCALAR(device, ‘$.v.f[0].v’) AS device_type,
    JSON_EXTRACT_SCALAR(device, ‘$.v.f[1].v’) AS device_brand,
    — Add more fields as needed with correct path to JSON fields
    JSON_EXTRACT_SCALAR(geo, ‘$.v.f[0].v’) AS geo_continent,
    JSON_EXTRACT_SCALAR(geo, ‘$.v.f[1].v’) AS geo_sub_continent,
    — Add more fields as needed
    FROM your_table;

    In the sample SQL code, event_name is a given column, the other columns are extracted from the JSON inside device and geo.

    Please note that this assumes a consistency in the order of the nested JSON data. If the order isn’t consistent, you might run into issue. Also, you need to replace ‘your_table’ with the name of your table.

    Once you have this setup, querying the data in PowerBI should be pretty straightforward. From PowerBI’s navigation menu, go to ‘Get Data’ and select BigQuery as your data source. Then, use Power BI’s interface to write the SQL query, or select the view you created earlier, and load the data.

    Finally, JSON data and its extraction can prove tricky and you might encounter some adjustments to be made according to the specific structure of your JSON data. But this should provide a decent start. If there are still issues, it might be worth investing time in understanding JSON structure more profoundly or consider reaching out for expert help.

Log in to reply.