Learn business growth with Google Analytics 4 Forums Google Analytics 4 Optimizing JSON Export with Null Values in BigQuery's Delimited Format

  • Optimizing JSON Export with Null Values in BigQuery's Delimited Format

    Posted by Rishi on 27 January 2023 at 8:05 am

    Hey friends!
    Just wondering if any of you have tripped over this issue as well. I’ve been trying to save GA4 query results in BigQuery as JSONL (delimited) but I’m hitting some snags. Specifically, columns that have NULLS are just poof – gone in the downloaded JSON. If you try to throw the lot back in as a table (upload -> JSONL) and run a query, you’ll see what I mean.

    Also weird: the order of the columns changes after import. They don’t match the original lineup pre-export. Is that just me?

    If you’ve battled this beast and won, I’d love to hear about the solution you found. Basically, how do you download and then re-upload a JSON with the same structure that it had originally?

    Btw, if you’re scratching your heads thinking “why would anyone download and re-upload a JSONL?”, it’s actually just to experiment with options that step outside the Google cloud ecosystem. And when I say NULLS are missing, I’m talking about things like ‘float_value’ or ‘double_value’ from GA4 Bigquery export(event_params), which vanish if they were all NULLs in the first place.

    Full of gratitude for any insights you can share!
    A quick recap of my trials and tribulations:
    – NULLS in columns become missing in action, which is confusing ’cause I expected them to stick around like a loyal json local.
    – After import, the column sequence has been shuffled compared to how they looked pre-export. So uncool.

    Finley replied 1 year, 1 month ago 3 Members · 2 Replies
  • 2 Replies
  • Evelyn

    Member
    20 June 2023 at 9:04 am

    The issue you’re experiencing with missing NULLs in columns in the downloaded JSON is due to how JSON handles NULL values. In JSON, if a field has a NULL value, it’s often simply not included in the output. As a result, when you download your BigQuery results as JSON, columns that only contain NULLs seem to disappear. As for the changing order of columns, JSON objects are an unordered collection of key-value pairs. Upon re-importing your JSON data, BigQuery determines the column order based on the order it encounters the keys in the JSON file. One workaround to maintain a steady column order could be to create a view in BigQuery with columns in a specified order and then upload your JSON data into that view. This problem has much to do with the way BigQuery deals with data imported from JSON files. To maintain the same structure, some users apply extensive preprocessing before uploading it to BigQuery or utilize an intermediate system that is able to better handle JSON data.

  • Finley

    Member
    23 June 2023 at 7:37 am

    It seems that you’re encountering a common issue within BigQuery and JSONL format. First, JSON by nature does not maintain the order of its keys (or in this case, columns). This is a property of the data structure itself, since JSON objects are unordered sets of key-value pairs. Hence, the issue you are experiencing with the column order being different after export and import.

    Secondly, concerning missing NULLs in your JSON file. NULLs are typically not included in JSON data due to normalization, to save on space and increase efficiency. This is actually a common practice in JSON serialization and not limited to Google BigQuery or GA4. Consequently, when you export data to JSONL and there are NULL values, those keys are usually not included in the serialized JSON object.

    To work around this, you need to address these nuances during the import/export process. For example, if you require NULLs to be present in the JSON document, you may need to perform a preprocessing step to fill NULLs with a default value before exporting. To address the issue with column order, you will have to enforce that explicitly during the import process, as the order is not preserved naturally in JSON/JSONL format.

    However, you’ll need to be aware that these adjustments could potentially alter the data structure and possibly affect your data analysis and further processing of these data.

Log in to reply.