Exporting GA4 Data to BigQuery
Google Analytics 4 now allows standard customers to export raw data to BigQuery, a feature previously exclusive to paid customers using Google Analytics 360. There are two types of raw data export options in Google Analytics 4 for BigQuery.
- Daily Export: This feature exports raw data from Google Analytics 4 into individual BigQuery tables. Each table corresponds to a specific day’s data. The complete loading of data for a single-day period (24 hours) may take up to 72 additional hours after the relevant day has concluded.
- Streaming Export: This option exports raw data continuously throughout the day. This is helpful for business use cases where immediate access to data is required.
Enabling Raw Data Export
To enable raw data export, link your analytics property to an existing BigQuery project. Assuming you have already created a BigQuery project, we will connect our analytics property’s raw data export to BigQuery.
Choose a BigQuery Project
If you are logged in with the same user who also has access to BigQuery project. You could select the project from the drop-down. If it’s not showing up, you can specify a project ID.
By going into the BigQuery console, you can copy and paste the project ID.
Configure Data stream
Once you have identified the BigQuery project and have its Project ID. Open Google Analytics and navigate to Admin > Property Settings >Product Links > BigQuery Link.
Under the project information, enter the project ID which you copied from the Google BigQuery console. Choose a default location for new dataset creation. This new dataset will be created under your project. The naming convention for this newly created data set will be analytics_PROPERTYID. Where PROPERTYID is the ID of the current property being linked.
After you have entered the BigQuery project information, the next step is to configure the stream. Clicking on the configure data streams and events button below allows you to select which streams data will be exported in the selected dataset and whether you want to exclude some events.
Limits & Exclusions
There is a 1 million per day daily limit on events export for free accounts. For GA4 – 360 account, google mentions “billion of events“. If you are using a free account and you suspect your daily volume can go beyond 1 million events. You can exclude events using the configuration screen.
There are two ways you can select the events to be excluded
- Specify event by name: This opens up a plain dialogue, where you can add the name of the event. GA4 will match the name and exclude the event from the export.
- Add button: Add button gives a list of events with their daily volume. However, in my case, it only gives Enhanced events. This might be a feature that Google would work on in the future as it would make sense to have all the events with their daily volume listed in a visual interface to select for exclusion.
Select Frequency
Once you have configured the BigQuery project and decided on which streams to export. You can now select the frequency of export. You can export data as batch exported once a day or streaming. You can also do both. I prefer to do both as in several use cases you will require real-time events.
Verifying Big Query link
We can verify the link from the Google Analytics interface. In addition, we can open the BigQuery console and confirm whether our data is flowing as configured.
Verifying from Google Analytics
Once you have linked the project, you will see the link established under Admin -> Property Settings -> Product Links -> BigQuery Links.
Verifying from BigQuery
We can verify whether our data is flowing into BigQuery dataset after we finish linking it up. When you configure it for the first time, the dataset and tables can take up to 24 – 48 hours to appear.
- DataSet: When you link BigQuery, GA4 automatically creates a dataset in the specified project. The naming convention for DataSet is analytics_PROPERTYID. In this screenshot, 3524337760 is the property id for which this DataSet is collecting data.
- events_: The events_* table stores the data coming from Google Analytics 4. Data for each day is stored in a separate table. The naming convention for each day’s table is events_yyyymmdd. In the screenshot, you see events_(14) selected. This means so far, there are 14 tables each containing 1 day worth of data. One table will be added for each day as data continues to come.
- events_intraday_: Intraday tables are Streaming-Export tables made when the streaming option is chosen in GA4. These tables contain records of the day’s session activity. However, streaming export may not include all data due to late events or failed uploads. Data is exported continuously throughout the day, and the table may include sessions that span multiple export operations. The table is deleted when the
events_YYYYMMDD
is finished.
Data Export Scheduling
Both daily export and streaming export follow a fixed schedule. Given below is a summary table explaining the difference between export schedule for both.
Feature | Daily Export Tables | Streaming-Export Tables |
---|---|---|
Update Schedule | Created after all events for the day are collected, then updated for up to 72 hours beyond the date of the table. Only occasionally data may get updated after 72 hours. | Continuously updated throughout the day (12:00:00 am – 11:59:59 pm in the property’s time zone) |
Effect of New Day | N/A | Events are written to a new intraday table when a new day starts in the property’s time zone |
Responses