One of the biggest benefits of being a Google Analytics 360 customer (specifically Universal Analytics) is the ability to leverage hit-level data in BigQuery. However, the data that is made available within BigQuery is not a one-to-one match to the data that is presented in the Google Analytics reporting interface.
For example, if you want to understand the total number of sessions in a given date range, there isn’t one precalculated metric in BigQuery that gives you this number. To get this metric, you would have to query the data with a statement like:
SUM(totals.visits) to get the value. If you want to understand the total number of users, you would have to query the data with:
Getting the same data is possible but requires extra knowledge of the BigQuery Export Schema to do so. Let’s explore three of the key differences between the two data sets.
Key Differences Between Universal Analytics and BigQuery Data Sets
1) Data Exported to BigQuery
First and foremost, the following data is available in the Universal Analytics reporting interface but will not export to the Google Analytics data exported to BigQuery:
- Demographic and interest data (note: this is third-party data about your first-party users. Providing this data to you at hit-level treads into privacy concerns which is why it is only available to you in aggregate via the reporting interface.)
- Custom channel groupings
- Google Marketing Platform (GMP) integration data (note: GAM/DFP and Google Ads data are available, but they’re technically not part of the GMP)
- Search Console data (note: Adswerve has created a GSC>BQ connector for this reason. We can join GSC+GA data with landing page and country. Contact us to learn more)
- Data joined via Query Time Data Import (note: data joined via Processing Time will be available, but only moving forward)
2) Effort to Reproduce Reports
Second, you can rebuild a lot of the pre-canned reports in the Universal Analytics UI in BigQuery. However, certain reports would take much more effort to reproduce because you have to manually recreate the data points:
- (From the “Audience” section): Lifetime Value, Cohort Analysis
- Any report in the Multi-Channel Funnel reports (and anything related to attribution)
3) Downstream Data Considerations
Finally, given BigQuery is a downstream data storage platform, there are two main considerations we’d like to point out when you’re designing a Universal Analytics implementation:
- If you’re editing/redefining your channel definitions, edit the Default Channel Groupings. This is the only set of channel definitions that will be available in BigQuery (and Data Studio). You can technically recreate the channel definitions using a CASE statement with multiple WHEN conditions, but why make it harder on yourself? The schema is “
channelGrouping“. If you’re worried about losing other channel definitions, create custom channel groupings to save those rules.
- If you’re leveraging the Data Import feature, create two imports: one Query Time and one Processing Time. We have done testing and this approach works as long as the files are the exact same. The Query Time feature allows you to use the newly joined data retroactively (e.g. segments, past reporting). The Processing Time feature allows that data to start populating downstream in BigQuery at the time of upload.
We hope this helps you on your BigQuery journey. Questions? Contact us! And in the meantime, happy querying!