New Open-Source Tool from Adswerve: Unnest Your BigQuery Google Analytics 360 Data


June 30, 2020

We’ve spent countless hours analyzing Google Analytics 360 (GA360) data in BigQuery (BQ) for our clients. As technologists and analysts, we’re proponents of open source — which enables and encourages collaboration and development of software — and we want to pass along our learnings to help you solve your own real-world challenges. 

One of the main GA360 benefits is its ability to export raw event-level data to BQ for unlimited analysis. However, the initial look into the BQ schema and data can be daunting. First, no pre-built reports exist; the data rows are raw event level, meaning each click, scroll and navigation is collected. The only way to extract information is by using SQL. Even SQL experts coming from relational database backgrounds like SQL Server or MySQL can struggle with the format of the nested records that BQ stores GA360 data. In fact, so many clients have asked us to unnest (flatten) their GA360 data that we’ve built several automated data pipelines over the past few years. These have helped them query the GA360 data using familiar SQL join clauses as opposed to learning the specific unnest function offered by BQ. They can help you, too. Here’s how we developed them.

Over the past few years, Adswerve has developed many custom software solutions (packages) that unnest (flatten) the GA360 data into separate tables from the original table named ga_sessions (segmented by day). Each time the project was executed, one table was created for sessions with a primary key defined as session_id and another table for hits, uniquely identified by the hit_id key, and joined to the session table using the session_id key. 

During this time period, we’ve used different techniques and technologies such as Google Apps Scripts, python, javascript and different methods for extracting the custom dimensions along with various time-based schedulers that were problematic. This experience, coupled with the consumer demands of the flattened data, has given us insight into the many different challenges, requests and initiatives posed by our clients. By the method of common denomination, we’ve created an open-source product that flattens the data into normalized tables on an event-driven trigger but still allows for flexibility based on client need.

The open-source project currently has six user stories listed in the project plan to enhance and grow the product. Having worked with a variety of e-commerce companies and websites, we understand the need for flexibility on how to store custom dimensions and at which intervals they should be updated, either manually or automatically. 

Get Our GA Flattener Tool

With the release of this open-source tool, we will be helping more companies analyze GA360 data quickly without having to become experts in specific BQ-only database structure or ramping up learning SQL functions. We’d rather have them focus on finding actionable insights with their GA360 than worrying about how to extract the information from it.  

In addition to providing a Google Cloud Platform (GCP) based product that flattens all your GA exported data into separate tables for each, we’ve included a simple installation process using Google’s deployment manager that requires only one step to install the entire solution. You can find the repository, description  and installation instructions on github.com, specifically at github.com/adswerve/google_analytics_flattener.  We encourage all to join the open-source community and follow this product.   

Please don’t hesitate to reach out with any questions.