Saving money is essential to almost everything we do. So when it comes to BigQuery — a data warehouse that is part of the Google Cloud Platform — it’s not surprising that saving money on BigQuery storage can be a game-changer.
As you may already know, BigQuery’s potential to turn big data into calculated business decisions is seemingly limitless. It allows you to load your data from various platforms into one centralized data warehouse, where you can then discover significant insights through in-depth data analysis.
While these projects allow you to query massive datasets in seconds to uncover user behavior patterns that are hard to notice in standard reporting, they also collect lots of data in storage as time goes on. And while the cost of storage per GB will stay consistent (and inexpensive), the total BigQuery storage cost can increase. To help keep those costs down, here are seven tips for saving money on BigQuery storage.
Do some “spring cleaning” in the project and datasets to find any tables or partitions you can delete or archive. Basically, don’t be a digital hoarder. For example, delete temporary tables you created for EDA (exploratory data analysis) if you don’t need them anymore.
We recommend that you keep your data in BigQuery and partition your tables.
A partitioned table is a special table divided into segments, called partitions, that make it easier to manage and query your data. By dividing a large table into smaller partitions, you can:
If you have a table or a partition not edited for 90 consecutive days, the price of BigQuery storage for that table automatically drops by 50 percent to $0.01 per GB per month because of Big Query’s long-term storage pricing. In addition to savings, long-term storage ensures that:
BigQuery Storage
See BigQuery storage pricing for details.
Savings examples
Example A: 67% of BigQuery data is in long-term storage for 43% savings:
Example B: 80% of BigQuery data is in long-term storage for 50% savings:
Taking advantage of BigQuery’s long-term storage should give you significant savings. If you want to take it further, see the following Google Cloud Storage (GCS) archiving steps.
Use the expiration settings to remove unneeded tables and partitions.
Classify your data into the following groups, based on the frequency of access:
Before your tables and partitions expire, move data in classification groups #2, #3 and #4 into Nearline, Coldline and Archive GCS buckets.
Google Cloud Storage (GCS) Costs
* Storage costs vary depending on your region.
Learn more about GCS:
Archival BigQuery Storage Solution
Below is a design of a solution that will archive your BigQuery data in GCS.
We suggest using the Cloud Billing Console to help you monitor how your storage and access costs changed after archiving. This will also allow you to adjust your savings strategy as needed.
Learn more:
Check out these helpful resources for more information about how you can cut your BigQuery costs:
Interested in getting started in BigQuery? Adswerve is offering a $500/month credit to any of our existing or new Google Analytics 360 subscribers interested in leveraging the Google Cloud and Google Marketing Platforms together to meet their marketing goals. Contact us to learn more.