Whether you use Cervinodata for BigQuery or not, it is good to understand the cost structure of Google BigQuery. If you are not already a Cervinodata user, feel free to start a free 14 day trial
Google BigQuery might charge for its services, this depends on how you use it. Although you have to use Google BigQuery quite extensively (i.e. large volumes of data) before you reach the limit of free usage of BigQuery it is wise to understand their cost structure. Also, check out their Cost calculator. Cervinodata has some optimalizations in place to reduce the workload / costs in BigQuery, so please read the article below.
Cost structure Google BigQuery
Google BigQuery's cost structure is based on 4 pillars:
- Streaming inserts
- (On demand) Queries
- Network pricing
1. Storage: Active or Long-term
When you make modifications to a table (i.e. add a record) that table becomes "Active storage". If you do not change anything on a particular table for more than 90 days, this tabel automatically becomes "Long-term storage" with a lower price.
Estimated costs Google BigQuery for storage with Cervinodata
The first 10 GigaByte are for free. Beyond 10 GB costs are moderate: 0,02 $ (Active storage) per GB per month. Even with 1 TeraByte of data, active storage only costs 20 $ per month (as an indication, none of our very large customers are near the one TB in data. Long term storage is cheaper with 0,01 $ per GB per month.
2. Streaming inserts
Adding a record in a table row is called a "streaming insert". Streaming insert cost 0,01 $ per 200 MB. So suppose you would insert 1 TB (1.000.000 MB) each month, this will cost you 1.000.000 MB * (0,01 $ / 200 MB) = 50 $.
Estimated costs Google Big Query for streaming inserts with Cervinodata
Cervinodata for BigQuery does not use streaming inserts where possible but uses "Load Jobs" through the API (see article about loading data into Google Big Query). There are no costs for Load jobs. There are no costs expected for streaming inserts from usage of Cervinodata.
3. (On demand) Queries
For executing queries on tables, BigQuery charges a small fee per TeraByte processed. The first TeraByte per month is free. After that each TB costs 5 $.
Things to take into account
- The calculation is based on the total amount of data that has been processed in the columns of a table. So, using a query that states something like Select * from table will cost more data processing than selecting specific columns from a table.
- There is a minimum of 10 MB per completed query.
- For queries with less then 10 MB of data processed the limit of 1 TB per month is very high (only after 100.000 queries per month of 10 MB you reach the 1 TB threshold).
- Queries with larger datasets will make you reach the threshold sooner. So optimise your queries where possible.
- There are no costs for queries executed on the Cache (those queries are also much faster). See more about Google Big Query Cache here
- There is no charge if a query is not executed due to an error.
Estimate costs for Google Big Query when using Cervinodata
Most of the queries in Google BigQuery will not be executed by Cervinodata processes, but by the user. That makes it a bit hard to predict costs for queries. When you take into account the items in the previous paragraph, your query costs will probably be (very) modest. Currently, we only have one customer (a very heavy user of the queries) that is paying a couple of dollars per month for running their own queries.
4. Network pricing
What will not happen on a regular basis but still is a thing to consider is moving your data from one region to another. In particular moving between larger regions can be costly (i.e. from the US to Asia). Cervinodata will not move your data across different regions, but do keep in mind the costs of doing this yourself: between 0,01 $ per GB to 0,12 $ per GB .
Estimated costs for Google Big Query when using Cervinodata
Cervinodata will not move data across regions, so unless you do this yourself, there are no costs to be expected here.
Cost reducing features built in Cervinodata
When Cervinodata synchronises data to Google BigQuery a number of features help keep the costs down. Google BigQuery calls them "Custom Cost Controls".
- Partitioned data: Cervinodata comes with "Partitioned data" fields you can use in the query. If you use partitioned date fields the query will not look beyond that particular date. If there is a lot of history available, you can reduce the amount of data processed, potentially saving you money.
- Incremental data feed: After the one time setup sync, where the entire history is inserted into BigQuery, Cervinodata will only add new data to the file. The data is incremented, not replaced.
Cost reducing tips from Google Big Query
There are a number of best practices to follow to keep the costs as low as possible. BigQuery offers some itself (Best Practices from Google Big Query):
- Put a quotum in the Admin Console (see explanation here)
- Use the partitioned data fields in the appropriate tables
- Do not use Select * from, but query only the particular variables you need
- Do not move data between regions
- Use the preview option to check out what's inside a table
- Do a dry-run before your execute a query and / or validate the query (see documentation here).
Feel free to contact support at any time.