Keep track of your Bigquery costs by exporting usage logs. | Dataform
Guide

Keep track of your Bigquery costs by exporting usage logs.

Turn on BigQuery audit log exports to start analysing your BigQuery usage. You can use this to keep track of Bigquery costs.

Data modeling
picture of author

Dan Lee on March 31, 2020

BigQuery makes detailed usage logs available through Cloud Logging exports, but before you can start analysing them you probably want to start exporting them to BigQuery. Once you have turned on the export, it's very easy to keep track of things like Bigquery costs using a simple query. It's really simple to turn the export on but unfortunately Google's documentation is pretty vague, so I thought I'd write this blog post to help.

Setting up the export

Open the Logs Viewer and make sure you've chosen the correct GCP project in the drop down at the top left.

Expand the advanced filter box.

Add the following advanced search into the filter box:

resource.type="bigquery_project" OR resource.type="bigquery_dataset"

You should see something like this:

Click Create Sink, enter a "Sink Name", choose BigQuery as the "Sink Service" and select a BigQuery dataset to write the logs into (you can also create a new dataset through the dropdown interface).

That's it! All future logs relating to BigQuery usage in your project will be written to this dataset. Note, it's not possible to backfill historical data.

Modelling the data

You now have a table in BigQuery with details of every single activity relating to your BigQuery project. However, the table isn't particularly user friendly. Answering simple questions like "which user spent the most on BigQuery yesterday" requires complex JSON transformations of difficult to remember fields:

select
  protopayload_auditlog.authenticationInfo.principalEmail
  sum(5.0 * cast(
    json_extract_scalar(
      protopayload_auditlog.metadataJson,
      "$.jobChange.job.jobStats.queryStats.totalBilledBytes"
    ) as int64
  )) as spend
from
  bigquery_logs.cloudaudit_googleapis_com_data_access_*
where
  resource.type = 'bigquery_project'
  and date(timestamp) = date(timestamp_sub(current_timestamp, interval 1 day))
group by 1
order by 2 desc
limit 1

To help with this, we've created a Dataform package (documentation here) that creates a clean and easy-to-use table with all the fields you'd expect. Adding a Dataform package to your project is as simple as adding a few lines of code.

Keeping track of your Bigquery costs and answering questions like "How much did we spend yesterday?" can be done with the much simpler query:

select
  principal_email,
  sum(cost_usd) as spend
from
  dataform.bigquery_compute_logs
where
  date(timestamp) = date(timestamp_sub(current_timestamp, interval 1 day))
group by 1
order by 2 desc
limit 1

If you'd like to learn more about Dataform, check out our documentation!

More content from Dataform

Dataform is joining Google Cloud

Announcement

Dataform is joining Google Cloud

Today we're excited to announce that we've joined Google Cloud. Over the course of the past several months, our partnership with Google Cloud has deepened and we believe that our new combined efforts can make our customers and partners even more successful.
Learn more
Using Javascript to Create an Acquisition Funnel Dataset illustration
Tutorial

Using Javascript to Create an Acquisition Funnel Dataset

How to use a combination of SQL and Javascript to create an acquisition funnel dataset.
Learn more
How to add a new warehouse adapter to Dataform illustration
Guide

How to add a new warehouse adapter to Dataform

I recently wrote a Presto adapter for Dataform during a week long quarterly hackathon. Here's a guide on how I'd do it again.
Learn more

Learn more about the Dataform data modeling platform

Dataform brings open source tooling, best practices, and software engineering inspired workflows to advanced data teams that are seeking to scale, enabling you to deliver reliable data to your entire organization.