Exporting BigQuery usage logs to... BigQuery | Dataform
Guide

Exporting BigQuery usage logs to... BigQuery

Turn on BigQuery audit log exports to start analysing your BigQuery usage

Data modeling
picture of author

Dan Lee on March 31, 2020

BigQuery makes detailed usage logs available through Cloud Logging exports, but before you can start analysing them you probably want to start exporting them to BigQuery. It's really simple to turn the export on but unfortunately Google's documentation is pretty vague, so I thought I'd write this blog post to help.

Setting up the export

Open the Logs Viewer and make sure you've chosen the correct GCP project in the drop down at the top left.

Expand the advanced filter box.

Add the following advanced search into the filter box:

resource.type="bigquery_project" OR resource.type="bigquery_dataset"

You should see something like this:

Click Create Sink, enter a "Sink Name", choose BigQuery as the "Sink Service" and select a BigQuery dataset to write the logs into (you can also create a new dataset through the dropdown interface).

That's it! All future logs relating to BigQuery usage in your project will be written to this dataset. Note, it's not possible to backfill historical data.

Modelling the data

You now have a table in BigQuery with details of every single activity relating to your BigQuery project. However, the table isn't particularly user friendly. Answering simple questions like "which user spent the most on BigQuery yesterday" requires complex JSON transformations of difficult to remember fields:

select
  protopayload_auditlog.authenticationInfo.principalEmail
  sum(5.0 * cast(
    json_extract_scalar(
      protopayload_auditlog.metadataJson,
      "$.jobChange.job.jobStats.queryStats.totalBilledBytes"
    ) as int64
  )) as spend
from
  bigquery_logs.cloudaudit_googleapis_com_data_access_*
where
  resource.type = 'bigquery_project'
  and date(timestamp) = date(timestamp_sub(current_timestamp, interval 1 day))
group by 1
order by 2 desc
limit 1

To help with this, we've created a Dataform package (documentation here) that creates a clean and easy-to-use table with all the fields you'd expect. Adding a Dataform package to your project is as simple as adding a few lines of code.

"How much did we spend yesterday" can then be answered with the much simpler query:

select
  principal_email,
  sum(cost_usd) as spend
from
  dataform.bigquery_compute_logs
where
  date(timestamp) = date(timestamp_sub(current_timestamp, interval 1 day))
group by 1
order by 2 desc
limit 1

If you'd like to learn more about Dataform, check out our documentation and sign up for a free account.

More content from Dataform

How Kaleva Media used Dataform to leverage the power of Snowflake and scale their processes illustration
Case study

How Kaleva Media used Dataform to leverage the power of Snowflake and scale their processes

Read their story

Invest in your analysts, now!

Opinion

Invest in your analysts, now!

Learn more
Converting Looker PDTs to the Dataform framework illustration
Guide

Converting Looker PDTs to the Dataform framework

Learn more

Learn more about the Dataform data modeling platform

Dataform brings open source tooling, best practices and software engineering inspired workflows to advanced data teams that are looking to scale, helping you deliver reliable data to the entire organization.