How Kaleva Media used Dataform to leverage the power of Snowflake and scale their processes | Dataform
Case study

How Kaleva Media used Dataform to leverage the power of Snowflake and scale their processes

After initially setting up Dataform in November 2019, Kaleva now manage 243 (and growing) separate datasets: all documented, and automatically tested using assertions.

Data teams
picture of author

Josie Hall on May 15, 2020

The situation

Kaleva is a Finnish online & print media company. They started as a local newspaper covering northern Finland but soon wanted to expand their offering. They started buying smaller newspapers and now own multiple newspapers across northern Finland including digital news sites & online marketing platforms.

The company is over 120 years old and has over 800 employees! Kaleva Media therefore has a huge amount of historical data. They work with clickstream data to measure events on their website, financial data, CRM data and marketing data .

A few years ago, Kaleva made a conscious decision to become more data driven. They hired a data team whose vision was to create a self-serve BI culture within Kaleva and they started designing their data architecture, essentially from scratch.

The first step was to centralise all their data in a warehouse. For this, they chose Fivetran for data ingestion and Snowflake as their data warehouse. The next step was modeling and making sense of all their raw data. They tried a few products but either found that the learning curve was too steep or they required a lot of maintenance.

As they were a small team supporting a large company they knew they wanted a tool which could help accelerate the powers of the data team. They wanted a tool which:

  • Helped the team be more productive, minimising time spent on maintaining infrastructure and menial tasks.
  • Helped the team to collaborate and work in a unified way.
  • Allowed them to leverage the power of their Snowflake warehouse.

Improving productivity of the data team

“We’ve never really compromised on data quality but assertions do give us more time. They allow us to do more with the same number of people.”

Kaleva works with huge amounts of historical data, which is great for analysis but also means setting up pipelines is complex and time consuming. The data team never wanted to compromise on data quality which meant they would often spend hours checking back on pipelines to see if they were failing and trying to debug them if they were.

"If you look at Harvard Business Review, one thing they raise is about measuring the quality of your data and making it visible. That’s definitely something we can say we can do better with Dataform than without”

With Dataform, Kaleva was able to use assertions to make sure their pipelines ran smoothly. Assertions run as part of a schedule and check the data doesn’t contain any errors - alerts automatically notify the team if any errors are detected. Once the assertion is added, they no longer have to worry about maintaining the dataset. This gives the team more time to spend on value-add work, like analysing data.

Being able to automate data quality tests has increased the output of the data team. Nowadays they have set up a process which means that no dataset can be published unless it has an assertion. In just 6 months of use, Kaleva Media has created 243 data sets that have 72 assertions and the gap is closing fast.

Collaboration that scales

“It’s been quite delightful having all four of us working intensely using the same processes”

The data team at Kaleva is made up of four people of different backgrounds and expertise with one distinct shared knowhow: SQL. Each member of the team had different ways of working and slightly different development cultures. This sometimes made collaboration hard. They decided to adopt Dataform as it helped enforce best practises for data-ops that helped lift the teams productivity even more.

“Using the built in version control has made everything more open and visible. It’s getting people to work in the same way with data.”

With Dataform, Kaleva Media now has one platform where the whole team can collaborate, following a single unified process. For example, all updates to the data pipeline must include documentation, ensuring the project remains easy to understand for the entire team.

Now they have a fixed way of working, onboarding new members of the team is far easier. The diagram linking all of the transformations (the dependency graph) means new team members can easily get up to speed with where all the tables live and how they relate to each other.

Leveraging the power of Snowflake

Kaleva chose Snowflake as the place where they wanted to centralise all their data. Snowflake is a SQL based warehouse with a unique architecture that is specifically designed for the cloud. Compute usage is billed per second, so you only pay for what you use. Kaleva needed a data modeling solution that could work well with their Snowflake warehouse.

Dataform sits on top of Snowflake and allows Kaleva to manage all data processes happening in their warehouse. Using Dataform’s powerful features, Kaleva can keep their warehouse up to date and all their code in Snowflake version controlled. All transformations happen in the warehouse itself, allowing Kaleva to harness the power of their Snowflake warehouse and do the heavy lifting.

Conclusion

Kaleva needed a tool which could help them become more data driven, transitioning from a printing press culture to the digital age. Dataform has allowed Kaleva to improve the output of their data team and collaborate more effectively as a team. After initially setting up their Snowflake warehouse in November 2019, they now manage 243 (and growing) separate datasets: all documented, and automatically tested using assertions.

With the help of the Dataform customer success team, the whole team at Kaleva Media were able to get up and running with Dataform in a matter of weeks.

"The way the Dataform team respond so quickly to feedback and feature requests means that their customers stay faithful to them.”

Now that their data models are built on solid foundations, the team can spend less time debugging failing pipelines, and more time delivering valuable insights to the business.

More content from Dataform

What can data teams learn from Google’s State of DevOps report? illustration
Opinion

What can data teams learn from Google’s State of DevOps report?

Google's research shows us that elite engineering teams have a lot in common when it comes to DevOps. We think similar principles apply to data teams or, more precisely, DataOps.
Learn more
Cut data warehouse costs with run caching illustration
Product Update

Cut data warehouse costs with run caching

How to save time and money by using our run caching feature
Learn more
Building the Dataform VS Code extension illustration
Guide

Building the Dataform VS Code extension

How we made our own extension for Visual Studio Code.
Learn more

Learn more about the Dataform data modeling platform

Dataform brings open source tooling, best practices, and software engineering inspired workflows to advanced data teams that are seeking to scale, enabling you to deliver reliable data to your entire organization.