For the last few years, Google has released a “State of DevOps” report. The report aims to describe how the most DevOps is done at the most effective engineering teams, and provides a set of benchmarks across industries and verticals. Google claims that there are four key metrics that can be used to categorize developer teams into groups - and they call the very best teams Elite Performers. The metrics provide an actionable way for DevOps teams to understand where they can improve.
Can data teams “steal with pride” and learn something about their own performance using these metrics? I think they can. Whilst they were developed with DevOps in mind, the reality is that the difference between a software engineering team and a data team is not that great (in the abstract). With a little tweaking, the same metrics can be used to rank and benchmark data teams.
These are the four metrics used to understand DevOps performance:
- Lead time: How long does it take to go from code committed to code successfully running in production?
- Deployment frequency: How often does your organization release code to end users?
- Time to restore: How long does it generally take to restore service after a defect that impacts users occurs?
- Change fail percentage: What percentage of releases to users result in degraded service and subsequently require remediation?
How can these be reworded to assess data teams DataOps performance?
The key question here is, how long does it take to go from code change made, to code change experienced by end users.
How long does it take to go from a change made to a the definition of a data model, to that data model reflecting the new definition for your end users (e.g. in a BI tool).
This one doesn’t need much rewording.
How often does your team release updates to data models to end users?
Time to restore
Defects for data teams are data quality issues, so we can be quite specific here.
How long does it generally take to fix a data quality issue once it has been reported to you?
Change fail percentage
Whilst this may be a little harder to measure, the wording doesn’t need too much of an update.
What percentage of changes to your data model result in a data quality issue, or failing pipeline, that requires remediation?
What does good look like?
For the State of DevOps report, Google carried out significant research, speaking to developer teams at many different organizations. To get a true sense of “what good looks like”, we need to do the same for data teams.
That said, for all of these metrics, the best teams will be aiming as low as possible:
- Changes to data models with a minimal lead time
- Changes deployed multiple times per day
- Issues resolved within hours, not days
- A low (sub 10%) percentage of releases leading to data quality issues
Take the survey!
If you’d like to contribute to our efforts to create a comprehensive set of benchmarks for data teams, take our survey! We’ll take your email address, and as soon as we’ve got a statistically significant number of results we’ll send you the benchmark data, showing you how your team compares.
Achieving Elite status with Dataform
Whilst we didn’t have the State of Data Ops report in mind when we started developing Dataform, the fundamental idea behind the product was always to bring DevOps best practices to the data team.
Dataform is a one-stop-shop for managing a data warehouse in an automated, dependable and continuously deployed manner. Teams using Dataform are operating at Elite status by definition, because each of these principles is built into the core of the platform.