Sportradar empowers its open source and DevOps culture with observability
Share this Story
services have their health status covered in dashboards
availability SLO across all services to deliver against
increase in the frequency of releases
Whether its Formula 1 racing or world snooker championships, American baseball or European football, in addition to dozens of other sports, the fan experience wouldn’t be the same without Sportradar.
As the global leader in sports data intelligence and technology, Sportradar monitors and delivers insights from more than 400,000 matches annually across 60 sports categories. The company is the official partner of the NBA, NHL, and NASCAR, as well as FIFA and UEFA and is also the only provider entrusted to work with the U.S. sports leagues in an official capacity to distribute sports data (NBA and NHL) around the world for betting purposes. Sportradar’s sports betting, entertainment, and integrity products and services are relied on by more than 1,000 companies in 80 countries.
"Our technology is behind every major sporting federation and organization around the world," says Scott Dunbar, global head of DevOps engineering at Sportradar. "We are first and foremost a software development organization, adopting bleeding-edge technology such as artificial intelligence and computer vision to deliver the best sports technology and products possible."
From siloed expertise to collaborative, cross-functional teams
When Dunbar and Sportradar’s chief technology officer Ben Burdsall joined the company, every engineering team was working completely independently. "It was the biggest start-up I’d ever seen, with teams siloed away from each other," says Dunbar. "We had pockets of excellence, but there was very little sharing, collaboration, or alignment across teams."
Burdsall’s vision was to transform the organization of the engineering group into the "Spotify" scaled agile model. "We set out to create what we call the ‘Spotify’ model, where engineering capacity is organized into tribes, squads, chapters, and guilds," says Dunbar. "The ultimate objective is to enable cross-functional teams, create transparency around roles and responsibilities, and align our engineering function around a clear strategy.’
Today, that vision has been achieved, with more than 500 engineers organized into 17 tribes. Each tribe is a team of engineering capacity focused on one product portfolio. DevOps specialists are injected into tribes and squads to complete the cross-functional teams. "As DevOps engineers, we focus on the software delivery lifecycle challenge of observability," says Dunbar. "We define observability as the creation of a system where we can debug unknown unknowns. Our team injects that skillset and passion for using data to understand the health of our applications within our tribes."
Autonomous teams and a single observability strategy
One of the company’s engineering principles that remains in the new organizational structure is the autonomy of its teams. "We believe that our squads and tribes should be free to choose the technology or tooling they’re comfortable with and believe is best for what they need," says Dunbar. "We’ve put a governance and alignment layer in place that enables their autonomy while ensuring consistency in exposing key metrics. Obviously, New Relic plays a very large role in helping us do this."
When Dunbar first came to Sportradar, he found that teams were using a broad array of different monitoring tools and approaches to understanding system health. ‘There was very little consistency and sparse adoption of service level metrics, key performance indicators, and measurement of service level indicators against a service level objective," he says.
With New Relic, teams can truly become autonomous while tracking and reporting on a consistent set of metrics, providing the company with a holistic view of its complex portfolio of products and services. "New Relic's flexibility and support of open source means that our teams can continue using the open telemetry tools that they know but surface key metrics to a single pane of glass," says Dunbar. "We now have a consistent view of service health across our entire estate of more than 200 business services."
Tracking key metrics to drive SLAs
For every technical service, Sportradar expects its teams to capture availability, reliability, and performance metrics against a service level objective (SLO). "We then bubble up technical services into a business service, compare service level indicators (SLIs) against our SLOs, and then guide the business on reliable service level agreements (SLAs) for that particular business service," says Dunbar. "We’re delivering against a 99.9% availability SLO for all our business services."
Programmability in New Relic gives Dunbar and the engineering teams the ability to customize how they track and calculate metrics. "For us, it was critical to get insight into time-based availability, where availability is determined by the duration of a specified condition in the technical service," he says. "We created a custom observability app in New Relic that pulls data from New Relic synthetics and processes it to give us a complete view of time-based availability for every system."
Accelerating releases and understanding latency
New Relic has also helped the engineering organization measure and improve release quality while increasing the frequency of releases by 15%. "New Relic has helped us change the culture in terms of how we understand performance and test our systems," says Dunbar. "With New Relic as an integral part of the delivery pipeline for testing changes, it acts as a quality gate to help us capture issues before we ship to production."
For Sportradar, low-latency data exchange between a massive web of interconnected systems is absolutely imperative. New Relic is essential for helping the company understand and improve data latency using distributed tracing. "We collect millions of data points from literally thousands of data sources in what I call the head of the snake," says Dunbar. "The long tail of the snake is where a variety of our services such as anti-match-fixing and odds calculations consume the data."
According to Dunbar, "New Relic's support for open telemetry and feature-rich APIs are what allows us to use distributed tracing to understand the latencies and ingest time from the data sources at the head of the snake all the way down to the tail. I don't think any other tool could adapt to the sheer variety of products and services within our portfolio and still be able to surface a trace, from the root of the span down to every consumer of that data point."
Sportradar successfully completed its shift from siloed teams to its Sportify model of engineering capacity, with a governance layer built in to align the teams around central technology and observability strategies.
"Our entire engineering capacity is comprised of truly cross-functional teams, with totally self-sufficient squads building great products and services without depending on people outside of that squad or tribe," says Dunbar. "The model is working well and we’ve achieved our goals. New Relic is an essential part of our success."
Going forward, Dunbar sees the organization continuing to mature everything that it’s doing, including further improving its observability. "I’m particularly interested in seeing us make even better use of New Relic for deep insights into log aggregation and more fully leverage the OpenTelemetry capabilities for distributed tracing."