We had dozens of tools, but no integrated way to view the data.

Justin Hawkins
engineering manager at Riot Games

The multiplayer online battle arena video game League of Legends, hosting nearly 8 million concurrent players around the world every day, is in a league of its own. Developed by Riot Games, League of Legends is the most-played PC game in the world and a key driver of the explosive growth of e-sports.

The annual League of Legends World Championship, featuring qualified e-sports teams from 13 international leagues, is the most widely viewed and followed e-sport tournament and is among the largest and most popular gaming and sporting events in the world.  

Founded in 2006, Riot Games’ mission has always been to develop, publish, and support the most player-focused games in the world. This singular focus on player experience is the driving force behind everything at Riot Games. It’s also why Riot Games turned to New Relic for its observability platform, giving the company end-to-end visibility into the player experience and the many systems and microservices that drive it.

Moving away from in-house-developed tools

“Our team focuses on how we operate microservices at scale,” says Justin Hawkins, engineering manager at Riot Games. “Specifically, our job is to deliver observability to all the different DevOps teams, site reliability engineering teams, and network operating center [NOC] so they can monitor, triage, and operate their games for optimal player experience.”

In a microservices architecture like Riot Games, gaining visibility into the different services involved in delivering the player experience is critical to quickly pinpointing issues that could impact that experience. “If players can't play the game, that is a big deal to us,” says Hawkins. “We need visibility into how the player experience is impacted by all the components that have to coordinate perfectly to make that happen.”

To address the need for observability, Riot Games developed its own metrics, logs, and alerting capabilities for the first game it developed. This worked well enough until new games entered the development pipeline. “As we started moving closer to the launch of new games, we began to hit the limits of scale with our in-house-built solution,” says Hawkins. “We had to make the decision whether to invest in scaling out our infrastructure or look to others to solve this for us, so we can focus on building experiences for players.”

At the same time, the company was looking to gain a single view of its telemetry data. “We had dozens of tools, but no integrated way to view the data,” says Hawkins. “There were many different ways that you could triage and investigate issues, and data was scattered all over the place. We really wanted to centralize all that data.” To resolve both the scalability and tool sprawl problems, Riot Games decided to deploy an observability platform to replace its own system and other disparate tools.

Deploying a trusted platform

The Riot Developer Experience (RDX) team, including Justin, gathered requirements from across the company, conducted an extensive evaluation of market leaders and emerging technology, and also considered the cost to further develop its in-house system. After running several proofs of concept, Riot Games chose New Relic as its standard platform for observability, with New Relic log management and New Relic metrics replacing the in-house capabilities. Some of the company’s teams were already using New Relic application performance monitoring (APM) and found it to be very intuitive, making the decision to move every team to New Relic that much easier.

“While New Relic scored very high on the proof of concept, it was the familiarity with New Relic and the trust Rioters already had in the New Relic APM product that convinced us,” says Hawkins. With the decision made, his team began the rollout process, starting first with teams that did not yet have a solution to their observability needs—those working on new games—and then rolling out to teams that had been using legacy monitoring tools.

Scaling to improve the workflow

With its in-house-developed system, Riot Games ran a single, shared deployment for the logging capability. “It was a massive platform that worked great at a particular scale,” says Hawkins. “But sometimes a team would run a query that seemed reasonable, like, ‘Show me the logs from the last 24 hours for my service and find error messages.’ If there were 18 terabytes of logging data in that day’s index, everyone else’s query would slow down. The lack of scalability was hurting everyone’s workflow.”

Now with New Relic logs, log data and New Relic compute resources are segmented by team, dramatically improving response time for all queries. “The overall experience is better, because the teams have tooling they're familiar with, but with much better responsiveness,” says Hawkins. “Because the scope of data is segmented by team and because New Relic is built for scale, we’re able to quickly search the detailed logs.” 

Riot Games has also seen significant scalability improvements for its metrics data after switching to New Relic. “We needed our metric store to be available, searchable, and usable all the time, and not have to worry about hitting scalability limits where everything is impacted,” says Hawkins.

Making alerting more meaningful

“Historically, when we establish alerts, teams have to think about which metric is important and establish a baseline for each data center,” he says. “There's a lot of manual curation to make sure your alerts are actionable and meaningful, and if you get it wrong, you either get no visibility or your team suffers alert fatigue.” 

With New Relic alerts, Riot Games can use baseline-centric alerting to help teams establish a meaningful balance around what's healthy with their services and what's not. “New Relic can help us discover what's important, based on historical performance,” says Hawkins.

New Relic provides visibility, not just on your services, but how they interact with other services that you may not own and operate. It's about the dependency chain and getting visibility into that to solve the right problem.

Going beyond visibility

Having a single platform enables Riot Games to connect the dots on player experience across multiple microservices. “New Relic provides visibility, not just on your services, but how they interact with other services that you may not own and operate,” says Hawkins. “It's about the dependency chain and getting visibility into that to solve the right problem.”

Beyond the benefits of the New Relic technology, Hawkins sees what he considers to be a hidden value of working with New Relic. “It’s the established relationship we have and the fact that everyone supporting us at New Relic is helpful and open to listening to our feedback,” he says. “There’s a willingness and responsiveness to understand what's important to us as a company, listen to our feedback, and deliver results on it. We've already seen product changes that help improve quality of life and drive adoption.”