Engineers need to support hundreds of applications while delivering new features and functionality but get tied up in identifying and troubleshooting software issues. According to our customers, some of their biggest application challenges include maintaining distributed systems and dependencies, ensuring good user experience, and moving beyond simple software performance. Here’s how and why our customers use New Relic and observability to monitor and improve application performance monitoring (APM):
Managing distributed systems, tools, and dependencies
Prior to New Relic, sports streaming service DAZN had one tool for client-side app telemetry and one for logs, in addition to CloudWatch dashboards and hundreds of Amazon Web Services (AWS) accounts. Teams could observe their services and alert when they weren’t performing, but worked in isolation. According to Pete Tanton, principal site reliability engineer at DAZN, a tiny alert for one team could cause a massive incident for another later on. During incidents, it was a challenge to correlate data because teams had to jump between different dashboards. From an administrative perspective, the burden of managing users and passwords—DAZN had a highly trained SRE team doing JML for three or four different tools—wasn’t working. Pushing all of that into New Relic allowed DAZN to see and query all CloudWatch metrics and client-side telemetry data in one dashboard. New Relic dashboards pull data from everywhere, even third-party sources, and are easily customizable.
Bookmaker William Hill also had a number of monitoring tools, including one for infrastructure and one for APM. “It was absolute chaos,” says Stephen Wild, engineering manager for observability and automation at William Hill. The solutions couldn't cope with the amount of data coming in from the 18,000 nodes, not to mention the extra containers that went through the cloud, says Stephen. This resulted in overnight callouts for teams. Since deploying New Relic, the mean time to resolve (MTTR) is over 80% better. In terms of reliability, “it's 100%. There's been absolutely no downtime,” says Stephen.
When troubleshooting an issue, speed is of the essence, says Kristian Lee, head of DevOps engineering at sports technology company Sportradar. New Relic brings that information together, with login, application monitoring, and infrastructure all in one single pane. All of the information is in one place to identify a misconfiguration, where there’s too much load, and what's broken—and then solve any issues quickly, says Kristian.
Actionable insights on issues, before customers are impacted
Agritech innovator IGS has a tech stack based on microservices and multiple environments, with lots of different systems generating logs. IGS uses logs in context to see the root cause of problems across systems in one view. According to IGS, that capability has been a big part of its recent success. Because IGS can quickly identify what’s happened and where it’s happened in the code, this has reduced MTTR, according to Owen Adams, head of platform engineering at IGS.
As IGS brings more data into its system, if there’s an issue, it’s critical to identify where the issue is. With infrastructure monitoring and APM, IGS can visualize all infrastructure data. “Now we focus on areas which actually deliver value to our customers or to the development team as opposed to needing to keep people assigned to work on the monitoring observability platform,” says Owen.
Improving workflows and user experience
It’s common for companies to use different vendors for real user monitoring (RUM), browser monitoring, and distributed tracing, often in combination with open source components. But managing multiple tools and systems takes a mental toll on teams. Engineer hours are spent trying to gain visibility across the system, in addition to dealing with administrative costs such as adding users and managing passwords. When there’s an issue, troubleshooting includes cross-referencing different tools, data sources, and measurements.
New Relic powers much of DAZN’s alert intelligence, says Pete. Applied intelligence proactively notifies DAZN teams about potential problems by identifying anomalous behavior, correlating relevant incidents, and providing a root cause analysis. New Relic also lowers the amount of time it takes to deploy services by having logs, metrics, APM, client-side telemetry, and synthetics, all in one place. Instead of dedicating people to monitoring, IGS can focus on delivering hard value for the business of customers. In addition, New Relic instrumentation was easy. IGS added the New Relic APM agent into a container. It covered 90% of the estate and the entire estate was instrumented in a day and a half.
Rightsizing for cost savings
Prior to adopting New Relic, IGS was spending somewhere between £20-24,000 a month on logging and monitoring. That figure has been cut down by more than half, freeing up a lot of resources. If IGS does have a problem, New Relic gives IGS the confidence to change many things at once. That allows IGS to take more risks and innovate faster with a live product, says Dave Scott, founder and CTO at IGS.
- Observe the performance of your application by installing one of our agents—it takes minutes.
- Monitor the performance and health of all your services in one place.
- Pair New Relic with OpenTelemetry or other open source tools.
The views expressed on this blog are those of the author and do not necessarily reflect the views of New Relic. Any solutions offered by the author are environment-specific and not part of the commercial solutions or support offered by New Relic. Please join us exclusively at the Explorers Hub (discuss.newrelic.com) for questions and support related to this blog post. This blog may contain links to content on third-party sites. By providing such links, New Relic does not adopt, guarantee, approve or endorse the information, views or products available on such sites.