Achievers hits SLA uptime of 99.95% with service level management

Business Challenge
Company Size
100+ software engineers

Achievers is an award-winning platform for employee recognition and engagement used by some of the world’s largest and most innovative companies—including McDonald’s, Kellogg’s, Rogers, and GM.

With strong growth over the past two years, Achievers is currently serving more than 2,500 brands across more than 150 countries. But with this growth came new and increased technical demands, with the sudden need to support a global architecture while maintaining quality and reliability. Achievers needed to transition from a system of disparate Kubernetes clusters—with no communication between regions—to a single, unified system across the organization’s global footprint.

Distributed data 

Site Reliability Engineers (SREs) and developers at Achievers have the complicated task of running a multi-region Kubernetes service mesh. While running in a single region alone would cover significant amounts of data, the engineers needed to scale the platform globally and make sense of all of this distributed data. It was difficult to understand what data was needed and how to use this data to solve their problems.

SLA uptime
minutes to go into production
deployments per day on average

New Relic helps us be more data-driven by giving us a centralized place to see all of our data. We're able to integrate a lot of tools with New Relic, but we're also able to bring our product teams and engineers a lot closer together.

Service level management shows which services are impacted

Achievers understood the need to catch up to their growing infrastructure and gain complete visibility into the performance of their system. The company turned to New Relic for service level management, making it possible to instantly establish and configure service-level objectives (SLO) for latency and availability. New Relic also makes it possible for teams to communicate and align on the issues they face throughout their tech stack, regardless of whether they’re deploying in a single region or in every corner of the world.

Operating a global infrastructure requires thoughtful prioritization of tools and services. While some features require accurate latency metrics and careful monitoring, other services can briefly fail without impacting other regions or overall customer experience. Service level management from New Relic made it possible for Achievers to quickly identify and prioritize SLOs based on their importance and requirements. This reduced complexity also makes it easier for Achievers to adjust and fine-tune its SLOs, delivering consistent improvements to overall performance.

Beyond individual performance metrics, SLOs made it possible for individuals and teams at Achievers to work together and understand common goals and challenges. Service level management from New Relic provides users with a simple dashboard to visualize indicators and understand exactly which services are experiencing errors and the resulting impact. This dashboard makes it possible for developers and product owners to align immediately on reliability trends and adjust their work accordingly. 

Service level management provides each engineer at Achievers with a simple workflow to establish baselines of performance and reliability. As a result, every engineer has consistent visibility into reliability throughout the development process. Every developer owns reliability—with the tools and data needed to identify and resolve issues proactively and independently.

Developers are at the heart of every part of our customer experience, and New Relic service level management empowers our developers to write code and proactively build reliability in a secure way.

Maintaining an SLA uptime of 99.95%

Achievers improved reliability tracking with New Relic by standardizing key service level indicators (SLIs) and SLOs for all engineering teams. Achievers engineers could now see all necessary data in a single dashboard instead of jumping between multiple tools—increasing the speed, volume, and reliability of deployments. Achievers now maintains an SLA uptime of 99.95% while entering production in as little as five minutes. Engineers can deploy to four production regions on demand, enabling the widespread service improvements required to support a global customer base.

Before adopting service level management from New Relic, Achievers engineers deployed just three times a day. Achievers now deploys 33 times per day on average, a remarkable increase of more than 10X. Most importantly, service level management facilitates simple collaboration and communication between engineers and product managers. Achievers can now plan feature releases and understand their impact on KPIs and the customer experience. All these performance improvements add up to a satisfied global customer base and the potential for continued expansion.

Watch Achievers discuss how they use New Relic and observability, including APM, logs, distributed tracing, and New Relic Query Language (NRQL).