Now GA: New Relic Outlier Detection
Automatically identify meaningful deviations and resolve hidden bottlenecks before they impact customers
Modern systems are increasingly complex. Whether you are operating hundreds of Kafka brokers, thousands of containers, or a large fleet of JVM-based services, maintaining consistent performance across environments is difficult. Aggregate dashboards often look healthy, even as a single service, host, or broker quietly degrades. By the time traditional alerts fire, customers may already be affected.
New Relic Outlier Detection is now generally available, giving SRE and DevOps teams a faster, more precise way to identify the “odd one out” and focus attention where it truly matters.
Instead of relying solely on static thresholds or historical baselines, Outlier Detection helps teams quickly spot entities that behave differently from their peers at a given moment, reducing alert noise and accelerating incident response.
The Problem: Why Averages Hide the Truth
In distributed environments, teams expect similar entities to behave similarly. Kafka brokers in the same cluster should process traffic evenly. Cloud nodes running the same workloads should consume comparable resources. JVMs supporting the same service should show similar memory profiles.
But when one entity begins to drift, traditional monitoring approaches often miss the signal.
These “unknown unknowns” commonly result in:
- Missed early warning signs
- Alert fatigue caused by static thresholds
- Reactive incident response
- Longer Mean Time to Detection (MTTD) and Mean Time to Resolution (MTTR)
For example, one Kafka broker may slowly take on more traffic than its peers. Or a single JVM may begin leaking memory while others remain stable. Because overall averages still look normal, the issue remains hidden until performance degrades or an outage occurs.
Teams told us they needed better precision, clearer context, and an easier way to identify meaningful deviations within groups of similar entities.
Outlier Detection vs. Anomaly Detection
While they are complementary, Outlier Detection and Anomaly Detection solve different problems.
- Anomaly Detection looks at an individual signal over time and alerts when it deviates from its own historical baseline.
- Outlier Detection compares entities to their peers and alerts when one entity significantly deviates from the group at a specific moment.
This peer-based approach is especially valuable in horizontally scaled and clustered systems, where averages often mask localized failures. By focusing on relative behavior within a group, Outlier Detection surfaces issues that time-based analysis alone can miss.
How New Relic Outlier Detection Works
New Relic Outlier Detection automatically identifies entities whose behavior deviates significantly from their peers, helping teams quickly focus on the most suspicious components.
Key capabilities include:
Alerts in context
Monitor groups of similar entities, such as Kafka brokers or cloud nodes, and trigger alerts only when one entity meaningfully deviates from the group. This reduces false positives and keeps alerts focused on actionable issues.
Flexible configuration
An intuitive UI allows teams to tune severity levels, evaluation windows, and grouping parameters without complex setup. Teams can align alerts with their operational standards and service-level objectives.
Full NRQL power
Outlier Detection leverages the same queries and dimensions teams already use. This ensures precise alignment with existing observability practices and makes it easy to extend detection to new workloads.
The result is fewer false positives and faster identification of issues that truly matter.
Real-World Use Cases
1. Balancing Kafka brokers
Identify brokers handling disproportionate traffic before they cause cluster-wide slowdowns. Teams can rebalance workloads proactively and avoid cascading performance issues.
2. Early detection of JVM memory issues
Spot memory pressure in a single JVM compared to its peers before an out-of-memory crash occurs, enabling faster intervention and improved service stability.
3. The “noisiest neighbor” problem
Quickly find tenants or services consuming excessive resources in shared environments, helping teams rebalance workloads and maintain fair usage.
Business Impact
By helping teams focus on the most meaningful deviations, New Relic Outlier Detection enables a more proactive incident response posture.
Organizations that incorporate automated outlier detection into their observability practices experience:
- Faster Mean Time to Detection (MTTD)
- Faster Mean Time to Resolution (MTTR)
- Reduced alert noise
According to the New Relic Observability Report 2024, enterprises that adopt observability best practices experience 19 percent less annual downtime compared to those that do not.
Get Started
Outlier Detection is available as part of the New Relic platform.
- Get started: Go to Alerts → Alert Conditions → New Alert Condition and select Outlier Detection
- Learn more: Explore the Outlier Detection documentation
- Pricing: You only pay for what you use
As opiniões expressas neste blog são de responsabilidade do autor e não refletem necessariamente as opiniões da New Relic. Todas as soluções oferecidas pelo autor são específicas do ambiente e não fazem parte das soluções comerciais ou do suporte oferecido pela New Relic. Junte-se a nós exclusivamente no Explorers Hub ( discuss.newrelic.com ) para perguntas e suporte relacionados a esta postagem do blog. Este blog pode conter links para conteúdo de sites de terceiros. Ao fornecer esses links, a New Relic não adota, garante, aprova ou endossa as informações, visualizações ou produtos disponíveis em tais sites.