Win a custom New Relic pinball machine! Just refer fellow Data Nerds to register for FutureStack. Register Now

Nerdlog Roundup: Cut MTTR and Reduce Alert Fatigue with AIOps

3 min read

If you’ve worked in Ops for any period of time, you’ve seen services fail that you wish had sent you an alert when they started acting up. With New Relic’s anomaly detection, you can use the power of AIOps to warn you when a service looks like it’s in trouble.

Cut MTTR and proactively detect anomalies

Our latest Nerdlog focuses on New Relic Applied Intelligence’s Proactive Detection. This feature adds context clues to incidents by flagging patterns that look concerning and uses artificial intelligence (AI) and machine learning (ML) to notice when changes in performance may indicate a possible problem. This feature doesn’t replace alerting, especially when you’ve configured alert thresholds ahead of time to generate warnings. Rather, Proactive Detection will warn you about issues that you didn’t predict.

Senior Product Manager Devin Cheevers gives us an overview of where you can view your anomalies in New Relic Applied Intelligence. Get started and check out the Applied Intelligence anomalies feed, which is now available to all users.

Find problems fast with root cause analysis

We know that it can be hard understanding why, when, and where issues occur in your stack. Sifting through graphs to find correlations can take you away from working on things that matter to you and your business.

Now, New Relic One will analyze issues and propose a root cause automatically on an issue page in Alerts & AI. Instead of looking at a single event, you’ll see a grouping of events and even possible attributes to investigate.

Will any of these components replace human Operations Engineers? Never, but using root cause analysis will give clues to those humans. Attributes to investigate looks at all attributes, including user-added attributes, and gives indicators that will make the most sense to someone familiar with your case.

Devin and Principal Product Manager Nate Heinrich discuss why these features matter and how enhanced root cause analysis can save us all time when the pager rings at 2 a.m. Check out the docs to learn more about how to get hands-on.

Increase context for issues using topology correlation

Finally, we want to provide you with the right context so that you can prioritize problems accurately and efficiently.

Topology (relationship-based) correlation uses multiple systems to discover how anomalies may be connected, and increases the quality of your event correlations and speed at which they are found. Previously, our correlation engine worked off two categories:

  • Time—when two anomalies are happening very close in time
  • Context—multiple anomalies have very similar metadata (see our past episode on how you can configure the correlation engine to connect metadata)

We’re now adding a third component to correlations:

  • Topology—when two entities are part of the same service

Topology connections can be configured by an API using the NerdGraph API explorer, which lets you tell our correlation engine about connections that exist within your stack.

In Nate’s segment, you’ll learn how and why correlations are so critical to reducing alert noise and fatigue and see him demo how to create these connections via the API.

Get weekly updates about the latest features and releases from the people who built them. Join the Nerdlog discussion every Thursday at 12 p.m. PT on Twitch or follow along in What's New. If you're not a New Relic customer, sign up for your free account today.