Expand Your Alert Coverage With Recommended Conditions

Published 2021년 Jul 13일 3분 소요

Your home almost certainly has smoke detectors to alert you in case of a fire. Your production applications should have alerts too, so you know when something goes wrong. However, unlike with smoke detectors, which come with predefined thresholds, it’s not so simple to set up accurate alert conditions in a production application. You need to figure out what to monitor and you also need to set up thresholds that are sensitive enough to catch anomalies but not so sensitive that you get distracting false alarms.

Good news: New Relic One has the data and tools to help. We have developed an Alert Condition Recommendation service that uses AI and machine learning (ML) to recommend specific metrics and signals to monitor for your specific entities. You can use the provided recommendations or modify them to fit your specific needs.

Using recommended conditions

You can easily add recommended alerts to APM entities that do not currently have alert coverage. Here’s how.

In Services - APM in the lefthand pane of New Relic Navigator, there is a high-density view of the health of your system. With the traffic-light visual, it’s easy to view which entities are healthy, which have violations, and which don’t have any alerts coverage. If an entity doesn’t have alert coverage, its hexagon is gray as shown in the image above. Recommended conditions help you automatically add alerts to entities that do not have alerts coverage.

Click an entity that isn’t covered yet (a gray hexagon). A new option will appear on the right-hand side of the screen. Click Create alert condition as shown in the image below.

Select a gray entity in Navigator and click Create alert condition.

You can choose several recommended conditions, as shown in the next image. The recommendations will depend on the quality of the tags associated with an entity. In New Relic One, each data source is an entity associated with a set of tags that provide metadata on important attributes (such as language and application name). You can add custom tags while other tags are added automatically. The more accurate and informative your tags are, the more precise the recommendations will be. The image below shows a few possible recommendations based on error percentage, Apdex, and response time.

How alert recommendations works

You might be wondering how New Relic One uses machine learning to generate recommended conditions. It’s all dependent on the tags. The machine-learning model is trained to consume information about all of the different entities created across New Relic One, along with their associated tags. The model divides the entities into clusters based on tag similarity and then finds the most common conditions that are defined on the data sources assigned to that cluster. Finally, the model recommends those conditions to you.

What if you are working with a new entity type and the model doesn’t have enough representative data to give you a recommendation? The model will then make recommendations based on the community-curated golden signals data set. This data set includes the most common conditions across all entities that belong to the same product, regardless of their specific tags.

Defining a threshold

The model doesn’t just recommend a metric to monitor—it provides recommended thresholds for when violations should be triggered. The condition recommendation model uses the recommended metric and applies dynamic baseline alerts to automatically set alert thresholds on key application performance metrics, using baselines modeled from historical data. Generally speaking, choosing the right threshold is important. It’s a tradeoff between too few and too many alerts. To diminish this concern, the model chooses a default threshold value based on the most common threshold for other entities with the same metric. However, you can choose at any time whether you want to keep the default recommended condition or set custom thresholds.

Add recommended conditions today

The alert recommendation service is designed to increase alert coverage in a fast and simple way. Maintaining a high level of alert coverage is crucial for avoiding blind spots and viewing your system’s health. Just as you should have working smoke detectors throughout your home, you should create alerts for a well-monitored system. To learn more about AI, take a look at our infographic on how AI is driving digital transformation.

다음 단계

If you’re not a New Relic customer yet, sign up for a forever free account. Make sure to take a look at our New Relic Infrastructure to also learn more.

Aya Bellicha

Aya Bellicha is a Senior Data Scientist at New Relic’s Applied Intelligence Research team.

이 블로그에 표현된 견해는 저자의 견해이며 반드시 New Relic의 견해를 반영하는 것은 아닙니다. 저자가 제공하는 모든 솔루션은 환경에 따라 다르며 New Relic에서 제공하는 상용 솔루션이나 지원의 일부가 아닙니다. 이 블로그 게시물과 관련된 질문 및 지원이 필요한 경우 Explorers Hub(discuss.newrelic.com)에서만 참여하십시오. 이 블로그에는 타사 사이트의 콘텐츠에 대한 링크가 포함될 수 있습니다. 이러한 링크를 제공함으로써 New Relic은 해당 사이트에서 사용할 수 있는 정보, 보기 또는 제품을 채택, 보증, 승인 또는 보증하지 않습니다.

780+ 개 통합을 사용해 무료로 스택 모니터링

모든 통합 보기

In this article