Harnessing the power of open source Prometheus metrics in New Relic

12 min read

Prometheus is a powerful, open source tool for real-time monitoring of applications, microservices, and networks. It’s especially useful for monitoring Kubernetes clusters. Originally developed by Soundcloud in 2012, it’s now part of the Cloud Native Computing Foundation.

 In this post, you’ll learn how to do the following:

  • Integrate Prometheus with New Relic One.
  • Use PromQL-style queries to query Prometheus metrics.
  • Set up alerts on Prometheus metrics.

Before we get started, let’s take a look at what Prometheus is and why it’s so useful. If you are already familiar with Prometheus, you can skip to the next section.

What is Prometheus?

Maybe you’ve heard the myth of Prometheus before. Prometheus offered fire to man and was punished by the gods as a result. SoundCloud gave this monitoring tool to the open source community, and while it may not be on the same level as fire, it’s still a hot commodity for monitoring applications. Fortunately, there is no divine punishment involved for using Prometheus.

There are many other monitoring tools out there, so what makes Prometheus stand out? There are three main advantages:

  • Dimensional data model: Prometheus is specifically designed to record time series data. This kind of data is collected over time and plotted as a series of data points. Data in Prometheus consists of streams of timestamped values. These streams are very useful for tracking real-time metrics, which is exactly what you need when you are monitoring application and network performance.
  • Powerful query language: Data is only useful if you can analyze it. Prometheus offers PromQL, a simple yet powerful query language that can help you drill down into your metrics.
  • Reliability: Servers are standalone and remain functional even when other parts of a system are down.

Why should you integrate Prometheus with New Relic One?

Prometheus is an amazing tool for monitoring real-time metrics. However, it can be challenging to scale Prometheus and set up long-term storage and role-based access control (RBAC). Fortunately, New Relic offers long-term durable storage for thirteen months for all of your metrics as well as RBAC and the ability to create custom user groups and levels of access.

You can easily import your Prometheus metrics into New Relic, helping you reduce overhead for your teams and increase application uptime while also improving the observability of your apps and systems.

You gain other benefits, too:

  • Dashboards for visualizing Prometheus metrics, including the ability to create custom visualizations as needed.
  • PromQL-style query mode so you can use Prometheus’ query language for your imported metrics.
  • Robust alert options so you can alert on your Prometheus metrics.

How to integrate Prometheus with New Relic One

There are two ways to integrate Prometheus with New Relic One:

The easier, recommended way is covered in the Nerd Bytes video below.

Learn how to send data from Prometheus to New Relic.

The steps are also included below and in Send Prometheus metric data to New Relic.

  1. From the Explorer screen, select Add more data in the upper right corner. You can also add more data from any New Relic screen by selecting the blue option button in the upper right corner and selecting Add more data from the dropdown.
  1. From the Add your data screen, search for Prometheus or scroll to the Open source monitoring systems and select Prometheus.
  1. Select the account you want to send metrics to and then select Continue.
  2. On the Integrate your Prometheus data screen, specify a label for your Prometheus server in the Add a name to identify your data source field. The name you provide will be an additional attribute included on all of your Prometheus metrics. After you input the label, select Generate url. Copy the provided url and bearer_token. Keep the Integrate your Prometheus data screen open. Select See your data at the bottom of the screen once you’ve updated your Prometheus configuration file.
  1. Open your prometheus.yml file and add the following under global_config in the file, at the same level of indentation as global:
remote_write:
- url: https://metric-api.newrelic.com/prometheus/v1/write?prometheus_server=prometheus-data  
  bearer_token: <license_key>

Make sure to replace prometheus-data with the name of your data source.

  1. Once you’ve updated your prometheus.yml file, restart your Prometheus server. Select See your data from the Integrate your Prometheus data screen and get instant visibility into your Prometheus metrics.

Now you’re ready to query and visualize Prometheus metrics directly in New Relic.

How to query Prometheus metrics in New Relic One with PromQL

You can query Prometheus metrics in New Relic One using either NRQL (New Relic Query Language) or PromQL-style queries.

From the Browse data screen in New Relic One, select the Query builder tab. 

Queries are NRQL by default, but you can select PromQL-style from the upper right corner to switch into PromQL-style queries.

Then you can input a query in PromQL. For instance, http_requests_total{status>="5.."}[10m] is a PromQL query that looks at the total number of HTTP requests over a ten-minute period from all Prometheus data with an HTTP status code that starts with 5, which would catch all status codes that are internal server errors.

You can then run the query by selecting the Run button. PromQL-style queries are translated automatically into NRQL approximations. The translation process works for 99.5% of all queries. New Relic One then creates a visualization based on the query.

To see the NRQL approximation, select NRQL in the upper right corner of the query. Note that you need to run the PromQL-style query before you can translate it. Otherwise, when you switch over to NRQL, you see a blank field. The translation process works both ways. You can translate NRQL queries into PromQL-style queries as well.

So that’s an example query that can help you determine how many HTTP requests are 500 errors. But what about other metrics you can query?

New Relic automatically provides a few default attributes for both the OpenMetrics and remote write integrations, but you can also get the metric names via a query as well, which is much more useful.

To learn what metrics you can query, enter the following:

FROM Metric SELECT uniques(metricName) WHERE instrumentation.provider='prometheus' AND instrumentation.name='remote-write'

The next image shows a few of the available metrics from the OpenMetrics integration. Use the previous query to check out metrics for the remote write integration.

For more information on querying Prometheus metrics in New Relic, see the next video.

Learn how to use PromQL-style queries in New Relic.

How to set up alerts on Prometheus metrics in New Relic One

You can set up alerts directly in Prometheus through Alertmanager or you can use New Relic to manage alerts on PromQL-style queries. With New Relic One, you have tremendous flexibility to define what you want to monitor, how it should be monitored, and when to generate incidents. New Relic also offers customizable notification channels so you can send notifications wherever your teams work and collaborate. Check out Introduction to Alerts to learn more about alerting in New Relic One.

The following video walks through adding an alert on a PromQL-style query. You can also read through the steps after the video and follow along to create your own alerts.

Learn how to set up an alert on a Prometheus metric in New Relic One.

You can set up an alert on any NRQL query. If you are writing PromQL-style queries in New Relic’s query builder, you’ll need to translate them into NRQL. To do so,  run a PromQL-style query in the query builder and then switch over to NRQL to see a translation.

Once you have your NRQL query in hand, follow these steps to add an alert.

  1. From the main New Relic One dashboard, select Alerts & AI from the navbar.
  1. If you haven’t set up an alert before, you need to set up your first policy. You can do so by selecting Policies in the left pane. Then select + New alert policy.
  1. Select a descriptive name for the policy and then select the incident preference. For this tutorial, select By condition. The docs have more detailed information about different incident preferences. If you already have notification channels set up, you can select which channels should receive notifications on your policy, but you can also set this up later. When you are ready, select the Create alert policy button.
  1. Your policy won’t have any conditions yet, so select Create a condition.
  1. Now you can set up your rule. Select NRQL, then select Next, define thresholds.
  1. Next, enter a name for your condition and define your signal. In this case, that means entering your NRQL query. In the last section, we used http_requests_total{status=~"5.."}[10m] as an example PromQL-style query.

    Here’s the same query translated into NRQL:

    SELECT latest(http_requests_total) FROM Metric WHERE (status RLIKE '5..') SINCE 60 MINUTES AGO UNTIL NOW FACET dimensions() LIMIT 100 TIMESERIES 600000 SLIDE BY 30000

    This query has some added clauses that can’t be used in an alert, including SINCE, LIMIT, TIMESERIES, and SLIDE BY. Clauses like SINCE and UNTIL aren’t needed because alerting creates a continuous stream of results. TIMESERIES isn’t necessary because alerts use an aggregation window. SLIDE BY is used in conjunction with TIMESERIES, so you don’t need that, either. If you enter a query, the threshold preview will show you if the query includes any clauses that aren’t allowed in an alert. The next image shows an example.

After removing disallowed clauses, the query looks like this:

SELECT latest(http_requests_total) FROM Metric WHERE (status RLIKE '5..') FACET dimensions()

  1. Next, set a threshold for alerts. In the case of total HTTP requests, you might set a threshold if there are too many requests, meaning the API might be overburdened. 

    There are several types of thresholds you can establish: static, baseline, and outlier. With a static threshold, you set thresholds manually. Baseline and outlier thresholds use algorithms to dynamically determine when alerts should be sent based on deviations from the mean that you set.

    The next image shows threshold options from the New Relic One UI. With static thresholds, you can open a violation when the query returns a value (as shown in the image) or when the sum of the query results in a value. The image shows a static threshold where a notification is sent if the query returns more than one hundred 500 status codes for at least five minutes.

  1. That’s all you need to set up a condition: a name, a NRQL query, and condition thresholds. You can then select the Save condition button to save the alert.

There are a number of other settings you can change on an alert, and you can read more about them in Creating NRQL alert conditions.

Conclusion

In this post, you’ve learned how to use a remote write integration to send your Prometheus metrics to New Relic One. And just as Prometheus from the Greek myth harnessed the power of fire and shared it with others, you can use New Relic One to harness metrics from Prometheus, write PromQL-style queries, visualize your data, and send alerts to your teams.