Prometheus is an open source monitoring solution written in Go that collects metrics data and stores that data in a time series database. It was originally built by SoundCloud in 2012 and became part of the Cloud Native Computing Foundation (CNCF) in 2016. It uses PromQL, a powerful query language for querying your time series data.
Prometheus scrapes metrics data from HTTP endpoints and then pushes that data into a database that uses a multidimensional model. Read How to monitor with Prometheus to learn more about the push model (and how it’s different from the pull model), the multidimensional model, and how data is collected.
What is PromQL?
PromQL, which is short for Prometheus Query Language, is a functional query language that allows you to select and aggregate time series data. It’s a flexible, powerful language that allows you to slice and dice your data however you need.
You can use instant vectors to query data from a single point in time, while you can use range vectors to query data over a time range. You can query a basic metric such as
http_requests_total and then filter your metrics further through key-value pairs with regular expression matches.
You can also query your data directly in New Relic with PromQL-style queries.
When should you use Prometheus?
Prometheus is a highly-reliable open source tool that can be used to monitor any part of your application, including microservices. Because it is vendor-neutral and has a rich open source community of developers and contributors, you can use it to monitor just about anything in your application, including the frontend and backend, servers and hardware, and even infrastructure like a service mesh. Many open source tools, such as Istio (service mesh) and CoreDNS (default DNS for Kubernetes) have native Prometheus endpoints. To monitor services that don’t use HTTP endpoints, such as hardware, you can use exporters.
As an open source tool, Prometheus has other major advantages—it’s free, its code is available on GitHub, and its toolkit is readily customizable.
Prometheus includes AlertManager, which groups and deduplicates alerts before sending out categories of alerts as a single notification. That means less alert fatigue—and you won’t be flooded with constant alerts during an outage.
It's also highly reliable because its servers are independent, which means that servers can continue functioning even when part of your system is down. This is especially important because you need your monitoring system to remain functional during outages.
When shouldn’t you use Prometheus?
A good engineer knows that it’s not just about using good tools—it’s also about using the right tool for the job. Prometheus is very good at what it does, but it’s not intended to be an all-in-one platform for all of your observability needs. Here are some examples where you’ll benefit from using another tool. Note that even when another tool is a better fit for a use case, you can still use Prometheus alongside it because it's often the right tool for monitoring a service.
- Long-term data storage: Prometheus isn’t intended for durable long-term storage. You can use an observability platform or another storage source for long-term storage. For instance, New Relic provides extended storage for up to 13 months for dimensional metrics.
- When you need 100% accuracy: Prometheus prioritizes reliability over accuracy. According to the CAP theorem, you can only have two of three in a distributed system: consistency (accuracy), avaiIability (reliability), and partition tolerance (data collected on separate servers). Since distributed systems always need partition tolerance, there is a tradeoff between reliability and accuracy. While the tradeoff is fairly small, when you need 100% accuracy (such as with a billing system), you’ll need to use another system.
- Automatic setup for your environment: An observability platform can automatically detect services and instrument them so you get observability in minutes. With Prometheus, you need to configure different services. That includes adding configurations to scrape specific HTTP endpoints and setting up exporters for services that don’t use HTTP endpoints. For large distributed systems, that’s a lot of work.
New Relic provides both long-term storage and helps you automatically set up your environment, so you can combine New Relic with Prometheus to benefit from both tools.
Monitor Prometheus metrics with New Relic
You can integrate New Relic directly with Prometheus through either the Remote Write or OpenMetrics integrations. The Remote Write option gives you a well-established Prometheus infrastructure with quick access to your metrics, whereas the OpenMetrics integration is more useful for visibility across multiple container platform configurations. If you don’t have a New Relic account yet, sign up for the free tier to get started, and then read more about the process of sending Prometheus data to New Relic, querying your data, and setting up alerts.