From Our CEO: Save Money, Do More in Economic Downturn -
Read the Article

Prometheus is an open source monitoring solution written in Go that collects metrics data and stores that data in a time series database. It was originally built by SoundCloud in 2012 and became part of the Cloud Native Computing Foundation (CNCF) in 2016. It uses PromQL, a powerful query language for querying your time series data.

Prometheus scrapes metrics data from HTTP endpoints and then pushes that data into a database that uses a multidimensional model. Read How to monitor with Prometheus to learn more about the push model (and how it’s different from the pull model), the multidimensional model, and how data is collected.

A New Relic dashboard showing metrics from Prometheus

What is PromQL?

PromQL, which is short for Prometheus Query Language, is a functional query language that allows you to select and aggregate time series data. It’s a flexible, powerful language that allows you to slice and dice your data however you need.

PromQL use cases

You can use instant vectors to query data from a single point in time, while using range vectors to query data over a time range. You can query a basic metric such as http_requests_total and then filter your metrics further through key-value pairs with regular expression matches.

You can also query your data directly in New Relic with PromQL-style queries.

When to use Prometheus?

Prometheus is a highly-reliable open source tool that can be used to monitor any part of your application, including microservices. Because it is vendor-neutral and has a rich open-source community of developers and contributors, you can use it to monitor almost your entire application, including the frontend and backend, servers and hardware, and even infrastructure like a service mesh, as mentioned earlier.

Many open-source tools, such as Istio (service mesh) and CoreDNS (default DNS for Kubernetes) have native Prometheus endpoints. To monitor services that don’t use HTTP endpoints, such as hardware, you can use exporters.

As an open-source tool, Prometheus has other major advantages—it’s free, its code is available on GitHub, and its toolkit is readily customizable.

Prometheus includes AlertManager, which groups and deduplicates alerts before sending out categories of alerts as a single notification. That means less alert fatigue—and you won’t be flooded with constant alerts during an outage.

It's also highly reliable because its servers are independent, which means that servers can continue functioning even when part of your system is down. This is especially important because you need your monitoring system to remain functional during outages.

 

When not to use Prometheus?

A good engineer knows that it’s not just about using good tools—it’s also about using the right tool for the job. Prometheus is very good at what it does, but it’s not intended to be an all-in-one platform for all of your observability needs. 

Here are some examples where you’ll benefit from using another tool. Note that even when another tool is a better fit for a use case, you can still use Prometheus alongside it because it's often the right tool for monitoring a service.

Long-term data storage:

Prometheus isn’t intended for durable long-term storage. You can use an observability platform or another storage source for long-term storage. For instance, New Relic provides extended storage for up to 13 months for dimensional metrics.

When you need 100% accuracy:

Prometheus prioritizes reliability over accuracy. According to the CAP theorem, you can only have two of three in a distributed system: consistency (accuracy), availability (reliability), and partition tolerance (data collected on separate servers). Since distributed systems always need partition tolerance, there is a tradeoff between reliability and accuracy. While the tradeoff is fairly small, when you need 100% accuracy (such as with a billing system), you’ll need to use another system.

Automatic setup for your environment:

An observability platform can automatically detect services and instrument them so you get observability in minutes. With Prometheus, you need to configure different services. That includes adding configurations to scrape specific HTTP endpoints and setting up exporters for services that don’t use HTTP endpoints. For large distributed systems, that’s a lot of work.

New Relic provides both long-term storage and helps you automatically set up your environment, so you can combine New Relic with Prometheus to benefit from both tools.

What is Prometheus monitoring?

An open-source monitoring and alerting toolkit that can be used for monitoring highly dynamic service-oriented architectures and hardware.

What can you monitor with Prometheus?

We have a whole article on monitoring with Prometheus, but if you just want a refresher on what Prometheus can monitor, the short answer is: almost everything. 

Frontend:

Use Prometheus to monitor application metrics like throughput (TPS) and response times. In particular, the Prometheus blackbox exporter enables you to perform an uptime check and monitor website status.

Backend:

Monitor databases, APIs, and HTTP request status on your endpoints using Prometheus. For instance, you can use the toolkit to track REST API metrics like request latency, log events, and error rate per API. You can also use Prometheus to monitor JVM applications via the JMX Exporter.

Servers:

Leverage Prometheus Blackbox Exporter to track server KPIs like average response time. In addition, you can monitor your operating system to understand your server CPU utilization or its hard disk usage. You can also use Apache Prometheus exporter to monitor Apache web server.

Hardware:

With Prometheus Node Exporter, you can monitor hardware and kernel metrics on Linux and any other Unix system. Some of the hardware metrics you can track include CPU usage, disk utilization, network bandwidth, and memory.

Infrastructure:

Prometheus can monitor your infrastructure and applications at multiple levels, including hosts, the application itself, and any containers. For instance, you can monitor the performance of MySQL and identify any issues with the database using the Prometheus MySQL exporter. Another example is the ability to track the throughput and response times of the Kafka load generator, Cassandra client, and Kafka consumer.