Prometheus is a powerful, open-source tool for real-time monitoring of applications, microservices, and networks, including service meshes and proxies. It’s especially useful for monitoring Kubernetes clusters. Originally developed by Soundcloud in 2012, it’s now part of the Cloud Native Computing Foundation.

Prometheus has a dimensional data model, which is ideal for recording time series data—in other words, your metrics. It also includes PromQL, a powerful query language that helps you drill down into those metrics. Finally, Prometheus is extremely reliable because its servers are standalone and continue to operate even when other parts of your system are down.

How does Prometheus collect data?

A Prometheus server collects metrics data from its targets using a pull model over HTTP. But what exactly is a pull model, and how is it different from a push model?

Push versus pull model

On a basic level, it really is as simple as it sounds. A Prometheus server pulls telemetry data from somewhere else. The server uses jobs to scrape HTTP endpoints. The data from those endpoints is then pulled into the Prometheus database.

A push model collects data and then pushes it to a database. New Relic and other observability platforms use the push model. That means using instrumentation, the process of installing agents in your systems, to collect telemetry data. The agent, which is embedded in your software, collects the data and pushes it to another server.

Both push and pull methods work well for monitoring and observability, and OpenTelemetry (OTEL), which is open-source and vendor-neutral, supports both methods.

Targets and jobs

A Prometheus server pulls its data by scraping HTTP endpoints. The endpoints provide a continuous stream, allowing the Prometheus server to collect real-time data. These endpoints are also known as targets or instances, and a collection of instances that have the same purpose is known as a job. Jobs are commonly used for scalability and reliability.

Exporters and native Prometheus endpoints

It may seem like a major limitation that Prometheus can only pull data from HTTP endpoints. However, sometimes you need to monitor data from things that don't have HTTP endpoints such as proxies, hardware, and databases. In those cases, you generally need to use exporters. Exporters act as an intermediary between services and your Prometheus servers, collecting data from the services and making it available to Prometheus for scraping. Because Prometheus has a robust open-source community, there are a lot of third-party exporters that help you expose data to a Prometheus server.

Many open-source projects also have native Prometheus endpoints. When you use projects with native Prometheus endpoints, you don’t need an exporter. These open-source projects make it easier to integrate Prometheus without any additional work from engineers. Examples include:

  • ArgoCD, which helps manage GitOps workflows
  • CoreDNS, which is used to manage DNS for Kubernetes’ clusters
  • Istio, an open-source service mesh

Time series database and multidimensional model

The Prometheus server pulls data into a time series database. Time series data includes a timestamp and optional labels stored as key-value pairs, and it’s especially useful for tracking how data changes over time.

While other database systems such as Cassandra, PostgreSQL, and MongoDB provide support for time series data, Prometheus is specifically designed for this kind of data. That makes Prometheus ideal for real-time monitoring.

Prometheus also uses a multidimensional model. That means you can query your data in many different ways, such as by job, endpoint, or time period. The ability to slice and analyze your data is very important for getting full observability into the systems you’re monitoring. For instance, narrowing down a query to a specific endpoint might correspond to a microservice or piece of hardware. On the other hand, querying data over a time period provides metrics for dashboards, including average response times and a high-level view of how a system is performing. Ultimately, a multidimensional model gives you tremendous flexibility in how you query and analyze your telemetry data.

Ideal use cases for Prometheus monitoring

Prometheus monitoring is particularly well-suited for a variety of ideal use cases in the realm of metric-based monitoring. It excels in scenarios where real-time visibility into system performance is critical, making it an excellent choice for:

Container orchestration 

Prometheus is highly popular for monitoring containerized environments, such as Kubernetes and Docker, providing insights into container resource utilization, orchestration, and scaling.

Microservices architectures 

In microservices-based applications, Prometheus can track the performance of individual services and help troubleshoot issues within the distributed system.

Cloud-native applications 

Prometheus is a natural fit for cloud-native applications, enabling teams to monitor dynamic, auto-scaling workloads in cloud environments.

Infrastructure monitoring 

It's an ideal choice for monitoring server and network infrastructure, providing insights into resource consumption, network latency, and system health.

Custom application metrics 

Developers can instrument their applications with Prometheus client libraries to collect custom metrics, making it a versatile tool for tracking specific application-level KPIs.

DevOps and Site Reliability Engineering (SRE) practices 

Prometheus aligns well with the principles of DevOps and SRE, offering real-time insights to ensure service reliability and meet SROs and SLAs.

Open-source ecosystem 

Prometheus boasts a vibrant open-source ecosystem with exporters and integrations for a wide range of technologies, making it suitable for diverse tech stacks.

Prometheus is an excellent choice when you need granular control over the metric collection and a real-time view of your system's performance. Its flexibility and scalability make it a valuable tool for organizations of all sizes, especially those with dynamic and containerized infrastructures.

Pros and cons of monitoring with Prometheus

Monitoring with Prometheus offers a range of advantages and some challenges. On the positive side, Prometheus is renowned for its robust metric-based monitoring capabilities and its ability to scale with the growing needs of your infrastructure. It provides real-time insights into system performance, enabling proactive issue resolution and efficient resource utilization. The integration of Grafana complements Prometheus, offering customizable and intuitive dashboards for visualizing data. 

However, it's important to recognize that Prometheus primarily focuses on metrics, and achieving comprehensive observability may require additional tools for logging and tracing. Moreover, implementing and maintaining Prometheus can be resource-intensive, demanding expertise and effort from your IT team. Overall, Prometheus is a powerful monitoring solution, but its effectiveness hinges on thoughtful integration and management to reap its full benefits while addressing its limitations.

Prometheus vs Grafana monitoring? Which is best?

Comparing Prometheus and Grafana for monitoring purposes is not a matter of determining which is best, but rather understanding their complementary roles in a monitoring stack. 

As you (hopefully) have come to see, Prometheus excels in data collection, storage, and alerting based on time-series metrics. It's ideal for tracking system performance and resource utilization, offering powerful querying capabilities through PromQL. 

On the other hand, Grafana is a visualization and dashboarding tool designed to display data from various sources, including Prometheus. Grafana's strength lies in its ability to create interactive, user-friendly dashboards that allow users, including IT team leaders and CTOs, to gain actionable insights from Prometheus data. 

So, it's not a question of one being better than the other; they are typically used together. Prometheus collects the data, and Grafana helps visualize it, providing a comprehensive monitoring solution. The real value comes from integrating both to create a robust observability stack that empowers your organization to efficiently manage systems, reduce costs, enhance the customer experience, and meet SROs and SLAs effectively.

How to visualize and query your Prometheus metrics

Prometheus metrics provide valuable insight that can help you monitor your services. Prometheus comes with a flexible query language known as PromQL (short for Prometheus Query Language) that you can use to query your metrics. But where do you query your data, and how do you visualize it? There are several different options:

  • Use the built-in expression browser. Prometheus provides an expression browser that you can use to see your data. It’s easy to use but very basic.
  • Use an analytics and visualization platform. The most popular option is Grafana, which is open-source and provides powerful dashboard and visualization features.
  • Use an observability platform. You can use an observability platform like New Relic, which provides dashboards and visualizations like Grafana. An observability platform also provides data storage and the ability to instrument services you aren’t monitoring with Prometheus, which platforms focused on just analytics and visualization don’t provide.
  • Combine multiple platforms. You can also combine the benefits of both kinds of platforms as well.

Expression browser

Once you’ve set up at least a basic Prometheus server, you can use Prometheus’s built-in expression browser at localhost:9090/graph to view your metrics. If you haven’t set up any external targets yet, you can even have Prometheus scrape itself (generally at localhost:9090). While that’s not extremely useful in itself, it’s a good way to ensure that Prometheus is up and running.

While you can make PromQL queries directly to a Prometheus server, that’s not the most effective way to see your metrics in action. Prometheus doesn’t offer features like distributed tracing, real user monitoring, and application performance monitoring, and its dashboards aren’t very fancy. You’ll likely want to query and visualize your data somewhere else.

Grafana

As we already mentioned, Grafana is open-source analytics and visualization software that you can use to query your Prometheus data. Grafana’s only job is to query and visualize, and it does both very well. Here’s an example of a Grafana dashboard visualizing Prometheus metrics.

If you want to build dashboards with your Prometheus data in Grafana, you just need to add Prometheus as a data source and then create a Prometheus graph.

The best of both worlds: An observability platform

While Prometheus and Grafana form a potent duo for metric-based monitoring, there are compelling reasons why someone might consider integrating an observability platform (like New Relic) into their monitoring strategy. The platform provides a more comprehensive solution by offering not just metrics but also logs, traces, and application performance monitoring. 

This holistic approach enables deeper insights into system behavior and the user experience. It simplifies root cause analysis by correlating metrics, logs, and traces, making it easier to pinpoint and resolve issues swiftly. 

Additionally, New Relic's AI-driven capabilities can proactively identify anomalies and performance bottlenecks, helping IT team leaders and CTOs ensure optimal system reliability, reduce operational costs, enhance the customer experience, and meet SROs and SLAs effectively. 

By combining Prometheus and Grafana with New Relic, organizations can achieve a more robust and proactive observability stack that covers all aspects of modern application monitoring.

An observability platform like New Relic gives you additional options for scaling and storing your data. New Relic provides up to 13 months of storage for your data and is compatible with PromQL.

To store Prometheus data in New Relic, you can set up the remote write integration or use the Prometheus OpenMetrics integration for Kubernetes or Docker. The steps for setting up the remote write integration are included in Harnessing the power of open source metrics in New Relic. You can also read the documentation to learn more about both integration options.