FutureStack 2022: Recapping our product announcements. Read our blog.
In this article

APM vs. observability

Application performance monitoring (APM) and observability are two closely related approaches to understand and improve your software performance, but they don’t mean the same thing. APM is one of the steps in a well-rounded observability practice that uses dashboards and alerts for known or expected failures.  Observability is a holistic, dynamic practice to help you understand your complex systems and discover answers to possible issues you haven’t even thought of. Observability involves collecting data across multiple layers of software architecture, analyzing it in real-time, and asking questions about that data, tailored to specific signals—so you know what problems to focus on. 

APM provides users with dashboards and alerts to troubleshoot an application’s performance in production. These insights are based on known, or expected, system failures—typically related to SRE golden signals—and provide engineers with alerts when pre-defined issues arise, with breadcrumbs and recommendations on how to troubleshoot. 

But what about issues that arise that weren’t predefined or expected? Today’s software environments are increasingly distributed, with software that is built, deployed, and maintained by distributed teams. This software also runs on a wide array of hosts—whether on premises or in the cloud. It’s critical for teams to conceptualize, troubleshoot, and improve these distributed systems. An observability practice gives teams the flexibility to query the “unknown unknowns” in their dynamic systems and investigate and troubleshoot anomalies as they arise. Observability platforms give teams a connected, real-time view of their operational data in one place, so they can better understand system behavior to make iterative improvements to the entire stack.

You need application performance monitoring

IT teams typically adopt APM as a best practice to understand and improve system performance, by helping them identify when an application is slow or broken and then fix issues before they affect users. Through pre-configured alerts and visualizations, APM helps teams understand metrics like response time, throughput, and errors. 

You can monitor the performance of everything from websites, to mobile apps, servers, networks, APIs, cloud-based services, and other technologies with tools, products, and solutions including:

  • operational dashboards
  • real user monitoring
  • mobile monitoring
  • synthetic monitoring
  • serverless monitoring
  • database monitoring
  • infrastructure monitoring
  • service maps

APM provides a high-level view of how an application is performing and works well for the questions or conditions you know to ask in advance, such as:

  • “What’s my application’s throughput?”
  • “Alert me when I exceed a certain error budget.”
  • “What does compute capacity look like?” 

But many modern application architectures are too complex to monitor and manage with just APM. You need to consider multiple data sources and various telemetry data types (not just metrics). You also need to think about logging. Each run time is likely emitting logs in different places, and you need a way to consolidate that data and evaluate it in the context of your application. And, as you add more services and microservices components to your architecture, when a user accesses one of these services and gets an error, you need to be able to trace that request across multiple services. You also need to be able to investigate all of this data in one place, so you don’t lose context and can improve KPIs like mean time to recovery (MTTR).

To get to the root cause of an issue when you have multiple run times and many architecture layers, it’s necessary to take a more holistic, proactive approach. While APM provides aggregated metrics, you also need other insights to understand your dynamic stack 

You really need observability

Observability is about getting deep, technical insights into the state of your entire system, no matter how large or complex it is. It helps DevOps teams navigate the challenges of increased fragmentation in today’s distributed systems. Observability also gives you the power to understand patterns and connections in your data that you hadn’t previously considered.

Observability platforms automate collecting data from an array of  sources and services together in one place, help you monitor the health of your application by visualizing its performance in context of the entire stack, and then give you the insights to take action. These insights help you understand not just that something happened, but why, with all the tools at your fingertips to take action to resolve. 

New Relic Lookout screen capture of CPU usage, memory usage, and disk usage

 New Relic One is an example of an observability platform that lets you know exactly where to focus your attention. In New Relic Lookout, the brighter the color, the more severe the change, and the bigger the size, the bigger the scale. You can dig deeper with correlations and abnormal history to see how it impacts your whole system.

When evaluating observability platforms, look for ones that allow you to: 

  • Use open instrumentation agents to gather telemetry data from open source or vendor-specific entities that produce that data. Examples of telemetry data include metrics, events, logs, and traces (often referred to as MELT). Examples of entities include services, hosts, applications, and containers.
  • Visualize, navigate, debug, and improve your entire stack to optimize your end users' experience.
  • Analyze the enormous amounts of raw telemetry data, in high cardinality, collected for correlations and context, so humans can make sense of any patterns and anomalies that arise. 
  • Take advantage of advances in artificial intelligence and machine learning, as part of an AIOps practice, so you can reduce alert noise and eliminate false alarms, correlate incidents, and automatically detect anomalies. Together, these help you find, diagnose, and resolve incidents faster. Insights into which areas need the most improvement help your teams determine where to focus developers’ efforts moving forward, to have the greatest effect on the end users of your application.
2021 Observability Forecast
2021 Global Observability Forecast
The industry’s most comprehensive research on observability
Download report Download report

APM is part of your observability practice

Given the application-centricity of today’s software stacks, you can’t have observability without a strong APM discipline.

You can think of it this way: Observability (a noun) is the approach to how well you can understand your complex system. Application performance monitoring (a verb) is an action you take to help in that approach. Observability doesn't eliminate the need for APM. APM just becomes one of the techniques used to achieve observability. 

To learn more about the specifics of application performance monitoring, see What is APM? To dive more deeply into observability, see What is observability?

Get started with APM and observability. Try New Relic One.

The best way to learn more about APM and observability is to get hands-on experience with an observability solution. Sign up for New Relic One. Your free account includes 100 GB/month of free data ingest, one free full-access user, and unlimited free basic users. Then explore the APM documentation and observability documentation.