About this report
Find quantifiable data points and an in-depth analysis of the state and future of observability below.
Last year, the 2021 Observability Forecast report focused on how, driven by digital transformation, full-stack observability (o11y) has become mission-critical to the success of every modern business. The report provided compelling reasons why it’s time for organizations to shift to full-stack observability so that they can plan, build, deploy, and run great software that powers optimal digital experiences for their customers, employees, partners, and suppliers.
This year, the 2022 Observability Forecast report focuses on the next chapter in the story—it looks at what’s driving observability practices today, how organizations are transforming those practices, and how observability impacts the lives of technical professionals. The report also includes a snapshot of emerging technologies that will potentially drive a further need for observability over the next three years.
Today, many organizations make do with a patchwork of tools to monitor their technology stacks, requiring extensive manual effort for fragmented views of their information technology (IT) systems and overall businesses. At the same time, survey respondents longed for—planned for—simplicity, integration, seamlessness, and more efficient ways to complete high-value projects.
Monitoring is fragmented. Most organizations do not currently monitor their full tech stacks.
Summary of observability challenges
Observability improves service-level metrics. Organizations see its value—and expect to invest more in it.
Summary of observability opportunities
Summary of how observability helps improve service-level metrics
Note that to avoid bias, we did not define observability in the survey.
Observability is the ability to measure how a system is performing and identify issues and errors based on its external outputs. These external outputs are telemetry data (metrics, events, logs, and traces). Data-driven engineering puts telemetry data to work to drive action. Observability requires instrumenting systems to secure actionable data that identifies an error and details when, why, and how an error occurs. Observability also involves collecting, analyzing, altering, and correlating that data for improved uptime and performance. Achieving observability brings a connected, real-time view of all data from different sources—ideally in one place—where teams can collaborate to troubleshoot and resolve issues faster, ensure operational efficiency, and produce high-quality software that ensures an optimal customer/user experience.
Software engineering, development, site reliability engineering, operations, and other teams use observability to understand the behavior of complex digital systems and turn data into tailored insights. Observability helps them pinpoint issues more quickly, understand root causes for faster, simpler incident response, and proactively align data with business outcomes.
A subset of observability, monitoring is reactive and reveals what is wrong (an error) and when an error happened. Observability is proactive in how it determines why and how an error happened (in addition to the what and when). Monitoring tools alone can lead to data silos and data sampling, while an observability platform provides the ability to instrument an entire technology stack and correlates the telemetry data drawn from it in a single location for one unified, actionable view.
Many tools are purpose-built for observability and include capabilities such as:
- AIOps (artificial intelligence for IT operations)
- Application performance monitoring (APM)
- Browser monitoring
- Custom dashboards
- Database monitoring
- Distributed tracing
- Error tracking
- Infrastructure monitoring
- Kubernetes monitoring
- Log management
- Machine learning (ML) model performance monitoring (MLOps)
- Mobile monitoring
- Network performance monitoring
- Security monitoring
- Serverless monitoring
- Synthetic monitoring
Real-user monitoring (RUM) includes browser monitoring and mobile monitoring. Digital experience monitoring (DEM) includes RUM plus synthetic monitoring.
The ability to see everything in the tech stack that could affect the customer experience is called full-stack observability or end-to-end observability. It is based on a complete view of all telemetry data.
With full-stack observability, engineers and developers don’t have to sample data, compromise their visibility into the tech stack, or waste time stitching together siloed data. Instead, they can focus on the higher-priority, business-impacting, and creative coding they love.
Full-stack observability, as used in this report, is achieved by organizations that deploy specific combinations of observability capabilities, including customer experience monitoring/DEM (front-end), services monitoring, log management, and environment monitoring (back-end).
Full-stack observability combinations
- Instrumentation is automated
- Portions of incident response are automated
- Infrastructure is provisioned and orchestrated using automation tooling
- Telemetry is captured across the full stack
- Telemetry (metrics, events, logs, traces) is unified in a single pane for consumption across teams
- Users broadly have access to telemetry data and visualization
- Software deployment uses CI/CD (continuous integration, development, and deployment) practices
- Ingestion of high-cardinality data
- Ability to query data on the fly
- Developer [and engineer] time is shifted from incident response (reactive) towards higher-value work (proactive)
- Improved collaboration across teams to make decisions related to the software stack
- Observability mitigates service disruptions and business risk
- Telemetry data includes business context to quantify the business impact of events and incidents
- Observability improves revenue retention by deepening understanding of customer behaviors
- Observability creates revenue-generating use cases
For the purpose of this report, a mature observability practice employs at least these five characteristics:
- Unifies telemetry (metrics, events, logs, traces) in a single pane for consumption across teams
- Shifts developer and engineer time from incident response (reactive) towards higher-value work (proactive)
- Improves collaboration across teams to make decisions related to the software stack
- Mitigates service disruptions and business risk
- Improves revenue retention by deepening understanding of customer behaviors
Roles, job titles, descriptions, and common key performance indicators (KPIs) for practitioners and ITDMs
Organization size by employee count
ETR sent survey respondents a questionnaire and compensated them for completing the survey.
All data in this report are derived from the survey, which was in the field from March to April 2022.
ETR qualified survey respondents on the basis of relevant expertise. ETR performed a non-probability sampling type called quota sampling to target sample sizes of respondents based on their country of residence and role type in their organizations (in other words, practitioners and ITDMs). Geographic representation quotas targeted 14 key countries.
All dollar amounts in this report are in USD.
In 2022, ETR polled 1,614 technology professionals—more than any other observability report and 25% more than the 1,295 we polled in 2021—in the same 14 countries across Asia Pacific, Europe, and North America. France, Germany, Ireland, and the United Kingdom represented 44% of respondents. Approximately 31% of respondents were from Canada and the United States. The remaining 25% were from the broader Asia-Pacific region including Australia, India, Indonesia, Japan, Malaysia, New Zealand, Singapore, and Thailand. Download the full report to view regional highlights.
The survey respondent mix was about the same as last year—65% practitioners and 35% ITDMs. A running theme in the data is a split between what practitioners value and perceive and what ITDMs value and perceive regarding observability.
Respondents’ age ranged from 19–72 with 80% identifying as male and 20% identifying as female, which (unfortunately) reflects the gender imbalance among technology professionals today.
Respondent demographics, including sample size, regions, countries, roles, age, and gender
More than half of survey respondents (56%) worked for midsize organizations, followed by 35% for large organizations, and 8% for small organizations.
For organization annual revenue, 17% had $1 million to $9.99 million (35% of those were small organizations and 55% were midsize), 43% had $10 million to $99.99 million (76% of those were midsize organizations), and 40% had $100 million or more (63% of those were large organizations).
The respondent pool represented a wide range of industries, including IT/telecommunications (telco), financial/insurance, industrials/materials/manufacturing, retail/consumer, healthcare/pharmaceutical (pharma), energy/utilities, services/consulting, education, nonprofit/unspecified, and government. Download the full report to view industry highlights.
Respondent firmographics, including organization size, annual revenue, and industries