To capture new insights into observability, New Relic partnered with Enterprise Technology Research (ETR) to conduct a survey and analysis for this third annual Observability Forecast report.
What’s new
The largest, most comprehensive study of its kind in the observability industry, we increased our sample size once again. We also added another country this year (South Korea) and updated the regional distribution. We repeated about half of last year’s questions to compare trends year-over-year and added many new ones to gain additional insights. And we added security monitoring to our definition of full-stack observability.
In addition, this year’s report is entirely online with updated navigation to help make it even easier to find the data you’re looking for.
Definitions
We’ve defined some common terms and concepts used throughout this report.
Observability
Note that to avoid bias, we did not define observability in the survey.
Observability enables organizations to measure how a system performs and identify issues and errors based on its external outputs. These external outputs are called telemetry data and include metrics, events, logs, and traces (MELT). Observability is the practice of instrumenting systems to surface insights for various roles so they can take immediate action that impacts customer experience and service. It also involves collecting, analyzing, altering, and correlating that data for improved uptime and performance.
Achieving observability brings a connected, real-time view of all data from different sources—ideally in one place—where teams can collaborate to troubleshoot and resolve issues faster, prevent issues from occurring, ensure operational efficiency, and produce high-quality software that promotes an optimal customer/user experience and competitive advantage.
Software engineering, development, site reliability engineering, operations, and other teams use observability to understand the behavior of complex digital systems and turn data into tailored insights. Observability helps them pinpoint issues more quickly, understand root causes for faster, simpler incident response, and proactively align data with business outcomes.
A subset of observability, organizations use monitoring to identify problems in the environment based on prior experience that is expressed as a set of conditions (known unknowns). Monitoring enables organizations to react to these conditions and is sufficient to solve problems when the number and complexity of possible problems are limited.
Organizations use observability to determine why something unexpected happened (in addition to the what, when, and how), particularly in complex environments where the possible scope of problems and interactions between systems and services is significant. The key difference is that observability does not rely on prior experience to define the conditions used to solve all problems (unknown unknowns). Organizations also use observability proactively to optimize and improve environments.
Monitoring tools alone can lead to data silos and data sampling. In contrast, an observability platform can instrument an entire technology stack and correlate the telemetry data drawn from it in a single location for one unified, actionable view.
Many tools are purpose-built for observability and include capabilities such as:
- AIOps (artificial intelligence for IT operations)
- Alerts
- Application performance monitoring (APM)
- Browser monitoring
- Dashboards
- Database monitoring
- Distributed tracing
- Error tracking
- Infrastructure monitoring
- Kubernetes monitoring
- Log management
- Machine learning (ML) model performance monitoring (MLOps)
- Mobile monitoring
- Network performance monitoring
- Security monitoring
- Serverless monitoring
- Synthetic monitoring
Real user monitoring (RUM) includes browser monitoring and mobile monitoring. Digital experience monitoring (DEM) includes RUM plus synthetic monitoring.
Monitoring | Observability |
---|---|
MonitoringReactive | ObservabilityProactive |
MonitoringSituational | ObservabilityPredictive |
MonitoringSpeculative | ObservabilityData-driven |
MonitoringWhat + when | ObservabilityWhat + when + why + how |
MonitoringExpected problems (known unknowns) | ObservabilityUnexpected problems (unknown unknowns) |
MonitoringData silos | ObservabilityData in one place |
MonitoringData sampling | ObservabilityInstrument everything |
Key differences between monitoring and observability
Full-stack observability
The ability to see everything in the tech stack that could affect the customer experience is called full-stack observability or end-to-end observability. It’s based on a complete view of all telemetry data.
Adopting a data-driven approach for end-to-end observability helps empower engineers and developers with a complete view of all telemetry data so they don’t have to sample data, compromise their visibility into the tech stack, or waste time stitching together siloed data. Instead, they can focus on the higher-priority, business-impacting, and creative coding they love.
Full-stack observability, as used in this report, is achieved by organizations that deploy specific combinations of observability capabilities, including customer experience monitoring/DEM (frontend), services monitoring, log management, environment monitoring (backend), and security monitoring.
See how many respondents had achieved full-stack observability and the advantages of achieving full-stack observability.
monitoring
monitoringApplication performance monitoringAND/ORServerless monitoring
monitoringDatabase monitoringAND/ORInfrastructure monitoringAND/ORNetwork monitoring
experience
monitoring/DEMBrowser monitoringAND/ORMobile monitoringAND/ORSynthetic monitoring
management
observability
Mature observability practice
While full-stack observability describes how much coverage an organization has for its services, maturity describes how effectively those observability capabilities are put into practice.
What constitutes a mature observability practice is somewhat subjective. In this report, we define a mature observability practice as one that follows best practices and delivers specific outcomes.
Best practices
- Instrumentation is automated
- Portions of incident response are automated
- Infrastructure is provisioned and orchestrated using automation tooling
- Telemetry is captured across the full stack
- Telemetry (metrics, events, logs, traces) is unified in a single pane for consumption across teams
- Users broadly have access to telemetry data and visualization
- Software deployment uses CI/CD (continuous integration, development, and deployment) practices
- Ingestion of high-cardinality data
- Ability to query data on the fly
Business outcomes
- Developer [and engineer] time is shifted from incident response (reactive) towards higher-value work (proactive)
- Improved collaboration across teams to make decisions related to the software stack
- Observability mitigates service disruptions and business risk
- Telemetry data includes business context to quantify the business impact of events and incidents
- Observability improves revenue retention by deepening understanding of customer behaviors
- Observability creates revenue-generating use cases
For the purpose of this report, a mature observability practice employs at least these five characteristics:
- Unifies telemetry (metrics, events, logs, traces) in a single pane for consumption across teams
- Shifts developer and engineer time from incident response (reactive) towards higher-value work (proactive)
- Improves collaboration across teams to make decisions related to the software stack
- Mitigates service disruptions and business risk
- Improves revenue retention by deepening understanding of customer behaviors
Learn about the maturity of survey respondents’ observability practices.
Roles |
Job titles |
Descriptions |
Common KPIs |
---|---|---|---|
Developers |
Application developers, software engineers, architects, and their frontline managers |
Members of a technical team who design, build, and deploy code, optimizing and automating processes where possible Enjoy taking on new coding challenges, adopting new technologies, and being up to date on the latest and greatest tools |
Cycle time (speed of making changes) Endpoint security incidents Error rates Lead time (speed from idea to deployment) Mean time between incidents (MTBI) Speed of software performance Uptime percentage |
Operations professionals |
IT operations engineers, network operations engineers, DevOps engineers, DevSecOps engineers, SecOps engineers, site reliability engineers (SREs), infrastructure operations engineers, cloud operations engineers, platform engineers, system administrators, architects, and their frontline managers |
Members of a technical team who are responsible for the overall health and stability of infrastructure and applications Detect and resolve incidents using monitoring tools, build and improve code pipeline, and lead optimization and scaling efforts |
Availability Deploy speed and frequency Error budgets Error rates Mean time to detection (MTTD) Mean time to resolution (MTTR) Service level agreements (SLAs) Service level indicators (SLIs) Service level objectives (SLOs) Uptime percentage |
Non-executive managers |
Directors, senior directors, vice presidents (VPs), and senior vice presidents (SVPs) of engineering, operations, DevOps, DevSecOps, SecOps, site reliability, and analytics |
Leaders of practitioner teams that build, launch, and maintain customer-facing and internal products and platforms Own the projects that operationalize high-level business initiatives and translate technology strategy into tactical execution Constantly looking to increase velocity and scale service |
Customer satisfaction MTBI MTTR On-time project completion Software development and efficiency Speed of deployment Uptime percentage |
Executives (C-suite) |
More technical focused: Chief information officers (CIOs), chief information security officers (CISOs), chief technology officers (CTOs), chief data officers (CDOs), chief analytics officers (CAOs), and chief architects Less technical focused: Chief executive officers (CEOs), chief operating officers (COOs), chief financial officers (CFOs), chief marketing officers (CMOs), chief revenue officers (CROs), and chief product officers (CPOs) |
Managers of overall technology infrastructure and cost who are responsible for business impact, technology strategy, organizational culture, company reputation, and cost management Define the organization’s technology vision and roadmap to deliver on business objectives Use digital to improve customer experience and profitability, enhancing company reputation as a result |
Conversion rates Cost-effectiveness Customer satisfaction Return on investment (ROI) Speed of deployment Speed of innovation Total cost of ownership (TCO) Uptime percentage |
Roles, job titles, descriptions, and common key performance indicators (KPIs) for practitioners and ITDMs
Methodology
All data in this report are derived from a survey, which was in the field from March to April 2023.
ETR qualified survey respondents based on relevant expertise. ETR performed a non-probability sampling type called quota sampling to target sample sizes of respondents based on their country of residence and role type in their organizations (in other words, practitioners and ITDMs). Geographic representation quotas targeted 15 key countries.
All dollar amounts in this report are in USD.
Note that the 2023 survey drew respondents from larger organizations (by employee count and annual revenue) than in the 2021 Observability Forecast and 2022 Observability Forecast, which could affect some year-over-year (YoY) comparisons.
Survey results
The annual Observability Forecast is the only study of its kind to open-source its raw data.
About ETR
ETR is a technology market research firm that leverages proprietary data from its targeted ITDM community to deliver actionable insights about spending intentions and industry trends. Since 2010, ETR has worked diligently at achieving one goal: eliminating the need for opinions in enterprise research, which are typically formed from incomplete, biased, and statistically insignificant data.
The ETR community of ITDMs is uniquely positioned to provide best-in-class customer/evaluator perspectives. Its proprietary data and insights from this community empower institutional investors, technology companies, and ITDMs to navigate the complex enterprise technology landscape amid an expanding marketplace.
About New Relic
As a leader in observability, New Relic empowers engineers with a data-driven approach to planning, building, deploying, and running great software. New Relic delivers the only unified data platform with all telemetry—metrics, events, logs, and traces—paired with powerful full-stack analysis tools to help engineers do their best work with data, not opinion.
Delivered through the industry’s first usage-based pricing that’s intuitive and predictable, New Relic gives engineers more value for their money by helping improve planning cycle times, change failure rates, release frequency, and MTTR. This helps the world’s leading brands and hyper-growth startups to improve uptime, reliability, and operational efficiency and deliver exceptional customer experiences that fuel innovation and growth.
Citing the report
Suggested citation for this report:
- APA Style:
Basteri, A. (2023, September). 2023 Observability Forecast. New Relic, Inc. https://newrelic.com/resources/report/observability-forecast/2023 - The Chicago Manual of Style:
Basteri, Alicia. September 2023. 2023 Observability Forecast. N.p.: New Relic, Inc. https://newrelic.com/resources/report/observability-forecast/2023.
Demographics
In 2023, ETR polled 1,700 technology professionals—more than any other observability report, 5% more than the 1,614 we polled in 2022, and 31% more than the 1,295 we polled in 2021—in 15 countries across Asia Pacific, Europe, and North America. France, Germany, Ireland, and the United Kingdom represented 24% of respondents. Approximately 29% of respondents were from Canada and the United States. The remaining 47% were from the broader Asia-Pacific region, including Australia, India, Indonesia, Japan, Malaysia, New Zealand, Singapore, South Korea (new this year), and Thailand. View the regional highlights.
The survey respondent mix was about the same as in 2021 and 2022—65% practitioners and 35% ITDMs. A running theme in the data is a split between what practitioners value and perceive and what ITDMs value and perceive regarding observability.
Respondents’ ages ranged from 22–73, with 82% identifying as male and 18% identifying as female, which reflects the gender imbalance among technology professionals today.
Sample size
Regions
Respondent demographics, including sample size, regions, countries, roles, age, and gender
Firmographics
More than half of survey respondents (58%) worked for large organizations, followed by 34% for midsize organizations, and 9% for small organizations.
For organization annual revenue, 17% cited $500,000 to $9.99 million, 25% cited $10 million to $99.99 million, and 49% cited $100 million or more.
The respondent pool represented a wide range of industries, including IT/telecommunications (telco), industrials/materials/manufacturing, financial services/insurance, retail/consumer, healthcare/pharmaceutical (pharma), energy/utilities, services/consulting, education, government, nonprofit, and unspecified. Keep an eye out for industry highlights throughout the report.