The recent acceleration of digital transformation in many organizations caused by a global pandemic has increased the workload on already short-staffed technology teams. As Elyssa Christensen mentioned in Evolving your DevOps practice: A look at the 2021 State of DevOps Puppet report, "the shortage of skilled resources is also identified as a significant barrier even for the most evolved organizations. A gold mine of highly skilled resources isn’t hiding behind any door just waiting to jump out, so while developing talent, we have to look at the two areas that can mitigate this shortage: automation and autonomous self-service."

So let’s try to decrease the workload of SREs and tech teams by automating observability. 

(At this point, you are probably asking yourself, is there a relevant XKCD for that? Of course there is.)

In this blog post, we’ll explore some of the ways your teams can automate different layers of the observability stack with New Relic, including:

  • Telemetry collection
  • Visualization
  • Alerts
  • Platform governance

Using this repository, you can build the basics of your own observability processes, then use these steps to make your own custom observability solution that's relevant to your team. 

Telemetry collection

Arguably the most important step of observability is collecting the data in the first place. New Relic provides many different ways to collect that data ranging from the simplest (guided installation) to the most direct (sending to our MELT APIs directly). Let’s cover a few ways you can automate your telemetry collection with New Relic, so you have a starting point you can customize to your team’s needs.

Guided installer

New Relic’s guided installer is an open-source tool that you can run in your environment that will automatically detect telemetry-producing systems and install the relevant New Relic agent or integration. Once installed, the agent begins sending that data to New Relic.

APM and infrastructure agents

The guided installer is a good starting point for getting visibility into your tech stack, but in order to maintain the rollout of observability agents, it is recommended to include the agents in a CI/CD pipeline or configuration management tool like Ansible.

New Relic provides example Ansible roles for both APM and Infrastructure agents. The latest agents are also available, making it simple to include in a CI/CD pipeline. 

Once included in the CI/CD pipeline, don’t forget to send deployment markers any time there is a new build of your application. You can automate deployment of the infrastructure agent in a similar way by utilizing a configuration management tool like Ansible, baking the agent into your AMI/server image, or using Docker Hub to automatically pull down the latest container image. 

OpenTelemetry SDKs

While New Relic provides many agents and integrations for you, we also support the collection of telemetry from other sources. 

One of the current industry standards for collecting Telemetry data is OpenTelemetry. This is an open-source project that contributes to the instrumentation of many common languages and frameworks. In most cases, it’s as simple as dropping an OpenTelemetry SDK into your application and pointing it at New Relic. New Relic inherently supports all OpenTelemetry SDKs that have the option of sending data to an OTLP endpoint. Most OpenTelemetry SDKs provide some sort of automatic instrumentation, but also allow for the ability to manually instrument your application as well. This deployment can also be automated by adding the SDK to your CI/CD pipeline.

MELT APIs

Utilizing  New Relic's MELT APIs is the most direct way to get telemetry data from your environment. Typically this is the last step after deploying agents in order to get the last bit of custom telemetry from services that may not be supported by New Relic agents or OpenTelemetry SDKs. 

Call New Relic APIs directly from your application for a fully-automated pipeline of custom data.

Visualize your data

Entities are an important part of the New Relic ecosystem because they allow the entire observability experience to be automated and tied together with relationships. 

An entity is anything that:

  1. Reports data to New Relic or that contains data that you have access to.
  2.  Is something you've identified with a unique entity ID. 

New Relic recognizes and displays ingested telemetry data by maintaining a set of rules that identify what type of entity your data is associated with. These rules determine golden signals for your entities and will automatically create visualizations based on that data, along with running the metrics through our anomaly detection to help engineers orient themselves quicker during critical triage periods.  

Implement workloads to easily separate team responsibilities

Workloads are a great way to have no-code automation where you define a set of rules that dynamically pull in the entities that are related to a specific service, responsible team, and more. They display useful statistics like golden signals, alerts, and Service Level Indicators (SLIs) for these groups of entities. Here are some examples of workloads: 

  • A serverless application that includes an API gateway, a few serverless functions, and a managed database and storage.
  • A browser application and the backend APIs that support it.
  • A collection of Java microservices and the infrastructure they run on.
  • All the work your team will complete during one sprint. 

Workloads can also be created as code through the use of the New Relic Terraform provider

Dashboards

Having a set of standard operating dashboards for every service deployed in your environment is a great organizational goal. New Relic provides observability quick start alerts and dashboards that you can modify to suit your organization's needs. 

After customizing these dashboards to fit what matters most to your teams, you can make them into templates and automate their creation using Terraform or the New Relic CLI.

Custom applications

You can also completely customize your views with New Relic applications and custom visualizations. Automate the visualizations that matter most to you by molding your data to suit your business needs. 

Instead of manually generating a report for your stakeholders every month, you could create a custom application that allows them to examine the data themselves. For example, the Pathpoint application allows you to consolidate and customize each part of your environment. 

New Relic provides several example applications you can deploy to your accounts for free. We also provide React components so you can begin building your own applications on top of the New Relic platform. 

Alerts

Once you've set up visualizations, you need to set up actionable alerts across your tech stack. Chris Trombley wrote a blog post on how to utilize the New Relic Terraform provider to apply golden signal alerts (Latency, Traffic, Errors, and Saturation) across your stack.

It is recommended that you test your alert conditions manually to make sure they won’t be too noisy. The New Relic UI provides a way to quickly add these recommended alerts to your entities:

Platform governance

Automating user provisioning is the low-hanging fruit for any enterprise application. Make sure you take advantage of setting up automated user management by integrating New Relic with your identity provider of choice (such as Azure AD, Okta, OneLogin, or SCIM). You get the benefit of maintaining your users' and groups' authorization and authentication through a centralized tool.

We’ve talked about several ways to automate your observability stack. This post is meant to be a starting point in your journey of making your technology team’s work more efficient. 

The most important part about the observability community is that we are all in it together. That’s why we’ve doubled down on open source initiatives making our work transparent and contributing back to the community.