Correlate OpenTelemetry traces, metrics, and logs with Kubernetes performance data

Published 7 min read

If you’re a developer, your primary focus is probably on your application, not the performance of your infrastructure. Kubernetes blurs the lines between application and infrastructure, which means your cluster and its underlying components can have a major impact on the performance of your applications. 

To truly understand how your K8s cluster impacts your apps, you need one place to analyze and correlate your performance data. However, some monitoring tools isolate application performance data from infrastructure signals. New Relic provides a seamless developer experience, correlating application-level metrics from OpenTelemetry (OTel) with Kubernetes infrastructure metrics. This makes it easier to view the entire landscape of your telemetry data and work across teams to get faster Mean Time to Resolution (MTTR) for issues in your Kubernetes environment.

NEW RELIC KUBERNETES INTEGRATION
kubernetes logo

In this blog, you’ll learn how to set up your OTel collector to inject relevant metadata into your traces, metrics, and logs so you can view your OTel data and your Kubernetes performance data in a single UI. The next image shows an example of a dashboard in New Relic that includes both OTel and Kubernetes data.

Getting started

To get started, do the following:

Linking your OpenTelemetry and infrastructure performance data

To link your OpenTelemetry data to your infrastructure performance data, your application needs to inject infrastructure-specific metadata into your telemetry data to power the OpenTelemetry + Kubernetes experience. To do so, you need to do the following:

  • Define environment variables to send telemetry data to the collector on each application container
  • Deploy the OpenTelemetry collector as a daemonset in agent mode with resourcedetection, resource, batch, and k8sattributes processors to inject relevant metadata on the cluster, deployment, and namespaces

These steps will be covered in detail in the next sections.

Send telemetry data to the OpenTelemetry collector

To send telemetry data to OpenTelemetry collector, you need to add a custom snippet to the env stanza of your Kubernetes YAML file. The example below shows the snippet for a sample front-end microservice with a configuration file called Frontend.yaml

For each microservice instrumented with OpenTelemetry, add these lines to your manifest’s env stanza:

apiVersion: apps/v1
kind: Deployment
...
spec:
  containers:
  - name: yourfrontendservice
    image: yourfrontendservice-beta

  env:
  # Section 1: Ensure that telemetry data is sent to the collector
  - name: HOST_IP
        valueFrom:
          fieldRef:
            fieldPath: status.hostIP
      # This is picked up by the opentelemetry sdks
      - name: OTEL_EXPORTER_OTLP_ENDPOINT
        value: "http://$(HOST_IP):55680"
  # Section 2: Attach infrastructure-specific metadata
     # Get pod ip so that k8sattributes can tag resources
- name: POD_NAME 
valueFrom: 
fieldRef: 
fieldPath: metadata.name 
- name: POD_UID 
valueFrom: 
fieldRef: 
fieldPath: metadata.uid
      # This is picked up by the resource detector
    - name: OTEL_RESOURCE_ATTRIBUTES
      value: service.instance.id=$(POD_NAME),k8s.pod.uid=$(POD_UID)”

The snippet includes two sections.

  • Section one ensures that telemetry data is sent to the collector. This sets the environment variable to OTEL_EXPORTER_OTLP_ENDPOINT with the host IP. This configuration calls the downward API to pull the host IP.
  • Section two attaches infrastructure-specific metadata. To do this, we capture metadata.uid using the downward API and add it to the OTEL_RESOURCE_ATTRIBUTES environment variable. This environment variable is used by the OpenTelemetry collector’s resourcedetection and k8sattributes processors to add additional infrastructure-specific context to telemetry data.

Configure and deploy the OpenTelemetry collector as an agent

You should deploy the collector as an agent on every node in a Kubernetes cluster. The agent can receive telemetry data as well as enrich that data with metadata. For example, the collector can add custom attributes or infrastructure information through processors as well as handle batching, retry, compression, and other more advanced features that are handled less efficiently at the client instrumentation level. 

For help configuring the collector, see the sample collector configuration file below, along with  sections about setting up these options:

  • OTLP exporter
  • batch processor
  • resourcedetection processor
  • k8sattributes processor
---
receivers:
  otlp:
    protocols:
      grpc:

processors:
  batch:
  resource:
    attributes:
      - key: host.id
        from_attribute: host.name
        action: upsert
  resourcedetection:
    detectors: [ gke, gce ]
  k8sattributes:
    auth_type: "serviceAccount"
    passthrough: false
    filter:
      node_from_env_var: KUBE_NODE_NAME
    extract:
      metadata:
        - k8s.pod.name
        - k8s.pod.uid
        - k8s.deployment.name
        - k8s.cluster.name
        - k8s.namespace.name
        - k8s.node.name
        - k8s.pod.start_time
    pod_association:
      - from: resource_attribute
        name: k8s.pod.uid

exporters:
  otlp:
    endpoint: $OTEL_EXPORTER_OTLP_ENDPOINT
    headers:
      api-key: $NEW_RELIC_API_KEY
  logging:
    logLevel: DEBUG

service:
  pipelines:
    metrics:
      receivers: [ otlp ]
      processors: [ resourcedetection, k8sattributes, resource, cumulativetodelta, batch ]
      exporters: [ otlp ]
    traces:
      receivers: [ otlp ]
      processors: [ resourcedetection, k8sattributes, resource, batch ]
      exporters: [ otlp ]
    logs:
      receivers: [ otlp ]
      processors: [ resourcedetection, k8sattributes, resource, batch ]
      exporters: [ otlp ]

Let’s walk through each part of this configuration file.

1. Configure the OTLP exporter
First, add an OTLP exporter to your OpenTelemetry collector configuration YAML file along with your New Relic API key as a header.

---
exporters: 
   otlp: 
     endpoint: $OTEL_EXPORTER_OTLP_ENDPOINT 
   headers: api-key: $NEW_RELIC_API_KEY
---

2. Configure the batch processor
The batch processor accepts and batches spans, metrics, and logs. This makes data compression easier and reduces the number of requests from the collector.

---
processors: 
   batch:
---

3. Configure the resource detection processor
The resourcedetection processor adds host-specific information to the telemetry data being processed through the collector. This example uses Google Kubernetes Engine (GKE) and Google Compute Engine (GCE) to get Google Cloud-specific metadata, including:

  • cloud.provider ("gcp")
  • cloud.platform ("gcp_compute_engine")
  • cloud.account.id 
  • cloud.region
  • cloud.availability_zone
  • host.id
  • host.image.id
  • host.type
---
processors: 
   resourcedetection: 
Detectors: [ gke, gce ]
---

4. Configure the Kubernetes Attributes processor (Collector)

The k8sattributes processor detects the IP addresses of pods sending telemetry data to the OTel collector agent and extracts metadata from those pods. This is a basic Kubernetes manifest example with only a processors section. To deploy the OpenTelemetry collector as a daemonset, read this comprehensive manifest example.

processors:
 k8sattributes:
    auth_type: "serviceAccount"
    passthrough: false
    filter:
      node_from_env_var: KUBE_NODE_NAME
    extract:
      metadata:
        - k8s.pod.name
        - k8s.pod.uid
        - k8s.deployment.name
        - k8s.cluster.name
        - k8s.namespace.name
        - k8s.node.name
        - k8s.pod.start_time
    pod_association:
      - from: resource_attribute
        name: k8s.pod.uid

5. Configure the Kubernetes Attributes processor (RBAC)

Next, you need to add configurations for role-based access control (RBAC). The k8sattributes processor needs get, watch, and list permissions for pods and namespaces resources included in the configured filters. See this example of how to configure role-based access control (RBAC) for ClusterRole to give a ServiceAccount the necessary permissions for all pods and namespaces in the cluster.

6. Configure the Kubernetes Attributes processor (discovery filter)

When running the collector as an agent, you should apply a discovery filter so that the processor only discovers pods from the host that it’s running on. If you don’t use a filter, resource usage can be unnecessarily high, especially on very large clusters. Once the filter is applied, each processor will only query the Kubernetes API for pods running on its own node.

To set the filter, use the downward API to inject the node name as an environment variable in the Pod Env section of the OpenTelemetry collector agent configuration YAML file. See this repository for an example. The environment variable will be set to the name of the node where the pod is scheduled to run.

spec:
  containers:
  - env:
    - name: KUBE_NODE_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: spec.nodeName

Then you can set the filter in k8sattributes:

k8sattributes:
  filter:
    node_from_env_var: KUBE_NODE_NAME

7. Verify that your configuration is working correctly

If you have successfully linked your OTel data with your Kubernetes data, you should see Kubernetes attributes like k8s.cluster.name and k8s.deployment.name in your spans in New Relic’s distributed tracing UI, as shown in the next image.

For each OpenTelemetry service, you should also see a Kubernetes tab, which displays a curated series of application and cluster performance metrics. Without changing screens, you can see relevant infrastructure metrics including memory, core, and CPU utilization, as well as pod status, container, and restarts. With this view, your OTel and Kubernetes data are automatically correlated and you don’t need to navigate between OTel services and infrastructure monitoring in New Relic.

To reduce your MTTR, you need to quickly identify the root cause of errors and performance issues. With New Relic, you can now correlate your OpenTelemetry traces, logs, and metrics with Kubernetes infrastructure data, surfacing your infrastructure metrics directly in New Relic’s OTel view.