Quickstart: Get Kubernetes insights and alerts instantly for New Relic with Pixie

3 min read

Microservices in modern, complex cloud native environments often use a mix of different frameworks and languages. Instrumenting each of these microservices with a separate monitoring tool can take a lot of technical work and time. To make the process easier, we built a new quickstart on New Relic Instant Observability to get a bird’s-eye view of your golden signals like latency, error rates, and throughput directly from your Kubernetes cluster. No agent required. 

While New Relic's Kubernetes cluster explorer uses Kubernetes agent data to show the status of your cluster, from the control plane to nodes and pods, you don’t get many of the application-level insights that come with traditional Application Performance Monitoring (APM) agents with the infrastructure agent itself. Instead, New Relic’s integration with Pixie, called Auto-telemetry with Pixie, uses eBPF to automatically collect metrics, events, logs, and traces for your Kubernetes clusters, applications, OS, and network layers.

Pixie is able to get this application-level data without instrumentation by leveraging eBPF, a lightweight, sandbox virtual machine (VM) inside the Linux kernel, where you can run BPF bytecode that takes advantage of specific kernel resources. eBPF allows you to run restricted code when an event is triggered. The event could be a function call either in kernel space(kprobes) or userspace(uprobes). Pixie uses both uprobes and kprobes to collect observability data across services and applications.

We built this quickstart specifically for the New Relic integration with Pixie integration and included a dashboard that provides insight into latency, errors, and throughput for all services Pixie observes within the cluster, as well as a set of example alerts. This is meant to be a starting point—we highly recommend modifying the NRQL dashboards to fit your particular use cases. 

Find function call bottlenecks

Pixie tracks all HTTP calls made by your application, which means the latency visualization on the dashboard can leverage New Relic’s powerful querying language NRQL to find potential bottlenecks. You can pinpoint the particular function calls by looking at the various HTTP requests’ origin service, target service, method, and duration.

Locate problem services in your cluster 

The errors visualizations on the dashboard sort the HTTP calls by status code, showing you the errors and network failures associated with your cluster’s services and connect errors to services that are causing issues. By faceting by services/pod and the HTTP target, you can locate the exact problematic services.

See the HTTP calls by service 

The throughput section of the dashboard visualizes the number HTTP calls made to your application faceted by services. A sudden drop in throughput can be a symptom of a slew of possible errors in your application.

Filter to understand HTTP calls in your cluster

Much like logs, you can query individual HTTP calls in the HTTP spans tab on the dashboard. You can filter by things like entity name and HTTP status codes to get deeper insights from your Kubernetes cluster.