opentelemetry with AWS Lambda

In this blog post, we'll explore how to use OpenTelemetry to instrument and monitor serverless applications running on AWS Lambda. We''ll look at the Open Telemetry SDK, a component that simplifies the configuration and deployment of OpenTelemetry agents for Lambda functions.

What is OpenTelemetry?

OpenTelemetry is an open-source project that provides a unified framework for collecting, processing, and exporting observability data from distributed systems. Observability data includes metrics, traces, and logs that help developers and operators understand the behavior and performance of their applications.

OpenTelemetry supports multiple languages, platforms, and vendors, and offers a vendor-neutral API and SDK for instrumenting applications. It also provides a set of exporters that allow users to send their data to various backends, such as New Relic, Jaeger, Zipkin, AWS X-Ray, and many more.

Why use OpenTelemetry for serverless?

Serverless applications are composed of ephemeral, event-driven functions that run on demand in response to triggers. This makes them scalable, cost-effective, and easy to deploy but also introduces new challenges for observability.

For example, serverless functions have short lifespans and may not have access to persistent storage or network resources. This means traditional methods of collecting and exporting observability data, such as using agents or sidecars, may not work well for serverless functions. Additionally, serverless functions may have complex invocation patterns and dependencies, which make it difficult to trace the end-to-end execution flow and identify bottlenecks or errors.

OpenTelemetry addresses these challenges by providing a lightweight and flexible solution for instrumenting serverless functions. With OpenTelemetry you can:

  • Add minimal code changes to the functions to enable automatic instrumentation of common libraries and frameworks.
  • Configure the sampling rate and span attributes to control the amount and quality of trace data collected.
  • Choose from a variety of exporters to send their data to the backend of their choice.
  • Leverage the Open Telemetry Lambda Collector to simplify the deployment and configuration of OpenTelemetry agents for Lambda functions (not covered in this blog).

We'll focus on the automatic instrumentation of the Lambda function using the OpenTelemetry libraries and SDKs for Node.js.

Pre-requisites

What do you need to follow this guide successfully?

  • New Relic ingest license key. (Don’t have an account? Sign up for a free New Relic account).
  • Basic knowledge of OpenTelemetry

Libraries overview

These are the few key libraries and APIs that are essential in instrumenting the AWS Lambda function with the Node.js runtime.

@opentelemetry/instrumentation-aws-lambda: This module provides automatic instrumentation for the AWS Lambda module, and it's' included in the @opentelemetry/auto-instrumentations-node bundle as a plug-in. This is necessary to instrument our AWS Lambda functions.

@opentelemetry/sdk-trace-node: This module provides automated instrumentation and tracing for Node.js applications. It exposes an API called NodeTracerProvider, which can be used with the registerInstrumentations API from @opentelemetry/instrumentation to load individual instrumentation plugins.

OTLPTraceExporter: This is a common API provided in the @opentelemetry/exporter-trace-otlp-http and @opentelemetry/exporter-trace-otlp-proto packages for exporting the collected telemetry to a collector or a choice of backend. We'll use New Relic's native OpenTelemetry Protocol (OTLP) endpoint to export our telemetry data.

Instrumenting our serverless functions with OpenTelemetry

In this example, we're using the serverless framework to quickly set up the Lambda function along with an API gateway for the entry point. The lambda function is a simple Koa REST API with a few functional endpoints.

Step 1: Install packages

First, install all required packages for Node.js and Node runtime on the Lambda function. Copy the command below and run it in the root of your Node app

npm install --save 
@opentelemetry/auto-instrumentations-node
@opentelemetry/sdk-trace-node
@opentelemetry/exporter-trace-otlp-proto

Step 2: Configure SDK

Next, create a new file with the name otel-wrapper.js in the root or src folder of your Function in the serverless framework project. Now, copy and paste the following code into the newly created file:

const { NodeTracerProvider } = require("@opentelemetry/sdk-trace-node");
const { registerInstrumentations } = require("@opentelemetry/instrumentation");

const {
  getNodeAutoInstrumentations,
} = require("@opentelemetry/auto-instrumentations-node");
const { BatchSpanProcessor } = require("@opentelemetry/sdk-trace-base");

const { Resource } = require("@opentelemetry/resources");
const {
  SemanticResourceAttributes,
} = require("@opentelemetry/semantic-conventions");

const {
  OTLPTraceExporter,
} = require("@opentelemetry/exporter-trace-otlp-proto");

const {
  CompositePropagator,
  W3CBaggagePropagator,
  W3CTraceContextPropagator,
} = require("@opentelemetry/core");

// For troubleshooting, set the log level to DiagLogLevel.DEBUG
const { diag, DiagConsoleLogger, DiagLogLevel } = require("@opentelemetry/api");
// var logger = diag.createComponentLogger(DiagLogLevel.WARN);
diag.setLogger(new DiagConsoleLogger(), DiagLogLevel.WARN);

const COLLECTOR_STRING = `${process.env.OTEL_EXPORTER_OTLP_ENDPOINT}/v1/traces`;

/**
 * The `newRelicExporter` is an instance of OTLPTraceExporter
 * configured to send traces to New Relic's OTLP-compatible backend.
 * Make sure you have added your New Relic Ingest License to NR_LICENSE env-var
 */
const newRelicExporter = new OTLPTraceExporter({
  url: COLLECTOR_STRING,
  headers: {
    "api-key": `${process.env.NR_LICENSE}`,
  },
});

const provider = new NodeTracerProvider({
  resource: new Resource({
    [SemanticResourceAttributes.SERVICE_NAME]: process.env.OTEL_SERVICE_NAME,
  }),
});
provider.addSpanProcessor(
  new BatchSpanProcessor(
    newRelicExporter,
    //Optional BatchSpanProcessor Configurations
    {
      // The maximum queue size. After the size is reached spans are dropped.
      maxQueueSize: 1000,
      // The maximum batch size of every export. It must be smaller or equal to maxQueueSize.
      maxExportBatchSize: 500,
      // The interval between two consecutive exports
      scheduledDelayMillis: 500,
      // How long the export can run before it is cancelled
      exportTimeoutMillis: 30000,
    }
  )
);
provider.register({
  propagator: new CompositePropagator({
    propagators: [new W3CBaggagePropagator(), new W3CTraceContextPropagator()],
  }),
});

registerInstrumentations({
  instrumentations: [
    /**
     * AutoInstrumentations META package for Node.js
     * contains bundled libraries for most libraries and frameworks
     */
    getNodeAutoInstrumentations({
      "@opentelemetry/instrumentation-fs": {
        enabled: process.env.ENABLE_FS_INSTRUMENTATION,
        requireParentSpan: true,
      },
      // Enable instrumentation for KOA framework
      "@opentelemetry/instrumentation-koa": {
        enabled: true,
      },
      /**
       * Enable instrumentation for AWS Lambda framework
       * Use the requestHook and responseHook to
       * add additional attributes to the traces
       */
      "@opentelemetry/instrumentation-aws-lambda": {
        enabled: true,
        disableAwsContextPropagation: true,
        requestHook: (span, { event, context }) => {
          span.setAttribute("faas.name", context.functionName);

          if (event.requestContext && event.requestContext.http) {
            span.setAttribute(
              "faas.http.method",
              event.requestContext.http.method
            );
            span.setAttribute(
              "faas.http.target",
              event.requestContext.http.path
            );
          }

          if (event.queryStringParameters)
            span.setAttribute(
              "faas.http.queryParams",
              JSON.stringify(event.queryStringParameters)
            );
        },
        responseHook: (span, { err, res }) => {
          if (err instanceof Error)
            span.setAttribute("faas.error", err.message);
          if (res) {
            span.setAttribute("faas.http.status_code", res.statusCode);
          }
        },
      },
    }),
  ],
});

Highlighting a few key things in the instrumentation in otel-wrapper.js, under the configurations for instrumentation-aws-lambda, the property disableAwsContextPropagation has been set to TRUE. This is required to skip Amazon X-Ray parent extraction, which may conflict with the OpenTelemetry traceparent headers.

Most Node.js OpenTelemetry libraries offer two methods, requestHook and responseHook, for adding custom attributes to spans. These hooks can also be used to add conditional attributes to spans. Take note of the sample responseHook below, which adds an extra attribute to the span called faas.error containing the error message thrown by the code.

responseHook: (span, { err, res }) => {
          if (err instanceof Error)
            span.setAttribute("faas.error", err.message);
          if (res) {
            span.setAttribute("faas.http.status_code", res.statusCode);
          }
        },

Step 3: Configure environment

In the code above in the otel-wrapper.js, there are a couple of environment variables that are being referred to. We need to define these in our function configuration. Since we're using the serverless framework in this example, we need to add them to our serverless.yml configuration file.

Most importantly, to start our function with the OpenTelemetry instrumentations, we need to launch our function a little differently. Notice the NODE_OPTIONS environment variable in the list of all the defined variables. We're setting the argument to use our otel-wrapper.js as the module to be initialized to invoke our function.

functions:
  api:
    handler: index.handler
    environment:
      NODE_OPTIONS: "--require ./otel-wrapper"
      OTEL_SERVICE_NAME: "Node-Lambda-Otel-SDK"
			# External API for Distributed traces
      EXPRESS_OTEL_API_ENDPOINT: "http://3.230.230.121/v3/api"
			# License key stored in SSM parameter.
      NR_LICENSE: ${ssm:/nr_experiments_ingest_key}
      OTEL_EXPORTER_OTLP_ENDPOINT: https://otlp.nr-data.net:4317
			# Disabling FS instrumentation to reduce noise
      ENABLE_FS_INSTRUMENTATION: false

Step 4: Explore data

Once your function is deployed, it's time to explore the telemetry data from the Lambda in the New Relic platform.

Lambda requests metrics from Spans

Attributes from Distributed Traces

Look at the highlighted attributes in the screenshot above: faas.http.method, faas.http.target, and faas.name. These attributes were added by the requestHook method, but not all the attributes are present in this trace; that's because there are conditional attributes depending on the incoming request.

Our Lambda has two different endpoints which are bridged via API gateway, /path and /weather. Let's take a look at the traces from the second endpoint, which is /weather?location=bangalore

The /weather endpoint in the Lambda function calls an external entity that's also instrumented with OpenTelemetry. As a result, the distributed traces capture both the entities and the New Relic platform automatically generates a service map that shows the complete journey of the request.

Lambda distributed traces and service map

In this specific trace, the additional attribute faas.http.queryParams is now visible in the list of attributes. Refer to the screenshot below.

Conditional attributes from requestHook

When exploring the spans in the New Relic UI, you can also check the attributes of the called entity from the distributed traces screen.

Attributes from the called OTEL entity

Conclusion

We've seen how OpenTelemetry can help us instrument and monitor serverless applications running on AWS Lambda. Instrumenting AWS Lambda functions with OpenTelemetry SDKs and exporting traces to New Relic can greatly enhance the observability of serverless applications. By following the steps outlined in this blog post, you can gain valuable insights into the performance, behavior, and dependencies of your Lambda functions. By exporting traces from OpenTelemetry to New Relic, we can take advantage of its powerful features, including service map visualizations, transactions, alerting, and much more.