Monitor Apollo Server with OpenTelemetry

Apollo Server is one of the most popular frameworks for building APIs based on GraphQL, a query language for APIs that includes a server-side runtime for executing queries. One of the benefits of using GraphQL language is that it makes it easier to evolve your APIs.

But that flexibility comes with a performance cost when you run complex queries against your Apollo server. In this blog post, you’ll learn how to use OpenTelemetry and New Relic to measure the performance of both federated and non-federated Apollo at every phase of the query cycle.

Every GraphQL query goes through three phrases:

Parse: A query is parsed into an abstract syntax tree (AST in short).
Validate: The AST is validated against the schema by checking the query syntax and its fields.
Execute: Starting from the root, the runtime walks through the AST, invokes appropriate resolvers of your server, collects results, and returns a JSON response.

Performing large queries (or mutations) on large GraphQL documents can degrade the performance of one or all of these phrases.

Before getting started, let’s take a look at some of the dashboards you’ll get in New Relic after you’ve set up monitoring for your Apollo server with OpenTelemetry and New Relic.

The next image shows a curated view of your traces and spans in New Relic.

The curated view of traces in New Relic includes trace count, duration, errors, and trace groups sorted by average span duration.

You can also find performance details like duration and throughput of a particular query along with attributes like the query type, name, HTTP status code, text or error message, host information, port, URL, and more telemetry data as well as GraphQL-specific spans.

Trace view shows spans with detail information about each span, including its statistics and related data.

Finally, you can see exactly how queries are being resolved, so if a query runs through multiple resolvers you can easily see details on how each field is being resolved at each resolver level.

Trace view shows spans correlated with each phase of an Apollo query request cycle.

Set up OpenTelemetry with GraphQL

Open Telemetry is a vendor-neutral standard allowing the generation, export, and collection of application, infrastructure, and network telemetry data. New Relic is the third-largest contributor to OpenTelemetry’s open source projects.

To follow along with this tutorial, you’ll need a running Apollo Server. If you don’t have one, you can grab a sample code from this repository.

1. Install OpenTelemetry

The Apollo Server documentation does a great job of explaining how to set up Open Telemetry with your GraphQL server. We'll follow those instructions with slight modifications.

First, let’s install the required libraries:

npm install \
 @opentelemetry/api \
 @opentelemetry/core \
 @opentelemetry/resources \
 @opentelemetry/sdk-trace-base \
 @opentelemetry/sdk-trace-node \
 @opentelemetry/instrumentation \
 @opentelemetry/instrumentation-http \
 @opentelemetry/instrumentation-express \

The next package is only required if you’re using subgraphs or monoliths. For federated Apollo, you should skip installing it.

@opentelemetry/instrumentation-graphql

This section uses a monolithic application as an example, so we’re including the package in our installation. Follow along to find out more about Federated Apollo instrumentation later in this blog post.

2. Set up and test your configuration

Next, you need to create a file called open-telemetry.js, which looks like this:

// open-telemetry.js
const { Resource } = require("@opentelemetry/resources");
const {
 SimpleSpanProcessor,
 ConsoleSpanExporter,
} = require("@opentelemetry/sdk-trace-base");
const { NodeTracerProvider } = require("@opentelemetry/sdk-trace-node");
const { registerInstrumentations } = require("@opentelemetry/instrumentation");
const { HttpInstrumentation } = require("@opentelemetry/instrumentation-http");
const {
 ExpressInstrumentation,
} = require("@opentelemetry/instrumentation-express");
// **DELETE IF SETTING UP A GATEWAY, UNCOMMENT OTHERWISE**
const {
 GraphQLInstrumentation,
} = require("@opentelemetry/instrumentation-graphql");
 
// Register server-related instrumentation
registerInstrumentations({
 instrumentations: [
   new HttpInstrumentation(),
   new ExpressInstrumentation(),
   // **DELETE IF SETTING UP A GATEWAY, UNCOMMENT OTHERWISE**
   new GraphQLInstrumentation(),
 ],
});
 
// Initialize provider and identify this particular service
// (in this case, we're implementing a federated gateway)
const provider = new NodeTracerProvider({
 resource: Resource.default().merge(
   new Resource({
     // Replace with any string to identify this service in your system
     "service.name": "apollo-newrelic-otel",
   })
 ),
});
 
// Configure a test exporter to print all traces to the console
const consoleExporter = new ConsoleSpanExporter();
// Test out by printing data to the console (for debugging purposes)
provider.addSpanProcessor(new SimpleSpanProcessor(consoleExporter));
 
// Register the provider to begin tracing
provider.register();

The end of the file has code to create a new ConsoleSpanExporter and SimpleSpanProcessor, which will log the GraphQL payload to the console for development purposes so you can confirm everything works as expected.

You also need to require that file with require("./open-telemetry"); at the top of your Apollo Server configuration file as shown in the GitHub repo.

Finally, start your Apollo Server and send a query (or mutation). You should see an Open Telemetry payload similar to this in your console:

Span {
 attributes: {
   'http.url': 'http://localhost:4000/',
   'http.host': 'localhost:4000',
   'net.host.name': 'localhost',
   'http.method': 'POST',
   'http.target': '/',
   'http.user_agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36',
   'http.request_content_length_uncompressed': 234,
   'http.flavor': '1.1',
   'net.transport': 'ip_tcp',
   'net.host.ip': '::1',
   'net.host.port': 4000,
   'net.peer.ip': '::1',
   'net.peer.port': 56810,
   'http.status_code': 200,
   'http.status_text': 'OK',
   'http.route': ''
 },
 links: [],
 events: [],
 status: { code: 1 },
 endTime: [ 1662122730, 865936797 ],
 _ended: true,
 _duration: [ 0, 3919162 ],
 name: 'HTTP POST',
 _spanContext: {
   traceId: '9766a8be2d31a3e202bfd96915c4569e',
   spanId: '8364ca9c4b2c51aa',
   traceFlags: 1,
   traceState: undefined
 },
 parentSpanId: undefined,
 kind: 1,
 startTime: [ 1662122730, 862017635 ],
 resource: Resource { attributes: [Object] },
 instrumentationLibrary: { name: '@opentelemetry/instrumentation-http', version: '0.27.0' },
 _spanLimits: {
   attributeValueLengthLimit: Infinity,
   attributeCountLimit: 128,
   linkCountLimit: 128,
   eventCountLimit: 128
 },
 _spanProcessor: MultiSpanProcessor { _spanProcessors: [Array] },
 _attributeValueLengthLimit: Infinity

You're done setting up OpenTelemetry. The next step is to integrate New Relic to get curated dashboards.

Set up New Relic

To get started, you need a free New Relic account and license key.

You also need one more Node module. You can install it using this command:

npm install \
 @opentelemetry/exporter-trace-otlp-proto \

This is an OpenTelemetry exporter that will send data to the New Relic collector.

Next, you need to modify your open-telemetry.js file by requiring modules from the installed libraries:

const { OTLPTraceExporter } = require("@opentelemetry/exporter-trace-otlp-proto");
const { BatchSpanProcessor } = require("@opentelemetry/sdk-trace-base");

Once you’ve successfully tested the OpenTelemetry payload, you can remove the ConsoleSpanExporter and SimpleSpanProcessor and replace them with the OTLPTraceExporter. The configuration should look like this:

const collectorOptions = {
 url: "https://otlp.nr-data.net:4318/v1/traces",
 headers: {
   "api-key": "{{Your New Relic License Ingest Key}}",
 },
 concurrencyLimit: 10,
};
 
// Configure an exporter that pushes all traces to a Collector
// In this case it's the New Relic's collector setup with collectorOptions
const collectorTraceExporter = new OTLPTraceExporter(collectorOptions);
provider.addSpanProcessor(
 new BatchSpanProcessor(collectorTraceExporter, {
   maxQueueSize: 1000,
   scheduledDelayMillis: 1000,
 })
);

The collectorOptions object contains the New Relic collector url and headers properties. You’ll need to replace {{Your New Relic License Ingest Key}} with your own API key to successfully send data to New Relic.

Depending on your data center location (EU or US) and telemetry data type, you might want to modify the url property according to the documentation on OpenTelemetry setup for services.

The code sample also includes other configuration options like concurrencyLimit, maxQueueSize, or scheduledDelayMillis that you can adjust according to your needs.

Your open-telemetry.js file should look like this open-telemetry.js file when you're finished:

const { Resource } = require("@opentelemetry/resources");
const {
 SimpleSpanProcessor,
 ConsoleSpanExporter,
} = require("@opentelemetry/sdk-trace-base");
const { NodeTracerProvider } = require("@opentelemetry/sdk-trace-node");
const { registerInstrumentations } = require("@opentelemetry/instrumentation");
const { HttpInstrumentation } = require("@opentelemetry/instrumentation-http");
const {
 ExpressInstrumentation,
} = require("@opentelemetry/instrumentation-express");
// **DELETE IF SETTING UP A GATEWAY, UNCOMMENT OTHERWISE**
const {
 GraphQLInstrumentation,
} = require("@opentelemetry/instrumentation-graphql");
 
const {
 OTLPTraceExporter,
} = require("@opentelemetry/exporter-trace-otlp-proto");
const { BatchSpanProcessor } = require("@opentelemetry/sdk-trace-base");
 
// Register server-related instrumentation
registerInstrumentations({
 instrumentations: [
   new HttpInstrumentation(),
   new ExpressInstrumentation(),
   // **DELETE IF SETTING UP A GATEWAY, UNCOMMENT OTHERWISE**
   new GraphQLInstrumentation(),
 ],
});
 
// Initialize provider and identify this particular service
// (in this case, we're implementing a single server/service)
const provider = new NodeTracerProvider({
 resource: Resource.default().merge(
   new Resource({
     // Replace with any string to identify this service in your system
     "service.name": "apollo-newrelic-otel",
   })
 ),
});
// Set collector options (for New Relic)
const collectorOptions = {
 url: "https://otlp.eu01.nr-data.net:4318/v1/traces",
 headers: {
   "api-key": "{{Your New Relic License Ingest Key}}",
 },
 concurrencyLimit: 10,
};
 
// Configure an exporter that pushes all traces to a Collector
// In this case it's the New Relic's collector setup with collectorOptions
const collectorTraceExporter = new OTLPTraceExporter(collectorOptions);
provider.addSpanProcessor(
 new BatchSpanProcessor(collectorTraceExporter, {
   maxQueueSize: 1000,
   scheduledDelayMillis: 1000,
 })
);
 
// Register the provider to begin tracing
provider.register();

If you aren’t using federated Apollo, your setup is complete and you can skip the next section. However, if you are using federated Apollo, you’ll need to include additional OpenTelemetry configuration files.

Set up the federated Apollo integration

Apollo Federation is an architecture for creating a supergraph that combines multiple GraphQL APIs. In this architecture, your individual GraphQL APIs are called subgraphs and you compose them into a supergraph. The only entry point to your application is a gateway, also known as a graph router, which accepts requests from clients and calls individual APIs based on a client’s query.

The previous image shows an example of federated Apollo architecture. There are two services (subgraphs) returning books data and authors data, and these services are combined into a single endpoint using the graph router. You can find a working sample in the GitHub repository.

In this case, each service, as well as the gateway, has to have its own OpenTelemetry setup. For subgraphs, you can simply copy the open-telemetry.js setup from the previous section to each of your subgraphs and just change the name of the service. You can find an example file in the repository.

For the gateway, you also need to remove new GraphQLInstrumentation() from the instrumentations section of the file.

After New Relic starts ingesting your data, you'll see a service map showing which services your query had to call to resolve a client’s request. In addition, you can find gateway-specific spans like request, execute, plan, or validate. The full list is in the Apollo documentation.

This trace view in New Relic shows the relation of Gateway with subgraphs, how a query is resolved, and which service it calls.

You can also view the gateway query plan and HTTP calls to the subgraphs with query resolution details as granular as individual fields.

GraphQL query view shows detailed trace with gateway-specific spans and subgraph-specific spans.

You can find sample code with Apollo Federation instrumentation example in this federated-apollo repository.

Analyze your Apollo Server data in New Relic

Restart your Apollo Server and try out a few queries. Navigate to the Traces tab in New Relic and you’ll see your GraphQL data.

The curated view of traces in New Relic includes trace count, duration, errors, and trace groups sorted by average span duration.

You can also use the New Relic query builder to query your Apollo Server telemetry data. New Relic uses New Relic Query Language (NRQL). Here’s a sample query:

SELECT * FROM Span WHERE name LIKE 'graphql%' SINCE 7 days AGO LIMIT MAX

Your results should look something like this:

Table view of all GraphQL spans sent to New Relic from Apollo server.

To see an average duration of your queries for every phase of the cycle (parse, validate, and execute), you can run this query, which uses a TIMESERIES view with a FACET by graphql.source:

SELECT average(duration.ms) FROM Span
WHERE name LIKE 'graphql%' TIMESERIES 5 minutes
FACET name, graphql.source LIMIT MAX

The query will return a visualization like the one in the next image.

Area chart that shows the average query duration faceted by GraphQL query name.

Debug issues with the OpenTelemetry integration

If you need to debug any issues with OpenTelemetry instrumentation, you can install the @opentelemetry/api npm package as shown in these code snippets and use its methods to log whether the integration is working correctly.

const api = require("@opentelemetry/api");
api.diag.setLogger(new api.DiagConsoleLogger(), api.DiagLogLevel.DEBUG);

If everything is instrumented correctly, your debug log should look similar to this:

Instrumentation suppressed, returning Noop Span
@opentelemetry/instrumentation-http https instrumentation outgoingRequest
@opentelemetry/instrumentation-http http.ClientRequest return request
@opentelemetry/instrumentation-http outgoingRequest on response()
@opentelemetry/instrumentation-http outgoingRequest on end()
statusCode: 200
@opentelemetry/instrumentation-http outgoingRequest on request close()

Otherwise, you’ll see error messages pointing you towards what might be going wrong.

Instrumentation suppressed, returning Noop Span
@opentelemetry/instrumentation-http https instrumentation outgoingRequest
@opentelemetry/instrumentation-http http.ClientRequest return request
@opentelemetry/instrumentation-http outgoingRequest on response()
@opentelemetry/instrumentation-http outgoingRequest on end()
@opentelemetry/instrumentation-http outgoingRequest on request close()
{"stack":"OTLPExporterError: Forbidden\n    at IncomingMessage.<anonymous> ...",
"message":"Forbidden","name":"OTLPExporterError","code":"403"}

You can find all the code samples in this blog post in the Apollo Server New Relic OpenTelemetry Integration repository. If you have feedback or issues, add them in the GitHub repository issues section.

次のステップ

Ready to start monitoring Apollo Server with New Relic? Sign up for a free account if you don't have one yet. Your account includes 100 GB/month of free data ingest, one free full-access user, and unlimited free basic users.