Introducing OpenTelemetry observability for Crystal

Instrument your Crystal applications with OpenTelemetry and New Relic.

Published 8 min read

Crystal is an object-oriented programming language with a syntax that is heavily influenced by Ruby and includes a sprinkling of influence from Go and other languages. Enterprise companies are running core systems on Crystal and offering products implemented with Crystal, but Crystal hasn't received much attention from existing observability platforms. However, you need observability to help find and fix errors and performance issues in your production systems—and that includes systems using Crystal. In this piece, you’ll learn how to instrument Crystal applications with New Relic and Open Telemetry.

Instrumenting your Crystal application with New Relic and OpenTelemetry

There are a lot of moving parts in OpenTelemetry. The basics are fairly simple, but the full spectrum of features and capabilities is deeply complex. I started working on an OpenTelemetry framework for Crystal late in September 2021. There is a lot left to do, but the project as a whole is ready for others to start using it.

So how does one implement OpenTelemetry-based observability in your Crystal software, and what capabilities does that give you?

The first approach to instrumenting your Crystal code is to leverage the OpenTelemetry API to manually add instrumentation where you need it. To use OpenTelemetry, you need a place to send your data, and this example is written to send data to New Relic. If you don’t have a New Relic account, take a minute and sign up. A free account includes 100 GB of ingest per month, one full platform user, and unlimited basic users.

Let's look at an example. Consider a basic service, built with just the Crystal standard library, that responds to HTTP GET requests by calculating the nth Fibonacci number. The code in this example isn't the most concise way to do it in Crystal, but it's structured so that it could potentially be expanded into a larger, more complex service. Here's the full code for this basic, uninstrumented application on GitHub. And here is the code for the instrumented version of the application that I'll discuss later in this post.

Here's the code in fibonacci.cr:

require "http/server"
require "big/big_int"

class Fibonacci
  VERSION = "1.0.0"
  private getter finished : Channel(Nil) = Channel(Nil).new

  def fibonacci(x)
    a, b = x > 93 ? {BigInt.new(0), BigInt.new(1)} : {0_u64, 1_u64}

    (x - 1).times do
      a, b = b, a + b
    end
    a
  end

  def run
    spawn(name: "Fibonacci Server") do
      server = HTTP::Server.new([
        HTTP::ErrorHandler.new,
        HTTP::LogHandler.new,
        HTTP::CompressHandler.new,
      ]) do |context|
        n = context.request.query_params["n"]?

        if n && n.to_i > 0
          answer = fibonacci(n.to_i)
          context.response << answer.to_s
          context.response.content_type = "text/plain"
        else
          context.response.respond_with_status(400,
            "Please provide a positive integer as the 'n' query parameter")
        end
      end

      server.bind_tcp "0.0.0.0", 5000
      server.listen
    end

    self
  end

  def wait
    finished.receive
  end
end

You can run this with a basic server file called server.cr that runs fibonacci.cr:

require "./fibonacci"
Fibonacci.new.run.wait

You can run the code in server.cr with the following terminal command from the root directory of the project.

crystal build -p -s -t --release src/server.cr
./server

Adding OpenTelemetry to your application

OpenTelemetry requires a small amount of configuration up front to use it. You should provide a service_name, a service_version, and an exporter when initializing the API.

So, let's require the OpenTelemetry package in fibonacci.cr, and add its configuration to server.cr. Here's the code to add to fibonacci.cr:

require "http/server"
require "big/big_int"
# This loads all of the essential OpenTelemetry functionality.
require "opentelemetry-api"

And here's the code to add to server.cr:

require "./fibonacci"

OpenTelemetry.configure do |config|
  config.service_name = "Fibonacci Server"
  config.service_version = Fibonacci::VERSION
  config.exporter = OpenTelemetry::Exporter.new(variant: :http) do |exporter|
    exporter = exporter.as(OpenTelemetry::Exporter::Http)
    exporter.endpoint = "https://otlp.nr-data.net:4318/v1/traces"
    headers = HTTP::Headers.new
    headers["api-key"] = ENV["NEW_RELIC_LICENSE_KEY"]?.to_s
    exporter.headers = headers
  end
end

Fibonacci.new.run.wait

This code block sets the service_name and the service_version in the configuration, then defines the OpenTelemetry exporter to use. The :http exporter, which has a class name of OpenTelemetry::Exporter::Http, is used to deliver data using the OTLP/HTTP protocol.

To send your data to  New Relic with this protocol, you need a license key as shown in the code. The code block also defines a set of custom HTTP headers that includes your license key and is attached to the exporter. Note that you should always store your license key in an environment variable.

Next, you need to add instrumentation. OpenTelemetry traces use a data model where a Trace is essentially a container for other data including Spans. A Span is a unit of work, with a distinct start and end time, a name, and some other metadata, as well as an option set of attributes and events. A trace is composed of one or more spans, and spans contain detailed metadata.

If you want to collect data on how long it takes to calculate Fibonacci numbers in this application, as well as what numbers are being calculated, you can instrument the fibonacci method to do this:

def fibonacci(x)
  trace = OpenTelemetry.trace # Get the current trace or start a new one
  trace.in_span("Calculate Fibonacci ##{x}") do |span|
    a, b = x > 93 ? {BigInt.new(0), BigInt.new(1)} : {0_u64, 1_u64}

    (x - 1).times do
      a, b = b, a + b
    end

# You generally won’t manually add spans like this in a production environment. We’ll cover auto-instrumentation in a moment.

    span["fibonacci.n"] = x
    span["fibonacci.result"] = a

    a
  end
end

Now recompile the code and start the server with your license key:

NEW_RELIC_LICENSE_KEY=<your license key> ./server

Any time a request comes into the server that results in the calculation of a Fibonacci number, a trace is created and sent to New Relic where you can visualize it in a dashboard.

What about errors and everything else?

The previous code is a minimal example that only traces a small part of the whole process. In a production application, you'll want to instrument more than just the code that is under your control, like the method that calculates the Fibonacci number. You'll also want to instrument the rest of the HTTP request handling pipeline. If errors are logged, you'll want to capture them. If there is database activity, queries to Redis, or calls to other external APIs, you'll want all of that activity captured, too.

However, it is unreasonable to expect all developers to figure out how to capture all of that information from libraries that are not under their control. To instrument all of this, you can use prebuilt instrumentation packages from OpenTelemetry. Here is the full code for the instrumented version of this application. Now let's walk through how to add instrumentation.

First, you need to change the require block at the top of the fibonacci.cr file to:

require "http/server"
require "big/big_int"
require "opentelemetry-instrumentation"

Then you can instrument your fibonacci method more concisely by adding the following block to your server.cr file:

class Fibonacci
  trace("fibonacci") do
    OpenTelemetry.trace.in_span("fibonacci(#{x})") do |span|
      span["fibonacci.n"] = x
      result = previous_def
      span["fibonacci.result"] = result.to_s
    end
  end
end

When you run the server, OpenTelemetry will auto-instrument the HTTP::Server request/response cycle and the Log class provided by the Crystal standard library. OpenTelemetry defines a log standard for collecting logs and reporting them to a platform like New Relic. Crystal OpenTelemetry libraries don’t support that standard yet, but there is a workaround. Instead, logs are stored as events in spans. This ensures that the log information is associated with the span where it occurred, and also provides a clear migration path for a time in the near future when the log standard is supported. At that time, the library will change the type of OpenTelemetry data that is generated from an event on a trace to a log, and you won't need to change anything in your code.

If you run the service, you can send it requests using a command line tool such as curl:

curl ‘http://127.0.0.1:5000/?n=37’

You can also open the URL in your web browser. If you want to generate a lot of activity, there is a load generator program included in the repository. You can build and run it with a single command:

crystal run src/load_generator.cr

After manually sending some requests to your Fibonacci server, or after letting the load generator run for a minute, you can view the generated traces in New Relic.

This screenshot shows all of the GET / traces that have been collected, with a histogram of how long the traces took, spread over a span of time. In a more sophisticated application, you might see a larger variation in the histogram like in the next image:

You can select a trace to get more specific information, as the next image shows:

This waterfall graph shows each of the spans that make up the trace. Each span encapsulates a specific part of the chain of events that went into handling the request that was traced, and you can click on each of those individual segments to see an example of the specific data that was captured during that phase of the request’s handling.

This Fibonacci example is basic. Most services, even the most rudimentary ones, will do more than just calculate Fibonacci numbers. However, even without adding custom instrumentation, most applications will return useful, actionable tracing with only the auto-instrumentation included in this sample code.

Crystal’s OpenTelemetry support is rapidly evolving, with the goal of complete support for the OpenTelemetry specification. You can view the roadmaps for the API, SDK, and instrumentation in their respective repositories. General goals include more complete and correct implementation of the OpenTelemetry specification, added support for Logs and Metrics signals, and expanded automatic instrumentation, documentation, and usability of Crystal’s OpenTelemetry support.

To learn more details about OpenTelemetry in general, see the OpenTelemetry API repository. To learn more specifically about the OpenTelemetry implementation of Crystal and how it is built, see the OpenTelemetry instrumentation repository.