Use the OpenTelemetry Collector for log enrichment

Logs can be difficult to work with at scale, as they often contain a vast amount of unstructured data that can be challenging to parse and analyze into actionable insights. This is where log enrichment comes in. By enriching your logs with additional metadata, context information, or other data, you can troubleshoot issues faster and have deeper insights into your system.

One way to enrich your logs is to add additional metadata using the OpenTelemetry Collector. OpenTelemetry is a vendor-agnostic, open-source project that allows you to collect, process, and export telemetry data such as logs, traces, and metrics from various sources. The OpenTelemetry Collector is a data collection pipeline that provides a powerful set of processors and exporters you can use to transform and enrich your telemetry data, all before it reaches your observability backend.

In this blog post, you’ll learn how to enrich logs using the OpenTelemetry Collector. With a single YAML file, you can define an entire data pipeline by snapping together community-contributed receivers, processors, and exporters. First, you’ll learn three ways to send logs to the OpenTelemetry Collector using receivers. Then, you’ll learn five different log enrichment methods with context and metadata using various processors.

Sending logs to the OpenTelemetry Collector

There are a few ways to send logs to the OpenTelemetry Collector:

File based collection. You can configure your application to write logs to a file, which can then be collected by the collector. This method is useful if you have legacy applications that cannot be instrumented using an SDK or logging library. To collect logs from files, you can use the filelog receiver, which can monitor a directory for new log files and read logs from them.
Log forwarder. You can use a log forwarder, such as Fluentd or FluentBit, to collect logs from your applications and forward them to the OpenTelemetry Collector. The log forwarder can also perform some pre-processing on the logs before forwarding them. You can use the fluentforward receiver, which can receive logs from Fluentd or FluentBit over the Fluentd Forward Protocol.
Send data directly to the collector. You can configure your application to send logs directly to the OpenTelemetry Collector using one of the supported protocols, such as gRPC or HTTP. This requires integrating an OpenTelemetry SDK or logging library into your application code. Then you can use the OTLP receiver, which can receive logs in the OpenTelemetry Protocol Specification (OTLP).

The significance of log enrichment for modern systems

Log enrichment is a fundamental practice in modern-day observability and monitoring, offering enhanced clarity and context to raw logs. . At its core, log enrichment is about adding supplementary data or metadata to logs to provide deeper insights and improved searchability. The importance of log enrichment can be broken down into four points:

Enriched logs facilitate quicker root cause analysis, as developers and system operators can draw direct correlations between events across various systems without having to cross-reference multiple sources manually. This saves valuable time during incident response.
Enriched logs provide a broader context, making it easier to understand the entirety of an event or transaction, rather than interpreting logs in isolation. This enhanced context is particularly crucial in distributed systems, where a single transaction might span multiple services.
Enriched logs help in meeting compliance requirements by ensuring that logs capture all the necessary data points.
Enriched logs aid in optimizing resources, as they allow for more efficient querying and less time spent sifting through irrelevant or redundant data.

Overall, log enrichment amplifies the value you can get from logs and can help you transform data into narratives that can drive your decision-making.

Let’s get into how you can enrich your logs with the OpenTelemetry collector.

Log enrichment with context and metadata

Up next, you’ll learn five ways to add context and metadata to your logs using various processors configured in your OpenTelemetry Collector:

Using the k8sattributes processor to add Kubernetes metadata
Filtering unnecessary logs with the filter processor
Correlating application logs with the resource processor
Attaching environmental metadata to logs with the resourcedetection processor
Adding and modifying log attributes with the transform processor

With dozens of community-contributed processors, you can create entire data pipelines that decorate, filter, and manipulate your data to fit your use cases.

Correlate Kubernetes metrics with OpenTelemetry metrics, logs, and traces with the k8sattributes processor

Adding Kubernetes metadata to your logs can provide valuable context information that can help link infrastructure metrics to your logs, and ultimately help you troubleshoot emergent issues faster.

To add Kubernetes metadata to your logs using the OpenTelemetry Collector, you can use the k8sattributes processor. The k8sattributes processor enables automatic discovery of Kubernetes pods, extracts metadata from them, and adds the metadata to relevant logs, metrics, and spans as resource attributes. The processor leverages the Kubernetes API to discover all pods running in a cluster and keep track of their IP addresses, pod UIDs, and other relevant metadata.

After you add the k8sattributes processor to your pipeline, it will automatically extract Kubernetes metadata from your log messages and add it as attributes. The added attributes will include information such as the container name, pod name, and namespace associated with each log message. Check out the next code snippet for an example.

processors:
  k8sattributes:
    apiVersion: k8s
    kind: K8sAttributes

service:
  pipelines:
    traces:
      receivers: [your_receiver_name]
      processors: [k8sattributes]
      exporters: [your_exporter_name]

With the k8sattributes processor added to your pipeline, a log in New Relic will now have metadata like pod_name, cluster_name, various Kubernetes labels, and more, all automatically scraped.

Optimize ingest by filtering unnecessary logs with the filter processor

When working with logs, it's important to focus on the logs that are relevant and ignore those that aren't. Unnecessary logs can take up valuable storage space, increase processing time, and make it harder to find important log messages. For example, in Kubernetes cluster logs, health checks aren't conducive to debugging and only occupy valuable storage space.

To filter unwanted logs using the OpenTelemetry Collector, you can use the filter processor in your pipeline configuration file. The filter processor allows you to filter log messages based on various criteria, such as the log level, message content, or specific attributes.

You can also use the filter processor to filter log messages based on other attributes or message content. For example, you can filter log messages that contain certain keywords or exclude log messages from specific sources or applications.

processors:
  filter:
    attributes:
      exclude:
        # Exclude log messages related to Kubernetes health checks
        - key: message
          values: ["Liveness probe failed", "Readiness probe failed"]

In this example, the filter processor specifies an exclude rule that removes log messages related to Kubernetes health checks. The rule uses the message attribute to determine the content of each message and removes messages that contain either "Liveness probe failed" or "Readiness probe failed".

You can customize this example based on the specific content of your Kubernetes health check logs. By using the filter processor to remove these logs, you can reduce the noise in your log data and focus on the logs that are most relevant to your system.

Attach environmental metadata to logs with the resourcedetection processor

When working with logs in a cloud or hybrid cloud environment, it can be challenging to identify the sources of log messages and understand the context of the logs. The resourcedetection processor in the OpenTelemetry Collector can help you address this challenge by automatically detecting and attaching environmental metadata to log messages, saving you time.

The resourcedetection processor detects and attaches metadata about the environment in which the application is running, such as the cloud provider, region, account ID, instance ID, and more. This metadata can be used to provide valuable context that can help you understand the source of log messages and troubleshoot issues more quickly.

To attach environmental metadata to logs using the OpenTelemetry Collector, you can use the resourcedetection processor in your pipeline configuration file. Here's an example of how to configure the resourcedetection processor to attach metadata about your cloud environment:

processors:
  resourcedetection:
    detectors: [gce, ec2, az]

In this example, the resourcedetection processor specifies which detectors should be used. The detectors are Google Compute Engine (gce), Amazon Elastic Compute Cloud (ec2), and Microsoft Azure (az).

Once you've added the resourcedetection processor to your pipeline, it will automatically detect and attach environmental metadata to your log messages. The added metadata will include information such as the cloud provider, region, account ID, and instance ID associated with each log message.

Attaching environmental metadata to logs using the resourcedetection processor can provide valuable context information about your infrastructure. Keep in mind that the amount of metadata you attach to your logs will have an impact on your data ingest cost.

Correlate your application logs with the resource processor

When troubleshooting issues in a complex distributed system, it can be challenging to correlate log events from different applications and services. The resource processor for logs can help to address this challenge by adding contextual information to logs, such as the service name, host name, and operating system, which can be used to correlate log events from different sources and save you time in the troubleshooting process

To configure the resource processor in the OpenTelemetry Collector to add relevant attributes to your logs, add it to your pipeline config file.

processors:
  resource:
    attributes:
      service.name: example_service
      host.name: example_host
      os.name: example_os

In this example, the resource processor adds the service.name, host.name, and os.name attributes to each log.

By using the resource processor to add relevant attributes to each log, you can correlate those attributes with other telemetry data produced by the application, like traces and metrics. All application metric, trace, and log data sent to New Relic with the same service.name are associated with the same entity, making it a breeze to get contextual information from a single log.

The difference between the resource processor and the resourcedetection processor

A key difference between the resource processor and the resourcedetection processor is that the resource processor operates on a per-log-record basis, while the resourcedetection processor operates on a per-resource basis. This means that the resource processor adds attributes to each log individually, while the resourcedetection processor extracts attributes from the environment and applies them to all logs generated by a particular resource. If you need to add commonly-used attributes to logs, the resource processor may be a good choice. If you need to extract custom attributes that are unique to your system or environment, the resourcedetection processor may be a better fit.

To get even more context into your logs, you can also attach a span id and trace id, which can link logs to traces. For Java applications, you can use log4j2 to write JSON structured logs with a trace_id and span_id. For Python applications, you can automatically inject tracing context into log statements by setting the environment variable OTEL_PYTHON_LOG_CORRELATION to true.

Add and modify log attributes with the transform processor

The transform processor can be used to enrich log data by adding or modifying attributes based on certain conditions or rules. For example, when investigating issues, it can be helpful to add additional context to log messages. For example, if you're troubleshooting a problem with a web application, having a user's session ID associated with the log message can make it easier to troubleshoot and save you time.

Here's an example configuration:

transform:          
  log_statements:
    - context: resource
      statements:
        - keep_keys(resource.attributes, ["service.name",          "service.namespace", "cloud.region"])
    - context: log
      statements:
        - set(severity_text, "FAIL") where body == "request failed"
        - replace_all_matches(attributes, "/user/*/list/*", "/user/{userId}/list/{listId}")
        - replace_all_patterns(attributes, "/account/\\d{4}", "/account/{accountId}")

In this code snippet:

The keep_keys transformation allows you to select the specific key value pairs you want to send to your backend. This is useful when you want to reduce the amount of data you're sending to your telemetry backend, or when you want to ensure that only certain data is sent for privacy or security reasons. In this example, the keep_keys transformation is applied to the resource data of the logs, keeping only specific keys (service name, namespace, and cloud region) from the resource attributes.
The set transformation allows you to set or modify the value of a specific key value attribute of a log. For example, you can assign the severity_text attribute to "FAIL" for logs with a body attribute that equals "request failed". This way you can easily sort and query logs that have a failed request.
The replace_all_matches and replace_all_patterns transformation are used to replace wildcard-style patterns. In this example, replace_all_matches replaces all matches of the regular expression "/user/*/list/*" with "/user/{userId}/list/{listId}" in the logs’ attributes. While replace_all_paterns replaces all matches of the regular expression "/account/\\d{4}" with "/account/{accountId}" in the log’s attributes. Both of these transformations can be useful for anonymizing or standardizing the data in your logs. For example, you might use them to replace potentially sensitive information (like user IDs, account IDs or list IDs in the example above) with placeholder values before the data is exported.

Conclusion

When you enrich your logs with metadata and context information, you can troubleshoot issues faster and have deeper insights into your system, which all leads to a payoff in efficiency. The OpenTelemetry Collector makes log enrichment easy to implement through its highly configurable options for collecting, enriching, and exporting logs.

Here's a quick overview of how you can use the processors outlined in this blog post:

Add Kubernetes metadata with the k8sattributes processor .
Only keep the logs that are relevant to your team's observability needs with the filter processor.
Extract and add environmental data as attributes with the resourcedetection processor.
Add resource attributes so you can correlate your logs with other application telemetry with the resource processor.
Add or modify attributes based on conditions with the transform processor.

Best of all, New Relic has a native OTLP endpoint, so getting started is as easy as configuring your OpenTelemetry collector to send data to our endpoint.

Next steps

If you haven’t already, set up your own app or service with OpenTelemetry.
Set up an OpenTelemetry collector and connect it to New Relic.
Explore how you want to enrich your logs with OpenTelemetry Collector processors.

Don't have New Relic yet? Sign up for a free account. Your account includes 100 GB/month of free data ingest, one free full-platform user, and unlimited free basic users.

Log enrichment FAQs

What is log enrichment, and why is it important?

Log enrichment is the process of enhancing raw log data with additional context and information to make it more meaningful and useful. It's important because enriched logs provide detailed insights into system behavior, aiding in troubleshooting, monitoring, and security analysis.

What types of information can be added during log enrichment?

During log enrichment, you can add various types of information such as user-specific details, geographical information, application version, environment, request IDs, or any other relevant contextual data that can help in troubleshooting and analysis.

Can I enrich logs based on specific conditions or attributes?

Yes, the OpenTelemetry Collector provides the flexibility to enrich logs based on specific conditions or attributes. You can configure processors to apply enrichment based on conditions like log level, specific log messages, or any other attributes associated with the log data.

Can I enrich logs from different sources with different contextual information?

Yes, you can configure the OpenTelemetry Collector to enrich logs from different sources with distinct contextual information. The collector supports flexible configurations, allowing you to define different enrichment rules for different log sources.

What are some best practices for log enrichment with OpenTelemetry?

Best practices include defining a clear enrichment strategy based on the specific requirements of your application, avoiding over-enrichment to prevent unnecessary resource usage, and thoroughly testing the enrichment configurations to ensure they provide the desired results.

Can log enrichment be retroactively applied to existing log data?

Log enrichment in real-time is common, but retroactively applying enrichment to existing log data might require additional processing and careful handling to avoid data inconsistencies. It's recommended to plan log enrichment strategies considering both real-time and historical data.

How can log enrichment aid in performance monitoring and optimization?

Enriched logs can provide insights into system performance by including data like response times, error rates, and resource usage. By analyzing this information, IT teams can identify bottlenecks, optimize system configurations, and improve overall performance.

By Daniel Kim

Daniel Kim (He/Him) is a Principal Developer Relations Engineer at New Relic and the founder of Bit Project, a 501(c)(3) nonprofit dedicated make tech accessible to underserved communities. He wants to inspire generations of students in tech to be the best they can be through inclusive, accessible developer education. He is passionate about diversity and inclusion in tech, good food, and dad jokes.

The views expressed on this blog are those of the author and do not necessarily reflect the views of New Relic. Any solutions offered by the author are environment-specific and not part of the commercial solutions or support offered by New Relic. Please join us exclusively at the Explorers Hub (discuss.newrelic.com) for questions and support related to this blog post. This blog may contain links to content on third-party sites. By providing such links, New Relic does not adopt, guarantee, approve or endorse the information, views or products available on such sites.

780+ integrations to start monitoring your stack for free.

See All Integrations

In this article