Get instant Kubernetes observability—no agents required. Meet Pixie Auto-telemetry

How to Parse Multiline Log Messages With the Infrastructure Agent's Fluent Bit Plugin

6 min read

Our Infrastructure agent is bundled with a Fluent Bit plugin, so you can natively forward logs with the simple configuration of a YAML file. Currently, the agent supports log tailing on Linux and Windows, systemd on Linux (which is really a collection from journald), syslog on Linux, TCP on both Linux and Windows, Windows Event Logs, and custom Fluent Bit configs containing any of the native inputs available.

While this out-of-box functionality is a great way to lower the barriers to entry for many of the logs operations teams collect, there are some limitations. 

For example, in the field I commonly find that teams need to collect and parse multiline log messages and display them in a sensible way. Consolidating multiline log messages into single log entries can look challenging on the surface, but if you follow a few basic patterns, it’s definitely possible.

In this post, I’ll demonstrate how to use custom Fluent Bit configurations on Linux and Windows to support multiline log messages in New Relic Logs

The challenge of multiline log messages

Here’s an example of a multiline message in an application log:

2020-10-28 13:48:55,584 [main] DEBUG com.newrelictest.logging.NerdFactory: Creating a Data Nerd
2020-10-28 13:48:55,751 [main] DEBUG com.newrelictest.logging.NerdFactory: {"id":1,"value":100.0}
2020-10-28 13:48:55,754 [main] DEBUG com.newrelictest.logging.NerdFactory: Creating a Data Nerd
2020-10-28 13:48:55,760 [main] ERROR com.newrelictest.logging.NerdFactory:
java.lang.NullPointerException
    at com.newrelictest.logging.NerdFactory.createDataNerd(NerdFactory.java:15)
    at com.newrelictest.logging.NerdFactoryTest.test(NerdFactoryTest.java:11)

Without providing direction for how Fluent Bit should treat a file like this, you’d see each individual log entry as a separate line in New Relic, which greatly reduces the value of the log data:

individual log entry example

But with some simple custom configuration in Fluent Bit, I can turn this into useful data that I can visualize and store in New Relic.

Handling multiline logs in New Relic

To handle these multiline logs in New Relic, I’m going to create a custom Fluent Bit configuration and an associated parsers file, to direct Fluent Bit to do the following:

  • Tail a specific file
  • Decorate the log with the file name under the key name filePath
  • Output the parsed log with the key name message
  • Use a Regex pattern to mark the timestamp, severity level, and message from the multiline input

Note: For Fluent Bit (and fluentd), you’ll want to test your Regex patterns using either Rubular or Fluentular.

Here's the YAML configuration file that I’ll add to /etc/newrelic-infra/logging.d. (For Windows the paths is C:\Program Files\New Relic\newrelic-infra\logging.d)

It is important to note that Fluent Bit configs have strict indentation requirements and copy/pasting from this post may result in malformed syntax issues.

A warning for Windows users: These configs require spaces, not tabs; and file path separators need to be escaped: C:\\Like\\this.log (you'll save yourself needless troubleshooting if you keep this in mind).

# This config uses an external Config and Parser file for Fluent Bit
---
logs:
  - name: external-fluentbit-config-and-parsers-file
    fluentbit:
      config_file: /etc/newrelic-infra/logging.d/fluentbit.conf
      parsers_file: /etc/newrelic-infra/logging.d/parsers.conf

And here is the Fluent Bit configuration file I’m using:

# This block represents an individual input type
# In this situation, we are tailing a single file with multiline log entries
# Path_Key enables decorating the log messages with the source file name
# ---- Note the value of Path_Key == the attribute name in NR1, it does not have to be 'On'
# Key enables updating from the default 'log' to the NR1-friendly 'message'
# Tag is optional and unnecessary unless you have multiple inputs defined and using different parsers. Otherwise, Parser_Firstline will direct your pipeline to your parser file

[INPUT]
    Name                tail
    Path                /tmp/test.log
    Path_Key            filePath
    Key                 message
    #Tag                 tail_test
    Multiline           On
    Parser_Firstline    MULTILINE_MATCH

# This block identifies the Parser to use with the associated Tag from the [INPUT] block(s)
#[FILTER]
#    Name             parser
#    Match            tail_test
#    Key_Name         message
#    Parser           MULTILINE_MATCH
#    Reserve_Data     On
#    Preserve_Key     On

Finally, here is the Fluent Bit parsers file:

# This block represents an individual parser

[PARSER]
   Name        MULTILINE_MATCH
   Format      regex
   Regex       /(?<timestamp>[\d-]+ [\d:,]+) (?:\[\w+\] )?(?<level>[A-Z]+)(?<message>[\s\S]*)/

This is matched, by name, in the [FILTER] block and also in the Parser_Firstline attribute in the [INPUT] block in the above configuration file. 

The regex names the timestamp, severity level, and message of the sample multiline logs provided.

Viewing multiline log messages in New Relic

Now that I have the configurations in place, and Fluent Bit running, I can see each multiline message displayed as a single in New Relic Logs:

multiline message example in new relic logs
multiline message example new relic logs

And there you have it. Valuable logs are flowing and you’re not sacrificing any of your data.