If you need to protect sensitive information and follow data privacy regulations, it's critical to obfuscate your log data, which means obscuring personally identifiable information (PII). However, effectively concealing PII in logs might take time to implement, which can increase computing resources and might not work well with all types of logs.
Prioritizing the obfuscation of PII and sensitive data in logs is essential for maintaining data confidentiality and integrity. This practice allows for effective system behavior analysis and issue troubleshooting while mitigating the risks of data breaches and unauthorized access, ensuring seamless debugging operations. This blog post shares helpful tips and code samples, guiding you through the process of r obfuscating logs through client-side Fluent Bit with New Relic infrastructure monitoring. Whether you’re evaluating New Relic with other platforms or you’re already using New Relic and seeking to safeguard sensitive information, you’ll discover advanced configuration options to safeguard sensitive information. Gain expert advice on log obfuscation, including:
- The importance of log obfuscation and a few popular approaches
- Using Fluent Bit in New Relic Infrastructure Monitoring
- How to obfuscate your logs
- How to remove sensitive data from your logs
Log obfuscation is both important and difficult
Log obfuscation is a critical security measure that helps protect computer systems against data breaches and unauthorized access to sensitive information. You might need to remove or obfuscate attributes from your logs to protect sensitive data like email addresses or credit card numbers from customer accounts so they’re not visible in the logs. Removing or obfuscating these attributes can prevent unauthorized access to sensitive data in case of log leakages or malicious users.
An added bonus of eliminating unnecessary attributes is reducing log sizes, making them easier to manage and analyze. This can boost overall system performance and reduce costs for storing and processing large log files.
Implementing obfuscation for logs is especially challenging. It’s tricky to ensure that the obfuscated logs provide enough information for effective troubleshooting and analysis, which is why you need logs in the first place. This challenge is even more significant when dealing with large, complex systems that generate huge amounts of log data.
A few log obfuscation approaches
To overcome this challenge, engineering teams use various techniques such as redaction, hashing, and encryption to obfuscate sensitive data while maintaining the overall integrity of the log files.
One popular technique involves creating custom rules using Grok patterns or Lua scripts to mask or remove sensitive information from log data.
- The process begins by identifying the types of sensitive data that need to be obfuscated. For example, you know that you have IP addresses, credit card numbers, and usernames for your customers.
- Then you define custom rules in Grok patterns or Lua scripts to match the sensitive data patterns and specify how they should be masked or removed.
- The next step is to test the custom rules against sample log data to ensure they correctly match the intended data patterns and apply the appropriate obfuscation.
- After the rules have been tested, you can integrate them into the log processing pipeline with a log management tool like Logstash, Fluent Bit, or FluentD. You can use all of these log management tools to forward the processed logs to New Relic for further analysis.
Finding the right balance between security and usability in your logs
To make sure the logs remain a valuable tool for monitoring system performance and troubleshooting, it’s crucial to balance the level of obfuscation with how useful they are for troubleshooting and analysis. Effective log obfuscation requires careful consideration of the trade-offs between security and usability.
Grok patterns are a helpful pattern-matching tool for parsing data from unstructured log files, and while Lua scripts offer more flexibility and control over the log processing pipeline, they may require more development effort. The choice between Grok patterns and Lua scripts will depend on the needs and maturity of the log processing pipeline.
How Fluent Bit works with New Relic infrastructure monitoring
Other techniques, such as Grok patterns or Lua scripts, are available to modify logs. But using Fluent Bit in New Relic infrastructure monitoring is a more straightforward approach to protect sensitive information in logs.
Fluent Bit is an open-source data collector for a unified logging layer, designed to collect, filter, and forward logs and metrics data to various destinations. Embedded as part of New Relic infrastructure monitoring, Fluent Bit is deployed as a client-side agent by default. This means you can accomplish advanced configurations to collect logs and metrics data from multiple sources and send them to New Relic for analysis. The next diagram shows how log forwarding works in New Relic:
When the New Relic infrastructure agent runs, it processes configuration files in the logging.d
directory. As a result, it generates a run-time Fluent Bit configuration file with the necessary [INPUT]
, [FILTER]
, and [OUTPUT]
sections. If you provided an external Fluent Bit configuration file via the fluentbit
option, the agent also declares an @INCLUDE
in the generated configuration file.
By default, the runtime file uses the default Fluent Bit configuration values because it doesn’t define a [SERVICE]
section. But you can customize the configuration settings by creating your own [SERVICE]
section in an external Fluent Bit configuration file and including it with the fluentbit
option. To use the advanced configuration in Fluent Bit, such as logs obfuscation, you must generate separate configuration and parsers files externally and use them with the fluentbit
, config_file
, and parsers_file
options in the agent.
Here’s an example of using Fluent Bit's additional configuration to forward obfuscated logs:
logs:
- name: external-fluentbit-config-and-parsers-file
fluentbit:
config_file: /etc/newrelic-infra/logging.d/fluent-bit.conf
parsers_file: /etc/newrelic-infra/logging.d/parsers.conf
The config_file
option refers to the path of an existing Fluent Bit configuration file that you can use to customize your Fluent Bit configuration settings. This file includes the necessary [INPUT]
, [FILTER]
, and [OUTPUT]
sections, among others.
The parsers_file
option refers to the path of an existing Fluent Bit parsers file. This file defines custom parsing rules that Fluent Bit can use to extract information from incoming logs. Using parsers, you can transform raw log data into structured data, making it easier to analyze and understand.
Important: To forward logs for common use cases, the infrastructure agent provides simple log forwarding configurations in the logging.d/
directory's YAML files. These files are automatically translated into Fluent Bit configuration files with the correct format and configuration defaults.
Understanding Fluent Bit configuration files
It's important to understand the different sections in Fluent Bit configuration files. These sections, namely [SERVICE], [INPUT], [FILTER], and [OUTPUT], play a crucial role in defining how logs are collected, processed, and forwarded.
1. The [SERVICE] section in Fluent Bit configuration files is optional and mainly used for customizing runtime settings. You can create your own [SERVICE] section in an external Fluent Bit configuration file for advanced customization. Keys in this section include:
Daemon: Specifies whether Fluent Bit should run as a daemon.
Log_Level: Sets the logging verbosity level.
Parsers_File: Points to an external parsers file for custom log parsing rules.
2. Mandatory and automatically generated, the [INPUT] section defines how Fluent Bit acquires logs. It features keys such as:
Name: Specifies the input plugin.
Tag: Assigns a tag to the collected logs.
Path: Indicates the log file's source.
3. Optional yet pivotal, the [FILTER] section allows the definition of rules for log modification or filtering. Essential keys include:
Name: Specifies the filter plugin.
Match: Defines conditions for applying the filter.
Parameters: Configures specific filter parameters.
4. Mandatory and auto-generated, the [OUTPUT] section dictates where processed logs are sent. Key options encompass:
Name: Specifies the output plugin.
Match: Defines conditions for log transmission.
Host: Specifies the destination host.
Obfuscate your logs using Fluent Bit in New Relic
To start using Fluent Bit in New Relic, you need two files in place: the parsers_file
for parsing files and the config_file
for modifying records. Follow these steps, using advanced settings with Fluent Bit in New Relic.
Step 1: Disable existing New Relic logs configuration
Before using Fluent Bit, you must disable any existing New Relic logs configurations. You'll either need to remove or comment out the following configuration in your YAML file. In this example, the configuration is commented out.
# logs:
# - name: nginx
# file: /var/log/nginx.log
# attributes:
# logtype: nginx
# environment: workshop
# engineer: yourname
# country: yourcountry
# pattern: 500|404|400
Step 2: Create the fluentbit.yml file
Create or rename your file to fluentbit.yml
, and include this path:
logs:
- name: external-fluentbit-config-and-parsers-file
fluentbit:
config_file: /etc/newrelic-infra/logging.d/fluent-bit.conf
parsers_file: /etc/newrelic-infra/logging.d/parsers.conf
Here’s an example of the file structure to use with Fluent Bit in New Relic infrastructure monitoring.
Step 3: Create a parsers_file
Create a file called parsers.conf
, and include your Fluent Bit parser configuration. You can use the power of the parser filter plugin in Fluent Bit to parse fields in event records. You can extract specific information from your logs and transform it into structured data that’s easier to analyze and understand. You can test these custom rules against sample log data to ensure they match the intended data patterns and apply the appropriate obfuscation.
Here’s the default sample parser provided by Fluent Bit:
[PARSER]
Name dummy_test
Format regex
Regex ^(?<INT>[^ ]+) (?<FLOAT>[^ ]+) (?<BOOL>[^ ]+) (?<STRING>.+)$
Here's an NGINX parser in a regular expression (regex) example:
[PARSER]
Name nginx
Format regex
Regex ^(?<source>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \\[(?<time>[^\\]]*)\\] "(?<method>\\S+)(?: +(?<path>[^\\"]*?)(?: +\\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\\"]*)" "(?<agent>[^\\"]*)")
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
Using a regular expression pattern, this code defines a custom parser for NGINX access logs. The regular expression captures fields such as the client's IP address, time of the request, HTTP method, URL path, status code, and user agent. The Time_Key
directive specifies the time field, and Time_Format
defines the time format in the log line. This configuration enables log processing tools to parse NGINX access logs and extract relevant fields for analysis and visualization. For more examples, you can explore parsers provided by Fluent Bit, New Relic, or open-source communities.
In the next example, you’ll use <source>
for obfuscating and removing logs attributes.
Step 4: Create a config_file
Create a file called fluent-bit.conf
file and include a modify
configuration where you’ll use one of the most valuable features of Fluent Bit, the Modify Filter plugin. Use this plugin to modify records using rules and conditions. You can change the content of log records based on specific criteria or requirements. This makes it easier to analyze and troubleshoot issues in your system.
Here’s the default sample from Fluent Bit:
[INPUT]
Name mem
Tag mem.local
[OUTPUT]
Name stdout
Match *
[FILTER]
Name modify
Match *
Add Service1 SOMEVALUE
Add Service3 SOMEVALUE3
Add Mem.total2 TOTALMEM2
Rename Mem.free MEMFREE
Rename Mem.used MEMUSED
Rename Swap.total SWAPTOTAL
Add Mem.total TOTALMEM
Here’s an example configuration file to use to obfuscate attributes in your logs:
[INPUT]
Name tail
Tag nginx
Path /var/log/nginx.log
DB /var/log/nginx_error.db
Path_Key filename
Parser nginx
Mem_Buf_Limit 8MB
Skip_Long_Lines On
Refresh_Interval 30
[FILTER]
Name record_modifier
Match *
Record logtype nginx
Record hostname ${HOSTNAME}
Record service_name Coming_From_FluentBit
[FILTER]
Name modify
Match *
Set source XXXXX
[OUTPUT]
Name newrelic
Match *
apiKey INGEST KEY
After applying the configuration file to obfuscate logs through Fluent Bit, your end result will look something like this example.
Step 5: Restart the agent
To use configurations from Fluent Bit, restart the New Relic infrastructure agent. Before restarting, ensure that both your files (fluent-bit.conf
and parsers.conf
) are correctly referenced in the fluentbit.yml
file, as shown back in step 2.
logs:
- name: external-fluentbit-config-and-parsers-file
fluentbit:
config_file: /etc/newrelic-infra/logging.d/fluent-bit.conf
parsers_file: /etc/newrelic-infra/logging.d/parsers.conf
Here’s the command to restart the New Relic infrastructure agent:
sudo systemctl restart newrelic-infra
Remove sensitive data from logs
In the previous example in Step 4, you learned how to obfuscate a log using Set
. Another alternative is to use the Remove
operation, a powerful tool for modifying logs. Use it to easily remove unnecessary or sensitive data from a record. Delete a key-value pair from a record, if it exists, using the Record Modifier plugin.
Here’s an example configuration file to remove attributes in your logs:
[INPUT]
Name tail
Tag nginx
Path /var/log/nginx.log
DB /var/log/nginx_error.db
Path_Key filename
Parser nginx
Mem_Buf_Limit 8MB
Skip_Long_Lines On
Refresh_Interval 30
[FILTER]
Name record_modifier
Match *
Record logtype nginx
Record hostname ${HOSTNAME}
Record service_name Coming_From_FluentBit
Remove_key source
[OUTPUT]
Name newrelic
Match *
apiKey INGEST KEY
After applying the configuration file to remove log data using Fluent Bit, your final result should look something like this example:
Important: Restarting the New Relic infrastructure agent after applying a new configuration is a crucial step to ensure the changes take effect. This step is particularly important because it guarantees that the new configuration settings are loaded and used by the agent. Make sure to restart the agent after applying new configurations, the same as shown in Step 5.
Conclusion
You've reviewed examples of straightforward ways to ensure confidentiality and integrity of log data using Fluent Bit in New Relic infrastructure monitoring.
Specifically, you've learned how to protect sensitive information in logs by obfuscating data with Fluent Bit log configurations. For example, after a key value <source> is identified, you can easily use the Modify filter plugin to obfuscate it or the Record Modifier filter plugin to remove log attributes.
次のステップ
Ready to try out obfuscating or removing your log data in New Relic? Learn more about forwarding your logs using the infrastructure agent.
Interested in other logging tools? Check out our top open source logging tools that work together with New Relic.
If you don’t already have a New Relic account, sign up for a free account to get started.
本ブログに掲載されている見解は著者に所属するものであり、必ずしも New Relic 株式会社の公式見解であるわけではありません。また、本ブログには、外部サイトにアクセスするリンクが含まれる場合があります。それらリンク先の内容について、New Relic がいかなる保証も提供することはありません。