Logs are crucial to help you understand the applications and services you develop and operate, but using logs effectively means more than collecting mass log data into a database or file.

Well-structured logs can help you comprehend a complex system, discover how to improve its performance or provide crucial insights on errors.

The log management struggle

Configuring logs for the entire stack can be overwhelming and leave blindspots if done poorly. You must decide what to log, how much detail to include, and whether too much data will lead to high costs.  

Log volume can be staggering, too. Yet there are ways to make it simpler (and a quality log monitoring tool definitely helps). 

Logging best practices

This list of logging best practices will help you gain visibility into your stack, troubleshoot errors faster, and better understand your users' experience.

Capabilities
New relic virtuo customer story
Discover the power of New Relic log management
Manage your logs Manage your logs

 

Decide what to log

A log is a file that stores event data about your software applications, operations systems, and infrastructure. Logs are generated by writing text to standard output or file and can include timestamps, error messages, stack traces, and other data points that tell when and why things happened. For more info on trace-level logging, check out How to simplify your troubleshooting with logs in context.

To get the most from your logs, you should carefully select what you put in them. They should include all necessary metadata to help pinpoint events and root causes when investigating. Make sure you don’t put too much in your logs, which can help you scale your logging effort and keep log monitoring costs down.

Information in logs should be purposeful and immediately helpful. The data should be valuable to the team, whether usage data, user events, or application errors and exceptions. The information stored in a log should also provide the necessary details to understand issues and make decisions.

Plan for common logging scenarios 

Not only can logs help you troubleshoot bugs, but they can also help you understand your stack better. Examples include performance profiling and gathering statistics. 

When configuring your logs, remember common scenarios to help you decide which data points provide value. For example, detailed application logs may provide insight into performance and potential problems such as memory leaks. Unlike other data about a user's interaction, logs can give crucial insight into the experience for your customers.

All of this information can be important when you make decisions and will vary based on the scenario. Here's an example video of making decisions using logs in context.

Log meaningful messages that drive decisions  

Log messages are only helpful if the information they provide is valuable and helps people make decisions. Third-party infrastructure tends to capture granular details, but for software applications, you should decide what details help diagnose why an error/event happened so you can take the necessary actions.  

How to know if a log message is effective

Log messages should convey what is happening with a line of code for application errors. 

For example, if a transaction fails, the log message could be: 

Transaction Failed: Could not create user ${path/to/file:line-number}

This message will make it easier to discover why the transaction failed and what line of code to review. 

Save your teammates' time

In addition to outputting an error code text or number, adding a short description in the log can save your teammates time when troubleshooting. 
 

Keep log messages simple and concise 

While it's essential to include enough information within a log message, don't be excessive.

Unnecessary data increases storage needs, slows log searches, and makes it tough to debug problems. Log messages must be helpful and concise and ensure you only collect the most crucial data.

When formatting your logs, collect the information needed to debug an error without including every detail about the environment. For example, if an API fails, it may be helpful to log any error messages from the API. But, consider omitting details about the application memory.

Take great care so your log messages don't contain sensitive information. Protecting your customers’ data and avoiding legal issues are two very big reasons to watch for PII leaks into logs.

Set intentional log rotation and retention guidelines

Implementing log rotation and retention policies is crucial in log management. Log rotation prevents disk space overload by archiving old logs and creating new ones, enhancing system efficiency and ease of analysis. Retention policies dictate how long logs are kept based on their importance, ensuring compliance and optimal use of storage. Together, these practices are essential for maintaining system performance, regulatory compliance, and effective data management.

Don't forget the timestamp

Remember to include a timestamp for all of your log messages. Based on the frequency of a task, select the right time to track, but ensure a timestamp in all your logs. Some tasks may need to track time down to the millisecond, while others only need to track to the minute, hour, or even the day. 

As a best practice, apply standards across your entire stack or organization to correlate your logs with other telemetry data types like metrics or events. 

Use logging formats that are easy to understand

Your log messages should provide critical information that is easy to read and also drives actions or provides insights. As a rule of thumb, avoid ambiguous or non-descriptive messages that only limited team members understand.

Log formats should be easy to parse and keep a consistent log structure. This will ensure they’re easy to collect and aggregate. For example, tools like New Relic log management make it easy to define custom log parsing rules—but parsing rules can't work if your log data is unreadable. 

Log format examples 

### Example of an NGINX access log that is difficult to parse:
127.180.71.3 - - [10/May/2022:08:05:32 +0000] "GET /downloads/product_1 HTTP/1.1" 304 0 "-" "Debian APT-HTTP/1.3 (0.8.16~exp12ubuntu10.21)"


### Example of the same log information in a parsable log format:
{
  "remote_addr":"93.180.71.3",
  "time":"1586514731",
  "method":"GET",
  "path":"/downloads/product_1",
  "version":"HTTP/1.1",
}

Both of these logs show very similar information, but the first one is difficult to read because the parts are not demarcated clearly. In the second example, you’re able to easily scan for the detail you want, without having to read the entire log. Standardizing a log format for all your applications also means that your logs will be easy to parse, collect, and leverage on other platforms.

Use consistent log levels

Effective logging hinges on applying log levels like DEBUG, INFO, WARNING, ERROR, and CRITICAL. These levels help categorize the severity of log messages, streamlining analysis and troubleshooting. While detailed descriptions of each level are less crucial, understanding their general purpose is vital for maintaining clear, organized logs. This approach significantly aids in system maintenance and efficient problem resolution.

As you can see, capturing and constructing logs with the right information can save you money, help you diagnose errors more quickly, and hopefully save you time when you’re on call.