Structured logging is the process of producing, transmitting, and storing log messages in a format that's easily machine-readable, such as JSON. The main advantage here is that by ensuring logs are structured consistently, you’ll get faster and more accurate automated processing and analysis.
In contrast to unstructured logs, which are just strings of text without a defined format, structured logs are designed with machine readability in mind. Each piece of data in the log message is stored in a defined field, allowing software to extract specific pieces of information without needing to parse arbitrary text strings.
Take a look at these logs:
Unstructured logs
INFO - User John Doe with ID 123 made a purchase of $200 on a Visa card with last four digits of 4312 on 2023-05-25.
Equivalent structured logs
{
"severity": "INFO",
"timestamp": "2023-05-25T12:34:56Z",
"userId": 123,
"userName": "John Doe",
"action": "purchase",
"card_type": "Visa",
"last_four_digits": 1414,
"amount": 200
}
With unstructured logging, troubleshooting an issue related to a specific transaction can become quite a daunting task. It might require writing complex search queries or even manually scanning through thousands of lines of logs. Structured logging, on the other hand, significantly improves readability for machines, which assists in easier and more efficient analysis and querying.
How to use structlog
for seamless structured logging in Python
While Python’s built-in logging module is robust and flexible, using a specialized library like structlog can make structured logging more intuitive and easier to manage. The structlog library wraps the built-in logging module to transform your logs into machine-readable key-value pairs.
Configure log processors
The structlog
library contains processor pipelines, which transform and route your log entries. They can add information to your logs (like timestamps), format the final log output (like converting it to JSON), filter logs, or even redirect logs to different targets.
structlog comes with multiple included processors, including:
- TimeStamper: Adds timestamps to your logs.
- JSONRenderer: Converts your log entries into JSON format.
- format_exc_info: Extracts and formats exception information from exc_info field, if present.
- UnicodeEncoder: Encodes Unicode strings in the event dictionary to UTF-8.
Have a look at this code sample, which implements JSON:
import structlog
# Configure structlog to output structured logs in JSON format
structlog.configure(
processors=[ structlog.stdlib.filter_by_level, structlog.processors.TimeStamper(fmt="iso"), structlog.processors.JSONRenderer()
],
context_class=dict, logger_factory=structlog.stdlib.LoggerFactory(),
)
# Get a logger
logger = structlog.get_logger()
# Now we can log structured messages!
logger.info("User logged in", user_id="1234", ip="192.0.2.0")
First, the example has configured structlog
to use several processors that control how messages are logged.
- The
TimeStamper
processor adds a timestamp to each log message, - The
JSONRenderer
processor converts the log message into a JSON string.
Next, the example uses a structlog
logger to record an information-level event. Notice how the log message includes key-value pairs: in this case, user_id="1234"
, ip="192.0.2.0"
. These are structured data that provide context for the log message.
When you run this code, it will output the following:
{"event": "User logged in", "user_id": "1234", "ip": "192.0.2.0", "timestamp": "2023-05-25T14:22:01Z"}
Remember, the order of processors matters. Each processor receives the output of the previous processor. Always consider this when configuring your processor pipeline, because it can lead to unintentional difficulties. For example, JSONRenderer
should always be at the end of your processor list, as it turns the dictionary into a string, making it unsuitable for further processing.
How to structure logs
There are a few key steps to take to structure logs:
- Select an industry-standard format, such as JSON or XML.
- Select what to log.
- Implement a framework for structured logging.
- Set up your logging sources.
- Analyze your logs.
Select a log format that can be easily parsed
Choose a standard machine-parsable format, preferably one used widely in the industry, such as JSON (JavaScript Object Notation). JSON is widely used for data logging. It’s easily readable by both machines and humans. Parsing JSON converts JSON-formatted text into a JavaScript object, which can be manipulated within a program. structlog renders data in JSON format by default.
Ideally, you should use the same log format across your application and infrastructure. If some of your logs are in JSON and some are in XML or other formats, it’s much harder to maintain consistency across your services and components.
Select what to log
Identify the parameters you want to log in your applications and services. These parameters will vary from application to application, and will also differ among the services in your infrastructure that you want to add to your structured logging framework. Selecting parameters will likely involve collaborating with other teams who gain value from the logging data.
Implement a framework for structured logging
Use structlog
to implement a logging framework. Set up logged data levels, output destinations, and any additional settings required. Configure the framework to create logs in your format.
Consider how and when you’ll migrate any existing instrumentation to the new framework. When converting unstructured entries into structured ones, be sure to include each contextual property as its own attribute.
Here is a basic example of how to format logs in Python with structured metadata in a log entry:
import structlog
import logging
logging.basicConfig(format="%(message)s", level=logging.INFO)
log = structlog.get_logger()
log.info("Application started", app="my_app", user="admin")
Getting started with structLog in python
To get started:
Install and import structlog
- Using pip, install
structlog
.pip install structlog
- Within your Python application, import the library.
import structlog
Set up logging sources
Once your applications are configured for structured logging, consider if other services across your infrastructure – such as web service, databases, and cloud service – could benefit from this format. Ideally, you should use the same log format across all applications and services. If some of your logs are in JSON and some are in XML or other formats, it’s much harder to maintain consistency across your services and components. Parsing JSON versus another format may require additional coding or configuring extra tasks in your observability platform to analyze your log data.
Look into dashboards for cloud services and other services and make sure they offer structured log output in your preferred format.
Define what to log
Decide which specific values or events should be captured in your application and service logs. These logging details may differ across components of your infrastructure, depending on their role and the data they generate. Collaborate with relevant stakeholders—such as DevOps, security, or product teams—to ensure the logs deliver meaningful insights for all users.
Analyze your logs
Proper implementation of structured logging can greatly benefit your team and organization. However, its full potential is realized when combined with a robust observability tool. Here are some of the key benefits of using structured logging alongside an observability tool.
- Add custom attributes to your logs. Custom attributes make it much easier to standardize your logs and avoid data gaps.
- Powerful search and analysis capabilities. Observability features, like the New Relic Query Language (NRQL) tool, let you fully customize your search to focus on answering the important questions that affect your business goals.
- Centralized logging. Collect and monitor all of your system data in one place with built-in tools to analyze it.
- Parsing unstructured logs. Pull attributes from unstructured log data to more effectively search and query your logs.
- See your logs in context of other platform UI experiences, like APM, infrastructure monitoring, and distributed tracing. Added context helps you learn more about what’s happening in your application without the need to manually search through log data.
- Integrations with popular logging frameworks, like Logstash, Fluent Bit, and Fluentd make it possible to use the framework of your choice to structure logs and maximize the benefits of logging.
Configure multiple loggers in python
For most applications, you’ll want to implement a logger per module. It’s best if you capture the logger name as part of each log by using the logging library’s built-in getLogger() method to dynamically set the logger name to match the name of your module. This allows you to pinpoint the module that generated each log message:
logger = logging.getLogger(__name__)
__name__
corresponds to the fully qualified name of the module in Python from which this method is called.
This query counts the number of 'purchase'
actions, broken down by individual user IDs, which is only possible if the action
and userId
fields are reliably structured in your logs.
Next steps
Learn more about log management with New Relic and log management best practices.
Got lots of logs? Check out our tutorial on how to optimize and manage them.
To try it for yourself, if you’re not already using New Relic, sign up to get free, forever access.
The views expressed on this blog are those of the author and do not necessarily reflect the views of New Relic. Any solutions offered by the author are environment-specific and not part of the commercial solutions or support offered by New Relic. Please join us exclusively at the Explorers Hub (discuss.newrelic.com) for questions and support related to this blog post. This blog may contain links to content on third-party sites. By providing such links, New Relic does not adopt, guarantee, approve or endorse the information, views or products available on such sites.