From Our CEO: Save Money, Do More in Economic Downturn -
Read the Article

Security incidents and cyber-attacks cost companies billions each year. These incidents can be very challenging to detect in complex distributed and microservice-based applications.

One critical component of operational security is the ability to detect cyberattacks in running applications. However, building and maintaining secure applications before attacks occur offers even more value by minimizing the available vulnerabilities attackers can exploit.

Observability plays a crucial role in bridging development and production insights to better understand the attack surface area of an application, which helps you detect and prevent security vulnerabilities before they become an issue. This is because observability gives you valuable insights into the state of your applications at any time. New Relic allows you to visualize all the security vulnerabilities in your source code, including in external dependencies, with New Relic vulnerability management.

While vulnerability management is a key part of evaluating security issues in your application, this blog is going to cover another important part of observability. Specifically, you'll learn how MELT—metrics, events, logs, and traces—can provide security insights into your application.

Here's how you can use metrics, events, logs, and traces (MELT) for security observability along with four common security vulnerabilities that observability can help you proactively manage.

Using MELT for security observability

To understand how you can use observability to find and patch specific vulnerabilities, it helps to understand MELT and how metrics, events, logs, and traces can help you proactively address potential security issues. You can also do a deeper dive into MELT through New Relic University’s Fundamentals course.

Events

Events are the foundation for metrics, so let’s start there. Events are distinct actions that occur at a specific moment in time. They have more detail than logs and metrics, which means they are more data-intensive and can be leveraged to detect attacks. Here’s an example of an event written in human-readable language where a malicious user attempts to bypass authentication and log in with admin privileges with a basic SQL injection attempt:

At 3:13am on 7/22/2021, user submitted log in request with the user name: Robert'); SELECT * FROM users WHERE username = 'admin'--' AND password = ''. Login request failed.

Actual telemetry data for an event wouldn’t be formatted as a sentence. Instead, it consists of data points such as Timestamp and EventType.

Events like this give you detailed information about something specific that occurred in your application so you can respond better and keep your software secure. Because events are more data-intensive, they are most useful when data sets are small.

Metrics

Metrics are an aggregated count of different kinds of events over a specific period of time. For instance, you might want to know how many form submissions in your application included the words UPDATE, DELETE, INSERT, or TRUNCATE over the last 24 hours since a spike in this kind of activity is a clear sign that one or more users are attempting SQL injection attacks.

While events are irregular (they may or may not be happening at any given time), metrics are always happening. You might have zero SQL injection attempts in the last 24 hours (no events), but that would still show up as a metric.

Metrics are very useful for large amounts of data where you know what you are looking for. If you are on the lookout for denial-of-service (DoS) attacks, tracking the number of pings your application is receiving might provide some clues. A sudden spike in pings might be an indicator that a DoS attack is happening.

Metrics are most useful for observability security when we track them over a long period. This information can then show trends and patterns that help us better pinpoint anomalies to address possible security incidents.

Logs

Logs are similar to events, but they are much more granular, recording the details of every action and activity within an application or microservice. A single event might be broken down into many lines in a log, giving you a detailed play-by-play of something that happened. With logs, you have granular information that can help you analyze what happened. Your logs might provide an indication of a brute force attack coming from a botnet with many different IPs, and later, after you’ve (hopefully) repelled the attack, you can retroactively search your logs and thoroughly block any IPs involved.

Activity logs provide details when investigating a security issue or vulnerability. But using just logs to recreate security incidents or issues is challenging. Implementing an analyzer alongside a log collector allows you to filter and interpret raw logs instead of sifting through them independently. There are also observability platforms like New Relic that allow you to analyze your logs and extract important information from them so you can identify issues faster.

Traces

Traces, also known as distributed traces, connect a series of events. For example, let’s say you have an event that indicates a user logged into your system as an admin. In itself, that event isn’t anomalous or unusual, but later, you might have an event that indicates a user engaged in suspicious activity. In order to determine how these events are connected and get to the root cause of how a malicious user breached the system, you need traces.

By following different requests through your systems, you can better understand why issues and possible breaches are happening. You can use these traces to see where requests start, if they go through security checks, if they call the components you expect, and if all parts are performing as intended.

Visualizations

How do you take huge amounts of data and bring them all together into a format that your DevOps, InfoSec, and Ops teams can understand? Observability platforms like New Relic provide dashboards and visualizations that show you when and where anomalies are occurring so you can drill down further and get to the root cause of the problem. And by aggregating security signals from various tools in context of the entire stack for a unified, comprehensive approach to vulnerability management.

Visualizations help you identify normal and abnormal patterns in a system’s behavior. Seeing information in this way is more straightforward than combing through large amounts of data. These patterns can help you quickly find security vulnerabilities, cyberattacks, and bugs within your applications and systems.

Artificial intelligence and machine learning

Your application likely has ebbs and flows. A large number of authentication attempts might be completely normal for your application in the morning but highly unusual in the middle of the night. That means you need different thresholds for different metrics at different times of the day. It’s like having a smoke alarm that’s extremely sensitive at certain times (why is there any smoke at midnight?) but a little less sensitive when you make breakfast at the same time every morning. Machine learning and artificial intelligence tools can establish dynamic thresholds by observing patterns over time. Machine learning tools allow you to respond to emerging patterns quickly and proactively deal with cybersecurity incidents before they cause bigger problems.

Finding and patching security vulnerabilities with observability

Metrics, events, logs, and traces are the building blocks for observability, and at this point, it should be clear why you need each element to achieve security observability, too. Metrics give you a high-level view of patterns in your systems, allowing you to proactively drill down and address security issues. Events and logs give you the granular detail you need to understand what is happening, while traces allow you to connect the dots and understand why a breach occurred. In order to proactively monitor security in your application, you need all of these pieces working together.

Here are four common security risks and how increased observability can help you address them.

Authentication vulnerabilities

Broken authentication mechanisms are an attractive target for attackers. Distributed tracing, along with logging, can help you track down authentication issues.

Proactively, you can run through different scenarios within your application with distributed tracing enabled, testing each step in the process to ensure that authentication components work as intended. Reactively, you can watch as users navigate through your application’s features, using detailed logging to ensure this navigation does not bypass authentication components.

Access control vulnerabilities

Similar to authentication vulnerabilities, robust access control mechanisms are essential for reducing access control vulnerabilities. You can ensure authenticated users only have access to the components that they have permission to use.

Using distributed tracing, you can ensure your application consistently and completely applies access controls to components after authentication. Additionally, detailed logging can provide in-depth information on part of the authentication process.

Brute force attacks

Brute force attacks are all too common. They are so easy to employ and there are many free brute-force tools available to hackers. Brute force attacks can lead to serious breaches as well as performance-related issues in your applications and systems.

Simple brute force attacks try different username and password combinations as fast as possible to try and gain access to an application using valid accounts. More sophisticated attacks spread this traffic over extended periods of time and with different source locations. 

For the first attack style, performance metrics can give you key context when your systems are being attacked. You will likely see a spike in login attempts, potentially from the same IP, and if you use a tool like New Relic, you will likely have dashboards that display invalid login attempts, system load, and traffic patterns, allowing you to pinpoint and address attacks quickly.

More sophisticated attempts can be harder to track. By looking at metrics and visualizations over a longer period of time, and by incorporating machine learning tools such as New Relic’s Applied Intelligence, you can see more subtle anomalies in your telemetry data.

External dependency vulnerabilities

While we can test and fix our code within our systems, sometimes external dependencies can be attack vectors for bad actors. It’s hard enough to vet your own software. When you take into account all the other libraries and dependencies your application uses, it becomes even more challenging to ensure your systems are secure.

By logging and visualizing each service and component, you can observe how requests are flowing through your application and quickly pinpoint weaknesses in the system.

Incorporate observability security in your systems

Building observability into your systems helps you understand how your application is working and how users interact with it. From a security perspective, observability can provide visibility in proactive security actions and recommendations and reactive security investigations to better understand vulnerabilities in your applications.

Observability can help you detect authentication vulnerabilities, access control vulnerabilities, brute force attacks, and external dependency vulnerabilities, to name a few. You can then patch these vulnerabilities or take other action to protect your systems before hackers find their way in.