Amazon Elastic Compute Cloud (Amazon EC2) is one of the most popular cloud computing systems used today. Amazon EC2 offers on-demand scalability in the Amazon Web Services (AWS) cloud, options to launch as many virtual servers as you need, and ways to develop and deploy applications faster.
However, while AWS provides some monitoring capabilities for EC2 with Amazon CloudWatch, such as metrics on CPU, disk usage, and networking, that’s not enough to give you full insight into how your EC2 instances are performing. There are a lot of important metrics that CloudWatch doesn’t provide, including information on processes, configuration, server OS, and application latency and error rates. For complete observability of your AWS infrastructure, you need to supplement CloudWatch with an observability platform like New Relic.
The next image shows a New Relic dashboard displaying Amazon EC2 metrics, including instance types, instances per region, average CPU utilization, and more.
In this article, you’ll learn about:
- How EC2 works
- Monitoring EC2 with CloudWatch
- Monitoring EC2 with an observability platform
- Rightsizing your EC2 instances
- Creating an EC2 monitoring plan
- Why use New Relic to monitor EC2
- How to use New Relic to monitor EC2
How Amazon EC2 works
EC2 allows you to create flexible computing environments called compute instances. Compute instances are secure, virtual AWS-hosted servers.
You can create EC2 instances by connecting to an AMI (Amazon Machine Image). AMIs are created with the operating system of your choice and you can use an AMI to run multiple EC2 instances. Each instance can be configured to fit your needs, including its operating system, dependencies, applications, and security add-ons.
Once you’ve created an EC2 instance, you can monitor and scale its resource allocation through the AWS Management Console, the AWS command line interface (CLI), or a third-party interface. Complementary interfaces, like the New Relic EC2 monitoring integration, work with existing AWS frameworks to improve your monitoring experience.
Monitoring Amazon EC2 with CloudWatch
When you set up an EC2 instance, you get basic monitoring with Amazon CloudWatch by default. CloudWatch provides metrics reporting on the core metrics at five-minute intervals, including:
- CPU utilization percentage
- Disk usage, including the number and total size of read and write operations
- Network, including both bytes and packets going in and out of an instance on all network interfaces
- CPU credit usage
- Status checks
- Auto-scaling group metrics, which requires group metrics to be enabled
There is also an option to upgrade your CloudWatch monitoring from basic to detailed, but it comes at an additional cost. With detailed monitoring, your EC2 data will be available at 1-minute intervals instead of five-minute intervals. The difference in time intervals is the only difference between basic and detailed monitoring—you get no additional metrics.
Monitoring EC2 with an observability platform
If basic CloudWatch monitoring for EC2 is enabled by default, why would you want to monitor your instances with an observability platform like New Relic?
An application consists of many layers, from components that run in the browser to the application logic to the database layer—and the infrastructure, while critical to hosting and running your applications, is only a small part of your system.
CloudWatch metrics are important, but they only tell you what’s happening with your instance hardware—what’s happening on the outside. CloudWatch doesn’t provide any metrics on the applications that the instance is running—you need another tool to understand what’s happening on the inside of your EC2 instances.
For example, you might get metrics from CloudWatch that tell you disk usage or CPU for an instance is high. But you won’t know why resource usage is high unless you monitor what is running on your EC2 instances, which is where an observability platform like New Relic can provide the insight you need.
Observability platforms provide tools like application performance monitoring (APM), real user monitoring, synthetic monitoring, and more. So when you see a spike in the CPU of an instance, or in the total amount of information going through your network, your APM metrics will tell you if there’s a high error rate or increased latency in your requests. You could then drill down into your metrics with distributed tracing to pinpoint the source of the actual issue. Or you might use real user monitoring combined with APM to see that there is an unusually high number of requests. Whether that’s due to a denial of service attack, a spike in user activity, or something else, once again you’ll know exactly why your EC2 instance has a corresponding spike in resource usage.
There’s also another huge advantage of full stack observability—it gives you the visibility you need to rightsize your instances and ensure you’re using the right kind of EC2 instances for your applications.
Rightsizing your EC2 instances
One of the most important aspects of managing EC2 instances is making sure that you’re using both the right size and the right kind of instance. For example, if you’re using a compute-optimized instance and it’s using very little CPU but a lot of memory, it would make more sense to switch to a memory-optimized instance instead. While CloudWatch can help you determine how much CPU and memory an EC2 instance is using, it can’t provide a complete picture of what’s actually happening inside the instance.
Let’s return to the example of the compute-optimized instance that’s using a lot of memory. If you only have CloudWatch metrics, you might assume that the solution to the problem is to switch to a memory-optimized instance or to otherwise increase the total memory your EC2 instance needs. But what if the real problem is actually occurring in an application running on the instance? The application might be running unnecessary processes for too long, resulting in high memory usage. In that case, the real solution is to optimize the application itself and then determine infrastructure needs accordingly. Changing the size and kind of the EC2 instance wouldn’t solve the problem, and you’d likely be spending more without taking care of the underlying issue.
On the other hand, sometimes the right solution is to change the size and kind of instance you’re using. With tools like APM and real user monitoring, you might discover that your application is handling a very high number of requests from customers, meaning that you might need to scale up your resources.
When you combine CloudWatch monitoring with other observability tools such as APM, you can confidently optimize your application and the size and kind of your EC2 instances. Rightsizing is about using just the right amount of resources. If an EC2 instance is oversized, you’re unnecessarily increasing your cloud spend. If it’s undersized, the performance of resources on the instance will be negatively impacted.
For more resources on rightsizing and optimizing your cloud spend, see AWS Cloud Cost Optimization.
Creating an EC2 monitoring plan
It’s important to create a monitoring plan to track your EC2 instances. However, your monitoring plan shouldn’t just be limited to infrastructure. Your infrastructure is the foundation for your applications and services. Just as your services are dependent on infrastructure to be successful, measuring the success of your infrastructure—and your applications as a whole—is dependent on monitoring and understanding the big picture.
A well-structured monitoring plan should answer these questions:
- What are your service level objectives (SLOs)? Your EC2 infrastructure is part of the support structure for the most important parts of your business. You need to determine what constitutes success for your applications. For example, one objective might be having an average page load time of under 0.2 seconds, since that would indicate that your application is highly responsive. Once you have SLOs in place, you’ll have concrete goals for monitoring and optimizing your infrastructure.
- Who will do the monitoring? Determine whether to assign monitoring to a department, an individual, or a contractor. With an observability platform like New Relic, you can share monitoring visibility with all stakeholders, including DevOps, site reliability engineers, cloud architects, and more.
- Who should be notified of issues? If problems arise, designate which individual, department, or third party to contact for handling the issue. You’ll need to determine what constitutes an issue—examples include a high CPU (such as 90%), preset latency rate for requests (such as one second), or an increased error rate (such as 5%).
- How will teams be notified? You’ll also benefit from setting up alerts based on these thresholds so your teams are notified as soon as an issue arises. That means determining baselines for your metrics as well as critical thresholds that will trigger alerts. You’ll need to consider which tools you’ll use to trigger and send alerts as well. For example, you could set alert thresholds in New Relic and then configure those alerts to be sent to the right teams on Slack and through a tool like PagerDuty.
- What monitoring tools will you use? Decide whether you need basic or detailed CloudWatch monitoring and which third-party observability tools you’ll use. In addition to monitoring tools, you may want to incorporate alerting tools like PagerDuty or ServiceNow.
- What metrics do you need to track? With observability, you really do want to capture everything. However, you don’t want to be overwhelmed or distracted by noisy signals. Your SLOs can help determine which metrics are most crucial to your business. If you’re using an observability platform like New Relic, you can create custom dashboards to see those crucial metrics in one place.
- What will you report on? If you are using service level agreements, and even if you’re not, you’ll likely want to have regular reports on the performance of your application and infrastructure. Determine the metrics that will go into these reports, who will create these reports (or who will automate them), and who will read these reports.
Why use New Relic to monitor EC2
An observability platform gives you insights into what’s happening in your EC2 instances, helping you troubleshoot not just your instances but your application as a whole when issues come up. It’s also helpful in choosing the right size and kind of EC2 instance for your services. Here are just a few ways New Relic gives you full observability into every part of your application, including your infrastructure and EC2 instances.
- Automatically instrument EC2 instances when they’re launched. New Relic provides an AWS integration that automatically adds the infrastructure agent to your EC2 instances. You don’t need to worry about manually configuring each EC2 instance.
- Get recommendations for rightsizing your instances and saving on your cloud spend. New Relic includes Cloud Optimize, an open source application that analyzes your environment and provides cost and performance metrics for each service as well as recommendations for cost optimization. The next image shows a dashboard that uses Cloud Optimize.
- Monitor your AWS services alongside the rest of your infrastructure. The New Relic EC2 integration uses the ec2Describe* policy to add data about your EC2 instances to your standard infrastructure data in New Relic. Infrastructure also imports Amazon EC2 custom tags and adds them to your data. These tags help separate and filter your EC2 data from other infrastructure data in New Relic as needed while also giving you the ability to look at all your infrastructure metrics in one place.
- Monitor your infrastructure and application data without context switching. With the Amazon CloudWatch Metric Streams integration, you can send your Amazon CloudWatch data to New Relic so you can visualize and analyze your EC2 infrastructure metrics alongside other application metrics without switching between tools.
- Use APM to monitor applications running within your EC2 instances. Application performance monitoring gives you critical insights into your applications are performing.
- Manage service level indicators (SLIs) and service level objectives (SLAs). When you set up SLIs and SLOs in New Relic, New Relic automatically provides recommendations to establish your baselines. You can then track your goals in a dashboard like the one shown in the next image.
- Alert on critical issues and easily notify your teams. Set up alerts that include intelligent anomaly detection and easily alert your teams with integrations to popular tools like Slack, Microsoft Teams, PagerDuty, and ServiceNow.
These are just a few of the benefits that the New Relic platform provides. Depending on what your EC2 instances are running, you can also benefit from mobile monitoring, real user monitoring, Kubernetes monitoring, and more.
How to use New Relic to monitor EC2
To get full observability into how your EC2 instances are performing, you’ll need to set up two integrations. You’ll also benefit from installing the AWS EC2 quickstart, which gives you a prebuilt dashboard and an alert for when CPU exceeds 90%.
New Relic Infrastructure Amazon EC2 Integration on AWS
This integration installs the New Relic infrastructure agent, a lightweight executable file, on your EC2 instances. New EC2 instances are automatically instrumented with the agent when they’re launched. You can use the guided install of the infrastructure agent to automatically find infrastructure in your environment that should be instrumented. You can also instrument each of your EC2 instances individually in New Relic with this install option.
See the New Relic Infrastructure Amazon EC2 Integration on AWS documentation to set up this integration through AWS.
Amazon CloudWatch Metric Streams integration
The Amazon CloudWatch Metric Streams integration sends your CloudWatch metrics to New Relic via a Kinesis Data Firehose. While you can view all these metrics in AWS, it’s helpful to have all your application and infrastructure metrics in one place. This also allows you to use New Relic features such as alerts and custom dashboards for your CloudWatch metrics.
You’ll need to do the following steps, which are described in more detail in the Amazon CloudWatch Metric Streams integration documentation.
Polling metrics outside of Amazon CloudWatch
Some AWS services, such as AWS Billing, don’t send data through Amazon CloudWatch. These AWS metrics also aren’t accessible to the infrastructure agent because the agent can only see what's happening in your EC2 instances. If you want to send data from services like AWS Billing, you’ll need to use API polling. See the full list of integrations, including ones that require polling.
Amazon EC2 quickstart
This quickstart automatically creates a New Relic dashboard with your CloudWatch metrics. It also automatically creates an alert that triggers when CPU exceeds 90%. To install this quickstart, you need to first set up the Amazon CloudWatch Metric Streams integration.
The views expressed on this blog are those of the author and do not necessarily reflect the views of New Relic. Any solutions offered by the author are environment-specific and not part of the commercial solutions or support offered by New Relic. Please join us exclusively at the Explorers Hub (discuss.newrelic.com) for questions and support related to this blog post. This blog may contain links to content on third-party sites. By providing such links, New Relic does not adopt, guarantee, approve or endorse the information, views or products available on such sites.