Container technology—popularized by Docker and Kubernetes—has profoundly changed the way many development and operations teams test and deploy their applications. Containers help companies modernize by making it easier to scale and deploy applications while creating a positive impact on IT costs. However, this also introduces new challenges and more complexity as containers and container orchestration create an entirely new infrastructure ecosystem. Running containerized environments requires teams to rethink and adapt their monitoring strategies to take into account all of the new layers introduced in this ecosystem.
At the end of the day, DevOps teams are charged with delivering high-quality software that delivers a great customer experience unaffected by any changes to your platforms, tools, languages, or frameworks. Customers don’t care if you’re using traditional virtual machines, a bleeding-edge multi-cloud federated Kubernetes cluster, or a home-grown artisanal orchestration layer—they just want the applications and services you provide to be reliable, available, and fast. Tools like Docker and Kubernetes are useful only insofar as they make that possible.
That’s why New Relic is optimized for monitoring applications even in containerized environments. Clay Smith’s recent blog post on Monitoring Application Performance in Kubernetes offers an application-centric approach to APM in an containerized environment. But since you need visibility into every layer of your environment, this post extends that understanding to monitoring your complete Docker/Kubernetes infrastructure.
New Relic Infrastructure automatically collects metrics for Docker containers running on hosts that have the agent installed on them. Once you deploy the Infrastructure agent, New Relic can automatically monitor the processes running inside containers on a host. The Infrastructure agent automatically imports all the labels associated with the containers and allows you to filter and group by the metadata associated with those labels.
But that’s just the start of what you can do. New Relic Infrastructure lets you filter down to see all the processes running inside a container using the “contained” attribute. From there, you can click on “Filter Processes” and “contained” and then select “true” to see all the processes running on your containers, as shown here:
Additionally, you can see which container is running what process in order to pinpoint the CPU, memory, and I/O used by that process within the container. Since container image is a unique key to differentiate between containers. you can also group apps by container image to see all the processes running within a specific container. The annotated screenshot below is a cheat sheet for the navigation in New Relic Infrastructure:
Monitoring containers in Kubernetes
Remember that containers can move between hosts when they’re redeployed. As noted earlier, in an environment using orchestration tools like Kubernetes, your containers will often do just that. They’ll be on one host during one deploy and then they’ll move to a different set of hosts during another deploy. You need some way to keep track of where these things are actually running at any given moment, and to make sure that you’re monitoring and gathering the right sets of data from that.
In Kubernetes, you define the amount of CPU and memory the container needs to run properly. Since containers consume CPU, memory, I/O, and network resources, it’s important to track how close things like CPU usage and memory consumption come to the limits you’ve configured. New Relic’s Kubernetes integration uses this information to provide a snapshot of your containers’ resource utilization:
Monitoring system resources helps ensure that your clusters and applications remain healthy. If you don’t have enough capacity to meet the minimum-resource requirements of all your containers, you should scale up your nodes’ capacity or add more nodes to distribute the workload. Here is what to look for:
- A container approaching its memory limit: If this happens often for containers in the same deployment, it means that the limit is not set correctly or that there is a problem with the application.
- A container exceeds its memory request: That container will be among the first to be evicted if the node runs out of memory.
- A container exceeds its CPU limits: This could affect performance, since Kubernetes could limit the amount of CPU the application can access. New Relic automatically tells you when your CPU usage approaches or exceeds the limit.
It’s also important to monitor Container Restarts. If there are not enough resources available or a cluster is not set up correctly, containers could begin restarting continuously, getting stuck in what’s called a “crash loop backoff.” You can see Container Restarts in New Relic’s Kubernetes dashboard, warning you that you need to address the issue.
See inside your containers
In order to fully understand what’s going on in your environment, you need to see into all of the layers in your dynamic environment, including inside your containers. That means holistic, application-centric and infrastructure-centric monitoring.
The New Relic platform is designed to provide insight into all layers of the container stack, from applications to services, infrastructure, and customer experience. Using New Relic to take advantage of the power of container orchestration is essential for modern software companies to move faster with confidence.