Apache Kafka is a popular distributed streaming platform that thousands of companies like New Relic, Uber, and Square use to build scalable, high-throughput, and reliable real-time streaming systems. But managing such a platform is no easy feat. Amazon Managed Streaming for Apache Kafka (MSK) abstracts away the management of Kafka so you don’t have to worry about maintaining your own data streaming pipeline.
Amazon MSK exposes metrics in a Prometheus compatible format. And since the New Relic Prometheus OpenMetrics integration collects metrics from any endpoint compatible with Prometheus, you can send MSK metrics to the New Relic One platform.
By collecting Amazon MSK metrics in New Relic One, you’ll be able to combine that data with agent-based APM and Infrastructure data; log data from your applications and hosts; and other third-party telemetry data like distributed traces to create an entity-centric system of record. You can then use this combined data to build dashboard charts and set alerts, with the aim of creating observability within your entire application stack.
In this post, we’ll explain how to collect and use Amazon MSK metrics in New Relic.
Step 1: set up an Amazon MSK cluster
To set up a new Amazon MSK cluster, follow the steps in the Amazon MSK getting started guide.
You can use the following definition file (in JSON) when setting up your cluster:
clusterinfo.json { "BrokerNodeGroupInfo": { "InstanceType": "kafka.m5.large", "ClientSubnets": [ "subnet-1", "subnet-2", "subnet-3" ], "SecurityGroups": [ "sg-1" ] }, "EncryptionInfo": { "EncryptionInTransit": { "InCluster": false, "ClientBroker": "PLAINTEXT" } }, "ClusterName": "PrometheusTest", "EnhancedMonitoring": "PER_TOPIC_PER_BROKER", "KafkaVersion": "2.2.1", "NumberOfBrokerNodes": 3, "OpenMonitoring": { "Prometheus": { "JmxExporter": { "EnabledInBroker":true }, "NodeExporter": { "EnabledInBroker": true } } } }
- Make sure the nodes in your security group have the rules to access Prometheus metrics on ports
11001
and11002
. (For details on managing security groups, refer to the AWS documentation.) - Next, discover the DNS name of your Kafka node:
aws kafka list-nodes --cluster-arn "arn:<cluster ARN>"
The result should be similar to:
"NodeInfoList": [ { "AddedToClusterTime": "2019-11-28T07:28:30.421Z", "BrokerNodeInfo": { "AttachedENIId": "eni-XXX", "BrokerId": "2", "ClientSubnet": "subnet-2", "ClientVpcIpAddress": "172.31.1.2", "CurrentBrokerSoftwareInfo": { "KafkaVersion": "2.2.1" }, "Endpoints": [ "b-2.prometheustest.XXXX.kafka.us-east-2.amazonaws.com" ] }, "InstanceType": "m5.large", "NodeARN": "arn:aws:kafka:us-east-2:XXX", "NodeType": "BROKER" },
- Choose the node that is in your subnet, and document this information. You’ll need it to configure New Relic Prometheus OpenMetrics integration.
- Confirm you can access the Prometheus endpoints:
curl <BrokerDNS>:11001/metrics curl <BrokerDNS>:11002/metrics
Step 2: set up the New Relic Prometheus OpenMetrics integration
If you haven’t already done so, create an EC2 instance in the same Amazon virtual private cloud.
Next, deploy the New Relic Prometheus OpenMetrics integration:
- Create a configuration file (
config.yaml
), or use our example configuration file.Important: Be sure to change the cluster_name setting.
- Configure the Amazon MSK endpoints targets in the configuration file:
targets: - description: MSK urls: ["http://<BrokerDNS>:11001/metrics", "http://<BrokerDNS>:11002/metrics"]
- Add the
http://localhost:8080/metrics
endpoint to collect metrics about the integration itself. - Start the integration with the following command:
docker run -d --restart unless-stopped \ --name nri-prometheus \ -e LICENSE_KEY="YOUR_LICENSE_KEY" \ -v "$(pwd)/config.yaml:/config.yaml" \ newrelic/nri-prometheus:1.2
Important: Replace your
YOUR_LICENSE_KEY
with your New Relic license key (required). - Confirm the container is running properly:
docker ps -f "name=nri-prometheus"
For more details about configuration options and using the Prometheus OpenMetrics integration, see the New Relic documentation.
Using your Amazon MSK metrics data in New Relic
After you get the integration running, it will immediately start sending Amazon MSK metrics to New Relic. The following examples show how to use these metrics.
Example 1: monitor the filesystem on Kafka nodes
Kafka’s offset retention policy can cause the disks on your nodes to fill up. Monitor the disk usage to ensure your nodes are healthy.
In New Relic One chart builder, select the metric node_filesystem_avail_bytes
, and in the Facet by field, select the device name to see file system usage per device in your cluster.
Example 2: monitor the producer request rate
To ensure a healthy pipeline, you can track the average number of producer requests sent per second.
In chart builder, select the metric kafka_server_BrokerTopicMetrics_Count
and filter (narrow to) the name TotalProduceRequestsPerSec
.
Example 3: alert on Kafka node storage
Use New Relic Alerts to create an alert condition to ensure your Amazon MSK nodes don’t violate critical storage thresholds.
See the Amazon MSK documentation for a full list of metrics you can collect from your cluster.
Les opinions exprimées sur ce blog sont celles de l'auteur et ne reflètent pas nécessairement celles de New Relic. Toutes les solutions proposées par l'auteur sont spécifiques à l'environnement et ne font pas partie des solutions commerciales ou du support proposés par New Relic. Veuillez nous rejoindre exclusivement sur l'Explorers Hub (discuss.newrelic.com) pour toute question et assistance concernant cet article de blog. Ce blog peut contenir des liens vers du contenu de sites tiers. En fournissant de tels liens, New Relic n'adopte, ne garantit, n'approuve ou n'approuve pas les informations, vues ou produits disponibles sur ces sites.