New Relic Now Start training on Intelligent Observability February 25th.
Save your seat.
 / 
NVIDIA DCGM

NVIDIA DCGM

Monitor and analyze your NVIDIA DCGM infrastructure with New Relic.

What's included?

dashboards
1
NVIDIA DCGM quickstart contains 1 dashboard. These interactive visualizations let you easily explore your data, understand context, and resolve problems faster.
alerts
2
NVIDIA DCGM observability quickstart contains 2 alerts. These alerts detect changes in key performance metrics. Integrate these alerts with your favorite tools (like Slack, PagerDuty, etc.) and New Relic will let you know when something needs your attention.
High GPU Temperature
TThis alert is triggered when the NVIDIA GPU Temperature is above 90%.
XID Error
This alert is triggered when the error is higher than 3 for 5 minutes.
documentation
1
NVIDIA DCGM observability quickstart contains 1 documentation reference. This is how you'll get your data into New Relic.

Why monitor NVIDIA DCGM?

monitoring NVIDIA DCGM is essential for maintaining the health and efficiency of your GPU infrastructure in a data center. It helps with performance optimization, fault detection, resource management, energy efficiency, and overall data center health, while also aiding in troubleshooting, security, and compliance.

Comprehensive monitoring quickstart for NVIDIA DCGM

New Relic comprehensive monitoring of your GPU infrastructure in your data center. This setup will allow you to monitor GPU performance and health while leveraging the capabilities of New Relic for data visualization, alerting, and analysis.

What’s included in this quickstart?

New Relic NVIDIA DCGM monitoring quickstart provides quality out-of-the-box reporting:

  • Dashboards (power usage, GPU utilisation, clocks, etc)
  • Alerts for NVIDIA DCGM (GPU temperature, Xid error)

How to use this quickstart

  • Sign Up for a free New Relic account or Log In to your existing account.
  • Click the install button.
  • Install the quickstart to get started or improve how you monitor your environment. They're filled with pre-built resources like dashboards, instrumentation, and alerts.
Authors
New Relic
Ramana Reddy
Support
BUILT BY NEW RELIC
Need help? Visit our Support Center or check out our community forum, the Explorers Hub.