In today's data-driven world, effectively managing data cardinality—the measure of uniqueness within a dataset—is crucial for optimizing system performance and enhancing strategic decision-making.
The Value of Adding Attributes: Answering More Questions
The more attributes you add to your metrics, the more complex and valuable questions you can answer. Every additional attribute provides a new dimension for analysis and troubleshooting.
For instance, adding an infrastructure attribute, such as region can help you determine if a performance issue is isolated to a specific geographic area or is widespread. Similarly, adding business context, like a store location attribute for an e-commerce platform, allows you to understand if an issue is specific to a particular set of stores and correlate it with other business data, such as store revenue, to understand its true business impact.
This ability to drill down is where high-cardinality data shines. If you're looking to conduct an in-depth trend analysis on user purchasing patterns and behavior, you need granular dimensions like user ID, product ID, or search query. These dimensions can have millions of outcomes, enabling you to answer questions like, “What product is a specific user purchasing?” or “What search queries is a specific user using?”.
The Cost of "ALL" Attributes
If adding attributes provides such power, why not add every attribute you can think of to your metrics?
The simple reason is cost and complexity. There is a cost to that additional power beyond the effort required to add it to the data.
- Increased Resources and Volume: More attributes mean larger data points, which require more resources to transfer and process the data.
- Higher Cardinality is a Key Cost Driver: More attributes generally lead to higher cardinality, and cardinality is a key driver of cost for metrics. Metric data is aggressively aggregated to keep queries fast, but this aggregation requires significant additional resources. Whether you are self-hosting an open-source metrics solution or paying a SaaS vendor, you will incur costs for those additional resources.
High-cardinality telemetry can be significantly more resource-intensive to maintain.
Striking a Balance: Establishing a Cardinality Budget
You need to strike a balance between the value to the business and the cost of that additional cardinality. This means you must establish a cardinality budget and manage it across your entire IT estate.
Combining low- and high-cardinality data allows for this comprehensive and balanced approach, providing both breadth and depth in insights across various dimensions.
When to Use Low vs. High Cardinality Data
Data Type | Purpose (Value) | Example Attributes | E-Commerce Example |
Low Cardinality (Sufficient) | General performance monitoring, basic trend analysis, and answering baseline, high-level questions.
|
Product Category, Region
| Seeing an overall trend: "Electronics purchases increased in North America over the last three months."
|
High Cardinality (Critical) | In-depth analysis, identifying specific trends, and granular telemetry for incident response and marketing insights.
|
User ID, Product ID, Search Query
| Getting granular analysis: "Specific user purchasing patterns useful for personalized marketing strategies."
|
By understanding both general and individual customer habits, you can craft more effective and personalized marketing campaigns. This synergy enhances strategic planning by balancing broad demand characteristics with specific consumer requirements.
How New Relic Helps You Manage Your Cardinality Budget
New Relic provides various tools to help you manage your cardinality and stick to your budget.
- New Cardinality Management UI for Identifying Drivers
The Cardinality Management UI provides a comprehensive solution for anything related to metric cardinality. It helps you identify the dimensions that contribute the most to your metric budget. You can view historical trends for your account and metric cardinality for up to 13 months, enabling proactive preparation. - Set Per-Metric and Account Budgets
New Relic offers the ability to set both per-metric and overall account cardinality budgets. This functionality enables you to increase cardinality where it's specifically needed (e.g., for a high-value metric) and avoid "cardinality explosions" that could inflate your bill. - Pruning Rules to Shape Your Data
You can create one-click pruning rules within the UI. These rules enable you to shape your data to reduce cardinality in the pipeline-for example, by removing a dimension that is too noisy or not needed until a certain event.
Events to Metrics
This feature enables you to send high-cardinality Event data—which is excellent for use during incident response when trying to find the needle in the haystack—and then define rules that automatically aggregate those Events into lower-cardinality metric data. The resulting metric data includes a subset of attributes needed for longer-term trending and analysis.
The views expressed on this blog are those of the author and do not necessarily reflect the views of New Relic. Any solutions offered by the author are environment-specific and not part of the commercial solutions or support offered by New Relic. Please join us exclusively at the Explorers Hub (discuss.newrelic.com) for questions and support related to this blog post. This blog may contain links to content on third-party sites. By providing such links, New Relic does not adopt, guarantee, approve or endorse the information, views or products available on such sites.