DBA Mary, extraordinary,
What makes your tables grow?
When queries propel, through my TSQL shell,
data ingests into neat little rows.
In this post I'm going to address a situation that happens a lot with JSON output—data that should be recorded as sequential rows under a single field, but instead ends up splitting across multiple fields.
Because it continues to provide me with simple, reproducible examples, I'm going to continue leveraging the Pi-hole Domain Name System (DNS) platform for this blog. For some context, in the previous post, all of our attention was focused on the output of a single API endpoint: ?summaryRaw. But the pihole has many other endpoints that emit data, including:
- topItems=xx: Show the xx top domains and top advertisers being requested.
- topClients=xx: Show the top sources of DNS queries within your network.
- getForwardDestinations: Show the external DNS servers where DNS queries are going once they bounce out of your network.
- getQueryTypes: Show the volume of each type of DNS query (A, AAAA, PTR, SRV, etc.).
Let's consider a New Relic Flex integration that is set up to gather some of the information I've identified above:
Note: I've purposely named the elements "bad" so you can find them, because ultimately I don't think they are valuable in the current format.
When you look at it in New Relic Query Language (NRQL), you'll see a result like this:
Data in columns when it should be in rows
The issue becomes even worse when the results are highly variable. For example, "topItems" will return the top domains and advertisers for a given period. While that MIGHT remain somewhat consistent, in larger or more dynamic networks that list can change drastically.
So with the YAML element of:
You could see your column count go up moment by moment:
What's needed is to transform the incoming data so that rather than appearing like this:
"top_ads.unity3d.com": 54,
"top_ads.display.ravm.tv": 90,
"top_ads.hbopenbid.pubmatic.com": 49,
Instead, it's re-organized into a format more like this:
Name: "top_ads.display.ravm.tv",
Count: 90,
Name: "top_ads.display.ravm.tv"
Count: 90,
Name: "top_ads.hbopenbid.pubmatic.com"
Count: 49,
The result of which looks like this when displayed in New Relic:
Phenomenal cosmic power, itty-bitty command
How is this transformation achieved? Through a remarkably simple use of the jq
utility. I mentioned jq
in a previous post, where the usage was far more complex.
As so often happens in tech, what we're asking for is a much more complex operation, and yet the structure is way easier to understand:
jq: >
.[]|.top_queries|to_entries|map({queryname:.key,query count:.value})
As with the jq wizardry in my last post, this is largely due to the genius of my colleague Haihong Ren, a principal solutions architect here at New Relic. Putting this line into the context of a complete Flex YAML file, it would look like this:
The result of which, as I showed earlier, is data that is easier to summarize, query, sort, select, and display.
Special bonus clip-and-save section
There's not really much to summarize here, except to underscore that the New Relic platform is not only flexible enough to enable you to collect just about any type of telemetry you need; but it also lets you manipulate it so you can transform data into information, which drives thoughtful action within your organization.
If you'd like to try out this entire thing for yourself, but would prefer not having to BUILD it all yourself (and in this I applaud your commitment to economy of effort), then below you will find the complete YAML file. And here is a link to a quickstart with the dashboard pictured above.
Next steps
Dealing with data that is flattened when it shouldn’t be (or unflattened when it should) is one of the most pernicious and frustrating situations developers run into. You probably have a few examples in mind already. If you’ve learned a new approach in this post, why not log in to New Relic and see if you can’t make a few long-overdue improvements?
Don't have New Relic? Become a hands-on observability practitioner by creating a free account. Your free account includes 100 GB/month of free data ingestion, one free full-platform user, and unlimited free basic users.
The views expressed on this blog are those of the author and do not necessarily reflect the views of New Relic. Any solutions offered by the author are environment-specific and not part of the commercial solutions or support offered by New Relic. Please join us exclusively at the Explorers Hub (discuss.newrelic.com) for questions and support related to this blog post. This blog may contain links to content on third-party sites. By providing such links, New Relic does not adopt, guarantee, approve or endorse the information, views or products available on such sites.