DBA Mary, extraordinary,
What makes your tables grow?
When queries propel, through my TSQL shell,
data ingests into neat little rows.
In this post I'm going to address a situation that happens a lot with JSON output—data that should be recorded as sequential rows under a single field, but instead ends up splitting across multiple fields.
Because it continues to provide me with simple, reproducible examples, I'm going to continue leveraging the Pi-hole Domain Name System (DNS) platform for this blog. For some context, in the previous post, all of our attention was focused on the output of a single API endpoint: ?summaryRaw. But the pihole has many other endpoints that emit data, including:
- topItems=xx: Show the xx top domains and top advertisers being requested.
- topClients=xx: Show the top sources of DNS queries within your network.
- getForwardDestinations: Show the external DNS servers where DNS queries are going once they bounce out of your network.
- getQueryTypes: Show the volume of each type of DNS query (A, AAAA, PTR, SRV, etc.).
Let's consider a New Relic Flex integration that is set up to gather some of the information I've identified above:
integrations:
- name: nri-flex
config:
name: badpihole
apis:
- name: badpihole_querytypes
url: http://pi.hole/admin/api.php?getQueryTypes&auth=abcdefg1234567890
headers:
accept: application/json
Note: I've purposely named the elements "bad" so you can find them, because ultimately I don't think they are valuable in the current format.
When you look at it in New Relic Query Language (NRQL), you'll see a result like this:
The issue becomes even worse when the results are highly variable. For example, "topItems" will return the top domains and advertisers for a given period. While that MIGHT remain somewhat consistent, in larger or more dynamic networks that list can change drastically.
So with the YAML element of:
- name: badpihole_topitems
url: http://pi.hole/admin/api.php?topItems=10&auth=abcdefg1234567890
headers:
accept: application/json
You could see your column count go up moment by moment:
What's needed is to transform the incoming data so that rather than appearing like this:
"top_ads.unity3d.com": 54,
"top_ads.display.ravm.tv": 90,
"top_ads.hbopenbid.pubmatic.com": 49,
Instead, it's re-organized into a format more like this:
Name: "top_ads.display.ravm.tv",
Count: 90,
Name: "top_ads.display.ravm.tv"
Count: 90,
Name: "top_ads.hbopenbid.pubmatic.com"
Count: 49,
The result of which looks like this when displayed in New Relic:
Phenomenal cosmic power, itty-bitty command
How is this transformation achieved? Through a remarkably simple use of the jq
utility. I mentioned jq
in a previous post, where the usage was far more complex.
As so often happens in tech, what we're asking for is a much more complex operation, and yet the structure is way easier to understand:
jq: >
.[]|.top_queries|to_entries|map({queryname:.key,query count:.value})
As with the jq wizardry in my last post, this is largely due to the genius of my colleague Haihong Ren, a principal solutions architect here at New Relic. Putting this line into the context of a complete Flex YAML file, it would look like this:
integrations:
- name: nri-flex
config:
name: pihole
apis:
- name: pihole_topitems
url: http://pi.hole/admin/api.php?topItems=10&auth=abcdefg1234567890
headers:
accept: application/json
jq: >
.[]|.top_queries|to_entries|map({queryname:.key,query count:.value})
The result of which, as I showed earlier, is data that is easier to summarize, query, sort, select, and display.
Special bonus clip-and-save section
There's not really much to summarize here, except to underscore that the New Relic platform is not only flexible enough to enable you to collect just about any type of telemetry you need; but it also lets you manipulate it so you can transform data into information, which drives thoughtful action within your organization.
If you'd like to try out this entire thing for yourself, but would prefer not having to BUILD it all yourself (and in this I applaud your commitment to economy of effort), then below you will find the complete YAML file. And here is a link to a quickstart with the dashboard pictured above.
integrations:
- name: nri-flex
config:
name: pihole
variable_store:
authkey: abcdefg1234567890 #your auth key goes here
# In order for this integration to work, you need to include your pihole API key.
# You can get the token by logging into your pihole and going to Settings/API/Show API token
# or by connecting directly to the pihole device and getting the WEBPASSWORD variable from
# /etc/pihole/setupVars.conf
apis:
- name: pihole_summary
url: http://pi.hole/admin/api.php?summaryRaw&auth=${var:authkey}
headers:
accept: application/json
- name: pihole_topitems
url: http://pi.hole/admin/api.php?topItems=10&auth=${var:authkey}
headers:
accept: application/json
jq: >
.[]|.top_queries|to_entries|map({queryname:.key,querycount:.value})
- name: pihole_topclients
url: http://pi.hole/admin/api.php?topClients=10&auth=${var:authkey}
headers:
accept: application/json
jq: >
.[]|.top_sources|to_entries|map({clientname:.key,clientcount:.value})
- name: pihole_toforwarddest
url: http://pi.hole/admin/api.php?getForwardDestinations&auth=${var:authkey}
headers:
accept: application/json
jq: >
.[]|.forward_destinations|to_entries|map({destinationname:.key,destinationcount:.value})
- name: pihole_querytypes
url: http://pi.hole/admin/api.php?getQueryTypes&auth=${var:authkey}
headers:
accept: application/json
jq: >
.[]|.querytypes|to_entries|map({querytype:.key,querycount:.value})
- name: pihole_recentblocked
url: http://pi.hole/admin/api.php?recentBlocked&auth=${var:authkey}
headers:
accept: application/json
Next steps
Dealing with data that is flattened when it shouldn’t be (or unflattened when it should) is one of the most pernicious and frustrating situations developers run into. You probably have a few examples in mind already. If you’ve learned a new approach in this post, why not log in to New Relic and see if you can’t make a few long-overdue improvements?
Don't have New Relic? Become a hands-on observability practitioner by creating a free account. Your free account includes 100 GB/month of free data ingestion, one free full-platform user, and unlimited free basic users.
The views expressed on this blog are those of the author and do not necessarily reflect the views of New Relic. Any solutions offered by the author are environment-specific and not part of the commercial solutions or support offered by New Relic. Please join us exclusively at the Explorers Hub (discuss.newrelic.com) for questions and support related to this blog post. This blog may contain links to content on third-party sites. By providing such links, New Relic does not adopt, guarantee, approve or endorse the information, views or products available on such sites.