In my last post on the New Relic Flex integration, I talked about times when pre-made New Relic agents and quickstarts are missing a critical metric or value. In that post, I used a simple example: taking the output of a standard command (I used df
for Linux systems and netstat
for Windows), pushed it to New Relic, and displayed it as a chart on a dashboard. In cases like this, Flex is a handy option to close the gap and get you the data you need.
In this post, I want to take that concept a step further. There are situations where those simple scenarios aren't enough. This can be because:
- The output isn't in the proper format.
- The data needed is the result of multiple commands.
- There isn’t an existing command.
In those cases, IT practitioners often turn to scripting to reformat, combine, or create the data needed. And so the question is how to get the result of those scripts into New Relic.
The good news is this isn't significantly different from running built-in commands. This post covers a few of the nuances to help you avoid common pitfalls as you get started.
Testing your internet speed
Let’s continue working on the example from the previous post in this series and expand on it.
A common question we all have, whether in a corporate setting or at home, is "Am I getting the internet speed I'm paying for?" To answer this, install Ookla's Speedtest.net command line utility using these instructions.
After it's installed, test it out to ensure it works the way we'll need it for this tutorial. Use this code in the command line:
speedtest --accept-license -f csv
You should see output like this:
There are some problems with that output.
- First, we don't need all of that information
- Second, the main data we want (upload/download speed) is in a raw format, which is bytes per second. To get the megabits per second (Mbps) value we expect, we need to divide it by 125,000.
I’ll walk through the magic of a script I’ve written called NR_checkspeed.py
later. For now, let’s use it to reduce the output to three values:
- Latency
- Download speed (in Mbps)
- Upload speed (in Mbps)
What's the YAML file look like?
If you've been following along from the first post, this file doesn't look terribly different. I want to draw your eye to a couple of specifics:
integrations:
- name: nri-flex
timeout: 5m
interval: 10m
config:
name: linuxspeedtest
apis:
- name: pyspeedtest
commands:
- run: sudo python3 /etc/newrelic-infra/integrations.d/NR_checkspeed.py
split: horizontal
split_by: \s+
set_header: [latency,download,upload]
timeout: 300000
The first point is purely for the production implementation. You don't want to be checking your internet speed every 10 minutes. So ratchet up the interval
line to: interval: 1h
.
Meanwhile, if you've run speedtest from the command line, you know it's not exactly snappy. Flex expects a default execution under 30 seconds, so that's not going to work. Setting the timeout
inside the commands
block will fix it: timeout: 300000
.
That's five minutes converted to milliseconds.
- run: sudo python3 /etc/newrelic-infra/integrations.d/NR_checkspeed.py
A quick reminder that if you're running this on a Windows system, your directory would be C:\Program Files\New Relic\newrelic-infra\integrations.d
.
Now we'll move on to the way the output is handled:
split: horizontal
This tells Flex to take multiple values that appear in a line and break them up into separate data points.
split_by: \s+
In addition, the symbol Flex uses to determine where one value ends and another begins is multiple spaces. This uses standard regular expression syntax.
set_header: [latency,download,upload]
This line sets the headers, which we'll use to set up our NRQL query in one.newrelic.com.
For more about these settings, along with all of the Flex configuration options, check out our on-host integrations documentation.
"Troubleshooting" and other swear words
In my previous post, I didn't dig into the process of checking if things are working, or if they aren’t, why not.
Without the following techniques, the only way you'll know if the Flex integration works is to set up a simple NRQL query and keep checking the output for error messages. But there are better options.
Run Flex manually
You can run the New Relic Flex utility in the command line and see the output immediately. Find the utility at: /var/db/newrelic-infra/newrelic-integrations/bin/nri-flex
. While there are a bunch of command line options, the ones you want for troubleshooting are --verbose
and pretty
.
So...presuming our sample YAML file from the earlier section is named py_ookla-speedtest.yml
, our command would be:
sudo /var/db/newrelic-infra/newrelic-integrations/bin/nri-flex --verbose --pretty --config_file ./py_ookla-speedtest.yml
The output of that command would look something like this:
Logging
You can also set up the logging level of the nri-flex
utility itself. Edit the file /etc/newrelic-infra.yml
and include the these items:
log:
file: '/var/log/newrelic-infra/newrelic-infra.log'
level: debug
forward: true
stdout: false
For more information on logging options, check out the documentation on infrastructure agent options.
The necessary NRQL
If everything works, you're ready to start showing the data in New Relic. Head over to your New Relic portal, open up a NRQL window, and type the following query:
FROM pyspeedtestSample SELECT *
This will show you if you're collecting data at all, and if so, whether you’re getting an error or data. Once again, I'm going to presume you've got data coming in and you see output that looks something like this:
As you can see from the screenshot, we've got download metrics coming in. Scroll to the right and you'll also see columns for upload and latency. In NRQL you need to show the latency, and on a separate chart the upload/download speeds. This would be:
FROM pyspeedtestSample SELECT average(latency) TIMESERIES
and
FROM pyspeedtestSample SELECT average(download), average(upload) TIMESERIES
After your query is working, give the query a name, select the graph type, and assign it to one of your dashboards:
The script behind the scenes
If your goal is to learn how to use the New Relic Flex integration, the script you use is important. At the same time, it’s not essential for you to use my exact version. If you want to create your own in your preferred language, that’s absolutely fine. I’m including my script here, along with comments about which parts are noteworthy, simply for reference and convenience.
import os
import subprocess
#========================================
#Define variables
speedlist = speedrun = ""
latency = download = upload = ""
batch = response = ""
#========================================
#Function Junction
def fixnum(x):
x = x.replace('"', '')
x = float(x)
return x
#========================================
speedrun = os.popen("speedtest --accept-license -f csv").read()
speedlist = speedrun.split(",")
latency = fixnum(speedlist[3])
download = fixnum(speedlist[6]) / 125000
upload = fixnum(speedlist[7]) / 125000
print(latency, " ", download, " ", upload"
To break this down a bit:
import os
import subprocess
This includes two modules we'll need later to run external commands and get the results back into the program.
def fixnum(x):
x = x.replace('"', '')
x = float(x)
return x
This subroutine removes the quotes around the CVS output.
speedrun = os.popen("speedtest --accept-license -f csv").read()
This runs the speedtest command and grab the results.
speedlist = speedrun.split(",")
This splits the results into multiple values.
Then, you'll need to take each of the component values and assign them names. Also, divide the download and upload numbers by 125,000 to get an accurate "bits per second" value, as shown here:
latency = fixnum(speedlist[3])
download = fixnum(speedlist[6]) / 125000
upload = fixnum(speedlist[7]) / 125000
Finally, you'll need to echo the results so they can be picked up by Flex, as shown here:
print(latency, " ", download, " ", upload)
Próximos passos
Stay tuned for the next of this multiple part blog series. Missed previous posts? Check out these two posts:
If you’re not using New Relic yet, get started for free. Your account includes 100 GB/month of free data ingest, one free full-access user, and unlimited free basic users.
As opiniões expressas neste blog são de responsabilidade do autor e não refletem necessariamente as opiniões da New Relic. Todas as soluções oferecidas pelo autor são específicas do ambiente e não fazem parte das soluções comerciais ou do suporte oferecido pela New Relic. Junte-se a nós exclusivamente no Explorers Hub ( discuss.newrelic.com ) para perguntas e suporte relacionados a esta postagem do blog. Este blog pode conter links para conteúdo de sites de terceiros. Ao fornecer esses links, a New Relic não adota, garante, aprova ou endossa as informações, visualizações ou produtos disponíveis em tais sites.