Sensing Clues is a Dutch NGO that partners with organizations to introduce best-of-industry technologies to turn wild spaces into safe havens for wildlife and people. In the nonprofit space, budget is extremely important when it comes to new technologies. Time is also very important, especially in such a small volunteer-based team. To help us manage both we needed to know more about what was going on in our tech stack: easy access to important data and the ability to configure alerts to avoid noise.
When something did go wrong in our tech stack, we used to get a call from our users. When we were starting, it was okay because we were in a pilot phase. It was manageable. But as we got more customers and had to scale, it wasn’t an option. If our tools aren’t working, our clients on the field have real issues. To get that type of proactive alerting into issues and to stop them from happening in the first place, we started to look at monitoring solutions. We wanted a software solution that could help us query, visualize, and analyze data. Before New Relic, we didn't know about the state of our software or platform.
The Sensing Clues platform consists of several applications powering the Cluey mobile "in-field" application backed by MarkLogic, Nginx, NextCloud, and Keycloak. The application helps users manage the status of the land, for example, to record wildlife sightings or the status of park assets. We wanted to use distributed tracing—this tracks and observes service requests as they flow through distributed systems—to see user journeys within our app. With this, we could also verify that the correct agents were working and running and limit any non-relevant data that was being ingested that we didn't need, in addition to synthetics testing that helps proactively validate user information.
Setting up easy and powerful alerts
One of the first things we did with New Relic was set up alerts. It was a small lift that made a huge difference. We don’t use email a lot, so the Slack integration is perfect. We have two dedicated Slack channels for alerts. The first channel tells us if something is failing, so we don’t need to continually monitor what is happening. Whenever there’s a failure, an alert pops up. Our second channel is dedicated to anomalies. One example of an anomaly is if application traffic suddenly goes down or up. As the operations manager, I’m usually the one watching both channels. If it’s something urgent, we set up a call with the team. We have a WhatsApp group for any real technological emergencies—which we haven’t had to use for over a year because of the new alerting in place.
Dashboards show how infrastructure is performing
Dashboards give us an overview of the environments we work with, including how users are interacting with our app. We have a production and test environment and an additional environment for the data science projects covered via dashboards. This helps us iron out any code issues in our testing environment so we can troubleshoot any issues ahead of time.
With New Relic, we can see the activity and experience of our users—conservation scientists around the world using our app to monitor and log endangered animal data—to understand the pains they’re facing. This helps us start recognizing patterns and make the appropriate fixes. If usage of one user journey suddenly drops off, that can indicate that something is broken.
Dashboards tell us how the infrastructure is performing—memory, throughputs. For example, we relied on dashboards when we were migrating. It showed us how we should create a new environment. If there are any issues we can then see the root cause. We’ve set up baselines that we expect. For example, If memory goes over what is expected, we get an alert.
Our application is really easy to use. We’ll either go to organizations in person and train them on how to use it, which takes half a day, or they’ll do a short online training. That user-friendliness and journey is really important to us. Rather than getting up to speed on the tech, it's getting organizations into the process of adding their details to the app—of changing their way of working. They also face connectivity issues, depending on where they are. They might go into the field for 10 days without internet connection. That information, those logs, need to be stored on the phone and updated once they regain connection.
As we’ve grown our understanding of monitoring and what observability can do over time. We have more warning systems and dashboards with statuses. Moving beyond being only alerted about the big issues, we now understand how the platform is being used.
With tools like New Relic, no matter how good they are, they often don’t get priority because there’s always a million other things to do—especially if you’re not a monitoring specialist. It takes too long. That’s why we love partnering with New Relic employee volunteers through pro bono. It helps us focus for a week or two on observability and we get dedicated support from experts that helps us understand what we need to do. New Relic is also really easy to use so you don’t need to be a techie to get value from it.
Through our Observability for Good Social Impact program, we’re committed to driving equitable access to technology. We support nonprofits with free and discounted access to our observability platform and pro bono support from employee volunteers. Start now with three full platform users and 1,000 GB of data ingest for free!
本ブログに掲載されている見解は著者に所属するものであり、必ずしも New Relic 株式会社の公式見解であるわけではありません。また、本ブログには、外部サイトにアクセスするリンクが含まれる場合があります。それらリンク先の内容について、New Relic がいかなる保証も提供することはありません。