When patients have serious or life-threatening health issues, making decisions, in partnership with their families and healthcare professionals can be overwhelming. At ACP Decisions, we empower patients to make more informed medical decisions based on their values. We do this through easy-to-understand, value-neutral, and culturally appropriate videos. Our videos are used in a variety of clinical contexts and at different clinical locations. If an ICU clinician is going to share a video with a patient and their family about a care conversation, what happens if the video doesn't play? This is a real problem in real time that we worry about. This is where New Relic comes in. 

In anticipation of growth and the need to meet the rigors of healthcare security standards, we needed a tool to help diagnose and predict the performance and robustness of our technologies, especially in order to scale and optimize the clinician and patient experience. New Relic is not only a telemetry platform but also services the triad of understanding the state of a system through its observability, monitoring, and telemetry capabilities.

In 2022, our monitoring was rather limited and we were unable to proactively identify issues before they potentially interrupted clinical workflows: clinicians are busy professionals, and if something doesn't work they can't afford to deal with it amid their very busy workday.

Setting up monitoring alerts

Any interruptions in a service impacts the care that people are receiving, so we needed to increase uptime and reliability. We started with adding alerts to keep an eye on the interruption of service for our web app, mobile apps, and backend code infrastructure. Now I have a monitoring alert and a dedicated errors inbox that we look at to identify the errors and understand who may be impacted by them. This gives us a 30,000-feet view of the end-user experience. Sometimes the error sensitivity may not result in an impacted user experience but it gives us more information about what is potentially going on in the back end.

We're still in the process of evolving our dashboards and refining the instrumentation of our monitoring. It’s an iterative learning process. The feature set of New Relic is so rich and there is a lot to monitor and learn. We are still learning how to use New Relic and implementing this work in phases.

Gap analysis to help prioritization

After we installed agents and further set up our telemetry we started looking more at the APM and services board, which informed our gap analysis. In terms of prioritization, our ability to understand key observability needs for our business and what needs to be worked on has increased intensely. If there are issues, we're going to know what they are. Having technology that meets our end users where they are reliably and consistently is our ultimate aim.  

Since using New Relic’s Lambda and Docker monitoring, we do log reviews using the New Relic Query Language (NRQL) and added more synthetic monitoring.

We want to continue to extend monitoring by way of installing more agents, improving instrumentation, and continually assessing the data in various ways. We are focused on increasing observability across ACP Decisions while guiding other technical members on using New Relic. Our roadmap includes enhancing our alert monitoring system and logging capabilities and resolving issues much faster. We want to continue to improve our CI/CD pipeline tool. I'm sure other healthcare nonprofits are in the same boat, so building up our observability maturity one step at a time makes all of this possible.