No momento, esta página está disponível apenas em inglês.

Vault Health reduces MTTR to under 15 minutes

Desafio comercial
Company Size
500+

Vault Health is a U.S. digital healthcare tech company. Known mainly for its national diagnostic service, Vault Health provides rapid testing and telemedicine to over 12 million patients. In addition, Vault Health distributes clinical trials to bring new medicine to the market faster and provides specialty care clinical services like men’s health services.

Delivering excellent customer service experience 

Historically, healthcare has not been the most customer-centric experience. Against this industry backdrop, Vault Health is committed to delivering an excellent customer experience so that customers—including underserved populations that are difficult to reach—can access healthcare when they need it. At the onset of the COVID-19 pandemic, Vault Health scaled the platform to run 24/7 so that patients could order COVID-19 tests at any time and participate in telemedicine. Because of this growth, Vault Health needed the ability to understand what was happening within all of its new systems, features, and product lines. That goes beyond simply looking at logs or even consolidated logs to get a view into how its applications are communicating with other services. Without this visibility, Vault Health would fall short of delivering an excellent, 24/7 customer experience to patients in crisis.

With errors inbox, we expect our MTTR to be within 15 minutes. To do that, our customer reliability team uses errors inbox to understand the error, escalate to product teams if needed, fix the issue quickly, and get the solution out the door as soon as possible. 

Empowering engineers to innovate and build a customer-centric platform

Errors inbox allows Vault Health to detect, triage, and resolve errors from across its entire tech stack in one place. With errors inbox, Vault Health can optimize its release cycle and innovate with quality assurance, and detect issues proactively and reduce mean time to resolution (MTTR). Errors inbox enables Vault Health to get a better understanding of how the customer experiences the platform and resolve errors quickly as they arise. Whether the interaction is a web transaction, an external API call, or an asynchronous messaging architecture to place an order, errors inbox gives Vault Health full visibility into how its applications are running—and how to deliver a reliable and scalable customer experience continuously. 

Optimizing the release cycle and innovating with QA

Vault Health uses errors inbox to streamline its software development lifecycle. First, it enables engineering to see all errors happening across the platform and throughout daily releases—all in one place. This enables them to stay integrated, in communication, and in sync, even across quality assurance (QA), customer reliability engineering (CRE), and site reliability engineering (SRE) teams.

Next, during testing, engineering uses errors inbox to look at new and existing issues. In fact, errors inbox has empowered Vault Health to drive innovation within traditional QA regression, the pass-fail process.

“After QA regression, we use errors inbox to look at how the system compares with what it looked like before. Is the error rate the same? Or is it higher? Are there novel errors that didn't exist before?” says Steve Shi, Chief Technology Officer at Vault Health. “Using this kind of system metric, QA regression is no longer just a black box exercise. It actually looks inside the box to see how the system reacted while QA was conducted as a particular kind of regression exercise.”

Proactive issue detection and reducing MTTR 

Finally, once Vault Health gets to production, the engineering team uses errors inbox to handle all escalations and ensure a great customer experience. To do this, the team took an SRE principle and applied it to CRE, a new engineering function. Instead of looking narrowly at system metrics to identify that a database is failing, they ask if the application is failing because of a customer use case. The focus is now on using errors inbox to solve issues proactively and detect and solve issues before a customer picks up the phone to call it in. The result? A short MTTR of 15 minutes or less that customers may not even notice.

“With errors inbox, we expect our MTTR to be within 15 minutes,” says Tommy Harke, Vice President of Engineering and Architecture at Vault Health. “To do that, our customer reliability team uses errors inbox to understand the error, escalate to product teams if needed, fix the issue quickly, and get the solution out the door as soon as possible. One error could be very bad for the customer experience, and we want every customer to have a good experience.”

It’s no longer just about detecting breakage. Errors inbox allows us to understand the baseline performance of our system. We have a visual understanding of our software performance.

Engineering teams are empowered with data to drive innovation

Together with the rest of the New Relic observability platform, errors inbox gives Vault Health a holistic view across its entire tech stack. Engineers at Vault Health use it as part of their daily routine. Today, they’re working together better and with stronger communication to detect issues proactively, reduce MTTR, and understand and deliver a great customer experience.

“It’s no longer just about detecting breakage. Errors inbox allows us to understand the baseline performance of our system,” concludes Steve. “We have a visual understanding of our software performance.”