Challenge

Workiva is transforming the way people manage and report business data with a SaaS platform that seamlessly orchestrates data among diverse systems and applications for transparent, trusted reporting and compliance. Most of the company’s services and applications run in AWS, and Workiva engineers relied on a number of tools to interface with different parts of the environment. These tools included Google Stackdriver, a homegrown tool for synthetic monitoring, and a third-party tool for metrics monitoring.

The company wanted to further simplify the tool set so people could get a more complete picture across all systems. To address this need, Workiva formed a dedicated Observability Team tasked with adopting a unified observability platform and practice. The team was also looking for a way to gain more granular views of various telemetry data and monitor key performance indicators without manually building custom dashboards.

Solution

Workiva consolidated onto New Relic One as a single, unified observability platform for traces, application performance metrics, and more. The company also takes advantage of the programmability provided by New Relic to create custom apps. One of the first apps provides a  primary view of all services and how they are performing against service-level objectives (SLOs). A simple click allows users to then drill down for an availability check on individual services.

The company also built additional apps, including one to monitor AWS resources and another to specifically monitor Kubernetes resources to aid internal teams in monitoring services. To also deliver uptime to Workiva customers, the company used New Relic to build a number of tests for monitoring production environments against various critical features. Going forward, Workiva plans to continue leveraging programmability and exploring additional apps, including a new testing app to strengthen its CI/CD process and New Relic One Cloud Optimize, which can track infrastructure costs associated with the company’s various systems.

Use Cases

  • Unifying observability across multiple services
  • Centrally monitoring service level objective (SLO) thresholds
  • Performing daily service availability checks
  • Monitoring AWS and Kubernetes resources
  • Reporting uptime to Workiva customers
  • Applying application smoke tests
  • Driving down MTTR
  • Building maturity into the CI/CD process
  • Tracking and optimizing infrastructure costs