Here in the UK, Sainsbury’s is a household name. For those that don’t know, the business was established in 1869 and is one of the UK’s leading retailers across food, clothing, general merchandise, and financial services. We’re also one of largest food retailers in the UK, with more than 600 supermarkets, 800 convenience stores, and over 250,000 grocery online orders every week. Using a site reliability engineering (SRE) approach, my team looks after the online customer journey for the groceries part of the Sainsbury’s business, from website order through to delivery at the customer’s door. My team is driven to ensure service stability for our customers by being proactive and using tools such as the New Relic platform, but also by recognizing we may need to be reactive sometimes to our customers' feedback.

A custom app for social signals

To allow us to identify problems impacting our customer experience aster, I built an app in New Relic, which I call Twitter Alerts. The app complements the visibility in New Relic with customer-reported information on social media. Combining New Relic alerts with social media signals lets us more quickly and effectively correlate customer feedback with systems that may be alerting.

The incentive for creating the app was to remove any latency between our contact center processing the social media signals and having that information communicated directly into my team. Whatever the feedback, we decided we wanted to track social comments automatically, with immediate alerting if there’s an issue reported.

Steps to instant visibility

Heading into the busy Christmas season in 2019, we looked at Twitter and found that we could stream the data, pattern match it to certain keywords, and determine whether a customer was having a less-than-optimal experience. Automating this allowed us to discover any issues quickly and resolve them.

Setting up a Twitter developer account to use the Twitter API, I created a simple Python script to stream data related to mentions that include Sainsbury’s Twitter handles. Using pattern matching, the script checks the resulting Twitter data for certain issue keywords such as “broken,” “can’t,” “down,” “problem,” and so on.

If there’s a match on one or more of these keywords, then it pattern matches against a third set of keywords for our services including “iOS,” “Android,” “checkout,” “delivery booking,” “payment,” and others. If the tweet matches across these keywords, then it fires the tweet data into New Relic as event data.

Using the alerting capability of New Relic, it was easy to configure, so if a Twitter handle reached three tweets in an hour, we’d get an alert through PagerDuty about it. This helps our teams investigate and resolve issues quickly and minimize any impact for our customers.

The entire effort only required about three days, with much of that time involving creation of the script and setting up the Twitter development account, both of which were new for me.

Customization and visualization

The Twitter Alerts app gives us maximum control over how the data is visualized, including using responsive design for mobile viewing.

The app displays statistics and tweet data—including the time it was tweeted, the service name, the handle it’s matched with, screen names, and tweet details. We can then drill down into the actual tweet text to get the full information.

Faster insight into the customer experience

The timing of creating this app was extremely beneficial for us heading into the holiday season, which is our busiest time of year. Using the programmability in New Relic, we were able to get something very quickly in our hands to give us confidence that customers weren't having any issues, or if they were, we could pick them up very quickly.

Now we can establish alert policies on issues identified by a customer through Twitter simply by creating a custom event and sending it to New Relic. Because it is in New Relic's connected telemetry database we can quickly and easily correlate this with our operational systems and, if necessary, page someone through the built-in PagerDuty integration. Once an incident is resolved, we use this as a mechanism to fine-tune our existing alert policies. Often, this lets us get out ahead of similar issues and prevents future customer disruption.

Our Twitter Alerts app gives our engineering community immediate visibility into what is happening, from the customer on Twitter straight through to our engineers. Happy customers who aren’t posting about issues offer really valuable insight for us as well, giving us confidence that they’re having a positive experience.

Just the beginning

We’ve also started on some other new apps that let us extend and manipulate data in the way we want to visualize it.

The ability to use New Relic to code and run apps locally and then publish them out is something that we'll continue to explore as we use the platform across our services. New Relic's programmable platform and apps give us the flexibility to consume our data in ways that enable us to go beyond dashboards to bring more value and a better experience to our customers.