Why New Relic
Performance monitoring of web browsers, servers and apps in one easy-to-use solution
- Saved 25% in compute costs during holiday business spike by dropping COU utilization from 80% to 60%
- Reduced customer check out time by 35%
- Moved to proactive performance management rather than learning of issues from customers
GiftCard.com has a core DevOps team in its Dallas headquarters, supported by a large remote IT staff. The company’s environment is built on a massive serviceorientedarchitecture (SOA), with consumer and admin sites running .NET on Amazon Web Services (AWS). Web content management is in Kentico CMS. Some of the company’s smaller sites are written in PHP and serviced by REST APIs.
GiftCard.com is on a very aggressive growth path. The company experienced 100 percent year-over-year growth from 2011 to 2012 and things show no sign of slowing down. “I’m the last person to complain about our exponential growth,” says Brian Winfield, CTO and Co-Founder of GiftCard.com. “But growth obviously presents a serious challenge from an IT perspective. And when you consider that we do a large percentage of our business during the year-end holidays, we were concerned about scaling to meet anticipated demand during the final quarter of 2012.”
A few key performance issues were the source of particular concern for Winfield and his team. Customers often call 1-800-GIFT CARD® to make orders, and the speed of the admin website was having an impact on the overall customer experience. “Having slow admin systems means that customers need to stay on the phone longer,” he explains. “For years, the IT team had heard complaints about the speed of admin screens — the average load time was around eight seconds — but we couldn’t identify the source of the problem.”
Prior to 2012, Winfield and his team relied on Microsoft Systems Center Operations Manager — more commonly known by its former acronym, MOM — to monitor server performance and memory usage. However, the GiftCard.com IT team lacked an easy solution for monitoring system performance on the application level. As a result, Winfield found himself counting on the customer support team to alert him to application performance issues. “I appreciated the efforts of our customer support team to tell us when the site wasn’t performing well, but we often had difficulty prioritizing their alerts,” says Winfield. “Most of them don’t have technical backgrounds and most of our customers don’t have technical backgrounds, either. So the problems were often described in a way that was vague or incomplete. If we were able to understand the problemwell enough to prioritize it, we relied on guesswork to identify the realsource of the issue. Consequently, problems identified by the customer support team would often take weeks or months to fix, if they were fixed at all.”
"I’m using New Relic all day long. This software is uniquely helpful because it gives us stats and historical information
at the individual machine level. We get performancerelated data to make sure that customers are having a great experience on the eCommerce side, but we also get server data to know when a machine is running out of disk space or when the CPU spikes."
Winfield took a close look at the available options and New Relic quickly emerged as the one to beat. “New Relic was the only solution we found that monitored web browser performance, error monitoring, server monitoring and app performance with proofing in one easy-to-use service,” he says. “Nothing else even came close.”
His team installed the New Relic web agent in October 2012. The software was up and running in 10 minutes and deployment of the New Relic server monitor quickly followed. GiftCard.com now runs New Relicon all public-facing and private internal machines to monitor admin sites, Windows Communication Foundation (WCF) service-layer sites, and external APIs used by third-party developers. Reliance on MOM is almost a thing of the past: these days, Winfield and his team only use it for patch management. “I’m using New Relic all day long,” he says. “This software is uniquely helpful because it gives us stats and historical information at the individual machine level. We get performance-related data to make sure that customers are having a great experience on the eCommerce side, but we also get server data to know when a machine is running out of disk space, or when the CPU spikes. With AWS, we can add machines quickly to support our traffic, and with New Relic on each of those machines, we can ensure that we’re making the most of all available compute resources.”
Winfield and his team now rely on a number of New Relic features,
- Real User Monitoring (RUM) to show response times from the
individual user’s perspective
- Transaction Traces to extend monitoring capabilities to the most
granular levels of detail
- App Map to demonstrate which services would be affected in the
event of application failure
- Apdex Alerts to measure users’ satisfaction with an application’s
With New Relic, the GiftCard.com IT team is able to drill all the way down to the most problematic screens. Then they use the New Relic API to record specific data for each web transaction, configuring the API to store the order number, customer information and account management data. With that information on each web transaction, Winfield and his team are able to track down orders at a much more detailed level than ever before. “The New Relic API shows us which ASP.NET methods are slowing down load times,” he says. “We put those orders into our dev environment, then use the profiler in Visual Studio 2012
to find the real root of the problem.” For Winfield, the biggest surprise was the value he found in New Relic’s error monitoring capabilities. “I have solutions in place to ping our website,” he says. “But Pingdom doesn’t tell me when individual components go down. With New Relic, I can set up an alert so that if we exceed a given error threshold based on percentage of traffic, I’ll get an email or a text. In fact, before we deployed this software, I was in the process of creating a custom solution to alert us to
emerging issues and New Relic provided me with an out-of-the-box alternative that I could start using right away. That saved me a huge amount of in-house development time.”
"We dropped our CPU utilization from roughly 80 percent to around 60 percent with help from New Relic. And with fewer machines in production, we were able to do very healthy holiday business while avoiding approximately 25 percent in additional compute costs."
From 2011 to 2012, GiftCard.com quadrupled the number of machines in use during the holiday season. Yet even that increase could have been far more dramatic, because Winfield used New Relic to achieve an estimated 25 percent reduction in the number of machines necessary to support customers and admin. “We dropped our CPU utilization from roughly 80 percent to around 60 percent with help from New Relic,” he says. “And with fewer machines in production, we were able to do very
healthy holiday business while avoiding approximately 25 percent in additional compute costs.”
Meanwhile, the GiftCard.com IT team managed to reduce responsetime in a number of key areas. For example, some customers were experiencing problems in checkout, with certain calls taking as longas four minutes. New Relic helped Winfield identify the source of the problem and get those calls down to 60 milliseconds. Even more crucially, the IT team succeeded in reducing the average response time on admin screens from eight seconds to one second. Not only did that improve the customer experience, but it also increased the number of calls the company could take while cutting per-minute phone costs. “Speedy checkout is crucial to our business,” says Winfield. “Historical data from New Relic demonstrated the source of those long checkout delays so that I could finally resolve the problem. It all came down to a network issue, so I added a cache layer on theconfiguration and took the variability out of the lookup. That simple fix made an immediate difference in our average checkout time. Overall, using the intelligence we gained from using New Relic, we were able to improve our checkout time by about 35%.”
Before New Relic, web performance issues often took weeks or months to identify and resolve. Now Winfield and his team can isolate problem areas in minutes and execute fixes within 24 hours. “I don’t have to guess anymore,” he says. “I can focus on real performance issues as they emerge. Once I’ve done my troubleshooting, I can give specific instructions to the programmers, complete with detailed screenshots, substantially shortening the amount of time required to upgrade the code. The contextual details I get from New Relic are instrumental in helping me communicate more clearly and more effectively with my IT team.”
Developers are thrilled to have New Relic on hand, as are the company’s customer service reps. Even Giftcard.com’s CEO has noticed the overall performance and productivity improvement. Most importantly, customers enjoy a faster, more reliable experience, whether on the website or on the phone. “New Relic has changed the way we operate,” says Winfield. “We’re not relying on customers to report problems to us. We’re more proactive, and we have more time to be truly customerfocused.
We can take more orders. We can handle those orders more efficiently. And when customers call us — whether they’re purchasing a card, asking about the status of an order, or calling us to activate a card — we’re giving them better service across the board. It’s a big win for everyone involved.”