Why New Relic
Real time insight into an agile development environment, with graphs to help tech teams communicate more effectively with business stakeholders
- Within one hour of deploying New Relic, OFA team achieved an 80% increase in Narwhal performance, thanks to the NR software.
- Within a few days of using New Relic, the team improved the response times across the board by 90%.
- New Relic enabled team to demonstrate value of refinement of features vs. developing new features – saving time and money.
The OFA infrastructure ran on the Amazon Elastic Compute Cloud (EC2), scaling as necessary to push 280 terabytes of traffic and billions of requests over an 18-month period. Developers used nearly all available components of AWS, with particular reliance on SES (simple email service), S3 (simple storage service), SQS (simple queue service), DynamoDB, and RDS (relational database service) for MySQL.
OFA selected open source solutions whenever possible, entering the majority of code in Rails, Flask and Kohana. The team built payment forms on Jekyll and ran A/B testing on Optimizely. In addition to the storage solutions available on AWS, OFA also deployed the Vertica analytics database, LevelDB on-disk key-value storage, and MongoDB document-based storage. The team monitored system performance using Google Analytics, Chartbeat, Graphite, statsd, and New Relic.
In a 21st-century political campaign, technology is a force multiplier. Politics will always require a great deal of old-fashioned face-to-face contact — shaking hands, knocking on doors, etc. — but tech can help increase the impact of those high-touch interactions by engaging prospective voters online, connecting volunteers through social networks, and tracking voter turnout on Election Day.
All of the developers on the OFA team had deep experience in the world of tech startups, but none were quite prepared for the unique intensity and scale of a national Presidential campaign. “Right after the President gave his convention speech, we were doing about two million dollars an hour,” says Nick Leeper, Tech Lead for Finance and Fundraising at OFA. “When you’re accustomed to hourly donations in the $10,000 to $20,000 range, that’s an incredible spike. You’re basically going from a thousand people using your app to 500,000 in a matter of minutes.”
“On a campaign like this, the traffic patterns are just crazy,” adds Scott VanDenPlas, Director of DevOps for OFA. “One send through every channel hits 60 million people — and that’s just the first and second layer of folks who hear about it.”
In such a volatile, high-volume environment, agility and split-second responsiveness are more important than ever. Even so, political campaigns have traditionally outsourced their IT operations, often resulting in a critical communications gap between campaign operations and tech teams. “These days, I don’t think you can run the technical side of a campaign through an IT vendor,” says Jason Kunesh, Director of User Experience for OFA. “It’s become such a rapidly moving, dynamic environment that you don’t have time to put together an RFP when you need something or when something goes wrong. You just need to push out one app after another, scrapping the apps that don’t work and optimizing the ones that do. Consider this: over the course of 18 months, our internal team produced more than 200 applications. There’s just no way you could do that with a third party.”
“In the old model, campaign operatives would produce a spec document and hand it over to a vendor,” adds Chris Gansen, Technical Lead for OFA’s Dashboard application. “The problem with that approach is that those operatives specialize in getting out the vote, not in building applications. They didn’t always know what they wanted — or even what was possible.”
In 2008, during Obama’s first Presidential run, the campaign relied on a number of third-party vendors, each of whom had their own specializations. Complicating matters further, many state campaign offices had their own tech systems in place, each collecting unique data sets from local voters. “Prior to 2012, the national campaign operation was dealing with many silos of data from vendors and individual states, so it was difficult to standardize that data for targeting large groups of voters across the country,” says Leeper. “In the four years since Obama’s first run, the size and influence of social networks had grown exponentially. But in order to take full advantage of all the connectivity on those networks, and in order to achieve outreach on a truly national scale, we needed to centralize our IT operations and manage all of that data ourselves. Otherwise we would never have a total picture of what was happening on the ground.”
Despite the many technological innovations introduced in 2008, the 2012 campaign essentially began from scratch. “In two years, technology becomes stale,” explains Kunesh. “In four years, it becomes obsolete. Very little from the 2008 campaign was usable in 2012, so this was really a greenfield operation.”
The team began by building a platform known internally as ‘Narwhal,’ which eventually became the services backend to 18 different applications. Narwhal was designed to integrate data across a wide range of apps, and the OFA team built it to be maximally redundant so that if any part of the system failed, essential functions would still be up and running. “We kept saying that we were building an airplane mid-flight,” jokes Ryan Kolak, Team Lead for the Narwhal Project. “While my team made Narwhal a reality, the rest of the team was building apps against APIs and platforms that didn’t exist yet. The only way to make it all work was to observe solid fundamentals, making every part of the system fault-tolerant. Without that level of discipline, we would’ve had chaos on our hands.”
One of the OFA team’s first decisions — and, as it turns out, one of its best decisions — was to run all applications on AWS. “Instead of trying to run this huge infrastructure ourselves, we were able to focus on optimizing it,” says Kolak. “Our first mega-rush of traffic happened in May, when Obama announced his support for gay marriage. Almost immediately, we were pushing four gigabytes per second, and all of our vendors’ traffic failed over to our infrastructure. That was the day when all of our pain points were exposed. That was also around the time we began using New Relic, which helped us identify those pain points much more quickly and much more proactively.”
The team was especially careful to protect the campaign’s donation function from any possible failure. “Donations were taken through one of our vendors and through the campaign itself,” explains Leeper. “We built an API to emulate the way the vendor works. So we had the vendor’s system and our own system, and then we split our own system between Amazon East and Amazon West. We therefore had three different data centers behind the payment process, and each layer was multiply redundant. We could lose an entire region — which ended up happening when Hurricane Sandy landed on the East Coast in late October — and still be totally fine.”
OFA created flexible APIs that would take full advantage of the campaign’s huge storehouse of voter information, enabling stakeholders to manipulate a wide array of data more freely than previous campaigns ever thought possible. “For years, Democratic operatives wanted to increase visibility and alignment across campaign teams, so any change on one team would be reflected to other teams right away,” says VanDenPlas. “Everybody thought that was impossible, because no vendor had been able to do it. But with our centralized tech operation and our new APIs, we managed to make it happen with just one or two days of work.”
“When we demoed that change, people actually brought champagne to the meeting,” adds Kunesh. “They watched the changes on one screen and then on the other screen, and the entire room broke out in cheers. It was crazy.”
For all its success in solving some of the most persistent problems facing a present day online political campaign, the OFA team insists that specific solutions are less important than the team’s overall approach. “Narwhal was built to solve very unique problems,” says Kunesh. “Trying to emulate what we did with that platform would be a mistake. Instead, I would encourage campaigns to hire tech teams with a wide range of experience. None of us are language zealots; we’re all polyglots. And we understand not just technology, but how to map that technology over to other disciplines like business or design. It goes back to rejecting the old silo strategy, opting instead for a more centralized, holistic approach. Having people with a broad perspective, sharp folks with their feet in more than one world, puts you in a much better position to build around failure scenarios and create healable applications.”
Even so, any tech team will fail to make an impact if it can’t communicate effectively with business users. And New Relic quickly became an essential tool for helping developers explain their processes and priorities to non-tech team members. “We helped our business users break some bad habits,” says Gansen. “They were accustomed to requesting new features whenever an application needed improvements. Using New Relic data, we could show them that a new feature would leave our system underserved, and we would encourage them to focus on refining what we already had.”
“Our campaign manager had two core questions for all of our projects,” adds VanDenPlas. “First, does it make our people’s lives easier? And second, is it fast? With the right monitoring tools at our disposal, it was easy for us to demonstrate how we could increase the speed of our applications. We found New Relic graphs to be a very simple, very effective way to show business stakeholders why something would or wouldn’t work.”
“When you’re building software in an agile environment, it’s easy to overlook simple fixes during the initial build. New Relic pointed us to some of the blind spots that had emerged during our first few months of work. And within a few days, we’d increased our response times across the board by 90 percent.”
When the OFA team deployed New Relic, they saw huge results in minutes. “The first time we turned New Relic onto Narwhal, we realized that a 10-row table containing all of our consumer keys was causing a big slowdown in queries,” says Kolak. “Within 30 minutes, we’d made a patch and cached those keys. It was maybe two lines of code, and making that single change reduced 160 millisecond requests by 100 milliseconds. That was an 80 percent increase in Narwhal performance in our first hour of using this software.”
The following days brought even more dramatic improvements. “When you’re building software in an agile environment, it’s easy to overlook simple fixes during the initial build,” explains Kunesh. “New Relic pointed us to some of the blind spots that had emerged during our first few months of work. And within a few days, we’d increased our response times across the board by 90 percent.”
With tools like New Relic at their disposal, the OFA tech team could focus more energy on iteration and less on reinvention. Iterative culture is on the rise everywhere — from fly-by-night startups to large corporations — but political campaigns, with their multi-year cycles, are an especially compelling laboratory for truly agile development. “If people look back in a year or two, I really hope they’re not using the code we wrote because it’s going to stagnate,” says Kunesh. “Things change. Vendors change. Technologies change. Apps come and go. That’s just the way things are. Personally, I’ll be very happy if people look back at our work and say, ‘You put a really smart team together. You put them in the same room and you gave them really tough problems to solve. And you inspired the next campaign to tackle even bigger problems on their own terms and in their own way.’”