Why New Relic
Easy implementation, comprehensive insight, and a minimal learning curve
- Achieved an 82% reduction in production error rate over the course of six months
- Gained the necessary insights in both test and production to pursue an agile approach to development
- Improved productivity across all development teams, freeing up more time and resources to focus on core projects
Limited visibility across a wide range of interdependent systems
For years, the complexity of Move’s technology stack made performance issues difficult to diagnose, let alone resolve. “There are a lot of interdependencies among our various systems,” says Niti Sharma, senior web developer for Move. “Whenever we encountered a cascading failure, our old monitoring tools usually couldn’t give us enough granularity to see which system fell first. It could take us hours or days, sometimes as long as a week, to discover the problem in the first place — then a few more days to resolve it.”
Ineffective monitoring was a particular challenge for the company’s QA team. “Realtor.com has a lot more moving parts than your typical website,” says Dave Arguelles, director of quality assurance at Move. “There’s a lot that can go wrong, and it’s absolutely critical for the data to be completely accurate when potential homeowners are looking for a place to live.”
The Move team attempted to meet these challenges by relying on a combination of commercial monitoring systems, open source technologies, and applications developed in house. Yet no single solution could provide either the depth or breadth that the company required to achieve fast, effective remediation of performance issues in such a heterogeneous environment.
New Relic quickly becomes the companywide standard
A new approach to monitoring came to Move’s attention in 2013, when a team of developers began experimenting with New Relic APM, Server and Browser monitoring to generate impressive data-driven insights within Heroku. The results were so promising that Move immediately deployed a five-server installation on its realtor.com® website. “Right from the start, we found a customer-facing issue in production that we were able to resolve in a few minutes,” said Nick Steel, senior manager of systems engineering for Move. “Without New Relic, we would have needed several more hours to find the same problem, let alone resolve it.”
With a quick win right out of the gate, New Relic succeeded in becoming the companywide standard in no time, expanding to monitor more than 25 applications in a matter of months. Meanwhile, a growing number of employees in every part of the company began using New Relic data to facilitate smarter decision making. “It only takes us about 10 or 15 minutes to install a New Relic agent,” Steel said. “We need another 15 minutes, on average, to train a new colleague on how to use the solution for generating useful information. In other words, this is a technology with almost zero learning curve.”
Move employees are especially enthusiastic about the dashboard feature in New Relic. In fact, the company installed nearly 20 large monitors companywide to display the latest New Relic data, with each monitor customized to show the metrics that are most relevant to the teams sitting nearby. The monitors are even lit to indicate the current status of each application —green when everything is running well, red when there’s a problem. The change in color is activated when the New Relic Apdex score falls below a certain threshold on any Move application.
Other New Relic features are just as critical for keeping the Move team up to speed on the latest performance data. For example, New Relic Browser measures site responsiveness from the end user’s perspective, giving developers new insight into the effectiveness of the latest changes in production. The App Map feature offers deep insight into the interactions of all the components across the Move environment, including third-party applications that might be adversely affecting performance.
With Transaction Tracing, the Move team can follow the source of any problem down to the line of code, definitively identifying the root cause for faster resolution. In recent days, Move developers have begun experimenting with the Thread Profiling feature in New Relic, which offers impressively clear visibility into the status of the CLR process pipeline across the
company’s .NET stack. “All of these features work together to give us an unprecedented level of insight into the health of our various websites,” said Daniel Woods, vice president of engineering for consumer services sites and applications for Move. “With New Relic, it’s all about isolating the problem so we can understand how to solve it without confusion or delay.”
“New Relic gives Move operational data that actually makes sense to people in operations and enables us to maintain a two-way flow of information between groups. By creating better relationships between development and operations, we can create a better product.”
Bridging the gap between development and operations
In the first six months of using New Relic to monitor the company’s large network of sites, Move has achieved an astonishing 82% drop in its production error rate. According to Steel, the main driver for such a dramatic change is the proactive stance enabled by New Relic technology. “We hardly ever get calls from our support centers anymore,” he said. “With this solution, we can usually stay one step ahead of emerging issues, and our average time to resolution is less than 20 minutes. Another nice benefit is that everyone’s a little calmer around the office, because we’re rarely in firefighting mode these days.”
As Move continues its aggressive transition toward an agile development environment, New Relic will play an especially valuable role by giving teams a more thorough understanding of QA cycles. “We’re constantly pushing our test function closer and closer to our development function,” said Arguelles. “Production releases happen once or twice a day, but test releases happen almost constantly. Our ultimate goal is continuous deployment on all fronts, where we simply aren’t afraid to release anything. New Relic will help us get there, because it shows us very clearly how any update in our test environment will behave when it’s actually in production.”
By gaining so much speed and flexibility in the deployment of new features, the Move team has seen a considerable increase in productivity on every development team. “We’re doing less scrambling because we’re not solving as many urgent performance problems, and that enables us to do more planning,” said Woods. “Everyone’s more efficient as a result. Best of all, New Relic frees our developers to spend far more time and energy on the core projects that add the most value to Move.”
New Relic also benefits Move by bridging the gap that typically exists between development and operations. “Unlike many monitoring tools, New Relic gives us operational data that actually makes sense to people in operations,” said Steel. “That enables us to maintain a two-way flow of information between the two groups. By creating better relationships between development and operations, we can create a better product.”