New Relic is giving us a view into our own platforms, with metrics and data about the interdependencies, how they're performing, and what the experience is like for real users across the planet that we could never access before ... But even more than this, New Relic is helping us on our DevOps and digital transformation journey.

Chris Callaghan
Head of DevOps at Royal Society of Chemistry

Keeping the world informed about the latest scientific advances depends on high-quality online publishing resources. New Relic is helping one of the world’s oldest and well-respected science publishers, the Royal Society of Chemistry, modernize its software and infrastructure to champion the chemical sciences. 

The Royal Society of Chemistry was founded in the United Kingdom in 1841 with a charter from Queen Victoria. Today, the society has more than 54,000 members worldwide and is one of the world’s leading non-profit publishers and knowledge businesses where significant advancements in chemistry are published and shared. 

The society has published and stored research since its foundation and has more than 1 million specialist articles, chapters, and records. This invaluable scientific resource is being constantly expanded as the society publishes 40-plus regular scientific journals in which its members share their latest research reports. In addition, there is a range of databases that provide access to high-quality chemical information for chemistry academics, professionals, and students. Ensuring access to these resources is vital for the society. Although it is a non-profit body, it competes with other institutions for end users of its services. Any problems with using the database or viewing and downloading papers must be prevented—or remedied quickly.

Moving to microservices

Responsibility for the online services’ hosting and application delivery lies with Chris Callaghan, head of DevOps. The challenge for him and his team has been the high volume of legacy technical debt and how to keep pace with customer demand that’s increasingly international. "Our systems have all grown organically over a really long period of time. We’re going through a process of modernization to improve how we provide access to our scientific knowledge and ensure we have the most robust and resilient architectures possible. This is leading us to consider how we can break down the monoliths and migrate to a microservices cloud-based architecture to improve the scalability and delivery performance," says Callaghan.

This led Callaghan and his colleague Graham Creek, senior site reliability engineer, to take the society on a technology journey from its own data center and toward the Amazon Web Services (AWS) public cloud to provide the elasticity, business continuity, and resiliency the society needs. As a result, they are rebuilding and redesigning legacy applications for public cloud.

As new applications were developed, deployed, and updated, it became clear that greater visibility was needed about performance and any underlying issues. As Callaghan explains, "We were using tools that were OK but weren’t giving us the detailed information inside the applications that we needed. So, we were blown away when we saw how New Relic could drill down into the code and tell us so much more."

Being able to share this depth of insight also supported the society’s adoption of both SRE and DevOps approaches. "We’ve taken a deliberate approach to mine and utilize the strong knowledge that's already there and take our existing IT operations teams on a DevOps journey where we move them towards scripting and automation," explains Callaghan. "In defining the technology road map, it was also obvious that SRE was the model that was most sympathetic to what we wanted to achieve. New Relic gave us a monitoring platform that guides us on this path."

As new applications are rolled out, the ability to correlate between the application and infrastructure is important. "Before it was a complete black box. But with New Relic, we can see into the application performance, drill down into why the error rate spikes, and when application latency shoots up, see which particular node is suffering. This is so valuable when a new build has been pushed out because it’s now possible to get a powerful view of where any instability is coming from," says Callaghan.

Creek adds, "New Relic gives us a safety net so that when we’re doing the deployments, we can see immediately whether there’s any particular problem with that application. There’s a lot of complexity with some of the systems and how they interact, and the developers don’t necessarily know everything about the wider areas of our infrastructure. So New Relic enables us to see the effects of what we have done and then do rollbacks and make whatever changes we need to do."

A global view of real-user monitoring

In the day-to-day running of the services, New Relic is helping the team respond to performance issues that can naturally arise when the society’s databases are getting around 1 million hits per hour. Product, developer, and operations teams are all using New Relic dashboards extensively. 

"They really like the view of what the end-user performance is really like and are heavily using the error rate calculation and the alerting to react quickly via Slack channels to reduce downtime or slow performance," says Callaghan. "What’s so valuable here is how the visibility into application performance links back into the infrastructure and the code we’ve developed. Knitting all of those different elements into a single place is definitely an advantage for us, and we didn’t have this sort of capability to do that previously."

The insights help deal with urgent operational issues and also future planning for online services. As Callaghan explains, "We have a much better global view of real-user monitoring, which we never had before when we were doing synthetic-based monitoring. For example, New Relic really helped us identify key performance issues in China, where we found page load speed was really slow because of a specific plugin. This insight means we know how to adjust our road map accordingly."

The data provided by New Relic will become even more important for the society as it moves more of its modernized applications onto the public cloud, enabling the team to react to surges in demand for scientific research. "Demand for our publishing sites heavily follows the academic year," Callaghan says.
When we built and managed our own data center, we had the usual problems of capital expenditure to ready ourselves for spikes in demand. With our migration to AWS, we will use New Relic to help us to be able to flexibly scale with a clear view on the cost management of running applications in the cloud."

As the society continues its digital transformation journey, Creek sums up New Relic’s value, saying, "For me, the value of using New Relic is how it enables shared responsibility. When things go wrong, it’s not down to the developers or operations, everyone has an opportunity to see what’s going on and play a positive role in fixing any issues collaboratively because everybody has the same information."

Callaghan sums up the contribution that New Relic is making to the Royal Society of Chemistry: "New Relic is giving us a view into our own platforms, with metrics and data about the interdependencies, how they're performing, and what the experience is like for real users across the planet that we could never access before. That obviously helps us build better products and services for our global audience of scientists, researchers, and chemical science organizations. But even more than this, New Relic is helping us on our DevOps and digital transformation journey."