At plentymarkets, we have a strong customer-first strategy. E-commerce is a fast and highly competitive industry. Consumers expect to get what they want in seconds, otherwise, they’ll go elsewhere. Website performance—including availability, functionality, and page speed—is now a necessity, especially during peak seasons such as Black Friday and Christmas. To ensure that our customers experience the highest level of performance, real-time monitoring of our services is essential. In the past, we relied on more than ten separate, self-managed, monitoring solutions. Managing these tools was time-consuming. We hardly had any time left to improve on customer experience. On top of that, only a few software engineers truly understood how to use each tool, which meant that we had to deal with a built-in bottleneck to resolve any issues. 

In my position as area engineering manager for the cloud platform, I'm responsible for architectural decisions on infrastructure to improve user experience. But my real passion is optimizing working conditions, making work more efficient, and then increasing employee happiness across the organization. This is why I wanted to take our monitoring to the next level with observability.

Since we implemented New Relic at plentymarkets, here are our top three takeaways:

1. Quality alerts, quality sleep

We offer a wide variety of functionalities for many different use cases, making our tech stack relatively complex. To get visibility into our systems over the years, we implemented several monitoring solutions that soon became overwhelming. On any given day, we received hundreds of updates from our tools via email or Telegram messenger, flooding teams with data and information. Not only was it a considerable effort to keep track of the messages and prioritize them accordingly, but 98% of the notifications were false alarms.

We needed a solution that didn’t rely on a few subject-matter experts—one that multiple teams could use to identify and address issues. Cost optimization was also very important to us. We found New Relic was the best solution for what we required in pricing and support. New Relic has also enabled us to give more people access to the information they require, using a seamless single sign-on experience. 

Next, we began setting alert standards to empower as many people as possible to make informed choices and resolve issues. We also established guidelines on decision-making. We prioritized autonomy when it came to updating our alerts. Notifications are only effective if you define the ones that matter—and different people will need different notifications. By tailoring alerts for each team member, we were able to dramatically reduce alert fatigue and give engineers the information they needed to be successful. Once we widened the circle of who had access to data, we set those people up for success.

Lukas Wöhrl, area engineering manager cloud platform, at plentymarkets shares why he chose New Relic over a self-managed open source solution.

2. Application performance monitoring

We replaced our self-managed APM solution with New Relic application performance monitoring. Previously, we focused on monitoring the underlying infrastructure. With New Relic, we further enhanced performance monitoring while simultaneously keeping an eye on our infrastructure. We’ve also increased the volume of data we monitor as well as the speed at which we monitor it. The performance implications were huge: expanding the amount of data we ingest and monitor means that we’re more likely to catch an anomaly before it ever reaches one of our customers. 

New Relic is becoming the only monitoring solution at plentymarkets. Everyone has the same visibility and is encouraged to work together to identify and address problems.

3. Real-time troubleshooting for customer issues

Before consolidating our tools into New Relic, any customer issue required multiple steps in the monitoring process. We wanted to simplify that process by cutting response times to further improve system performance. Now, we collect all data, including transactions, system performance, and how the customer is impacted, through a custom dashboard powered by template variables. We have immediate access when issues arise. This approach required several people to give up ownership of a certain part of the stack. But now that data isn’t sacred to the few engineers. Now we can improve our mean time to resolution and enhance the overall customer experience.