As a global tech platform for luxury fashion, failure and downtime are not an option. Senior Principal Engineer Manuel Garcia at farfetch.com explains how he and his team of over 130 engineers get ready for Black Friday with observability.
FARFETCH, like most retail businesses, is already gearing up for one of the biggest shopping days of the year. As the leading global technology platform for the luxury fashion industry, customer experience needs to be seamless. And Black Friday is no exception.
On this day, millions of customers scour their favorite sites for the best deals on exclusive products. In 2020, online shopping on Black Friday surged by 22% to a record $9 billion, according to Adobe Analytics—and farfetch.com traffic tripled.
FARFETCH is a leading global technology platform for the luxury fashion industry. In 2008 it launched as an e-commerce marketplace for luxury fashion and quickly grew into a unique shopping experience housing over 1,400 brands. With 1,000 engineers divided into domains and clusters, FARFETCH also develops key technologies and services for the industry through FARFETCH Platform Solutions.
As Senior Principal Engineer, I support the engineering domain responsible for farfetch.com. I'm tasked with preparing for the sales season, which includes analyzing and preparing all actions in order to prevent any severe failure or downtime. Here's how my team uses observability to win on Black Friday.
Start below the glass
At farfetch.com, everything our customers experience on Singles’ Day, Black Friday, and Cyber Monday—whether it’s on desktop, browser, or mobile phone—is below the glass. We look at everything through that critical interface.
During this busy time, my engineering team works 24/7 to proactively find and fix problems before customers are impacted. This means architecting our software, infrastructure, and cloud investments to scale while providing the efficiency, uptime, and performance crucial to strong customer experience.
For farfetch.com it’s about having ROBUST software at the core of everything we do. And it’s not without challenges. To prepare our site for triple the traffic during the holiday season, we need team collaboration, and most importantly, the insight to adapt our infrastructure to add new services and capabilities. We need to deliver business growth without compromising on user experience.
As a team, we are ambitious. We want to do better than last year; we want to improve and exceed expectations. New Relic is crucial to achieve this and ensure that we have a successful sales season.
Plan, test, observe, repeat
Built on a microservice architecture, farfetch.com is part of a complex system. This type of architecture relies on a network of hundreds of microservices, new capabilities, and input/output work to ensure our customers get the best experience on Black Friday. But these downstream services often use different technologies and are built by different people and behave differently under high loads.
To add to the complexity, we are managing huge stores of data across multiple geographical locations and different data stores. This data needs to be quickly processed, understood, and acted on. Then, our platform can pivot based on traffic, customer behavior, and business needs.
We built our platform on Microsoft Azure to combat some of these problems, which gives us the agility to scale based on Black Friday traffic. We also rotate our teams to monitor every part of the platform 24/7, distributing responsibility across different departments. This gives my engineering teams autonomy while ensuring that every aspect of our site is covered on the big day.
But our approach to Black Friday is not just reactive. A vital element of our preparation begins months before and focuses on testing, testing, and more testing.
I work across my team to review alerts and timeouts leading up to the day. We try to guarantee any necessary fallback/redundancy in our architecture as a Plan B. This contingency plan can be a real lifesaver during a peak day. We conduct fire drills on the platform, execute load tests, look for weaknesses, and create reports for the different microservices with massive help from New Relic.
We also create data in New Relic to monitor our resilience. We can tell when a service is degraded or unavailable, and when circuit breakers are opened/closed. What’s even better is that we can monitor the offload we create on other services, and if we need to switch off features in order to preserve the best possible user experience.
We also implement code freezes to stabilize the platform as much as possible. It gives my team a clear view and a replication of the platform on high traffic days so we can quickly anticipate and troubleshoot issues before they impact users.
One clear view
Good dashboards and one clear view of our platform are invaluable tools, both for my engineering team and the different stakeholders we report back to. We did a lot of work over the last 24 months to make that possible.
We built an observability map—programmability platform—to zoom into different data center performances, which gives us a perfect bird’s eye view. We also created satellite dashboards so that my team has clear actionable insights to monitor our microservices and our users' browsers on Black Friday. Dashboards also allow our main stakeholders to see how our site dealt with periods of high traffic, high revenue, and high risk—like Black Friday.
Observability and New Relic are embedded in farfetch.com and how we manage our platform. It allows us to follow important signals, such as response time and error rate, and gives all of us a birds-eye view of what’s going on across multiple microservices and key markets.
New Relic is a tool that my engineering team uses every day. Not just on Singles’ Day, Black Friday, and Cyber Monday.
Read how FARFETCH improved Core Web Vitals by 40% in this case study.
The views expressed on this blog are those of the author and do not necessarily reflect the views of New Relic. Any solutions offered by the author are environment-specific and not part of the commercial solutions or support offered by New Relic. Please join us exclusively at the Explorers Hub (discuss.newrelic.com) for questions and support related to this blog post. This blog may contain links to content on third-party sites. By providing such links, New Relic does not adopt, guarantee, approve or endorse the information, views or products available on such sites.