This post is the first of a two-part series on load balancing. This first post addresses local load balancing, while part two will focus on global load balancing.

There are two common questions that people often ask about local load balancing: “Do I need to do it?” and “When do I need to do it?”

The answers: Yes, and always.

Let’s start by clearly defining what we mean by “local” load balancing. This is load balancing within the data center: Getting a request to one of the many servers running a web application, for example. Local load balancing is a basic infrastructure fundamental and there’s really no excuse for not doing it.

There are two key reasons why local load balancing is a must:

  • Reason #1: To achieve high availability that’s sustainable as you grow. You need at least two backend servers for high availability, and your load balancer will ensure that if one backend isn’t functioning, the traffic will be directed to the other backend. I’m going to focus on HTTP here, but this holds true for mail servers or anything that answers traffic coming in on TCP or that pulls items off a backend work queue.
  • Reason #2: To put a control point in front of your services. This benefit doesn’t really have anything to do with balancing or distributing the load. In fact, even if you had a service with a single backend, you’d still want a load balancer. Having a control point enables you to change backends during deploys, to add filtering rules, and to generally manage how your traffic flows. It gives you the ability to change how your service is implemented on the backend without exposing those changes to the people who consume your service on the frontend. That could be an external customer, an internal user, or even another service in the data center.

Load balancing made easy

load balancing iconCreating a load balancer should be quick and easy. It should be something that just happens, or at least seems that way. Give people a system where they can always get a lightweight software load balancer or virtual setup associated with a hardware load balancer as part of their application so that every application, even if it’s completely trivial, has that load balancer control point.

Every application needs a load balancer, so your system for provisioning an app should include a load balancer automatically, whether as software or a configuration within a hardware load balancer. A good example of this is the Heroku platform. If you deploy an app to Heroku, even if you’re just running a single instance, there’s always a load balancer in front of it. Many different stacks offer this kind of “automatic” approach these days.

There are plenty of technological approaches to effective, easy-to-use load balancing.

Some organizations rely on ready-to-use commercial solutions like Amazon’s Elastic Load Balancing (ELB) or F5’s Local Traffic Manager; others build it into software with HAProxy, nginx, or another HTTP proxy or web server that’s deployed as a load balancer. Again, the most important thing is that it must be easy to use and lets you put in place business logic or rules such as “Traffic is currently going to these backends, but after this deploy it will go to these other backends.”

Load balancing concerns

Adding a load balancing layer in front of your backends is not without its own concerns. The foremost is that you’re creating a single point of failure in your architecture. If you have 10 backend web servers, any single one of them can fail without service interruption. But if you have a single load balancer, its failure would knock out that whole tier. That means your load balancer has to be very resilient, even more so than any of the resources behind it. This requires multiple load balancers deployed in a high-availability manner.

At New Relic, we primarily use F5’s Local Traffic Manager for our customer-facing load balancing, which has a failover mechanism to move the load balancer IP over to another piece of hardware and pick up the traffic if a problem occurs. Other key pieces of our load balancing approach include:

  • Load balancers use border gateway protocol (BGP) internally. A growing trend in modern data center networks, with this approach every load balancer publishes a route to the virtual IP that you’re using for the service, and then it’s up to your routers to choose which one of these serves the traffic. This ensures that our routers will perform a balancing on an IP level to our HTTP load balancers, to make sure that you’re always able to reach an HTTP load balancer. And then from the HTTP load balancers, those will perform this HTTP proxy-type of balancing into our web backends. This gives us a highly available load balancer tier and, behind it, the services that we’re balancing.
  • Filtering logic and other behavioral rules in the load balancing tier. The “control point” benefit I discussed above is critical to how we do load balancing at New Relic. We bake a lot of important logic into this layer. For internal traffic, we use this for things like making sure that we put our tracking headers in place and that everything looks good. For our traffic coming in from the outside, we can actually use this to implement rate limits, to blacklist bad actors who are trying to access the system, or to route requests to new services.

Stay tuned: In part two of this series post we’ll take a close look at global load balancing, where we’re concerned not with traffic moving between multiple web servers, but rather traffic moving between multiple data centers.

Be sure to also read Load Balancing 101: Understanding Global Load Balancing.


Background image and diagram courtesy of