Using HAProxy to Build a More Featureful Elastic Load Balancer

Though Shareaholic is hosted in the AWS cloud, we avoid depending on Amazon’s virtualized cloud services whenever possible. If we ever hit a bottleneck in AWS, I want to be able to switch providers without needing to rebuild a core piece of our architecture. I also don’t want our tech team to have to make product and infrastructure sacrifices just so that we conform to AWS standard practices.

Load balancing with HAProxy was the first example of a service that Amazon provides, that we felt was better to manage ourselves.

Here’s how we did it.

Why HAProxy?

I’ve used HAProxy nonstop since 2009, and to date it has never been the culprit of a single problem I’ve experienced. It is extremely resource efficient and it is impeccably maintained by Willy Tarreau.

It provides both HTTP and TCP load balancing, and a powerful feature set not found in Amazon’s ELB.

Monitoring

HAProxy provides a dashboard with global stats, server stats, and aggregate stats for each frontend and backend. On a single page you can see which servers are busiest, which ones are up and which ones are down. You can also view this data at a service level, so if you have a three node cluster and only two of the servers were down at any point in time, HAProxy correctly summarizes that you did not experience any downtime.

While writing this post, one of our rails boxes had degraded performance due to too many background jobs running. Using the HAProxy dashboard, I quickly spotted it. You probably can too:

Shareaholic's HAProxy Dashboard

(Mouseover the image for the spoiler).

Configurability

HAProxy is highly configurable. You have over a half dozen load balancing algorithms to choose from. You can provide relative weighting for different servers in case you have boxes of differing resources. You can specify small health check timeouts on servers that should resond quickly and longer ones for servers that are slower. You can set these preferences globally, on a per-backend basis or on a per-server basis.

Requests as Mutable Objects

We often think of a web requests as being static, but with HAProxy we can bend them to our will to do important things.

Let’s start with the prototypical load balancer example: adding an X-Forwarded-Proto header so that our app server knows it’s receiving an SSL request even though it’s coming through on port 80:

reqadd X-Forwarded-Proto:\ https

Now let’s try something we can’t do with ELB: perform HTTP Basic Authentication before a request hits our app server.

userlist admins
  user myusername insecure-password mypassword

frontend restricted_cluster
  acl auth_tintoretto http_auth(admins)
  http-request auth realm ShareaholicRestricted

Any web request being routed to restricted_cluster will need to contain valid HTTP auth credentials in order for that request to be sent to an application server. At Shareaholic we use this for our internal products and dashboards. This lets us manage authentication in a single place, and prevents us from needing to worry about it at the application level.

Frontends

HAProxy supports an infinite number of independently configurable load balancing frontends. In addition to HTTP load balancing, it can be used in TCP mode for general purpose load balancing. This means that if you have a redis database with three read slaves, you can use the same HAProxy instance that balances your web traffic to balance your redis traffic.

Backends and ACLs

Backends and ACLs enable you to host large-scale web sites and web services in ways not possible with simple request forwarding.

For example, did you know that the shareaholic.com domain is hosted partly in a PHP cluster, partly in a Rails cluster, partly in a Python cluster and partly as static content on github.com? We accomplish this with backends and ACLs.

First, we define an ACL based on the path of the URL request. An abbreviated example:

# Tech blog traffic goes to tech blog
acl techblog path_beg /tech
use_backend sharingthetech if techblog

# Other traffic goes to rails app
default_backend rails_app

Then we define backends composed of one or more servers to handle the different kinds of traffic:

backend rails_app :80
  option httpchk /haproxy_health_check
  server rails-1 10.1.1.1:8080
  server rails-2 10.1.1.2:8080
  server rails-3 10.1.1.3:8080

backend sharingthetech :80
  reqirep ^Host: Host:\ shareaholic.com
  server github-pages shareaholic.github.com:80

There are two things to note in this example:

First, our rails servers are configured to receive their health checks at /haproxy_health_check. These requests are handled by a proc in our routes.rb. We chose this rather than a static file on disk because we want the rails stack to be invoked during the health check; if apache is up but rails isn’t working, we want the health check to fail.

Second, Github pages only works for a single host (CNAME), so in our sharingthetech backend, we rewrite the Host header to ensure it is shareaholic.com, which is what’s contained in our Github Pages CNAME file.

Note: Since writing this post we have moved our tech blog from http://shareaholic.com/tech to http://tech.shareaholic.com. This example illustrates how we had hosted it at http://shareaholic.com/tech.

Shareaholic’s HAProxy Production Setup

Deployment

We run HAProxy on EC2 small boxes backed by instance storage. We do not back up these instances because Chef generates the config files on the fly. We use a custom fork of the haproxy cookbook that provides support for compiling HAProxy from source. We use the latest 1.5-dev12 build.

The shareaholic.com A record points to an Elastic IP address, which we assign to our load balancer. A second load balancer is always running; if the primary load balancer goes down, we simply reassign the Elastic IP address to our backup load balancer.

Because HAProxy consumes negligible CPU cycles and memory when not in use, we save money by avoiding a single tenant backup balancer. Instead we run our backup balancer on a utility box that is doing other work, and that utility box doubles as a hot standby that can handle traffic in a pinch.

HAProxy Configuration

Here is our production haproxy.cfg. The only changes are 1) redactions to protect authentication credentials and 2) Changes to IP addresses and paths to protect potential attack vectors:

haproxy.cfg

global
  maxconn 15000 # Max simultaneous connections from an upstream server
  spread-checks 5 # Distribute health checks with some randomness
  log 127.0.0.1 local0
  log 127.0.0.1 local1 notice
  #debug # Uncomment for verbose logging

defaults # Apply to all services
  log global
  mode http
  balance roundrobin
  option abortonclose # abort request if client closes output channel while waiting
  option httpclose # add "Connection:close" header if it is missing
  option forwardfor # insert x-forwarded-for header so that app servers can see both proxy and client IPs
  option redispatch # any server can handle any session
  timeout client 60s
  timeout connect 9s
  timeout server 30s
  timeout check 5s
  stats enable
  stats uri /stats # Real path redacted
  stats realm Haproxy\ Statistics
  stats auth username:password # Real credentials redacted
  monitor-uri /monitor # Returns 200 if we're up; real path redacted
  errorfile 503 /etc/haproxy/errors/503.http

  userlist shareaholic_admins
    user myuser insecure-password mypass # Real auth redacted

  frontend incoming
    bind *:80
    reqadd X-Forwarded-Proto:\ http

    acl sharingthetech path_beg /tech
    acl tintoretto_app hdr_dom(host) -i tintoretto

    acl php_app path_beg /about /contact /help /media
    acl php_app path /api /api/ /tools/browser /tools/browser/
    acl php_app hdr_dom(host) -i blog.shareaholic.com
    acl php_app hdr_dom(host) -i sexybookmarks.shareaholic.com
    acl php_app hdr_dom(host) -i www.sexybookmarks.shareaholic.com

    use_backend sharingthetech if sharingthetech
    use_backend tintoretto_app if tintoretto_app
    use_backend php_app if php_app

    default_backend rails_app

  frontend incoming_ssl
    bind *:8443 accept-proxy
    reqadd X-Forwarded-Proto:\ https

    acl sharingthetech path_beg /tech
    acl tintoretto_app hdr_dom(host) -i tintoretto

    acl php_app path_beg /about /contact /help /media
    acl php_app path /api /api/ /tools/browser /tools/browser/
    acl php_app hdr_dom(host) -i blog.shareaholic.com
    acl php_app hdr_dom(host) -i sexybookmarks.shareaholic.com
    acl php_app hdr_dom(host) -i www.sexybookmarks.shareaholic.com

    use_backend sharingthetech if sharingthetech
    use_backend tintoretto_app if tintoretto_app
    use_backend php_app if php_app

    default_backend rails_app


  backend sharingthetech :80
    reqirep ^Host: Host:\ shareaholic.com
    server github-pages shareaholic.github.com:80 check inter 5000 fastinter 1000 fall 1 weight 50

  backend php_app :80
    option httpchk HEAD /version.txt HTTP/1.1\r\nHOST:\ www.shareaholic.com
    server master 10.1.1.1:80 check inter 5000 fastinter 1000 fall 1 weight 1
    server slave 10.1.1.2:80 check inter 5000 fastinter 1000 fall 1 weight 1

  backend rails_app :80
    option httpchk /haproxy_health_check
    server rails-1 10.1.1.3:8080 check inter 5000 fastinter 1000 fall 1 rise 1 weight 1 maxconn 100
    server rails-2 10.1.1.4:8080 check inter 5000 fastinter 1000 fall 1 rise 1 weight 1 maxconn 100
    server rails-secondary-a-1 10.1.1.5:8080 check inter 5000 fastinter 1000 fall 1 rise 1 weight 1 maxconn 100
    server rails-secondary-c-1 10.1.1.6:8080 check inter 5000 fastinter 1000 fall 1 rise 1 weight 1 maxconn 100

  backend tintoretto_app :80
    server tintoretto-1 10.1.1.7:8080 check inter 5000 fastinter 1000 fall 1 rise 1 weight 1
    acl auth_tintoretto http_auth(shareaholic_admins)
    http-request auth realm Tintoretto if !auth_tintoretto

SSL Termination

Terminating SSL at the Load Balancer: Why?

Many web sites defer SSL termination to the app servers that handle the individual requests. Because of the way we route traffic at Shareaholic, this is not possible for us. In order to choose a backend based on the request’s path or its Host header, we need to be able to inspect the request. This is only possible once the request has been decrypted.

An additional benefit of terminating SSL at the load balancer is the operational DRY it provides. Rather than configure SSL termination for JBoss and Apache and nginx and any other app server we run, we can configure it in one place. When it’s time to update our SSL keys, we just drop them into that one place and redeploy the box.

How We Terminate SSL

As of build 1.5-dev12 (September 10, 2012), HAProxy can be configured to perform SSL termination. We however use stud for this task because HAProxy did not have this feature when we began using it. We considered the more popular stunnel but found that stud provided substantially better performance. This is very important to us because SSL termination is the most expensive part of load balancing, and we want to squeeze as many connections as possible out of our EC2 boxes.

I’ve open sourced the stud Chef cookbook that I wrote for deploying stud to our production servers. Even if you don’t use Chef, that cookbook is instructive in how to get stud up and running quickly.

Performance and Scalability

While writing this post, HAProxy reports 161 concurrent connections with a new connection rate of 67/second. It is using about 4.1% of the single EC2 compute unit allocated to our EC2 small box.

stud is consuming an additional 9.7% of the CPU. We force SSL on all pages, so that is with near-100% of traffic flowing over SSL. (I say “near” to account for initial HTTP requests being redirected to their HTTPS equivalents). Based on these numbers, a single EC2 compute unit can handle 1167 concurrent connections, with an arrival rate of 486 connections/second. (RAM is a non-factor; combined, HAProxy and stud are consuming only 73.6 MB).

We can scale vertically on AWS until we need more than 88 EC2 compute units. That means we won’t need to tackle horizontal scalability until we are handling over 100,000 concurrent connections or accepting over 40,000 new connections per second.

If you’re fortunate enough to have that problem, the simplest way to scale horizontally is with Round-Robin DNS. This enables you to point your A record to multiple load balancers straight from your DNS.

Go Give It a Try!

Among our Chef cookbooks and the configuration posted above, we’ve open sourced everything you need to build and deploy your own load balancer, complete with SSL termination and examples for path-based and domain-based routing.

If you’ve ever wanted to host your blog on your application domain, or if you’re trying to avoid yet another provider-specific dependency in your tech stack, I encourage you to give this a try and see how easy it is to do load balancing yourself.

External References

Follow @freerobby

Sharing the Tech

The Shareaholic Engineering Blog