(This is the third part of the Scaling Django to 30K requests/s series)
We use HAProxy on EC2 instances to load-balance the incoming HTTP requests across the web server boxes. Amazon provide the Elastic Load Balancer(ELB) service which does a similar thing, so why did we run our own?
The biggest difficulty for us with ELB is that our traffic peaks very quickly when the TV show is on. For example, on Saturday between 3 minutes before Britain’s Got Talent and 2 minutes into the show, the load on our servers tripled, and it peaked 10x higher. ELB provides nearly infinite capacity, but takes tens of minutes to scale up – too slow for our needs.
Using our own HAProxy nodes lets us pre-scale to cope with the expected peak demand, while dynamically scaling the web layer below.
One thing that there doesn’t seem to be much information on is the peak load that can be handled on an EC2 node. Our testing showed that a c1.medium could handle approximately 5,000 incoming connections per second. m1.small handled somewhat less, but larger node sizes didn’t provide an increase. It seems there’s some EC2 network/hypervisor/something else limit that means > 5K/s/node isn’t achievable.