GitHub will release as open source the GitHub Load Balancer (GLB), its internally developed load balancer.
GLB was originally built to accommodate GitHub’s need to serve billions of HTTP, Git, and SSH connections daily. Now the company will release components of GLB via open source, and it will share design details.
“Historically one of the more complex components has been our load-balancing tier,” said Joe Williams, GitHub senior infrastructure engineer, and Theo Julienne, GitHub infrastructure engineering manager, in a co-authored bulletin. “Traditionally we scaled this vertically, running a small set of very large machines running haproxy and using a very specific hardware configuration allowing dedicated 10G link failover.”
But when the loading-balancing platform hit its limit, the company set out to develop its own solution. This new platform would have to meet certain goals, including horizontal scaling, high availability, support of connection draining, and resilience to DDoS attacks. “To achieve these goals we needed to rethink the relationship between IP addresses and hosts, the constituent layers of our load balancing tier, and how connections are routed, controlled, and terminated.”
In designing its load balancer, GitHub sought to improve on the common pattern for the traffic director tier. The company settled on a variant of rendezvous hashing that supports multiple lookups. Each proxy host is stored and assigned a state, which then handles connection draining. A fixed-size forwarding table is generated and each row filled with proxy servers using the ordering component of rendezvous hashing. The table and proxy states are sent to director servers and kept in sync.
TCP packets, upon arrival at the director, have the source IP hashed to generate consistent index into the forwarding table. The packet is encapsulated inside another IP packet destined to the internal IP of the proxy server and sent over the network. The proxy server receives the encapsulated packet, de-encapsulates it, and processes the original packet locally. Outgoing packets use Direct Server Return, so packets going to the client egress directly to the client and bypass the director tier.
“We set out to design a new director tier that was stateless and allowed both director and proxy nodes to be gracefully removed from rotation without disruption to users wherever possible,” the engineers said. “Users live in countries with less than ideal internet connectivity, and it was important to us that long running clones of reasonably sized repositories would not fail during planned maintenance within a reasonable time limit.”