Load Balancing Explained
Load balancing distributes incoming network traffic across multiple servers to ensure no single server bears too much load. It is a fundamental building block of scalable, highly available web architectures — without it, a single server failure brings down the entire application.
Why Load Balancing Matters
A single server has finite capacity. As traffic grows, you scale horizontally by adding more servers. A load balancer sits in front of these servers and distributes requests among them. This provides three benefits: scalability (add servers to handle more traffic), availability (if one server fails, traffic is routed to healthy ones), and performance (requests go to the least-loaded server).
Layer 4 vs Layer 7
Load balancers operate at different network layers. Layer 4 (transport) load balancers route based on IP address and TCP/UDP port — they are fast but cannot inspect HTTP content. Layer 7 (application) load balancers understand HTTP — they can route based on URL path, headers, cookies, and request content. L7 load balancers enable features like path-based routing (send /api/* to API servers, /static/* to CDN), SSL termination, and request manipulation. Use the IP Lookup tool to check IP addresses and understand network routing.
Load Balancing Algorithms
Round Robin: requests go to each server in order. Simple, works well when servers are identical. Weighted Round Robin: servers with more capacity get proportionally more requests. Least Connections: sends the request to the server with the fewest active connections — good when request processing time varies. IP Hash: hashes the client IP to always route the same client to the same server — useful for sticky sessions but reduces distribution uniformity.
Health Checks
Load balancers periodically check whether backend servers are healthy. If a server fails its health check (e.g., does not respond to an HTTP GET on /health within 5 seconds), the load balancer stops sending traffic to it. When the server recovers, traffic resumes. Health checks are critical — without them, the load balancer sends requests to dead servers, resulting in errors for users.
SSL/TLS Termination
Load balancers can terminate TLS connections — the client connects to the load balancer over HTTPS, and the load balancer connects to backend servers over plain HTTP (or re-encrypts to HTTPS). This offloads the CPU-intensive TLS handshake from application servers and centralises certificate management. Use the Hash Generator to verify certificate file checksums when managing TLS certificates.
Session Persistence (Sticky Sessions)
Some applications store session state on the server (e.g., shopping cart in memory). Sticky sessions ensure the same user always hits the same server, usually by setting a cookie with the server ID. However, sticky sessions reduce load distribution and make scaling harder. The better solution is to externalize session state to a shared store (Redis, database) so any server can handle any request.
Cloud Load Balancers
AWS offers ALB (L7), NLB (L4), and CLB (classic). GCP has HTTP(S) Load Balancer (L7, global) and Network Load Balancer (L4, regional). Azure has Application Gateway (L7) and Azure Load Balancer (L4). All integrate with auto-scaling groups and health checks. For most web applications, the L7 load balancer (ALB, GCP HTTP LB, Azure App Gateway) is the right choice. Compare cloud configurations with the JSON Formatter to validate IaC templates.
CDN and Global Load Balancing
A Content Delivery Network (CDN) is a form of global load balancing — it distributes content across edge locations worldwide and routes users to the nearest one. DNS-based global load balancing routes users to the nearest data center. Together, CDN + regional load balancers + auto-scaling provide a complete scalable architecture.