Load Balancing Explained

BY TOOLS.FUN  ·  MARCH 28, 2026  ·  6 min read

Load balancing distributes incoming network traffic across multiple servers to ensure no single server bears too much load. It is a fundamental building block of scalable, highly available web architectures — without it, a single server failure brings down the entire application.

Why Load Balancing Matters

A single server has finite capacity. As traffic grows, you scale horizontally by adding more servers. A load balancer sits in front of these servers and distributes requests among them. This provides three benefits: scalability (add servers to handle more traffic), availability (if one server fails, traffic is routed to healthy ones), and performance (requests go to the least-loaded server).

Layer 4 vs Layer 7

Load balancers operate at different network layers. Layer 4 (transport) load balancers route based on IP address and TCP/UDP port — they are fast but cannot inspect HTTP content. Layer 7 (application) load balancers understand HTTP — they can route based on URL path, headers, cookies, and request content. L7 load balancers enable features like path-based routing (send /api/* to API servers, /static/* to CDN), SSL termination, and request manipulation. Use the IP Lookup tool to check IP addresses and understand network routing.

Key point: Use L7 load balancing for web applications — the ability to route based on HTTP content, terminate TLS, and add/modify headers is almost always worth the small additional latency. Use L4 only for non-HTTP protocols or extreme performance requirements.

Load Balancing Algorithms

Round Robin: requests go to each server in order. Simple, works well when servers are identical. Weighted Round Robin: servers with more capacity get proportionally more requests. Least Connections: sends the request to the server with the fewest active connections — good when request processing time varies. IP Hash: hashes the client IP to always route the same client to the same server — useful for sticky sessions but reduces distribution uniformity.

Health Checks

Load balancers periodically check whether backend servers are healthy. If a server fails its health check (e.g., does not respond to an HTTP GET on /health within 5 seconds), the load balancer stops sending traffic to it. When the server recovers, traffic resumes. Health checks are critical — without them, the load balancer sends requests to dead servers, resulting in errors for users.

SSL/TLS Termination

Load balancers can terminate TLS connections — the client connects to the load balancer over HTTPS, and the load balancer connects to backend servers over plain HTTP (or re-encrypts to HTTPS). This offloads the CPU-intensive TLS handshake from application servers and centralises certificate management. Use the Hash Generator to verify certificate file checksums when managing TLS certificates.

Key point: Terminate TLS at the load balancer to simplify certificate management and reduce backend server load. If your security policy requires end-to-end encryption, use TLS re-encryption — the load balancer decrypts, inspects, and re-encrypts before forwarding to backends.

Session Persistence (Sticky Sessions)

Some applications store session state on the server (e.g., shopping cart in memory). Sticky sessions ensure the same user always hits the same server, usually by setting a cookie with the server ID. However, sticky sessions reduce load distribution and make scaling harder. The better solution is to externalize session state to a shared store (Redis, database) so any server can handle any request.

Cloud Load Balancers

AWS offers ALB (L7), NLB (L4), and CLB (classic). GCP has HTTP(S) Load Balancer (L7, global) and Network Load Balancer (L4, regional). Azure has Application Gateway (L7) and Azure Load Balancer (L4). All integrate with auto-scaling groups and health checks. For most web applications, the L7 load balancer (ALB, GCP HTTP LB, Azure App Gateway) is the right choice. Compare cloud configurations with the JSON Formatter to validate IaC templates.

CDN and Global Load Balancing

A Content Delivery Network (CDN) is a form of global load balancing — it distributes content across edge locations worldwide and routes users to the nearest one. DNS-based global load balancing routes users to the nearest data center. Together, CDN + regional load balancers + auto-scaling provide a complete scalable architecture.

Key point: Load balancing is not just about distributing traffic — it is about building resilient systems. Combine load balancers with auto-scaling, health checks, and redundancy to create architectures that survive individual server, rack, and even data center failures.
← Back