A load balancer is a system that distributes incoming network traffic across multiple servers to prevent any single server from becoming overwhelmed. It sits between users and application servers, routing each request to an available server based on algorithms like round-robin, least connections, or resource utilization. Without load balancing, all traffic hits one server. As traffic grows, that server becomes a bottleneck and eventually a single point of failure. Load balancers enable horizontal scaling: add more servers and the load balancer automatically distributes traffic across them. They also provide fault tolerance, if a server fails, the load balancer stops sending traffic to it and routes requests elsewhere. Application load balancers (Layer 7) operate at the HTTP level, making routing decisions based on URL, headers, and content. They enable A/B testing by routing percentages of traffic to different versions. Network load balancers (Layer 4) operate at the TCP level, handling higher throughput with lower latency for non-HTTP traffic. Health checks are fundamental to load balancer operation: the balancer continuously tests that backend servers are alive and responding correctly, removing unhealthy servers from rotation automatically.
Back to Glossary