Traffic tamers: unveiling the load balancing algorithm arsenal - AMITAV ROY BLOG
    Traffic tamers: unveiling the load balancing algorithm arsenal
    In this blog post, we embark on an in-depth exploration of load balancing algorithms, uncovering the intricacies and mechanics that drive their effectiveness.
    9 April, 2024


    Imagine your website is a bustling city during rush hour. Visitors flood in, eager for your content. But suddenly, traffic grinds to a halt. Servers overload and frustrated users see dreaded error messages. Enter the heroes of high performance: load-balancing algorithms! These clever algos distribute traffic across multiple servers, ensuring a smooth user experience. But with different algorithms at your disposal, which is the right champion for your website's needs?

    So, when we have a situation where one server is not able to handle the entire request, we would typically have two options:

    1. Vertical scaling - Boost the power of your existing server by adding more RAM and CPU.
    2. Horizontal scaling - Expand your infrastructure by adding more servers. This article focuses on horizontal scaling. When you add more servers, a crucial question arises: how do you efficiently distribute incoming traffic across these servers? With two or more servers, you need a system (load balancing algorithm) to intelligently route requests to ensure optimal performance.

    The different kinds of Load-balancing algorithms

    The two main categories of load-balancing algorithms differ in their approach:

    Static algorithms: These have pre-defined rules that don't change. They're simpler and faster, but less efficient when dealing with servers with varying capacities or response times. They don't adapt to the current state of your infrastructure.

    Dynamic algorithms: These adjust routing behavior based on real-time server data like workload and response times. This flexibility makes them more efficient, but they add some complexity and may be slightly slower than static algorithms.

    The "Better" Choice Depends on Your Needs

    The distinction between static and dynamic algorithms lies in their inherent nature: while one adheres to fixed rules, the other adjusts based on varying factors. Naturally, when presented with choices, we seek clarity on which option is better. Consider the comparison between algorithms, for instance, prioritizing speed.

    The answer isn't straightforward—it hinges on circumstances. Although static algorithms typically outpace dynamic ones due to their streamlined logic and lower overhead, they falter in scenarios where server specifications or individual request response times diverge. This deficiency arises because static load-balancing algorithms disregard the current infrastructure state.

    In contrast, dynamic load balancing algorithms analyze server statistics in real-time, adapting routing accordingly. Yet, this adaptability introduces complexity, hindering their speed relative to static counterparts.

    Common algorithms for Static and Dynamic

    So, let's dive into the different kind of algorithms which fall under these two high level categories and understand their pros and cons.

    For static load balancing algorithms, we have some examples:

    • Round robin
    • Weighted round robin
    • IP hash

    For dynamic load balancing algorithms, we have:

    • Least connections
    • Weighted least connection
    • Least response time

    Let's see a little more details about each one of them

    Round robin (RR)

    This is a common algorithm that distributes traffic evenly across all servers in a sequential manner. It's like taking turns for each server to handle a request.


    1. Simplicity: It is very easy to understand and implement. It follows a clear, sequential approach, making it ideal for basic load distribution needs.
    2. Fairness: RR ensures a fair distribution of traffic across all available servers. Each server gets its turn to handle a request, promoting a balanced workload.
    3. Predictable: RR doesn't rely on complex calculations, so it's predictable in its behaviour. You can easily estimate how much traffic each server will receive.
    4. Low overhead: The algorithm itself is lightweight and requires minimal resources, making it suitable for environments where processing power is a concern.


    1. Inefficiency for Unequal Servers: RR assumes all servers have equal capacity. If some servers are more powerful, they might be underutilised while weaker servers become overloaded. This can lead to bottlenecks and slowdowns.
    2. Sticky Sessions Issue: RR doesn't consider maintaining user sessions on the same server. If a user makes multiple requests that bounce between servers due to RR, it can disrupt applications that rely on session data.
    3. Limited Adaptability: RR doesn't react to changes in server health or workload. If a server becomes overloaded or unavailable, RR will still send it traffic until the next round, potentially causing issues.


    Based on the above points, we can summarise that RR is a good choice when:

    1. we have simple deployments with identical servers
    2. environments with predictable workloads
    3. situations where minimal configuration and resource usage are priorities

    However, when we have servers with varying capacities and fluctuating traffic patterns, we can use the next algorithm which is Weighted Round Robin (WRR).

    Weighted round robin (WRR)

    Weighted Round Robin (WRR) load balancing builds upon the foundation of Round Robin (RR) but offers a little more flexibility and efficiency. It allows us to put some weight to certain servers. Which means, we can divert more traffic to bigger servers thereby allowing us more customisation. Now, like any algorithm, it has it’s pros and cons. So, let’s look at that as well.


    1. Improved Resource Utilisation: WRR assigns weights to servers based on their processing power, memory, or other capacity metrics. This allows you to direct more traffic to powerful servers and less to weaker ones, ensuring optimal utilisation of resources.
    2. Fairness with Heterogeneity: Unlike RR, WRR addresses the issue of unequal servers. By assigning higher weights to stronger servers, it ensures a fairer distribution of workload and avoids overloading weaker machines.
    3. Predictable Performance: While more complex than RR, WRR still follows a clear logic based on weight assignments. You can predict how much traffic each server will receive based on its weight, aiding in capacity planning.
    4. Maintains Simplicity: WRR builds upon the familiar concept of round robin, making it relatively easy to understand and implement compared to more complex dynamic algorithms.


    1. Weight Configuration Challenge: Setting appropriate weights requires knowledge of your server capabilities and traffic patterns. Inaccurate weight assignments can lead to inefficiencies, defeating the purpose of WRR.
    2. Limited Dynamic Response: WRR doesn't dynamically adjust weights based on real-time server health. If a server becomes overloaded but maintains its weight, it might still receive traffic exceeding its capacity.
    3. Sticky Sessions Consideration: Similar to RR, WRR doesn't inherently guarantee user sessions stay on the same server. This might cause issues for applications relying on session data.


    So, in summary we can say that WRR is a little more advanced compared to RR. It is good when:

    1. Environments with servers of varying capacities
    2. Scenarios where maximising resource utilisation is important (which is always the case)
    3. Setups with predictable traffic patterns where weight configuration can be optimised

    IP Hash

    IP hash load balancing uses a client's IP address as a key. It calculates a unique hash value from the client's IP address (sometimes combined with the server's IP) and based on the hash value, the client is directed to a specific server.

    Now let’s understand the pros and cons of this algorithm


    1. Session Persistence: Clients connecting repeatedly are likely directed to the same server (assuming same IP), which is useful for maintaining sessions (shopping carts, logins).
    2. Simplicity: The concept is easy to understand and implement.


    1. Server Inefficiency: If clients have dynamic IPs (common with DHCP), they might bounce between servers, reducing efficiency.
    2. Uneven Distribution: With many clients sharing similar IP ranges (e.g., corporate network), the load might not be evenly distributed across all servers.
    3. Limited Scalability: Adding or removing servers requires recalculating hash ranges for clients, potentially disrupting existing connections.

    Now, all the above are Static load balancing algorithms where the health of the resource and load on the resource are not considered. With Dynamic algos, we start to give them equal importance to distribute the load. So, let’s understand the different dynamic algorithms, how they work and their pros and cons.

    Least connections

    Least Connections directs incoming traffic to the server with the fewest active connections at that moment. It aims to distribute workload based on current server busyness. Now let’s understand the pros and cons of this algorithm.


    1. Fair Distribution: Ensures a balanced workload across servers, especially when connections have varying durations.
    2. Adaptability: Works well with fluctuating traffic patterns.


    1. Ignores Server Capacity: Doesn't consider server processing power or resource limitations. A server with fewer connections might be weaker.
    2. Long Connections Impact: Servers handling long-lived connections could appear less busy even if overloaded, leading to uneven distribution.


    So, we can see that this algorithm takes into account aspects around load.

    1. It takes the connection into account. So busy servers won’t become a bottleneck
    2. It is good in traffic which are not stable in pattern
    3. Very good when the connection time is small
    4. Good with servers of similar specs

    However, this algorithm doesn’t consider the fact that there can be some servers with higher specs which has more potential. And that’s when we come to our next algorithm "Weighted least connection".

    Weighted least connections

    Combines least connections with server weights. Sends requests to the server with the fewest active connections, considering a weight assigned based on server capacity. More weight means handling more traffic.


    Efficiently distributes traffic based on both workload and capacity. Adapts to varying server loads.


    Requires pre-configuring server weights (can be challenging). More complex than Least Connections.


    It is an improved version of the previous algorithm because it does allow us to put in weights. However, they also come with their own set of issues.

    Least response time

    Sends requests to the server with the fastest response time and the fewest active connections. It considers both server health and workload. It considers two aspects "active connections" and "average response time".

    The load balancer analyses the collected data from it server to find the server perfect for taking the request. It does a continious poll to monitor the server and get the desired information.


    1. Improved User Experience: Directs traffic to the most responsive servers, leading to faster page loads and smoother interactions for users.
    2. Adaptability: Responds effectively to changes in server performance. As server response times fluctuate, the algorithm dynamically adjusts traffic distribution.


    1. Monitoring Overhead: Requires constant monitoring of server response times, which can add some overhead to the system.
    2. Short-Lived Requests: Might not be ideal for scenarios involving a high volume of short-lived requests with frequent fluctuations in response times. The overhead of monitoring response times might outweigh the benefits.


    1. Least Response Time prioritizes user experience by directing traffic to the fastest-performing servers.
    2. It's particularly beneficial for applications sensitive to response times and in environments with dynamic server workloads.


    Here's an in-depth exploration of various load-balancing algorithms, highlighting their advantages, drawbacks, and suitable applications. In modern cloud solutions, we often have access to pre-built options, alleviating the need to delve into intricate workings.

    However, when faced with unexpected challenges, understanding such details empowers us to debug effectively and make informed choices regarding our application's facets.

    I'm eager to hear your insights on load balancing and these algorithms. Feel free to connect with me on Twitter @amitavroy7.

    Image courtsey Andriy Babchiy on Unsplash


    Transforming ideas into impactful solutions, one project at a time. For me, software engineering isn't just about writing code; it's about building tools that make lives better.

    Share with the post url and description