why do we need consistent hashing when round robin can distribute the traffic evenly
Asked Answered
C

4

8

When the load balancer can use round robin algorithm to distribute the incoming request evenly to the nodes why do we need to use the consistent hashing to distribute the load? What are the best scenario to use consistent hashing and RR to distribute the load?

Chokeberry answered 13/10, 2019 at 8:46 Comment(0)
I
12

From this blog,

With traditional “modulo hashing”, you simply consider the request hash as a very large number. If you take that number modulo the number of available servers, you get the index of the server to use. It’s simple, and it works well as long as the list of servers is stable. But when servers are added or removed, a problem arises: the majority of requests will hash to a different server than they did before. If you have nine servers and you add a tenth, only one-tenth of requests will (by luck) hash to the same server as they did before. Consistent hashing can achieve well-distributed uniformity.

Then there’s consistent hashing. Consistent hashing uses a more elaborate scheme, where each server is assigned multiple hash values based on its name or ID, and each request is assigned to the server with the “nearest” hash value. The benefit of this added complexity is that when a server is added or removed, most requests will map to the same server that they did before. So if you have nine servers and add a tenth, about 1/10 of requests will have hashes that fall near the newly-added server’s hashes, and the other 9/10 will have the same nearest server that they did before. Much better! So consistent hashing lets us add and remove servers without completely disturbing the set of cached items that each server holds.

Similarly, The round-robin algorithm is used to the scenario that a list of servers is stable and LB traffic is at random. The consistent hashing is used to the scenario that the backend servers need to scale out or scale in and most requests will map to the same server that they did before. Consistent hashing can achieve well-distributed uniformity.

Interior answered 14/10, 2019 at 8:50 Comment(0)
D
10

Let's say we want to maintain user sessions on servers. So, we would want all requests from a user to go to the same server. Using round-robin won't be of help here as it blindly forwards requests in circularly fashion among the available servers.

To achieve 1:1 mapping between a user and a server, we need to use hashing based load balancers. Consistent hashing works on this idea and it also elegantly handles cases when we want to add or remove servers.

References: Check out the below Gaurav Sen's videos for further explanation. https://www.youtube.com/watch?v=K0Ta65OqQkY https://www.youtube.com/watch?v=zaRkONvyGr8

Datum answered 18/3, 2020 at 7:46 Comment(3)
So.. I think the consistent hashing is useful between API server and sharded DB servers . Am I right?Alberich
@Alberich I would say, that is one of the use cases where Consistent Hashing can be advantageous. However, the scope of consistent hashing is beyond just the types mentioned and would rather evolve into a space where we need the request routing decisions to be made not just based on mere server availability, but also considering aspects such as current position of the node, its current load etc., in order to distribute the load evenly between the nodes around such that, a failed node in the cluster wouldn't overwhelm the nodes around, in the same cluster; the same when a node gets re-spawned.Teodoor
If we combine round robin with a store of mapping of user sessions to servers, we can first look up to see if there is already a mapping to a server for a particular user. If not, proceed to perform the round robin algorithm to allocate a server. This helps achieve a 1:1 mapping. What are the limitations of using this approach besides the storage overhead?Syrup
P
3

For completeness, I want to point out one other important feature of Consistent Hashing that hasn't yet been mentioned: DOS mitigation.

If a load-balancer is getting spammed with requests, (either from too many customers, an attack, or a haywire local service) a round-robin approach will apply the request spam evenly across all upstream services. Even spread out, this load might be too much for each service to handle. So what happens? Your loadbalancer, in trying to be helpful, has brought down your entire system.

If you use a modulus or consistent hashing approach, then only a small subset of services will be DOS'd by the barrage.

Being able to "limit the blast radius" in this manner is a critical feature of production systems

Picaroon answered 18/5, 2021 at 0:43 Comment(0)
M
1

Consistent hashing is fits well for stateful systems(where context of the previous request is required in the current requests), so in stateful systems if previous and current request lands in different servers than for current request context is lost and system won't be able to fulfil the request, so in consistent hashing with the use of hashing we can route of requests to same server for that particular user, while in round robin we cannot achieve this, round robin is good for stateless systems.

Merbromin answered 1/9, 2022 at 17:28 Comment(1)
Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.Michaelmas

© 2022 - 2024 — McMap. All rights reserved.