How does consistent hashing work?

D

5

11

I am trying to understand how consistent hashing works. This is the article which I am trying to follow but not able to follow, to start with my questions are:

I understand, servers are mapped into ranges of hashcodes and the data distribution is more fixed and look becomes easy. But how does this deal with the problem a new node is added in the cluster?
The sample java code is not working, any suggestion of a simple java based consistent hashing.

Update

Any alternatives to consistent hashing?

Desquamate answered 29/9, 2010 at 2:44 Comment(2)

Archive of dead link: web.archive.org/web/20100724081532/http://weblogs.java.net/blog/… – Reefer 22/10, 2019 at 19:19

youtube.com/watch?v=zaRkONvyGr8 – Piled 23/10, 2019 at 5:11

H

5

I will answer the first part of your question. First of all, there are some errors in that code, so I would look for a better example.

Using a cache server as the example here.

When you think about consistent hashing, you should think of it as a circular ring, as the article you linked to does. When a new server is added, it will have no data on it to start with. When a client fetches data that should be on that server and does not find it, a cache-miss will occurs. The program should then fill in the data on the new node, so future requests will be a cache-hit. And that is about it, from a caching point of view.

Homogony answered 29/9, 2010 at 3:1 Comment(0)

D

7

For python implementation Refer my github repo

Simplest Explanation What is normal hashing ?

Let's say we have to store the following key value pair in a distributed memory store like redis.

Let say we have a hash function f(id) ,that takes above ids and creates hashes from it . Assume we have 3 servers - (s1 , s2 and s3)

We can do a modulo of hash by the no of servers ie 3 , to map each each key to a server and we are left with following.

We could retrieve the value for a key by simple lookup using f(). Say for key Jackson , f("Jackson")%(no of servers) => 1211*3 = 2 (node-2).

This looks perfecto , yea close but not cigar !

But What if a server say node-1 went down ? Applying the same formula ie f(id)%(no of servers) for user Jackson, 1211%2 = 1 ie we got node-1 when the actual key is hashed to node-2 from the above table .

We could do remapping here , What if we have a billion keys ,in that case we have to remap a large no of keys which is tedious :(

This is a major flow in traditional hashing technique.

What is Consistent Hashing ?

In Consistent hashing , we visualize list of all nodes in a circular ring .(Basically a sorted array)

start func
For each node:
 Find f(node) where f is the hash function
 Append each f(node) to a sorted array
For any key
  Compute the hash f(key)
  Find the first f(node)>f(key)
  map it
end func

For example, if we have to hash key smith, we compute the hash value 1123 , find the immediate node having hash value > 1123 ie node 3 with hash value 1500

Now , What if we loose a server , say we loose node-2 , All the keys can be mapped to next server node-3 :) Yea , we only have to remap the keys of node-2

Dia answered 23/10, 2019 at 4:30 Comment(2)

In the consistent hashing you say that when a node goes down all its keys are mapped to another node. But what about the data the keys refer to? Does this mean that every node has all the data? – Leveille 25/3, 2020 at 8:52

Consistent hashing can be used for different purposes involving distributed key value databases, load balancing etc. The specific use case we are referring is for distributed cache database. Other nodes won't have the data, it will be a cache miss in that case, what consistent hashing solves is that, instead of remapping the entire keys, we only have to remap the keys that belongs to the server that went down. It doesn't implements any kinds of high availability for the keys. That will be provided by setting up replication factor. – Dia 31/5, 2020 at 15:24

H

5

I will answer the first part of your question. First of all, there are some errors in that code, so I would look for a better example.

Using a cache server as the example here.

When you think about consistent hashing, you should think of it as a circular ring, as the article you linked to does. When a new server is added, it will have no data on it to start with. When a client fetches data that should be on that server and does not find it, a cache-miss will occurs. The program should then fill in the data on the new node, so future requests will be a cache-hit. And that is about it, from a caching point of view.

Homogony answered 29/9, 2010 at 3:1 Comment(0)

M

2

Overview

I wrote a blog for how the consistent-hashing works, here to answer those original questions, below are the quick summary.

Consistent-hashing are more commonly used for the data partitioning purpose, and we usually see it in the components like

Load balancer
API gateway
...

To answer the questions, the below will covers

How it works
How to add/find/remove server nodes
How to implement it
Any alternatives to consistent hashing?

Let's use a simple example here the load balancer, the load balancer maps 2 origin nodes (servers behind the load balancer) and the incoming requests in the same hash ring circle (let's say the hash ring range is [0,255]).

Initial State

For the server nodes, we have a table:

Find Node

For any incoming request, we apply the same hash function, then we assume that we get a hashcode for a request which hashcode = 120, now from the table, we find the next closest node in the clockwise order, so the node 2 is the target node in this case.

Similarly, what if we get a request with hashcode = 220? Because the hash ring range is a circle, so we return the first node then.

Add Node

Now let's add one more node into the cluster: node 3 (hashcode 150), then our table will be updated to:

Then we use the same algorithm in the Find Node section to find the next nearest node. Say, the request with hashcode = 120, now it will be routed to node-3.

Remove Node

Removal is also straight forward, just remove the entry <node, hashcode> from the table, let's say we remove the node-1, then the table will be updated to:

Then all the requests with:

Hashcode in [201, 255] and [0, 150] will be routed to the node-3
Hashcode in [151, 200] will be routed to node-2

Implementation

Below is a simple c++ version (with virtual-node enabled), which is quite similar to Java.

#define HASH_RING_SZ 256

struct Node {
  int id_;
  int repl_;
  int hashCode_;
  Node(int id, int replica) : id_(id), repl_(replica) {
    hashCode_ =
      hash<string>{} (to_string(id_) + "_" + to_string(repl_)) % HASH_RING_SZ;
  }
};

class ConsistentHashing {
private:
  unordered_map<int/*nodeId*/, vector<Node*>> id2node;
  map<int/*hash code*/, Node*> ring;

public:
  ConsistentHashing() {}

  // Allow dynamically assign the node replicas
  // O(Rlog(RN)) time
  void addNode(int id, int replica) {
    for (int i = 0; i < replica; i++) {
      Node* repl = new Node(id, replica);

      id2node[id].push_back(repl);
      ring.insert({node->hashCode_, repl});
    }
  }

  // O(Rlog(RN)) time
  void removeNode(int id) {
    auto repls = id2node[id];
    if (repls.empty()) return;

    for (auto* repl : repls) {
      ring.erase(repl->hashCode_);
    }

    id2node.erase(id);
  }

  // Return the nodeId
  // O(log(RN)) time
  int findNode(const string& key) {
    int h = hash<string>{}(key) % HASH_RING_SZ;

    auto it = ring.lower_bound(h);
    if (it == ring.end()) it == ring.begin();

    return it->second->id;
  }
};

Alternatives

If I understand the question correctly, the question is for asking any alternatives to consistent hashing for the data partitioning purpose. There are a lot actually, depends on the actual use case we may choose different approaches, like:

Random
Round-robin
Weighted round-robin
Mod-hashing
Consistent-hash
Weighted(Dynamic) consistent-hashing
Range-based
List-based
Dictionary-based
...

And specifically in the load balancing domain, there are also some approaches like:

Least Connection
Least Response Time
Least Bandwidth
...

All the above approaches have their own pros and cons, there is no best solution, we have to do the trade-off accordingly.

Last

Above just a quick summary for the original questions, for the further reading, like the:

Consistent-hashing unbalancing issue
Snow crash issue
Virtual node concept
How to tweak the replica number

I've covered them in the blog already, below are the shortcuts:

Have fun :)

Metrics answered 20/7, 2020 at 22:13 Comment(0)

I

1

I'll answer the first part of your question. The question that arises is how consistent hashing actually works? As we know that in a client-server model Load Balancer will be there that will route the request to the different servers depending upon the traffic of the network request to the server.

So, the purpose of hashing is to assign the numerals to all the clients that are requesting for and mode(mathematics) it by a number of servers we have. The remainder that we will get after mode, we assign the request to that particular server.

In Consistent Hashing Strategy, It uses a hashing function to position clients and servers on a circular path. It will route the request if the client is in the clockwise direction, the request of the client will be accepted by the server that comes first in the path.

What if our one server dies?

Earlier, in a simple hashing strategy, we need to redo the calculation and route the request according to the remainder that we will get and we will face the problem of cache hits.

In this consistent hashing strategy, if any server dies, the request of the client will move to the next server in a path in the same clockwise direction. That means it will not affect the other servers and cache hits and consistency will be maintained.

Iotacism answered 8/12, 2021 at 6:38 Comment(0)

B

0

You say that...

I understand, servers are mapped into ranges of hashcodes and the data distribution is more fixed

... but that is not how consistent hashing works.

In fact, the opposite: consistent hashing's physical_node:virtual_node mapping is dynamically random while still being "evenly" distributed.

I answer in detail here how this randomness is implemented.

Give that a read, and make sure that you understand how it all fits together. Once you have the mental model, the Java blog article you linked to should conceptually make much more sense:

It would be nice if, when a cache machine was added, it took its fair share of objects from all the other cache machines. Equally, when a cache machine was removed, it would be nice if its objects were shared between the remaining machines. This is exactly what consistent hashing does

Brioche answered 24/6, 2022 at 19:23 Comment(0)