Consistent hashing, why are Vnodes a thing?
Asked Answered
H

2

10

My understanding of consistent hashing is that you take a key space, hash the key and then mod by say 360, and place the values in a ring. Then you equally space nodes on that ring. You pick the node to handle this key by looking clockwise from where your hashed key landed.

Then in many explanation they go onto describe Vnodes. In the riak docs which refers to the dynamo paper they say:

The basic consistent hashing algorithm presents some challenges. First, the random position assignment of each node on the ring leads to non-uniform data and load distribution.

Then they go on to propose Vnodes as a way to ensure uniform distribution of the input key space around the ring. The gist as I understand is that Vnodes divide up the ranges many more times than you have machines. So say you have 10 machines you might have 100 Vnodes and an individual machines Vnodes would be scattered randomly around the ring.

Now my question is why is this extra Vnode step required. Hash functions are supposed to provide a uniform distribution of their output so it would seem this is unneeed. According to this answer even the modulo of a hash function is still uniformly distributed.

Highspirited answered 4/11, 2021 at 15:16 Comment(3)
From the linked article: "Using virtual nodes has the following advantages: If a node becomes unavailable (due to failures or routine maintenance), the load handled by this node is evenly dispersed across the remaining available nodes. When a node becomes available again, or a new node is added to the system, the newly available node accepts a roughly equivalent amount of load from each of the other available nodes."Jason
(whereas without virtual nodes, a node becoming available/unavailable would disproportionately impact its two "neighbors" on the ring instead of spreading the restructuring-pain more evenly around the circle)Jason
I wanted to focus on the bit I highlighted in the question if possibleHighspirited
V
28

Imo, the missing piece of key information with most explanations of consistent hashing is that they don't detail the part about "multiple hash functions."

For whatever reason, most "consistent hashing for dummies" articles gloss over the implementation detail that makes the virtual nodes work with random distribution.

Before talking about how it does work, let me clarify the above with an example of how it does not work.

How it does not work

A naive implementation of vnodes looks like this:

naive vnodes source

This is naive because you'll notice that, for example, the green vnode always precedes the blue vnode. This means that if the green vnode goes offline then it will be replaced solely by the blue vnode, which defeats the entire purpose of moving from single-token nodes to distributed virtual nodes.

The article quickly mentions that practically, Vnodes are randomly distributed across the cluster. It then shows a separate picture indicating this but without explaining the mechanics of how this is achieved.

How it does work

Random distribution of vnodes is achieved via the use of multiple, unique hash functions. These multiple functions are where the random distribution comes from.

This makes the implementation look something roughly like this:

A) Ring Formation

  1. You have a ring consisting of n physical nodes via physical_nodes = ['192.168.1.1', '192.168.1.2', '192.168.1.3', '192.168.1.4']; (think of this as B/R/P/G in the prior picture's left-side)
  2. You decide to distribute each physical node into k "virtual slices," i.e. a single physical node is sliced into k pieces
    1. In this example, we use k = 4, but in practice we use should use k ≈ log2(num_items) to obtain reasonably balanced loads for storing a total of num_items in the entire datastore
  3. This means that num_virtual_nodes == n * k; (this corresponds to the 16 pieces in the prior picture's right-side)
  4. Assign a unique hashing algo for each k via hash_funcs = [md5, sha, crc, etc]
    1. (You can also use a single function that is recursively called k times)
  5. Divy up the ring by the following:
virtual_physical_map = {}
virtual_node_ids = []
for hash_func in hash_funcs:
  for physical_node in physical_nodes:
    virtual_hash = hash_func(physical_node)
    virtual_node_ids.append(virtual_hash)
    virtual_physical_map[virtual_hash] = physical_node
virtual_node_ids.sort()

You now have a ring composed of n * k virtual nodes, which are randomly distributed across the n physical nodes.

B) Partition Request Flow

  1. A partition-request is made with a provided key_tuple to key off of
  2. The key_tuple is hashed to get key_hash
  3. Find the next clock-wise node via virtual_node = binary_search(key_hash, virtual_node_ids)
  4. Lookup the real node via physical_id = virtual_physical_map[virtual_node]

Page 6 of this Stanford Lecture was very helpful to me in understanding this.

The net effect is that the distribution of vnodes across the ring looks like this:

distributed vnodes source

Verda answered 23/6, 2022 at 23:59 Comment(2)
This is probably one of the best explanations!Lum
Thank you for elaboration, but what does "item" in num_items refer to?Leventhal
E
5
First, the random position assignment of each node on the ring leads to non-uniform data and load distribution.

Good hash functions provide uniform distribution, but the input also had to be sufficiently large in number for them to appear spread out. The keys are, but the servers aren't. So a million keys that are hashed and modulo'd by 360 will be evenly distributed around the ring, but if you only use say 3 servers S1 through S3 to hold the key-value pairs, there is no guarantee that they might be hashed (with the same hash function used for the keys) uniformly on the ring at positions 0, 120 and 240. S1 might hash at 10, S2 at 12 and S3 at 50. So S2 will hold very less KV pairs compared to the other two. By having virtual servers, you increase the chances of them being hashed uniformly around the ring.

The other benefit is the even re-distribution of keys when a server is added or removed as mentioned in the doc.

Emarginate answered 8/11, 2021 at 15:49 Comment(2)
@LukeDeFeo Makes sense?Emarginate
+1 because you also focused on the first part which is typically neglected, that is, the number of servers is typically much less than the number of records. Servers may be assigned with a disproportionate number of records, regardless of whether data gets rebalanced or not.Dayle

© 2022 - 2024 — McMap. All rights reserved.