Difference between replicas and virtual nodes in consistent hashing
Asked Answered
D

1

6

This is perhaps specific to an implementation that I'm looking at (node-hashring), but what is the difference between virtual nodes (vnodes) and replicas in a consistent hash ring?

The original Akamai paper does not seem to describe vnodes explicitly, and various other sources seem to use the two terms interchangeably (e.g. "virtual nodes", which are replicas of cache points in the circle, from source).

The docs for node-hashring give the example 40 hashes (vnodes) and 4 replicas per hash = 160 points per server. Despite reading the source, I can't quite figure out what these two different parameters do.

Demonism answered 16/11, 2016 at 1:13 Comment(0)
S
1

vnodes are different from replicas. vnodes are just the labels given to a physical node in the consistent hash ring in order to maintain more even distribution of data. While replica is a copy of the data stored by the adjacent servers which come into play when that server goes down or is removed from the ring. For eg. if node1 has 40 virtual nodes, then all the data whose hash values fall in the range of the vnodes will be stored and served by node1. Also, node1 can have 4 replicas, which means 4 adjacent servers will be storing copy of the data of node1 and will serve them when node1 goes down.

Sheehy answered 10/11, 2019 at 19:1 Comment(2)
when a node goes down how it will be handled in a typical system? will the nodes containing the replica will continue to serve the data or will consistent hashing kicks in and it will assign new nodes for the data and the data from the replica will be moved over to the newly assigned nodes?.. the former seems to be the most efficient option but not sure how the consistent hashing will work in this case.. ? basically no re-distribution of keys will take place since the replica nodes take care of the downed node's request?Neelyneeoma
generally replicas are used for fault tolerance as a backup server. So, in this scenario we can have server chain like [R1-R2-R3] all grouped as single node (N1). So, here N1 represents entire chain of replicas and would be virtually distributed. So, ideally N1 should be considered down only if all the servers in the chain falls else in this case if R1 goes down, R2 should be primary one to handle the load leaving N1 to be [R2, R3]. And in this scenario, outside world doesn't care since requests are still getting handled by N1 until all of the replicas are down, it shouldn't be considered downReciprocal

© 2022 - 2024 — McMap. All rights reserved.