distributed - Consistent Hashing: what about rehashing? -
as may know, consistent hashing great idea when dealing dht. main idea not suffer when new node added or deleted.
from original paper:
when machine added or removed set of caches, expected fraction of objects must moved new cache minimum needed maintain balanced load across caches.
the solution great, there phenomenon of bad distribution of keys. solve that, replicas of original nodes distributed randombly. solution works quite well. @ chart if want sure.
ok, seems work well. but, there i've been thinking nobody mention.
what happens when 1 node added (or removed)? well, every key, "before" node placed needs rehashed. seems good, becouse keys not "all" keys. but, if decide place replicas, 20, then, 20 nodes feel pain of rehashing.
less replicas means worse distribution, more replicas means more pain when rehashing needed.
what solution know suit in situation? missing something?
yeah, think you're looking @ wrong way. i'll use terms "node" machine , "object" thing cached.
on average, every node affected new add being added. not bad; spreads load of rehashing across available nodes.
the important part objects not affected rehashing. on average, 1/nodes objects need reassigned; , on average, each node need cope transferring-away 1/nodes^2 nodes, cuts down on impact of this.
Comments
Post a Comment