libhashring - high performance consistent hashing

In a previous post I talked about the need to shard Redis data and how I accomplished this by adding shard/hashing support to erldis, an Erlang client for Redis. This solution proved to work well, it distributed our data very well amongst many Redis servers -- but there was one problem. Performance.

In the change I made to erldis, the hash ring was stored in ETS (an erlang memory store) and anytime a key was hashed, the ring had to be retrieved from ETS. The problem with this is that Erlang copies the entire ring when it comes out of ETS. The ring we used was very large, with thousands of items. Copying this every single time became a huge performance hit.

To mitigate the performance problems, I decided to implement the hash ring in C, and write an Erlang driver to use it. That is how libhashring was born. It is very high performance and currently has bindings for Erlang and Java. It is currently deployed in our production environment and its speed is incredible. I am confident that increasing the size of the ring as we add capacity will not cause an impact to the hash ring's performance.

libhashring supports MD5 and SHA-1. MD5 seems to be about 25% faster than SHA-1, so if you want the extra performance, MD5 is probably the best bet.

Feel free to fork libhashring and make it even better, I'd be really happy to get some feedback and contributions.

P.S. - After finishing the hash ring, I came across libketama and realized it pretty much solves the same problem that I had. Too bad I didn't see it sooner!

C example

hash_ring_t *ring = hash_ring_create(8, HASH_FUNCTION_SHA1);
char *slotA = "slotA";
char *slotB = "slotB";

char *keyA = "keyA";
char *keyB = "keyBBBB";
char *keyC = "keyB_";

hash_ring_node_t *node;

assert(hash_ring_add_node(ring, (uint8_t*)slotA, strlen(slotA)) == HASH_RING_OK);
assert(hash_ring_add_node(ring, (uint8_t*)slotB, strlen(slotB)) == HASH_RING_OK);

node = hash_ring_find_node(ring, (uint8_t*)keyA, strlen(keyA));
assert(node != NULL && node->nameLen == strlen(slotA) && memcmp(node->name, slotA, strlen(slotA)) == 0);

node = hash_ring_find_node(ring, (uint8_t*)keyB, strlen(keyB));
assert(node != NULL && node->nameLen == strlen(slotA) && memcmp(node->name, slotA, strlen(slotA)) == 0);

node = hash_ring_find_node(ring, (uint8_t*)keyC, strlen(keyC));
assert(node != NULL && node->nameLen == strlen(slotB) && memcmp(node->name, slotB, strlen(slotB)) == 0);

hash_ring_free(ring);
Written on June 4, 2011