# Consistent Hashing Lookup

# Consistent Hashing Lookup

Consistent hashing maps keys to nodes in a dynamic set. Both keys and nodes are placed on a hash ring. A key is assigned to the first node encountered when moving clockwise from the key position.

You use it in distributed systems where nodes are added or removed and minimal key movement is required.

## Problem

Given a set of nodes and a key $k$, determine which node is responsible for storing or serving $k$.

## Model

Let:

* $h(x)$ map keys and nodes to a circular space $[0, M)$
* nodes are placed on the ring at positions $h(node)$

For a key $k$, compute:

$$
p = h(k)
$$

The assigned node is:

* the first node with position $\ge p$
* if none exists, wrap to the smallest node position

## Algorithm

```text id="9f3pqa"
consistent_hash_lookup(nodes, k):
    p = h(k)

    i = first node position >= p

    if i exists:
        return node[i]

    return node[0]   // wrap around
```

Nodes are typically stored in a sorted structure to support efficient lookup.

## Example

Suppose nodes are placed at:

$$
[10, 30, 70]
$$

and:

$$
h(k) = 50
$$

Then:

* 50 lies between 30 and 70
* next clockwise node is 70

So key $k$ is assigned to node 70.

If:

$$
h(k) = 80
$$

* no node $\ge 80$
* wrap to node 10

## Correctness

The ring structure partitions the hash space into intervals. Each node owns the interval from its predecessor to itself.

The lookup selects exactly the node that owns the interval containing $h(k)$.

## Complexity

| operation | time        |
| --------- | ----------- |
| lookup    | $O(\log n)$ |

Using a sorted array or balanced tree over node positions.

## Space

$$
O(n)
$$

Additional space is often used for virtual nodes to improve load balancing.

## When to Use

Consistent hashing is appropriate when:

* nodes are added or removed dynamically
* minimal redistribution of keys is required
* load balancing across nodes is needed
* distributed storage or caching is implemented

It is widely used in distributed databases, caches, content delivery systems, and sharded services.

## Implementation

```python id="8c2zvh"
import bisect

class ConsistentHash:
    def __init__(self, nodes):
        self.positions = sorted((hash(n), n) for n in nodes)
        self.keys = [p for p, _ in self.positions]

    def get(self, key):
        h = hash(key)
        i = bisect.bisect_left(self.keys, h)
        if i == len(self.keys):
            i = 0
        return self.positions[i][1]
```

```go id="k5x7fd"
import "sort"

type Node struct {
	Hash uint64
	ID   string
}

type ConsistentHash struct {
	Nodes []Node
}

func (c *ConsistentHash) Get(key string) string {
	h := hashString(key)

	i := sort.Search(len(c.Nodes), func(i int) bool {
		return c.Nodes[i].Hash >= h
	})

	if i == len(c.Nodes) {
		i = 0
	}

	return c.Nodes[i].ID
}

func hashString(s string) uint64 {
	var h uint64 = 1469598103934665603
	for i := 0; i < len(s); i++ {
		h ^= uint64(s[i])
		h *= 1099511628211
	}
	return h
}
```

