\(Hopscotch\ hashing\) is a scheme in computer programming for resolving hash collisions of values of hash functions in a table using open addressing. \(Hopscotch\ hashing\) was introduced in 2008. The name is derived from the sequence of hops that characterize the table's insertion algorithm.
The algorithm uses a single array of \(N\) elements. For each element (in the following of type \(char\)), its neighborhood ($$H$$) is a small collection of nearby consecutive elements (i.e. ones with close indexes to the original hashed element). The desired property of the neighborhood is that the cost of finding an item in the indexes of the neighborhood is close to the cost of finding it in the index itself.
In hopscotch hashing, as in cuckoo hashing, and unlike in linear probing, a given item will always be inserted-into and found-in the neighborhood of its hashed index. In other words, it will always be found either in its original hashed array entry, or in one of the next $$H-1$$ neighboring entries. The neighborhood is thus "virtual" and has a fixed size and overlaps with the next $$H-1$$ indexes.
Here is how to add item $$x$$ which was hashed to index $$i$$:
- If the entry $$i$$ is empty, add $$x$$ to $$i$$ and return.
- Starting at entry $$i$$, use a linear probe to find an empty entry at index $$j$$.
- If the empty entry's index $$j$$ is within $$H-1$$ of entry $$i$$, place $$x$$ there and return. Otherwise, entry $$j$$ is too far from $$i$$. To create an empty entry closer to $$i$$, find an item $$y$$ whose hash value lies between $$i$$ and $$j$$, but within $$H-1$$ of $$j$$. Displacing $$y$$ to $$j$$ creates a new empty slot closer to $$i$$. Repeat until the empty entry is within $$H-1$$ of entry $$i$$, place $$x$$ there and return. If no such item $$y$$ exists, insertion fails.
Given the following:typedef char element;
typedef element hash[N];
- Write a function that checks whether an element \(x\) of type \(char\) belongs to such hash table.
- Write a function that inserts an element \(x\) of type \(char\) into such hash table using the Hopscotch hashing algorithm.
Prototype: \(int\ insert(hash\ h,\ element\ e,\ int\ neighborhood)\)
Example:
Let $$N=10$$, $$H(k)=((int)k)\%N$$, and neighborhood of length 3.
Let the keys be the following, along with their hashing values:
\begin{array}{| c | c | }\hline
k & H(k) \\ \hline
a & 7 \\ \hline
b & 8 \\ \hline
l & 8 \\ \hline
d & 0 \\ \hline
e & 1 \\ \hline
k & 7 \\ \hline
\end{array}
Insertion of \(a\) of hash value = 7, add in index 7 since empty cell:
\begin{array}{|c |c|c|c|c|c|c|c|c|c|c|c|}\hline
index & 0 & 1&2&3&4&5&6&7&8&9 \\ \hline
T & & &&&&&&a&& \\ \hline
\end{array}
Insertion of \(b\) of hash value = 8, add in index 8 since empty cell:
\begin{array}{|c |c|c|c|c|c|c|c|c|c|c|c|}\hline
index & 0 & 1&2&3&4&5&6&7&8&9 \\ \hline
T & & &&&&&&a&b& \\ \hline
\end{array}
Insertion of \(l\) of hash value = 8, add in index 9 since 8 is occupied and by linear probing index 9 is empty and within the neighborhood:
\begin{array}{|c |c|c|c|c|c|c|c|c|c|c|c|}\hline
index & 0 & 1&2&3&4&5&6&7&8&9 \\ \hline
T & & &&&&&&a&b&l \\ \hline
\end{array}
Insertion of \(d\) of hash value = 0, add in index 0 since empty cell:
\begin{array}{|c |c|c|c|c|c|c|c|c|c|c|c|}\hline
index & 0 & 1&2&3&4&5&6&7&8&9 \\ \hline
T & d & &&&&&&a&b&l \\ \hline
\end{array}
Insertion of \(e\) of hash value = 1, add in index 1 since empty cell:
\begin{array}{|c |c|c|c|c|c|c|c|c|c|c|c|}\hline
index & 0 & 1&2&3&4&5&6&7&8&9 \\ \hline
T & d & e &&&&&&a&b&l \\ \hline
\end{array}
Insertion of \(k\) of hash value = 7, index 7 is occupied by \(a\). The neighborhood are cells 7, 8 and 9; they are all occupied.
Search for an empty cell by linear probing; in this example, cell number 2.
Check neighborhood 0, 1 and 2. Element at index 0 (character \(d\) can move to cell number 2 since its neighborhood are cells 0, 1 and 2), thus emptying cell number 0 which is still not in the neighborhood of \(k\).
We reiterate, since now cell 0 is empty, we check its neighborhood 8, 9 and 0. Element at index 8 (character \(b\) can move to cell number 0 (since its neighborhood are cells 8, 9 and 0), thus emptying cell number 8 which is in the neighborhood of \(k\):
\begin{array}{|c |c|c|c|c|c|c|c|c|c|c|c|}\hline
index & 0 & 1&2&3&4&5&6&7&8&9 \\ \hline
T & b & e & d &&&&&a&k&l \\ \hline
\end{array}
Difficulty level
Video recording
This exercise is mostly suitable for students
#define N 10
typedef char element;
typedef element hash[N];
int search(hash h, element e, int hop)
{
int i;
int v = hashfunction(e);
for (i = v; i < v + hop; i++)
{
if (h[i%N] == e)
return 1;
}
return 0;
}
int insert(hash h, element e, int hop)
{
int j;
int vhash, index, vhahsalready, step;
vhash = hashfunction(e);
if (h[vhash] == '0')
{
h[vhash] = e;
return 1;
}
else
{
j = vhash + 1;
// find empty place
while (j!=vhash && h[j%N] != '0')
j = (j + 1) % N;
if (j == vhash) // no empty place
return 0;
while(1)
{
//empty place at j
if (j < vhash) j = j + N;
if (abs(j - vhash) < hop) // If the empty entry's index j is within H-1 of entry i, place x there and return.
{
h[j%N] = e;
return 1;
}
// Otherwise, entry j is too far from i.
// To create an empty entry closer to i,
// find an item y whose hash value lies between i and j, but within H - 1 of j.
// Displacing y to j creates a new empty slot closer to i.
// Repeat until the empty entry is within H - 1 of entry i, place x there and return.
// If no such item y exists, or if the bucket i already contains H items, resize and rehash the table.
// find an item between vhash and j, but within H - 1 of j.
// check if element can be moved forward
step = 1;
while (step < hop)
{
index = (j - hop + step) % N;
if (index < 0) index += N;
vhahsalready = hashfunction(h[index]);
if (j > N) j = j%N;
if (abs(j - vhahsalready) < hop) // can be moved
{
h[j%N] = h[index];
h[index] = '0';
j = index;
break;
}
step++;
}
if (step >= hop)
return 0;
}
}
}
Back to the list of exercises
Looking for a more challenging exercise, try this one !!
Hashing using quadratic probing