Quadratic Probing Technique for Open Addressing

Imagine a hash table as a set of labelled boxes (or slots). When we want to store an item, a hash function tells us which box to use. But what happens if that box is already full? This situation is called a collision. Open addressing is one way to handle collisions: instead of putting the item somewhere else entirely, we look for the next available box within the table itself.

Linear probing, the simplest open addressing method, just checks the very next box, then the next, and so on. This can lead to clumps of filled boxes, called primary clustering, slowing things down. Quadratic probing is a smarter approach that tries to avoid these clumps by looking for an empty box further away with each attempt.

How Quadratic Probing Works

Quadratic probing is a collision resolution technique used in hash tables with open addressing. When a collision occurs at a specific index (calculated by the hash function), quadratic probing looks for the next available slot using a sequence that increases quadratically.

The basic idea is to jump further and further away from the original hash index with each probe attempt. The sequence of indices to check is typically calculated as follows:

  1. Initial Probe (i=0): Check the index given by the hash function: index = h(key) % table_size.
  2. Second Probe (i=1): If the initial slot is full, check (h(key) + 1*1) % table_size.
  3. Third Probe (i=2): If that slot is also full, check (h(key) + 2*2) % table_size.
  4. Fourth Probe (i=3): If still full, check (h(key) + 3*3) % table_size.
  5. And so on… The general formula for the i-th probe (starting with i=0) is: index = (h(key) + i^2) % table_size

Note

Sometimes a more general formula (h(key) + c1*i + c2*i^2) % table_size is used, where c1 and c2 are constants. However, c1=0 and c2=1 (giving h(key) + i^2) is the most common and simplest form.

Insertion:

Let’s say we have a hash table of size 10 and we want to insert the keys 23, 13, 43, and 33. Let our hash function be h(key) = key % 10.

  • Insert 23: h(23) = 23 % 10 = 3. Slot 3 is empty. Place 23 at index 3. Table: [_, _, _, 23, _, _, _, _, _, _]
  • Insert 13: h(13) = 13 % 10 = 3. Slot 3 is occupied by 23 (Collision!).
    • Probe 1 (i=1): (3 + 1^2) % 10 = (3 + 1) % 10 = 4. Slot 4 is empty. Place 13 at index 4. Table: [_, _, _, 23, 13, _, _, _, _, _]
  • Insert 43: h(43) = 43 % 10 = 3. Slot 3 is occupied (Collision!).
    • Probe 1 (i=1): (3 + 1^2) % 10 = 4. Slot 4 is occupied (Collision!).
    • Probe 2 (i=2): (3 + 2^2) % 10 = (3 + 4) % 10 = 7. Slot 7 is empty. Place 43 at index 7. Table: [_, _, _, 23, 13, _, _, 43, _, _]
  • Insert 33: h(33) = 33 % 10 = 3. Slot 3 is occupied (Collision!).
    • Probe 1 (i=1): (3 + 1^2) % 10 = 4. Slot 4 is occupied (Collision!).
    • Probe 2 (i=2): (3 + 2^2) % 10 = 7. Slot 7 is occupied (Collision!).
    • Probe 3 (i=3): (3 + 3^2) % 10 = (3 + 9) % 10 = 12 % 10 = 2. Slot 2 is empty. Place 33 at index 2. Table: [_, _, 33, 23, 13, _, _, 43, _, _]

Searching: To search for a key, we follow the same probing sequence starting from the initial hash index. If we find the key, the search is successful. If we encounter an empty slot during the probe sequence, it means the key is not in the table.

Deletion: Similar to other open addressing techniques, simply emptying a slot can break the probe sequence for later searches. To handle this, deleted slots are often marked with a special “deleted” or “dummy” value. During insertion, these slots can be treated as empty, but during searching, the probing continues past them.

Advantages of Quadratic Probing

  • Reduces Primary Clustering: Unlike linear probing where occupied slots tend to form long consecutive blocks, quadratic probing jumps further away. This helps spread out elements that initially hash to the same index, significantly reducing the primary clustering problem.
  • Relatively Simple: While more complex than linear probing, it’s generally easier to understand and implement compared to other techniques like double hashing.

Disadvantages and Considerations

  • Secondary Clustering: Quadratic probing suffers from a milder form of clustering called secondary clustering. If multiple keys hash to the same initial index, they will follow the exact same sequence of probe locations. This is better than primary clustering but still means that these keys compete for the same set of alternative slots.

  • Not Guaranteed to Find an Empty Slot: A major drawback is that quadratic probing might not explore all available slots in the table. If the table size (table_size) is not chosen carefully (e.g., not a prime number) or if the load factor (ratio of filled slots to total slots) gets too high (typically above 0.5 or 1/2), the probing sequence might start repeating indices before finding an empty slot, even if one exists! This can lead to an infinite loop or insertion failure.

    Tip

    To guarantee that quadratic probing can find an empty slot (if one exists and the load factor is less than or equal to 0.5), the table size should be a prime number p such that p ≡ 3 (mod 4). Here’s a related discussion.

  • Deletion is Complex: Like other open addressing methods, simply removing an element can break the probing chain for searching other elements. Usually, slots are marked as “deleted” instead of being truly emptied, adding complexity.

Complexity:

  • Insertion/Search (Average Case): O(1), provided the load factor is kept low (e.g., below 0.5).
  • Insertion/Search (Worst Case): O(n), where n is the number of elements in the table. This can happen with very high load factors or unlucky collision patterns.

What’s Next?

Now that you understand how quadratic probing works to handle hash collisions, you might be interested in exploring related topics:

  • Double Hashing: Discover how using a second hash function provides another effective way to find empty slots when collisions occur in open addressing.
  • Comparing Collision Resolution Techniques: Explore the pros and cons of different strategies for handling hash collisions, including separate chaining, linear probing, quadratic probing, and double hashing, to understand when to use each.
  • Open Addressing Overview: Revisit the main concept of open addressing, where all elements are stored directly within the hash table itself.
  • Hash Table Operations: Learn more about the fundamental operations like insertion, deletion, and search in hash tables.