Binary Heap Data Structure

Introduction to Binary Heaps
đź”—
Lecture on Heaps & Heap Sort
đź”—
Click here to suggest a better video or view suggestions..

A Binary heap is a data structure used to efficiently find the maximum or minimum element among a collection of elements. If a binary heap is configured to find or retrieve the maximum element among all the elements in the heap, the heap is called as a max-heap. Similarly if a binary heap is configured to find or retrieve the minimum element among all the elements in the heap, the heap is called as min-heap.

Characteristics of a Binary Heap

A binary heap is characterized by it’s structure and the partially ordered nature of the parent and chiid nodes. Below are both the properties of a binary heap in detail:

Invariant #1 (Shape Property) of a Heap

The structure of a binary heap resembles a complete binary tree. A complete binary tree is a binary tree in which all the child layers are completely filled except possibily for the last layer. Also in the last layer, the nodes have to be filled from left to right. Below is a sample image (source: wiki) for complete binary tree.

Invariant #2 (Heap Property) of a Heap

A consistent relation (greater than or less than) is maintained between all the parent and child nodes. And depending on the type of relation between the parent and child nodes, the type of heap is determined. If the relation between parent node and child node is such that each parent node is greater than or equal to its child nodes, the resulting heap is called as a max heap. Similarly if the relation is that each parent node is less than or equal to its child nodes, the resulting heap is called as min heap.

Below are images of min-heap and max-heap (source):

Because of recursive nature of relation between parent and child nodes, the structure starting from from any child node in a heap can be thought of as another heap rooted at that child node. This intuition is useful while thinking about merging heaps or extracting an element from a heap.

Array Implementation

A binary heap can be implemented using tree data structure composed of nodes with up to two children. But this implementation is not efficient storage wise and it is also difficult to efficiently track the position where the next ending insert position in the tree is. Fortunately, due to the fact that a binary heap is made up of a complete binary tree, it can be represented in the form of an array. Below are both tree and array representations for an example binary heap. (source):

The parent child relations between elements stored in the array can be found out using the below formulas:

  • If the parent node is at position i, the child nodes are at positions 2*i + 1 and 2*i + 2
  • For example, the children for the node at position 3 are at positions 2*3 + 1, 2*3 + 2 = at positions 7 & 8
  • Similarly, for any child node at position i, the parent node can be found by using the formula (i - 1) / 2
  • Considering the example discussed before, for element at position 7, the parent node is at (7 - 1) / 2 = 3. We round off the answer to the nearest integer less than the result. For example, the parent to the child at position 8 is present at: (8 - 1) / 2 = 3.5 ~= 3.

Operations

For all the operations discussed here, we assume that the heap is a max-heap i.e, the largest element will be at the top of the heap.

Find the Largest/Smallest Element present in the Heap

This is the easiest operation among the all the rest. By the definition of max-heap data structure, the top most element will always be larger than all the other elements. So we can just return the first element as the largest element among all the other elements. (In case of min-heap, the top/first element will be the smallest element among all the other elements.)

This operation takes O(1) time.

Insert element into heap

This operation inserts a new element into an existing heap structure. Below is the algorithm on how we do it:

  • Insert the new element at the end of heap (After the last element in the array)
  • Check if the newly inserted element is larger than its parent element
  • If the new element is not larger than it’s parent, it means that the heap is perfect and we need not change anything
  • But if the new element is larger that it’s parent, the max heap order (larger element should be at the top) is not satisfied. So we swap the newly added element and it’s parent
  • We repeat this process with the new element and move it upwards (sift the element up) until we reach the top of heap or we reach some element which is larger than the newly added element.

This operation takes O(logN) time.

Extract Largest / Smallest element

In this operation, which is also known as the pop() operation, we need to extract the largest element (smallest in case of min-heap) from the heap structure.

In the previous operation we observed that the top element will be the maximum / minimum element depending of which type the underlying heap structure is: max-heap or min-heap. In this operation, we need to remove that top element and re-arrange the elements so that the heap data structure properties are maintained.

Below is the algorithm to remove the top element from a binary heap:

  • Return the topmost element in the heap (The smallest (in min-heap) or largest (in max-heap) element)
  • Insert the last element into the top position
  • Now check if the top element has any child elements larger than it
  • If there are any child elements larger, move the largest child element to the top and the smaller element to the bottom
  • Repeat this process until the node reaches the last layer or node is already greater than the child node (sift/bubble the node down the tree)

This operation takes O(logN) time.

Heapify unordered elements

Heapifying is the process of converting a collection of elements into a binary heap data structure. One easy way of heapifying a list of elements is to initialize an empty heap and add elements one by one to the heap. This way of building a heap takes O(NlogN).

There is an optimized way of build a heap from scratch which can be done in O(N) time: Building a Heap

TODO: Add animations for operations and code

References