Heap#

A Heap is a special Tree-based data structure in which the tree is a complete binary tree. Generally, Heaps can be of two types:

  1. Max-Heap: Key present at the root node must be greatest among the keys present at all of it’s children. The same property must be recursively true for all sub-trees in that Binary Tree.

  2. Min-Heap: key present at the root node must be minimum among the keys present at all of it’s children. The same property must be recursively true for all sub-trees in that Binary Tree.

flowchart TB; a((15)) b1((19)) b2((21)) c1((26)) c2((29)) c3((31)) c4((35)) d1((39)) d2((47)) d3((57)) d4((71)) a --> b1 a --> b2 b1 --> c1 b1 --> c2 b2 --> c3 b2 --> c4 c1 --> d1 c1 --> d2 c2 --> d3 c2 --> d4

Binary Heap Abstract Data Type#

The Binary Heap abstract data type is defined by the following structure and operations. A Binary Heap is structured, as described above, as an ordered collection of items where items are added to and removed from the end called the top. Binary Heaps are ordered LIFO.

Binary Heap Operations#

  • BinaryHeap() creates a new, empty, binary heap.

  • insert(k) adds a new item to the heap.

  • find_min() returns the item with the minimum key value, leaving item in the heap.

  • delete_min() returns the item with the minimum key value, removing the item from the heap.

  • is_empty() returns true if the heap is empty, false otherwise.

  • size() returns the number of items in the heap.

  • build_heap(list) builds a new heap from a list of keys.

h = BinaryHeap() creates a Binary Heap that starts out empty.

Implementing a BinaryHeap in Python#

An abstract data type (ADT) when given a physical implementation then we refer to the implementation as a data structure.

In any object-oriented programming language, the implementation of choice for an abstract data type such as a BinaryHeap is the creation of a new class. The BinaryHeap operations are implemented as methods. Further, to implement a BinaryHeap, which is a collection of elements, it makes sense to utilize the primitive collections provided.

In order to make our heap work efficiently, we will take advantage of the logarithmic nature of the binary tree to represent our heap. In order to guarantee logarithmic performance, we must keep our tree balanced. A balanced binary tree has roughly the same number of nodes in the left and right subtrees of the root. In our heap implementation we keep the tree balanced by creating a complete binary tree, which can be represent by using a single List.

Heap Order Property#

The method that we will use to store items in a heap relies on maintaining the heap order property. The heap order property is as follows:

  • In a heap, for every node x with parent p, the key in p is smaller than or equal to the key in x for Min-Heap.

  • In a heap, for every node x with parent p, the key in p is greater than or equal to the key in x for Max-Heap.

class BinaryHeap:

    def __init__(self):
        self.items = [0]
        self.num_items = 0

    def _min_child(self,index):
        if index * 2 + 1 > self.num_items:
            return index * 2
        else:
            if self.items[index*2] < self.items[index*2+1]:
                return index * 2
            else:
                return index * 2 + 1

    def _percolate_up(self, index):

        while index // 2 > 0:
            child = self.items[index]
            parent = self.items[index // 2]
            if child > parent:
                self.items[index], self.items[index // 2] = self.items[index // 2], self.items[index]
            index //= 2
    
    def _percolate_down(self, index):

        while index * 2 <= self.num_items:
            parent = self.items[index]
            min_child_index = self._min_child(index)
            child = self.items[min_child_index]
            
            if parent > child:
                self.items[index], self.items[min_child_index] = self.items[min_child_index], self.items[index]
            index *= 2
    
    def is_empty(self):
        return self.num_items == 0
    
    def size(self):
        return self.num_items
    
    def insert(self, key):
        self.num_items += 1
        
        self.items.append(k)
        self._percolate_up(self.num_items)

    def find_min(self):
        return self.items[1]

    def delete_min(self):
        self.num_items -= 1

        min_value = self.items[1]
        self.items[1] = self.items[-1]
        self.items.pop()
        self._percolate_down(1)
        return min_value
    
    def build_heap(self, items):
        self.num_items = len(items)
        self.items = [0] + items

        index = len(items) // 2
        while index > 0:
            self._percolate_down(index)
            index -= 1
bh = BinaryHeap()
bh.build_heap([9,5,6,2,3])

print(bh.delete_min())
print(bh.delete_min())
print(bh.delete_min())
print(bh.delete_min())
print(bh.delete_min())
2
3
5
6
9

Practical usage in python#

heapq module provides an implementation of the heap queue algorithm, also known as the priority queue algorithm.

Heaps are arrays for which a[k] <= a[2k+1] and a[k] <= a[2k+2] for all k, counting elements from 0. For the sake of comparison, non-existing elements are considered to be infinite. The interesting property of a heap is that a[0] is always its smallest element.

By default heapq in standard liberary implements Min-Heap only. And you can find more information about the interface and api from docs.