In my data structures class I have been learning about **Binary Heaps**. I have been coding examples of min and max Binary Heaps in both Python and C# the past few days, and have become quite fond of the simplicity and elegance of the data structure. Typically you just need the ability to push an element onto the Binary Heap and pop either the minimum or maximum element from the Binary Heap. However, if you write other methods, like `change priority`

or `remove`

if using it as a **Priority Queue**, the solutions are very elegant, because they re-use the `siftup`

and `siftdown`

methods already in place for push and pop. It's a pretty amazing data structure when you get to know its inner-workings. The fact that you can use it to sort items (**Heapsort**) makes it all that much cooler.

Before I dive into a simple implementation of a min Binary Heap in Python, I first want to mention `heapq`

, which is a collection of functions offering min Binary Heap functionality in the Python collections module.

## heapq in Python

I was first introduced to `heapq`

in Python when I was looking for alternative solutions to finding the two largest integers in a list in Python. This was an actual problem I needed to solve in my computer science class. I had solved it using a number of methods and I was looking for other solutions. This is when I found `heapq`

. It has a wonderful method, called `nlargest`

, that returns the first n largest numbers in a list.

integers = [1, 16, 3, 39, 26, 4, 8, 16] # get 2 largest values largest_integers = heapq.nlargest(2, integers) # [39, 26]

At the time I knew nothing about Binary Heaps and `heapq`

, but now that I understand how Binary Heaps work, `heapq`

is a wonderful set of functions that I can now fully appreciate.

## heapq and Min Binary Heap

What's interesting about `heapq`

is that it is not a container. You define a list that will be your heap and pass it to `heapq`

to perform heap operations. You can start with an empty list and push a bunch of integers onto the heap and then pop them off one at a time, essentially performing a Heapsort. Because `heapq`

is a *min* Binary Heap, it pops from lowest to highest.

import heapq heap = [] heapq.heappush(heap, 12) heapq.heappush(heap, 19) heapq.heappush(heap, 88) heapq.heappush(heap, 62) heapq.heappush(heap, 2) heapq.heappush(heap, 14) heapq.heappush(heap, 92) while len(heap) > 0: print(heapq.heappop(heap), end=' ') # 2 12 14 19 62 88 92

You can also start with a list of values and then perform push and pop operations. In this case, you will first need to `heapify`

the list before pushing and popping values.

import heapq heap = [12, 19, 88, 62, 2, 14, 92] heapq.heapify(heap) heapq.heappush(heap, 3) while len(heap) > 0: print(heapq.heappop(heap), end=' ') # 2 3 12 14 19 62 88 92

After writing several samples of min and max Binary Heaps, I like that `heapq`

is a collection of functions that provide Binary Heap functionality as opposed to also being a container. Allows the developer to have more control of the container at the risk of perhaps writing code that alters the container unexpectedly. There is more to `heapq`

, but this gives you a taste of the two main ways to use `heapq`

, with or without starting values, and the fundamental push and pop operations offered by a Binary Heap.

## Custom Min Binary Heap in Python

There are 2 important features of a Binary Heap. One, it is a complete Binary Tree, which means it can be stored as an Array. Two, the nodes are ordered in a particular way. For a max Binary Heap, the value of a node is at least the value of its children. For a min Binary Heap, the value of a node is no more than the value of its children.

Here is a simple, bare-bones example of a min Binary Heap in Python providing simple push and pop operations.

class Heap: def __init__(self): self.__heap = [] self.__last_index = -1 def push(self, value): self.__last_index += 1 if self.__last_index < len(self.__heap): self.__heap[self.__last_index] = value else: self.__heap.append(value) self.__siftup(self.__last_index) def pop(self): if self.__last_index == -1: raise IndexError('pop from empty heap') min_value = self.__heap[0] self.__heap[0] = self.__heap[self.__last_index] self.__last_index -= 1 self.__siftdown(0) return min_value def __siftup(self, index): while index > 0: parent_index, parent_value = self.__get_parent(index) if parent_value <= self.__heap[index]: break self.__heap[parent_index], self.__heap[index] =\ self.__heap[index], self.__heap[parent_index] index = parent_index def __siftdown(self, index): while True: index_value = self.__heap[index] left_child_index, left_child_value = self.__get_left_child(index, index_value) right_child_index, right_child_value = self.__get_right_child(index, index_value) if index_value <= left_child_value and index_value <= right_child_value: break if left_child_value < right_child_value: new_index = left_child_index else: new_index = right_child_index self.__heap[new_index], self.__heap[index] =\ self.__heap[index], self.__heap[new_index] index = new_index def __get_parent(self, index): if index == 0: return None, None parent_index = (index - 1) // 2 return parent_index, self.__heap[parent_index] def __get_left_child(self, index, default_value): left_child_index = 2 * index + 1 if left_child_index > self.__last_index: return None, default_value return left_child_index, self.__heap[left_child_index] def __get_right_child(self, index, default_value): right_child_index = 2 * index + 2 if right_child_index > self.__last_index: return None, default_value return right_child_index, self.__heap[right_child_index] def __len__(self): return self.__last_index + 1

In class I was required to write Binary Heaps using a fixed-size array with a maximum size, but in this case I let the Python List grow as big as necessary. I also don't delete cells in the list. I just use a pointer, called `last_index`

, to point to the last value in the heap.

I also re-wrote the `siftup`

and `siftdown`

methods to be iterative functions as opposed to recursive functions. I like the elegance of recursive functions and it was fine for class assignments, but most developers comment "negatively" on them.

The importat public methods are `push`

and `pop`

for adding and removing items from the heap. The important private methods are `siftup`

and `siftdown`

, which are responsible for maintaining the ordered hierarchy of the heap.

Let's execute a Heapsort on a list of integers by pushing them all onto the heap and popping them off one-by-one.

values = random.sample(range(10000), 10) print(values) h = Heap() for v in values: h.push(v) while len(h) > 0: print(h.pop(), end=' ') [3383, 1851, 4975, 435, 5008, 480, 5151, 6462, 2936, 1621] 435 480 1621 1851 2936 3383 4975 5008 5151 6462

Looks good. The original list of random integers is in no particular order. However, as we pop each value from the min Binary Heap they arrive in numerically increasing order (Heapsort)! If this were a max Binary Heap they would arrive in decreasing order.

Such an elegant data structure.

Best Wishes!

**Posted by David Hayden**