Heaps, Priority Queues, and Heapsort

Imagine a hospital ER triage system where you must constantly identify the most critical patient amidst a continuous stream of new arrivals; this scenario requires a data structure that handles priority efficiently, not just arrival order. Standard arrays fail here: keeping an array sorted makes retrieval fast $O(1)$ but insertion painfully slow $O(N)$ as you shift elements to make room, while an unsorted array makes insertion instant but forces you to scan the entire list $O(N)$ every time you need the highest priority item. The Binary Heap is the "Goldilocks" solution, striking a perfect balance between structure and speed by allowing us to peek at the top priority element instantly while keeping both insertion and deletion consistently fast $O(logN)$.

Anatomy of a Binary Heap

A Binary Heap is defined by two strict rules. If a tree violates either of these, it is not a heap.

Rule 1: The Shape Property (Complete Binary Tree)

First, the structure must be a Complete Binary Tree. This means the tree is completely filled on all levels, except possibly the lowest level, which is filled from left to right. There are no "gaps" in the tree.

Rule 2: The Heap Property (Order)

Second, the tree must satisfy the Heap Property. This defines the relationship between a parent and its children. There are two flavors:

A. Max-Heap

In a Max-Heap, the value of every parent node is greater than or equal to the values of its children. Consequently, the largest element is always at the Root.

Note: Unlike a Binary Search Tree (BST), there is no specific relationship between the left and right children (e.g., the left child doesn't have to be smaller than the right).

        (Root is Max)
             100
           /     \
         70       50
        /  \     /  \
       40  30   20   10

B. Min-Heap

In a Min-Heap, the value of every parent node is less than or equal to the values of its children. The smallest element is always at the Root.

        (Root is Min)
              5
           /     \
         10       20
        /  \     /  \
       30  40   50   60

Why "Complete" Matters?

The requirement for a Complete Binary Tree is not just aesthetic. It is the secret sauce that allows us to flatten this tree into a standard array without wasting memory on empty slots.

Array Representation

While we draw Heaps as trees, we almost never implement them using struct Node with pointers. Instead, we use a simple, flat Array.

Because a heap is a Complete Binary Tree, we can map nodes to array indices in a specific order: strictly from top to bottom, left to right.

The Visual Mapping

Let's map a Max-Heap into an array. Notice how the array is just a "level-by-level" reading of the tree.

The Tree View:

             50 [0]
           /        \
       30 [1]        20 [2]
      /      \      /      \
   15 [3]   10 [4] 8 [5]   16 [6]

The Array View:

Index:   [ 0 | 1 | 2 | 3 | 4 | 5 | 6 ]
Value:   [ 50| 30| 20| 15| 10| 8 | 16]

Since we don't have pointers (like node->left), how do we move up and down the tree? We use math. If a node is at index i, its family members are located at predictable indices using 0-based indexing:

Left Child: $2i + 1$
Right Child: $2i + 2$
Parent: $\lfloor \frac{i - 1}{2} \rfloor$

Here is how you would define these helpers in C to keep your code clean:

// Helper functions to navigate the imaginary tree within the array

int get_parent_index(int i) {
    return (i - 1) / 2;
}

int get_left_child_index(int i) {
    return (2 * i) + 1;
}

int get_right_child_index(int i) {
    return (2 * i) + 2;
}

Why do we do this?

Space Efficiency: We save the memory overhead of storing two pointers (Left/Right) per node.
Cache Locality: Arrays are stored in contiguous memory blocks. Traversing a heap (which often involves moving between parents and children) is much friendlier to the CPU cache than jumping around memory to chase pointers.

Bubbling Up and Down

A heap must always maintain its two properties (Shape and Order). Whenever we modify the heap, we temporarily break the Order property, and then perform a "bubbling" operation to fix it.

Insertion (The "Bubble Up")

Time Complexity: $O(\log n)$

When we add a new element, we must place it at the very end of the array to keep the tree "complete." However, this new element might be huge, violating the Max-Heap property.

The Algorithm:

Place the new element at the last index (bottom-right of the tree).
Compare it with its parent.
Swap if the child is greater than the parent.
Repeat up the tree until order is restored or the root is reached.

Visual Walkthrough (Max-Heap): Inserting 45

Step 1: Add 45 to end       Step 2: Swap with Parent (30)
       50                          50
      /  \                        /  \
    30    20                    45    20
   /  \                        /  \
  10  [45] <-- New            10  30
 (Violation!)               (Valid Heap!)

C Implementation:

void swap(int *a, int *b) {
    int temp = *a;
    *a = *b;
    *b = temp;
}

void insert(int arr[], int *n, int value) {
    // 1. Insert at the end
    arr[*n] = value;
    int current = *n;
    (*n)++; // Increase size

    // 2. Bubble Up
    // While current is not root AND current > parent
    while (current > 0 && arr[current] > arr[(current - 1) / 2]) {
        int parent = (current - 1) / 2;
        swap(&arr[current], &arr[parent]);
        current = parent; // Move up
    }
}

B. Deletion / Extract Max (The "Bubble Down")

Time Complexity: $O(\log n)$

Removing the root (the maximum element) leaves a "hole" at the top. We can't just shift the whole array up (that would be $O(n)$).

The Algorithm:

Replace the root with the last element in the heap.
Decrease the heap size by 1.
Heapify (Sink Down): Compare the new root with its children.
Swap with the larger of the two children.
Repeat down the tree until the node dominates both children.

Visual Walkthrough (Max-Heap): Removing 50

Start: Max is 50            Step 1: Move last (10) to Top
       50                          10  <-- (Was last)
      /  \                        /  \
    45    20                    45    20
   /  \                        /
  10   30                    30
 (Remove 50)                (Violation: 10 < 45!)

Step 2: Swap with larger child (45)
       45
      /  \
    10    20
   /
  30
(Violation: 10 < 30!)

Step 3: Swap with larger child (30)
       45
      /  \
    30    20
   /
  10
(Valid Heap!)

C Implementation:

void heapify_down(int arr[], int n, int i) {
    int largest = i;
    int left = 2 * i + 1;
    int right = 2 * i + 2;

    // Check if left child exists and is greater than root
    if (left < n && arr[left] > arr[largest])
        largest = left;

    // Check if right child exists and is greater than current largest
    if (right < n && arr[right] > arr[largest])
        largest = right;

    // If largest is not root
    if (largest != i) {
        swap(&arr[i], &arr[largest]);
        // Recursively heapify the affected sub-tree
        heapify_down(arr, n, largest);
    }
}

int extract_max(int arr[], int *n) {
    if (*n <= 0) return -1; // Error code

    int max_val = arr[0];

    // Move last element to root
    arr[0] = arr[*n - 1];
    (*n)--;

    // Sink it down
    heapify_down(arr, *n, 0);

    return max_val;
}

Priority Queues

It is crucial to distinguish between the Priority Queue (PQ) and the Heap. They are often used interchangeably, but they are not the same thing.

Priority Queue is the Abstract Data Type (ADT). It defines what the tool does (the interface).
Heap is the Data Structure. It defines how the tool is implemented (the guts).

Cutting the Line

A standard Queue follows the FIFO (First-In, First-Out) rule. Think of a line at a grocery store; it doesn't matter who you are, if you arrived first, you get served first.

A Priority Queue breaks this rule. Each element has a "priority" associated with it. Elements with higher priority represent VIPs who skip the line.

Standard Queue (FIFO):
Input:  [Task A] -> [Task B] -> [Task C]
Output: Task A comes out first.

Priority Queue (Max-Priority):
Input:  [Task A (prio=1)] -> [Task B (prio=10)] -> [Task C (prio=5)]
Output: Task B comes out first (Highest Priority).

Why Heaps are the Standard Backing

You could implement a Priority Queue using a simple array or a linked list, but they are inefficient for this specific use case. The Binary Heap is the industry standard because it offers the best trade-off.

Implementation	Insert (Enqueue)	Extract Max (Dequeue)	Peek (Get Max)
Unsorted Array	$O(1)$ (Fast)	$O(n)$ (Slow - must scan all)	$O(n)$
Sorted Array	$O(n)$ (Slow - must shift)	$O(1)$ (Fast)	$O(1)$
Binary Heap	$O(\log n)$	$O(\log n)$	$O(1)$

Heapsort

Heapsort is an elegant application of the data structure we just built. It allows us to sort an array in-place (without needing a separate output array) by leveraging the Heap's ability to always know the maximum element.

The algorithm works in two distinct phases.

Phase 1: The "Build Heap" Phase

Goal: Convert a chaotic, unsorted array into a valid Max-Heap.

We could insert elements one by one into a new heap, but that takes $O(n \log n)$. There is a smarter way. We treat the existing array as a heap and fix it from the bottom up.

The Trick: We start at the last non-leaf node (index $\frac{n}{2} - 1$) and run heapify_down on every node backwards to the root. This ensures that every subtree is a valid heap before we combine them.

Complexity: Surprisingly, this phase is $O(n)$, not $O(n \log n)$.

Phase 2: The "Extract & Sort" Phase

Goal: Repeatedly move the largest element to the end of the array.

Once we have a Max-Heap, we know the largest number is at arr[0].

Swap arr[0] with the last element in the heap.
The largest item is now in its correct sorted position at the end of the array.
Shrink the "heap size" by 1 (locking the sorted element out of the heap).
Heapify Down the new root (which is likely small) to restore the Max-Heap property.
Repeat until the heap size is 1.

Visual Walkthrough

Let's sort [4, 10, 3, 5, 1]

1. After Build Heap (Max-Heap state):

Array: [ 10, 5, 3, 4, 1 ]
Tree:
       10
      /  \
     5    3
    / \
   4   1

2. Swap Root (10) with End (1):

We move 10 to the end and "lock" it.

Swap:  [ 1, 5, 3, 4 | 10 ]  <-- 10 is sorted
Heap size is now 4.

3. Heapify Down the Root (1):

The 1 sinks down, 5 floats up.

Array: [ 5, 4, 3, 1 | 10 ]
Tree (Heap part only):
       5
      / \
     4   3
    /
   1

4. Repeat:

Swap Root (5) with current end (1).

Swap:  [ 1, 4, 3 | 5, 10 ] <-- 5 and 10 are sorted
Heapify Down 1...
Result: [ 4, 1, 3 | 5, 10 ]

C Implementation

This code assumes we have the swap and heapify_down functions from earlier.

// Main Heapsort Function
void heapsort(int arr[], int n) {
    // Phase 1: Build Heap
    // Start from last non-leaf node and go up to root
    for (int i = n / 2 - 1; i >= 0; i--) {
        heapify_down(arr, n, i);
    }

    // Phase 2: Extraction
    // One by one extract an element from heap
    for (int i = n - 1; i > 0; i--) {
        // Move current root (max) to end
        swap(&arr[0], &arr[i]);

        // Call max heapify on the reduced heap
        // Note: We pass 'i' as the size, effectively ignoring the sorted elements
        heapify_down(arr, i, 0);
    }
}

Why is this cool?

Unlike Merge Sort, Heapsort requires $O(1)$ extra space (it sorts inside the original array). Unlike Quick Sort, Heapsort guarantees $O(n \log n)$ worst-case performance—there is no "bad pivot" that can slow it down to $O(n^2)$.

Heaps, Priority Queues, and Heapsort

Anatomy of a Binary Heap

Rule 1: The Shape Property (Complete Binary Tree)

Rule 2: The Heap Property (Order)

A. Max-Heap

B. Min-Heap

Why "Complete" Matters?

Array Representation

The Visual Mapping

The Navigation Math

Why do we do this?

Bubbling Up and Down

Insertion (The "Bubble Up")

B. Deletion / Extract Max (The "Bubble Down")

Priority Queues

Cutting the Line

Why Heaps are the Standard Backing

Heapsort

Phase 1: The "Build Heap" Phase

Phase 2: The "Extract & Sort" Phase

Visual Walkthrough

C Implementation

Why is this cool?

Comments

More from this blog

Computer Arithmetic: From Fast Adders to Floating-Point

The Complete Memory Hierarchy: From Semiconductor RAM to Virtual Memory

The Ultimate Spring Boot Guide: From Fundamentals to Production-Ready REST APIs

Understanding Functions, Memory, and Pointers in C: A Complete Guide

Datapath and Control

Command Palette

Anatomy of a Binary Heap

Rule 1: The Shape Property (Complete Binary Tree)

Rule 2: The Heap Property (Order)

A. Max-Heap

B. Min-Heap

Why "Complete" Matters?

Array Representation

The Visual Mapping

The Navigation Math

Why do we do this?

Bubbling Up and Down

Insertion (The "Bubble Up")

B. Deletion / Extract Max (The "Bubble Down")

Priority Queues

Cutting the Line

Why Heaps are the Standard Backing

Heapsort

Phase 1: The "Build Heap" Phase

Phase 2: The "Extract & Sort" Phase

Visual Walkthrough

C Implementation

Why is this cool?

Comments

More from this blog