Visualized! Intro to 7 Sorting Algorithms and their complexity analysis

After we introduced Graph Algorithm: Topological Sort, let’s now take a time to learn some more general and commonly used sorting algorithms! Actually, if you take a look at Introduction to Algorithms, you will find that sorting is the first algorithm you will learn! Time complexities of them will be appended at the end.

Here is the list of sorting algorithms that will be explained:

  • Selection Sort
  • Insertion Sort
  • Bucket Sort
  • Bubble Sort
  • Mergesort
  • Quicksort
  • Heapsort

Selection Sort

It selects the smallest item in the unsorted part of the array, and swap it with the item in the position that is to be filled.

arr = [3,1,12,1,14,2,13]
for i in range(len(arr)):
	min = i
	for j in range(i+1,len(arr)):
		if arr[j] < arr[min]:
			min = j
	arr[i],arr[min] = arr[min],arr[i]

Insertion Sort

It selects one item from the unsorted part of the array at a time and inserts it to the correct place in the sorted part. In the Python code below, it uses a for loop to loop through the item that is being inserted, and uses a while loop inside of it to find the correct place for the inserting item, during which items in the sorted part will be swapped if necessary.

arr = [3,1,12,1,14,2,13]
for i in range(len(arr)):
    j = i-1
    while j > -1 and arr[j] > arr[j+1]:
        arr[j],arr[j+1] = arr[j+1],arr[j]
        j -= 1

Bucket Sort

It first creates an array bucket with its length equals to the largest value of the input array. Then it goes through the input array and for each element with the value of val, it marked as bucket[val]+=1. Finally, it loops through the bucket and prints the index for val times. Here bucket is literally a bucket to store items with corresponding values. bucket[i]>0 means there are bucket[i] items with value of i.

arr = [3,1,12,1,14,2,13]
bucket = [0] * (max(arr)+1)
for ele in arr:
    bucket[ele] += 1

ret = []
for i,val in enumerate(bucket):
    if val > 0:
        ret += [i] * val

Bubble Sort

It keeps comparing adjacent items, and swap them if necessary. To put it another way, each time it would ‘bubble’ the largest item to the top.

arr=[3,1,12,1,14,2,13]
for i in range(len(arr)-1,-1,-1):
    for j in range(0,i):
        if arr[j] > arr[j+1]:
            arr[j],arr[j+1] = arr[j+1],arr[j]

Mergesort

Mergesort is a typical showcase of divide-and-conquer strategy. It first divides the input into two parts, sorts them individually, and then merge them. To sort the subarray, it divides them again until there’s only one element in it, which could be returned directly. For the merge process, given that the two subarrays are sorted, we could simply compare the first items of each array and pick up the smallest. A more detailed diagram is shown below:

arr = [3,1,12,1,14,2,13]
def mergesort(low=0,high=len(arr)-1):
    if low == high:
        return [arr[low]]
    else:
        mid = (low+high) // 2
        left = mergesort(low,mid)
        right = mergesort(mid+1,high)
        return merge(left,right)

def merge(l1,l2):
    """Summary
    This function merges two sorted arrays
    Args:
        l1 (int): The first sorted array
        l2 (int): The second sorted array
    
    Returns:
        list: Merged array
    """
    i = j = 0
    ret = []
    while i < len(l1) and j < len(l2):
        if l1[i] < l2[j]:
            ret.append(l1[i])
            i += 1
        else:
            ret.append(l2[j])
            j += 1
    if i < len(l1):
        ret += l1[i:]
    if j < len(l2):
        ret += l2[j:]
    return ret

mergesort()

Quicksort

Quicksort

It is another way of divide-and-conquer sorting algorithms. Each time it picks an item as pivot and partitions the array so that smaller items appear to its left and all larger items are on its right by swapping. Then it continuously sorts the left and right subarrays in the same way.

arr = [3,1,12,1,14,2,13]
def partition(low,high):
    """Summary
    This function partitions array[low:high+1] so that items 
    on pivot's left will be smaller than it, items on its right 
    will be larger than it.
    Args:
        low (int): Left boundary of the array to be partitioned
        high (int): Right boundary of the array to be partitioned
    
    Returns:
        int: The index of pivot
    """
    pivot = low
    low += 1
    while low < high:
        while low <= high and arr[low]<=arr[pivot]:
            low += 1
        while low <= high and arr[high] >= arr[pivot]:
            high -= 1
        if low < high:
            arr[low],arr[high] = arr[high],arr[low]
    arr[pivot],arr[high] = arr[high],arr[pivot]
    return high

def quicksort(low=0,high=len(arr)-1):
    if low < high:
        p = partition(low,high)
        quicksort(low,p-1)
        quicksort(p+1,high)

quicksort()

Heapsort

A max heap

What is a heap? A binary heap is a complete binary tree where parent nodes are always smaller or larger than their children. The former one is known as Min Heap and the latter one is known as Max Heap. Notably, a complete binary tree is a binary tree where each level is completely filled except the last level, and all nodes in the last level are as far left as possible (Wikipedia).

Let’s take look at functions that will be implemented for heapsort.

build_heap( )
Visualization tool: https://visualgo.net/en
heapsort( )
Visualization tool: https://visualgo.net/en
  • heapify() refers to the process of converting the tree starting at node i to a heap, provided that its subtree is heapified already.
  • build_heap() is the function that builds a heap from an array. It calls heapify() from bottom to top due to heapify()‘s nature.
  • heapsort() calls build_heap() first to build a heap. To return items in the sorted order, it swap the root with the last node and pop it. Then it calls heapify(0) to make sure that, after the swapping, the remaining tree is still a heap.
# Max-heap
def heapify(i,n):
    """
    Given the children nodes are hipified, this function would
    convert a tree starting with i to a heap
    Args:
        i (int): the index of node
        n (int): the length of the heap
    """
    if i >= n:
        return  
    c1 = 2 * i + 1
    c2 = 2 * i + 2
    maxi = i
    if c1 < n and arr[c1] > arr[maxi]:
        maxi = c1
    if c2 < n and arr[c2] > arr[maxi]:
        maxi = c2
    if maxi != i:
        arr[maxi],arr[i] = arr[i],arr[maxi]
        heapify(maxi,n)


def build_heap():
    """
    Starting from the last parent node, it keeps calling
    heapify() to convert the tree into a heap
    """
    last_node = len(arr) - 1
    parent = (last_node - 1) // 2
    for i in range(parent,-1,-1):
        heapify(i,len(arr))


def heapsort():
    """
    This function first swap the first item with the last one
    and return it. Then it calls heapify() to maintain the heap
    structure.
    It keeps doing until all are sorted.
    Note that, we put i as the length of the heap so it works like
    we 'cut off' the tail - the original first element.
    """
    build_heap()
    for i in range(len(arr)-1,-1,-1):
        arr[i],arr[0] = arr[0],arr[i]
        heapify(0,i)

arr = [3,1,12,1,14,2,13]
heapsort()

Time complexity analysis

Sorting AlgorithmsBestAverageWorst
Selection Sortn^2n^2n^2
Insertion Sortnn^2n^2
Bucket Sortnn
Bubble Sortnn^2n^2
Mergesortnlognnlognnlogn
Quicksortnlognnlognn^2
Heapsortnlognnlognnlogn
Time complexities for sorting algorithms

Further notes for Heap

Moreover, apart from the introduced buid_heap(), which heapify() from the bottom to top provided that all elements are given, there is another way to build a heap where items continuously arrive.

In this case, starting with an empty heap, it appends each new item first to the end of the heap, and adjust upwards with heap_insert() along with the path from the new node to the root.

def heap_insert(num):
    global heap
    heap += [num]
    cur = len(heap) - 1
    par = (cur-1)//2
    while par >= 0 and heap[par] < heap[cur]:
        heap[cur],heap[par] = heap[par],heap[cur]
        cur = par
        par = (cur-1)//2

def build_heap():
    """
    Starting from the last parent node, it keeps calling
    heapify() to convert the tree into a heap
    """
    for e in arr:
        heap_insert(e)

Time complexities for heap:

  • heapify()logn
  • build_heap()
    with heapify() – n
    with heap_insert() – nlogn (best case O(n))
  • heapsort()O(nlogn+n)=O(nlogn)

Reference

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments