After we introduced Graph Algorithm: Topological Sort, let’s now take a time to learn some more general and commonly used sorting algorithms! Actually, if you take a look at Introduction to Algorithms, you will find that sorting is the first algorithm you will learn! Time complexities of them will be appended at the end.
Here is the list of sorting algorithms that will be explained:
- Selection Sort
- Insertion Sort
- Bucket Sort
- Bubble Sort
It selects the smallest item in the unsorted part of the array, and swap it with the item in the position that is to be filled.
arr = [3,1,12,1,14,2,13] for i in range(len(arr)): min = i for j in range(i+1,len(arr)): if arr[j] < arr[min]: min = j arr[i],arr[min] = arr[min],arr[i]
It selects one item from the unsorted part of the array at a time and inserts it to the correct place in the sorted part. In the Python code below, it uses a
for loop to loop through the item that is being inserted, and uses a
while loop inside of it to find the correct place for the inserting item, during which items in the sorted part will be swapped if necessary.
arr = [3,1,12,1,14,2,13] for i in range(len(arr)): j = i-1 while j > -1 and arr[j] > arr[j+1]: arr[j],arr[j+1] = arr[j+1],arr[j] j -= 1
It first creates an array
bucket with its length equals to the largest value of the input array. Then it goes through the input array and for each element with the value of
val, it marked as
bucket[val]+=1. Finally, it loops through the
bucket and prints the index for
val times. Here
bucket is literally a bucket to store items with corresponding values.
bucket[i]>0 means there are
bucket[i] items with value of
arr = [3,1,12,1,14,2,13] bucket =  * (max(arr)+1) for ele in arr: bucket[ele] += 1 ret =  for i,val in enumerate(bucket): if val > 0: ret += [i] * val
It keeps comparing adjacent items, and swap them if necessary. To put it another way, each time it would ‘bubble’ the largest item to the top.
arr=[3,1,12,1,14,2,13] for i in range(len(arr)-1,-1,-1): for j in range(0,i): if arr[j] > arr[j+1]: arr[j],arr[j+1] = arr[j+1],arr[j]
Mergesort is a typical showcase of divide-and-conquer strategy. It first divides the input into two parts, sorts them individually, and then merge them. To sort the subarray, it divides them again until there’s only one element in it, which could be returned directly. For the merge process, given that the two subarrays are sorted, we could simply compare the first items of each array and pick up the smallest. A more detailed diagram is shown below:
arr = [3,1,12,1,14,2,13] def mergesort(low=0,high=len(arr)-1): if low == high: return [arr[low]] else: mid = (low+high) // 2 left = mergesort(low,mid) right = mergesort(mid+1,high) return merge(left,right) def merge(l1,l2): """Summary This function merges two sorted arrays Args: l1 (int): The first sorted array l2 (int): The second sorted array Returns: list: Merged array """ i = j = 0 ret =  while i < len(l1) and j < len(l2): if l1[i] < l2[j]: ret.append(l1[i]) i += 1 else: ret.append(l2[j]) j += 1 if i < len(l1): ret += l1[i:] if j < len(l2): ret += l2[j:] return ret mergesort()
It is another way of divide-and-conquer sorting algorithms. Each time it picks an item as
pivot and partitions the array so that smaller items appear to its left and all larger items are on its right by swapping. Then it continuously sorts the left and right subarrays in the same way.
arr = [3,1,12,1,14,2,13] def partition(low,high): """Summary This function partitions array[low:high+1] so that items on pivot's left will be smaller than it, items on its right will be larger than it. Args: low (int): Left boundary of the array to be partitioned high (int): Right boundary of the array to be partitioned Returns: int: The index of pivot """ pivot = low low += 1 while low < high: while low <= high and arr[low]<=arr[pivot]: low += 1 while low <= high and arr[high] >= arr[pivot]: high -= 1 if low < high: arr[low],arr[high] = arr[high],arr[low] arr[pivot],arr[high] = arr[high],arr[pivot] return high def quicksort(low=0,high=len(arr)-1): if low < high: p = partition(low,high) quicksort(low,p-1) quicksort(p+1,high) quicksort()
What is a heap? A binary heap is a complete binary tree where parent nodes are always smaller or larger than their children. The former one is known as Min Heap and the latter one is known as Max Heap. Notably, a complete binary tree is a binary tree where each level is completely filled except the last level, and all nodes in the last level are as far left as possible (Wikipedia).
Let’s take look at functions that will be implemented for heapsort.
heapify()refers to the process of converting the tree starting at node i to a heap, provided that its subtree is heapified already.
build_heap()is the function that builds a heap from an array. It calls
heapify()from bottom to top due to
build_heap()first to build a heap. To return items in the sorted order, it swap the root with the last node and pop it. Then it calls
heapify(0)to make sure that, after the swapping, the remaining tree is still a heap.
# Max-heap def heapify(i,n): """ Given the children nodes are hipified, this function would convert a tree starting with i to a heap Args: i (int): the index of node n (int): the length of the heap """ if i >= n: return c1 = 2 * i + 1 c2 = 2 * i + 2 maxi = i if c1 < n and arr[c1] > arr[maxi]: maxi = c1 if c2 < n and arr[c2] > arr[maxi]: maxi = c2 if maxi != i: arr[maxi],arr[i] = arr[i],arr[maxi] heapify(maxi,n) def build_heap(): """ Starting from the last parent node, it keeps calling heapify() to convert the tree into a heap """ last_node = len(arr) - 1 parent = (last_node - 1) // 2 for i in range(parent,-1,-1): heapify(i,len(arr)) def heapsort(): """ This function first swap the first item with the last one and return it. Then it calls heapify() to maintain the heap structure. It keeps doing until all are sorted. Note that, we put i as the length of the heap so it works like we 'cut off' the tail - the original first element. """ build_heap() for i in range(len(arr)-1,-1,-1): arr[i],arr = arr,arr[i] heapify(0,i) arr = [3,1,12,1,14,2,13] heapsort()
Time complexity analysis
Further notes for Heap
Moreover, apart from the introduced
heapify() from the bottom to top provided that all elements are given, there is another way to build a heap where items continuously arrive.
In this case, starting with an empty heap, it appends each new item first to the end of the heap, and adjust upwards with
heap_insert() along with the path from the new node to the root.
def heap_insert(num): global heap heap += [num] cur = len(heap) - 1 par = (cur-1)//2 while par >= 0 and heap[par] < heap[cur]: heap[cur],heap[par] = heap[par],heap[cur] cur = par par = (cur-1)//2 def build_heap(): """ Starting from the last parent node, it keeps calling heapify() to convert the tree into a heap """ for e in arr: heap_insert(e)
Time complexities for heap:
with heapify() – n
with heap_insert() – nlogn (best case O(n))
- Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2009). Introduction to algorithms. MIT press.
- CS 15-121, Introduction to Data Structures, Carnegie Mellon University