Searching#

Searching is the algorithmic process of finding a particular item in a collection of items. A search typically answers either True or False as to whether the item is present. On occasion it may be modified to return where the item is found.

Sorting#

Sorting is the process of placing elements from a collection in some kind of order.

Example

  1. A list of words could be sorted alphabetically or by length.

  2. A list of cities could be sorted by population, by area, or by zip code.

There are many, many sorting algorithms that have been developed and analyzed, sorting is an important area of study in computer science. Sorting a large number of items can take a substantial amount of computing resources. Like searching, the efficiency of a sorting algorithm is related to the number of items being processed.

For small collections, a complex sorting method may be more trouble than it is worth. The overhead may be too high. On the other hand, for larger collections, we want to take advantage of as many improvements as possible, we will discuss several sorting techniques and compare them with respect to their running time.

Bubble Sort#

The bubble sort makes multiple passes through a list. It compares adjacent items and exchanges those that are out of order. Each pass through the list places the next largest value in its proper place. In essence, each item β€œbubbles” up to the location where it belongs.

To analyze the bubble sort, we should note that regardless of how the items are arranged in the initial list, nβˆ’1 passes will be made to sort a list of size n. The total number of comparisons is the sum of the first nβˆ’1 integers. The sum of the first nβˆ’1 integers is n*(n-1)/2. This is still O(n^2) comparisons. In the best case, if the list is already ordered, no exchanges will be made. However, in the worst case, every comparison will cause an exchange. On average, we exchange half of the time.

A bubble sort is often considered the most inefficient sorting method since it must exchange items before the final location is known. These β€œwasted” exchange operations are very costly.

def bubble_sort(arr):
    n = len(arr)

    for i in range(n):
        swaps = False
        for j in range(n - i - 1):
            if arr[j] > arr[j + 1]:
                arr[j], arr[j + 1] = arr[j + 1], arr[j]
                swaps = True 
        if not swaps:
            return arr
    
    return arr

arr = [5, 4, 1, 2, 3, 9, 8, 7, 6]
print(bubble_sort(arr))
[1, 2, 3, 4, 5, 6, 7, 8, 9]

Selection Sort#

The selection sort improves on the bubble sort by making only one exchange for every pass through the list. In order to do this, a selection sort looks for the smallest value as it makes a pass and, after completing the pass, places it in the proper location. As with a bubble sort, after the first pass, the smallest item is in the correct place. After the second pass, the next smallest is in place. This process continues and requires nβˆ’1 passes to sort n items, since the final item must be in place after the (nβˆ’1) st pass.

def selection_sort(arr):
    n = len(arr)

    for i in range(n - 1):
        min_index = i
        for j in range(i + 1, n):
            if arr[min_index] > arr[j]:
                min_index = j
        arr[i], arr[min_index] = arr[min_index], arr[i]
    
    return arr

arr = [5, 4, 1, 2, 3, 9, 8, 7, 6]
print(selection_sort(arr))
[1, 2, 3, 4, 5, 6, 7, 8, 9]

Insertion Sort#

The insertion sort, although still O(n^2), works in a slightly different way. It always maintains a sorted sublist in the lower positions of the list. Each new item is then β€œinserted” back into the previous sublist such that the sorted sublist is one item larger.

def insertion_sort(arr):
    n = len(arr)

    for i in range(1, n):
        for j in range(i, 0, -1):
            if arr[j] > arr[j - 1]:
                break
            if arr[j - 1] > arr[j]:
                arr[j - 1], arr[j] = arr[j], arr[j - 1]
    return arr

arr = [5, 4, 1, 2, 3, 9, 8, 7, 6]
print(insertion_sort(arr))
[1, 2, 3, 4, 5, 6, 7, 8, 9]

Merge Sort#

Merge sort is a recursive algorithm that continually splits a list in half. If the list is empty or has one item, it is sorted by definition (the base case). If the list has more than one item, we split the list and recursively invoke a merge sort on both halves. Once the two halves are sorted, the fundamental operation, called a merge, is performed. Merging is the process of taking two smaller sorted lists and combining them together into a single, sorted, new list.

def merge_sort(arr):
    n = len(arr)

    if n <= 1:
        return arr

    left_sub_arr = merge_sort(arr[:n // 2])
    right_sub_arr = merge_sort(arr[n // 2:])

    arr = []
    i, j = 0, 0
    
    while i < len(left_sub_arr) and j < len(right_sub_arr):
        if left_sub_arr[i] <= right_sub_arr[j]:
            arr.append(left_sub_arr[i])
            i += 1
        elif left_sub_arr[i] > right_sub_arr[j]:
            arr.append(right_sub_arr[j])
            j += 1
        
    return arr + left_sub_arr[i:] + right_sub_arr[j:]

arr = [5, 4, 1, 2, 3, 9, 8, 7, 6]
print(merge_sort(arr))
[1, 2, 3, 4, 5, 6, 7, 8, 9]

Quick Sort#

The quick sort uses divide and conquer to gain the same advantages as the merge sort, while not using additional storage. As a trade-off, however, it is possible that the list may not be divided in half. When this happens, we will see that performance is diminished.

A quick sort first selects a value, which is called the pivot value. Although there are many different ways to choose the pivot value, we will simply use the first item in the list. The role of the pivot value is to assist with splitting the list. The actual position where the pivot value belongs in the final sorted list, commonly called the split point, will be used to divide the list for subsequent calls to the quick sort.

from random import randrange

def quick_sort(arr):
    n = len(arr)

    if n <= 1:
        return arr

    pivot_index = randrange(n)
    pivot_value = arr[pivot_index]
    
    arr[-1], arr[pivot_index] = arr[pivot_index], arr[-1]

    pivot_index = 0

    for i in range(n - 1):

        if arr[i] <= pivot_value:
            arr[i], arr[pivot_index] = arr[pivot_index], arr[i]
            pivot_index += 1
    
    arr[-1], arr[pivot_index] = arr[pivot_index], arr[-1]

    return quick_sort(arr[:pivot_index]) + [arr[pivot_index]] + quick_sort(arr[pivot_index + 1:])

arr = [5, 4, 1, 2, 3, 9, 8, 7, 6]
print(quick_sort(arr))
[1, 2, 3, 4, 5, 6, 7, 8, 9]