Sorting Algorithms

There are many sorting algorithms, each of which has its variations. In this course, we look at three of those algorithms: insertion sort, mergesort, and quicksort. Don't forget to read about these algorithms and study the C code provided in Sedgewick's Algorithms book.

Insertion Sort

To sort a list of items, you move through the list from left to right and insert one element at a time into its proper position among the elements to its left. Working through an example is the best way to see how this algorithm works. Let's sort the following list of numbers:

5 3 8 2 9 4 1 7

Start with the second element: 3. (store it in a temporary variable.)
Compare it with the element to its left: 5.
Since 3 is less than 5, continue scanning to the left.
Since 5 is the leftmost element, stop.
Shift the 5 into the 2nd position.
Insert the 3 into the 1st position (we have it stored in a temp. var.)
So, at the end of the first stage, we have the following:

3 5 8 2 9 4 1 7

Now, start with the third element: 8.
Compare it with the element to its left: 5.
Since 8 is greater than 5, stop.
The 8 is already in place.
The list is unchanged after this second stage:

3 5 8 2 9 4 1 7

Now, start with the fourth element: 2.
Compare it with the element to its left: 8.
Since 2 is less than 8, continue scanning to the left.
Since 2 is less than 5, continue scanning to the left.
Since 2 is less than 3, continue scanning to the left.
Since 3 is the leftmost element, stop.
Shift the 3, 5, and 8 to the right (one position each.)
Insert the 2 into its new position at the start of the list.
After the third stage, the list looks like this:

2 3 5 8 9 4 1 7

Now, start with the fifth element: 9.
Compare it with the element to its left: 8.
Since 9 is greater than 8, no further scanning is necessary.
The 9 is already in its proper position.
The list remains unchanged after this stage:

2 3 5 8 9 4 1 7

Now, start with the sixth element: 4.
Compare it with the element to its left: 9.
Since 4 is less than 9, continue scanning to the left.
Since 4 is less than 8, continue scanning to the left.
Since 4 is less than 5, continue scanning to the left.
Since 4 is greater than 3, stop.
Shift the 5, 8, and 9 to the right (one position each.)
Insert the 4 into the position previously occupied by the 5: the third.
After this stage, the list looks like this:

2 3 4 5 8 9 1 7

Now, start with the seventh element: 1
This time, we'll have to scan all the way back to the beginning.
Shift all of the elements in positions 1 through 6 to the right.
Insert the 1 into the first position:

1 2 3 4 5 8 9 7

Last stage: we need to insert the 7 into its proper position.
Shift the 8 and the 9 to the right.
Insert the 7 into the correct position:

1 2 3 4 5 7 8 9

Try sorting this string of letters (alphabetically) "S O R T M E" answer

Note that the running time of this algorithm is N2. If the list to be sorted is already in sorted or almost-sorted order, the running time will be closer to N.

Mergesort

Mergesort requires extra space: to sort an array of N elements, you need a second array also of size N for use during the algorithm. The key to understanding mergesort is realizing that if you have two lists, one with size X and one with size Y, and both of those lists are each in sorted order, then you can create a new list (of size X + Y) that contains the sorted combination of the two lists with a simple merging algorithm that has running time proporational to X + Y.

To accomplish this merging, you need to traverse through both the lists simultaneously, adding elements to the new list as you go. Here's an example with explanation.

List A: 1 4 7 8 9
List B: 2 3 5 6 10

Now, we will build up the new list, the combination of A & B, by comparing pairs of elements in the lists. First, compare 1 & 2. Since 1 is less than 2, insert 1 into the new list. Now, compare the next element of list A: 4 with the current element of list B: 2. Since 2 is less than 4, insert 2 into the new list. Now, compare the current element of list A with the next element of list B. (We look at the next element of list B because that was the list from which we chose an element, during the last step.) We'll continue through the two lists, merging the elements together into one list. We'll move through each list exactly once, giving us a running time proportional to the sum of the lengths of the lists.

The mergesort algorithm is pretty simple:

  1. divide the list in half
  2. mergesort the left half
  3. mergesort the right half
  4. merge the two halves together

Notice that the algorithm is recursive - the left and right halves are sorted recursively (with the same mergesort algorithm.) Since you can divide a list in half at most log-base-2 times, there will be [log N] levels of recursion. At each level, the merging can be done in linear time, so this algorithm is N log N, no matter what order the list is in initially. The extra storage is necessary during the merging step.

Quicksort

Quicksort is also an N log N algorithm that works by recursively dividing the list to be sorted (roughly) in half at each stage. Unlike the method of mergesort, quicksort does not divide the list precisely in half. Instead, quicksort involves a partitioning algorithm. This partitioning algorithm results in one of the elements - designated as the 'pivot' element - being placed in its proper position. That is, all the elements to the left of it will be less than it, and all the elements to the right of it will be greater than it. Notice the similarity between this algorithm and the fundamental property of binary search trees. (Review this if you've forgotten!) Here's the basic algorithm:

  1. choose the pivot element - we'll use the last element in the list
  2. start scanning from the left of the list
  3. stop at the first element that's out of place (an out of place element is one that has value greater than the pivot element.)
  4. start scanning from the right end of the list (the element directly to the left of the pivot element.)
  5. stop at the first element that's out of place (an out of place element is one that has value less than the pivot element.)
  6. exchange the two elements you've stopped at.
  7. start scanning from left to right, starting at the place you stopped before, stopping when you reach an out of place element
  8. start scanning from right to left, starting at the place you stopped before, stopping when you reach an out of place element
  9. switch these two new elements (you just stopped at)
  10. repeat steps 7 - 9 until the left and right scanning meets.
  11. place the pivot element in this middle location by exchanging it with the element there (the element you exchange the pivot element with should have value greater than the pivot element.)

Notice that you do not move the pivot element until the very end of this partioning algorithm. At the end of the partioning, the pivot element is in its correct, sorted position. Every element to the left of it has lesser value. Every element to the right of it has greater value. The quicksort algorithm proceeds with the recursive sorting of both these halves.

Here's an example to illustrate the partioning algorithm:

list of numbers to sort: 45 78 23 34 11 98 36 62 50

Step 1: 50 is the pivot element. This the is the number that we will compare every other element in the array to once (notice that this means that there are N-1 comparisons, so the partitioning algorithm is linear: O(N))

Start scanning from the left: leave 45 in place, since it's less than 50. Stop at 78, since 78 is out of place.

Start scanning from the right: 62 is fine, since it's greater than 50. Stop at 36, which is less than 50.

Switch 78 and 36. The list now looks like this: 45 36 23 34 11 98 78 62 50

We resume scanning from left to right with the element 23. The elements 23, 34, and 11 are all fine, so we stop at 98. 98 is the element we start scanning at with the right to left scanning. We have now met in the middle. So, 98 is the element we'll switch with 50. The list will look like this: 45 36 23 34 11 50 78 62 98.

Notice that all of the elements to the left of 50 have lesser value, and all the elements to the right of it have greater value. The algorithm will continue recursively: quicksort the list: 45 36 23 34 11 and quicksort the list: 78 62 98.

When the input elements are randomly distributed, we expect to divide the list roughly in half each time, so there will be log N stages. The partioning algorithm takes O(N) time, so this is an N lg N algorithm, in the average-case.

If the input elements are not randomly distributed, the running time may be worse. Consider the case in which all the input elements are already sorted (or in reverse order.) Each time you partition the elements, the pivot element will not divide the elements in half. If the list isn't divided in half each time, there could potentially be N stages instead of log N stages, making the worst-case running time for this algorithm N2