We saw a nice technique for solving problems: the Greedy approach. It is time for another powerful approach to solving problems: the divide-and-conquer technique. The basic strategy is to take a large size problem and break it down into a number of smaller problems; solve them; and, then combine the individual results to get the complete solution. These algorithms are also quite useful when we can put more than one computer on the job as each of the smaller problems can be solved on individual computers and then recombined.
Let us arrange the same list of eight numbers, from the basic algorithms section, in ascending order.
52 | 12 | 17 | 49 | 73 | 24 | 92 | 62 |
Merge Sort is the most intuitive divide-and-conquer algorithm and everyone without exception comes up with it when sorting problem is posed as follows. P number of people are together asked to arrange a list of N items in order. The solution: divide the N items equally among P people; each one arranges his/her share of roughly N/P items; finally, combine the individually sorted lists.
Merge Sort takes this idea to the extreme. Divide the original list into sublists (sometimes called runs) such that each sublist contains only one item (in which case, it is already sorted!) or two items (in which case, one comparison sorts the sublist).
Merge procedure
Input: Two lists F and GNote the two highlighted lines in the algorithm for Merge Sort. These are the statements that solve the same sorting problem but on smaller lists and such statements are always found in divide-and-conquer algorithms.
Divide-and-conquer algorithms are not as straightforward to analyse as the earlier, simpler ones. We first start Merge Sort with the single input list of 8 items. As S is not equal to 1 in Step 1, the list gets split into two lists F and G, each of size 4 items. We are doing Merge Sort on F in Step 2 (highlighted). Again, for this list, S is not equal to 1 and it gets split into two lists of size 2 items each. This happens until we have 8 sublists each of size 1 (shown below).
52 (A) |
12 (B) |
17 (C) |
49 (D) |
73 (E) |
24 (F) |
92 (G) |
62 (H) |
Once these eight sublists are formed, the Merge procedure (Step 4 of Merge Sort) comes into play. It first merges the pairs of lists (A, B), (C, D), (E, F), and (G, H). The procedure Merge procedure is followed for merging the sublists. At the end of the first Merge round, we get the following four lists:
12, 52 (AB) |
17, 49 (CD) |
24, 73 (EF) |
62, 92 (GH) |
12, 17, 49, 52 (ABCD) |
24, 62, 73, 92 (EFGH) |
12 | 17 | 24 | 49 | 52 | 62 | 73 | 92 |
Notice carefully that all the comparisons are done only in the Merge procedure and nowhere else! As we split the input list into 8 sublists of one item each, there was no need for any comparisons.
Let us count the number of comparisons in Merge procedure. Each item is compared once when we merge the sublist in which it is present, with another sublist. Each item is merged into three sublists. For example, the item 17 is initially in (C). It is merged once to create (CD) and again for (ABCD) and then for the final list. Therefore, each item is compared three times or $\log_2{8}$ times. The total number of comparisons is \[3\times8=\fbox{24}\] comparisons. Make sure you understand this argument - it is critical!
It looks as if Merge Sort took more comparisons for our 8 item list. True, but see what happens as we work with larger input lists.
Merge sort takes $1000\times\log_2{1000} = 10,000$ comparisons for an input list of size 1000. Compare this against how many we needed for Selection or Bubble Sort!
What about a list of 100,000 items? If we approximate $\log_2{100,000} = 16$, then Merge Sort makes 1,600,000 comparisons for sorting the list. Bubble Sort took nearly 5 billion comparisons! So,
Merge Sort is more than 3000 times faster.
Now we come to one of the most beautiful algorithms in Computer Science and one that is considered a classic, the Quick Sort. Quick Sort is also a divide-and-conquer algorithm that splits a given list of items into smaller lists and individually sorts them. It is similar to Merge Sort in the overall strategy but differs from it in a significant way.
Quick and Merge sorts are complementary in some ways. Merge Sort spends almost no time in splitting the list into sublists: it simply divides the list into two equal halves each time. Its comparisons occur in the Merge step which is time consuming. Quick Sort spends a lot of time in splitting the list into two parts such that one part contains small items and the other, large items. Its merge step is trivial: in fact, there is no real need for a merge.
The basic idea behind Quick Sort is to define a pivot item. All those items smaller than the pivot are moved to the left and those that are larger, to the right. Once the moves are completed, the list is split into two lists at the pivot. As the two sublists are already split in such a way that one contains only small items and the other, larger, there is no need for any merge step. The algorithm is given below. Note again the divide-and-conquer nature of the algorithm in Steps 3.3 and 3.4 in the Quick Sort algorithm.
Partition procedure
Input: One list L(end of partition procedure)
So, let us see how Quick Sort works. We first write the algorithm and start with the same list of eight numbers again.
52 | 12 | 17 | 49 | 73 | 24 | 92 | 62 |
The simplest version of Quick Sort chooses the first item as the pivot. Here the pivot is 52. Initially, Quick Sort is called with left as 0 and right as 7. When the partition procedure is called, llimit is initially 0 and rlimit is 7. llimit stays at 0 because the first element is the pivot and pivot is not less than pivot! rlimit decrements to 5 because 62 and 92 are both greater than the pivot (52) while 24 is smaller. Then Step 3.4 causes 52 to swap with 24 and we get
24 | 12 | 17 | 49 | 73 | 52 | 92 | 62 |
Step 3 starts again. Now, llimit stops at 4 in Step 3.1 because 12, 17 and 49 are smaller than the pivot while 73 is not. rlimit does not move and remains at 5 because 52 is itself the pivot. Step 3.4 swaps 73 and 52. The resulting list is
24 | 12 | 17 | 49 | 52 | 73 | 92 | 62 |
Quick Sort splits the original list into two sublists F and S as shown below. You can see that the list F contains only numbers smaller than 52 while S comprises entirely of numbers larger than 52.
24 | 12 | 17 | 49 |
73 | 92 | 62 |
The procedure continues now with splitting list F with pivot as 24 and also splitting list G with pivot as 73. At the end, the entire list is sorted.
The analysis of Quick Sort is almost the same as that of Merge Sort. Note that there is no merge step and no comparisons are made except in the partition procedure. The partitioning splits the original list $\log_2{N}$ times. In each split, the items are compared once each to result in at most $\log_2{N}$ comparisons per item. Therefore the total number of comparisons is $N\times\log_2{N}$.