b. Write a short paragraph that explains exactly why this third version of bubble sort will work.
c. Does this third version of bubble sort change the worst-case analysis?
Give an analysis or justification for your answer.
d. This third version of bubble sort does change the average-case analysis.
Give a detailed explanation of what is involved in calculating this new average-case analysis.
6. Develop a formal argument that proves that the largest element must be in the correct place after the first pass of the BubbleSort loop.
7. Develop a formal argument that proves that if there are no swaps done on any pass of BubbleSort, the list must now be in the correct order.
passes = lg N
while (passes ≥ 1) do increment = 2passes - 1
for start = 1 to increment do
InsertionSort( list, N, start, increment ) end for
passes = passes - 1 end while
The variable increment gives the spacing between the elements of the sublist. (In Fig. 3.1, the increments used are 8, 4, 2, and 1.) In the algorithm, we start with an increment that is 1 less than the largest power of 2 that is smaller than the size of the list. So, if our list has 1000 elements, our first incre- ment will be 511. The increment also indicates the number of sublists that we
16 7 10 1 13 11 3 8 14 4 2 12 6 5 9 15
14 4 2 1 5 11 3 8 16 7 10 12 6 13 9 15
(a) Pass 1
(b) Pass 2
(c) Pass 3
(d) Pass 4
5 4 2 1 6 7 3 8 14 11 9 12 16 13 10 15
2 1 3 4 5 7 6 8 9 11 10 12 14 13 16 15
■ FIGURE 3.1 The four passes of a shellsort
have. If our first sublist has the elements in locations 1 and 1 + increment, the last sublist has to start in location increment. The last time the while loop is executed, passes will have a value of 1, which will make increment 1 for the last InsertionSort.
The analysis of this algorithm depends on the analysis that we did of InsertionSort. Before we begin the analysis of Shellsort, recall that in Section 3.1, we saw that for a list with N elements the worst case for insertion sort was and the average case for insertion sort was .
■ 3.3.1 Algorithm Analysis
In this analysis, we will first determine the number of times we call the InsertionSort function and the number of elements in the lists for those calls. Let’s look at the specific case when the list size is 15. On the first pass, increment is 7 and so we make seven calls with lists of size 2. On the second pass, increment is 3, and so we make three calls with lists of size 5. On the third pass and last pass, increment is 1, and so we make one call with a list of size 15. From the above formulas, we see that for a list of size 2, Insertion- Sort will do one comparison in the worst case. For a list of size 5, it will do 10 comparisons in the worst case. For a list of size 15, it will do 105 compari- sons in the worst case. If we add all of this up together, we find that we get a total of 142 comparisons (7 * 1 + 3 * 10 + 1 * 105). But is this a good esti- mate?
If you look back at the analysis of Section 3.1.1, you will see that we said the worst case for insertion sort occurs when each element to be added has to be put at the front of the list. On the last pass of our Shellsort algorithm, we know that this worst case cannot possibly occur because of the sorting that occurred in the earlier passes. Maybe a different approach will help us figure out how much work is left.
When analyzing sorting algorithms, we will sometimes consider the number of inversions in a list. An inversion is a pair of elements in the list that are out of order. For example, the list [3, 2, 4, 1] has four inversions, namely, (3, 2), (3, 1), (2, 1), and (4, 1). You should see that a list in reverse order has the worst number of inversions possible: .
One way to look at the work a sorting algorithm does is to count the num- ber inversions between the current permutation of the elements and a sorted
N2–N
( )⁄2 N2⁄4
N2–N
( )⁄2
list. Each swap of elements will remove one or more of these inversions. For example, when bubble sort does a comparison and finds two adjacent elements out of order, it switches them, removing just one inversion. The same is true for insertion sort because the movement of each larger element up one loca- tion in the list is the removal of one inversion between it and the element we are inserting. So, in bubble and insertion sort (O(N2) sorts) each comparison can result in the removal of exactly one inversion.
Because shellsort relies on insertion sort, it would seem that its analysis must be the same, but when you consider that shellsort looks at sublists that are interleaved with each other, one comparison can cause a swap that removes more than just one inversion. On the first pass of Fig. 3.1, we compared 16 and 14, and because they were out of order they were swapped. By moving 16 from the first location to the ninth we removed 7 inversions of 16 with the val- ues in locations 2 through 8 of the list. The analysis of shellsort gets compli- cated because that same swap moved 14 from the ninth location to the first and created seven new inversions, so that comparison didn’t help at all. If you look at the swap of 7 and 4, you see the same thing. But overall, there are improve- ments. On the first pass, we did eight comparisons and removed 36 inversions.
On the second pass, we did 16 comparisons and removed 20 inversions. On the third pass, we did 19 comparisons and removed 24 inversions. And on the last pass, we did 19 comparisons and removed the last 10 inversions. This is a total of 62 comparisons. If we just considered the average cases for the inser- tion sort calls that we did, you would still calculate 152 comparisons.
A complete analysis of the shellsort algorithm is very complex and beyond the scope of this book. With the sequence of increment values that we chose, it has been shown that shellsort in the worst case is O(N3/2). A detailed analysis of shellsort and the impact of the increment sequence discussed in the next section are presented in the third volume of Donald Knuth’s The Art of Com- puter Programming (Addison-Wesley, 1998).
■ 3.3.2 The Effect of the Increment
The choice of the increment sequence can have a major effect on the order of shellsort, and attempts at finding an optimal increment sequence have not be successful. A number of different options have been considered, and their results are presented here.
If there are just two passes, it has been shown that using an increment of about for the first pass and 1 for the second pass produces a sort of O(N5/3).
Another set of increments would be hj = (3j 1) / 2 for all h values less thanN. These values also satisfy the relationship hj+1 = 3hj + 1 and h1 = 1, so once the largest value of h is identified, succeeding increments can be calcu- lated by hj = (hj+1 – 1) / 3. Using this sequence of increments results in a sort of O(N3/2).
Another version will calculate all of the possible values of 2i3j (for any inte- gers i≥ 0 and j≥ 0) that are less than the size of the list and use those values in decreasing order. For example, if N is 40, we would have the following sequence of increments: 36 (2232), 32 (2530), 27 (2033), 24 (2331), 18 (2132), 16 (2430), 12 (2231), 9 (2032), 8 (2330), 6 (2131), 4 (2230), 3 (2031), 2 (2130), and 1 (2030). By using a sequence of values that follows this pattern, shellsort’s order can be reduced to O(N(lg N)2). It should be noted that the large number of passes introduces significant overhead, so this doesn’t become a practical sequence unless the size of the list is very large.
Shellsort is unique in that its general algorithm stays the same, but the choices of its parameters can have a dramatic effect on its order.
3.3.3
1. Show the results of each of the passes of Shellsort using the increments of 7, 5, 3, and 1 with the initial list of values [16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1]. How many comparisons are done?
2. Show the results of each of the passes of Shellsort using the increments of 8, 4, 2, and 1 with the initial list of values [16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1]. How many comparisons are done?
3. Show the results of each pass of Shellsort using increments of 5, 2, and 1 applied to the list [7, 3, 9, 4, 2, 5, 6, 1, 8]. How many comparisons are done?
4. Show the results of each pass of Shellsort using increments of 5, 2, and 1 applied to the list [3, 5, 2, 9, 8, 1, 6, 4, 7]. How many comparisons are done?
5. Write the new version of InsertionSort used in this section.
6. This section looked at sorting as the removal of inversions in a list. For a list ofN elements, what is the formula for the largest number of inversions that
1.72 * 3 N
3.3.3 EXERCISES
■
can be removed by the exchange of two nonadjacent elements? Give an example for a list with 10 elements.