QUICKSORT - Analysis of Algorithms : An Active Learning Approach

Splitting the List

There are at least two versions of the PivotList function. The ﬁrst is easy to program and understand and is presented in this section. The other is more complicated to write but is faster than this version. The second version will be considered in the exercises.

The function PivotList will pick the first element of the list as its pivot element and will set the pivot point as the first location of the list. It then moves through the list comparing this pivot element to the rest of the elements. Whenever it finds an element that is smaller than the pivot element, it will increment the pivot point and then swap this element into the new pivot point location. After some of the elements are compared to the pivot inside the loop, we will have four parts to the list. The first part is the pivot element in the first location. The second part is from location first + 1 through the pivot point and will be all of the elements we have looked at that are smaller than the pivot element. The third part is from the location after the pivot point through the loop index and will be all of the elements we have looked at that are larger than the pivot element. The rest of the list will be values we have not yet examined. This is shown in Fig. 3.4.

The algorithm for ^PivotList is as follows:

PivotList( list, first, last ) list the elements to work with first the index of the first element last the index of the last element PivotValue = list[ first ]

PivotPoint = first

for index = first + 1 to last do if list[ index ] < PivotValue then

pivot < pivot ≥ pivot unknown

Pivot Point

Index

First Last

■ FIGURE 3.4 Relationship between the indices and element values in PivotList

PivotPoint = PivotPoint + 1

Swap( list[ PivotPoint ], list[ index ] ) end if

end for

// move pivot value into correct place Swap( list[ first ], list[ PivotPoint ] ) return PivotPoint

■ 3.7.1 Worst-Case Analysis

WhenPivotList is called with a list of N elements, it does N 1 comparisons as it compares the PivotValue with every other element in the list. Because we have already said that quicksort is a divide and conquer algorithm, you might assume that a best case would be when PivotList creates two parts that are the same size and you would be correct. The worst case would then be when the lists are of drastically different sizes. The largest difference in the size of these two lists occurs if the ^PivotValue is smaller (or larger) than all of the other values in the list. In that case, we wind up with one part that has no elements and the other that has N 1 elements. If the same thing happens each time we apply this process, we would only remove one element (the ^Pivot- Value) from the list at each recursive call. This means we would do the number of comparisons given by the following formula:

What original ordering of elements would cause this behavior? If each pass chooses the ﬁrst element, that element must be the smallest (or largest). A list that is already sorted is one arrangement that would cause this worst case behavior! In all of the other sort algorithms we have considered, the worst and average cases have been about the same, but as we are about to see, this is not true for quicksort.

■ 3.7.2 Average-Case Analysis

You will recall that when we looked at shellsort, we considered the number of inversions that each comparison removed in our analysis. At that time, we pointed out that bubble sort and insertion sort didn’t do well on average because they both removed only one inversion for each comparison.

W N( ) (i–1)

i=2

∑

N ^{N N}---⁽ ₂^–¹⁾

= =

So, how does quicksort do in removing inversions? Consider a list of N elements that the PivotList algorithm is working on. Let’s say that the Pivot- Value is greater than all of the values in the list. This means that at the end of the routine ^PivotPoint will be N, and so the ^PivotValue will be switched from the first location to the last location. It is also possible that the element in the last location is the smallest value in the list. So swapping these two values will move the largest element from the first location to the last and will move the smallest element from the last location to the first. If the largest element is first, there are N 1 inversions of it with the rest of the elements in the list, and if the smallest element is last, there are N 1 inversions of it with the rest of the elements in the list. This one swap can remove 2N 2 inversions from the list. It is because of this possibility that quicksort has an average case that is significantly different from its worst case.

Notice that PivotList is doing all of the work, and so we ﬁrst look at this algorithm to see what it does in the average case. We ﬁrst notice that it is possible for each of the N locations in the list to be the location of the ^Pivot- Value when PivotList is done. To get the average case, we have to look at what happens for each of these possibilities and average the results. When look- ing at the worst case, we noticed that for a list of N elements there are N 1 comparisons done by ^PivotList in dividing the list. There is no work done to put the lists back together. Lastly, notice that when PivotList returns a value of P, we call Quicksort recursively with lists of P 1 and NP elements. Our average case analysis needs to look at all N possible values for P. Putting this together gives the recurrence relation

If you look closely at the summation, you will notice that the ﬁrst term is used with values from 0 through N 1, and the second term is used with values from N 1 down to 0. This means that the summation adds up every value of A from 0 to N 1 twice. This gives us the following simpliﬁcation:

A N( ) (N–1) 1

N---- [A i( –1)+A N( –i)]

i=1

∑

 

 

 

+ forN≥2

A( )1 = A( )0 = 0

A N( ) (N–1) 1

N---- 2 A i( )

i=0 N–1

∑

 

 

 

forN≥2 +

A( )1 = A( )0 = 0

This is a very complicated form of recurrence relation because it depends on not just one smaller value of A, but rather on every smaller value for A.

There are two ways to go about solving this. The ﬁrst is to come up with an educated guess for the answer and to then prove that this answer does satisfy the recurrence relation. The second way is to look at the equations for both A(N) and A(N 1). Those two equations differ by only a few terms. We now computeA(N)* N and A(N 1) * (N 1) to get rid of the two fractions.

This gives

Now, we subtract the third equation above from the second and simplify to get

AddingA(N 1) * (N 1) to both sides, we get

This gives our ﬁnal recurrence relation:

Solving this is not difﬁcult but does require care because of all of the terms on the right-hand side of the equation. If you work through all of the details, you will see the ﬁnal result is A(N)⬇ 1.4 (N + 1) lg N. Quicksort is, therefore, O(N lg N) on average.

A N( )*N (N–1)N 2 A i( )

i=0 N–1

∑

A N( )*N (N–1)N 2A N( –1) 2 A i( )

i=0 N–2

∑

+ +

A N( –1)*(N–1) (N–2)(N–1) 2 A i( )

i=0 N–2

∑

A N( )*N–A N( –1)*(N–1) = 2A N( –1)+(N–1)N–(N–2)(N–1) A N( )*N–A N( –1)*(N–1) = 2A N( –1)+N²–N–(N²–3N+2) A N( )*N–A N( –1)*(N–1) = 2A N( –1)+2N–2

A N( )*N = 2A N( –1)+A N( –1)*(N–1)+2N–2 A N( )*N = A N( –1)*(2+N–1)+2N–2

A N( ) (N+1)*A N( –1)+2N–2 ---N

A( )1 = A( )0 = 0

3.7.3

1. Trace the operation of Quicksort on the list [23, 17, 21, 3, 42, 9, 13, 1, 2, 7, 35, 4]. Show the list order and the stack of (ﬁrst, last, pivot) values at the start of every call. Count the number of comparisons and swaps that are done.

2. Trace the operation of ^Quicksort on the list [3, 9, 14, 12, 2, 17, 15, 8, 6, 18, 20, 1]. Show the list order and the stack of (ﬁrst, last, pivot) values at the start of every call. Count the number of comparisons and swaps that are done.

3. We showed that the ^Quicksort algorithm performs poorly when the list is sorted because the pivot element is always smaller than all of the elements left in the list. Just picking a different location of the list would have the same problem because you could get “unlucky” and always pick the smallest remaining value. A better alternative would be to consider three values list[ ﬁrst ], list[ last ], and list[ (ﬁrst + last) / 2 ] and pick the median or middle value of these three. The comparisons to pick the middle element must be included in the complexity analysis of the algorithm.

a. Do Question 1 using this alternative method for picking the pivot element.

b. Do Question 2 using this alternative method for picking the pivot element.

c. In general, how many comparisons are done in the worst case to sort a list of N keys? (Note: You are now guaranteed to not have the smallest value for the ^PivotValue, but the result can still be pretty bad.)

4. An alternative for the ^PivotList algorithm would be to have two indices into the list. The ﬁrst moves up from the bottom and the other moves down from the top. The main loop of the algorithm will advance the lower index until a value greater than the ^PivotValue is found, and the upper index is moved until a value less than the ^PivotValue is found. Then these two are swapped. This process repeats until the two indices cross. These inner loops are very fast because the overhead of checking for the end of the list is elim- inated, but the problem is that they will do an extra swap when the indices pass each other. So, the algorithm does one extra swap to correct this. The full algorithm is

PivotList( list, first, last ) list the elements to work with first the index of the first element last the index of the last element 3.7.3 EXERCISES

■

PivotValue = list[ first ] lower = first

upper = last+1 do

do upper = upper - 1 until list[upper] ≤ PivotValue do lower = lower + 1 until list[lower] ≥ PivotValue Swap( list[ upper ], list[ lower ] )

until lower ≥ upper

// undo the extra exchange

Swap( list[ upper ], list[ lower ] ) // move pivot point into correct place Swap( list[ first ], list[ upper ] ) return upper

(Note: This algorithm requires one extra list location at the end to hold a special sentinel value that is larger than all of the valid key values.)

a. Do Question 1 using this alternative method for PivotList. b. Do Question 2 using this alternative method for PivotList.

c. What operation is done signiﬁcantly less frequently for this version of PivotList?

d. How many key comparisons does the new PivotList do in the worst case for a list of N elements? (Note: It is not N 1.) How does this affect the overall worst case for quicksort?

5. How many comparisons will ^Quicksort do on a list of N elements that all have the same value?

6. What is the maximum number of times that Quicksort will move the largest or smallest value?

Dalam dokumen Analysis of Algorithms : An Active Learning Approach (Halaman 106-112)