HEAPSORT - Analysis of Algorithms : An Active Learning Approach

a. If the key is a number in the range of 0 to 2⁶⁴, choose two options (one smaller and one larger) for the number of bits that will be used on each pass and indicate how many buckets and passes will be needed.

b. If the key is a string of 40 characters, choose two options (one smaller and one larger) for the number of bits that will be used on each pass and indicate how many buckets and passes will be needed.

c. Based on your answers to parts (a) and (b), can you give any general rec- ommendations for how to make the choice of passes and key subdivi- sions?

of the list grows. We can, however, use the space for the list itself and do this sort without extra space. We can “build” the list into a heap if we notice that in a heap each internal node has two children, except for perhaps one node toward the bottom. If we consider the following mapping, we can use the list to hold these values. For the node at location i, we will store its two children at locations 2 *i and 2 *i + 1. Notice that this process produces distinct locations for each node’s children. We know that node i is a leaf if 2 *iis greater than N, and we know that node i has only one child if 2 * i is equal to N. Figure 3.3 shows a heap and its list version.

FixHeap

When we take the largest element out of the root and move it to the list, this leaves the root vacant. We know that the larger of its two children needs to be moved up, but then that child’s node becomes vacant, and so we look at its two children, and so on. In this process, we need to maintain the heap as close to a complete tree as possible. When we ﬁx the heap, we will also pass in the right- most node from the bottom level, to be inserted back into the heap. This will remove nodes evenly from the bottom. If we don’t do this and all of the large values are on one side of the heap, the heap will be unbalanced and the algorithm’s efﬁciency will go down. This gives us the following algorithm:

FixHeap( list, root, key, bound ) list the list/heap being sorted

root the index of the root of the heap

4 3

7 2

5 9

Locations 1 2 3 4 5 6 7 8 9 10 11 12

12 10 11 5 9 8 6 1 2 7 3 4

■ FIGURE 3.3 A heap and its list implementation

key the key value that needs to be reinserted in the heap bound the upper limit (index) on the heap

vacant = root

while 2*vacant ≤ bound do largerChild = 2*vacant

// find the larger of the two children

if (largerChild < bound) and (list[largerChild+1] > list[largerChild]) then largerChild = largerChild + 1

end if

// does the key belong above this child?

if key > list[ largerChild ] then // yes, stop looping

break else

// no, move the larger child up list[ vacant ] = list[ largerChild ] vacant = largerChild

end if end while

list[ vacant ] = key

When you look at this algorithm’s parameters, you might wonder why we have chosen to pass in the root location. Because this routine is not recursive, the root of the heap should always be location 1. You will see, however, that this additional parameter will make it possible for us to use this function to construct the heap from the bottom up. We pass in the size of the heap, because as the elements get moved from the heap to the list, the heap shrinks.

Constructing the Heap

The way that we have chosen to implement the ^FixHeap function means that we can use this in the initial construction of the heap. Any two values can be treated as leaves of a vacant node. We do not need to do any work on the second half of the list because they are all leaves. We just need to construct small heaps from the leaves and then combine these until eventually all values are in the heap. This is accomplished by the following loop:

for i = N/2 down to 1 do

FixHeap( list, i, list[ i ], N ) end for

Final Algorithm

Putting these pieces together, and adding the ﬁnal details needed to move the elements from the heap to the list, gives the algorithm

for i = N/2 down to 1 do

FixHeap( list, i, list[ i ], N ) end for

for i = N down to 2 do max = list[ 1 ]

FixHeap( list, 1, list[ i ], i-1 ) list[ i ] = max

end for

■ 3.5.1 Worst-Case Analysis

We begin by analyzing ^FixHeap because the rest of the algorithm depends on it. This algorithm will do, for each level of the heap, a comparison between the two children and then the larger child and the key. This means that for a heap of depth D, there will be no more than 2D comparisons.²

In the heap construction stage, we ﬁrst call FixHeap for each node on the second level from the bottom, which means heaps of depth 1. Then it is called for each node on the third level from the bottom, which means heaps of depth 2. On the last pass at the root level, the heap will have depth lgN. The only thing we need to consider is how many nodes is ^FixHeap called for at each pass. At the root level there is one node, and it has two children at the second level. The next level down has at most four nodes, and the level after that has eight. Putting all of this together gives the following formula:

2 The depth is 1 less than the level number. A heap with four levels will have a maxi- mum depth of 3. The root is at depth 0, its children are at depth 1, their children are at depth 2, and so on.

WConstruction( )N 2(D–i)2ⁱ

i=0 D–1

∑

WConstruction( )N 2D 2ⁱ 2 i2ⁱ

i=0 D–1

∑

–

i=0 D–1

∑

Using Equations 1.17 and 1.19, we get

We now substitute D = lg N, giving

So the heap construction phase is linear with respect to the number of elements in the list.

Now we consider the main loop of the algorithm. In this loop, we remove one element from the heap and then call FixHeap. This gets repeated until there is only one element left in the heap. On each pass the number of elements decreases by 1, but how does this change the depth of the heap? We have already said that the entire heap has a depth of lg N, and so it is easy to see that if there are K nodes left in the heap, it will have depth of lgK, and the number of comparisons is twice this number. This means that the worst case for the loop is

The problem we now have is that there is no standard form for this summation.

Let’s think about how this breaks down. When k is 1, lgk will be 0. When k is 2 and 3, lgk will be 1. When k is 4 through 7, lgk will be 2. When k is 8 through 15, lg k will be 3. We notice that in each of these cases, when the result is j, there are 2^j values that give that result. This will be true for all but the last level of the heap if it is not complete. On that level, we see that there are N 2^lg^N elements.³ This means that our equation can be represented as:

3 Because there is a ﬂoor involved with the exponent of 2, 2^lg^N < N, except when N is an exact power of 2, and then these are equal.

WConstruction( )N = 2D(2^D–1)–2[(D–2)2^D+2] WConstruction( )N = 2^D+1D–2D–2^D+1D+2^D+2–4 WConstruction( )N = 2^D+2–2D–4

WConstruction( )N = 4*2^lg^N–2 lg N–4 WConstruction( )N = 4N–2 lg N–4 = O N( )

W_loop( )N 2 lgk

k=1

N–1

∑

² ^lg^k

k=1 N–1

∑

= =

W_loop( )N 2 k2^k

k=1

∑

d–1

 

 

 

d N( –2^d)

+ where d lgN

= =

Using Equation 1.19, we get

Now, substituting lgN for d, we get

Now, we have to add together the construction and loop stages to get our ﬁnal result. This gives the equation

■ 3.5.2 Average-Case Analysis

To get the average-case analysis for heapsort, we will take a different approach.

Let’s consider the best case, where the values are initially in the array in the reverse order. You should be able to see that this is automatically a correct heap.

This means that each call to FixHeap in the construction phase will do only two comparisons to show the values are properly ordered. So, because Fix- Heap is called for about one-half of the elements and each call does two comparisons, the construction stage does about N comparisons, which is the same order as in the worst case.

Notice that no matter what order the elements are in at the start, after the construction stage we always have a heap. So, in every case, the ^for loop will have to execute the same number of times as in the worst case, because to get the sorted values we have to take each element out of the heap and then ﬁx the heap. So, in the best case, heapsort does about N + N lg N comparisons. This means that the best case is O(N lg N).

For heapsort, the best and worst cases are both O(N lg N). This can only mean that the average case must also be O(N lg N).

W_loop( )N = 2[(d–2)2^d+2+d N( –2^d)] W_loop( )N = d2^d+1–2^d+2+4+2d*N–d2^d+1 W_loop( )N = 2d*N–2^d+2+4

W_loop( )N = 2 lgN N–2 ^lg^N ⁺²+4

W_loop( )N = 2 lgN N–4 lgN +4 = O N( lgN)

W N( ) = Wconstruction( )N +W_loop( )N

W N( ) = 4N–2lgN–4+2 lgN N–4 lgN +4 W N( ) ≈ 4N+2 lgN (N–3)

W N( ) = O N( lgN)

3.5.3

1. Given the list of elements, [23, 17, 21, 3, 42, 9, 13, 1, 2, 7, 35, 4], what would be their order after the loop for the heap construction phase exe- cutes?

2. Given the list of elements, [3, 9, 14, 12, 2, 17, 15, 8, 6, 18, 20, 1], what would be their order after the loop for the heap construction phase exe- cutes?

3. We could shorten the second ^for loop in our heapsort by changing the ending condition to i ≥ 3. What, if anything, would need to be added after thisfor loop to assure that the ﬁnal list is sorted? Would this change reduce the number of comparisons (give a detailed reason for your answer)?

4. Prove that a list in reverse order is a heap.

Dalam dokumen Analysis of Algorithms : An Active Learning Approach (Halaman 94-100)