PivotValue = list[ first ] lower = first
upper = last+1 do
do upper = upper - 1 until list[upper] ≤ PivotValue do lower = lower + 1 until list[lower] ≥ PivotValue Swap( list[ upper ], list[ lower ] )
until lower ≥ upper
// undo the extra exchange
Swap( list[ upper ], list[ lower ] ) // move pivot point into correct place Swap( list[ first ], list[ upper ] ) return upper
(Note: This algorithm requires one extra list location at the end to hold a special sentinel value that is larger than all of the valid key values.)
a. Do Question 1 using this alternative method for PivotList. b. Do Question 2 using this alternative method for PivotList.
c. What operation is done significantly less frequently for this version of PivotList?
d. How many key comparisons does the new PivotList do in the worst case for a list of N elements? (Note: It is not N 1.) How does this affect the overall worst case for quicksort?
5. How many comparisons will Quicksort do on a list of N elements that all have the same value?
6. What is the maximum number of times that Quicksort will move the largest or smallest value?
It is also possible to be able to declare an array large enough to hold all of the data but that the logical memory needs of the program are much greater than the physical memory available in the computer. This places a reliance on the computer to effectively implement virtual memory, but in even the best of circumstances there may be a large amount of data that needs to be swapped between physical memory and a disk drive. Even with an effective sorting algorithm like Quicksort, the relationship between the bounds of a partition and the blocks of logical memory that can be and are loaded may be such that a large number of blocks will have to be swapped in and out of physical mem- ory. This problem may not be seen until a program is implemented and runs so slowly that computer-based analysis tools are needed to identify the problem.
Even in that case, the problem may not be found unless system profiling tools are available that track virtual memory use.
Our analysis looked at comparison operations to determine what was an efficient sort algorithm. But the amount of time that can be spent writing information to and from the disk in the process of a virtual memory block swap will be much more significant than any logical or arithmetic operation.
Because this is handled by the operating system, we have no real control over when swaps may occur.
An alternative thought might be to use a direct access file on disk and con- vert each array access into a seek operation to move to the correct location of the file, followed by a read. This reduces the amount of logical memory needed and so reduces the reliance on virtual memory. This still translates to a signifi- cant amount of disk input and output, which is what is costly, whether done by the program or the operating system.
All of this makes the sorting algorithms in the last seven sections impractical when the data set gets extremely large. We will now look at an alternative that will use four external sequential files and a merging process to accomplish a sort.
We first identify how many records can reasonably be held in memory when we account for the size of the executable code and the available memory.
We will declare an array of this size, call it S, that will be used for the two steps of our sort process. In the first step, we will read in S records and use an appro- priate internal sort to put these records in order. This set of now sorted records will be written to file A. We read in a second set of S records, sort them, and write them to file B. We continue this process, alternating where we write the
sorted list between file A and file B. An algorithm to accomplish this first step would be
CreateRuns( S )
S is the size of the runs to be created CurrentFile = A
while not at the end of the input file do read S records from the input file sort the S records
write the records to file CurrentFile if CurrentFile = A then
CurrentFile = B else
CurrentFile = A end if
end while
Once we have processed the entire original file into sorted runs, we are now ready to start the second step, which is to merge these runs. If you think about the process, you will realize that both files A and B have some number of runs ofS records that are in order. But, as in merge sort, we can’t really say anything about the relationship between the records that are in two separate runs.
Our merging process will be similar to the MergeLists function of Sec- tion 3.6, however, in this case, instead of moving the records to a new array, we will move them to a new file. So we begin by reading in half of the first runs from files A and B. We can only read in half of each run because we have already identified that we can only hold S records at a time, and we need records from both files A and B. We now begin to merge them into a new file C. If we run through the first half of records from either file, we will then read in the second set of records for this run. When we have completed one of the two runs, the rest of the other run is written to the file. Once the first runs of files A and B have been merged, we then merge the second two runs, but this time the output is written to file D. This process continues to merge runs and write them alternately to files C and D. On the completion of this pass, you should see that we now have runs of 2S records in files C and D. We repeat the process again, but this time we read runs from C and D and write them to files A and B, which will then have runs with 4S records. You should see that even- tually we will have merged the runs into one list that is now sorted. An algo- rithm to accomplish this second step would be
PolyphaseMerge( S )
S is the size of the initial runs Size = S
Input1 = A Input2 = B
CurrentOutput = C while not done do
while more runs this pass do
Merge one run of length Size from file Input1 with one run of length Size from file Input2 sending output to CurrentOutput
if (CurrentOutput = A) then CurrentOutput = B
elseif (CurrentOutput = B) then CurrentOutput = A
elseif (CurrentOutput = C) then CurrentOutput = D
elseif (CurrentOutput = D) then CurrentOutput = C
end if end while Size = Size * 2 if (Input1 = A) then Input1 = C
Input2 = D
CurrrentOutput = A else
Input1 = A Input2 = B
CurrentOutput = C end if
end while
Before we begin our analysis, we first look at what we have in terms of runs and the number of passes this translates to. If we have N records in our original file, and we can store S records at one time, this means that after CreateRuns we must have R = N / S runs split between the two files. Each of the PolyphaseMerge passes joins pairs of runs, so it must cut the number of runs in half. After one pass there will be R / 2 runs, after two passes there will be
R / 4 runs, and, in general, after j passes there will be R / 2j runs. Because we stop when we get down to one run, this will be when R / 2D is equal to 1, which will be when D is lgR. This means we will do lgR passes of the merge process.
■ 3.8.1 Number of Comparisons in Run Construction Because the algorithm used for the run construction phase is not specified, we will assume that an O(N lg N) sort is used. Because there are S elements in each run, each one will take O(S lg S) to construct. There are R runs, giving O(R*S* lg S) = O(N lg S) comparisons in total for the construction of all of the runs. The run construction phase is O(N lg S).
■ 3.8.2 Number of Comparisons in Run Merge
In Section 3.6.1, we saw that MergeLists does A + B 1 comparisons in the worst case with two lists of A and B elements. In our case, we have R runs of size S on the first pass that get merged, so there are R / 2 merges, each of which will take at most 2S 1 comparisons, or R / 2 * (2S 1) = R*S R / 2 comparisons. On the second pass, we have R / 2 runs of size 2S, so there are R / 4 merges, each of which will take at most 2(2S) 1 comparisons, orR / 4 * (4S 1) = R*SR / 4 comparisons. On the third pass, we have R / 4 runs of size 4S, so there are R / 8 merges, each of which will take at most 2(4S) 1 comparisons, or R / 8 * (8S 1) = R*SR / 8 comparisons.
If we recall that there will be lg R merge passes, the total number of com- parisons in the merge phase will be
In the second equation, you should note that if you add 1/2 + 1/4 + 1/8 +, . . ., you will get a number that is less than 1, but it will get closer to 1 the more terms that you have. To visualize this, imagine that you stand 1 foot away from a wall, and you repeatedly keep moving closer to the wall by one-half your current distance from the wall. Because you only move one-half the dis- tance each step, you will never reach the wall, but you will keep getting closer to it. In the same way, if you do the above addition, you are adding one-half the distance between your current total and the number 1 each time. This means that the sum will keep getting closer to 1, but never larger. This can also be shown by the application of Equation 1.18 using A = 0.5 and an adjustment
R*S–R⁄2i
( )
i=1 lgR
∑
(R*S)i=1 lgR
∑
(R⁄2i)i=1 lgR
∑
–
=
R*S
( ) lgR–R 1 2⁄ i
i=1 lgR
∑
* *
=
R*S
( )*lgR–R
≈
NlgR–R
=
because the above summation begins at 1, where the summation in Equation 1.18 begins at 0.
The run merge phase is O(N lg R). This makes the entire algorithm
■ 3.8.3 Number of Block Reads
Reading large blocks of data is significantly faster than reading each of the items in the block one after another. For this reason, a polyphase merge sort will be most efficient if all data is read as larger blocks. We are still, however, interested in how many blocks of data will be read in this sort.
In the run construction step, we will read one block of data for each run, resulting in R block reads. Because we can only fit S records in memory, and we need records from two runs for the merge step, we will read blocks of size S / 2 in the merge phase. Each pass of the merge step will have to read all of the data as part of some run, meaning that there will be N / (S / 2) = 2R block reads.
Because there are lg R passes, there are 2R lg R block reads in the merge step.
In the entire algorithm, there are R + 2R lg R = O(R lg R) block reads.
3.8.4
1. What would be involved in rewriting the external polyphase merge sort algorithm so that it only used three files instead of four? What impact would this change have on the number of comparisons and block reads? (Hint: This new version can’t alternate back and forth in the merge step.)
2. What would be involved in rewriting the external polyphase merge sort algorithm so that it used six files instead of four and the merging of three lists was done simultaneously? What impact would this change have on the number of comparisons and block reads? Would there be any additional change if we used eight files and merged four lists simultaneously?