Algorithms
5.3 Randomized algorithms
5.4.4 The on-line hiring problem
this number is small (much less than1), then we expect few streaks of lengthkto occur and the probability that one occurs is low. Ifk D clgn, for some positive constantc, we obtain
EŒX D nclgnC1 2clgn D nclgnC1
nc D 1
nc1 .clgn1/=n nc1 D ‚.1=nc1/ :
Ifc is large, the expected number of streaks of lengthclgnis small, and we con- clude that they are unlikely to occur. On the other hand, ifc D1=2, then we obtain EŒX D ‚.1=n1=21/ D ‚.n1=2/, and we expect that there are a large number of streaks of length.1=2/lgn. Therefore, one streak of such a length is likely to occur. From these rough estimates alone, we can conclude that the expected length of the longest streak is‚.lgn/.
ON-LINE-MAXIMUM.k; n/
1 bestscore D 1 2 fori D1tok
3 ifscore.i / >bestscore 4 bestscore D score.i / 5 fori DkC1ton
6 ifscore.i / >bestscore
7 returni
8 returnn
We wish to determine, for each possible value of k, the probability that we hire the most qualified applicant. We then choose the best possible k, and implement the strategy with that value. For the moment, assume that k is fixed. Let M.j /Dmax1ijfscore.i /g denote the maximum score among ap- plicants 1throughj. Let S be the event that we succeed in choosing the best- qualified applicant, and letSibe the event that we succeed when the best-qualified applicant is the ith one interviewed. Since the various Si are disjoint, we have that PrfSg DPn
iD1PrfSig. Noting that we never succeed when the best-qualified applicant is one of the firstk, we have that PrfSig D 0fori D1; 2; : : : ; k. Thus, we obtain
PrfSg D Xn
iDkC1
PrfSig : (5.12)
We now compute PrfSig. In order to succeed when the best-qualified applicant is theith one, two things must happen. First, the best-qualified applicant must be in position i, an event which we denote by Bi. Second, the algorithm must not select any of the applicants in positionskC1throughi1, which happens only if, for eachj such thatkC1j i1, we find thatscore.j / <bestscorein line 6.
(Because scores are unique, we can ignore the possibility ofscore.j /Dbestscore.) In other words, all of the valuesscore.kC1/ through score.i 1/ must be less thanM.k/; if any are greater thanM.k/, we instead return the index of the first one that is greater. We useOi to denote the event that none of the applicants in position kC1 through i 1 are chosen. Fortunately, the two eventsBi and Oi
are independent. The eventOi depends only on the relative ordering of the values in positions 1 through i 1, whereas Bi depends only on whether the value in position i is greater than the values in all other positions. The ordering of the values in positions1through i 1does not affect whether the value in positioni is greater than all of them, and the value in positioni does not affect the ordering of the values in positions1through i 1. Thus we can apply equation (C.15) to obtain
PrfSig DPrfBi\Oig DPrfBigPrfOig :
The probability PrfBig is clearly 1=n, since the maximum is equally likely to be in any one of the n positions. For eventOi to occur, the maximum value in positions1throughi1, which is equally likely to be in any of thesei1positions, must be in one of the first k positions. Consequently, PrfOig D k=.i 1/and PrfSig Dk=.n.i 1//. Using equation (5.12), we have
PrfSg D Xn
iDkC1
PrfSig D
Xn
iDkC1
k n.i1/
D k n
Xn
iDkC1
1 i 1 D k
n Xn1
iDk
1 i :
We approximate by integrals to bound this summation from above and below. By the inequalities (A.12), we have
Z n k
1 xdx
Xn1 iDk
1 i
Z n1 k1
1 xdx :
Evaluating these definite integrals gives us the bounds k
n.lnnlnk/PrfSg k
n.ln.n1/ln.k1// ;
which provide a rather tight bound for PrfSg. Because we wish to maximize our probability of success, let us focus on choosing the value ofkthat maximizes the lower bound on PrfSg. (Besides, the lower-bound expression is easier to maximize than the upper-bound expression.) Differentiating the expression.k=n/.lnnlnk/
with respect tok, we obtain 1
n.lnnlnk1/ :
Setting this derivative equal to0, we see that we maximize the lower bound on the probability when lnkDlnn1Dln.n=e/or, equivalently, whenkDn=e. Thus, if we implement our strategy withkDn=e, we succeed in hiring our best-qualified applicant with probability at least1=e.
Exercises
5.4-1
How many people must there be in a room before the probability that someone has the same birthday as you do is at least1=2? How many people must there be before the probability that at least two people have a birthday on July 4 is greater than1=2?
5.4-2
Suppose that we toss balls intobbins until some bin contains two balls. Each toss is independent, and each ball is equally likely to end up in any bin. What is the expected number of ball tosses?
5.4-3 ?
For the analysis of the birthday paradox, is it important that the birthdays be mutu- ally independent, or is pairwise independence sufficient? Justify your answer.
5.4-4 ?
How many people should be invited to a party in order to make it likely that there arethreepeople with the same birthday?
5.4-5 ?
What is the probability that ak-string over a set of sizenforms ak-permutation?
How does this question relate to the birthday paradox?
5.4-6 ?
Suppose thatnballs are tossed intonbins, where each toss is independent and the ball is equally likely to end up in any bin. What is the expected number of empty bins? What is the expected number of bins with exactly one ball?
5.4-7 ?
Sharpen the lower bound on streak length by showing that innflips of a fair coin, the probability is less than1=nthat no streak longer than lgn2lg lgnconsecutive heads occurs.
Problems
5-1 Probabilistic counting
With ab-bit counter, we can ordinarily only count up to2b1. With R. Morris’s probabilistic counting, we can count up to a much larger value at the expense of some loss of precision.
We let a counter value ofirepresent a count ofnifori D0; 1; : : : ; 2b1, where theni form an increasing sequence of nonnegative values. We assume that the ini- tial value of the counter is0, representing a count ofn0 D 0. The INCREMENT
operation works on a counter containing the valuei in a probabilistic manner. If i D2b1, then the operation reports an overflow error. Otherwise, the INCRE-
MENT operation increases the counter by1with probability 1=.niC1ni/, and it leaves the counter unchanged with probability11=.niC1ni/.
If we select ni D i for alli 0, then the counter is an ordinary one. More interesting situations arise if we select, say,ni D2i1 fori > 0orni D Fi (the ith Fibonacci number—see Section 3.2).
For this problem, assume that n2b1 is large enough that the probability of an overflow error is negligible.
a. Show that the expected value represented by the counter after n INCREMENT
operations have been performed is exactlyn.
b. The analysis of the variance of the count represented by the counter depends on the sequence of the ni. Let us consider a simple case: ni D 100i for alli 0. Estimate the variance in the value represented by the register aftern INCREMENToperations have been performed.
5-2 Searching an unsorted array
This problem examines three algorithms for searching for a valuexin an unsorted arrayAconsisting ofnelements.
Consider the following randomized strategy: pick a random index i intoA. If AŒi Dx, then we terminate; otherwise, we continue the search by picking a new random index intoA. We continue picking random indices intoAuntil we find an indexj such thatAŒj D x or until we have checked every element ofA. Note that we pick from the whole set of indices each time, so that we may examine a given element more than once.
a. Write pseudocode for a procedure RANDOM-SEARCH to implement the strat- egy above. Be sure that your algorithm terminates when all indices intoAhave been picked.
b. Suppose that there is exactly one index i such that AŒi D x. What is the expected number of indices into A that we must pick before we find x and RANDOM-SEARCHterminates?
c. Generalizing your solution to part (b), suppose that there are k 1indicesi such that AŒi D x. What is the expected number of indices intoA that we must pick before we find xand RANDOM-SEARCH terminates? Your answer should be a function ofnandk.
d. Suppose that there are no indices i such that AŒi Dx. What is the expected number of indices intoAthat we must pick before we have checked all elements ofAand RANDOM-SEARCH terminates?
Now consider a deterministic linear search algorithm, which we refer to as DETERMINISTIC-SEARCH. Specifically, the algorithm searchesAforx in order, considering AŒ1; AŒ2; AŒ3; : : : ; AŒnuntil either it findsAŒi D x or it reaches the end of the array. Assume that all possible permutations of the input array are equally likely.
e. Suppose that there is exactly one index i such that AŒi D x. What is the average-case running time of DETERMINISTIC-SEARCH? What is the worst- case running time of DETERMINISTIC-SEARCH?
f. Generalizing your solution to part (e), suppose that there are k 1indicesi such thatAŒi Dx. What is the average-case running time of DETERMINISTIC- SEARCH? What is the worst-case running time of DETERMINISTIC-SEARCH? Your answer should be a function ofnandk.
g. Suppose that there are no indicesisuch thatAŒi Dx. What is the average-case running time of DETERMINISTIC-SEARCH? What is the worst-case running time of DETERMINISTIC-SEARCH?
Finally, consider a randomized algorithm SCRAMBLE-SEARCH that works by first randomly permuting the input array and then running the deterministic lin- ear search given above on the resulting permuted array.
h. Lettingkbe the number of indicesisuch thatAŒi Dx, give the worst-case and expected running times of SCRAMBLE-SEARCH for the cases in whichk D0 andkD1. Generalize your solution to handle the case in whichk1.
i. Which of the three searching algorithms would you use? Explain your answer.
Chapter notes
Bollob´as [53], Hofri [174], and Spencer [321] contain a wealth of advanced prob- abilistic techniques. The advantages of randomized algorithms are discussed and surveyed by Karp [200] and Rabin [288]. The textbook by Motwani and Raghavan [262] gives an extensive treatment of randomized algorithms.
Several variants of the hiring problem have been widely studied. These problems are more commonly referred to as “secretary problems.” An example of work in this area is the paper by Ajtai, Meggido, and Waarts [11].
This part presents several algorithms that solve the followingsorting problem:
Input: A sequence ofnnumbersha1; a2; : : : ; ani.
Output: A permutation (reordering) ha01; a20; : : : ; an0iof the input sequence such thata01a02 an0.
The input sequence is usually ann-element array, although it may be represented in some other fashion, such as a linked list.
The structure of the data
In practice, the numbers to be sorted are rarely isolated values. Each is usually part of a collection of data called arecord. Each record contains akey, which is the value to be sorted. The remainder of the record consists ofsatellite data, which are usually carried around with the key. In practice, when a sorting algorithm permutes the keys, it must permute the satellite data as well. If each record includes a large amount of satellite data, we often permute an array of pointers to the records rather than the records themselves in order to minimize data movement.
In a sense, it is these implementation details that distinguish an algorithm from a full-blown program. A sorting algorithm describes the method by which we determine the sorted order, regardless of whether we are sorting individual numbers or large records containing many bytes of satellite data. Thus, when focusing on the problem of sorting, we typically assume that the input consists only of numbers.
Translating an algorithm for sorting numbers into a program for sorting records
is conceptually straightforward, although in a given engineering situation other subtleties may make the actual programming task a challenge.
Why sorting?
Many computer scientists consider sorting to be the most fundamental problem in the study of algorithms. There are several reasons:
Sometimes an application inherently needs to sort information. For example, in order to prepare customer statements, banks need to sort checks by check number.
Algorithms often use sorting as a key subroutine. For example, a program that renders graphical objects which are layered on top of each other might have to sort the objects according to an “above” relation so that it can draw these objects from bottom to top. We shall see numerous algorithms in this text that use sorting as a subroutine.
We can draw from among a wide variety of sorting algorithms, and they em- ploy a rich set of techniques. In fact, many important techniques used through- out algorithm design appear in the body of sorting algorithms that have been developed over the years. In this way, sorting is also a problem of historical interest.
We can prove a nontrivial lower bound for sorting (as we shall do in Chapter 8).
Our best upper bounds match the lower bound asymptotically, and so we know that our sorting algorithms are asymptotically optimal. Moreover, we can use the lower bound for sorting to prove lower bounds for certain other problems.
Many engineering issues come to the fore when implementing sorting algo- rithms. The fastest sorting program for a particular situation may depend on many factors, such as prior knowledge about the keys and satellite data, the memory hierarchy (caches and virtual memory) of the host computer, and the software environment. Many of these issues are best dealt with at the algorith- mic level, rather than by “tweaking” the code.
Sorting algorithms
We introduced two algorithms that sortnreal numbers in Chapter 2. Insertion sort takes ‚.n2/ time in the worst case. Because its inner loops are tight, however, it is a fast in-place sorting algorithm for small input sizes. (Recall that a sorting algorithm sorts in place if only a constant number of elements of the input ar- ray are ever stored outside the array.) Merge sort has a better asymptotic running time,‚.nlgn/, but the MERGEprocedure it uses does not operate in place.
In this part, we shall introduce two more algorithms that sort arbitrary real num- bers. Heapsort, presented in Chapter 6, sortsnnumbers in place inO.nlgn/time.
It uses an important data structure, called a heap, with which we can also imple- ment a priority queue.
Quicksort, in Chapter 7, also sortsnnumbers in place, but its worst-case running time is ‚.n2/. Its expected running time is‚.nlgn/, however, and it generally outperforms heapsort in practice. Like insertion sort, quicksort has tight code, and so the hidden constant factor in its running time is small. It is a popular algorithm for sorting large input arrays.
Insertion sort, merge sort, heapsort, and quicksort are all comparison sorts: they determine the sorted order of an input array by comparing elements. Chapter 8 be- gins by introducing the decision-tree model in order to study the performance limi- tations of comparison sorts. Using this model, we prove a lower bound of.nlgn/
on the worst-case running time of any comparison sort onninputs, thus showing that heapsort and merge sort are asymptotically optimal comparison sorts.
Chapter 8 then goes on to show that we can beat this lower bound of.nlgn/
if we can gather information about the sorted order of the input by means other than comparing elements. The counting sort algorithm, for example, assumes that the input numbers are in the set f0; 1; : : : ; kg. By using array indexing as a tool for determining relative order, counting sort can sortnnumbers in‚.kCn/time.
Thus, whenk D O.n/, counting sort runs in time that is linear in the size of the input array. A related algorithm, radix sort, can be used to extend the range of counting sort. If there are nintegers to sort, each integer has d digits, and each digit can take on up to k possible values, then radix sort can sort the numbers in ‚.d.nCk// time. When d is a constant and k is O.n/, radix sort runs in linear time. A third algorithm, bucket sort, requires knowledge of the probabilistic distribution of numbers in the input array. It can sort nreal numbers uniformly distributed in the half-open intervalŒ0; 1/in average-caseO.n/time.
The following table summarizes the running times of the sorting algorithms from Chapters 2 and 6–8. As usual,ndenotes the number of items to sort. For counting sort, the items to sort are integers in the setf0; 1; : : : ; kg. For radix sort, each item is ad-digit number, where each digit takes onkpossible values. For bucket sort, we assume that the keys are real numbers uniformly distributed in the half-open interval Œ0; 1/. The rightmost column gives the average-case or expected running time, indicating which it gives when it differs from the worst-case running time.
We omit the average-case running time of heapsort because we do not analyze it in this book.
Worst-case Average-case/expected Algorithm running time running time
Insertion sort ‚.n2/ ‚.n2/
Merge sort ‚.nlgn/ ‚.nlgn/
Heapsort O.nlgn/ —
Quicksort ‚.n2/ ‚.nlgn/ (expected)
Counting sort ‚.kCn/ ‚.kCn/
Radix sort ‚.d.nCk// ‚.d.nCk//
Bucket sort ‚.n2/ ‚.n/ (average-case)
Order statistics
Theith order statistic of a set ofnnumbers is theith smallest number in the set.
We can, of course, select theith order statistic by sorting the input and indexing theith element of the output. With no assumptions about the input distribution, this method runs in.nlgn/time, as the lower bound proved in Chapter 8 shows.
In Chapter 9, we show that we can find the ith smallest element inO.n/time, even when the elements are arbitrary real numbers. We present a randomized algo- rithm with tight pseudocode that runs in‚.n2/time in the worst case, but whose expected running time is O.n/. We also give a more complicated algorithm that runs inO.n/worst-case time.
Background
Although most of this part does not rely on difficult mathematics, some sections do require mathematical sophistication. In particular, analyses of quicksort, bucket sort, and the order-statistic algorithm use probability, which is reviewed in Ap- pendix C, and the material on probabilistic analysis and randomized algorithms in Chapter 5. The analysis of the worst-case linear-time algorithm for order statis- tics involves somewhat more sophisticated mathematics than the other worst-case analyses in this part.
In this chapter, we introduce another sorting algorithm: heapsort. Like merge sort, but unlike insertion sort, heapsort’s running time isO.nlgn/. Like insertion sort, but unlike merge sort, heapsort sorts in place: only a constant number of array elements are stored outside the input array at any time. Thus, heapsort combines the better attributes of the two sorting algorithms we have already discussed.
Heapsort also introduces another algorithm design technique: using a data struc- ture, in this case one we call a “heap,” to manage information. Not only is the heap data structure useful for heapsort, but it also makes an efficient priority queue. The heap data structure will reappear in algorithms in later chapters.
The term “heap” was originally coined in the context of heapsort, but it has since come to refer to “garbage-collected storage,” such as the programming languages Java and Lisp provide. Our heap data structure isnot garbage-collected storage, and whenever we refer to heaps in this book, we shall mean a data structure rather than an aspect of garbage collection.