94. Consider a deck consisting of seven cards, marked 1, 2, . . . , 7. Three of these cards are selected at random. Define an rv Wby of the resulting numbers, and compute the pmf of W.Then compute mand s2. [Hint:Consider out- comes as unordered, so that (1, 3, 7) and (3, 1, 7) are not different outcomes. Then there are 35 outcomes, and they can be listed. (This type of rv actually arises in connection with a statistical procedure called Wilcoxon’s rank-sum test, in which there is an xsample and a ysample and Wis the sum of the ranks of the x’s in the combined sample; see Section 15.2.)
95. After shuffling a deck of 52 cards, a dealer deals out 5. Let of suits represented in the five-card hand.
a. Show that the pmf of Xis X5the number
W5the sum
96. The negative binomial rv X was defined as the number of F’s preceding the rth S.Let of trials neces- sary to obtain the rth S.In the same manner in which the pmf of Xwas derived, derive the pmf of Y.
97. Of all customers purchasing automatic garage-door openers, 75% purchase a chain-driven model. Let number among the next 15 purchasers who select the chain-driven model.
a. What is the pmf of X?
b. Compute .
c. Compute .
d. Compute mand s2.
e. If the store currently has in stock 10 chain-driven models and 8 shaft-driven models, what is the probability that the requests of these 15 customers can all be met from existing stock?
98. A friend recently planned a camping trip. He had two flash- lights, one that required a single 6-V battery and another that used two size-D batteries. He had previously packed two 6-V and four size-D batteries in his camper. Suppose the probability that any particular battery works is pand that batteries work or fail independently of one another. Our friend wants to take just one flashlight. For what values of p should he take the 6-V flashlight?
P(6#X#10) P(X.10)
X5the Y5the number
x 1 2 3 4
p(x) .002 .146 .588 .264
[Hint: , (only spades
and hearts with at least one of each suit), and .]
b. Compute m, s2, and s.
54P(2 spades¨one of each other suit)
p(4) p(2)56P
p(1)54P(all are spades)
99. A k-out-of-n systemis one that will function if and only if at least k of the n individual components in the system function. If individual components function independently of one another, each with probability .9, what is the prob- ability that a 3-out-of-5 system functions?
100. A manufacturer of integrated circuit chips wishes to con- trol the quality of its product by rejecting any batch in which the proportion of defective chips is too high. To this end, out of each batch (10,000 chips), 25 will be selected and tested. If at least 5 of these 25 are defective, the entire batch will be rejected.
a. What is the probability that a batch will be rejected if 5% of the chips in the batch are in fact defective?
b. Answer the question posed in (a) if the percentage of defective chips in the batch is 10%.
c. Answer the question posed in (a) if the percentage of defective chips in the batch is 20%.
d. What happens to the probabilities in (a)–(c) if the criti- cal rejection number is increased from 5 to 6?
101. Of the people passing through an airport metal detector, .5% activate it; let among a randomly selected group of 500 who activate the detector.
a. What is the (approximate) pmf of X?
b. Compute .
c. Compute .
102. An educational consulting firm is trying to decide whether high school students who have never before used a hand- held calculator can solve a certain type of problem more easily with a calculator that uses reverse Polish logic or one that does not use this logic. A sample of 25 students is selected and allowed to practice on both calculators. Then each student is asked to work one problem on the reverse Polish calculator and a similar problem on the other. Let , where S indicates that a student worked the problem more quickly using reverse Polish logic than with- out, and let of S’s.
a. If , what is ?
b. If , what is ?
c. If the claim that is to be rejected when either or , what is the probability of rejecting the claim when it is actually correct?
d. If the decision to reject the claim is made as in part (c), what is the probability that the claim is not rejected when ? When ?
e. What decision rule would you choose for rejecting the claim if you wanted the probability in part (c) to be at most .01?
103. Consider a disease whose presence can be identified by carrying out a blood test. Let pdenote the probability that a randomly selected individual has the disease. Suppose n individuals are independently selected for testing. One way to proceed is to carry out a separate test on each of the n blood samples. A potentially more economical approach, group testing, was introduced during World War II to iden- tify syphilitic men among army inductees. First, take a part
p5.5
p5.8 p5.6
p5.5 x$18
x#7
p5.5
P(7#X#18) p5.8
P(7#X#18) p5.5
X5number p5P(S)
P(5#X) P(X55)
X5the number
of each blood sample, combine these specimens, and carry out a single test. If no one has the disease, the result will be negative, and only the one test is required. If at least one individual is diseased, the test on the combined sample will yield a positive result, in which case the nindividual tests are then carried out. If and , what is the expected number of tests using this procedure? What is the expected number when ? [The article “Random Multiple-Access Communication and Group Testing”
(IEEE Trans. on Commun.,1984: 769–774) applied these ideas to a communication system in which the dichotomy was active/idle user rather than diseased/nondiseased.]
104. Let p1denote the probability that any particular code sym- bol is erroneously transmitted through a communication system. Assume that on different symbols, errors occur independently of one another. Suppose also that with prob- ability p2an erroneous symbol is corrected upon receipt.
Let Xdenote the number of correct symbols in a message block consisting of nsymbols (after the correction process has ended). What is the probability distribution of X?
105. The purchaser of a power-generating unit requires ccon- secutive successful start-ups before the unit will be accepted. Assume that the outcomes of individual start-ups are independent of one another. Let pdenote the probabil- ity that any particular start-up is successful. The random variable of interest is of start-ups that must be made prior to acceptance. Give the pmf of Xfor the case . If , what is ? [Hint: For , express p(x) “recursively” in terms of the pmf eval- uated at the smaller values .] (This problem was suggested by the article “Evaluation of a Start-Up Demonstration Test,” J. Quality Technology, 1983: 103–106.)
106. A plan for an executive travelers’ club has been developed by an airline on the premise that 10% of its current cus- tomers would qualify for membership.
a. Assuming the validity of this premise, among 25 ran- domly selected current customers, what is the probabil- ity that between 2 and 6 (inclusive) qualify for membership?
b. Again assuming the validity of the premise, what are the expected number of customers who qualify and the standard deviation of the number who qualify in a ran- dom sample of 100 current customers?
c. Let Xdenote the number in a random sample of 25 cur- rent customers who qualify for membership. Consider rejecting the company’s premise in favor of the claim that if . What is the probability that the company’s premise is rejected when it is actually valid?
d.Refer to the decision rule introduced in part (c). What is the probability that the company’s premise is not rejected even though (i.e., 20% qualify)?
107. Forty percent of seeds from maize (modern-day corn) ears carry single spikelets, and the other 60% carry paired spikelets. A seed with single spikelets will produce an ear
p5.20 x$7 p..10
x23, x24, c, 2 x$5
P(X#8) p5.9
c52
X5the number n55
n53 p5.1
with single spikelets 29% of the time, whereas a seed with paired spikelets will produce an ear with single spikelets 26% of the time. Consider randomly selecting ten seeds.
a. What is the probability that exactly five of these seeds carry a single spikelet and produce an ear with a single spikelet?
b. What is the probability that exactly five of the ears pro- duced by these seeds have single spikelets? What is the probability that at most five ears have single spikelets?
108. A trial has just resulted in a hung jury because eight mem- bers of the jury were in favor of a guilty verdict and the other four were for acquittal. If the jurors leave the jury room in random order and each of the first four leaving the room is accosted by a reporter in quest of an interview, what is the pmf of of jurors favoring acquittal among those interviewed? How many of those favoring acquittal do you expect to be interviewed?
109. A reservation service employs five information operators who receive requests for information independently of one another, each according to a Poisson process with rate
per minute.
a. What is the probability that during a given 1-min period, the first operator receives no requests?
b. What is the probability that during a given 1-min period, exactly four of the five operators receive no requests?
c. Write an expression for the probability that during a given 1-min period, all of the operators receive exactly the same number of requests.
110. Grasshoppers are distributed at random in a large field according to a Poisson process with parameter per square yard. How large should the radius Rof a circular sampling region be taken so that the probability of finding at least one in the region equals .99?
111. A newsstand has ordered five copies of a certain issue of a photography magazine. Let of individuals who come in to purchase this magazine. If Xhas a Poisson distribution with parameter , what is the expected number of copies that are sold?
112. Individuals A and B begin to play a sequence of chess
games. Let , and suppose that out-
comes of successive games are independent with and (they never draw). They will play until
one of them wins ten games. Let of
games played (with possible values 10, 11, . . . , 19).
a. For , obtain an expression for
.
b.If a draw is possible, with , , , what are the possible values of X? What is ? [Hint:
.]
113. A test for the presence of a certain disease has probability .20 of giving a false-positive reading (indicating that an individual has the disease when this is not the case) and
12P(X,20)
P(20#X)5 P(20#X)
12p2q5P(draw)
q5P(F) p5P(S)
p(x)5P(X5x) x510, 11, c, 19
X5the number P(F)512p
P(S)5p S55A wins a game6
m54 X5the number
a52 a52
X5the number
probability .10 of giving a false-negative result. Suppose that ten individuals are tested, five of whom have the dis- ease and five of whom do not. Let of pos- itive readings that result.
a. Does Xhave a binomial distribution? Explain your rea- soning.
b. What is the probability that exactly three of the ten test results are positive?
114. The generalized negative binomial pmf is given by
Let X, the number of plants of a certain species found in a particular region, have this distribution with and . What is ? What is the probability that at least one plant is found?
115. There are two Certified Public Accountants in a particular office who prepare tax returns for clients. Suppose that for a particular type of complex form, the number of errors made by the first preparer has a Poisson distribution with mean value m1, the number of errors made by the second preparer has a Poisson distribution with mean value m2, and that each CPA prepares the same number of forms of this type. Then if a form of this type is randomly selected, the function
gives the pmf of of errors on the selected form.
a. Verify that p(x; m1, m2) is in fact a legitimate pmf ( and sums to 1).
b. What is the expected number of errors on the selected form?
c. What is the variance of the number of errors on the selected form?
d. How does the pmf change if the first CPA prepares 60%
of all such forms and the second prepares 40%?
116. The modeof a discrete random variable Xwith pmf p(x) is that value x* for which p(x) is largest (the most probable xvalue).
a. Let . By considering the ratio
, show that b(x; n, p) increases with xas long as . Conclude that the mode x* is the
integer satisfying .
b. Show that if Xhas a Poisson distribution with parame- ter m, the mode is the largest integer less than m. If mis an integer, show that both and mare modes.
117. A computer disk storage device has ten concentric tracks, numbered 1, 2, . . . , 10 from outermost to innermost, and a single access arm. Let that any particu- lar request for data will take the arm to track . Assume that the tracks accessed in suc- cessive seeks are independent. Let X5the number of i(i51, . . . , 10)
pi5the probability m21
(n11)p21#x*#(n11)p x,np2(12p)
p)/b(x; n, p)
b(x11; n, X|Bin(n, p)
$0 X5the number
p(x; m1, m2)5.5 e2m1m1x
x! 1.5 e2m2m2x
x! x50, 1, 2, . . . P(X54)
r52.5
p5.3 x50, 1, 2, . . .
nb(x; r, p)5k(r, x)
#
pr(12p)x X5the numbertracks over which the access arm passes during two succes- sive requests (excluding the track that the arm has just left, so possible X values are ). Compute the pmf of X.[Hint:
. After the conditional probability is written in terms of p1, . . . , p10, by the law of total proba- bility, the desired probability is obtained by summing over i.]
118. If Xis a hypergeometric rv, show directly from the defini- tion that (consider only the case ).
[Hint: Factor nM/N out of the sum for E(X), and show that the terms inside the sum are of the form
, where .]
119. Use the fact that
to prove Chebyshev’s inequality given in Exercise 44.
120. The simple Poisson process of Section 3.6 is characterized by a constant rate at which events occur per unit time. A generalization of this is to suppose that the probability of exactly one event occurring in the interval is . It can then be shown that the number of events occurring during an interval [t1, t2] has a Poisson distribution with parameter
The occurrence of events over time in this situation is called a nonhomogeneous Poisson process. The article
“Inference Based on Retrospective Ascertainment,” J.
Amer. Stat. Assoc.,1989: 360–372, considers the intensity function
as appropriate for events involving transmission of HIV (the AIDS virus) via blood transfusions. Suppose that and (close to values suggested in the paper), with time in years.
b5.6 a52
a(t)5ea1bt m5
冮
t1t2
a(t) dt a(t)
#
⌬t1o(⌬t)[t, t1⌬t]
a
g
all x
(x2m)2p(x)$ g
x: u x2mu$ks
(x2m)2p(x) y5x21 h(y; n21, M21, N21)
n,M E(X)5nM/N
P(X5j|arm now on i)
#
pi
P(the arm is now on track i and X5j)5 x50, 1, . . . , 9
a. What is the expected number of events in the interval [0, 4]? In [2, 6]?
b. What is the probability that at most 15 events occur in the interval [0, .9907]?
121. Consider a collection A1, . . . , Akof mutually exclusive and exhaustive events, and a random variable Xwhose distri- bution depends on which of the Ai’s occurs (e.g., a com- muter might select one of three possible routes from home to work, with X representing the commute time). Let denote the expected value of Xgiven that the event Ai occurs. Then it can be shown that , the weighted average of the indi- vidual “conditional expectations” where the weights are the probabilities of the partitioning events.
a. The expected duration of a voice call to a particular telephone number is 3 minutes, whereas the expected duration of a data call to that same number is 1 minute.
If 75% of all calls are voice calls, what is the expected duration of the next call?
b. A deli sells three different types of chocolate chip cook- ies. The number of chocolate chips in a type icookie has a Poisson distribution with parameter . If 20% of all customers pur- chasing a chocolate chip cookie select the first type, 50% choose the second type, and the remaining 30%
opt for the third type, what is the expected number of chips in a cookie purchased by the next customer?
122. Consider a communication source that transmits packets containing digitized speech. After each transmission, the receiver sends a message indicating whether the transmis- sion was successful or unsuccessful. If a transmission is unsuccessful, the packet is re-sent. Suppose a voice packet can be transmitted a maximum of 10 times. Assuming that the results of successive transmissions are independent of one another and that the probability of any particular trans- mission being successful is p, determine the probability mass function of the rv of times a packet is transmitted. Then obtain an expression for the expected number of times a packet is transmitted.
X5the number mi5i11 (i51, 2, 3)
E(X)5⌺E(XuAi)
#
P(Ai) E(XuAi)
Bibliography
Johnson, Norman, Samuel Kotz, and Adrienne Kemp, Discrete Univariate Distributions,Wiley, New York, 1992. An ency- clopedia of information on discrete distributions.
Olkin, Ingram, Cyrus Derman, and Leon Gleser, Probability Models and Applications (2nd ed.), Macmillan, New York, 1994. Contains an in-depth discussion of both general
properties of discrete and continuous distributions and results for specific distributions.
Ross, Sheldon, Introduction to Probability Models (9th ed.), Academic Press, New York, 2007. A good source of material on the Poisson process and generalizations and a nice intro- duction to other topics in applied probability.
137
4 Continuous Random
Variables and Probability Distributions
INTRODUCTION
Chapter 3 concentrated on the development of probability distributions for dis- crete random variables. In this chapter, we consider the second general type of random variable that arises in many applied problems. Sections 4.1 and 4.2 present the basic definitions and properties of continuous random variables and their probability distributions. In Section 4.3, we study in detail the normal ran- dom variable and distribution, unquestionably the most important and useful in probability and statistics. Sections 4.4 and 4.5 discuss some other continuous distributions that are often used in applied work. In Section 4.6, we introduce a method for assessing whether given sample data is consistent with a specified distribution.
Example 4.3 Example 4.2 Example 4.1
A discrete random variable (rv) is one whose possible values either constitute a finite set or else can be listed in an infinite sequence (a list in which there is a first element, a second element, etc.). A random variable whose set of possible values is an entire interval of numbers is not discrete.
Recall from Chapter 3 that a random variable Xis continuous if (1) possible values comprise either a single interval on the number line (for some , any number xbetween Aand Bis a possible value) or a union of disjoint intervals, and (2) for any number cthat is a possible value of X.
If in the study of the ecology of a lake, we make depth measurements at randomly chosen locations, then the depth at such a location is a continuous rv. Here Ais the minimum depth in the region being sampled, and Bis the maximum depth. ■ If a chemical compound is randomly selected and its pH Xis determined, then Xis a continuous rv because any pH value between 0 and 14 is possible. If more is known about the compound selected for analysis, then the set of possible values might be a subinterval of [0, 14], such as , but Xwould still be continuous. ■ Let Xrepresent the amount of time a randomly selected customer spends waiting for a haircut before his/her haircut commences. Your first thought might be that Xis a continuous random variable, since a measurement is required to determine its value.
However, there are customers lucky enough to have no wait whatsoever before climbing into the barber’s chair. So it must be the case that . Conditional on no chairs being empty, though, the waiting time will be continuous since Xcould then assume any value between some minimum possible time Aand a maximum possible time B. This random variable is neither purely discrete nor purely continuous but instead is a mixture of the two types. ■ One might argue that although in principle variables such as height, weight, and temperature are continuous, in practice the limitations of our measuring instru- ments restrict us to a discrete (though sometimes very finely subdivided) world.
However, continuous models often approximate real-world situations very well, and continuous mathematics (the calculus) is frequently easier to work with than math- ematics of discrete variables and distributions.