Probability Distributions for Continuous Variables
4.1 Probability Density Functions
Example 4.3 Example 4.2 Example 4.1
A discrete random variable (rv) is one whose possible values either constitute a finite set or else can be listed in an infinite sequence (a list in which there is a first element, a second element, etc.). A random variable whose set of possible values is an entire interval of numbers is not discrete.
Recall from Chapter 3 that a random variable Xis continuous if (1) possible values comprise either a single interval on the number line (for some , any number xbetween Aand Bis a possible value) or a union of disjoint intervals, and (2) for any number cthat is a possible value of X.
If in the study of the ecology of a lake, we make depth measurements at randomly chosen locations, then the depth at such a location is a continuous rv. Here Ais the minimum depth in the region being sampled, and Bis the maximum depth. ■ If a chemical compound is randomly selected and its pH Xis determined, then Xis a continuous rv because any pH value between 0 and 14 is possible. If more is known about the compound selected for analysis, then the set of possible values might be a subinterval of [0, 14], such as , but Xwould still be continuous. ■ Let Xrepresent the amount of time a randomly selected customer spends waiting for a haircut before his/her haircut commences. Your first thought might be that Xis a continuous random variable, since a measurement is required to determine its value.
However, there are customers lucky enough to have no wait whatsoever before climbing into the barber’s chair. So it must be the case that . Conditional on no chairs being empty, though, the waiting time will be continuous since Xcould then assume any value between some minimum possible time Aand a maximum possible time B. This random variable is neither purely discrete nor purely continuous but instead is a mixture of the two types. ■ One might argue that although in principle variables such as height, weight, and temperature are continuous, in practice the limitations of our measuring instru- ments restrict us to a discrete (though sometimes very finely subdivided) world.
However, continuous models often approximate real-world situations very well, and continuous mathematics (the calculus) is frequently easier to work with than math- ematics of discrete variables and distributions.
DEFINITION
(a) (b) (c)
0 M 0 M 0 M
Figure 4.1 (a) Probability histogram of depth measured to the nearest meter; (b) probability histogram of depth measured to the nearest centimeter; (c) a limit of a sequence of discrete histograms
pictured in Figure 4.1(b); it has a much smoother appearance than the histogram in Figure 4.1(a). If we continue in this way to measure depth more and more finely, the resulting sequence of histograms approaches a smooth curve, such as is pictured in Figure 4.1(c). Because for each histogram the total area of all rectangles equals 1, the total area under the smooth curve is also 1. The probability that the depth at a randomly chosen point is between aand bis just the area under the smooth curve between aand b. It is exactly a smooth curve of the type pictured in Figure 4.1(c) that specifies a continuous probability distribution.
Let Xbe a continuous rv. Then a probability distributionor probability den- sity function(pdf) of Xis a function f(x) such that for any two numbers aand
bwith ,
That is, the probability that Xtakes on a value in the interval [a, b] is the area above this interval and under the graph of the density function, as illustrated in Figure 4.2. The graph of f(x) is often referred to as the density curve.
P(a#X#b)5 3
b
a
f(x)dx a#b
a b
x f(x)
Figure 4.2 P(a#X#b)5the area under the density curve between a and b
For f(x) to be a legitimate pdf, it must satisfy the following two conditions:
1. for all x
2. area under the entire graph of f(x)
The direction of an imperfection with respect to a reference line on a circular object such as a tire, brake rotor, or flywheel is, in general, subject to uncertainty. Consider the reference line connecting the valve stem on a tire to the center point, and let X
51 3
`
2`
f(x)dx5 f(x)$ 0
Example 4.4
DEFINITION
Shaded area P(90 X 180)
x 1
360 f(x)
0 360
x f(x)
360 270
180 90
Figure 4.3 The pdf and probability from Example 4.4
be the angle measured clockwise to the location of an imperfection. One possible pdf for Xis
The pdf is graphed in Figure 4.3. Clearly . The area under the density curve is just the area of a rectangle: (height)(base) . The probability that the angle is between and is
The probability that the angle of occurrence is within of the reference line is P(0#X#90)1 P(270#X, 360)5 .251 .255.50
908 P(90#X#180)5 3
180
90
1
360 dx 5 x 360 `x5180
x590
5 1 4 5.25 1808
908
5Q 1
360R(360) 51 f(x)$ 0
f(x)5 • 1
360 0#x, 360 0 otherwise
A continuous rv X is said to have a uniform distribution on the interval [A, B] if the pdf of Xis
f(x; A, B)5 • 1
B2A A#x#B 0 otherwise
The graph of any uniform pdf looks like the graph in Figure 4.3 except that the inter- val of positive density is [A, B] rather than [0, 360].
In the discrete case, a probability mass function (pmf) tells us how little
“blobs” of probability mass of various magnitudes are distributed along the mea- surement axis. In the continuous case, probability density is “smeared” in a continu- ous fashion along the interval of possible values. When density is smeared uniformly over the interval, a uniform pdf, as in Figure 4.3, results.
When Xis a discrete random variable, each possible value is assigned positive probability. This is not true of a continuous random variable (that is, the second
■
Because whenever in Example 4.4 and depends
only on the width b2 aof the interval, Xis said to have a uniform distribution.
P(a#X#b) 0#a#b#360
Example 4.5
condition of the definition is satisfied) because the area under a density curve that lies above any single value is zero:
The fact that when Xis continuous has an important practical consequence: The probability that Xlies in some interval between aand bdoes not depend on whether the lower limit aor the upper limit bis included in the probabil- ity calculation:
(4.1) If Xis discrete and both aand bare possible values (e.g., Xis binomial with and ), then all four of the probabilities in (4.1) are different.
The zero probability condition has a physical analog. Consider a solid circular rod with cross-sectional area . Place the rod alongside a measurement axis and suppose that the density of the rod at any point xis given by the value f(x) of a density function. Then if the rod is sliced at points aand band this segment is removed, the amount of mass removed is
冕
; if the rod is sliced just at the point c, no mass is removed. Mass is assigned to interval segments of the rod but not to individual points.“Time headway” in traffic flow is the elapsed time between the time that one car fin- ishes passing a fixed point and the instant that the next car begins to pass that point.
Let the time headway for two randomly chosen consecutive cars on a freeway during a period of heavy flow. The following pdf of Xis essentially the one suggested in “The Statistical Properties of Freeway Traffic” (Transp. Res.,vol. 11: 221–228):
The graph of f(x) is given in Figure 4.4; there is no density associated with headway times less than .5, and headway density decreases rapidly (exponentially fast) as xincreases from .5. Clearly, f(x)$ 0; to show that
冕
2`` f(x)dx51, we usef(x)5 e.15e2.15(x2.5) x$ .5
0 otherwise
X5
b a f(x)dx 5 1 in2
a55, b510
n5 20 P(a#X#b)5P(a, X,b)5P(a,X#b)5P(a#X, b)
P(X5 c)5 0 P(X5 c)5 3
c
c
f(x)dx5 lim
eS0 3
c1e
c2e
f (x)dx5 0
0 .15
2 .5
4 6 8 10
x f (x)
P(X 5)
Figure 4.4 The density curve for time headway in Example 4.5
the calculus result
冕
. Then5 .15e.075
#
1.15 e2(.15)(.5)5 1 3
`
2`
f(x)dx5 3
`
.5
.15e2.15(x2.5) dx5.15e.0753
`
.5
e2.15x dx
a`e2kx dx5 (1/k)e2k
#
aThe probability that headway time is at most 5 sec is
■ Unlike discrete distributions such as the binomial, hypergeometric, and nega- tive binomial, the distribution of any given continuous rv cannot usually be derived using simple probabilistic arguments. Instead, one must make a judicious choice of pdf based on prior knowledge and available data. Fortunately, there are some general families of pdf’s that have been found to be sensible candidates in a wide variety of experimental situations; several of these are discussed later in the chapter.
Just as in the discrete case, it is often helpful to think of the population of inter- est as consisting of X values rather than individuals or objects. The pdf is then a model for the distribution of values in this numerical population, and from this model various population characteristics (such as the mean) can be calculated.
5P(less than 5 sec)5 P(X, 5)
5e.075(2e2.751 e2.075)51.078(2.472 1.928)5 .491 5.15e.0753
5
.5
e2.15x dx5 .15e.075
#
a2 1.15 e2.15x `
x5.5 x55
b P(X#5) 5 3
5
2`
f(x)dx5 3
5
.5
.15e2.15(x2.5) dx
EXERCISES Section 4.1 (1–10)
1. The current in a certain circuit as measured by an ammeter is a continuous random variable Xwith the following density function:
a. Graph the pdf and verify that the total area under the den- sity curve is indeed 1.
b. Calculate . How does this probability compare to ?
c. Calculate and also .
2. Suppose the reaction temperature X (in ) in a certain chemical process has a uniform distribution with and .
a. Compute .
b. Compute .
c. Compute .
d. For k satisfying , compute
.
3. The error involved in making a certain measurement is a con- tinuous rv Xwith pdf
a. Sketch the graph of f(x).
b. Compute .
c. Compute .
d. Compute .P(X, 2.5 or X..5) P(21,X,1) P(X.0)
f(x)5 e.09375(42x2) 22#x#2
0 otherwise
P(k,X,k14)
25,k,k14,5 P(22#X#3)
P(22.5,X,2.5) P(X,0)
B55
A5 25 8C
P(4.5,X) P(3.5#X#4.5)
P(X,4) P(X#4)
f(x)5 e.075x1.2 3#x#5
0 otherwise
4. Let Xdenote the vibratory stress (psi) on a wind turbine blade at a particular wind speed in a wind tunnel. The article
“Blade Fatigue Life Assessment with Application to VAWTS” (J. of Solar Energy Engr.,1982: 107–111) proposes the Rayleigh distribution, with pdf
as a model for the Xdistribution.
a. Verify that is a legitimate pdf.
b. Suppose (a value suggested by a graph in the article). What is the probability that Xis at most 200? Less than 200? At least 200?
c. What is the probability that X is between 100 and 200 (again assuming )?
d. Give an expression for .
5. A college professor never finishes his lecture before the end of the hour and always finishes his lectures within 2 min after the hour. Let the time that elapses between the end of the hour and the end of the lecture and suppose the pdf of Xis
a. Find the value of kand draw the corresponding density curve. [Hint: Total area under the graph of f(x) is 1.]
b. What is the probability that the lecture ends within 1 min of the end of the hour?
f(x)5 ekx2 0#x#2 0 otherwise X5
P(X#x) u5100 u5100
f(x; u) f(x; u)5 •
x u2
#
e2x2/(2u2) x.00 otherwise
c. What is the probability that the lecture continues beyond the hour for between 60 and 90 sec?
d. What is the probability that the lecture continues for at least 90 sec beyond the end of the hour?
6. The actual tracking weight of a stereo cartridge that is set to track at 3 g on a particular changer can be regarded as a con- tinuous rv Xwith pdf
a. Sketch the graph of f(x).
b. Find the value of k.
c. What is the probability that the actual tracking weight is greater than the prescribed weight?
d. What is the probability that the actual weight is within .25 g of the prescribed weight?
e. What is the probability that the actual weight differs from the prescribed weight by more than .5 g?
7. The time X(min) for a lab assistant to prepare the equipment for a certain experiment is believed to have a uniform distri-
bution with and .
a. Determine the pdf of X and sketch the corresponding density curve.
b. What is the probability that preparation time exceeds 33 min?
c. What is the probability that preparation time is within 2 min of the mean time? [Hint: Identify from the graph of f(x).]
d. For any asuch that , what is the
probability that preparation time is between a and min?
8. In commuting to work, a professor must first get on a bus near her house and then transfer to a second bus. If the wait- ing time (in minutes) at each stop has a uniform distribution with and , then it can be shown that the total waiting time Yhas the pdf
B55 A50
a12
25,a,a12,35 m B535
A525
f(x)5 ek[12(x23)2] 2#x#4
0 otherwise
a. Sketch a graph of the pdf of Y.
b. Verify that .
c. What is the probability that total waiting time is at most 3 min?
d. What is the probability that total waiting time is at most 8 min?
e. What is the probability that total waiting time is between 3 and 8 min?
f. What is the probability that total waiting time is either less than 2 min or more than 6 min?
9. Consider again the pdf of time headway given in Example 4.5. What is the probability that time headway is a. At most 6 sec?
b. More than 6 sec? At least 6 sec?
c. Between 5 and 6 sec?
10. A family of pdf’s that has been used to approximate the dis- tribution of income, city population size, and size of firms is the Pareto family. The family has two parameters, kand , both , and the pdf is
a. Sketch the graph of .
b. Verify that the total area under the graph equals 1.
c. If the rv Xhas pdf , for any fixed , obtain an expression for .
d. For , obtain an expression for the probability .
P(a#X#b) u,a,b
P(X#b)
b.u f(x; k, u)
f(x; k, u) f(x; k, u)5 uk
#
ukxk11 x$u 0 x,u .0
u X5
3
` 2`
f(y) dy51 f(y)5 e
1
25 y 0#y,5 2
5 2 1
25 y 5#y#10 0 y,0 or y.10