To predict the properties of a population on the basis of a sample, it is necessary to know something about the population’s expected distribution around its central value. The distribution of a population can be represented by plotting the frequency of occurrence of individual values as a function of the values themselves. Such plots are called probability distributions. Unfortunately, we are rarely able to calculate the exact probability distribution for a chemical system. In fact, the probability dis- tribution can take any shape, depending on the nature of the chemical system being investigated. Fortunately many chemical systems display one of several common probability distributions. Two of these distributions, the binomial distribution and the normal distribution, are discussed next.
P V M
( )= N
Chapter 4 Evaluating Analytical Data
71
Table 4.10
Results for a SecondDetermination of the Mass of a United States Penny in Circulation
Penny Mass
(g)
1 3.052
2 3.141
3 3.083
4 3.083
5 3.048
X– 3.081
s 0.037
population
All members of a system.
sample
Those members of a population that we actually collect and analyze.
probability distribution
Plot showing frequency of occurrence for members of a population.
72
Modern Analytical Chemistrybinomial distribution
Probability distribution showing chance of obtaining one of two specific outcomes in a fixed number of trials.
*N! is read as N-factorial and is the product N×(N– 1)×(N– 2)×. . .×1. For example, 4! is 4×3×2×1, or 24.
Your calculator probably has a key for calculating factorials.
Binomial Distribution The binomial distributiondescribes a population in which the values are the number of times a particular outcome occurs during a fixed num- ber of trials. Mathematically, the binomial distribution is given as
where P(X,N) is the probability that a given outcome will occur Xtimes during N trials, and pis the probability that the outcome will occur in a single trial.* If you flip a coin five times, P(2,5) gives the probability that two of the five trials will turn up “heads.”
A binomial distribution has well-defined measures of central tendency and spread. The true mean value, for example, is given as
µ=Np and the true spread is given by the variance
σ2=Np(1 –p) or the standard deviation
The binomial distribution describes a population whose members have only certain, discrete values. A good example of a population obeying the binomial dis- tribution is the sampling of homogeneousmaterials. As shown in Example 4.10, the binomial distribution can be used to calculate the probability of finding a particular isotope in a molecule.
EXAMPLE
4.10
Carbon has two common isotopes, 12C and 13C, with relative isotopic abundances of, respectively, 98.89% and 1.11%. (a) What are the mean and standard deviation for the number of 13C atoms in a molecule of cholesterol?
(b) What is the probability of finding a molecule of cholesterol (C27H44O) containing no atoms of 13C?
SOLUTION
The probability of finding an atom of 13C in cholesterol follows a binomial distribution, where Xis the sought for frequency of occurrence of 13C atoms, N is the number of C atoms in a molecule of cholesterol, and pis the probability of finding an atom of 13C.
(a) The mean number of 13C atoms in a molecule of cholesterol is µ=Np= 27×0.0111 = 0.300
with a standard deviation of
(b) Since the mean is less than one atom of 13C per molecule, most molecules of cholesterol will not have any 13C. To calculate
σ = ( )( .27 0 0111 1)( −0 0111. ) = 0 172. σ = Np(1−p)
P X, N N
X N X pX pN X
( ) !
!( )! ( )
= − × × −1 −
homogeneous
Uniform in composition.
the probability, we substitute appropriate values into the binomial equation
There is therefore a 74.0% probability that a molecule of cholesterol will not have an atom of 13C.
A portion of the binomial distribution for atoms of 13C in cholesterol is shown in Figure 4.5. Note in particular that there is little probability of finding more than two atoms of 13C in any molecule of cholesterol.
P( , ) !
!( )! ( . ) ( . ) .
0 27 27
0 27 0 0 01110 1 0 011127 0 0 740
= − × × − − =
Chapter 4 Evaluating Analytical Data
73
Probability
Number of atoms of carbon-13 in a molecule of cholesterol 0
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
1 2 3 4 5
Figure 4.5
Portion of the binomial distribution for the number of naturally occurring 13C atoms in a molecule of cholesterol.
Normal Distribution The binomial distribution describes a population whose members have only certain, discrete values. This is the case with the number of 13C atoms in a molecule, which must be an integer number no greater then the number of carbon atoms in the molecule. A molecule, for example, cannot have 2.5 atoms of
13C. Other populations are considered continuous, in that members of the popula- tion may take on any value.
The most commonly encountered continuous distribution is the Gaussian, or normal distribution,where the frequency of occurrence for a value, X,is given by
The shape of a normal distribution is determined by two parameters, the first of which is the population’s central, or true mean value, µ, given as
where nis the number of members in the population. The second parameter is the population’s variance, σ2, which is calculated using the following equation*
σ2 1 µ2 4.8
=
∑
= (X − ) ni i N
µ =
∑
= Xn
i i N
1
f X X
( ) exp ( )
= − −
1
2 2 2
2
πσ 2
µ σ
normal distribution
“Bell-shaped” probability distribution curve for measurements and results showing the effect of random error.
*Note the difference between the equation for a population’s variance, which includes the term nin the denominator, and the similar equation for the variance of a sample (the square of equation 4.3), which includes the term n– 1 in the denominator. The reason for this difference is discussed later in the chapter.
74
Modern Analytical Chemistryf(x)
Value of x –50 –40 –30 –20 –10 0
(a)
(b)
(c)
10 20 30 40 50
Figure 4.6
Normal distributions for (a)µ= 0 and σ2= 25; (b)µ= 0 and σ2= 100; and (c)µ= 0 and σ2= 400.
Examples of normal distributions with µ= 0 and σ2= 25, 100 or 400, are shown in Figure 4.6. Several features of these normal distributions deserve atten- tion. First, note that each normal distribution contains a single maximum corre- sponding to µand that the distribution is symmetrical about this value. Second, increasing the population’s variance increases the distribution’s spread while de- creasing its height. Finally, because the normal distribution depends solely on µ and σ2, the area, or probability of occurrence between any two limits defined in terms of these parameters is the same for all normal distribution curves. For ex- ample, 68.26% of the members in a normally distributed population have values within the range µ±1σ, regardless of the actual values of µand σ. As shown in Example 4.11, probability tables (Appendix 1A) can be used to determine the probability of occurrence between any defined limits.
EXAMPLE
4.11
The amount of aspirin in the analgesic tablets from a particular manufacturer is known to follow a normal distribution, with µ= 250 mg and σ2= 25. In a random sampling of tablets from the production line, what percentage are expected to contain between 243 and 262 mg of aspirin?
SOLUTION
The normal distribution for this example is shown in Figure 4.7, with the shaded area representing the percentage of tablets containing between 243 and 262 mg of aspirin. To determine the percentage of tablets between these limits, we first determine the percentage of tablets with less than 243 mg of aspirin, and the percentage of tablets having more than 262 mg of aspirin. This is accomplished by calculating the deviation, z,of each limit from µ, using the following equation
where Xis the limit in question, and σ, the population standard deviation, is 5.
Thus, the deviation for the lower limit is
z X
= − µ σ
Figure 4.7
Normal distribution for population of aspirin tablets with µ= 250 mg aspirin and σ2= 25. The shaded area shows the percentage of tablets containing between 243 and 262 mg of aspirin.
Chapter 4 Evaluating Analytical Data
75
f(mg aspirin)
Aspirin (mg)
290 280
220 240 250 260 270
210 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08
230
and the deviation for the upper limit is
Using the table in Appendix 1A, we find that the percentage of tablets with less than 243 mg of aspirin is 8.08%, and the percentage of tablets with more than 262 mg of aspirin is 0.82%. The percentage of tablets containing between 243 and 262 mg of aspirin is therefore
100.00% – 8.08% – 0.82 % = 91.10%