Although the range is easy to compute and understand and the interquartile range is designed to overcome the range’s sensitivity to extreme values, neither measure uses all the available data in its computation. Thus, both measures ignore potentially valuable information in data.
Two measures of variation that incorporate all the values in a data set are the variance and the standard deviation.
These two measures are closely related. The standard deviation is the positive square root of the variance. The standard deviation is in the original units (dollars, pounds, etc.), whereas the units of measure in the variance are squared. Because dealing with original units is easier than dealing with the square of the units, we usually use the standard deviation to measure variation in a population or sample.
o u tc o m e 3
Variance
The population variance is the average of the squared distances of the data values from the mean.
Standard Deviation
The standard deviation is the positive square root of the variance.
BUSINESS APPLICATION
Calculating the Variance and Standard Deviation
Travel-Time Recreational Vehicles (continued ) Recall the Travel-Time Recreational Vehicles application, in which we compared the weekly production output for two of the company’s plants. Table 3.3 showed the data, which are considered a population for our purposes here.
Previously, we examined the variability in the output from these two plants by computing the ranges. Although those results gave us some sense of how much more variable Plant A is than Plant B, we also pointed out some of the deficiencies of the range. The variance and standard deviation offer alternatives to the range for measuring variation in data.
Equation 3.9 is the formula for the population variance. Like the population mean, the population variance and standard deviation are assigned Greek symbols.
Population Variance
s2 = a
N i=1
(xi - m)2 N
(3.9) where:
m = Population mean N = Population size
s2 = Population variance 1sigma squared2
Excel 2016 can be used to calculate the interquartile range using the Quartile.Inc function in an equation as follows:
=Quartile.Inc 1data values,32 - Quartile.Inc 1data values,12
M03_GROE0383_10_GE_C03.indd 121 23/08/17 5:50 PM
122 Chapter 3
|
Describing Data Using Numerical MeasuresWe begin by computing the variance for the output data from Plant A. The first step in manually calculating the variance is to find the mean using Equation 3.1:
m = ax
N = 15 + 25 + 35 + 20 + 30
5 = 125
5 = 25
Next, we subtract the mean from each value, as shown in Table 3.4. Notice that the sum of the deviations from the mean is 0. Recall from Section 3.1 that this will be true for any set of data. The positive differences are canceled out by the negative differences. To overcome this fact when computing the variance, we square each of the differences and then sum the squared differences. These calculations are also shown in Table 3.4.
Population Variance Shortcut
s2 = ax2 -
1
ax2
2N N
(3.10)
Example 3-11 will illustrate using Equation 3.10 to find a population variance.
Because we squared the deviations to keep the positive values and negative values from canceling, the units of measure were also squared, but the term RVs squared doesn’t have a meaning. To get back to the original units of measure, we take the square root of the variance.
The result is the standard deviation. Equation 3.11 shows the formula for the population standard deviation.
TABLE 3.4 Computing the Population Variance:
Squaring the Deviations
xi 1xi - m2 1xi - m22
15 15 - 25 = -10 100
25 25 - 25 = 0 0
35 35 - 25 = 10 100
20 20 - 25 = -5 25
30 30 - 25 = 5 25
g1xi - m2 = 0 g1xi - m22 = 250
Population Standard Deviation
s = 2s2 = H a
N
i=1(xi - m)2 N
(3.11)
The Excel 2016 function for the population mean is
=AVERAGE 115,25,35,20,302
The Excel 2016 function for the population variance is
=Var.P 115,25,35,20,302
The final step in computing the population variance is to divide the sum of the squared differences by the population size, N = 5:
s2 = a(x - m)2
N = 250
5 = 50 The population variance is 50 RVs squared.
Manual calculations for the population variance may be easier if you use an alternative formula for s2 that is the algebraic equivalent. This is shown as Equation 3.10.
M03_GROE0383_10_GE_C03.indd 122 23/08/17 4:53 PM
3.2 Measures of Variation
|
Chapter 3 123Therefore, the population standard deviation of Plant A’s production output is s = 250
s = 7.07 mobile homes
The population standard deviation is a parameter and will not change unless the population values change.
We could repeat this process using the data for Plant B, which also had a mean output of 25 mobile homes. You should verify that the population variance is
s2 = a(x - m)2
N = 10
5 = 2 mobile homes squared The standard deviation is found by taking the square root of the variance:
s = 22
s = 1.414 mobile homes
Thus, Plant A has an output standard deviation that is five times larger than Plant B’s. The fact that Plant A’s range was also five times larger than the range for Plant B is merely a coincidence.
EXAMPLE 3-11
Computing a Population Variance and Standard Deviation
Boydson Shipping Company Boydson Shipping Company owns and operates a fleet of ships that carry commodities between the countries of the world. In the past six months, the company has had seven contracts that called for shipments between Vancouver, Canada, and London, England. For many reasons, the travel time varies between these two locations. The scheduling man- ager is interested in knowing the variance and standard deviation in shipping times for these seven shipments. To find these values, he can follow these steps:
s t e p 1 Collect the data for the population.
The shipping times are shown as follows:
x = Shipping weeks
= 55, 7, 5, 9, 7, 4, 66
s t e p 2 Select Equation 3.10 to find the population variance.
s2 = ax2 -
1
ax2
2N N
s t e p 3 Add the x values and square the sum.
ax = 5 + 7 + 5 + 9 + 7 + 4 + 6 = 4 3
1
ax2
2 = 14322 = 1,849s t e p 4 Square each of the x values and sum these squares.
ax2 = 52 + 72 + 52 + 92 + 72 + 42 + 62 = 2 8 1
s t e p 5 Compute the population variance.
s2 = ax2 -
1
ax2
2N
N =
281 - 1,849 7
7 = 2.4082
HOW TO DO IT (Example 3-11) Computing the Population Vari- ance and Standard Deviation 1. Collect quantitative data for the variable of interest for the entire population.
2. Use either Equation 3.9 or Equation 3.10 to compute the variance.
3. If Equation 3.10 is used, find the sum of the x values 1gx2 and then square this sum 1gx22. 4. Square each x value and sum these squared values 1gx22. 5. Compute the variance using
s2 = ax2 - 1ax22
N N 6. Compute the standard deviation by taking the positive square root of the variance:
s = 2s2
The Excel 2016 function for the population variance is
=Var.P (5,7,5,9,7,4,6)
M03_GROE0383_10_GE_C03.indd 123 23/08/17 4:53 PM
124 Chapter 3
|
Describing Data Using Numerical MeasuresThe variance is in units squared, so in this example the population variance is 2.4082 weeks squared.
s t e p 6 Calculate the standard deviation as the positive square root of the variance.
s = 2s2 = 22.4082 = 1.5518 weeks
Thus, the standard deviation for the number of shipping weeks between Vancouver and London for the seven shipments is 1.5518 weeks.
TRY EXERCISE 3-27 (pg. 127) The Excel 2016 function for the
population standard deviation is
=Stdev.P (5,7,5,9,7,4,6)