• Tidak ada hasil yang ditemukan

Variance and Standard Deviation

Dalam dokumen Statistics for Business and Economics (Halaman 72-76)

Although range and interquartile range measure the spread of data, both measures take into account only two of the data values. We need a measure that would average the total 1g2 distance between each of the data values and the mean. But for all data sets, this sum will always equal zero because the mean is the center of the data. If the data value is less than the mean, the difference between the data value and the mean would be negative (and distance is not negative). If each of these differences is squared, then each observa-tion (both above and below the mean) contributes to the sum of the squared terms. The average of the sum of squared terms is called the variance.

Variance

With respect to variance, the population variance, s2, is the sum of the squared differences between each observation and the population mean divided by the population size, N:

s2 = a

N

i=11xi - m22

N (2.10)

The sample variance, s2, is the sum of the squared differences between each observation and the sample mean divided by the sample size, n, minus 1:

s2= a

n

i=11xi- x22

n - 1 (2.11)

Notice that the distribution of sales for Location 3 is skewed left, which indicates the presence of days with sales less than most of the other days ($200 and $300) or per-haps a data-entry error. Similarly, the distribution of sales in Location 4 is skewed right indicating the presence of sales higher than most of the other days ($2,200 and $2,000) or the possibility that sales were incorrectly recorded.

The management of Gilotti’s Pizzeria will want to know more about the variation in sales, both within a given location as well as between these four locations. This infor-mation will assist Gilotti’s Pizzeria in their decision-making process.

72 Chapter 2 Using Numerical Measures to Describe Data

Notice that, for sample data, variance in Equation 2.11 is found by dividing the nu-merator by (n - 1) and not n. Since our goal is to find an average of squared deviations about the mean, one would expect division by n. So why is the denominator of sample variance given as (n - 1) in Equation 2.11? If we were to take a very large number of samples, each of size n, from the population and compute the sample variance, as given in Equation 2.11 for each of these samples, then the average of all of these sample vari-ances would be the population variance, s2. In Chapter 6 we see that this property indi-cates that the sample variance is an “unbiased estimator” of the population variance, s2. For now, we rely on mathematical statisticians who have shown that if the population variance is unknown, a sample variance is a better estimator of the population variance if the denominator in the sample variance is (n - 1) rather than n.

To compute the variance requires squaring the distances, which then changes the unit of measurement to square units. The standard deviation, which is the square root of variance, restores the data to their original measurement unit. If the original measure-ments were in feet, the variance would be in feet squared, but the standard deviation would be in feet. The standard deviation measures the average spread around the mean.

Standard Deviation

With respect to standard deviation, the population standard deviation, s, is the (positive) square root of the population variance and is defined as follows:

s = 2s2 = H a

N

i=11xi - m22 N

(2.12) The sample standard deviation, s, is as follows:

s = 2s2 = H

a

n

i=11xi - x22

n- 1 (2.13)

In Example 2.8 we found the range of daily sales in Location 1 to be $800, smaller than the range of the other three locations (Table 2.3). These differences in the ranges are clearly seen in the box-and-whisker plots in Figure 2.3. However, since only the maxi-mum and minimaxi-mum values are used to find the range, it is better to calculate the variance and standard deviation, as these measures take into account the difference of each daily sale from its mean.

Example 2.9 Gilotti’s Pizzeria Sales (Variance and Standard Deviation)

Calculate the standard deviation of daily sales for Gilotti Pizzeria, Location 1. From Table 2.3 the daily sales for Location 1 are:

6 8 10 12 14 9 11 7 13 11

Solution To calculate sample variance and standard deviation follow these three steps:

Step 1: Calculate the sample mean, x, using Equation 2.2. It is equal to 10.1.

Step 2: Find the difference between each of the daily sales and the mean of 10.1.

Step 3: Square each difference. The result is Table 2.4.

2.2 Measures of Variability 73 Equations 2.14 and 2.15 are sometimes referred to as shortcut formulas to calculate sample variance. We include these equations for statisticians who prefer these methods of computation. The value of sample variance is the same using Equation 2.11, 2.14, or 2.15.

We illustrate this in Example 2.10.

Table 2.4 Gilotti’s Pizzeria Sales

SALES ($100S), xi DEVIATIONABOUTTHE

MEAN, 1xi - x2 SQUARED DEVIATIONABOUTTHE MEAN, 1xi - x22

6 -4.1 16.81

8 -2.1 4.41

10 -0.1 0.01

12 1.9 3.61

14 3.9 15.21

9 -1.1 1.21

11 0.9 0.81

7 -3.1 9.61

13 2.9 8.41

11 0.9 0.81

a

10

i=1xi = 101 x = a x2i

n = 10.1

a

10

i=11xi - x2 = 0 a

10

i=11xi- x22 = 60.9 s2=

a

n

i=11xi- x22 n - 1 = 60.9

9 = 6.76 s = 2s2 = 26.76 ⬇ 2.6

Shortcut Formulas for Sample Variance, s2 Sample variance, s2, can be computed as follows:

s2 = a

n

i=1xi2 - 1a xi22

n

n - 1 (2.14)

Alternatively, sample variance, s2, can be computed as follows:

s2 = a

n

i=1xi2 - nx2

n - 1 (2.15)

Example 2.10 Gilotti’s Pizzeria Sales (Variance by Alternative Formula)

Calculate the variance in daily sales for Gilotti Pizzeria, Location 1, using the alterna-tive shortcut formulas found in Equations 2.14 and 2.15. From Table 2.3 daily sales for Location 1 are:

6 8 10 12 14 9 11 7 13 11

74 Chapter 2 Using Numerical Measures to Describe Data

There are numerous applications of standard deviation in business. For example, in-vestors may want to compare the risk of different assets. In Example 2.11 we look at two assets that have the same mean rates of return. In Example 2.12 we consider an invest-ment in stocks with different mean closing prices over the last several months.

Solution From Table 2.4 we have the following calculations for the n = 10 daily sales:

a

10

i=1xi = 101 x = 10.1

All we need is to find the sum of the squares of each daily sale. This is found as follows:

a x2i = 1622 + 1822 + 11022 + . . . + 11122 = 1,081 Substituting into Equation 2.14, sample variance, s2 is calculated as follows:

s2 = a

n

i=1xi2 -

1

a xi

2

2

n

n - 1 =

1,081 - c110122 10 d

9 = 1,081 - 1,020.1

9 = 60.9

9 = 6.76 Using Equation 2.15, sample variance, s2 is calculated as follows:

s2 = a

n

i=1xi2 - nx2

n - 1 = 1,081 - 10110.122

9 = 1,081 - 1,020.1

9 = 60.9

9 = 6.76

Example 2.11 Comparing Risk of Two Assets with Equal Mean Rates of Return (Standard Deviation)

Wes and Jennie Moore, owners of Moore’s Foto Shop in western Pennsylvania, are con-sidering two investment alternatives, asset A and asset B. They are not sure which of these two single assets is better, and they ask Sheila Newton, a financial planner, for some assistance.

Solution Sheila knows that the standard deviation, s, is the most common single indicator of the risk or variability of a single asset. In financial situations the fluctuation around a stock’s actual rate of return and its expected rate of return is called the risk of the stock. The standard deviation measures the variation of returns around an asset’s mean. Sheila obtains the rates of return on each asset for the last 5 years and calculates the means and standard deviations of each asset. Her results are given in Table 2.5.

Table 2.5 Rates of Return: Asset A and Asset B

ASSETA ASSET B

Mean Rate of Return 12.2% 12.2%

Standard Deviation in Rate of Return 0.63 3.12

Since each asset has the same average rate of return of 12.2%, Sheila compares the standard deviations and determines that asset B is a more risky investment.

2.2 Measures of Variability 75

Dalam dokumen Statistics for Business and Economics (Halaman 72-76)