• Tidak ada hasil yang ditemukan

Other Measures of Location

Weighted Mean The arithmetic mean is the most frequently used measure of central location. Equations 3.1 and 3.2 are used when you have either a population or a sample. For instance, the sample mean is computed using

x = ax

n = x1 + x2 + x3 + g + xn n

In this case, each x value is given an equal weight in the computation of the mean. However, in some applications there is reason to weight the data values differently. In those cases, we need to compute a weighted mean.

Equations 3.4 and 3.5 are used to find the weighted mean (or weighted average) for a population and for a sample, respectively.

o u tc o m e 1

Weighted Mean

The mean value of data values that have been weighted according to their relative importance.

Weighted Mean for a Sample

xw = awi xi

awi (3.5)

where:

wi = The weight of the ith data value xi = The ith data value

Weighted Mean for a Population

mw = awi xi

awi (3.4)

M03_GROE0383_10_GE_C03.indd 108 23/08/17 4:52 PM

3.1 Measures of Center and Location

|

Chapter 3 109

One weighted-mean example that you are probably very familiar with is your college grade point average (GPA). At most schools, A = 4 points, B = 3 points, and so forth. Each course has a certain number of credits (usually 1 to 5). The credits are the weights. Your GPA is computed by summing the product of points earned in a course and the credits for the course, and then dividing this sum by the total number of credits earned.

Percentiles In some applications, we might wish to describe the location of the data in terms other than the center of the data. For example, prior to enrolling at your university, you took the SAT or ACT test and received a percentile score in math and verbal skills.

If you received word that your standardized exam score was at the 90th percentile, it means that you scored as high as or higher than 90% of the other students who took the exam.

The score at the 50th percentile would indicate that you were at the median, where at least 50% scored at or below and at least 50% scored at or above your score.2

Percentiles

The pth percentile in a data array is a value that divides the data set into two parts. The lower segment contains at least p% and the upper segment contains at least (100-p)% of the data. The 50th percentile is the median.

EXAMPLE 3-7

Calculating a Weighted Population Mean

Myers & Associates The law firm of Myers & Associates was involved in litigating a discrimination suit concerning ski instructors at a ski resort in Colorado. One ski instruc- tor from Germany had sued the operator of the ski resort, claiming he had not received equitable pay compared with the other ski instructors from Norway. In preparing a defense, the Myers attorneys planned to compute the mean annual income for all seven Norwegian ski instructors at the resort. However, because these instructors worked differ- ent numbers of days during the ski season, a weighted mean needed to be computed. This was done using the following steps:

s t e p 1 Collect the desired data and determine the weight to be assigned to each data value.

In this case, the variable of interest was the income of the ski instructors. The population consisted of seven Norwegian instructors. The weights were the numbers of days that the instructors worked. The following data and weights were determined:

xi = Income: +7,600 +3,900 +5,300 +4,000 +7,200 +2,300 +5,100

wi = Days: 50 30 40 25 60 15 50

s t e p 2 Multiply each weight by the data value and sum these.

Σwixi = 15 021+7 ,6 0 02 + 13 021+3 ,9 0 02+ c+15 021+5 ,1 0 02 = +1 ,5 3 0 ,5 0 0

s t e p 3 Sum the weights for all values (the weights are the days).

Σwi = 50 + 30 + 40 + 25 + 60 + 15 + 50 = 270

s t e p 4 Compute the weighted mean.

Divide the weighted sum by the sum of the weights. Because we are working with the population, the result will be the population weighted mean.

mw = awixi

awi = $1,530,500

270 = $5,668.52

Thus, taking into account the number of days worked, the Norwegian ski instructors had a mean income of $5,668.52.

TRY EXERCISE 3-8 (pg. 116)

2More rigorously, the percentile is that value (or set of values) such that at least p% of the data is as small as or smaller than that value and at least 11 0 0 - p2, of the data is at least as large as that value. For introductory courses, a convention has been adopted to average the largest and smallest values that qualify as a certain percentile.

This is why the median was defined as it was earlier for data sets that have an even number of data values.

M03_GROE0383_10_GE_C03.indd 109 23/08/17 4:52 PM

110 Chapter 3

|

Describing Data Using Numerical Measures

To illustrate how to manually approximate a percentile value, consider a situation in which 309 customers enter a Verizon store during the course of a day. The time (rounded to the nearest minute) that each customer spends is recorded. If we wish to approximate the 10th percentile, we would begin by first sorting the data in order from low to high, then assign each data value a location index from 1 to 309, and next determine the location index that corresponds to the 10th percentile using Equation 3.6.

Percentile Location Index

i = p

1001n2 (3.6)

where:

p = Desired percent

n = Number of values in the data set

If i is not an integer, we round up to the next higher integer. The next integer greater than i corresponds to the position of the pth percentile in the data set.

If i is an integer, the pth percentile is the average of the values in position i and position i + 1.

Thus, the index value associated with the 10th percentile is i = p

1001n2 = 10

10013092 = 30.90

Because i = 30.90 is not an integer, we round to the next higher integer, which is 31.

The  10th percentile corresponds to the value in the 31st position from the low end of the sorted data.

EXAMPLE 3-8

Calculating Percentiles

Henson Trucking The Henson Trucking Company is a small company in the business of moving people from one home to another within the Dallas, Texas, area. His- torically, the owners have charged the customers on an hourly basis, regardless of the distance of the move within the Dallas city limits. However, they are now considering adding a surcharge for moves over a certain distance. They have decided to base this charge on the 80th percentile. They have a sample of travel-distance data for 30 moves. These data are as follows:

13.5 8.6 16.2 21.4 21.0 23.7 4.1 13.8 20.5 9.6

11.5 6.5 5.8 10.1 11.1 4.4 12.2 13.0 15.7 13.2

13.4 13.1 21.7 14.6 14.1 12.4 24.9 19.3 26.9 11.7

The 80th percentile can be computed using these steps.

s t e p 1 Sort the data from lowest to highest.

4.1 4.4 5.8 6.5 8.6 9.6 10.1 11.1 11.5 11.7

12.2 12.4 13.0 13.1 13.2 13.4 13.5 13.8 14.1 14.6

15.7 16.2 19.3 20.5 21.0 21.4 21.7 23.7 24.9 26.9

HOW TO DO IT (Example 3-8) Calculating Percentiles 1. Sort the data in order from the lowest to highest value.

2. Determine the percentile loca- tion index, i, using Equation 3.6:

i = p 1001n2 where

p = Desired percent n= Number of values in the

data set

3. If i is not an integer, then round to next higher integer. The pth percentile is located at the rounded index position. If i is an integer, the pth percentile is the average of the values at location index positions i and i + 1.

M03_GROE0383_10_GE_C03.indd 110 23/08/17 4:52 PM

3.1 Measures of Center and Location

|

Chapter 3 111

s t e p 2 Determine percentile location index, i, using Equation 3.6.

The 80th percentile location index is i = p

1001n2 = 80

1001302 = 24

s t e p 3 Locate the appropriate percentile.

Because i = 24 is an integer value, the 80th percentile is found by averaging the values in the 24th and 25th positions. These are 20.5 and 21.0. Thus, the 80th percentile is 120.5 + 21.02>2 = 20.75; therefore, any distance exceeding 20.75 miles will be subject to a surcharge.

TRY EXERCISE 3-7 (pg. 116)

Quartiles Another location measure that can be used to describe data is quartiles.

The first quartile corresponds to the 25th percentile. That is, it is the value at or below which there is at least 25% (one quarter) of the data and at or above which there is at least 75% of the data. The third quartile is also the 75th percentile. It is the value at or below which there is at least 75% of the data and at or above which there is at least 25% of the data. The second quartile is the 50th percentile and is also the median.

A quartile value can be approximated manually using the same method as for percentiles using Equation 3.6. For the 309 Verizon customer-service times mentioned earlier, the loca- tion of the first-quartile (25th percentile) index is found, after sorting the data, as

i = p

1001n2 = 25

10013092 = 77.25

Because 77.25 is not an integer value, we round up to 78. The first quartile is the 78th value from the low end of the sorted data.

Issues with Excel The quartile and percentile values from Excel will be slightly differ- ent from those we find manually using Equation 3.6. For example, referring to Example 3-8, when we use Excel to compute the 80th percentile for the moving distances, the value returned is 20.60 miles. This is slightly different from the 20.75 we found in Example 3-8.

Box and Whisker Plots

A descriptive tool that many decision makers like to use is called box and whisker plot (or a box plot). The box and whisker plot incorporates the five-number summary (minimum, first quartile, median, third quartile, and maximum) to graphically display quantitative data. It is also used to identify outliers that are unusually small or large data values that lie mostly by themselves.

Quartiles

Quartiles in a data array are those values that divide the data set into four equal-sized groups. The median corresponds to the second quartile.

o u tc o m e 2

EXAMPLE 3-9

Constructing a Box And Whisker Plot

Rental Car Company A demand analyst for a rental car company has recently per- formed a study at one of the company’s stores in which he determined the number of miles driven by rental car customers. He now wishes to construct a box and whisker plot as part of a presentation to describe customer driving patterns. The sorted sample data showing the miles driven are as follows. (The data are also listed in the file Rental Car Miles.)

231 236 241 242 242 243 243 243 248

248 249 250 251 251 252 252 254 255

255 256 256 257 259 260 260 260 260

262 262 264 265 265 265 266 268 268

270 276 277 277 280 286 300 324 345

The Excel 2016 function for a percentile is

=Percentile.Inc 113.5,8.6,

…,11.7,.802

Note: Excel sorts the data automatically when calculating percentiles.

The Excel 2016 function for a quartile is

=Quartile.Inc 113.5,8.6,

…,11.7,12

Note: The value for the quartile is entered as 1, 2, or 3 for first, second, or third quartile.

Box and Whisker Plots A graph that is composed of two parts: a box and the whiskers. The box has a width that ranges from the first quartile (Q1) to the third quartile (Q3). A vertical line through the box is placed at the median. Limits are located at a value that is 1.5 times the difference between Q1 and Q3 below Q1 and above Q3. The whiskers extend to the left to the lowest value within the limits and to the right to the highest value within the limits.

M03_GROE0383_10_GE_C03.indd 111 23/08/17 4:52 PM

112 Chapter 3

|

Describing Data Using Numerical Measures

The box and whisker plot is constructed using the following steps:

s t e p 1 Sort the data values from low to high.

s t e p 2 Calculate the 25th percentile (Q1), the 50th percentile (median), and the 75th percentile (Q3).

The location index for Q1 is i = p

1001n2 = 25

1001452 = 11.25

Thus, Q1 will be the 12th value, which is 250 miles. The median location is i = p

1001n2 = 50

1001452 = 22.5

In the sorted data, the median is the 23rd value, which is 259 miles. The third- quartile location is

i = p

1001n2 = 75

1001452 = 33.75 Thus, Q3 is the 34th data value. This is 266 miles.

s t e p 3 Draw the box so the ends correspond to Q1 and Q3.

230

Q1 Q3

240 250 260 270 280 290 300 310 320 330 340 350 s t e p 4 Draw a vertical line through the box at the median.

230

Q1 Q3

240 250 260 270 Median

280 290 300 310 320 330 340 350

s t e p 5 Compute the upper and lower limits.

The lower limit is computed as Q1 - 1.51Q3 - Q12. This is Lower limit = 250 - 1.51266 - 2502 = 226 The upper limit is Q3 + 1.51Q3 - Q12. This is

Upper limit = 266 + 1.51266 - 2502 = 290 Any value outside these limits is identified as an outlier.

s t e p 6 Draw the whiskers.

The whiskers are drawn to the smallest and largest values within the limits.

230

Q1 Q3

240 250 260 270 Median

280 290 300 310 320 330 340 350 Lower

Limit = 226 Upper

Limit = 290 Outliers

* * *

HOW TO DO IT (Example 3-9)

Constructing a Box and Whis- ker Plot

1. Sort the data values from low to high.

2. Use Equation 3.6 to find the 25th percentile 1Q1= first quar- tile2, the 50th percentile 1Q2 = median), and the 75th percentile 1Q3 = third quartile2.

3. Draw a box so that the ends of the box are at Q1 and Q3. This box will contain the middle 50%

of the data values in the popula- tion or sample.

4. Draw a vertical line through the box at the median. Half the data values in the box will be on either side of the median.

5. Calculate the interquartile range 1IQR = Q3 - Q12. (The interquartile range will be dis- cussed more fully in Section 3.2.) Compute the lower limit for the box and whisker plot as Q1 - 1.51Q3- Q1). The upper limit is Q3 + 1.51Q3 - Q12. Any data values outside these limits are referred to as outliers.

6. Extend dashed lines (called the whiskers) from each end of the box to the lowest and high- est value within the limits.

7. Any value outside the limits (outlier) found in Step 5 is marked with an asterisk (*).

M03_GROE0383_10_GE_C03.indd 112 23/08/17 4:52 PM

3.1 Measures of Center and Location

|

Chapter 3 113

s t e p 7 Plot the outliers.

The outliers are plotted as values outside the limits.

TRY EXERCISE 3-5 (pg. 116)