• Tidak ada hasil yang ditemukan

Standardized Data Values

3.3 EXERCISES

M03_GROE0383_10_GE_C03.indd 135 23/08/17 4:53 PM

136 Chapter 3

|

Describing Data Using Numerical Measures

a. Calculate the mean and standard deviation for this data.

b. Determine the percentage of data values that fall in each of the following intervals:

x { s, x { 2s, x { 3s.

c. Compare these with the percentages that should be expected from a bell-shaped distribution. Does it seem plausible that these data came from a bell- shaped population? Explain.

3-56. Consider the following population:

71 89 65 97 46 52 99 41 62 88

73 50 91 71 52 86 92 60 70 91

73 98 56 80 70 63 55 61 40 95

a. Determine the mean and variance.

b. Determine the percentage of data values that fall in each of the following intervals:

x { 2s, x { 3s, x { 4s.

c. Compare these with the percentages specified by Tchebysheff’s theorem.

Business Applications

3-57. The owner of a clothes store wants to determine whether salespeople make more sales if they spend more time with customers. She randomly selects two salespersons from her shop and records the times (in minutes) with each customer during a period of 15 days. At the same time, she records the total sales made by each. The total sales made by the salespersons are $578 and $695, respectively. The following are the durations recorded:

Salesperson A 16 9 18 21 6 26 30 17 15 29 45 27 34 52 Salesperson B 10 15 4 34 45 34 18 24 43 19 21 14 9 28

a. Calculate the mean and standard deviation of the amount of time a customer is served by each salesperson. Using each mean, determine whether a salesperson makes more sales if they spend more time with a customer.

b. Use your findings for part a, determine the variation for each salesperson.

c. Explain whether the standard variation values, calculated in part a, can be used to compare the dispersion between the two salespersons. Explain and show how the two can be compared.

3-58. The Jones and McFarlin Company provides call center support for a number of electronics manufacturers.

Customers call in to receive assistance from a live technician for the products they have purchased from a retailer. Jones and McFarlin are experimenting with a new automated system that attempts to provide answers to customer questions. One issue of

importance is the time customers have to spend on the line to get their question answered. The company has collected a sample of calls using each system and recorded the following times in seconds:

Time to Complete the Call (in seconds) Automated

Response 131 80 140 118 79 94 103 145 113 100 122 Live Response 170 177 150 208 151 127 147 140 109 184

119 149 129 152

a. Compute the mean and standard deviation for the time to complete calls using the automated response.

b. Compute the mean and standard deviation for the time to complete calls to live service representatives.

c. Compute the coefficient of variation for the time to complete calls to the automated system and the live representatives. Which group has the greater relative variability in the time to complete calls?

d. Construct box and whisker plots for the time required to complete the two types of calls and briefly discuss.

3-59. Lockheed Martin is a supplier for the aerospace industry. Suppose Lockheed is considering switching to Cirus Systems, Inc., a new supplier for one of the component parts it needs for an assembly. At issue is the variability of the components supplied by Cirus Systems, Inc., compared to that of the existing supplier.

The existing supplier makes the desired part with a mean diameter of 3.75 inches and a standard deviation of 0.078 inch. Unfortunately, Lockheed Martin does not have any of the exact same parts from the new supplier.

Instead, the new supplier has sent a sample of 20 parts of a different size that it claims are representative of the type of work it can do. These sample data are shown here and in the data file called Cirus.

Diameters (in inches)

18.018 17.856 18.095 17.992 18.086 17.812 17.988 17.996 18.129 18.003 18.214 18.313 17.983 18.153 17.996 17.908

17.948 18.219 18.079 17.799

Prepare a short letter to Lockheed Martin indicating which supplier you would recommend based on relative variability.

3-60. Suppose mortgage interest tax deductions average

$8,268 for people with incomes between $50,000 and

$200,000 and $365 for those with incomes of $40,000 to $50,000. Suppose also the standard deviations of the housing benefits in these two categories were equal to

$2,750 and $120, respectively.

a. Examine the two standard deviations. What do these indicate about the range of benefits enjoyed by the two groups?

b. Repeat part a using the coefficient of variation as the measure of relative variation.

3-61. Thirty people were given an employment screening test, which is supposed to produce scores that are distributed according to a bell-shaped distribution. The following data reflect the scores of those 30 people:

M03_GROE0383_10_GE_C03.indd 136 05/09/17 4:15 PM

3.3 Using the Mean and Standard Deviation Together

|

Chapter 3 137

76 75 74 56 61 76

62 96 68 62 78 76

84 67 60 96 77 59

67 81 66 71 69 65

58 77 82 75 76 67

The employment agency has in the past issued a rejection letter with no interview to the lower 16%

taking the test. They also send the upper 2.5% directly to the company without an interview. Everyone else is interviewed. Based on the data and the assumption of a bell-shaped distribution, what scores should be used for the two cutoffs?

3-62. Fertilizer is a very important component for growing vegetables. It keeps the soil in optimum shape to feed the crops. Choosing the right type of fertilizer and adding the right amount are important steps in farming.

Kim, the owner of a vegetable farm, is introduced to an organic fertilizer that replaces the chemical fertilizer she is using now. She performs a test on the amounts (in kg) needed for both fertilizers on two equal pieces of land. She finds that the average amount of chemical fertilizer required is 23.2 kg with a standard deviation of 2.67 kg. The organic fertilizer requires an average of 21.6 kg with a standard deviation of 3.78 kg.

a. Determine which fertilizer gives Kim a greater relative variation.

b. Assume that the amounts of the two fertilizers used are nearly symmetrical. Find the largest and smallest amounts of the two fertilizers needed.

c. Suppose Kim does not want to use the minimum fertilizer for her farm. Use the findings in part b to help her make a decision.

Computer Software Exercises

3-63. A West Coast newspaper has asked 50 certified public accountant (CPA) firms to complete the same tax return for a hypothetical head of household. The CPA firms have their tax experts complete the return with the objective of determining the total federal income tax liability. The data in the file Taxes show the taxes owed as figured by each of the 50 CPA firms.

Theoretically, they should all come up with the same taxes owed.

Based on these data, write a short article for the paper that describes the results of this experiment.

Include in your article such descriptive statistics as the mean, median, and standard deviation. You might consider using percentiles, the coefficient of variation, and Tchebysheff’s theorem to help describe the data.

3-64. Nike RZN Black is one of the golf balls Nike, Inc., produces. It must meet the specifications of the United States Golf Association (USGA). The USGA mandates that the diameter of the ball shall not be less than 1.682 inches (42.67 mm). To verify that this specification is

met, sample golf balls are taken from the production line and measured. These data are found in the file titled Diameter.

a. Calculate the mean and standard deviation of this sample.

b. Examine the specification for the diameter of the golf ball again. Does it seem that the data could possibly be bell-shaped? Explain.

c. Determine the proportion of diameters in the following intervals: x { 2s, x { 3s, x { 4s.

Compare these with the percentages specified by Tchebysheff’s theorem.

3-65. A company in Silicon Valley periodically offers its employees a full health screen in which data are collected on several characteristics including percent body fat. Data for a sample of employees are in the file Bodyfat. Use percent body fat as the variable of interest in the following calculations:

a. Calculate the mean and median for percent body fat.

b. Calculate the standard deviation for percent body fat.

c. Calculate the coefficient of variation for percent body fat.

d. If an employee has a percent body fat equal to 29, would it be appropriate to say that this employee is in the 90th percentile or above? Explain.

3-66. Airfare prices were collected for a round trip from Los Angeles (LAX) to Salt Lake City (SLC). Airfare prices were also collected for a round trip from Los Angeles (LAX) to Barcelona, Spain (BCN). Airfares were obtained for the designated and nearby airports during high travel months. The passenger was to fly coach class round-trip, staying seven days. The data are contained in a file titled Airfare.

a. Calculate the mean and standard deviation for each of the flights.

b. Calculate an appropriate measure of the relative variability of these two flights.

c. A British friend of yours is currently in Barcelona and wishes to fly to Los Angeles. If the flight fares are the same but priced in English pounds,

determine his mean, standard deviation, and measure of relative dispersion for that data. (Note:

+1 = 0.566 GBP.)

3-67. One factor that will be important for world trade is the growth rate of the population of the world’s countries.

The data file Country Growth contains the most recent United Nations data on the population and the growth rate for the last decade for 232 countries throughout the world as of 2015 (source: www .indexmundi.com/). Based on these data, which countries had growth rates more than 2 standard deviations higher than the mean growth rate? Which countries had growth rates more than 2 standard deviations below the mean growth rate?

M03_GROE0383_10_GE_C03.indd 137 23/08/17 4:53 PM

138 Chapter 3

|

Describing Data Using Numerical Measures

3 Overview

Summary

• The three numerical measures of the center of a data set are the mean, median, and mode.

• The mean is the arithmetic average and is the most frequently used measure.

The mean is sensitive to extreme values in the data.

• Use the median if the data are skewed or ordinal-level data. The median is unaffected by extremes and is the middle value in the data array.

• The mode is the value in the data that occurs most frequently; it is less often used as a measure of the center.

• When the mean, median, or mode is computed from a population, the measure is a parameter. If the measure is computed from sample data, the measure is a statistic.

• Other measures of location that are commonly used are percentiles and quartiles.

• A box and whisker plot uses a box to display the range of the middle 50% of the data. The limits of whiskers are calculated based on the numerical distance between the first and third quartiles (see Figure 3.13).

• One of the major issues that business decision makers face is the variation that exists in their operations, processes, and people. Because virtually all data exhibit variation, it is important to measure it.

• The range is the difference between the highest value and the lowest value in the data.

• The interquartile range measures the numerical distance between the third and first quartiles; this alternative to the range ignores the extremes in the data.

• The two most frequently used measures of variation are the variance and the standard deviation. The equations for these two measures differ slightly depending on whether you are working with a population or a sample. The standard deviation is a measure of the average deviation of the individual data items around the mean;

it is measured in the same units as the variable of interest (see Figure 3.13).

• The real power of statistical measures of the center and variation is seen when they are used together to fully describe the data.

• A measure that is used a great deal in business, especially in financial analysis, is the coefficient of variation. For two or more data sets, the larger the coefficient of variation, the greater the relative variation of the data.

• The Empirical Rule allows decision makers to better understand the data from a bell-shaped distribution.

Tchebysheff’s theorem helps describe data that are not bell-shaped.

z-values for each individual data point measure the number of standard deviations a data value is from the mean (see Figure 3.13).

Measures of Center and Location (pg. 98–118)

3.1

Measures of Variation (pg. 119–129)

3.2

Using the Mean and Standard Deviation Together (pg. 130–137)

3.3

o u t c o m e 1 Compute the mean, median, mode, and weighted mean for a set of data and use these measures to describe data.

o u t c o m e 2 Construct a box and whisker graph and interpret it.

o u t c o m e 3 Compute the range, interquartile range, variance, and standard deviation and use these measures to describe data.

ou t c o m e 4 Compute a z-score and the coefficient of variation and apply them in decision-making situations.

ou t c o m e 5 Use the Empirical Rule and Tchebysheff’s theorem.

138

M03_GROE0383_10_GE_C03.indd 138 23/08/17 4:53 PM

Equations

|

Chapter 3 139

FIGURE 3.13 Summary of Numerical Statistical

Measures Mode

Median

Mean Mode

Median

Mode

Percentiles/

Quartiles Range

Interquartile Range Variance and

Standard Deviation

Percentiles/

Quartiles Box and Whisker

Coefficient of Variation Standardized

z-Values

Ordinal Nominal

Ratio/Interval

Location Variation

Location Location

Descriptive Analysis &

Comparisons Data

Level

Type of Measures

Equations

(3.1) Population Mean pg. 98

m = a

N i=1xi

N

(3.2) Sample Mean pg. 102

x = a

n i=1xi

n

(3.3) Median Index pg. 103

i = 1 2n

(3.4) Weighted Mean for a Population pg. 108

mw = awixi awi

(3.5) Weighted Mean for a Sample pg. 108

xw = awixi awi

(3.6) Percentile Location Index pg. 110

i = p 100(n)

(3.7) Range pg. 119

R = Maximum value - Minimum value

(3.8) Interquartile Range pg. 120

Interquartile range = Third quartile - First quartile

(3.9) Population Variance pg. 121

s2 = a

N

i=1(xi - m)2 N

(3.10) Population Variance Shortcut pg. 122

s2 = ax2 -

1

ax

2

2

N N

(3.11) Population Standard Deviation pg. 122

s = 2s2 = H a

N

i=1(xi - m)2 N

(3.12) Sample Variance pg. 124

s2 = a

n

i=1(xi - x)2 n - 1

(3.13) Sample Variance Shortcut pg. 124

s2 = ax2 -

1

ax

2

2

n n - 1

(3.14) Sample Standard Deviation pg. 124

s = 2s2 = H a

n

i=1(xi - x)2 n - 1

M03_GROE0383_10_GE_C03.indd 139 23/08/17 4:53 PM

140 Chapter 3

|

Describing Data Using Numerical Measures

(3.15) Population Coefficient of Variation pg. 130

CV = s m(100),

(3.16) Sample Coefficient of Variation pg. 130

CV = s

x (100),

(3.17) Standardized Population Data pg. 133

z = x - m s

(3.18) Standardized Sample Data pg. 133

z = x - x s

Key Terms

Box and whisker plot pg. 111 Coefficient of variation pg. 130 Data array pg. 103

Empirical Rule pg. 131 Interquartile range pg. 120 Left-skewed data pg. 104 Mean pg. 98

Median pg. 103 Mode pg. 105

Parameter pg. 98 Percentiles pg. 109 Population mean pg. 98 Quartiles pg. 111 Range pg. 119

Right-skewed data pg. 104 Sample mean pg. 101 Skewed data pg. 104

Standard deviation pg. 121 Standardized data values pg. 133 Statistic pg. 98

Symmetric data pg. 104 Tchebysheff’s theorem pg. 133 Variance pg. 121

Variation pg. 119 Weighted mean pg. 108

Chapter Exercises

Conceptual Questions

3-68. Consider the following questions concerning the sample variance:

a. Is it possible for a variance to be negative? Explain.

b. What is the smallest value a variance can be?

Under what conditions does the variance equal this smallest value?

c. Under what conditions is the sample variance smaller than the corresponding sample standard deviation?

3-69. For a continuous variable that has a bell-shaped distribution, determine the percentiles associated with the endpoints of the intervals specified in the Empirical Rule.

3-70. Consider that the Empirical Rule stipulates that virtually all of the data values are within the interval m { 3s.

Use this stipulation to determine an approximation for the standard deviation involving the range.

3-71. Cindy is the quality control specialist for a microphone manufacturing company. Every month, she will need to test 200 microphones with a sound pressure system.

Each microphone should have the average sound pressure level around 100 dB with a standard deviation of 0.025 dB. The sound pressure level gives a bell- shaped distribution and the company rejects any microphones that are more than two standard deviations from the mean. Determine how many microphones will be rejected on average from the 200 microphones.

3-72. Since the standard deviation of a set of data requires more effort to compute than the range does, what advantages does the standard deviation have when discussing the spread in a set of data?

3-73. The mode seems like a very simple measure of the location of a distribution. When would the mode be preferred over the median or the mean?

Business Applications

3-74. Recently, a store manager tracked the time customers spent in the store from the time they took a number until they left. A sample of 16 customers was selected and the following data (measured in minutes) were recorded:

15 14 16 14 14 14 13 8

12 9 7 17 10 15 16 16

a. Compute the mean, median, mode, range, interquartile range, and standard deviation.

b. Develop a box and whisker plot for these data.

3-75. Suppose the mean age of video game players is 28, the standard deviation is 9 years, and the distribution is bell- shaped. To assist a video game company’s marketing department in obtaining demographics to increase sales, determine the proportion of players who are

a. between 19 and 28 b. between 28 and 37 c. older than 37

3-76. Interns are complaining about either their internship stipend being very low or about the fact that they do not receive any stipend at all. To study this further, a lecturer selects a random sample of interns and his records show that the average salary of an intern is

$800 per month. Assume that the monthly salary of an intern is symmetrically distributed with a standard deviation of $245.

M03_GROE0383_10_GE_C03.indd 140 05/09/17 4:15 PM

Chapter Exercises

|

Chapter 3 141

a. What is the percentage of interns earning less than

$555 per month?

b. Determine the percentage of interns that earn between $1,290 and $1,535 per month.

c. If the lecturer denies the shape of the distribution in which it can be any distribution, what percentage of interns earn more than $1,000?

3-77. The engineers at Jaguar Land Rover (JLR) produced a recent study on the service process for renewals. The repair procedures are under constant review; therefore, durations are subject to change. The revised durations (in hours) selected from oil seal installations are as follows:

2.0 1.9 2.2 2.1 1.8

2.3 2.4 1.7 2.0 2.3

a. What is the average revised time for an oil seal installation?

b. Calculate the variance and standard deviation for the revised durations.

3-78. An HR manager of a company finds that teenagers frequently change jobs. The dissatisfaction with their present jobs is a major factor in the decision they make.

Thus, she selects a sample of interviews of 15 teenagers from the past six months. She records the number of months the teenagers spent on their previous jobs:

12 5 1 6 20

24 16 7 11 8

23 19 25 14 4

a. Calculate the range of months that the teenagers spent on their jobs.

b. Calculate the median months that each spent at their previous job.

c. Calculate the interquartile range for the months each teenager spent at his or her previous job.

d. Construct a grouped data frequency distribution for the months the teenagers spent at their previous job.

e. Use the frequency distribution in part d to construct a histogram.

f. Develop a box and whisker plot for the data.

g. Suppose the HR manager decides to employ the teenagers who worked longer than 90th percentiles of months from her sample. Determine the minimum number of months each teenager should have worked to gain employment in this company.

3-79. Agri-Chemical Company has decided to implement a new incentive system for the managers of its three plants. The plan calls for a bonus to be paid next month to the manager whose plant has the greatest relative improvement over the average monthly production volume. The following data reflect the historical production volumes at the three plants:

Plant 1 Plant 2 Plant 3 m = 700 m = 2,300 m = 1,200 s = 200 s = 350 s = 30

At the close of next month, the monthly output for the three plants was

Plant 1 = 810 Plant 2 = 2,600 Plant 3 = 1,320 Suppose the division manager has awarded the bonus to

the manager of Plant 2 since her plant increased its production by 300 units over the mean, more than that for any of the other managers. Do you agree with the award of the bonus for this month? Explain, using the appropriate statistical measures to support your position.

3-80. Cathy wants to determine the type of slimming products that give better results in losing weight. She is considering two types of slimming products—pills and food supplements. She assigned two groups of people, having a similar body size, to use these products. After a month of treatment, she finds that those using the pills lost an average of 8 kg, with a standard deviation of 2 kg. Those who took the food supplements lost an average of 9 kg with the standard deviation of 3 kg.

a. Calculate the coefficient of variation for both types of slimming product.

b. Based on your findings in part a, determine which slimming product is more efficient in losing weight.

c. Assuming the weight lost produces a bell-shaped curve, determine the percentage of people who lost more than 11 kg for both groups.

3-81. Edmund wants to buy a secondhand PlayStation 3 (PS3) and he surveys the selling price from three different sources. He can purchase a PS3 from a friend, from a retail shop, or online. The following are the average and standard deviation values he finds through the three different sources:

Friends Retail Shops Online

Average Price ($) 65 80 75

Standard Deviation ($) 6 9 15

a. Determine what decisions Edmund can make from the average prices and the standard deviation values for his purchase.

b. If Edmund needs to make a decision based on the consistency of the selling price, which is the best source for him to go?

c. If the selling price is symmetrically distributed, determine the chances that Edmund will purchase the PS3 for not more than $71 from the three sources.

M03_GROE0383_10_GE_C03.indd 141 05/09/17 4:15 PM