• Tidak ada hasil yang ditemukan

COMPUTING THE STANDARD DEVIATION

PART VIII. TESTS OF SIGNIFICANCE

6. COMPUTING THE STANDARD DEVIATION

COMPUTING THE STANDARD DEVIATION 71

7. (Hypothetical). In a clinical trial, data collection usually starts at “baseline,” when the subjects are recruited into the trial but before they are randomized to treatment or control. Data collection continues until the end of followup. Two clinical trials on prevention of heart attacks report baseline data on weight, shown below. In one of these trials, the randomization did not work. Which one, and why?

Number of Average

persons weight SD

Treatment 1,012 185 lb 25 lb

(i)

!

Control 997 143 lb 26 lb

Treatment 995 166 lb 27 lb

(ii)

!

Control 1,017 163 lb 25 lb

8. One investigator takes a sample of 100 men age 18–24 in a certain town. Another takes a sample of 1,000 such men.

(a) Which investigator will get a bigger average for the heights of the men in his sample? or should the averages be about the same?

(b) Which investigator will get a bigger SD for the heights of the men in his sample? or should the SDs be about the same?

(c) Which investigator is likely to get the tallest of the sample men? or are the chances about the same for both investigators?

(d) Which investigator is likely to get the shortest of the sample men? or are the chances about the same for both investigators?

9. The men in the HANES5 sample had an average height of 69 inches, and the SD was 3 inches. Tomorrow, one of these men will be chosen at random. You have to guess his height. What should you guess? You have about 1 chance in 3 to be off by more than . Fill in the blank. Options: 1/2 inch, 3 inches, 5 inches.

10. As in exercise 9, but tomorrow a whole series of men will be chosen at random.

After each man appears, his actual height will be compared with your guess to see how far off you were. The r.m.s. size of the amounts off should be . Fill in the blank. (Hint: Look at the bottom of this page.)

The answers to these exercises are on pp. A49–50.

Example 2. Find the SD of the list 20, 10, 15, 15.

Solution. The first step is to find the average:

average= 20+10+15+15

4 =15.

The second step is to find the deviations from the average: just subtract the average from each entry. The deviations are

5 −5 0 0

The last step is to find the r.m.s. size of the deviations:

SD=

$52+(−5)2+02+02 4

=

$25+25+0+0 4

=

$50 4 =√

12.5≈3.5 This completes the calculation.

The SD comes out in the same units as the data. For example, suppose heights are measured in inches. The intermediate squaring step in the procedure changes the units to inches squared, but the square root returns the answer to the original units.11Do not confuse the SD of a list with its r.m.s. size. The SD is the r.m.s., not of the original numbers on the list, but of their deviations from average.

Exercise Set E

1. Guess which of the following two lists has the larger SD. Check your guess by computing the SD for both lists.

(i) 9, 9, 10, 10, 10, 12 (ii) 7, 8, 10, 11, 11, 13

2. Someone is telling you how to calculate the SD of the list 1, 2, 3, 4, 5:

The average is 3, so the deviations from average are

−2 −1 0 1 2 Drop the signs. The average deviation is

2+1+0+1+2

5 =1.2

And that’s the SD.

Is this right? Answer yes or no, and explain briefly.

COMPUTING THE STANDARD DEVIATION 73

3. Someone is telling you how to calculate the SD of the list 1, 2, 3, 4, 5:

The average is 3, so the deviations from average are

−2 −1 0 1 2 The 0 doesn’t count, so the r.m.s. deviation is

$4+1+1+4

4 =1.6

And that’s the SD.

Is this right? Answer yes or no, and explain briefly.

4. Three instructors are comparing scores on their finals; each had 99 students. In class A, one student got 1 point, another got 99 points, and the rest got 50 points.

In class B, 49 students got a score of 1, one student got a score of 50, and 49 students got a score of 99. In class C, one student got a score of 1, one student got a score of 2, one student got a score of 3, and so forth, all the way through 99.

(a) Which class had the biggest average? or are they the same?

(b) Which class had the biggest SD? or are they the same?

(c) Which class had the biggest range? or are they the same?

5. (a) For each list below, work out the average, the deviations from average, and the SD.

(i) 1, 3, 4, 5, 7 (ii) 6, 8, 9, 10, 12

(b) How is list (ii) related to list (i)? How does this relationship carry over to the average? the deviations from the average? the SD?

6. Repeat exercise 5 for the following two lists:

(i) 1, 3, 4, 5, 7 (ii) 3, 9, 12, 15, 21

7. Repeat exercise 5 for the following two lists:

(i) 5,−4, 3,−1, 7 (ii) −5, 4,−3, 1,−7

8. (a) The Governor of California proposes to give all state employees a flat raise of $250 a month. What would this do to the average monthly salary of state employees? to the SD?

(b) What would a 5% increase in the salaries, across the board, do to the average monthly salary? to the SD?

9. What is the r.m.s. size of the list 17, 17, 17, 17, 17? the SD?

10. For the list 107, 98, 93, 101, 104, which is smaller—the r.m.s. size or the SD? No arithmetic is needed.

11. Can the SD ever be negative?

12. For a list of positive numbers, can the SD ever be larger than the average?

The answers to these exercises are on pp. A50–51.

Technical note. There is an alternative way to compute the SD, which is more efficient in some cases:12

SD=

"

average of (entries2)−(average of entries)2.