Goodness of Fit - Introductory Inference

Introductory Inference

5.7 Goodness of Fit

t: sx=0.3912, n=26.14, ν =25.14

One−sample t test power calculation w=x 0.0

0.2 0.4 0.6 0.8 1.0

−1 0 1 2

w_c=0.6681 μa=1 μ0=0

0.05 0.2

0.8

−2 0 2 4 6

x t t1

μ0 0 0

−2.556 μa

1 2.556 0

wcrit.R 0.6681

1.708

−0.8484 α power

Probability 0.05 0.8 0.2

t: sx=0.36515, n=30, ν =29

One−sample t test power calculation w=x 0.0

0.2 0.4 0.6 0.8 1.0

−1 0 1 2

w_c=0.6204 μa=1 μ0=0

0.05 0.152

0.848

−2 0 2 4 6

x t t1

μ0 0 0

−2.739 μa

1 2.739 0

wcrit.R 0.6204

1.699

−1.04 α power

Probability 0.05 0.8483 0.1517

Fig. 5.10 Sample size and power figures for the one-sample, one-sidedt-test. The left figure shows the sample sizen=26.14 calculated in Table5.7. The right figure shows the starting position with n=30 from Table5.8. When the sample sizenis larger (on the right), the df goes up, the height of the densities (both null and alternative) go up, the densities become thinner, the critical value in thetscale and in the ¯xscale goes down. The alternative distribution is noncentralt, has a different maximum height, and is not symmetric.

5.7 Goodness of Fit 149 asymptotic distribution. In Chapter15, we will learn in our discussion of the analy- sis of contingency table data that the distribution ofχ²=(O−E)²

E is approximately chi-square provided that no cell sizes are too small. A determination of the ground rule for “too small” required tests of goodness of fit to chi-square distributions with appropriate degrees of freedom.

This class of tests assesses whether a sample may be assumed to be taken from a null hypothesized distribution.

5.7.1 Chi-Square Goodness-of-Fit Test

The chi-square distribution may be used to conduct goodness-of-fit tests, i.e., ones of the form

H₀: the data are from a [specified population]

H₁: the data are from some other population

For certain specific populations, including normal ones, other specialized tests are more powerful.

The test begins by partitioning the population intokclasses or categories. For a discrete population the categories are the possible values; for a continuous population the choice of a decomposition is rather arbitrary, and the ultimate conclusion may well depend on the selected size ofkand the selected partition.

The test statistic is the same as that used for contingency tables. For each cat- egory, calculate from the probability distribution the theoretical or expected fre- quencyE. If over allkcategories, there is a substantial discrepancy between thek observed frequenciesOand thek E’s, thenH₀is rejected. The measure of discrepancy is the test statisticχ² = (O−E)²

E . A “large” value ofχ² is evidence against H₀. If the total sample size,n =O=E, is sufficiently “large”,χ²is approximately chi-square distributed and thep-value is approximately the chi-square tail probability associated withχ²withk−1 degrees of freedom.

For adequacy of the chi-square approximation it is suggested that all expected frequencies be at least 5. If this is not the case, the analyst may consider combining adjacent categories after which this condition is met. Thenkrepresents the number of categories following such combining.

Sometimes, the statement of the null hypothesis is so vague that calculation of expected frequencies requires that some parameters be estimated from the data. In such instances, the df is further reduced by the number of such parameters estimated.

This possibility is illustrated in Example5.7.3.

5.7.2 Example—Test of Goodness-of-Fit to a Discrete Uniform Distribution

A six-sided die (singular of the worddice) is rolled 30 times with the following outcomes: 1, 3 times; 2, 7 times; 3, 5 times; 4, 8 times; 5, 1 time; and 6, 6 times.

Test whether the die is fair.

A fair die is one that has a discrete uniform distribution on 1, 2, 3, 4, 5, 6. Each of these six possibilities has ¹₆ chance of occurring, and all sixE’s are 30(¹₆) =5.

Then

χ² = (3−5)²

5 +. . .+(6−5)²

5 = 6.8

and thep-value fromχ²₅is 0.236. Hence these 30 observations do not provide evidence to refute the fairness of the die. We show the calculations in Table5.9and the plot of the test in Figure5.11.

Table 5.9 Test of Goodness-of-Fit to a Discrete Uniform Distribution. The test is plotted in Figure5.11.

> dice <- sample(rep(1:6, c(3,7,5,8,1,6)))

> dice

[1] 4 6 4 2 3 2 4 4 6 3 6 4 3 2 3 4 6 2 6 2 1 4 3 5 1 2 1 6 [29] 4 2

> table(dice) dice

1 2 3 4 5 6 3 7 5 8 1 6

> chisq.test(table(dice))

Chi-squared test for given probabilities data: table(dice)

X-squared = 6.8, df = 5, p-value = 0.2359

5.7 Goodness of Fit 151

0 5 10 15

Chisq density: ν = 5

Density

0.00 0.05 0.10 0.15

χ²

11.07 χ²

shaded area 0.05 6.8

0.236 χ² 6.8

Fig. 5.11 Plot of the hypothesis test of Table5.9. The observed valueχ² =6.8 showsp=0.236 and is in the middle of the do-not-reject region,

Table 5.10 Observed and expected frequencies for the goodness-of-fit example in Section5.7.3.

Y O E (O−E)²

0 13 6.221 7.388

1 18 20.736 0.361

2 20 27.648 2.116

3 18 18.432 0.010

4 6 6.144 0.003

5 5 0.819 21.337

31.215

5.7.3 Example—Test of Goodness-of-Fit to a Binomial Distribution

In a certain community, there were 80 families containing exactly five children. It was noticed that there was an excess of boys among these. It was desired to test whetherY =“number of girls in family” is a binomial r.v. withn=5 and p =.4.

The expected frequencies calculated from this binomial distribution are shown in Ta- ble5.10along with the observed frequencies and the calculatedχ²₅statistic. Then the p-value is, 8.510−6, calculated as the tail probability at 31.215 for a chi-square distribution with 5 df. We conclude that the sample data contain more dispersion than does binomial(5, .4). The excess dispersion is visible in the left panel of Figure5.12.

0:5 0

5 10 15 20 25

0 1 2 3 4 5

Observed

0 1 2 3 4 5

Binomial p=.4

Fig. 5.12 Plot of family size data from Table 5.11. The Observed data is more spread out than the Expected (binomial) data. The sample variance for the Observed isvar(rep(0:5, times=Observed)) == 1.987and the sample variance for the Expected isvar(rep(0:5, times=Expected)) == 1.131.

In this example, the value of the binomial proportion parameter,p, was specified.

If instead it had to be estimated, the df would decrease from 5 to 4. We illustrate the calculation of both tests inRin Table5.11.

Dalam dokumen Second Edition (Halaman 177-181)