• Tidak ada hasil yang ditemukan

One-Way Analysis of Variance

Designs with One Source of Variation

3.5 One-Way Analysis of Variance

42 3 Designs with One Source of Variation

Fig. 3.2 Residuals under the full and reduced models whenH0is false

1 2 i

yit

y1.

y2.

e13 e11

e23 e24

Residuals; full model

1 2 i

yit

y..

e13 e11

e23

e24

Residuals; reduced model

Yi t =μ+τ+0i t, 0i tN(02) , 0i ts are mutually independent,

t =1, . . . ,ri, i =1, . . . , v,

where we write0i t for the(i t)th error variable in the reduced model. To calculate the sum of squares for error,ssE0, we need to determine the value ofμ+τthat minimizes the sum of squared errors

i

t

(yi tμτ)2.

Using calculus, the reader is asked to show in Exercise 7 that the unique least squares estimate ofμ+τ is the sample mean of all the observations; that is,μˆ+ ˆτ=y... Therefore, the error sum of squares for the reduced model is

ssE0=

i

t

(yi ty..)2

=

i

t

yi t2n y2... (3.5.10)

If the null hypothesisH0: {τ1=τi = · · · =τv}is false, and the treatment effects differ, the sum of squares for errorssEunder the full model (3.3.1) is considerably smaller than the sum of squares for errorssE0for the reduced model. This is depicted in Fig.3.2. On the other hand, if the null hypothesis is true, thenssE0andssEwill be very similar. The analysis of variance test is based on the difference ss E0ssE, relative to the size ofssE; that is, the test is based on(ssE0ssE)/ssE. We would want to rejectH0if this quantity is large.

We callssT=ssE0ssEthesum of squares for treatmentsor thetreatment sum of squares, since its value depends on the differences between the treatment effects. Using formulas (3.5.10) and (3.4.5) forss E0andssE, the treatment sum of squares is

3.5 One-Way Analysis of Variance 43

ssT =ssE0ssE (3.5.11)

=

i

t

yi t2n y..2

i

t

yi t2

i

riyi2.

=

i

riy2i.n y..2. (3.5.12)

An equivalent formulation is

ssT =

i

ri(yi.y..)2. (3.5.13)

The reader is invited to multiply out the parentheses in (3.5.13) and verify that (3.5.12) is obtained.

There is a shortcut method of expanding (3.5.13) to obtain (3.5.12). First write down each term inyand square it. Then associate with each squared term the signs in (3.5.13). Finally, precede each term with the summations and constant outside the parentheses in (3.5.13). This quick expansion will work for all terms like (3.5.13) in this book. Formula (3.5.13) is probably the easier form ofssTto remember, while (3.5.12) is easier to manipulate for theoretical work and use for computations.

Since we will rejectH0ifssT/ssEis large, we need to know what “large” means. This in turn means that we need to know the distribution of the corresponding random variableSST/SSEwhenH0is true, where

SST =

i

ri(Yi.Y..)2 and SSE=

i

t

(Yi tYi.)2. (3.5.14)

Now, as mentioned in Sect.3.4.6, it can be shown thatS S E/σ2has a chi-squared distribution with nvdegrees of freedom, denoted byχ2nv. Similarly, it can be shown that whenH0is true,SST/σ2has aχ2v1distribution, and thatSSTandSSEare independent. The ratio of two independent chi-squared random variables, each divided by their degrees of freedom, has anFdistribution. Therefore, ifH0is true, we have

SST/σ2(v−1)

SSE/σ2(nv)Fv1,nv.

We now know the distribution ofSST/SSEmultiplied by the constant(nv)/(v−1), and we want to reject the null hypothesis H0: {τ1= · · · =τv}in favor of the alternative hypothesisHA:{at least two of the treatment effects differ} if this ratio is large. Thus, if we writemsT=ssT/(v−1),msE= ssE/(nv), wheressTandssEare the observed values of the treatment sum of squares and error sum of squares, respectively, our decision rule is to

rejectH0 if msT

msE >Fv1,nv,α, (3.5.15) whereFv1,nv,αis the critical value from theFdistribution withv−1 andnvdegrees of freedom withαin the right-hand tail. The probabilityαis often called thesignificance levelof the test and is the probability of rejectingH0when in fact it is true (a Type I error). Thus,αshould be selected to be small if it is important not to make a Type I error (α=0.01 and 0.001 are typical choices); otherwise, αcan be chosen to be a little larger (α=0.10 and 0.05 are typical choices). Critical valuesFv1,nv,α

for theFdistribution are given in Table A.6. Due to lack of space, only a few typical values ofαhave been tabulated.

44 3 Designs with One Source of Variation

Table 3.4 One-way analysis of variance table

Source of variation Degrees of freedom Sum of squares Mean square Ratio Expected mean square

Treatments v-1 ssT vssT1 msTms E σ2+Qi)

Error n-v ssE nss Ev σ2

Total n-1 sstot

Computational formulae ssT=

iriy2i.n y2.. ssE=

i

ty2i t

iriy2i. sstot=

i

tyi t2n y2..

Qi)=

irii

hrhτh/n)2/(v1)

The calculations involved in the test of the hypothesis H0against HA are usually written as an analysis of variance tableas shown in Table3.4. The last line shows thetotal sum of squaresandtotal degrees of freedom. The total sum of squares,sstot, is(n−1)times the sample variance of all of the data values. Thus,

sstot=

i

t

(yi ty..)2=

i

t

yi t2n y..2. (3.5.16)

From (3.5.10), we see thatsstot happens to be equal toss E0 for the one-way analysis of variance model, and from (3.5.11) we see that

sstot =ssT+ssE.

Thus, the total sum of squares consists of a partssT that is explained by differences between the treatment effects and a partssEthat is not explained by any of the parameters in the model.

Example 3.5.1 Battery experiment, continued

Consider the battery experiment introduced in Sect.2.5.2, p. 24. The sum of squares for error was calculated in Example3.4.2, p. 40, to bessE=28,412.5. The life per unit cost responses and treatment averages are given in Table3.3, p. 41. From these, we haveyi t2 =6,028,288,y..=590.125, and ri =4. Hence, the sums of squaresssT(3.5.12) andsstot(3.5.16) are

ssT =

riy2i.n y..2

=4(570.752+860.502+433.002+496.252)−16(590.125)2

=427,915.25,

sstot=ssE0 = yi t2n y..2

=6,028,288−16(590.125)2 = 456,327.75, and we can verify thatsstot=ssT+ssE.

The decision rule for testing the null hypothesis H0 : {τ1 =τ2 =τ3 =τ4}that the four battery types have the same average life per unit cost against the alternative hypothesis that at least two of the battery types differ, at significance levelα, is

rejectH0ifmsT/msE=60.24>F3,12.

3.5 One-Way Analysis of Variance 45

Table 3.5 One-way analysis of variance table for the battery experiment

Source of variation Degrees of freedom Sum of squares Mean square Ratio p-value

Type 3 427,915.25 142,638.42 60.24 0.0001

Error 12 28,412.50 2,367.71

Total 15 456,327.75

From Table A.6, it can be seen that 60.24>F3,12for any of the tabulated values ofα. For example, if αis chosen to be 0.01, thenF3,12,0.01=5.95. Thus, for any tabulated choice ofα, the null hypothesis is rejected, and it is concluded that at least two of the battery types differ in mean life per unit cost. In order to investigate which particular pairs of battery types differ, we would need to calculate confidence

intervals. This will be done in Chap.4.

3.5.2 Use of p-Values

The p-valueof a test is the smallest choice ofαthat would allow the null hypothesis to be rejected.

For convenience, computer packages usually print the p-value as well as the ratiomsT/msE. Having information about thep-value saves looking upFv1,nv,αin Table A.6. All we need to do is to compare the p-value with our selected value ofα. Therefore, the decision rule for testingH0 : {τ1 = · · ·τv} againstHA:{not all ofτi’s are equal} can be written as

rejectH0if p<α.

Example 3.5.2 Battery experiment, continued

In the battery experiment of Example3.5.1, the null hypothesisH0: {τ1=τ2=τ3=τ4}that the four battery types have the same average life per unit cost was tested against the alternative hypothesis that they do not. The p-value generated by SAS software for the test is shown in Table3.5as p=0.0001.

A value of 0.0001 in the SAS computer output indicates that thep-value is less than or equal to 0.0001.

Smaller values are not printed explicitly. Ifαwere chosen to be 0.01, then the null hypothesis would

be rejected, since p<α.