Analysis of the Fixed Effects Model - The Central Limit Theorem

The Central Limit Theorem

3.3 Analysis of the Fixed Effects Model

In this section, we develop the single-factor analysis of variance for the ﬁxed effects model.

Recall that y_i.represents the total of the observations under the ith treatment. Let represent the average of the observations under the ith treatment. Similarly, let y_..represent the grand total of all the observations and represent the grand average of all the observations.

Expressed symbolically,

(3.3) whereNanis the total number of observations. We see that the “dot” subscript notation implies summation over the subscript that it replaces.

We are interested in testing the equality of the atreatment means; that is,E(y_ij) ii,i1, 2, . . . ,a. The appropriate hypotheses are

(3.4) In the effects model, we break the ith treatment mean i into two components such that ii. We usually think of as an overall mean so that

This deﬁnition implies that

That is, the treatment or factor effects can be thought of as deviations from the overall mean.¹ Consequently, an equivalent way to write the above hypotheses is in terms of the treatment effects i, say

Thus, we speak of testing the equality of treatment means or testing that the treatment effects (thei) are zero. The appropriate procedure for testing the equality of atreatment means is the analysis of variance.

H₁⬊i Z 0 for at least one i H₀⬊12Áa0

i1i0

i1i

H₁⬊i Z j for at least one pair (i,j) H₀⬊12Áa

y_.._i1

^a _j1

ⁿ ^y^ij ^y^..^y^..^/N

y_i.

_j1ⁿ ^yîj ^yî.^yî.^/n ⁱ1, 2, . . . , a y_..

y_i.

1For more information on this subject, refer to the supplemental text material for Chapter 3.

3.3.1 Decomposition of the Total Sum of Squares

The name analysis of varianceis derived from a partitioning of total variability into its com- ponent parts. The total corrected sum of squares

is used as a measure of overall variability in the data. Intuitively, this is reasonable because if we were to divide SSTby the appropriate number of degrees of freedom (in this case,an 1 N 1), we would have the sample varianceof the y’s. The sample variance is, of course, a standard measure of variability.

Note that the total corrected sum of squares SSTmay be written as

(3.5) or

However, the cross-product term in this last equation is zero, because

Therefore, we have

(3.6) Equation 3.6 is the fundamental ANOVA identity. It states that the total variability in the data, as measured by the total corrected sum of squares, can be partitioned into a sum of squares of the differences between the treatment averages and the grand average plus a sum of squares of the differences of observations withintreatments from the treatment average. Now, the difference between the observed treatment averages and the grand average is a measure of the differences between treatment means, whereas the differences of observations within a treatment from the treatment average can be due to only random error. Thus, we may write Equation 3.6 symbolically as

whereSSTreatmentsis called the sum of squares due to treatments (i.e., between treatments), and SSEis called the sum of squares due to error (i.e., within treatments). There areanNtotal observations; thus,SSThasN1 degrees of freedom. There are alevels of the factor (and a treatment means), so SSTreatmentshasa1 degrees of freedom. Finally, there are nreplicates within any treatment providing n1 degrees of freedom with which to estimate the experi- mental error. Because there are atreatments, we have a(n1)anaNadegrees of freedom for error.

It is instructive to examine explicitly the two terms on the right-hand side of the fundamental ANOVA identity. Consider the error sum of squares

SSE_i1

^a _j1

ⁿ⁽^y^ij^y^i.⁾²_i1

^j1ⁿ⁽^y^ij^y^i.⁾²

SSTSSTreatmentsSSE

i1_j1

ⁿ⁽^y^ij^y^..⁾²ⁿ_i1

^a⁽^y^i.^y^..⁾²_i1

^a _j1

ⁿ⁽^y^ij^y^i.⁾²

ⁿ

j1 (yijyi.)yi.nyi.yi.n(yi./n)0 2_i1

^a _j1

ⁿ^(yî.^y^..⁾⁽^yîj^yî.⁾

i1_j1

ⁿ⁽^y^ij^y^..⁾²ⁿ_i1

^a⁽^y^i.^y^..⁾²_i1

^a _j1

ⁿ⁽^y^ij^y^i.⁾²

_j1ⁿ⁽^y^ij^y^..⁾²

_i1^a

_j1ⁿ^[(yî.^y^..⁾^(yîj^yî.^)]²

SST_i1

^a _j1

ⁿ⁽^y^ij^y^..⁾²

3.3 Analysis of the Fixed Effects Model

71

In this form, it is easy to see that the term within square brackets, if divided by n1, is the sample variance in the ith treatment, or

Now asample variances may be combined to give a single estimate of the common popula- tion variance as follows:

Thus,SS_E/(N⫺a) is apooled estimateof the common variance within each of the atreatments.

Similarly, if there were no differences between the atreatment means, we could use the variation of the treatment averages from the grand average to estimate ². Speciﬁcally,

is an estimate of ²if the treatment means are equal. The reason for this may be intuitively seen as follows: The quantity estimates ²/n, the variance of the treatment averages, so must estimate ²if there are no differences in treatment means.

We see that the ANOVA identity (Equation 3.6) provides us with two estimates of ²—one based on the inherent variability within treatments and the other based on the variability between treatments. If there are no differences in the treatment means, these two estimates should be very similar, and if they are not, we suspect that the observed difference must be caused by differences in the treatment means. Although we have used an intuitive argument to develop this result, a somewhat more formal approach can be taken.

The quantities

and

are called mean squares. We now examine the expected values of these mean squares.

Consider

NaE

ⁱ¹

^a ^j1

ⁿ⁽^y²îj^2yîj^yî.^y²î.⁾

E(MS_E)E

^N^SS^E^a

^N¹ ^a^E

ⁱ¹

^a ^j1

ⁿ ^(y^ij^y^i.⁾²

MS_E SS_E Na MS_TreatmentsSS_Treatments

a1 n兺^ai1(yi.y..)²/(a1)

兺^ai1(y_i.y_..)²/(a1) SS_Treatments

n_i1

^a⁽^y^i.^y^..⁾²

SS_E (Na) (n1)S²₁(n1)S²₂Á(n1)S²_a

(n1)(n1)Á(n1) _i1

^j1ⁿ⁽^y^ij^y^i.⁾²

i1 (n1)

S²_i _j1

ⁿ⁽^y^ij^y^i.⁾²

n1 i1, 2, . . . , a

Substituting the model (Equation 3.1) into this equation, we obtain

Now when squaring and taking expectation of the quantity within the brackets, we see that terms involving and are replaced by ² and n², respectively, because E(ij) 0.

Furthermore, all cross products involving ijhave zero expectation. Therefore, after squaring and taking expectation, the last equation becomes

By a similar approach, we may also show that²

Thus, as we argued heuristically,MSESSE/(Na) estimates ², and, if there are no differences in treatment means (which implies that i0), MSTreatmentsSSTreatments/(a1) also estimates². However, note that if treatment means do differ, the expected value of the treatment mean square is greater than ².

It seems clear that a test of the hypothesis of no difference in treatment means can be performed by comparing MSTreatmentsandMSE. We now consider how this comparison may be made.

3.3.2 Statistical Analysis

We now investigate how a formal test of the hypothesis of no differences in treatment means (H0:1 2. . .a, or equivalently, H0:12. . . a0) can be performed.

Because we have assumed that the errors ijare normally and independently distributed with mean zero and variance ², the observations yijare normally and independently distributed with mean iand variance ². Thus,SST is a sum of squares in normally distributed random variables; consequently, it can be shown thatSST/²is distributed as chi-square with N1 degrees of freedom. Furthermore, we can show that SSE/²is chi-square with Na degrees of freedom and that SSTreatments/²is chi-square with a1 degrees of freedom if the null hypothesis H0:i0 is true. However, all three sums of squares are not necessarily independent because SSTreatmentsandSSEadd toSST. The following theorem, which is a spe- cial form of one attributed to William G. Cochran, is useful in establishing the independence ofSSE andSSTreatments.

E(MSTreatments)² n_i1

^a ²ⁱ

a1 E(MSE)² E(MSE) 1

^N²ⁿ

ⁱ¹^a ²ⁱ ^N²^N²ⁿⁱ¹

^a ²ⁱ ^a²

²i.

²ij

E(MSE) 1

NaE

ⁱ¹

^a ^j1

ⁿ⁽ⁱ ^ij⁾²¹ⁿⁱ¹

^j1

ⁿ ⁱ^ij

NaE

ⁱ¹

^a ^j1

ⁿ ^y²^ij¹ⁿⁱ¹

^a ^y²^i.

NaE

ⁱ¹

^a ^j1

ⁿ ^y²^ij²ⁿⁱ¹

^a ^y²^i.ⁿ

ⁱ¹^a ^yⁱ²^.

3.3 Analysis of the Fixed Effects Model

73

2Refer to the supplemental text material for Chapter 3.

Because the degrees of freedom for SSTreatmentsandSSEadd toN1, the total number of degrees of freedom, Cochran’s theorem implies thatSSTreatments/²andSSE/²are independently distributed chi-square random variables. Therefore, if the null hypothesis of no difference in treatment means is true, the ratio

(3.7) is distributed as Fwitha1 and Nadegrees of freedom. Equation 3.7 is the test statis- ticfor the hypothesis of no differences in treatment means.

From the expected mean squares we see that, in general,MSE is an unbiased estimator of ². Also, under the null hypothesis,MSTreatmentsis an unbiased estimator of ². However, if the null hypothesis is false, the expected value of MSTreatmentsis greater than ². Therefore, under the alternative hypothesis, the expected value of the numerator of the test statistic (Equation 3.7) is greater than the expected value of the denominator, and we should reject H0on values of the test statistic that are too large. This implies an upper-tail, one-tail critical region. Therefore, we should reject H0and conclude that there are differences in the treatment means if

whereF0is computed from Equation 3.7. Alternatively, we could use the P-value approach for decision making. The table of Fpercentages in the Appendix (Table IV) can be used to ﬁnd bounds on the P-value.

The sums of squares may be computed in several ways. One direct approach is to make use of the deﬁnition

Use a spreadsheet to compute these three terms for each observation. Then, sum up the squares to obtain SST,SSTreatments, and SSE. Another approach is to rewrite and simplify the def- initions of SSTreatmentsandSSTin Equation 3.6, which results in

(3.8)

(3.9) and

(3.10) SS_ESS_TSS_Treatments

SS_Treatments1

n_i1

^a ^y²^i.^y_N²^..

SST_i1

^a _j1

ⁿ ^y²^ij ^yN²^..

yijy..(yi.y..)(yijyi.) F0⬎F_,a1,Na

F0 SS_Treatments/(a1)

SS_E/(Na) MS_Treatments MS_E

THEOREM 3-1

Dalam dokumen Design and Analysis - of Experiments - Spada UNS (Halaman 86-90)