• Tidak ada hasil yang ditemukan

Distribution of Quadratic Forms

Dalam dokumen Applied Regression Analysis: A Research Tool (Halaman 131-135)

ANALYSIS OF VARIANCE AND QUADRATIC FORMS

4.4 Distribution of Quadratic Forms

In Chapter 3, the variance–covariance matrices for β, Y, and e were Estimated Variances expressed in terms of the true variance σ2. Estimates of the variance–

covariance matrices are obtained by substitutings2= 107.81 forσ2in each Var(·) formula;s2(·) is used to denote an estimated variance–covariance matrix. (Note the boldface type to distinguish the matrix of estimates from individual variances.)

In the ozone example, Example 4.3, Example 4.5

s2(β) = (X X)1s2

=

1.0755 9.4340

9.4340 107.8167

107.81

=

115.94 1,017.0

1,017.0 11,623

. Thus,

s2(β0) = (1.0755)(107.81) = 115.94, s2(β1) = (107.8167)(107.81) = 11,623, and Cov(β0, β1) = (9.4340)(107.81) =1,017.0.

In each case, the first number in the product is the appropriate coefficient from the (XX)1matrix; the second number iss2. (It is only coincidence that the lower right diagonal element of (XX)1 is almost identical to s2.)

The estimated variance–covariance matrices forY andeare found sim- ilarly by replacing σ2 with s2 in the corresponding variance–covariance matrices.

4.4 Distribution of Quadratic Forms

The probability distributions of the quadratic forms provide the basis for parametric tests of significance. It is at this point (and in making confidence interval statements about the parameters) that the normality assumption on theicomes into play. The results are summarized assuming that nor- mality ofand therefore normality of Yare satisfied. When normality is not satisfied, the parametric tests of significance must be regarded as ap- proximations.

A general result from statistical theory [see, for example, Searle (1971)]

states:

116 4. ANALYSIS OF VARIANCEAND QUADRATIC FORMS If Y is normally distributed, with E(Y) = µ andVar(Y) = Vσ2, whereV is a nonsingular matrix (µmay beandV may beI), then

1. a quadratic formY(A2)Y is distributed as anoncen- tral chi-squarewith

(a) degrees of freedom equal to the rank ofA,df =r(A), and

(b) noncentrality parameter Ω = (µ)/2σ2,

if AV is idempotent (if V =I, the condition reduces to Abeing idempotent);

2. quadratic forms YAY and YBY are independent of each other ifAV B =0 (ifV =I, the condition reduces to AB = 0; that is, A and B are orthogonal to each other); and

3. a quadratic functionYAY is independent of a linear func- tion BY ifBV A=0. (If V =I, the condition reduces toBA=0.)

In the normal multiple regression model, the following hold. Application to Regression 1. The sums of squares for model, mean, regression, and residuals all

involve defining matrices that are idempotent. Recall that SS(Model)2 = YP Y2.

Since P is idempotent, SS(Model)2 is distributed as a chi-square random variable withr(P) =pdegrees of freedom and noncentrality parameter

Ω = βXP Xβ/2σ2=βX/2σ2. Similarly:

(a) SS(µ)2=Y(J/n)Y2is distributed as a chi-square random variable with r(J/n) = 1 degree of freedom and noncentrality parameter

Ω = βX(J/n)/2σ2= (1)2/22.

(b) SS(Regr)2=Y(PJ/n)Y2is distributed as a chi-square random variable withr(PJ/n) =p(see Exercise 4.15) degrees of freedom and noncentrality parameter

Ω = [βX(P J/n)]/2σ2= [βX(IJ/n)]/2σ2.

4.4 Distribution of Quadratic Forms 117 (c) SS(Res)2 = Y(IP)Y2 is distributed as a chi-square random variable withr(IP) = (n−p) degrees of freedom and noncentrality parameter

Ω = βX(IP)/2σ2= 0.

That is, SS(Res)/σ2 has acentral chi-square distribution with degrees of freedom (n−p). (Acentralchi-square distribution has noncentrality parameter equal to zero.)

2. Since (IP)(PJ/n) = 0 (see Exercise 4.15), SS(Res) =Y(I P)Y and SS(Regr) = Y(P J/n)Y are independent. Similarly, since P(I P) = 0, J/n(P J/n) = 0, and J/n(IP) = 0, we have that SS(Model) and SS(Res) are independent, SS(µ) and SS(Regr) are independent, and SS(µ) and SS(Res) are independent, respectively.

3. SinceX(IP) = 0, any linear functionKβ=K(XX)1XY = BY is independent of SS(Res)= Y(I P)Y. This follows from noting thatB(IP) =K(XX)1X(IP) =0.

Thus, the normality assumption onimplies that the sums of squares, divided byσ2, are chi-square random variables. The chi-square distribution and the orthogonality between the quadratic forms provide the basis for the usual tests of significance. For example, when the null hypothesis is true, thet-statistic is the ratio of a normal deviate to the square root of a scaled independent central chi-square random variable. TheF-statistic is the ra- tio of a scaled noncentral chi-square random variable (central chi-square random variable if the null hypothesis is true) to a scaled independent cen- tral chi-square random variable. The scaling in each case is division of the chi-square random variable by its degrees of freedom.

The noncentrality parameter Ω = (µ)/2σ2is important for two rea- Noncentrality Parameter andF-Test sons: the condition that makes the noncentrality parameter of the numera-

tor of theF-ratio equal to zero is an explicit statement of the null hypoth- esis; and the power of the test to detect a false null hypothesis is deter- mined by the magnitude of the noncentrality parameter. The noncentrality parameter of the chi-square distribution is the second term of the expecta- tions of the quadratic forms divided by 2 (see equation 4.25). SS(Res)2 is a central chi-square since the second term was zero (equation 4.29). The noncentrality parameter for SS(Regr)2(see equation 4.28) is

Ω = βX(IJ/n)

2σ2 , (4.34)

which is a quadratic form involving allβjexceptβ0. Thus, SS(Regr)2is acentral chi-square only if Ω = 0, which requires (IJ/n)= 0. Since

118 4. ANALYSIS OF VARIANCEAND QUADRATIC FORMS

X is assumed to be of full rank, it can be shown that Ω = 0 if and only if β1=β2=· · ·=βp= 0. Therefore, theF-ratio using

F = MS(Regr) MS(Res)

is a test of the composite hypothesis that allβj, exceptβ0, equal zero. This hypothesis is stated as

H0:β = 0 Ha:β = 0,

whereβ is the1 vector of regression coefficients excludingβ0. An observedF-ratio, equation 4.35, sufficiently greater than 1 suggests that the noncentrality parameter is not zero. The larger the noncentrality parameter for the numerator chi-square, the larger will be theF-ratio, on the average, and the greater will be the probability of detecting a false null hypothesis. This probability, by definition, is thepower of the test. (The power of an F-test is also increased by increasing the degrees of freedom for each chi-square, particularly the denominator chi-square.) All of the quantities except β in the noncentrality parameter are known before the experiment is run (in those cases where theXs are subject to the control of the researcher). Therefore, the relative powers of different experimental designs can be evaluated before the final design is adopted.

In the Heagle ozone example, Example 4.2, Example 4.6

F = MS(Regr)

MS(Res) = 799.14 107.81= 7.41.

The critical value forα=.05 with 1 and 2 degrees of freedom isF(.05;1,2)= 18.51. The conclusion is that these data do not provide sufficient evidence to reject the null hypothesis thatβ1 equals zero. Even though MS(Regr) is considerably larger than MS(Res), the difference is not sufficient to be confident that it is not due to random sampling variation from the un- derlying chi-square distributions. The large critical value of F, 18.51, is a direct reflection of the very limited degrees of freedom for MS(Res) and, consequently, large sampling variation in theF-distribution. A later anal- ysis that uses a more precise estimate ofσ2(more degrees of freedom) but the same MS(Regr) shows thatβ1clearly is not zero.

The key points from this section are summarized as follows.

1. The expectations of the quadratic forms are model de- pendent. If the incorrect model has been used, the expec- tations are incorrect. This is particularly critical for the

Dalam dokumen Applied Regression Analysis: A Research Tool (Halaman 131-135)