Univariate and Joint Conﬁdence Regions

ANALYSIS OF VARIANCE AND QUADRATIC FORMS

Case 1. A simple hypothesis

4.6 Univariate and Joint Conﬁdence Regions

TABLE 4.7.Partial sums of squares, the null hypothesis being tested by each, and the F-test of the null hypothesis for the oxygen uptake example.

Partial Sum of Squares HypothesisNull F^a R(β1|β0, β2, β3, β4) = 397.8664 β1= 0 53.57 R(β3|β0, β1, β2, β4) = 25.0917 β3= 0 3.38 R(β2|β0, β1, β3, β4) = .0822 β2= 0 .01 R(β₄|β₀, β₁, β₂, β₃) = 9.5975 β₄= 0 1.29

aAllF-tests were computed using the residual mean square from the full model.

for deciding whether the variable might be omitted. The null hypotheses in Table 4.7 reﬂect the adjustment of each partial regression coeﬃcient for all other independent variables in the model.

The partial sum of squares forX2,R(β2|β0β1 β3β4) =.0822 is much smaller thans²= 7.4276 and provides a clear indication that this variable does not make a signiﬁcant contribution to a model that already contains X1,X3, andX4. The next logical step in building the model based on tests of the partial sums of squares would be to omitX2. Even though the tests forβ3andβ4are also nonsigniﬁcant, one must be cautious in omitting more than one variable at a time on the basis of the partial sums of squares. The partial sums of squares are dependent on which variables are in the model;

it will almost always be the case that all partial sums of squares will change when a variable is dropped. (In this case, we know from the sequential sums of squares that all three variables can be dropped. A complete discussion on choice of variables is presented in Chapter 7.)

4.6 Univariate and Joint Conﬁdence Regions

Confidence interval estimates of parameters convey more information to the reader than do simple point estimates. Univariate confidence intervals for several parameters, however, do not take into account correlations among the estimators of the parameters. Furthermore, the individual confidence coefficients do not reflect the overall degree of confidence in the joint statements. Joint confidence regions address these two points. Univariate confidence interval estimates are discussed briefly before proceeding to a discussion of joint confidence regions.

4.6.1 Univariate Conﬁdence Intervals

If ∼N(0,Iσ²), then β andY have multivariate normal distributions Conﬁdence Intervals for β_j (see equation 3.37). With normality, the classical (1−α)100% conﬁdence

136 4. ANALYSIS OF VARIANCEAND QUADRATIC FORMS interval estimate of eachβjis

β_j ± t_(α/2,ν)s(β_j), j= 0, . . . , p, (4.54) wheret(α/2,ν)is the value of the Student’st-distribution, withνdegrees of freedom, that putsα/2 probability in the upper tail. [In the usual multiple regression problem,ν= (n−p).] The standard error ofβjiss(βj) =

cjjs² where s² is estimated with ν degrees of freedom and cjj is the (j+ 1)th diagonal element from (XX)⁻¹.

Similarly, the (1−α)100% conﬁdence interval estimate of the mean of Conﬁdence Interval for E(Y0) Y for a particular choice of values for the independent variables, sayx₀=

( 1 X01 · · · X0p), is

Y0 ± t_(α/2,ν)s(Y0), (4.55) whereY0=x₀β; s(Y0) =

x₀(XX)⁻¹x0s², in general, ors(Y0) =√ viis² ifx₀corresponds to theith row ofX;viiis theith diagonal element inP; t(α/2,ν) is as deﬁned for equation 4.54.

A (1−α)100% prediction interval ofY0=x₀β+, for a particular choice Prediction Interval for Y₀ of values of the independent variables, sayx₀= ( 1 X₀₁ · · · X_0p) is

Y0 ± t(α/2,ν)s(Y0−Y0), (4.56) whereY₀=x₀βands(Y₀−Y₀) =

s²[1 +x₀(XX)⁻¹x₀].

The univariate conﬁdence intervals are illustrated with the oxygen uptake Example 4.11 example (see Example 4.8).s²= 7.4276 was estimated with 26 degrees of

freedom. The value of Student’stforα=.05 and 26 degrees of freedom is t_(.025,26)= 2.056. The point estimates of the parameters and the estimated variance-covariance matrix ofβwere

β= ( 84.2690 −3.0698 .0080 −.1167 .0852 ) and

s²(β) = (X X)⁻¹s²







129.4119 −1.185591 .053980 −.104321 −.579099

−1.185591 .175928 −.012602 −.007318 .007043 .053980 −.012602 .005775 −.000694 −.000634

−.104321 −.007318 −.000694 .004032 −.002646

−.579099 .007043 −.000634 −.002646 .005616





.

The square root of the (j+ 1)st diagonal element gives s(βj). If d is deﬁned as the column vector ofs(β_j), the univariate 95% conﬁdence interval

4.6 Univariate and Joint Conﬁdence Regions 137 estimates can be computed as

CL(β) = [ β−t_(α/2,ν)d β+t_(α/2,ν)d]







60.880 107.658

−3.932 −2.207

−.148 .164

−.247 .014

−.069 .239





,

where the two columns give the lower and upper limits, respectively, for theβj in the same order as listed inβ.

4.6.2 Simultaneous Conﬁdence Statements

For the classical univariate confidence intervals, the confidence coefficient (1−α) =.95 applies to each confidence statement. The level of confidence associated with the statement that all five intervals simultaneously contain their respective parameters is much lower. If the five intervals were sta- tistically independent, which they are not, the overall or joint confidence coefficient would be only (1−α)⁵=.77.

There are two procedures that keep the joint confidence coefficient for several simultaneous statements near a prechosen level (1−α). The oldest and simplest procedure, commonly called theBonferroni method, con- structs the individual confidence intervals as given in equations 4.54 and 4.55, but usesα^∗ =α/k where k is the number of simultaneous intervals or statements. That is, in equation 4.54,t(α/2,ν)is replaced witht(α/2k,ν). This procedure ensures that the true joint confidence coefficient for thek simultaneous statements isat least (1−α).

The Bonferroni simultaneous conﬁdence intervals for the p parameters inβ are given by

βj ± t_(α/2p_,ν)s(βj). (4.57) This method is particularly suitable for obtaining simultaneous confidence intervals forkprespecified (prior to analyzing the data) parameters or linear combinations of parameters. Whenk is small, generally speaking, the Bonferroni simultaneous confidence intervals are not very wide. However, if k is large, the Bonferroni intervals tend to be wide (conservative) and the simultaneous coverage may be much larger than the specified confidence level (1−α). For example, if we are interested in obtaining simultaneous confidence intervals of all pairwise differences ofpparameters (e.g., treat- ment means), thenkisp(p+ 1)/2 which is large even for moderate values of p. The Bonferroni method is not suitable for obtaining simultaneous confidence intervals for all linear combinations. In this case, k is infinity

138 4. ANALYSIS OF VARIANCEAND QUADRATIC FORMS

and the Bonferroni intervals would be the entire space. For example, in a simple linear regression, if we wish to compute a conﬁdenceband on the entire regression line, then the Bonferroni simultaneous band would be the entire space.

The second procedure applies the general approach developed by Scheffé Scheffé’s Method (1953).Scheffé’s methodprovides simultaneous confidence statements for

alllinear combinations of a set of parameters in ad-dimensional subspace of thep-dimensional parameter space. The Scheffé joint confidence intervals for thep parameters inβ and the means of Y,E(Y_i), are obtained from equations 4.54 and 4.55 by replacing t_(α/2,ν) with [pF_(α,p_,ν)]^1/2. (If only a subset ofd linearly independent parametersβj is of interest, t_(α/2,ν) is replaced with [dF_(α,d,ν)]^1/2.) That is,

βj ± (pF_(α,p_,ν))^1/2s(βj) (4.58) Y0 ± (pF_(α,p_,ν))^1/2s(Y0). (4.59)

This method provides simultaneous statements for all linear combinations of the set of parameters. As with the Bonferroni intervals, the joint confidence coefficient for the Scheffé intervals is at least (1−α). That is, the confidence coefficient of (1−α) applies to all confidence statements on the βj, theE(Yi), plus all other linear functions ofβj of interest. Thus , equation 4.59 can be used to establish a confidencebandon the entire regression surface by computing Scheffé confidence intervals for E(Y0) for all values of the independent variables in the region of interest. The confidence band for the simple linear regression case was originally developed by Working and Hotelling (1929) and frequently carries their names.

The reader is referred to Miller (1981) for more complete presentations on Bonferroni and Scheffé methods. Since the Scheffé method provides simultaneous confidence statements onalllinear functions of a set of parameters, the Scheffé intervals will tend to be longer than Bonferroni intervals, particularly when a small number of simultaneous statements is involved (Miller, 1981). One would choose the method that gave the shorter intervals for the particular application.

The oxygen uptake model of Example 4.8 has p = 5 parameters and Example 4.12 ν = 26 degrees of freedom fors². In order to attain an overall conﬁdence

coeﬃcient no smaller than (1−α) =.95 with the Bonferroni method,α^∗= .05/5 =.01 would be used, for which t_(.01/2,26)= 2.779. Using this value oftin equation 4.54 gives the Bonferroni simultaneous conﬁdence intervals

4.6 Univariate and Joint Confidence Regions 139 with anoverall confidence coefficient at least as large as (1−α) =.95:

CLB(β) =







52.655 115.883

−4.235 −1.904

−.203 −.219

−.293 .060

−.123 .293





.

The Scheﬀ´e simultaneous intervals for the p = 5 parameters in β are obtained by using [pF(.05,5,26)]^1/2= [5(2.59)]^1/2= 3.599 in place oft(α/2,ν)

in equation 4.54. The results are

CLS(β) =







43.331 125.207

−4.579 −1.560

−.265 .281

−.345 .112

−.184 .355





.

The Bonferroni and Scheffé simultaneous confidence intervals will always be wider than the classical univariate confidence intervals in which the confidence coefficient applies to each interval. In this example, the Scheffé intervals are wider than the Bonferroni intervals.

The 100(1−α)% simultaneous confidence intervals for β obtained using either Bonferroni or Sheffé methods, provide confidence intervals for each individual parameterβj in such a way that thep-dimensional region formed by the intersection of thep-simultaneous confidence intervals gives at least a 100(1−α)% joint confidence region forall parameters. The shape of this joint confidence region is rectangular or cubic. Sheffé also derives an ellipsoidal 100(1−α)% joint confidence region for all parameters that is contained in the boxed region obtained by the Sheffé simultaneous confidence intervals. This distinction is illustrated after joint confidence regions are defined in the next section.

4.6.3 Joint Conﬁdence Regions

A joint conﬁdence region for all p parameters in β is obtained from the inequality

(β−β)(XX)(β−β) ≤ ps²F(α,p,ν), (4.60) whereF(α,p,ν) is the value of theF-distribution withp andν degrees of freedom that leaves probability α in the upper tail; ν is the degrees of freedom associated with the s². The left-hand side of this inequality is a quadratic form inβ, becauseβandXX are known quantities computed from the data. The right-hand side is also known from the data. Solving

140 4. ANALYSIS OF VARIANCEAND QUADRATIC FORMS

this quadratic form for the boundary of the inequality establishes a p- dimensional ellipsoid which is the 100(1−α)% joint conﬁdence region for all the parameters in the model. The slope of the axes and eccentricity of the ellipsoid show the direction and strength, respectively, of correlations between the estimates of the parameters.

An ellipsoidal confidence region with more than two or three dimen- Interpretation sions is difficult to interpret. Specific choices of β can be checked, with a

computer program, to determine whether they fall inside or outside the confidence region. The multidimensional region, however, must be viewed two or at most three dimensions at a time. One approach to visualizing the joint confidence region is to evaluate the p-dimensional joint confidence region for specific values of all but two of the parameters. Each set of specified values produces an ellipse that is a two-dimensional “slice”

of the multidimensional region. To develop a picture of the entire region, two-dimensional “slices” can be plotted for several choices of values for the other parameters.

An alternative to using thep-dimensional joint confidence region for all parameters is to construct joint confidence regions for two parameters at a timeignoring the other (p−2) parameters. The quadratic form for the joint confidence region for a subset of two parameters is obtained from that for all parameters, equation 4.60, by

1. replacing (β−β) with the corresponding vectors involvingonly the two parameters of interest;

2. replacing (XX) with theinverse of the 2×2 variance–covariance matrix for the two parameters; and

3. replacingps²F_(α,p_,ν) with 2F_(α,2,ν). Notice thats²is not in the second quantity since it has been included in the variance–covariance matrix in step 2.

Thus, if βj and βk are the two distinct parameters of interest, the joint conﬁdence region is given by

βj

βk

− βj

βk

(s²(βj,βk))⁻¹ βj

βk

− βj

βk

≤2F(α,2,ν). (4.61) The confidence coefficient (1−α) applies to the joint statement on the two parameters being considered at the time. This procedure takes into account the joint distribution of βj and βk but ignores the values of the other parameters. Since this bivariate joint confidence region ignores the joint distribution ofβjandβkwith the other (p−2) parameter estimates, it suffers from the same conceptual problem as the univariate confidence intervals.

4.6 Univariate and Joint Conﬁdence Regions 141

The oxygen uptake data, given in Example 4.8, are used to illustrate joint Example 4.13 conﬁdence regions, but the model is simpliﬁed to include only an intercept

and two independent variables, time to run 1.5 miles (X1) and heart rate while running (X3). The estimate ofβ,XX, and the variance–covariance matrix forβfor this reduced model are

β = ( 93.0888 −3.14019 −0.073510 ) XX =



 31 328.17 5,259

328.17 3531.797 55,806.29 5,259 55,806.29 895,317





and

s²(β) =



 68.04308 −.47166 −.37028

−.47166 .13933 −.00591

−.37028 −.00591 .00255



.

The residual mean square from this model iss²= 7.25426 with 28 degrees of freedom.

The joint conﬁdence region for all three parameters is obtained from equation 4.60 and is a three-dimensional ellipsoid. The right-hand side of equation 4.60 is

ps²F(α,3,28) = 3(7.25426)(2.95)

if α = .05. This choice of α gives a conﬁdence coeﬃcient of .95 that applies to the joint statement involving all three parameters. The three- dimensional ellipsoid is portrayed in Figure 4.1 with three two-dimensional

“slices” (solid lines) from the ellipsoid at β0 = 76.59,93.09, and 109.59.

These choices ofβ0correspond toβ0andβ0±2s(β0). The “slices” indicate that the ellipsoid is extremely thin in one plane but only slightly elliptical in the other, much like a slightly oval pancake. This is reﬂecting the high correlation betweenβ0andβ3of−.89 and the more moderate correlations of−.15 and−.31 betweenβ0andβ1and betweenβ1andβ3, respectively.

The bivariate joint confidence region forβ1andβ3ignoring β0, obtained from equation 4.61, is shown in Figure 4.1 as the ellipse drawn with the dashed line. The variance–covariance matrix to be inverted in equation 4.61 is the lower-right 2×2 matrix ins²(β). The right-hand side of the inequality is 2F_(α,2,28)= 2(3.34) ifα=.05. The confidence coefficient of .95 applies to the joint statement involvingonly β1andβ3. The negative slope in this ellipse reflects the moderate negative correlation between β1 andβ3. For reference, the Bonferroni confidence intervals for β1 andβ3, ignoringβ0, using a joint confidence coefficient of .95 are shown by the corners of the rectangle enclosing the intersection region.

The implications as to what are “acceptable” combinations of values for the parameters are very diﬀerent for the two joint conﬁdence regions. The

142 4. ANALYSIS OF VARIANCEAND QUADRATIC FORMS

FIGURE 4.1. Two-dimensional “slices” of the joint confidence region for the regression of oxygen uptake on time to run 1.5 miles(X1), and heart rate while running (X3) (solid ellipses), and the two-dimensional joint confidence region for β1 andβ3 ignoring β0 (dashed ellipse). The intersection of the Bonferroni univariate confidence intervals is shown as the corners of the rectangle formed by the intersection .

joint confidence region for all parameters is much more restrictive than the bivariate joint confidence region or the univariate confidence intervals would indicate. Allowable combinations ofβ1 andβ3 are very dependent on choice ofβ₀. Clearly, univariate confidence intervals and joint confidence regions that do not involve all parameters can be misleading.

The idea of obtaining joint conﬁdence regions in equation 4.60 can also be extended to obtain joint prediction regions. Let X0 : k×p be a set of klinearly independent vectors of explanatory variables at which we wish to predictY0. That is, we wish to simultaneously predict

Y0=X0β+0, (4.62)

where0is N(0, σ²Ik) and is assumed to be independent of Y. The best linear unbiased predictor ofY0is

Y0=X0β, (4.63)

whereβ= (XX)⁻¹XY. Note that the prediction error vector Y0−Y0 = X0(β−β) + 0

∼ N(0, σ²[Ik+X0(XX)⁻¹X₀]). (4.64)

Dalam dokumen Applied Regression Analysis: A Research Tool (Halaman 151-159)