• Tidak ada hasil yang ditemukan

Chapter 1 Introduction

3.10 Inference for random effects

Brief discussion on Empirical Bayes (EB) inference and how to carry out best linear unbiased prediction will be outlined.

3.10.1 Empirical Bayes Inference

The purpose of random effects bi in the model is to reflect how the evolution for theithsubject deviates from the expected evolutionXiβ. The estimation ofbi is helpful for the detection of outlying profiles. This strategy is however, only meaningful under the hierarchical model interpretation. Recall that the hierarchical specification of the model is given as

Yi|bi ∼N(Xiβ+Zibii), bi ∼N(0, G).

Since the bi are random, it is natural to use Bayesian methods. Under this setting or approach the prior distribution forbi will be taken asN(0, G). Its posterior density f(bi|yi) is then given by

f(bi|yi) ≡ f(bi|Yi =yi)

= f(yi|bi)f(bi) R f(yi|bi)f(bi)dbi

∝ f(yi|bi)f(bi)

∝ . . .

∝ exp{−1

2(bi−GZi0Wi(yi−Xiβ))0Λ−1(bi−GZi0Wi(yi−Xiβ))}

for some some positive definite matrix Λi. It follows that the posterior dis- tribution of bi is given by

bi|yi ∼N(GZi0Wi(yi−Xiβ),Λi)

Thus a logical estimate of bi can be obtained from its posterior mean given by

i(θ) = E[bi|Yi =yi]

= Z

bif(bi|yi)dbi

= GZi0Wi(α)(yi−Xiβ) (3.17) assume to depend on a parameter θ. It is clear from the above that ˆbi(θ) is normally distributed with covariance

Var( ˆbi(θ)) =GZi0{Wi−WiXi(

N

X

i=1

Xi0WiXi)−1Xi0Wi}ZiG

It follows that the inference aboutbi should account for the variability inbi. Because of the this reason, inference for bi should be based on

var( ˆbi(θ)−bi) = G−var( ˆbi(θ)).

It follows that just as for the fixed effects inference discussed in Section 3.7, Wald tests can be derived to test hypotheses about bi. Parameters in θ are replaced by their ML or REML estimates, obtained from fitting the marginal model. The estimate ˆbi = ˆbi(θ) is called the ‘Empirical Bayes’ estimate of bi. Approximate t-test and F-tests to account for the variability introduced by replacing θ by ˆθ similar to testing for fixed effects can be derived.

3.10.2 Best Linear Unbiased Prediction

Often parameters of interest are linear combinations of fixed effects inβ and random effects in bi. For example, a subject specific slope is the sum of the average slope for subjects with same covariate values and the subject specific random slope for that subject. Thus in general, suppose

u=λ0ββ+λ0bbi

is of interest. Conditionally on α, ˆ

u=λ0ββˆ+λ0bi

is a best linear unbiased predictor ( BLUP) of u. In fact from the theory of linear models ˆu is linear in the observations Yi, unbiased foru and it has minimum variance among all unbiased linear estimators and abbreviated as (UMVUE).

3.10.3 Shrinkage estimators

Consider the the prediction of the evolution of the ith subject. That is Yˆi ≡ Xiβˆ +Zii

= Xiβˆ +ZiGZi0Vi−1(yi−Xiβ)ˆ because

i=GZi0Vi−1(yi−Xiβ).

Now since

Vi =ZiGZi0 + Σi

it follows that

Vi−Σi =ZiGZi0 so that if we make this substitution, we have

i = Xiβˆ+ (Vi−Σi)Vi−1(yi−Xiβ)ˆ

= Xiβˆ−(Vi −Σi)Vi−1Xiβˆ+ (Vi−Σi)Vi−1yi

= Xiβˆ−Xiβˆ+ ΣiVi−1Xiβˆ + (Ini−ΣiVi−1)yi

= ΣiVi−1Xiβˆ+ (Ini −ΣiVi−1)yi (3.18)

Hence, ˆYi is a weighted mean of the population averaged profile Xiβˆ and the observed data yi, with weights ˆΣii−1 and Ini − Σˆii−1 respectively.

Note that Xiβˆ gets much higher weight if the residual variability is large in comparison to the total variability contained in Vi. This phenomenon is called ‘shrinkage’. The observed data are shrunk towards prior average Xiβ. This is also reflected in the fact that for any linear combination λ0bi of random effects

Var(λ0ˆbi)≤Var(λ0bi)

3.10.4 The random-intercepts model revisited

Consider the random intercepts model with Zi =1ni a vector of ones and

D=σb2Ini,

a diagonal ni×ni matrix with only one variance componentσb2. Also assume absence of serial correlation such that

Σi2Ini

so that from Eq. (3.17) Empirical Bayes estimate for the random estimate bi, equals

ˆbi = σ210n

ib21ni10n

i2Ini)−1(yi−Xiβ)

= σb2 σ210n

i

Ini− σb2

σ2+niσ2b1ni10n

i

(yi−Xiβ)

= niσb2 σ2+niσ2b

1 ni

ni

X

j=1

(yij −Xi[j]β)

It is important to take note that ˆbi is a weighted average of 0 (prior mean) and the average residual for subject i. The less shrinkage the larger ni and the smallerσ2 relative toσ2b. The equation above shows that the larger ni is the smaller σ2 is relative toσb2 and the less the shrinkage and vice versa.

3.10.5 The normality assumption for Random Effects

In practice, histograms of Empirical Bayes (EB) estimates are often used to check the normality assumption for the random effects. However, since

i = GZi0Wi(yi−Xiβ) Var( ˆbi) = GZi

(

Wi−WiXi(

N

X

i=1

Xi0WiXi)−1Xi0Wi )

ZiG

One should at least first standardize the EB estimates. Further due to the shrinkage property, the EB estimates do not fully reflect the heterogeneity in the data. Therefore EB estimates obtained under normality cannot be used to check normality. This suggests that the only possibility to check the normality assumption is to fit a more general model, with a classical linear mixed model as a special case and to compare both models using Likelihood ratio methods.

3.10.6 The heterogeneity model

One possible extension of the linear mixed model is to assume a finite mixture as random effects distribution namely :

bi

g

X

j=1

pjN(µj, G) with Pg

j=1pj = 1 and Pg

i=1pjµj =0.

The interpretation of the above assumption is as follows: The population

consists of g sub-populations. Each sub-population contains a fraction pj of the total population and in each sub-population, a linear mixed model holds. A very flexible class of parametric models holds for the random effects distribution whilst the classical model is the case where g = 1. The fitting of the above model is based on an EM algorithm for which a SAS macro is available and the EB estimates can be calculated under the heterogeneity model.

3.10.7 Power analyses under the linear mixed model

In any statistical test no matter how simple or complex the test is, the statis- tician is always interested in the power of the test. In this section the F-test for fixed effects is considered. Thus consider the general linear hypothesis:

H0 :Lβ=0 versus

HA :Lβ 6=0

Recall that the F test statistic is given by:

FT =

βˆ0L0h L(PN

i=1Xi0Vi−1( ˆα)Xi)L0i Lβˆ rank(L)

The approximate null distribution of FT is F with the numerator degrees of freedom equal to the rank(L). The denominator degrees of freedom need to be estimated from the data. This can be done so using three possible methods namely, the:

1. Containment method

2. Sattherwaite approximation

3. Kenward and Roger approximation

In general, not necessarily under H0, FT is approximately F distributed with the same number of the degrees of freedom but with a non-centrality parameter:

φ =β0L0

"

L(

N

X

i=1

Xi0Vi−1( ˆα)Xi)−1L0

# Lβ

which equals 0 under H0. This can be used to calculate powers under a variety of models and under a variety of alternative hypotheses. Note that φ is equal to rank(L) ×FT and with β replaced by ˆβ. The SAS proce- dure ‘MIXED’ can therefore be used for the calculation of φ and the related numbers of degrees of freedom.

Calculation in SAS

The following is an outline of the steps involved in the calculation of the power of the test.

1. Construct a data set of the same dimension and with the same co- variates and factor values as the design for which the power is to be calculated.

2. Use as responsesyithe average valuesXiβunder the alternative model.

3. The fixed effects estimate will then be equal to β(α) =ˆ

N

X

i=1

Xi0Wi(α)Xi

!−1 N X

i=1

Xi0Wi(α)yi

=

N

X

i=1

Xi0Wi(α)Xi

!−1 N X

i=1

Xi0Wi(α)Xiβ

= β

4. Hence the F statistic reported by SAS will be equal to rank(L)φ

5. This calculated F value and the associated numbers of degrees of free- dom can be saved and used afterwards for the calculation of the power 6. Note that this requires keeping the variance components in α fixed,

equal to the assumed population values 7. The steps in the calculations are as follows:

• Use PROC MIXED to calculate φ and the degrees of freedom ν1 and ν2

• Calculate the critical value Fc:

P(Fν12,0 > Fc) =level of signif icance

• Calculate the power

power=P(Fν12 > Fc)

8. The SAS functions ‘finv’ and ‘probf’ are used to calculate Fc and the power

Using the above procedure it is clear that the within subject correlation will increase the power for inferences on within subject effects but decrease the power for inferences on between subject effects.