Generalized Estimating Equations (GEE) - Analysis of longitudinal binary data : an application

Chapter 1 Introduction

4.6 Generalized Estimating Equations (GEE)

4.6.1 Introduction

The key paper that introduced the Generalized Estimating Equations (GEE) was that by Liang and Zeger (1986). Thereafter reviews have been published by Desmond (1997), Pendergast et al. (1996) and Hall (2001). Recall that the score equations for GLM’s were derived in Eq (4.3) as

S(β) = X

∂µ_i

∂βυ⁻¹_i (y_i−µ_i) = 0.

In case the outcome Y_i is multivariate, that is, Y_i= (Y_i1, . . . , Y_in_i)⁰ with in- dependent components Y_ij, this would become

S(β) = X

∂µ_ij

∂β υ⁻¹_ij (y_i−µ_i)

= X

∂µ⁰_i

∂βV_i⁻¹(y_i−µ_i)

= X

F_i⁰V_i⁻¹(y_i−µ_i)

= 0

where F_i = ^∂µ

0 i

∂β and µ_i =E(Y_i) and V_i =







υ_i1 . . . 0 ... . .. ... 0 . . . υ_in_i





.

In case of the normal model with µ_i =X_iβ, this equation becomes

S(β) =X

X_i⁰V_i⁻¹(y_i−X_iβ) = 0. (4.5) It is important to note that when fitting linear mixed models, the same score equation as Eq (4.5) had to be solved. However,V_iwas not diagonal but was equal to the modelled covariance matrix of Y_i. GEE’s can be obtained by using a non-diagonal V_i in the score equations for GLM’s:

∂µ⁰_i

∂βV_i⁻¹(y_i−µ_i) = 0

where V_i is now a n_i xn_i covariance matrix with diagonal elements given by υij. In practice then Vi will be of the form

V_i(β,α) = φA^1/2_i (β)R_i(α)A^1/2_i (β) (4.6) in which

A^1/2_i (β) =







√υ_i1(µi1(β)) . . . 0 ... . .. ...

0 . . . √

υ_i1(µ_in_i(β))







R_i(α) is the correlation matrix of Y_i which depends on a vector α of unknown parameters. It is important to note that unlike in the normal case, solving S(β) = 0 will not yield MLE’s. The equations are strictly speaking, not score equations since they are not first-order derivatives of some log- likelihood function for the data under some statistical model. We refer to the above approach as the standard modelling GEE approach.

4.6.2 Large Sample Properties

Let ˆβ be the solution to Eq (4.5) that is ˆβ is the solution to X

∂µ⁰_i

∂βV_i⁻¹(y_i−µ_i) = 0.

Then large sample properties guarantee that conditionally upon α, ˆβ is asymptotically (N → ∞) normally distributed with meanβ and covariance matrix:

Var( ˆβ) = X

∂µ⁰_i

∂βV_i∂µ_i

∂β

!−1

x X

∂µ⁰_i

∂βV_i⁻¹Var(Y_i)V_i⁻¹∂µ_i

∂β

x X

∂µ⁰_i

∂βV_i∂µ_i

∂β

!−1

. (4.7)

Notationally we can write Eq (4.7) as:

Var( ˆβ) =I₀⁻¹I₁I₀⁻¹.

The above estimator of Var( ˆβ) called the sandwich estimator is also some- times called the robust estimator. This result holds provided that the mean was correctly specified i.e. provided thatE(Y_i) =µ_i(β). In practiceαis replaced by an estimate. The robust (sandwich) estimator in the linear models case derived earlier on is a special case of the above covariance matrix. In caseR_i is indeed the correct correlation model, the covariance matrix Var( ˆβ) reduces to

Var( ˆβ) =φ X

∂µ⁰_i

∂βV_i⁻¹∂µ_i

∂β

!⁻¹

=I₀⁻¹. provided

∂µ⁰_i

∂βV_i⁻¹^∂µ_∂βⁱ

is non-singular.

However, I₀⁻¹ is the so called naive estimator or model based estimator. The known variance result is recovered when the guess of the correct correlation model is actually equal to the true model. The estimators βˆ are consistent even if the working correlation matrix is correct.

In practice, Var(Y_i) in Var( ˆβ) is replaced by:

[Y_i−µ_i(ˆβ)][Y_i−µ_i(ˆβ)]⁰

which is unbiased for Var(Yi), provided that the mean has been correctly specified. This means that there are several implications based on the asymptotic theory, viz.

• Mean structure needs to be correctly specified

• Little effort needs to be spent on specifying the correlation structure because mis-specification does not affect consistency and asymptotic normality

• Considerably large samples may be required

• Efficiency can be affected and this follows from the Cram´er-Rao in- equality

• Taken to the extreme, one could make the working assumptions of independence between two repeated measures

• It also implies that the correlation structure should not be interpreted

• GEE’s validity is limited when there are incomplete data

4.6.3 The Working Correlation Matrix

When fitting marginal models it is possible to specify what is known as the working correlation matrix R_i(α) for the n observations from subject i We can therefore write

Vi(β,α) =φA^1/2_i (β)Ri(α)A^1/2_i (β).

The variance function A_i is the n_i×n_i diagonal matrix with elementsυ(µ_ij), the known GLM variance function. The working correlation R_i(α), is pos- sibly dependent on a different set of parameters α. The over dispersion parameter φ, is assumed to be 1 or estimated from the data.

The unknown quantities are expressed in terms of the Pearson residuals e_ij = y_ij −µ_ij

√υ(µ_ij). (4.8)

Note that the eij implicitly depends on β which is unknown and has to be estimated.

4.6.4 Estimation of the Working Correlation Matrix

Liang and Zeger (1986) proposed the moment based estimates for the working correlation matrix. Some of the more popular estimation assumptions include:

Assumption Corr(Y_ij, Y_ik) Estimate

Independence 0

Exchangeable α αˆ= _N¹ PN i=1

1 ni(ni−1)

j6=ke_ije_ik

AR(1) α^t αˆ = _N¹ PN

i=1 1 ni−1)

j≤n_i−1e_ije_i,j+1 Unstructured α_jk αˆ_jk = _N¹ PN

i=1e_ije_ik The dispersion parameter is then estimated by

φˆ= 1 N

i=1

1 n_i

j=1

e²_ij (4.9)

Albert and McShane (1995), Fitzmaurice (1995) and Hall and Severini (1998) all point out that accurate modelling of the correlation structure generally improves statistical inference on means. However the moment based estimates of the working correlation structure has led to doubts about efficiency

of correlation modelling and convergence of GEE solution algorithms in certain situations. Crowder (1995) indicated that the solution to the first order GEE for β does not exist under certain types of severe misspecification of the working correlation structure. Further work along these lines was done by Sutradhar and Das (1999) who using unbalanced data, showed that these estimates ofβ obtained under a working independence assumption are some- times more efficient than those with a misspecified nondiagonal working correlation structure. Chaganty (1997), Segal, Neuhaus and James (1997) and O’Hara-Hines (1998) all point out criticisms with the parameter estimation of GEE. Chaganty (1997) proposes the quasi-least squares (QLS) method for estimating the correlation parameters. This method was further extended and investigated by Shults and Chaganty (1998). Recently, Wang and Carey (2004) propose two ways of constructing unbiased estimating equations from general correlation models for irregularly timed repeated measures to sup- plement and enhance GEE. The equations are obtained by differentiation of the Cholesky decomposition of the working correlation or as the score equations for decoupled Gaussian pseudolikelihood. These equations can be solved with computational effort equivalent to that required for a first order GEE. Wang and Carey (2004) also state that methods are well defined for highly unbalanced and irregularly timed data sets and are applicable for working correlation patterns outside the first order Markovian time se- ries model. They also state that if convergence in the unbiased estimating equations are not achieved then choosing a different correlation structure or switching to a working independence model may be a solution.

4.6.5 Fitting GEE

The standard fitting procedure for GEE’s in SAS is ‘PROC GENMOD’. The steps involved in the computational procedure are as follows:

Step 1

Compute initial estimates for β using a univariate GLM (i.e. assuming independence among the n_i responses for subjecti)

Step 2

Compute Pearson residuals e_ij using Eq (4.8) Step 3

Compute estimates for α.

Step 4

Compute R_i(α) under a given assumption of a correlation structure Step 5

Compute an estimate for φ using Eq (4.9) Step 6

Compute V_i(β,α) =φA^1/2_i (β)R_i(α)A^1/2_i (β).

Step 7

Update estimate for β:

β^(t+1) =β^(t)−

" _N X

i=1

F_i⁰V_i⁻¹F_i

#⁻¹" _N X

i=1

F_i⁰V_i⁻¹(y_i−µ_i)

Step 8

Iterate 2-7 until convergence is reached

Estimates of precision can be achieved by comparing I₀⁻¹ and I₀⁻¹I₁I₀⁻¹.

4.7 Some developmental notes on GEE over

Dalam dokumen Analysis of longitudinal binary data : an application to a disease process. (Halaman 98-105)