Chapter 1 Introduction
4.6 Generalized Estimating Equations (GEE)
4.6.1 Introduction
The key paper that introduced the Generalized Estimating Equations (GEE) was that by Liang and Zeger (1986). Thereafter reviews have been published by Desmond (1997), Pendergast et al. (1996) and Hall (2001). Recall that the score equations for GLM’s were derived in Eq (4.3) as
S(β) = X
i
∂µi
∂βυ−1i (yi−µi) = 0.
In case the outcome Yi is multivariate, that is, Yi= (Yi1, . . . , Yini)0 with in- dependent components Yij, this would become
S(β) = X
i
X
j
∂µij
∂β υ−1ij (yi−µi)
= X
i
∂µ0i
∂βVi−1(yi−µi)
= X
i
Fi0Vi−1(yi−µi)
= 0
where Fi = ∂µ
0 i
∂β and µi =E(Yi) and Vi =
υi1 . . . 0 ... . .. ... 0 . . . υini
.
In case of the normal model with µi =Xiβ, this equation becomes
S(β) =X
i
Xi0Vi−1(yi−Xiβ) = 0. (4.5) It is important to note that when fitting linear mixed models, the same score equation as Eq (4.5) had to be solved. However,Viwas not diagonal but was equal to the modelled covariance matrix of Yi. GEE’s can be obtained by using a non-diagonal Vi in the score equations for GLM’s:
X
i
∂µ0i
∂βVi−1(yi−µi) = 0
where Vi is now a ni xni covariance matrix with diagonal elements given by υij. In practice then Vi will be of the form
Vi(β,α) = φA1/2i (β)Ri(α)A1/2i (β) (4.6) in which
A1/2i (β) =
√υi1(µi1(β)) . . . 0 ... . .. ...
0 . . . √
υi1(µini(β))
Ri(α) is the correlation matrix of Yi which depends on a vector α of un- known parameters. It is important to note that unlike in the normal case, solving S(β) = 0 will not yield MLE’s. The equations are strictly speaking, not score equations since they are not first-order derivatives of some log- likelihood function for the data under some statistical model. We refer to the above approach as the standard modelling GEE approach.
4.6.2 Large Sample Properties
Let ˆβ be the solution to Eq (4.5) that is ˆβ is the solution to X
i
∂µ0i
∂βVi−1(yi−µi) = 0.
Then large sample properties guarantee that conditionally upon α, ˆβ is asymptotically (N → ∞) normally distributed with meanβ and covariance matrix:
Var( ˆβ) = X
i
∂µ0i
∂βVi∂µi
∂β
!−1
x X
i
∂µ0i
∂βVi−1Var(Yi)Vi−1∂µi
∂β
!
x X
i
∂µ0i
∂βVi∂µi
∂β
!−1
. (4.7)
Notationally we can write Eq (4.7) as:
Var( ˆβ) =I0−1I1I0−1.
The above estimator of Var( ˆβ) called the sandwich estimator is also some- times called the robust estimator. This result holds provided that the mean was correctly specified i.e. provided thatE(Yi) =µi(β). In practiceαis re- placed by an estimate. The robust (sandwich) estimator in the linear models case derived earlier on is a special case of the above covariance matrix. In caseRi is indeed the correct correlation model, the covariance matrix Var( ˆβ) reduces to
Var( ˆβ) =φ X
i
∂µ0i
∂βVi−1∂µi
∂β
!−1
=I0−1. provided
P
i
∂µ0i
∂βVi−1∂µ∂βi
is non-singular.
However, I0−1 is the so called naive estimator or model based estimator. The known variance result is recovered when the guess of the correct correlation model is actually equal to the true model. The estimators βˆ are consistent even if the working correlation matrix is correct.
In practice, Var(Yi) in Var( ˆβ) is replaced by:
[Yi−µi(ˆβ)][Yi−µi(ˆβ)]0
which is unbiased for Var(Yi), provided that the mean has been correctly specified. This means that there are several implications based on the asymp- totic theory, viz.
• Mean structure needs to be correctly specified
• Little effort needs to be spent on specifying the correlation structure because mis-specification does not affect consistency and asymptotic normality
• Considerably large samples may be required
• Efficiency can be affected and this follows from the Cram´er-Rao in- equality
• Taken to the extreme, one could make the working assumptions of independence between two repeated measures
• It also implies that the correlation structure should not be interpreted
• GEE’s validity is limited when there are incomplete data
4.6.3 The Working Correlation Matrix
When fitting marginal models it is possible to specify what is known as the working correlation matrix Ri(α) for the n observations from subject i We can therefore write
Vi(β,α) =φA1/2i (β)Ri(α)A1/2i (β).
The variance function Ai is the ni×ni diagonal matrix with elementsυ(µij), the known GLM variance function. The working correlation Ri(α), is pos- sibly dependent on a different set of parameters α. The over dispersion parameter φ, is assumed to be 1 or estimated from the data.
The unknown quantities are expressed in terms of the Pearson residuals eij = yij −µij
√υ(µij). (4.8)
Note that the eij implicitly depends on β which is unknown and has to be estimated.
4.6.4 Estimation of the Working Correlation Matrix
Liang and Zeger (1986) proposed the moment based estimates for the work- ing correlation matrix. Some of the more popular estimation assumptions include:
Assumption Corr(Yij, Yik) Estimate
Independence 0
Exchangeable α αˆ= N1 PN i=1
1 ni(ni−1)
P
j6=keijeik
AR(1) αt αˆ = N1 PN
i=1 1 ni−1)
P
j≤ni−1eijei,j+1 Unstructured αjk αˆjk = N1 PN
i=1eijeik The dispersion parameter is then estimated by
φˆ= 1 N
N
X
i=1
1 ni
ni
X
j=1
e2ij (4.9)
Albert and McShane (1995), Fitzmaurice (1995) and Hall and Severini (1998) all point out that accurate modelling of the correlation structure generally improves statistical inference on means. However the moment based esti- mates of the working correlation structure has led to doubts about efficiency
of correlation modelling and convergence of GEE solution algorithms in cer- tain situations. Crowder (1995) indicated that the solution to the first order GEE for β does not exist under certain types of severe misspecification of the working correlation structure. Further work along these lines was done by Sutradhar and Das (1999) who using unbalanced data, showed that these estimates ofβ obtained under a working independence assumption are some- times more efficient than those with a misspecified nondiagonal working cor- relation structure. Chaganty (1997), Segal, Neuhaus and James (1997) and O’Hara-Hines (1998) all point out criticisms with the parameter estimation of GEE. Chaganty (1997) proposes the quasi-least squares (QLS) method for estimating the correlation parameters. This method was further extended and investigated by Shults and Chaganty (1998). Recently, Wang and Carey (2004) propose two ways of constructing unbiased estimating equations from general correlation models for irregularly timed repeated measures to sup- plement and enhance GEE. The equations are obtained by differentiation of the Cholesky decomposition of the working correlation or as the score equations for decoupled Gaussian pseudolikelihood. These equations can be solved with computational effort equivalent to that required for a first or- der GEE. Wang and Carey (2004) also state that methods are well defined for highly unbalanced and irregularly timed data sets and are applicable for working correlation patterns outside the first order Markovian time se- ries model. They also state that if convergence in the unbiased estimating equations are not achieved then choosing a different correlation structure or switching to a working independence model may be a solution.
4.6.5 Fitting GEE
The standard fitting procedure for GEE’s in SAS is ‘PROC GENMOD’. The steps involved in the computational procedure are as follows:
Step 1
Compute initial estimates for β using a univariate GLM (i.e. assuming in- dependence among the ni responses for subjecti)
Step 2
Compute Pearson residuals eij using Eq (4.8) Step 3
Compute estimates for α.
Step 4
Compute Ri(α) under a given assumption of a correlation structure Step 5
Compute an estimate for φ using Eq (4.9) Step 6
Compute Vi(β,α) =φA1/2i (β)Ri(α)A1/2i (β).
Step 7
Update estimate for β:
β(t+1) =β(t)−
" N X
i=1
Fi0Vi−1Fi
#−1" N X
i=1
Fi0Vi−1(yi−µi)
#
Step 8
Iterate 2-7 until convergence is reached
Estimates of precision can be achieved by comparing I0−1 and I0−1I1I0−1.