• Tidak ada hasil yang ditemukan

Some developmental notes on GEE over timetime

Chapter 1 Introduction

4.7 Some developmental notes on GEE over timetime

4.7 Some developmental notes on GEE over

longitudinal data.

4. An alternative to GEE is the alternating logistic regressions (ALR) proposed by Carey, Zeger and Diggle (1993), but not of interest in the current work.

5. Le Cessie and Van Houwelingen (1994) suggested an approximation to the true likelihood by means of a pseudo-likelihood (PL) function that is easier to evaluate and to maximize. Both GEE and PL give con- sistent and asymptotically normal estimators provided an empirically corrected variance estimator which we have called the sandwich esti- mator is used. GEE is well suited only to marginal models while PL can be used for marginal models (Geys, Molenberghs and Lipsitz, 1998) and conditional models (Geys, Molenberghs and Ryan, 1997, 1999).

6. Wang and Lin (2005) investigate the impacts of misspecifing the vari- ance function which is known to be a function of the mean. They state that in the framework of GEE, the correct specification of the variance function can improve the estimation efficiency even if the correlation structure is misspecified. However misspecification of the variance func- tion impacts much more on the estimators for within cluster covariates than for cluster level covariates and also if the variance function is misspecified, the correct choice of the correlation structure may not necessarily improve estimation efficiency.

7. Mainstream statistical software packages such as SAS (PROC GEN- MOD), STATA(XTGEE command) and GENSTAT has the methodol- ogy of the GEE described above in-built .

4.7.1 Application of fitting GEE models to the RSV data set

A series of various models were fitted using the ‘Proc Genmod’ procedure in SAS by changing the correlation structure within individual responses and then assessing the main effects. The model that was first fitted included all the main effects terms. Only those terms that were found to be significant were retained with the suitable correlation structure. The main effects terms that we consider are: age, dt, prev, actipass and timemonth. These variables were described in detail in Chapter 1. All the interaction terms were assessed by sequentially adding them to the full model of main effects one at a time and then assessing the p-values of the Wald test of the model but none of the interaction terms were found to be significant. Hence they are not reported here. The results are summarized below.

Exchangeable Independent AR(1)

Parameter Est. Std. Error Pr>|Z| Est. Std. Error Pr>|Z| Est. Std. Error Pr>|Z|

Intercept -5.0329 1.4454 0.001 -5.0363 1.4479 0.001 -5.0337 1.458 0.001

age 0 -0.9253 1.2175 0.447 -0.9197 1.2194 0.451 -0.9261 1.2343 0.453

age 1 -0.6499 1.0714 0.544 -0.647 1.0736 0.547 -0.6011 1.0824 0.579

age 2 -0.2792 1.0337 0.787 -0.276 1.0356 0.790 -0.241 1.0437 0.817

age 3 -0.0714 0.9744 0.942 -0.0689 0.9759 0.944 -0.0305 0.9835 0.975

age 4 -0.6709 0.9491 0.480 -0.669 0.9502 0.481 -0.6499 0.9579 0.498

age 5 -2.6057 1.298 0.045 -2.6025 1.2979 0.045 -2.5411 1.2919 0.049

age 6 -1.5989 1.0086 0.113 -1.596 1.0087 0.114 -1.566 1.0105 0.121

age 7 -2.2518 1.1538 0.051 -2.25 1.1538 0.051 -2.2603 1.1692 0.053

age 8 -1 0.5944 0.093 -0.9989 0.5946 0.093 -0.96 0.5969 0.108

age 9 -0.7399 0.5124 0.149 -0.7389 0.5126 0.149 -0.7361 0.5189 0.156

age 10 -0.3234 0.4528 0.475 -0.3221 0.4528 0.477 -0.2992 0.4574 0.513

age 11 -0.5684 0.4612 0.218 -0.5685 0.4612 0.218 -0.5365 0.4637 0.247

age 12 0.000 0.000 . 0.000 0.000 . 0.000 0.000 .

dt 0.0008 0.0084 0.919 0.0009 0.0084 0.919 0.0014 0.0082 0.866

prev 44.6065 8.1063 < .0001 44.5942 8.1055 < .0001 43.8948 8.1214 < .0001

timemonth -0.0457 0.1044 0.662 -0.0454 0.1046 0.664 -0.0437 0.1053 0.678

actipass 0 2.2345 0.1768 < .0001 2.2341 0.1769 < .0001 2.2049 0.1759 < .0001

actipass 1 0.000 0.000 . 0.000 0.000 . 0.000 0.000 .

Table 4.1: Model based standard errors and estimates GEE

Exchangeable Independent AR(1)

Parameter Est. Std. Error Pr>|Z| Est. Std. Error Pr>|Z| Est. Std. Error Pr>|Z|

Intercept -5.033 1.165 < .0001 -5.036 1.165 < .0001 -5.034 1.161 < .0001

age 0 -0.925 1.230 0.452 -0.920 1.229 0.454 -0.926 1.226 0.450

age 1 -0.650 0.906 0.473 -0.647 0.906 0.475 -0.601 0.902 0.505

age 2 -0.279 0.858 0.745 -0.276 0.858 0.748 -0.241 0.857 0.779

age 3 -0.071 0.801 0.929 -0.069 0.801 0.932 -0.031 0.800 0.970

age 4 -0.671 0.746 0.368 -0.669 0.746 0.370 -0.650 0.749 0.385

age 5 -2.606 1.194 0.029 -2.603 1.194 0.029 -2.541 1.172 0.030

age 6 -1.599 0.869 0.066 -1.596 0.869 0.066 -1.566 0.860 0.069

age 7 -2.252 1.040 0.030 -2.250 1.039 0.030 -2.260 1.033 0.029

age 8 -1.000 0.606 0.099 -0.999 0.607 0.100 -0.960 0.606 0.113

age 9 -0.740 0.561 0.187 -0.739 0.561 0.188 -0.736 0.554 0.184

age 10 -0.323 0.473 0.494 -0.322 0.473 0.496 -0.299 0.473 0.527

age 11 -0.568 0.443 0.199 -0.569 0.443 0.199 -0.537 0.447 0.231

age 12 0.000 0.000 . 0.000 0.000 . 0.000 0.000 .

dt 0.001 0.011 0.937 0.001 0.011 0.937 0.001 0.010 0.893

prev 44.607 6.554 < .0001 44.594 6.552 < .0001 43.895 6.527 < .0001

timemonth -0.046 0.085 0.589 -0.045 0.085 0.592 -0.044 0.084 0.603

actipass 0 2.235 0.181 < .0001 2.234 0.181 < .0001 2.205 0.178 < .0001

actipass 1 0.000 0.000 . 0.000 0.000 . 0.000 0.000 .

Table 4.2: Empirical based standard errors and estimates GEE

The algorithm for the unstructured correlation matrix option did not con- verge and the results are omitted. The results of the model based estimates and standard errors are not very different between the three correlation struc- tures. The magnitude of the estimates are somewhat similar. Moreover, we see that the model based and the empirical parameter estimates are not very

different in magnitude. This is a feature of GEE because the choice between naive and empirical only affects the estimation of the covariance matrix of the regression parameter β. The output for the correlation between two re- peated measurement for the exchangeable correlation matrix was found to be −0.00035. A possible reason why the unstructured correlation matrix did not produce convergence is because the observations can not be aligned that is the observations were not equally spaced. Table 4.1 and 4.2 shows that for the model and empirical based estimates that at the 5% significance level there were significant differences between age group 5 relative to age group 12 and mildly between age group 7 relative to age group 12 in deter- mining whether a child is infected or not. The variables prevalence (prev) and type of sampling (actipass), whether a child was actively or passively sampled (actipass 0 versus actipass 1) were both significant at the 5% level in influencing whether a child is infected or not. The full results are tab- ulated in Tables 4.1 and 4.2 for the types of standard errors and the three correlation structures. It is also worthwhile noting that the exchangeable and independent correlation structures have their empirical standard errors slightly closer to the model based standard errors than the AR(1) correlation structure. The estimated GEE correlation matrices are all essentially inde- pendent, so we expect to see no appreciable differences among the columns of Table 4.1 and 4.2. It is however interesting that the sandwich estimator appears to be picking up dependence not captured by the working correla- tion matrices given the estimated correlation parameters. It is necessary to reiterate that the unstructured correlation matrix is found to be unsuitable in this scientific setting and is dropped.

Correlation Type Source DF Chi-Square Pr> Chi-Sq

Exchangeable age 12 30.39 0.0024

dt 1 0.01 0.9379

prev 1 23.32 < .0001

timemonth 1 0.3 0.5860

actipass 1 61.86 <.0001

Independent age 12 30.39 0.0024

dt 1 0.01 0.9378

prev 1 23.32 < .0001

timemonth 1 0.29 0.5882

actipass 1 61.81 <.0001

AR(1) age 12 30.54 0.0023

dt 1 0.02 0.8974

prev 1 22.94 < .0001

timemonth 1 0.27 0.6008

actipass 1 62.00 <.0001 Table 4.3: Score statistics for Type III GEE

The type III score statistics show that the age, prev and actipass variables to be significant at the 5% level in all three correlation structures. The magnitude of the estimates do not differ by vast amounts from each other in the three correlation structures.

4.8 Weighted Generalized Estimating Equa-