Chapter 1 Introduction
5.2 The Generalized Linear Mixed Model
5.2.8 Inference for Generalized Linear Mixed Models
plasia. The sample comprised 490 white premenopausal women who had no breast disease. Their comparison is between a model fitted using numerical quadrature and a model fitted using GEE, rather than PQL and MQL. They examine one representative of each approach; for SS models they look at the mixed effects logistic model, while PA models they use the GEE approach of Liang et al.(1986). They compare the two approaches for the case of a single covariate x, pointing out that the results generalize easily to the case of several covariates.
validate them or develop corrections. Simulations carried out by Engel and Buist (1996) suggests that the procedures for confidence intervals and signif- icance tests as developed for ordinary mixed models appear to perform well enough for practical use when applied to the adjusted dependent variate.
The GLIMMIX procedure (macro) in SAS, which can fit both MQL (Bres- low and Clayton, 1993) and PQL (Schall, 1991) models using repeated calls to PROC MIXED, provides for type III Wald statistics i.e. these enable us to test the significance of any terms in the model, conditional on the remain- ing model terms. When the scale parameter a(φ) is known, these statistics are approximately distributed as Chi-squared. If we have overdispersion (or underdispersion) in the case of the Poisson or Binomial models and a(φ) is unknown, then the Wald statistic is divided by the rank of L (the matrix used to formulate the hypothesis test) and this is approximately distributed as F with ν1 and ν2 degrees of freedom, whereν1 =rank(L) andν2 in simple cases correspnds to the degrees of freedom required to estimate a(φ), but in more complex cases must be approximated using a Sattherwaite-type proce- dure. The GLMM procedure in Genstat can produce Type I Wald statistics, but it should be noted that these depend on the order in which terms are included in the model. So if we are using these statistics for model selection, we should refit the model with terms in a number of different orderings. An alternative likelihood based test statistic analogous to change in deviance in generalized linear models and will be considered later on.
As a comparison to SAS GLMM implementation in Genstat, the param- eter estimates are calculated either using the the method of Schall (PQL) or the marginal method of Breslow and Clayton (MQL). Ignoring the random effects u, this gives a linear predictor Xβ on the scale of the link function
g(.). Predicted means are calculated on this transformed scale in the way that REML calculates them by ignoring the random effects. Consider the case of how REML calculates the predicted means when the response is nor- mally distributed. The predicted means are based on the estimates of the parameters in the model y = Xβ + Zu + e. If the design is balanced and orthogonal then the table of means produced in REML for fixed model terms are the same as the ordinary means. There is no such correspondence with unbalanced data, as with the Kilifi data. With REML the means are calculated from a linear transformation of the estimated parameter values, taking no account of the frequency counts for the different factor combina- tions. Therefore these predicted means will correspond to averages over the factor combinations only with orthogonal data. In the other cases, tables of means can be thought of as mean effects of factor levels adjusted for the mean values of any covariates and for any lack of balance in other factors; that is, as the means we would have expected if the data had been orthogonal. We should note that these means are not of the same type as those produced by default PREDICT directive in Genstat. In this case the marginal frequencies are used as weights for the averages of the factor combinations. Predicted means are calculated using all the parameter estimates and taking means over the model terms not present in the table. If we require the predicted means for the fixed model terms, these means need only to be taken over the estimates for the fixed model terms, since means over the the random terms will always be zero. These predicted means on the transformed scale are referred to as, for example, “predicted means for age”. GLMM will also print a table headed “Back transformed means (on the original scale)”, which in the case of binary data with a logit link are simply found by applying the
antilogit function
µ= g−1(η) = exp(η)/{1 + exp(η)}.
Genstat also issues a statement to emphasize that the “means are proba- bilities not expected values”. Thus inference should be carried out on the transformed scale, and the back-transformed means are only to give an in- tuitive guide in interpreting the results. Note in Genstat there is a choice to use either the Schall (PQL) method or the marginal method of Breslow and Clayton (MQL) method. As stated earlier the only difference in the two methods is in the way the parameter estimates are formed.