Maximum Likelihood Estimation - Statistical models to understand factors associated with under-

In linear mixed models the marginal distribution of Y could be computed as the multivariate normal, meaning f(Y) is a density function of a multivariate normal distribution.

However, for generalized linear mixed models, it is difficult to evaluate the integral be- cause of the presence of N q-dimensional integral over the random effects (Vittinghoff et al., 2011; Bolker et al., 2009). The random effect model could be fitted by maximization of marginal likelihood, and that is obtained by integrating out the random effects.

The likelihood is given by

L(β, G, φ) =

i=1

f_i(Y_i |β, G, φ)

i=1

f_i(Y_i |β, G, φ).f(U_i, G)du_i

(4.6)

where,f_i(Y_i | β, G, φ) = R Qni

j=1f_ij(Y_ij |β, G, φ).f(U_i, G)du_i(Molenberghs and Ver- beke, 2006). In general, numerical approximations have to be used to evaluate likelihood of GLMMs.

4.4.1 Estimation: Approximation of the Integrand

The Laplace method is one of the approaches of approximating the integrand and is one of the natural alternatives when exact the likelihood function is difficult to compute (Molenberghs and Verbeke, 2006). When the integrands are approximated, the objective is to obtain traceable integrals such that closed form expressions can be obtained which make numerical maximization of the approximated likelihood feasible (Molenberghs and Verbeke, 2006). Suppose we wish to approximate the integral of the form

I = Z

exp^(−q(x))dx (4.7)

where q(.) is a well behaved function in a way that its minimum value is at x = ˜x

with q⁰(˜x) = 0 and q⁰⁰(˜x)>0. we can consider the Taylor expansion about ˜x given by

q(x)≈q(˜x) +1

2q⁰⁰(˜x)(x−˜x) +. . . . (4.8) This gives an approximation to (4.7) as

exp (−q(x))dx≈ s

2π

q⁰⁰(˜x)exp (−q(˜x)). (4.9) We may also have the multivariate extension of (4.9), which is often useful. Let q(α) be a well behaved function with its minimum atα = ˜α withq⁰( ˜α) = 0 andq⁰⁰( ˜α)>0, where q⁰ and q⁰⁰ are the gradient and Hessian ofq respectively. We have

exp (−q(x))dx≈c|q⁰⁰(˜x)|⁻¹² exp (−q(˜x)) (4.10)

where c is a constant depending on the dimension of the integral and | q⁰⁰(˜x) | is the determinant of matrixq⁰⁰(˜x). In whichq⁰⁰(˜x)>0 implies matrixq⁰⁰(˜x) is positive definite.

4.4.2 Estimation: Approximate of Data

There is another class of estimation approach based on a decomposition of the data into mean and error terms. With the Taylor series expansion of the mean which is a non- linear function of predictors. The method in this class differs in the order of the Taylor approximation. The decomposition that is considered is

Y_ij =µ_ij +_ij =h(X_ij⁰ β+Z_ij⁰ U) +_ij (4.11)

where, h(.) is the inverse link function, and error term have an appropriate distribution with variance equal to var(Y_ij | U_i) =φV(µ_ij). Here, V(.) is the usual variance function in the exponential family (Molenberghs and Verbeke, 2006). Consider a binary outcome

with logit link function. One then has

µ_ij =h(X_ij⁰ β+Z_ij⁰ U) = P_ij = exp(X_ij⁰ β+Z_ij⁰ U)

1 + exp(X_ij⁰ β+Z_ij⁰ U) (4.12) where h(X_ij⁰ β + Z_ij⁰ U) is the inverse for the logit link function which is the logistic function. x_i and z_i are as in the definition of generalized linear mixed model. This is considered as the special case of GLMM where the exponential the family is Bernoulli and corresponding link function is g(µ) = logit(µ).

4.4.3 Penalized Quasi-Likelihood

The Penalized Quasi-Likelihood (PQL) is one of the methods that approximates data by mean plus error term with variance equals to Var(Y_ij | U_i ). This method uses Taylor expansion around estimates ˆβ and ˆU of fixed effects and random effects respectively (Bolker et al., 2009; Moeti, 2010). One then has

Yij =µij +ij =h(X_ij⁰ β+Z_ij⁰ U) +ij

≈h(X_ij⁰ βˆ+Z_ij⁰ Uˆ) +h(X_ij⁰ βˆ+Z_ij⁰ Uˆ)X_ij⁰ (β−β) +ˆ h(X_ij⁰ βˆ+Z_ij⁰ Uˆ)Z_ij⁰ (U −Uˆ) +_ij

= ˆµ_ijV(ˆµ_ij)X_ij⁰ (β−β) +ˆ V(ˆµ_ij)Z_ij⁰ (U −Uˆ) +_ij,

(4.13) and

Y_i = ˆµ_i+ ˆV_iX_i(β−β) + ˆˆ V_iZ_i((U)−Uˆ) +_i

where ˆµ_i contains values of µˆ_ij = h(X_ij⁰ βˆ +Z_ij⁰ Uˆ), V_i is the diagonal matrix with elements V( ˆµij) =h(X_ij⁰ βˆ+Z_ij⁰ Uˆ) and Xi and Zi contain the X_ij⁰ and Z_ij⁰ respectively.

Re-ordering the above expression and pre-multiply with ˆ

V_i⁻¹ we obtain

Y_i^∗ =V_iˆ⁻¹(Y_i−µˆ_i) +X_iβˆ+Z_iUˆ

≈X_iβˆ+Z_iUˆ +^∗_i.

(4.14)

For^∗_i equal to V_iˆ⁻¹_i and has a zero mean. This can be viewed as a linear mixed model for a pseudo data Y_i^∗ with error term ^∗_i. This gives the algorithm for fitting original generalized linear mixed models.

Algorithm

Step 1: Given starting value for parameter β, φ and G. In the marginal likelihood em- pirical Bayes estimates are calculated for U_i and pseudo data Y_i^∗ are computed.

Step 2: Approximate linear mixed model is fitted, which gives updated estimates forβ, φ and G. then updated estimates are used to update the pseudo data. This whole scheme is iterated until convergence is reached, and resulting estimates are called penalized quasi- likelihood estimate. They are obtained from optimizing a quasi-likelihood function that involves first and second order conditional moments, augmented with a penalty term on the random effects (Molenberghs and Verbeke, 2006).

4.4.4 Marginal Quasi-Likelihood

Marginal Quasi-Likelihood (MQL) is an approximation method which is similar to PQL method. However, it is based on a linear Taylor expansion of the mean around current estimate ˆβ for fixed effects, and around U = 0 for random effects (Bolker et al., 2009;

Moeti, 2010). This gives same expansion as shown for PQL, but now the current predictor is of the formh(X_ij⁰ β). The pseudo-data are now of the formˆ

Y_i^∗ =Vˆ_i⁻¹(Y_i−µˆ_i) +X_iβˆ (4.15)

and satisfy the approximate linear mixed model

Y_i^∗ ≈Xiβ+ZiU +^∗_i. (4.16)

The model fitting is also done by iteration between the calculation of the pseudo data and fitting of approximate linear mixed model for these pseudo data (Molenberghs and Verbeke, 2006).. The resulting estimates are known as quasi-likelihood estimates (MQL).

4.4.5 Discussion of MQL and PQL

There is no main difference between penalized quasi-likelihood (PQL) and marginal quasi- likelihood (MQL); they both do not incorporate the random U_i in the linear predictor (Bolker et al., 2009). Both of these methods are based on similar ideas and will have almost similar properties. However, the accuracy of both models depends on the accuracy of the linear mixed model for pseudo data Y_i^∗. The Laplace method, PQL and MQL perform poorly in the cases of binary with repeated observations small number of repeated observations available (Molenberghs and Verbeke, 2006). The MQL completely ignores the random effects variability in linearization of the mean. The Laplace method is more accurate than penalized quasi-likelihood. However, Laplace is slower and less flexible compared to penalized quasi-likelihood (Bolker et al., 2009). The MQL remains biased while PQL will be consistent with an increased number of measurements.

4.5 Generalized Linear Mixed Models (GLMMs) in

Dalam dokumen Statistical models to understand factors associated with under-five child mortality in Tanzania. (Halaman 77-81)