• Tidak ada hasil yang ditemukan

Chapter 7 Modeling of the joint determinants of malaria Rapid Diagnosis

7.2 Joint model formulation for multivariate GLMM

The primary objective of the joint modelling is to provide a framework where questions of scientific interest pertaining to relationships among and between multiple outcomes and other factors. Therefore, generalized linear mixed model introduced in previous studies can be easily be adapted to situations where various outcomes of a different nature are observed (Molenberghs and Verbeke, 2005). Consider a conditional random effects model with bivariate responses.

Let the two outcomes be y£ and y£ and denoted by }£ = (}£V ,}£V )V, where }£= (y£, y£, . . . , y£@)V and }£= (y£, y£, . . . , y£@)V on the first and second outcome. Here, y£à,j= 1, . . . , n£ and y£à,j= 1, . . . , n£ are conditionally

165 independent given b£ and b£ with densities f(. ) and f(. ) in the exponential family for the first and second outcomes. Furthermore, y£ and y£ are conditionally independent given b£ = (b£,b£)V and the responses are independent. In addition to this, g(. ) and g(. ) be appropriate link functions for f and f. Moreover, the conditional means of y£à and y£à denoted by μ£à and μ£à respectively. Suppose €£= (μ£à, . . . , μ£@~)V and €£= (μ£à, . . . , μ£@~)V (Gueorguieva and Agresti, 2011). Therefore, at the first stage the mixed model specification is assumed to be

μ£= g,(X£β+ Z£b£) (7.1) μ£= g,(X£β+ Z£b£) (7.2) where β and β are (p× 1) and p× 1) dimensional unknown parameter vectors, X£ and X£ are (n£ × p) and (n£× p) dimensional design matrices for the fixed effects, Z£ and Z£ are (n£ ×q) and (n£× q) design matrices for the random effects and g and g are applied component wise to μ£ and μ£ (Gueorguieva, 2001). Secondly,

b£ = äb£

b£æ ~ i.i.d MVN (0,Ʃ) =MVN äƒ00„,fƩ Ʃ

Ʃ Ʃgæ, (7.3)

where Ʃ, Ʃand Ʃare unknown positive definite matrices. For a given value of Ʃ= 0, the above model is equivalent to two separate GLMM’s for two outcome variables. This leads to the assumption of complete independence for both outcomes. Advantages of joint model include the control of the type I error rates in multiple tests. This leads to possible gains in efficiency in the parameter estimates and the ability to answer intrinsically multivariate questions (Molenberghs and Verbeke, 2005).

The marginal means and the marginal variances of y£ and y£ for the model defined by (7.1), (7.2) and (7.3) are the same as those of the GLMM considering one variable at a time

166 E(y£) =E\E(y£|b£)^ =E\μ£^

E(y£) =E\E(y£|b£)^ =E\μ£^ and

var(y£) =E\∅V(μ£)^ + Var\μ£^ var(y£) =E\∅V(μ£)^ + Var\μ£^

where Var\μ£^ and Var\μ£^ denote the variance function corresponding to the exponential family distributions for the two response variables. Therefore, Var\μ£^ = Var\E(y£|b£)^ and Var\μ£^ = Var\E(y£|b£)^. The marginal covariance matrix between y£ and y£ is found to be equal to the covariance between μ£ and μ£, that is Cov (y£, y£) =Cov(μ£, μ£). The property is a consequence of the key assumption of conditional independence between the two response variables. This property allows the method to extend model fitting methods from the univariate to the multivariate GLMM.

To solve the problem of two outcomes, there are two strategies. These strategies accommodate mixed endpoints of the two outcomes. The product of the marginal distribution of one of the response variable and the conditional distribution of the other one given the first variable can be used to express the joint distribution of the binary variables. But, there is no simple expression to find the association between both endpoints. Therefore, to overcome this problem, it is important to treat the surrogate as binary variable. Therefore, the bivariate normal model for y£ and y£ can be described in Probit-linear model and an alternative can be formulated based on the bivariate Plackett density which Plackett-Dale modem (Plackett, 1965).

To use a Probit-normal formulation, assume the following models (Molenberghs and Verbeke, 2005).

167

y£= μ+ βX£+ ∈£ (7.4)

y£= μ+ βX£+ ∈£ (7.5)

where μ and μ are intercepts, β£’s are fixed effects and ∈£ and ∈£ are correlated errors. Therefore,

„ …†

£

£‡

ˆ‰ ~ N

„ ……

†

„ …†

0

ˆ‰,

„ …†σ

ρŒ [1 −ρ

1 1 −ρ ‡

ˆ‰

‡

ˆˆ

‰

The bivariate normal density models are represented by (7.4) and (7.5). It is clear that y£ univariate normal with variance σ. Therefore, μ, β and σ can be estimated with response y£ and covariate Z£. Therefore, the conditional density of y£ for X£ and y£ is

y£ ~ N z«μ− ρ

σ[1 −ρμ­ + «β− ρ

σ[1 −ρβ­X£+ « ρ

σ[1 −ρy£; 1­|

with unit variance. Therefore, the corresponding probability is

P(y£ = 1|y£,X£) = Ф‡,X£+ λy£) (7.6) where

λ‡ = μŒ[, Õ , (7.7) λ,= βŒ[, Õβ, (7.8) λ = Œ[, Õ, (7.9)

and Ф is the standard normal cumulative density function. To find the λ parameters, model (7.6) can be used to y£ with covariate X£ and y£. Furthermore, regression parameters from y£, β and σ) and probit

168 regression (λ‡, and λ) and parameters from y£ can be obtained using equations (7.7) – (7.9)

μ = λ‡+ λμ, (7.10)

β = λ+ λ,β, (7.11)

ρ = / ‘‘’ÕŒÕ

’ÕŒÕ. (7.12)

Where, σd = 2σ⁄ . N The asymptotic covariance of WλŸ‡, λŸ,, 럏Y yields the covariance matrix of the parameters. The derivation of the asymptotic covariance of (μ, β, ρ) can be obtained from the calculations of equations (7.10 – 7.12) with respect to the six orthogonal parameters with delta method.

Therefore,

∂(μ, β, ρ)

∂(μ, β, σ‡,) = :λ 0 0 0 λ 0

0 0 h

1 0 μ 0 1 β 0 0 h; where

h = 1

2ρ λ (1 +λσ), h = 1

2ρ 2λσ (1 +λσ).

Furthermore, the joint estimation can be obtained by maximizing the likelihood based influence of (7.1) and (7.2) (Molenberghs et al., 2001). To formulate Plackett-Dale, it is important to assume the cumulative distribution of y£ and y£ given by F• and F• (Plackett, 1965). Therefore,

F•~Ö,• =c1 + WF• +F•Y(ψ£− 1) − cWF•,F•, ψ£Y 2(ψ£− 1)

F•F• if ψ£≠ 1 if ψ£= 1d .

169 Bivariate Plackett “density” function G£(y£, y£) for mixed outcomes can be derived. Let y£ be denoted by π£, then define G£(y£, 0) by G£(y£, y£)and G£(y£, 1). In addition, the result can be a sum to f•(t). Therefore,

G£(t, 0) = ∂F•,F•(t, 0)

∂t . Then,

G£(t, 0) = NO P

OQf•(t) «1 − 1 +F•(t)(ψ£− 1) −F•(t)(ψ£+ 1)

cWF•, 1 −π£, ψ£Y ­ if ψ£ ≠ 1, F•(t)(1 −π£) if ψ£= 1,UOV

OW

and

G£(t, 1) = f•(t) − G£(t, 0).

Moreover, assume y£ ~ N(μ£) with μ£ = μ+ βX£ and logit(π£) = μ+ βX£.

For

θ£ = ´ μ£ σ π£ ψ

µ and η£ = ´ μ£ ln (σ) logit (π£)

ln (ψ) µ,

estimation of parameters ν= (μ, β, β, ln(σ) , ln(ψ)) easily obtained by solving the estimating equation, U(ν) = 0, using Newton-Raphson iteration scheme, where U(ν) is given by

< ä∂η£

∂νæV

@

£›

©ä∂η£

∂θ£æV®

,

: ∂

∂θ£lnG£(y£, y£);.

The joint model can also be discussed based on the generalized linear mixed model formulation. For this approach, the formulation can be done on the presence of both random effects and serial correlations. The expressions

Y£ = μ£+ ∈£ is the general formulation and

170 Y£ = e›~œ/~ž

1 + e›~œ/~ž+ ∈£

is specific random effects logistic regression. For a bivariate response vectors y£ = (y£V , y£V )V where y£= (y£, . . . , y£@~)V and y£= (y£, . . . , y£@~)V are for the two outcomes respectively (Goldstein, 2011).

In general,

μ£ = μ££) = g,(X£β + Z£b£). (7.13) Assume b£ ~ N (0,Ʃ) are the q-dimensional random effects. Furthermore, the link function g, are allowed to change with the nature of outcomes in i. O and are (2 × ) and (2× ) dimensional matrices of the covariate values and β ia s p-dimensional vector of unknown fixed regression coefficients. The variance of ∈ depends on the mean-variance link of various outcomes. In addition to this, the variance contains a correlation matrix R£(α) and a dispersion parameter ø.

The variance-covariance matrix of i can be obtained from a general first-order approximate expression, which is given by

 = $€(i) ≃ ∆ÜVV+ Ʃ (7.14) with

= òβ‹—

—Ä |˜—›‡, and

 ≃ ɸÖÕÆÖÕÉ(9)ÆÖÕɸÖÕ,

where Æ a diagonal matrix containing the variance from the generalized linear model specification of ? ( = 1,2) for a given random effects b£ = 0. ɸ is a diagonal matrix with the overdispersion parameter along the diagonal. É(9) is a correlation matrix. Furthermore, the over dispersion is normally distributed

171 with j and the variance function 1 (Molenberghs and Verbeke, 2005). For a binary outcome with logit link

μ£à (b£ = 0)(1 − μ£à(b£ = 0))

can be derived from Taylor series expression of the mean component around { = 0. When an exponential family specification is used for all components, with a canonical link, ∆= Æ, the resulting GLMM has the variance-covariance matrix of ?, i.e.,

var(y£) = ∆£Z£GZ£V£V+ ɸ£ÖÕ£ÖÕR£(α)∆£ÖÕɸ£ÖÕ under conditional independence É vanishes and

var(y£) = ∆£Z£GZ£V£V+ ɸ£ÖÕ£ÖÕɸ£ÖÕ.

A model with no random effects for the marginal generalized linear model (MGLM) has a form

´ y£

y£

µ = ¤

μ+λb£+αX£

+,- (¥Õ~/ œ›~) / +,- (¥Õ~/ œ›~)

¦+ ´

£

£

µ (7.15)

The scale parameter λ is included in the continuous of random-intercept model, given two outcomes are measured. Therefore,

= Ã41Ä, ∆= ä1 00 …æ , ø = Ãj 0 0 1Ä

with v£= μ£(b£ = 0)W1 − μ£(b£ = 0)Y.

Suppose 1 is the correlation between = and =. But, is not a design matrix, because it contains unknown parameters. Therefore, variance-covariance function (7.14) leads to

172 V£ = «λ v£λ

v£ v£ ­τ+ ä σ ρσ√v£

ρσ√v£ v£ æ

= « λτ v£λτ+ρσ√v£

v£λτ+ρσ√v£ v£τ+ v£ ­. (7.16) Therefore, the derived approximate marginal correlation function is given by

ρ(β) = v£λτ+ ρσ√v£

√λτ[v£τ+v£, (7.17)

Expression (7.17) depends on the fixed effects through v£. A model with no random effects, it can be given as

´ y£

y£

µ = ¤

μ+ βX£

+,- (¥Ö֛~) / +,- (¥Ö֛~)

¦+ ´

£

£

µ (7.18)

and expression (7.16) reduced to 1.

Under conditional independence, 1 in expression (7.16) satisfies 1 ≡0 and equation (7.17) can be reduced to

ρ(β) = ª‘«Õ

√‘Õ«ÕÕÕ«Õ

. (7.19)

Equation (7.19) is simpler than equation (7.17). But, equation (7.19) is a function of the fixed effects. For the case of binary endpoints (both outcomes), equation (7.17) is

ρ(β) = v£v£τ+ρσ√v£v£ [v£τ+v£[v£τ+v£.

Similarly, for a constant correlation ρ with no random effects and no residual correlation, we have

173 ρ(β) = v£v£τ

[v£τ+v£[v£τ+v£. (7.20)

Equation (7.20) can be performed with general random effects design matrices and for more than two components.

Full joint distribution is not necessary for the general model formulation. A full joint model specification needs full bivariate model specification, conditional upon the random effects. Furthermore, the generalized linear mixed model formulation can be extended to the hierarchical cases. The hierarchical cases include repeated measures, meta-analysis, cluster data, correlated data, etc.

Model i = m+ ∈ is sufficient to generate marginal and random effects models. For shared parameters between models of different types, it is important to ensure the models to be meaningful. For correlations in the model

with random effects, the correlation structure can be derived from

 = $€(i) ≃ ∆ÜVV+ Ʃ. In general, the parameters from joint models can be estimated using numerical approximation method. These methods include Gaussian quadrature and Laplace approximation. Estimation based on data using pseudo-likelihood where pseudo data created based on a linearization of the mean. Furthermore, the pseudo-likelihood approach can be used to estimate parameters in marginal models and random effects with or without correlations. But, quadrature or Laplace approximations can only estimate parameters in the conditional independence random effects models.