Advanced topics - Textbook in Psychiatric Epidemiology

In this section we briefly review a number of more advanced topics that can be considered extensions of the standard logistic regression model. Many of these methods have been somewhat slow to move into

the mainstream of psychiatric research. However, with their recent implementation in widely available statistical software, these methods are starting to be more widely applied.

2.6.1 Conditional logistic regression

The previous section showed that logistic regression can be used to perform analyses similar to those using contingency table methods but with more complex extensions and applications. This section introduces a related technique known as conditional logistic regression, which extends many of the benefits of logistic regression to studies with matched designs.

In matched study designs individuals are stratified on the basis of variables thought to be related to the outcome variable of interest. For example, age and years of education are two variables commonly used to define strata in many psychiatric studies. The conditional logistic regression model used to analyse matched data is

log[pij/(1−pij)]=αi0+β1xij1+β2xij2

+ · · · +βKx_ijK. (2.17) Note that this is similar to the standard logistic regression model, except that the probability of success and the predictors are now indexed byiandj instead ofialone to indicate that they apply to the jth individual from theith strata (e.g. matched pair).

Note also that the common interceptβ0 in standard logistic regression has been replaced in Equation 2.17 by a stratum-specific interceptαi0.

Parameter estimates from conditional logistic regression can be interpreted in a similar way as parameter estimates from standard (or uncondi- tional) logistic regression, and conditional logistic regression offers the same capabilities as standard logistic regression with a few exceptions. One is that the stratum-specific intercepts cannot be estimated (and will not be included in conditional logistic regression output). This is because the method of estimation (discussed later) effectively eliminates them to ensure that theβ’s are estimated without any bias. This is usually not a concern since, as for standard logistic regression, these intercepts are generally not of scientific interest. Second, because the model includes stratum-specific intercepts, theβ’s now have

stratum-specific interpretations in terms of changes in the log odds of success for within-stratum changes in the covariates. For example,β1has interpretation in terms of changes in the log odds of success for a single unit change inx_ij1 within the ith stratum(i.e. com- paring two individuals within the same stratum that happen to differ by one unit in the covariate). Third, the associations between any variables used for matching (or any other covariates that are constant within strata) and the outcome cannot be quantified.

This is because the method of estimation is based entirely on variation within a stratum; conditional logistic regression cannot be used to estimate the effect of a covariate that varies only between strata (but not within a stratum). Returning to the example from Section 2.4.2, a study examining the association between birth complications and age of onset of psychosis that matches on sex, conditional logistic regression cannot quantify the association between sex and age of onset of psychosis because, by study design, sex does not vary within each stratum. How- ever, it is still possible to test for interactions between variables used for matching and other predictors.

Next we consider estimation of the model parameters. One approach to fitting this model would be to attempt to estimate all of the parameters, including the stratum-specific intercepts. However, for matched designs, the number of strata grows as the sample size increases, which means that the number of parameters would be large relative to the sample size no matter how big a sample was collected. For example, in a matched-pair design withnpairs (i.e. two subjects in each stratum), such an analysis would require the estimation ofn+K parameters from a sample of only 2nobservations.

It should not be surprising that this proliferation of stratum-specific intercepts causes problems for estimation; it also causes problems with the prop- erties of standard maximum likelihood estimates of the model parameters. To avoid these problems, conditional logistic regression maximises a likelihood for the conditional distribution (and hence the term ‘conditional’ logistic regression) that eliminates these stratum-specific intercepts and bases estimation of the associations between the predictors and outcome entirely on information from within the strata.

2.6.2 Exact logistic regression

Like many methods for contingency tables, logistic regression as traditionally implemented (i.e.

maximum likelihood logistic regression) relies on large sample theory for the validity of its results.

Maximum likelihood logistic regression can perform poorly when the sample size is small, the probability of success is near one or zero, or we have an insuf- ficient number of successes or failures for certain combinations of our covariates. Error messages from statistical software, very large or small parameter estimates, or very wide confidence intervals can sometimes alert us to these problems, though logistic regression can still have poor performance due to sparse data even when the problem is not evident from the distribution of individual covariates or from an examination of the results [11]. Exact logistic regression[12] is a method for fitting logistic regression models that produces valid estimates, test statistics and confidence intervals even for small datasets or sparse data. For example, exact logistic regression was used to study psychiatric and social predictors of attempted suicides in a sample of Indian women [13]. In this study, the total number of par- ticipants was fairly large (2494), but the number of suicide attempts was relatively small (19). As a result, very small numbers of successes (suicide attempts) were observed for some predictors, and exact logistic regression was an appropriate analysis strategy. The relationship of exact logistic regression to maximum likelihood logistic regression is similar in some ways to the relationship of Fisher’s exact test to large-sample methods for R×C contingency tables.

However, whereas Fisher’s exact test conditions on the row and column totals in order to derive the distribution of the test statistic, exact logistic regression conditions on the so-calledsufficient statistics for the remaining parameters in the model when estimating each parameter. The sufficient statistics for the parameters are determined by the number of successes for different values of the corresponding covariates.

Like other exact methods, exact logistic regression guarantees that tests conducted at significance level α have a type I error rate less than or equal to α, and that 95% confidence intervals have at least 95%

coverage even for small sample sizes and sparse data.

It can be implemented by several popular statistical software packages, and parameter estimates and confidence intervals have an interpretation identical to those for maximum likelihood logistic regression. Some disadvantages are that it may be overly conservative in settings when maximum likelihood logistic regression performs adequately and that it can be computationally intensive, especially when quantitative covariates or a large number of categorical covariates are included in the model. In principle, exact logistic regression can be applied in settings with multiple covariates; however, greater care is required when attempting to fit complex models. When feasible, exact logistic regression is an attractive alternative to maximum likelihood logistic regression in small sample and sparse data settings.

2.6.3 Multinomial regression models

A major focus of this chapter has been on logistic regression modelling of a binary outcome. For some applications, however, the outcome variable of interest is categorical with more than two levels. For example, in a study of trauma in a high-risk African- American sample [14], response to trauma was categorised as currently ill (current psychiatric disorder), recovered (past history of one or more psychiatric disorders) or resilient (no history of psychiatric disorder). Predictors of response to severe trauma in this population were examined usingpolytomouslogistic regression. In this section we introduce multinomial models for categorical outcomes with more than two levels by first considering the case of such a nominal categorical variable. Suppose the outcome for indi- vidualiis categorical withJlevels and letY_iequal 1 with probabilityp_i1, equal 2 with probabilityp_i2, and so on. In general,Y_i equalsjwith probability p_ij, j=1,. . .,J.We can introduce some additional notation that will make the extension of logistic regression to this setting more transparent. Suppose we letY_ij equal 1 ifY_i=j, and equal 0 otherwise, forj=1,. . .,J.Then,p_ij=pr[Y_i=j|x_i1,. . .,x_iK]= pr[Y_ij=1|x_i1,. . .,x_iK].WhenJ>2, the extension of the logistic regression model known as polytomous (or multinomial) logistic regression is appropriate.

In polytomous logistic regression, we form J−1

non-redundant logits:

log

pr[Y_i=j|x_i1,. . .,x_iK] pr[Y_i=J|x_i1,. . .,x_iK]

=log

pr[Y_ij=1|xi1,. . .,x_iK] pr[Y_iJ=1|x_i1,. . .,x_iK]

=log pij

p_iJ

=βj0+βj1x_i1+ · · · +βjKx_iK j=1,. . .,J−1, where the regression parameters, βj0,βj1,. . .,βjK, can be different for each level j.In this notation, the last categoryJis referred to as the

‘reference’ category. It can also be shown that, for j=1,. . .,J−1,

pij= exp[βj0+βj1x_i1+ · · · +βjKx_iK] 1+_J−1

j=1exp[βj0+βj1x_i1+ · · · +βjKx_iK] Note that this polytomous logistic regression model is more appropriate when the categorical variable is nominal. In other settings, the categorical outcome is ordinal. For example, in a study of predictors of remission in patients over age 60 treated for depression, the outcome was categorised as no remission, partial remission or full remission [15].

For ordinal outcomes a variety of regression models can be used, including mean score models and models of a logistic regression form. The logistic models for ordinal data include the continuation odds model, theadjacent category logitand thecumulative logit models [16]. Here, we briefly discuss the cumulative logistic proportional odds model, one of the most widely used regression models for ordinal data.

To formulate an ordinal response model, we form logits of the cumulative probabilities. Recall, p_ij= pr[Y_ij=1|x_i1,. . .,x_iK]. We define the cumulative probabilities as F_ij=pr[Y_i≤j|x_i1,. . .,x_iK]=p_i1+ . . .+p_ij. In the previous example, F_i2 is the probability of a response of (i) no remission or (ii) partial remission. The logit ofF_ij,

logit(F_ij)=log Fij

1−F_ij

=log

pi1+ · · · +pij

p_i,j+1+. . .+p_iJ

is often referred to as the ‘cumulative’ logit, and these cumulative logits can be related to covariates in the following proportional odds model,

logit(F_ij)=αj+β1x_i1+ · · · +βKx_iK.

Note that the original multinomial probabilities can be expressed in terms of the cumulative probabilities via p_ij=F_ij−F_i,j−1. Inferences about the

‘cumulative logits’ or ‘cumulative’ logORs can be made similarly to inferences for standard logistic regression. For example, with remission category as the outcome, ifx_i1is an indicator for comorbid dysthymia (an important predictor in the Hybelset al.

[15] study), exp(β1) is theORcomparing the odds of full or partial remission versus no remission. A prop- erty of the proportional odds model is that exp(β1) is also theORcomparing the odds of full remission versus partial or no remission for patients with and without comorbid dysthymia.

2.6.4 Clustered categorical data

In the previous sections we have considered regression models for a single categorical outcome. How- ever, multivariate categorical response data commonly arise in a number of applications in psychiatry.

That is, two or more measurements of the response are often obtained in a block or cluster and the categorical responses within a cluster are expected to be positively correlated. When this occurs, the responses from any pair of members of the same cluster are expected to be more closely related than the responses from a pair belonging to different clusters. Some common examples where data arise in clusters include repeated measures or longitudinal studies and studies on families, communities or other naturally occurring groups. For example, in a study of the familial association between rheumatic fever and obsessive–compulsive spectrum disorders, each cluster consisted of first-degree relatives of either a case with rheumatic fever or a control [17]. The important aspect of all of these studies is that the categorical responses within a cluster (e.g. the presence of obsessive–compulsive spectrum disorders in first- degree relatives) cannot be regarded as independent of one another.

There may be many reasons for the correlation among cluster members. For example, when the cluster is comprised of all the siblings within a fam- ily the correlation among siblings may be due to shared (or at least similar) genetic, environmental and social conditions. In a longitudinal study, where

the responses within a cluster represent measurements taken at different occasions, the categorical responses are expected to be positively correlated simply because they have been obtained on the same individual (or cluster). Whatever the underlying reasons for the correlation, failure to account for it in the analysis can lead to invalid inferences. That is, the standard application of logistic regression (or any methods that assume independent observations) in this setting is no longer appropriate.

For the remainder of this section we focus on the case of clustered binary data; however, the methods we discuss apply more broadly to clustered categorical data. There are two general approaches for handling the analysis of clustered binary data. The first is to consider models for the joint distribution of the cluster of binary responses that explicitly account for the within-cluster correlation. There is an extensive statistical literature on this topic and the interested reader is referred to a review article by Pendergastet al. [18]. For the most part, these models can be computationally demanding and have only recently been implemented in commercially available statistical software. An alternative approach is to simply ignore the correlation among members of a cluster. That is, the analysis proceeds naively as though the binary responses within a cluster can be regarded as independent observations, but later a correction is applied to ensure that valid standard errors are obtained. Note that in this ‘naive’

approach that ignores the within-cluster correlation the estimated logistic regression coefficients are valid, but their nominal standard errors are not. However, valid standard errors can be readily obtained using the well known empirical variance estimator, first proposed by Huber [19]. Thus, the analysis proceeds in two stages. In the first stage, the correlation among binary responses within a cluster is simply ignored and standard logistic regression is used to obtain estimates of the logistic regression coefficients. In the second stage, valid standard errors for the estimated logistic regression coefficients are obtained using an alternative, but widely implemented, variance estimator that properly accounts for the correlation among the binary responses. The chief advantage of this approach is that it can be readily implemented using standard statistical software for logistic regression.

Dalam dokumen Textbook in Psychiatric Epidemiology (Halaman 39-43)