Chapter 2: Data and exploratory analysis 8
3.6 Model diagnostics
3.6. Model diagnostics
of the random components and the residuals are often used for the diagnostic pur- poses. In particular, the scatter plots are used to pinpoint outlying observations which arise from subjects that seem to evolve differently from the other subjects in the sample, but the histograms of the residuals can be used to check for the normality of the random effects and the error terms. The empirical distribution of the data (the histogram) should be bell-shaped and resemble the normal distribution. This might be difficult to see if the sample is small. In this case one might proceed by regress- ing the data against the quantiles of a normal distribution with the same mean and variance as the sample. Lack of fit to the regression line suggests a departure from normality. One can also use the probability plots, and tests using the Shapiro-Wilk test and the Kolmogorov-Smirnov test to assess the normality assumptions. Specif- ically the W-statistic (in the Shapiro-Wilk test), suggested by Shapiro & Wilk (1965) has been shown to be a good omnibus test of normality.
3.6.1 Residual diagnostics
In order to validate model assumptions and to detect outliers and potentially influ- ential data points residuals are employed. A residual is defined as the difference be- tween an observed quantity and its predicted value. In the mixed model a marginal residual is the difference between the observed data and the estimated (marginal) mean that is
rmi=yi−x0iβˆ (3.67)
and a conditional residual is the difference between the observed data and the pre- dicted value of the observation that is,
rci=yi−x0iβˆ−zi0bˆi (3.68) For a model without random effectsb, the marginal and conditional residuals coin- cide. The name conditional residual stems from the fact thatx0iβˆ+zi0bˆiis the condi- tional mean ofyi. According to Schabenberger (2005), the raw residuals,rmiandrci
are usually not well suited for these purposes. ”Even if the true model errors are un- correlated and have equal variance, the residuals will exhibit correlations and their variances will differ. The interpretation of raw residuals is further made difficult if the variances of the observations differ. A data point with a smaller raw residual may be more troublesome than a data point with a large residual, if the variance of the former observation is less”. Different types of residuals are presented in Table 3.3.
3.6. Model diagnostics
Table 3.3:Summary of available residuals
Type of Residual Marginal Conditional Raw rmi=yi−x0iβˆ rci=yi−x0iβˆ−zi0bˆi Studentized rstudentmi = √rmi
Vb(rmi) rcistudent= √rci
Vˆ[rci]
Pearson rmipearson= √rmi
Vˆ[Yi] rpearsonci = √ rci
Vˆ[Yi|ˆbi]
Scaled Cˆrm−1
3.6.2 Influence diagnostics
Influential observations refer to those observations that appear to have fairly large influence on the parameter estimates.
3.6.3 Overall influence
Typically the subscriptU denotes quantities obtained without the observations in the setU. An overall influence statistic measures the change in the objective func- tion being minimized Schabenberger (2005) . Beckman et al. (1987) refers to it as the likelihood displacement (LD). According to Schabenberger (2005), the likelihood and restricted likelihood distances (RLD) are then given by
LD(U)= 2
l( ˆψ)−l( ˆψ(U)) (3.69)
RLD(U)= 2
lR( ˆψ)−lR( ˆψ(U)) (3.70) According to Schabenberger (2005) ”the likelihood distance gives the amount by which the log-likelihood of the full data changes if one were to evaluate it at the reduced-data estimates. It is obtained by evaluating the likelihood function based on the full data set (containing all n observations) at the reduced-data estimates. The likelihood distance is a global, summary measure, expressing the joint influence of the observations in the set U on all parameters in that were subject to updating”.
3.6. Model diagnostics
3.6.4 Change in parameter estimates
Cooks distance measures the effect of deleting a given observation and was first introduced by Cook (1977). Schabenberger (2005) states that the difference between Cooks distance (D) and multivariate DFFITS (also known as MDFFITS) is that the latter uses an externalized estimate of the variance of the parameter estimates, while Cooks distance does not. For the fixed effects, the two statistics are
D(β) = ( ˆβ−βˆ(u))0vˆar( ˆβ)−1( ˆβ−βˆ(u))/rank(X) (3.71)
M DF F IT S= ( ˆβ−βˆ(u))0vˆar( ˆβ(u))−1( ˆβ−βˆ(u))/rank(X) (3.72) For both statistics, large values, according to Schabenberger (2005) ”indicate that the change in the parameter estimate is large relative to the variability of the estimate . If the covariance parameters are updated during influence analysis, similar statis- tics can be computed forθ. However, theˆ D(θ)andM DF F IT S(θ)statistics do not involve division by a matrix rank”.
3.6.5 Change in precision of estimates
According to Schabenberger (2005) ”the effect on the precision of estimates is sepa- rate from the effect on the point estimates. Data points that have a small Cooks D, for example, can still greatly affect hypothesis tests and confidence intervals, if their influence on the precision of the estimates is large”.
whereqdenotes the rank ofV ar(θ). The variance matrix that is used in the compu- tation of COVTRACE and COVRATIO for covariance parameters is obtained from the inverse Hessian matrix Schabenberger (2005).
3.6.6 Effect on fitted and predicted values
Following Schabenberger (2005) ”the MIXED procedure computes the following statis- tics to measure influence on fitted and predicted values. The PRESS residual Allen (1974) is the difference between the observed value and the predicted (marginal) mean, where the predicted value is obtained without the observations in question.
Formally,
ˆ
ei(U) =yi−x0iβˆ(U) (3.73) If you compute the influence of individual observations, the MIXED procedure re- ports these PRESS residuals. When removing sets of observations, PROC MIXED
3.6. Model diagnostics
computes the PRESS statistic. This statistic is the sum of the squared PRESS residu- als in a deletion set,
P RESS(U) =X
i∈U
ˆ
ei(U) (3.74)
The effect of observations on fitted values can be measured by the DFFITS statistic of Belsley et al. (1980). Recall that a DFFIT measures the change in predicted values due to removal of a single data point. If this change is standardized by the externally estimated standard error of the predicted value in the full data, then the DFFITS statistic is given by”
DF F IT Ss = (ˆyi−yˆi(U))/ese(ˆyi) (3.75) 3.6.7 Summary
The theory on linear mixed models and the estimation methods was discussed.
Types of covariance structures were briefly described and the best structure can be selected by choosing the model with a structure that gives the lowest Akaike Infor- mation Criteria (AIC). Furthermore different types of residuals for model diagnos- tics were briefly discussed