Testing adequacy of fit in the random right censoring framework

Testing this hypothesis is complicated by the fact that random right-censoring often occurs in the mentioned fields of study; i.e. not all lifetimes of interest are observed. The majority of the recently modified tests are based on either the characteristic function or the Laplace transform. Some of the tests developed for the Weibull distribution for use with censored data are also new to the full example case.

We compare the final sample performance of the above tests with a wide range of alternatives, both in the complete sample case as well as in the presence of random right-censoring. A general overview of the thesis is given in chapter 1 together with the purpose of the study. The third article on the impact of the assumed tail behavior of the Kaplan-Meier estimator on goodness-of-fit testing is due for submission to Computational Statistics and Data Analysis.

The conference proceedings On an omnibus test for the parametric proportional hazards model has been accepted for publication in the Proceedings of the 62nd Annual Conference of the South African Statistical Association. The promoters agreed on co-authorship and consented to the use of these articles as part of the final thesis.

Overview

Throughout the study, we use a single censored data set to demonstrate the use of the newly modified tests. This example is also used to demonstrate the computational assumptions required when estimating lifetimes and censoring distribution functions using the Kaplan-Meier estimator. In addition, we consider goodness-of-fit testing for survival models by developing an omnibus goodness-of-fit test for a parametric Cox proportional hazards model in the presence of random censoring.

The test can detect deviations from the assumed model when the baseline distribution or regression component of the model is incorrectly specified. In the last part of this thesis, we include some concluding remarks, together with some possibilities for future research.

Objectives

The influence of these assumptions on the sizes as well as the empirical forces is studied in detail using Monte Carlo simulations. Present newly modified tests for the gamma distribution in the presence of random right censoring and compare the performance of these tests with that of existing tests for a wide range of alternatives. Explore and evaluate some of the computational assumptions associated with the Kaplan-Meier estimator.

Provide new bootstrap algorithms to calculate critical values when proposing and implementing fitness-of-fit survival models.

Thesis outline

On the impact of the assumed tail behavior of the Kaplan-Meier estimator on goodness-of-fit tests. In the full example, these empirical functions can be expressed as integrals with respect to the empirical distribution function of the lifetimes. In the mentioned fields, random right-censoring often occurs due to the nature of the study itself.

Henze and Meintanis (2002a) proposed a goodness-of-fit test based on the characterization of the exponential distribution via a characteristic function. The null distribution of each of the test statistics considered depends on the unknown censoring distribution, even in the case of the simple hypothesis (see D'Agostino & Stephens, 1986). For the sake of continuity, these tables are placed at the end of the paper.

A class of consistent tests for exponentiality based on the empirical Laplace transform. Annals of the Institute of Statistical Mathematics. Goodness-of-fit tests based on a new characterization of the exponential distribution. Communications in Statistics - Theory and Method.

Addendum to Article 1

Epps and Pulley (1986)

Baringhaus and Henze (1991)

We investigate the performance of the new tests on a finite sample using an extensive Monte Carlo study. In the presence of censoring, testing the hypothesis that the distribution of lifetimes is Weibull is complicated by the fact that an imperfect sample is observed. Note that the inclusion of the weighting function, w above, is necessary to ensure that η is finite.

However, G can be estimated by the Kaplan-Meier estimator, Gn, of the distribution function given by . New goodness-of-fit tests that include a tuning parameter are often accompanied by a recommended value for this parameter; this choice is generally based on the finite sample power performance of the test. A convenient setting for deriving the asymptotic properties of these tests is the separable Hilbert space of square integrable functions.

Some of these results may be useful in deriving the asymptotic properties of the tests proposed in this paper in future research. The null distribution of each of the test statistics considered depends on the unknown censoring distribution, even in the case of a simple hypothesis, see D'Agostino and Stephens (1986). This test compares the empirical Laplace transform of the random variables resulting from the transformation in (2) with the Laplace transform of an E V ( 0 , 1 ) random variable;.

Let ψn be the empirical Laplace transform of the transformed observations, obtained using the Kaplan-Meier estimation of the distribution function; For each of the tests considered above, the null hypothesis in (1) is rejected for large values of the test statistics. For a discussion of the original data set, see Kotze and Johnson (1983) and Allison et al.

The results above indicate that the additional flexibility of the Weibull (compared to the exponential) distribution indeed ensures that the Weibull distribution is a more appropriate model than the exponential for the initial remission times considered. A number of interesting numerical phenomena are evident when the powers of the various tests are considered. It is clear that the obtained powers and thus the null distribution of the test statistic are affected by the shape of the censoring distribution.

The effect of the censoring distribution on critical test values seems not to have been investigated in the literature to date. Allison J, Santana L (2015) On a data-dependent tuning parameter choice emerging in some goodness-of-fit tests.

Table 1 Density functions of the alternative distributions

Addendum to Article 2

O N THE EFFECT OF ASSUMED TAIL BEHAVIOR OF THE K APLAN-M EIER ESTIMATOR ON GOODNESS-OF-FIT TESTING. In the case of the full sample, classical goodness-of-fit tests are based on comparisons between nonparametric and parametric estimates of the distribution function (obtained by estimating the parameters of the specified class of distributions under the null hypothesis). If the maximum sample τ is uncensored, then the final jump in Phenoccurs at τ, regardless of the set-to-1 assumption.

As can be seen in the figure, the set-to-1 assumption makes a substantial difference in the estimated value of the distribution function for large values often. This is to be expected since the exponential distribution is a special case of the Weibull. In the case of the exponential distribution, we rescale the data by multiplying by the estimated norm parameter.

The Kaplan-Meier estimate of the censoring distribution in a practical case is shown in Figure 2. Since the maximum of the sample is censored, the Kaplan-Meier estimate reaches a value of 1 att= 269 regardless of the assumption set to 1. Cn∗by sampling from the Kaplan-Meier estimate of the distribution of times censoring (assumption setting set to 1 to true or false).

In the present example, the assumption of clustering at 1 with respect to the Kaplan-Meier estimate of the censoring distribution is irrelevant as it does not affect the resulting estimated distribution function. A test statistic based on a characterization of the exponential distribution via the characteristic function is presented in [18]. In the simulations in Section 3, we use a = 0.5and = 1. The hypothesis of the gamma class of distributions is rejected for large values of Rn,a.

Furthermore, we will make recommendations regarding the set-to-1 assumption for the best performing test for each of the three lifetime distributions. The influence of the censorship distribution increases as the censorship share increases. For the exponential distribution, the nominal significance level is achieved for the majority of the tests with a censorship share of 10%.

The Sn,1(1) test performs well with estimated competitive abilities regardless of the censoring ratio and set-to-1 assumptions. As before, the powers of the tests generally increase with sample size and decrease with censoring rate.

Table 1: Initial remission times of leukemia patients in days.

Overview of the goals of the papers presented

In Chapter 5, a new omnibus test for the parametric proportional hazard model was developed in the presence of arbitrary censorship.

Overview of results

Another result of independent interest was presented; a test, based on the Laplace transform, initially proposed for use with complete samples, was modified to test the Weibull distribution in the presence of censoring. Based on the observed numerical powers, we recommend using the newly proposed test based on Stein's method when testing the hypothesis under consideration in the presence of censoring. In the first two articles we make some computational assumptions regarding the Kaplan-Meier estimator.

The numerical results in the paper indicate that the set-to-1 assumption required when using the Kaplan-Meier estimator of the distribution function plays an unexpectedly important role when testing goodness-of-fit hypotheses. Furthermore, it should be noted that the highest powers are often reached when the set-to-1 assumption is made for one of the two distributions being estimated, but not the other. Only goodness-of-fit testing for censored samples in the independent and identically distributed case was investigated in the first three articles.

In the conference proceedings we presented a goodness-of-fit test for the parametric proportional hazards model. In the proceedings, we proposed an omnibus test of fit for the parametric Cox proportional hazards model in the presence of random right-censoring. This test is shown to be able to detect deviations from the hypothesized model when the underlying distribution or regression component of the model is misspecified.

Two modified classical tests are considered and the Monte Carlo study suggested that the Cram´er-von Mises as well as the newly proposed tests are particularly powerful. We also outlined the procedure needed to use the newly modified test in the setting of independent and identically distributed random variables.

Concluding remarks and future research