POISSON METHODS FOR SINGLE SAMPLE SURVIVAL DATA In theory, a hazard function can have almost any functional form. The estimated

C H A P T E R 10

Poisson Methods for Censored Survival Data

The Kaplan–Meier method is based on relatively few assumptions; in particular, nothing is specified regarding the functional form of either the survival function or the hazard function. Censoring is assumed to be uninformative, but this is a feature of virtually all of the commonly used methods of survival analysis. Since so little structure is imposed, it is appropriate to view a Kaplan–Meier survival curve as a type of scatter plot of censored survival data. The appearance of a Kaplan–Meier curve can be used to form ideas about the nature of the underlying survival function and hazard function, in much the same way as a scatter plot is used as a visual aid in linear regression.

Despite these advantages, there are difficulties with the Kaplan–Meier approach.

Kaplan–Meier curves are not designed to “smooth” the data while accounting for random variation in the way that a linear regression line is fitted to points in a scatter plot. As a result, Kaplan–Meier survival curves can be erratic in appearance and sensitive to small changes in survival times and censoring patterns, especially when the number of deaths is small. The Kaplan–Meier survival curves for the six receptor level–stage strata shown in Figure 9.6 are relatively well-behaved, but it is easy to imagine how complicated such a graph might otherwise be.

In this chapter we describe parametric methods of survival analysis based on the Weibull, exponential, and Poisson distributions. The computations required by the exponential and Poisson models are relatively straightforward, and the results are readily interpreted. However, this convenience is gained at the expense of having to make strong assumptions about the functional form of the hazard function, a decision that needs to be justified in any application.

10.1 POISSON METHODS FOR SINGLE SAMPLE SURVIVAL DATA

the entire life cycle and, as is well known, mortality risk is highly dependent on age.

There may be a degree of systematic or random error in Figure 8.2(c), but Statistics Canada vital statistics data are very reliable and the sample size is so large that the complicated appearance must be accepted as a realistic depiction of the underlying hazard function. In practice, most cohort studies have a relatively small sample size and a fairly short period of follow-up. This means that the period of observation will usually be too short for the hazard function to exhibit much variation over time, and the sample size will be too small for it to be possible to discern subtle changes in the hazard function, even if they should be present. As a consequence, it is usually appropriate in epidemiologic studies to model the hazard function using relatively uncomplicated functional forms. Two of the most widely used are the Weibull and exponential distributions (Kalbfleisch and Prentice, 1980; Lawless, 1982; Cox and Oakes, 1984; Lee, 1992; Collett, 1994; Klein and Moeschberger, 1997).

10.1.1 Weibull and Exponential Distributions

The Weibull distribution has the survival functionS(t) =exp[−(λt)^α]and hazard functionh(t) =αλ(λt)^α−¹. Hereλandαare parameters satisfying the conditions λ >0 andα >0. We refer toλas the rate parameter and toαas the shape parameter.

Figure 10.1(a) shows graphs of the hazard function forλ = 1 andα = .5, 1, 1.5, and 3. Settingλ = 1 reflects the choice of time units but does not influence the basic shapes of the curves. Whenα = 1,h(t)is constant; whenα < 1,h(t)is a decreasing function of time; and whenα >1,h(t)is increasing. The corresponding survival curves are shown in Figure 10.1(b). The Weibull distribution is applicable to a range of situations commonly encountered in epidemiology. For example, consider a cohort of surgical patients who are being monitored after having just undergone

FIGURE 10.1(a) Weibull hazard functions for selected values ofα, withλ=1

FIGURE 10.1(b) Weibull survival functions for selected values ofα, withλ=1

major surgery. Suppose that for the first few days after surgery the mortality risk is high, but after that it gradually declines. In this case a Weibull distribution withα <1 would be appropriate. As another example, consider a cohort of cancer patients who are undergoing long-term follow-up after entering remission. Suppose that for the first few years the risk of relapse is relatively low, but as time progresses more and more patients have a recurrence. In this case, a Weibull distribution with α > 1 would be a reasonable choice. The ovarian cancer cohort with high-grade disease in Figure 9.4(a) exhibits the latter type of survival experience.

Consider an open cohort study ofrsubjects and, in the notation of Section 8.1, let (t_i, δi)be the observation for theith subject(i =1,2, . . . ,r). Maximum likelihood methods can be used to estimateλandαfrom these data but, except whenα =1, closed-form expressions are not available. When α = 1 the Weibull distribution simplifies to the exponential distribution, in which case S(t) = e^−λ^t andh(t) = λ. The exponential distribution rests on the assumption that the hazard function is constant over the entire period of follow-up. This assumption is evidently a very strong one and will often be unrealistic. However, when the sample size is small and the period of follow-up is relatively short, the exponential distribution provides a useful approach to analyzing censored survival data. The attraction of the exponential distribution is that the parameterλ is easily estimated, as shown below. Since the exponential hazard function has the same value at any point during follow-up, the exponential distribution is said to be “memoryless.”

Letd denote the number of deaths in the cohort. This represents a change of notation from Chapter 9 where we used the symbola. We adopt this convention as a way of distinguishing the formulas based on the exponential and Poisson distributions from those based on the binomial approach. It follows immediately from the

Dalam dokumen Biostatistical Methods in Epidemiology (Halaman 197-200)