www.elsevier.com / locate / econbase
When is a proportional hazards model valid for both stock and
flow sampled duration data?
*
Gordon C.R. Kemp
Department of Economics, University of Essex, Wivenhoe Park, Colchester CO4 3SQ, UK
Received 25 November 1999; accepted 24 January 2000
Abstract
It is shown that if the flow-sampled total durations and stock-sampled elapsed durations from stationary renewal processes both exhibit proportional hazard rates and are continuous then they belong to the family of survival distributions with linear mean residual functions introduced by Hall and Wellner (1984) [Hall, W.J.,
¨ ¨
Wellner, J.A., 1984. Mean residual life. In: Csorgo, M., Dawson, D.A., Rao, J.N.K., Saleh, A.K.M.E., (Eds.), Proceedings of the International Symposium on Statistics and Related Topics. North Holland, Amsterdam, pp.
169–184]. 2000 Elsevier Science S.A. All rights reserved.
Keywords: Renewal processes; Proportional hazards
JEL classification: C41
1. Introduction
Econometric duration processes are sometimes modelled as possessing flow-sampled hazard rates that exhibit a proportional hazards structure. The convenience of this assumption is that Cox’s partial likelihood method can then be used to estimate the relative risk factor of the model (Cox, 1972, 1975). However, the data available to researchers is often in the form of stock-sampled data. As a quick and dirty data analysis procedure it is tempting to use Cox’s partial likelihood method on the elapsed spell durations from stock-sampled data, i.e. to treat the elapsed spell durations as if they had proportional hazards. However, as noted by various authors, including Chesher and Lancaster (1983), the properties of the estimators used will depend on the sampling scheme used.
The purpose of this note is to identify the circumstances under which both the flow-sampled durations and stock-sampled elapsed durations do exhibit proportional hazards so that such inferences
*Tel.:144-1206-872-766; fax: 144-1206-872-724. E-mail address: kempgcr@essex.ac.uk (G.C.R. Kemp).
will be valid. One well-known case for which this holds is when the spells are generated by a renewal process started in the infinite past which has an exponential hazard rate. However, as will be shown, this is not the only case for which both types of sampled durations exhibit proportional hazards.
2. Framework and main result
The framework used to derive the main result is that of renewal theory, following Lancaster (1979). Let hj[Jj index a population of independent continuous renewal processes such that the common lifetime survivor function of process j is S (x). Standard results in renewal theory show that if thej common lifetime distribution of process j has finite mean then the residual mean common lifetime of process j is:
` 21
e (x)j 5 E(Xj2xuXj.x) 5 S (x)j
E
S (u) duj (1)x
while the backward recurrence time density of process j is:
21
e (0)j S (x),j for x$0,
f (x)j 5
H
0, otherwise,
(Cox, 1962). In addition, if the common lifetimes of process j are absolutely continuously distributed
21
with density g (x) then their hazard rate is given by h (x)j j 5g (x)S (x)j j , and the forward recurrence
21
time hazard rate for process j is given by k (x)j 5e (x)j for x$0 such that e (x)j $0 and 0 otherwise. Finally, the flow-sampled total durations are distributed according to the common lifetime dis-tribution, while the flow-sampled elapsed durations are distributed according to the backward recurrence time distribution. The main result of the paper is then as follows:
Theorem 1. Suppose that there exists a continuous function h (?):R →R and positive constants
0 1 1
hl ; j[Jj such that h (x)5l h (x) for all 0#x, ` and all j[J, where R 5[0,`). Then a
j j j 0 1
necessary and sufficient condition for there to exist a function k (?):R →R and positive constants
that lj.A /(A0 011) for all j[J. Now take an arbitrary process j[J. Then from the proportional
hazards assumption on the flow-sampled total durations, i.e. h (x)j 5ljh (x), it follows that:0
lj l[( 1 / A )11] ( 1 / A )11
The family of distributions with survivor functions given by (2) was introduced by Hall and Wellner (1984) as characterizing the survival distributions with a linear mean residual lifetime given by e (x)0 5maxh0,(A x0 1B )0 j for all 0#x, `. Thus it follows the flow-sampled total durations of
Take an arbitrary process j[J. The proportional hazards assumption on the backward recurrence
times, i.e. that there exist a function k (?):R →R and a constanthg; j[Jjsuch that k (x)5gk (x)
0 1 1 j j j 0
for all 0#x, `, implies that the process also exhibits proportional residual mean lifetimes, i.e.
21 `
e (x)j 5e (x) /g0 j where e (x)0 5S (x)0 exS (u) du. The proportional hazards assumption on the common0
lifetime distribution, i.e. h (x)j 5ljh (x) for all 00 #x, `, is equivalent to the condition that
lj
S (x)j 5[S (x)] . But then the conditions of Theorem 2 of Oakes and Dasu (1990) are satisfied and0 hence S (x) must take the form given in (2).0
Furthermore, process j[J must also belong to the Hall-Wellner family of distributions with Aj. 21 and Bj.0, where (A ,B )j j 5(0,B /0 lj) if A050 and (A ,B )j j 5(A / [0 lj(A011)2A ],B A /0 0 j A ) otherwise. Clearly, if A0 050 then A /(A0 011)50 and so Aj50 and Bj.0 sincelj.0; hence if A050 then the condition that lj.A /(A0 011) is automatically satisfied. Suppose instead that
21
0 in which case process j[J fails to belong to the Hall-Wellner family of distributions. h
flow-sampled total durations as for the stock-sampled elapsed durations. In fact, as is well-known, in this situation both the stock-sampled elapsed durations and the flow-sampled total durations have the same distribution.
When A0.0 then the common lifetime distributions of the processes are Type-II Pareto distributions. As is well-known, the Type-II Pareto distribution can be obtained by gamma mixing of exponential distributions (Maguire et al., 1952; Lancaster, 1979; Ouyang and Wu, 1994). In particular, the flow-sample total durations of process j[J can be obtained by mixing exponential distributions
when the rate parameter (which is the inverse of the mean) has a Gamma distribution with shape parameteraj5(1 /A )j 11 and scale parameterbj5A /B . However, (4) implies that A /B is constantj j j j
across j[J and thus the scale parameters would be the same for all j[J. Furthermore, the rate 1
parameter would need to be independent across spells for a given process. Finally, when 21,A0,
0 then the flow-sample total durations have rescaled beta distributions. The upper bound on this distribution for process j[J is given by 2B /A which from (4) must be constant across jj j [J.
It is worth noting that when A ±0 then l 5g if and only if l 51 so that in general the
0 j j j
stock-sampled elapsed durations and flow-sampled total durations will have different distributions. However, if the processes satisfy Theorem 1 as observed in the proof of Theorem 1, the flow-sampled total durations for process j will have linear mean residual lifetimes given by e (x)j 5(A xj 1B ). Thisj
implies that the stock-sampled elapsed durations for process j will have hazard rates given by
21 21
k (x)j 5e (x)j 5(A xj 1B )j for x$0 such that e (x)j $0 and k (x)j 50 otherwise. The survivor function for these stock-sampled elapsed durations is then given by:
x
2( 1 / A )j
M (x)j 5 exp
1
2E
k (u) duj2
5f
(A /B )xj j 11g
1 , (6)0
which is also the survivor function of a member of the Hall-Wellner family of distributions. Unfortunately, since the factors of proportionality lj and gj for the flow-sampled total and stock-sampled elapsed durations differ when A ±0 then although we can use Cox’s partial likelihood
0
methods for both the flow-sampled total and stock-sampled elapsed durations, neither the relative risk functions nor the baseline hazards will be the same across the two methods of sampling.
The proof of the sufficiency part of Theorem 1 also shows that (lj2gj)5A (10 2lj) and hence flow-sampled total durations for processes i and j have exactly the same distribution. The second case implies that the flow-sampled total durations are exponentially distributed. Thus, in general, the ratio of the relative risk factors for the flow-sampled total durations will only be equal in general to the ratio of the relative risk factors for the stock-sampled elapsed durations when the processes have a constant baseline hazard.
1
3. Conclusion
This note has characterized the cases in which both the flow-sampled total and stock-sampled elapsed durations from a family of renewal processes will have proportional hazards. The best-known case is when the flow-sampled total durations have exponential distributions so that they have constant hazard rates. However, this is not the only case where both the flow-sampled total and stock-sampled elapsed durations from a family of renewal processes have proportional hazards: this property holds more generally when the flow-sampled durations come from a Hall-Wellner family when a particular relationship between the A and B parameters is kept constant across the family of processes.
Nevertheless, the constant baseline hazard case is the only case where both flow-sampled total and stock-sampled elapsed durations exhibit proportional hazards and where the ratios of the relative risk factors are the same for both types of sampling.
The note has focused on stock-sampled elapsed durations because in practice if a sample is taken at a point in time then the elapsed durations are commonly recorded. However, one might follow the individuals until the ends of their current spells and thus record the stock-sampled residual durations which would be the forward recurrence times. As is well-known, the forward and backward recurrence times from a renewal process have the same distribution so that the results reported here also apply to stock-sampled residual durations.
References
Chesher, A., Lancaster, T., 1983. The estimation of models of labour market behaviour. The Review of Economic Studies 50, 609–624.
Cox, D.R., 1962. Renewal Theory. Chapman and Hall, London.
Cox, D.R., 1972. Regression models and life tables. Journal of the Royal Statistical Society, Series B 34, 187–200. Cox, D.R., 1975. Partial likelihood. Biometrika 62, 187–200.
¨ ¨
Hall, W.J., Wellner, J.A., 1984. Mean residual life. In: Csorgo, M., Dawson, D.A., Rao, J.N.K., Saleh, A.K.M.E. (Eds.), Proceedings of the International Symposium on Statistics and Related Topics. North Holland, Amsterdam, pp. 169–184. Lancaster, T., 1979. Econometric methods for the duration of unemployment. Econometrica 47, 939–956.
Maguire, B.A., Pearson, E., Wynn, H., 1952. The time intervals between industrial accidents. Biometrika 39, 168–180. Oakes, D., Dasu, T., 1990. A note on residual life. Biometrika 77, 409–410.