Parametric Failure Data Analysis
5.4.4 Exponential Life-testing Procedures
Suppose that the underlying chance distribution is assumed to be an exponential withPT >
t=exp−t(section 5.2.1) so that (5.4) takes the form PTn+1≥tn+1 •=
exp−tFd •
with•denoting data from a life-test, and
Fd •∝ •Fd (5.27)
see equation (5.5). Depending on the testing protocol and censoring scheme used, the data from the life-test could arrive in one of several forms discussed in section 5.4.3. For example, under Type II censoring with n and r fixed, we would observe the r ordered failure times 1≤2≤ · · · ≤r, and the likelihood of, •, would be of the form
•= n!
n−r!
r i=1
e−ie−n−rr
= n!
n−r!re−Tnr (5.28)
PREDICTIVE DISTRIBUTIONS INCORPORATING FAILURE DATA 151 whereTn ris the total time on test. This likelihood is a consequence of the fact the stopping rule for the test is non-informative – an issue than can be verified using the same line of argument used for the case of Bernoulli testing withnandr specified (section 5.4.2) and the fact that a probability model for the firstr order statisticsT1≤T2≤ · · · ≤TrofT1 Tnhas density at1≤2≤ · · · ≤rof the form
n!
n−r!
r i=1
fiPT > in−r (5.29) where theTis are independent (given) and identically distributed with distributionPT≥t whose density attisft. Theorder statisticsof a collection of random variablesT1 Tnis an ordering of these random variables from the smallest to the largest, so thatT1≤T2≤ · · · ≤Tn; thusT1=min
i TiandTn=max
i Ti. Whenr=n, (5.29) simplifies asn!n
1
fi, implying that the ordering of (conditionally) independent random variables renders them dependent.
Under Type I censoring, withtspecified by free will, the stopping rule is again non-informative, and the likelihood offor an observedk(the number of failures), and1≤2≤ · · · ≤k≤t is of the form
•= n!
n−k!
k i=1
e−ie−n−kt−k
= n!
n−k!ke−Tnt (5.30)
whereTn kis the total time on test. The line of reasoning used to obtain (5.28) is also used here, except that now we need to consider the joint distribution of the firstkorder statistics out ofn.
When censoring is a combination of Type I and Type II, the likelihood will depend on whether r≤t orr> t. In the first case it will be the product of the right-hand side of (5.28) and the probability thatTr≤t, which is the distribution function of ther-th order statistic. In the second case, it will be the product of the right-hand side of (5.30) and the probability thatTr> t.
Specifically, with hybrid testing, •will take the form:
n!
n−r!re−Tnr n i=r
n
i 1−e−ti
e−n−it
ift≤t and forr> t, withk < r failures observed byt, it will be
n!
n−k!ke−Tnt
r−1
i=0
n
i 1−e−ti
e−n−it
(5.31)
With retrospective data it is often the case that the censoring rule is not known, and all that we have at our disposal isn,the number of items under observation, the ordered times to failure 1≤2≤ · · · ≤k, and now the time, sayt; i.e. tis the time at which observation on the failure/survival process is terminated. The stopping rule will be non-informative as long astwas chosen at free will, or it was determined by eitherkor by1≤2≤ · · · ≤k. That is,tshould not depend on. When such is the case, the likelihood foris that given by (5.30), so that with a Bayesian analysis all that matters isTn t, the total time on test; similarly with the case of progressive censoring.
In the context of life-testing the role of informative stopping rules becomes more transparent via the scenario of random censoring. To see how, consider the case of a single item whose lifelengthThas the exponential chance distribution considered. Suppose that the choice of testing a single item (to failure or until it gets censored) is done by free will. Now suppose that the censoring timeY is random withPY > thaving a densitygtatt. Observation on the item stops when the item fails or gets censored. The parameter of interest is, the one associated with lifelengthT;is a nuisance parameter. Then, the likelihood for, were the item to fail att, will be governed by the probabilitye−t·PY > t , whereas it is governed by the probabilitye−t·PT > tif the item were censored att. The first term of the above two likelihoods define the stopping rule, which in the second case will be non-informative for, if andare independent; it will be informative if=or ifdepends on. As a special case of the above, suppose thatY is also exponential with a scale parameter; i.e.PY > t=e−t. Then the said likelihoods aree−t+ande−t+, respectively;tis the total time on test.
With nitems under observation, where nis chosen by free will, the likelihood will be the product of the above two likelihoods, the first for every item that fails and the second for every item that gets censored. Thus, for example, suppose that of thenitems under test,kexperience a failure at times1 2 k, and the remainingn−kget censored at timesk∗+1 n∗. Then, the likelihood ofandis
kn−kexp
−+ k
i=1
i+ n j=k+1
j∗
where k
i=1i+ n
j=k+1j∗
is the total time on test.
If and are independent then the stopping rule given by the censoring times is non- informative for, and the term(total time on test) cancels out as a constant in an application of Bayes’ law. Thus the part of the likelihood that is germane to inference aboutis
kexp
− k
i=1
i+ n j=k+1
j∗
(5.32a)
If=, the likelihood is
nexp
−2 k
i=1
i+ n
j=k+1
j∗
(5.32b)
and the stopping rule contributes to inference about; it is therefore informative.
In any case, the role of the total time on test for inference aboutis central with regard to all the censoring schemes discussed by us.
Going back to where we left off with (5.27), suppose thatFdis the gamma density with a scale parameter and a shape parameter , where and are elicited via the approach prescribed in section 5.2; thus
Fd=−1e−
(5.33)
Then for ·given by (5.28) – the case of Type II censoring – it is straightforward to verify that the posterior foris also a gamma density with a scale parameter+Tn rand a shape parameter+n; i.e.
Fd •=+Tn r+n+n−1e−+Tnr
+n (5.34)
PREDICTIVE DISTRIBUTIONS INCORPORATING FAILURE DATA 153 Consequently,E •=+n/+Tn r, the role of nandTn r being to change the mean offrom/to that given above.
Similarly, with Type I censoring, when •is given by (5.30), the gamma prior on results in a gamma posterior for of the same form as (5.34), but with Tn r replaced by Tn t. In general, the above result will also hold when the observed failure data is retrospective, or the testing is progressive, or the censoring is random and non-informative; all that matters is a use of the appropriate total time on test statistic. The only exception is the case of informative censoring wherein the likelihood will entail additional terms involving. In the case of hybrid censoring, it can be verified (cf. Gupta and Kundu, 1998) that the likelihood (equation (5.31)) can also be written as
n!
n−r∗!r∗exp−S (5.35)
wherer∗is the number of units that fail in0 t∗, witht∗=minr tandSthe total time on test is
S=r∗
i=1i+n−r∗t∗ifr∗≥1
=nt ifr∗=0
This form of the likelihood parallels that of (5.28) and (5.30), and thus with hybrid testing a result of the form given by (5.34) continues to hold withTn rreplaced byS.
OnceFd •is assessed, it can be used in (5.4) and (5.7) to provide predictions of future lifetimes. Furthermore, sinceis the model failure rate ofPT > t, (5.34) provides inference about, should this be of interest. Specifically, from (5.4), we can derive the result that
P Tn+1≥tn+1 •=
+Tn r +Tn r+tn+1
+n
and from (5.7) the result that
P Tn+1≥tn+1 Tn+m≥tn+m •=
+Tn r +Tn r+t∗
+n
wheret∗=m
i=1
tn+i. Both the predictive distributions are Pareto, the former indexed bytn+1, the latter by t∗. Indexing byt∗ suggests that in the scenario considered here, what matters is t∗, the total horizon of prediction. The individual times tn+1 tn+m do not matter; only their sum does. This feature can be seen as a manifestation of the lack of memory property of the exponential distribution and will always be true, irrespective of the prior distribution that is used.
Instead of assigning the gamma prior distributionFd on(equation (5.33)) one may prefer to assign a prior distribution on 1/def=, the mean time to failure. A possible choice is the inverted gamma distribution with parametersand; here for >0,
Fd =−+1exp−/
The mean and variance of this distribution are/−1and2/−12−2; these exist only if >2. With this choice for a prior, it can be seen that (5.12) will continue to hold and
that the predictive distribution ofTn+1is precisely the same Pareto distribution obtained when the prior onwas a gamma distribution; i.e.
P Tn+1≥t •=
+Tn r +Tn r+t
+n
This is because a gamma prior oninduces an inverted gamma prior on, and coherence of the probability calculus will ensure that the predictive distribution remains the same.
There are some other interesting features about the choice of the inverted gamma prior on . The first thing to note is that E Tn+1 •, the mean of the predictive distribution is +Tn r /+n+1– the mean of a Pareto with parameters +Tn r and+n.
Consequently, were there to be no failure data under this set-up, bothTn r andnwill not be there so thatE Tn+1 , the mean of the predictive distribution will be /−1. But /−1is the mean of the inverted gamma prior on. Thus here we have the curious result that in the absence of any life-testing data, the mean of the prior and the predictive distributions are identical, provided that they exist; i.e. provided that >1.
This completes our discussion on inference and predictions using an exponential chance dis- tribution assuming various informative and non-informative censoring schemes. In principle, the case of the gamma and the Weibull distribution, or for that matter any other chance distribution, will follow along similar lines, the main difficulty being an assessment of multi-dimensional parameters and the computational issues that such matters will entail. Here the total time on test statistic will no more play the central role it plays in the case of the exponential distribution, and expressions having the closed form nature of (5.34) are hard to come by. In section 5.4.5 the nature of these difficulties via the case of the Weibull distribution is illustrated. However, before doing so I need to draw attention to the following important point about assessing chance distributions.
Estimated Reliability – An Approximation to the Predictive Distribution
I start by drawing attention to the feature that what has been discussed thus far pertains to a coherent assessment of predictive distributions, and the parameterof the underlying exponential chance distribution exp−t. The coherent assessment of this chance distribution itself, namely the reliability function, has not been mentioned. This may seem strange because much of the non-Bayesian literature in reliability and life-testing focuses on estimating the reliability function.
Why this omission?
Our response to the above question is that in practice, what matters is assessing predictive distributions of observable quantities likeT, and not the predictive distribution ofTconditioned on unobservable parameters like, which is what the reliability function is. However, if for some reason – for instance section 6.2.3 – an assessment of a chance distribution is still desired, then the best that one can hope to do is to replacebyE •, the mean of its posterior distribution.
It is important to note that there is no rule of probability that justifies this step. The closest that comes to a rationale is that exp−E •tis an approximation to the predictive distribution ofT. Specifically, by an infinite series expansion of exp−E •tand of exp−t, we can verify that the predictive distribution ofT
0
exp−tFd •≈exp−E •t
Thus our proposed estimate of the reliability function can be seen as a surrogate for the predictive distribution ofT.
PREDICTIVE DISTRIBUTIONS INCORPORATING FAILURE DATA 155 5.4.5 Weibull Life-testing Procedures
Suppose that PT > t =exp−t, for >0 andt≥0, and to keep matters simple we focus attention on the case of Type II censoring. Thus given the r ordered failure times 1≤2≤ · · · ≤r, the likelihood ofandis
•= n!
n−r!rexp − r
i=1
i +n−r r
!r i=1
i (5.36)
which when re-parameterized as •with=−, and when multiplied by a prior such as that given by (5.9) yields a posterior Fdd •. This posterior when used in (5.4) and (5.7) provides predictions of future lifetimes. The computational challenge that such a scenario involving the integral of a non-linear function of the parameters is daunting. The situation is no better if other forms of censoring were to be considered or even with the absence of any censoring. The material in Singpurwalla (1988b), which is based on a conventional numerical perspective, provides a feel of the ensuing difficulties. It is because of such obstacles that some Bayesians have abandoned a use of proper, subjectively induced, priors and have then appealed to computer-intensive approaches such as simulation by the Markov Chain Monte Carlo (MCMC) (cf. Upadhyay, Vasishta and Smith, 2001). The practical importance of such methods mandates that some of these be overviewed and this is done in the appendix; the remainder of this section is devoted to an application of the MCMC method to the scenario at hand. Readers interested in a better appreciation of the material that follows are urged to first consult Appendix A.
Inference for the Weibull via MCMC
In the interest of generality, suppose that the Weibull chance distribution considered here has three parameters, withas a threshold; recall (section 5.3.2) that this was the model considered by Green et al. (1994). Suppose further, that in (5.36), n=r so that there is no censoring;
from an MCMC point of view this is no more a limitation (Appendix A). Then, with the re-parameterization=−, the likelihood of (5.36) becomes
n n
n
i=1
i−
exp n
i=1
i−
(5.37)
so that with the joint priorFddd∝−1of Greenet al.(1994) (also Sinha and Sloan, 1988), the posterior is proportional to
n−1 n+1
n
i=1
i−−1
exp
− n i=1
i−
(5.38)
We need to emphasize that the prior mentioned above is improper, and from a purely subjective Bayesian point of view, it should not be entertained. Besides convenience, its virtue is that it facilitates a straightforward application of the Gibbs sampler; my aim in introducing this development is to ensure completeness of coverage.
The full conditionals generated by the posterior of (5.38), with =1 n, can be written as:
p ∝ 1 n+1exp
− n i=1
i−
p ∝ n−1 n
n i=1
i−
exp
− n i=1
i−
and
p ∝ n
i=1
i−−1
exp
− n i=1
i−
Generating random samples from the first of the above three conditionals is straightforward, since p is, de facto, a gamma density. Generating random samples fromp is via a procedure due to Gilks and Wild (1992), whereas those fromp is via a scheme proposed by Upadhyay, Vasishta and Smith (2001). This latter paper is noteworthy because it illustrates the above approach by considering some data on the fatigue life of bearings.
To conclude, the success of the Gibbs sampling approach for analyzing failure-time data assumed to be meaningfully described by a Weibull chance distribution depends on a key feature, namely the use of a prior that facilitates a generation of random samples from the conditionals of the posterior.