DECLARATION 2- PUBLICATIONS
2.6 The role of AM processes on risk mitigation
2.6.2 Life data analysis
1 ( ). ( ) )
( t
t
t R dt t f
t (2.8)
) ( 1
) ( ) (
) ) (
( F t
t f t R
t t f
(2.9)where f(t) is the probability density (distribution) function.
Also,
) ( . 1 ) ) (
( dt R t
t t dF
(2.10)Integrating equation (2.8) from 0 to t gives:
t t dtt dFR tt0
0 ( )
) . (
)
( (2.11)
Alternatively,
log ( )) (
) ( )
( ) ( . 1
) (
0 0 0
t t R
R t dR t
R t R dt d
t e
t t t
R t
tt dt
0
.) ( exp
)
(
(2.11)In analyzing life data for electrical equipment like transformers and reactors, the most commonly used distribution functions for fitting the data are as follows [73]:
1) Normal distribution 2) Lognormal distribution 3) Extreme value distribution 4) Weibull distribution
The normal distribution may be applicable when the sample size is large, but it is occasionally used for life data analysis because it always gives an increasing hazard rate. Thus, it is particularly suitable for the life of products with wear-out types of failure [73]. The normal PDF is given by:
x
x
xf 1 2 12exp 2 22 , (2.12)
where x is the random variable and σ is the standard deviation.
The lognormal distribution empirically fits many types of data adequately, especially if the range of the data is several powers of 10 as is the case of some life data involving metal fatigue and electrical insulation [73]. The lognormal PDF is expressed as follows:
exp
log
2
, 0 24343 .
0 2 2
12
x x
x x
f
(2.13)where σ is the log standard deviation.
The Weibull distribution can be used to fit many kinds of data as well as many types of distributions based on the variation of the shape parameter. It does so by establishing an expression for
t which can permit variability so as to fit a range of probable failures (thereby transforming equation 2.8) as follows [74]:t
ba t )
(
(2.14)Then, equation (2.14) becomes:
1 1
1 exp exp .
) (
b b
A t b
t t a
R (2.15)
where
1
1 b1
a A b Thus,
t t
R( ) exp (2.16)
where
t
is time to failure,
is location parameter, that is time at which F (t) = 0, i.e. when R(t) = 1 = datum parameter (= the failure free life); η is characteristic life or scale parameter; and β is shape parameter.Therefore, the substitution of equation (2.16) in (2.4) gives the following:
t t
F( ) 1 exp (2.17)
t t
dt t t dF
f( ) ( ) . 1.exp (2.18)
. 1
) (
) ) ( (
t
t R
t
t f (2.19)
If:
, ( ) exp 1
1 0 . 368
t R t e
(2.20)then,
% 2 . 63 ) ( 1 )
(t R t f
Thus the scale parameter η represents the time from 0(the starting point of equipment operation) to when 63.2% of the population can be expected to have failed; η is independent of β, which is the shape of the Weibull distribution curve. Figure 2-7 illustrates how the Weibull distribution can be used to model different types of distribution by changing the values of β. It shows three Weibull PDFs at three values of β.
Figure 2-7: Fitting various distributions using Weibull shape parameter β
Section 2.6.2.2 presents special cases on how the life data can be applied in electrical equipment, whereas Section 2.6.2.3 discusses data analysis techniques.
2.6.2.2 Special cases applicable to electrical equipment
Electric power and aerospace engines are among products with complicated layouts. They usually have high infantile mortality rate, and the hazard rate increases more rapidly with age due to the aging process under heavy duty [75]. In recent years, some power utilities have adopted Perks’ formula to incorporate transformer life data, which gives the hazard rate, h (t) as follows [16]:
t t t A
h
exp 1
) exp
(
(2.21)0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
0 0.2 0.4 0.6 0.8 1 1.2
1.4 Weibull PDF for =1,=2,=3
Magnitude
Random variable
=1
=2
=3
where A is a constant representing risk of failure by random events such as lighting and collisions;
α and β are constants that control the shape of the hazard increase with passage of time; and μ is a constant to slow down the rate of hazard increase at older ages.
The Makeham’s formula is another fitting model that has been used by transformer insurance companies in the USA, for a number of years, to represent the instantaneous risk function. It is a simplification of the Perks’ formula, given as follows [76]:
t A
t
h exp (2.22)
Another fitting technique employed in the life data analysis is the Iowa Survivor Curve.
Utilities apply it to analyze transformer reliability [15]. Figure 2-8 is an example of the Iowa Survivor Curves. This fitting technique contains 18 basic Iowa curves, each with a unique two- character name. The first characters consist of R, L or S to indicate whether the curve has a right, left or symmetrical mode, respectively. The second character comprises digits ranging from 0 to 6 to show the steepness of the modal peak.
Figure 2-8: Simplified IOWA type R1-R5 Survivor curves (single vintage) [15]
The major flaw of the use of the Iowa Survivor Curves is that the decision on the best fitted curve is subjective. In addition, it is very difficult to identify the best fitted curve if the life data is too short in comparison to the full extent of the Iowa Survivor Curves, resulting in data misfit [72].
In general, analysis of power system equipment suffers from many set-backs which, among others, include:
1) Different equipment vintages (i.e. varying in-service times).
2) Similar equipment in different zones may be subjected to different loading conditions.
3) Equipment installed in different regions or zones may be subjected to different fault levels and failure rates; hence combining data from different zones may not represent statistical inferences accurately.
The above listed challenges have resulted in a tendency to derive mean lives and to use them for the statistical analysis [77], [78]. However, the fitted results depend largely on the type of distribution used [72]. Generally, the Weibull and Lognormal distributions have better statistical data fitting capabilities for electrical equipment [73], [79]. The Weibull distribution tends to be used most because it is flexible enough to represent all types of distributions. The Gamma distribution can also be used to model random lifetimes of items (products) as it can also assume a number of shapes, but it is less flexible than the Weibull distribution because its PDF is always right modal regardless of the type of data used [80]. In general, when modelling equipment life data using parametric family of distributions, the accuracy of the parameter estimates depends on the type of method used [35], [81]. For example, the LSM is suitable for large sample sizes, the MLE is the most appropriate for handling both censored and non-censored data (as well as extremely small sample sizes), whereas the MOM may be used to validate the results of the MLE. When applied to large sample sizes, the MLE simply reduces to the LSM [82].
The statistical data analysis in the power sector is not common because of data unavailability due to poor record keeping, as utility asset managers tend to lose the track record of failure as the time progresses. In the same regard, in large power utility grids, it is viewed that statistical inference for components like transformers can only be considered valid if the number of components exceeds 200 [72]. However, this proposition can be questioned, because in a small grid, where only a small number of equipment are installed, it is possible to get accurate times to failure that can be used to get accurate statistical data analysis. In addition, special techniques for handling extremely small data sizes exist. For example, [77] carried out a statistical data analysis with as few as four reactors. Therefore, the credibility of statistical data analysis does not always lie on the quantity of the data alone, but also on the method used for computing the parameter estimates.
There are three main data analysis approaches in engineering and science, namely: Classical Analysis (CA), Exploratory Data Analysis (EDA) and Bayesian Analysis (BA) [83], [84]. Sequence of data analyses for the three techniques is as outlined in Table 2-3.
Table 2-3: Sequence of data analysis techniques [83]
Technique Sequence
CA Problem Data Model Analysis Conclusions
EDA Problem Data Analysis Model Conclusions
BA Problem Data Model Prior distribution Conclusions
These approaches are classified according to the way they deal with the underlying data. For the CA, the data collection is followed by the imposition of a model such as normality, linearity, etc. The analysis, estimation, and testing that follows are focused on the parameters of that model.
For the EDA, the data collection is immediately followed by analysis with an objective of inferring which model would be appropriate. On the other hand, for the BA, the analyst attempts to incorporate either scientific or engineering knowledge or expertise into the analysis by imposing a data-independent distribution on the parameters of the selected model. Thus the BA analysis consists of formally combining both the prior distribution on the parameters and the collected data to jointly make inferences and/or test assumptions about the model parameters [84].
Practically, most data analysts would combine all the three approaches. The current research predominantly uses the CA and some aspects of the EDA. The merit of combining the EDA and the CA techniques is that the EDA can provide a broad spectrum of data analysis to gain insight into the engineering process behind the data (that is the data structure), to provide a good fitting and to estimate the parameters; whereas the CA approach can be used to measure strength of associations and to identify patterns that can help in the drawing of conclusions [83], [84].