DECLARATION 2- PUBLICATIONS
5.2 Critical review of the RCM
5.2.1 Merits, demerits and opportunities
The RCM, initially applied in the aviation industry, establishes maintenance and refurbishment needs in complex, critical assets [26], [105] based on constant failure and repair rates [59]. Critical assets are those for which the financial and service level impacts of failure justify proactive assessment and restoration [10]. The main value of the RCM is in answering the seven questions about the functions, functional failures and their causes as well as remedial actions needed to determine the most appropriate maintenance strategies [105]-[107]. In order to answer the questions, detailed decision algorithms must be formulated, which can be very tedious. For that reason, asset managers tend to either reduce the failure modes or to prioritize the implementation of
the RCM to critical assets. The determination of critical equipment involves a decision algorithm which can be summarized in the form shown in Figure 5-1.
The RCM has been described as a structured methodology and maintenance organization or process for establishing the most cost effective level of reliability [57], [106]- [108]. The RCM is claimed to reduce 11 kV transformer maintenance costs by 30 to 40% and routine preventive maintenance costs by up to 50% [109]. The US navy reported the following tangible benefits of the RCM: life cycle cost reductions of 15%, representing $1.7 billion; increased availability by 17%;
and extended ship lifecycle by 8 years [25].
Figure 5-1: Equipment criticality determination (adapted from [10])
Yes No
No
No
No Yes Yes
Yes
Yes Yes
Indentify all equipment and record the data: Item #, description,
manufacturer’s details, local agent/
supplier, technical details, cost price;
and manuals and drawings
Establish the main operating model
Determine the equipment breakdown structure
Establish the function of the equipment
Non-critical (Duty spare) If the function was lost, would there be a safety/
environmental consequence?
Is there redundant capacity?
Does the equipment have a direct impact on
the system output?
If this function was lost, would the operating model be
fulfilled?
If this function was partially lost, would the operating model
be satisfied?
Critical equipment
Non-critical
The major flaws of the RCM are as follows: it lacks prioritization needed for general industrial application [25], it is costly to implement and requires components of Total Productive Maintenance (TPM) to sustain its full capability [110]. Furthermore, it lacks the flexibility and full merits of probabilistic models [111]. Finally, it is unable to quantify the benefits of maintenance on system reliability and costs [112], [113].
The Risk based inspection (RBI) may be incorporated in the RCM to fortify it [10], [64];
thereby helping to select appropriate condition monitoring methods [57]. However, the RBI is neither able to quantify costs of inspection/condition monitoring nor indicate the alternative risk treatment options [57]. The RCM’s fault root cause analysis usually comprises a Failure mode effect analysis (FMEA) which is used to analyze potential failure modes and their impacts; and the Failure Mode Effect and Criticality Analysis (FMECA) which extends the FMEA to include measures to rectify the faults [114]. The major concerns about the FMECA are the tendency to eliminate cascade failures [33], and the use of simple uptime or downtime data to compute risk indexes which can affect the validity of the results [91]. For power transformers, a risk index is given as the product of consequence factor and failure probability [4], [76]. Sections 5.2.2, 5.2.3 and 5.2.4 critically examine challenges pertaining to the risk characterization, data requirements and reliability modelling, respectively.
5.2.2 Risk characterization
Physical AM centers on optimization of risks, cost and reliability [1]. The risk management process consists of the following seven stages that are presented in Figure 5-2 [10]: risk contextualization, risk identification, risk exploration, risk assessment, risk treatment, risk monitoring and review, and risk reporting. Risk characterization refers to a synthesis of the seven stages, to provide a conclusion about the risk, the nature of the inherent and residual risk; and a rethink in strategy or policy due to changes in the risk profile over time.
The success of the risk characterization process requires a comprehensive database of faults, failures, operations, and maintenance as in a surveyed breakdown strategy; whereby a fault and damage database is combined with SCADA, ERP software and GIS [2]. The cost implications and integration challenges tend to limit the application of such strategies in the power sector. Hence, the asset managers tend to solely rely on soliciting opinions on how to determine the probability of failure or the end of life from experts in the field. For the power transformer risk management, the sourcing of opinions from the experts involves consultations with designers and chemists along with a rigorous inspection and an extensive testing procedure [76]. This is both time consuming and
very costly. The foregoing scenario highlights the need to develop models that can eliminate some of the steps in the risk characterization process in order to reduce the time and costs.
Figure 5-2 outlines the key elements of an integrated risk management process, showing the contribution of the RCM in the process. The Top-down and Bottom-up techniques shown in the figure are tools used in the development of lifecycle management and maintenance plans for the assets [10].
No
Yes Phase 1
Risk Contextualisation:
set system boundaries and objectives; and establish
functional synergies
Phase 2
Risk identification: establish risks pertaining to the context
and get the right people to manage the process
Phase 3 Risk exploration: explore
cause, consequence and impact
Phase 4 (a) First risk assessment:
determine inherent risk, their likelihood and
impacts Phase 4 (b)
Second risk assessment:
assess residual risk in relation to existing controls in phase 5
Phase 5
Risk treatment: preventive and control measures plus
treatment options Phase 6
Risk monitoring and review: determine
whether the risk profile changed with
time Phase 7 Risk reporting and
communication
Did measures work?
Top-down, Bottom-up techniques
Metrics Probabilistic and stochastic tools
RCM, FMECA
Key:
Risk management process flow
Contributions from tools and philosophies
Figure 5-2: A systems view of the RCM and risk management process (adapted from [9])
5.2.3 Data requirements
This section highlights key issues pertaining to the data requirements in the risk management process. The RCM is one of the proactive equipment management practices, with probabilistic inferences, that have characterized the current risk-based techniques and AM paradigms [2], [26].
Data requirements for probabilistic concepts are huge, and it is usually difficult to get the data [2], [90]. The validity of the risk evaluation processes by line managers, who normally conduct the risk analysis [55], can be adversely affected by the data unavailability.
Parameters for probabilistic models include mean times to failure, inspection rates and probabilities of state transitions [5], [92]. ICT models are useful for capturing these parameters, but power utilities have applied the models inconsistently, in a fragmented way and face challenges in integrating them in the data mining process [66]. OSA-CBM and MIMOSA initiated the integration of standard ICT protocols in condition monitoring and maintenance, but most organizations have not embraced their use [25], [66]. The IEEE standardized the protocols to incorporate condition monitoring transducers [25]. It also developed a standard for the FMECA and fault root cause analysis [4].
5.2.4 Reliability modelling
This section explores ways of integrating stochastic processes in the RCM in order to simplify its process of quantifying the effects of maintenance on reliability, and to augment its probabilistic capabilities. Models that quantify effects of maintenance on reliability in the power systems are few, hence the need for research to focus in this area [5]. Many electric power utilities install RCM programs as risk prioritization tools for critical systems and equipment that have experienced problems [25]. However, the RCM approach is heuristic and its application requires judgment and experience which can take long before enough data is collected for the decision making purposes [90]. It is envisaged that the Markov analysis can be used cost effectively (with a few data sets) to derive the MTTFF, to model reliability and to measure the effectiveness of strategies, provided component failure and repair rates are known [27]. In modelling reliability, Markov processes treat equipment failures and repairs as constant or as exponentially distributed [5], [92]. The treatment of failure and repair rates as constant is the major criticism of the Markov approach [81]. However, the treatment simplifies the analytical process [115], [116]. Section 5.3 presents an analytical model that exploits the opportunities, and addresses the challenges that have been outlined in Section 5.2.