Qualitative and quantitative evaluation of human error in risk assessment
8.6 Representation of the failure model
If the results of the qualitative analysis are to be used as a starting-point for quan- tification, they need to be represented in an appropriate form. As discussed, the form of representation can be a fault tree, as shown in Figure 8.2, or an event tree (Figure 8.3). The event tree has traditionally been used to model simple tasks at the level of individual task steps, for example in the THERP (Technique for Human Error Rate Prediction) method for human reliability assessment [4] (see Section 8.7.3). It is most appropriate for sequences of task steps where few side effects are likely to occur as a result of errors, or when the likelihood of error at each step of the sequence is independent of previous steps.
If we consider the compressor filter case study described previously, the PHEA analysis shown in Table 8.3 indicates that the following outcomes (hazards) are possible for an unplanned compressor trip:
• loss of production;
• possible knock-on effects for downstream process;
• compressor life reduced;
• filter not changed;
• oil not filtered will eventually damage compressor.
The overall fault tree model of the errors that could give rise to a compressor trip is shown in Figure 8.8. (For the purposes of illustration, technical causes of these failures are not included in the representation.)
Pressure surge
2.1 Fill valve opened too quickly (action
too fast)
2. Off-line filter not primed prior to
changeover (action omitted) 1. On-line filter
selected instead of off-line (wrong
selection) 2.2 Vent valve
opened too quickly (action
too fast)
Compressor trip
2.3 Flow through sight glass not verified (check not done) Blockage in line
2.5 Fill valve on equalisation line left open during changeover
System blockage
OR
OR OR
OR
Air introduced into system
AND
Off-line filter not primed
2.3 Not enough time allowed to purge air bubbles from the system
(action too short)
Figure 8.8 Fault tree for compressor trip
Table 8.4 Probabilities for each failure mode
Failure modes Probabilities
2.1 Fill valve opened too quickly (action too fast) P1 2.2 Vent valve opened too quickly (action too fast) P2 2.5 Fill valve on equalisation line left open (action omitted) P3 1. On-line filter selected instead of off-line (right action, wrong object) P4 2. Off-line filter not primed prior to changeover (action omitted) P5 2.3 Not enough time allowed to purge air bubbles from the system P6
(action too short)
Blockage in line P7
2.3 Flow through sight glass not verified (check omitted) P8
P1
P5 P4
P2
P1 + P2 + P3 + P4 + P5 + P6 + (P7×P8)
= Probability of compressor trip
P4 +P5
P8 P7
P3
P6
OR
OR
OR
OR
AND
P1 + P2 + P3 (P4 + P5) + P6 P7×P8
Figure 8.9 Calculation of probability of compressor trip
The overall probability of operator failure is obtained by multiplying probabilities at AND gates and combining values at OR gates as shown in Table 8.4. (Note that this is an approximation for small probabilities.)
Therefore, the probability of a compressor trip is given by the calculation shown in Figure 8.9.
Figure 8.10 is an event tree representation of operator actions involved in an off- shore emergency shutdown scenario [5]. This scenario represents a situation where a gas release could occur in a particular module outside the control room. This release will trigger an alarm in the control room, which has to be detected by the opera- tor, who must then respond by operating the emergency shutdown (ESD). All of these actions have to be completed within 20 minutes, as otherwise a dangerously flammable gas concentration will build up with the risk of an explosion. The scenario
CCR operator initiates ESD within 20 minutes
S1 begins ESD demand Operator detects only partial ESD has occurred (within 2 hours)
Supervisor initiates ESD within same 20minutes CCR operator identifies correct equipment room to outside operator
Supervisor detects only partial ESD has occurred (within 2 hours) CCR operator identifies manual valves and tells outside operator
Outside operator identifies failed activator and communicates these to CCR operator END STATE
Outside operator moves valves to correct position within the same 2 hours S 1.1 F 1.1S 1.2 F 1.2
F 1.4
S 1.4 F 1.3
S 1.3
S 1.5 F 1.5
S 1.6 F 1.6
S 1.7 F 1.7
S 1.8 F 1.8
SUCCESS F1 F2 F3 F4 F5 F6 Figure8.10OperatoractiontreeforESDfailurescenario[5]
also includes the situations where only a partial ESD occurs. This situation could be detected either by the control room operator or by his supervisor. In this case, the gas flow rate will be lower, but emergency action has to be taken within two hours in order to prevent the gas build-up to flammable levels. Because the ESD has not functioned correctly, the control room operator or supervisor has to identify the malfunctioning ESD valves, and then send an outside operator to the appropriate equipment room to close the valves manually.
It can be seen that the development of an appropriate representation even for a simple scenario is not trivial. This type of event tree is called an Operator Action Event Tree (OAET) because it specifically addresses the sequence of actions required by some initiating event. Each branch in the tree represents success (the upper branch, S) or failure (the lower branch, F) to achieve the required human actions described along the top of the diagram. The probability of each failure state to the far right of the diagram is calculated as the product of the error and/of success probabilities at each node of branches that leads to the state. As described in Section 8.2, the overall probability of failure is given by summing the probabilities of all the resulting failure states. The dotted lines indicate recovery paths from earlier failures. In numerical terms, the probability of each failure state is given by the following expressions (where SP is the success probability and HEP the human error probability at each node):
F1= [SP 1.1+HEP 1.1×SP 1.2]×SP 1.3×SP 1.5×SP 1.6×SP 1.7×HEP 1.8 F2= [SP 1.1+HEP 1.1×SP 1.2] ×SP 1.3×SP 1.5×SP 1.6×HEP 1.7 F3= [SP 1.1+HEP1.1×SP 1.2] ×SP 1.3×SP 1.5×HEP 1.6
F4= [SP 1.1+HEP 1.1×SP 1.2] ×SP 1.3×HEP 1.5 F5= [SP 1.1+HEP 1.1×SP 1.2] ×HEP 1.3×HEP 1.4 F6=HEP 1.1×HEP 1.2
These mathematical expressions are somewhat more complex for this situation than for the event tree considered earlier in Figure 8.3, since they need to take into account the fact that recovery is modelled. This means that some paths in the event tree are traversed twice, once directly and once via the alternative route when recovery occurs following a failure. Since each of these routes is essentially an ‘OR’ combina- tion of probabilities, they have to be summed together to produce the correct value for the overall probability of that particular train of events that could lead to the failure mode (i.e. F1, F2, etc.). Total failure probability PT is given by:
PT=F1+F2+F3+F4+F5+F6