Risk Assessment Approaches - S AFETY O RGANIZ ATION 3

S AFETY O RGANIZ ATION 3

3.8 Risk Assessment Approaches

The most widely recognized element of safety organization has been the management of safety-related risks. Existing methodologies of risk assessment in the aviation domain have been based on a typical framework that includes the identification of hazards, their screening according to a risk matrix, the quantification of risks, their prioriti-zation, and finally, the recommendation of risk mitigation measures or corrective solutions. There is also a change management policy so that the effectiveness of changes and risk mitigation measures is monitored according to a formal plan (i.e., the risk monitoring stage).

SAFETY ORGANIZATION AND RISK MANAGEMENT 8 9

The ARMS methodology (EASA 2011) and the risk assessment tool (Eurocontrol 2009) both comply with this typical framework of analysis.

However, many aviation practitioners and organizations have indicated several difficulties with the quantification and risk miti-gation stages. There are many reasons for this difficulty, including high requirements for expertise, greater human and time resources, access to historical data, and cost issues pertaining to the choice of risk counter-measures. In fact, a formal survey in safety-critical industries shows that risk assessment methods have been used only in design and modification projects and not during daily operation (Andersen and Mostue 2012). As a result, risk assessment may be seldom updated and hence, important changes in safety functions are not monitored on a continuous basis. To readdress such problems and make risk assessment a practical tool for safety practitioners, this sec-tion proposes several requirements derived from the literature review, the views of several safety practitioners, and the experience of the authors. To understand potential problems and areas of improvement for existing risk assessment methods, a basic background is provided first of a typical risk-assessment framework.

3.8.1 Systemic Risk Assessment

Risk assessment is a complex process that requires a team of experts in order to identify hazards, collect historical data about component failures, construct risk models, assess the influence of workplace and organizational factors, and design risk counter-measures. Because of the large demands in human resources and data requirements, risk assessment is carried out when new designs or modifications are introduced into the system. For this reason, it is usually called sys-temic risk assessment to distinguish it from other types of operational risk assessment carried out daily by the practitioners themselves.

Eurocontrol has developed an integrated risk picture model (Eurocontrol 2006) that is used to estimate frequencies of several accident types (e.g. taxiway collision, midair collision, and so on). For each accident type, a separate causal model is constructed using fault trees that represent precursors to accidents and failures of barriers.

Precursors are unsafe conditions that can lead to more adverse states,

9 0 COGNITIVE ENGINEERING AND SAFETY ORGANIZATION

provided that the barriers cannot manage them successfully. The pre-cursors are arranged in severity terms from conflicts to separation losses, air proximity events, and accidents. Barriers (or safeguards) include equipment, safety nets, procedures and processes that can be managed so as to prevent precursors from progressing in severity to accidents. The fault trees allow quantification using historical inci-dent experience and judgmental modifications to fit specific organi-zations. An influence model is used to show the effects of workplace factors, environmental factors, and safety processes that contribute to accidents. The structure of the influence model is based on a system model that defines the major system elements, represents the con-cept of operation, and identifies interdependencies due to common resources. As a minimum requirement, the system model shows the necessary inputs and outputs for each element, the required resources, and safety constraints.

Eurocontrol has proposed the SADT model (Marca and McGowan 1987) as a basis for describing the system; other mod-els include STAMP (Leveson 2004, 2012) and SCOPE (McDonald et al. 2011). The output of the influence model is a set of modification factors applied to the frequencies and probabilities of the base events of the fault trees. In general, the initial system model should be able to assist analysts in developing the risk influence model and tailor risk assessment to the specific problem at hand. For quantification pur-poses, this approach can result in an exponential growth in the tree size because managerial influences should also be seen as common causes of individual causal factors. Fault trees are based on the Swiss cheese model (Reason 1997), where causal factors are represented as holes in a series of organizational processes or barriers. A similar dia-gram in Figure 3.6 shows how barrier failures create unsafe states in increasing severity from flight conflicts to midair collisions. Failure of barriers includes not only technical factors, but also human failures to respond promptly and prevent the next stage precursor.

Accidents may arise in innumerable ways, hence it is not practical to identify every possible accident scenario. In building a causal model, it is only necessary to identify scenarios that involve combinations of barrier failures or scenarios that involve practitioners bypassing cer-tain barriers. For convenience, conflicts due to pilot noncompliance and unauthorized penetration of controlled airspace (e.g., military

SAFETY ORGANIZATION AND RISK MANAGEMENT 91

flights) are referred to as unplannable conflicts, since they involve similar barriers. Tactical conflicts that could have been prevented by strategic traffic planning and synchronization are described as plan-nable conflicts.

An unplanned conflict can be presented due to traffic synchroni-zation problems managed by air traffic flow controllers. The executive controller is responsible for tactical separation, which involves monitor-ing radar information, detectmonitor-ing conflicts, resolvmonitor-ing conflicts by tactical planning, and liaising with the coordinating controller. As a result of tactical separation, flight crews should receive separation instructions

Collision avoidance (TCAS) Visual maneuvering

Conflict warnings (STCA, other controllers)

Tactical traffic separation (Controllers)

Strategic de-conflict (ATFCM measures)

Strategic conflict Tactical conflict Loss of separation Air proximity event Imminent collision Mid-air collision

Change conflict geometry BARRIERS

UNSAFE STATES

Figure 3.6 Barrier failures create precursors in increasing severity from conflicts to accidents.

9 2 COGNITIVE ENGINEERING AND SAFETY ORGANIZATION

in a timely fashion and should respond by proper aircraft maneuver.

Inadequate communications with pilots may take several forms, incor-rect or late instruction transmission, loss of communication, and inad-equate pilot feedback. The estimation of the presence of unplanned conflict can be made with using another fault tree that combines human errors (e.g., controller fails to use flight progress strips correctly), technical failures (e.g., malfunction of medium term conflict detection system–MTCD), and coordination failures between sectors.

Fault tree analysis (FTA) is a conventional method of risk analysis that has been applied successfully for many years in the analysis of mechanical systems and process control systems that involve routine human operations. Fault trees consider failures of components and human actions that have distinct categories of operation (e.g., failed/

operating states or wrong/correct actions). In the integrated risk pic-ture (IRP) model, causes that cannot be split into simple operating states are represented through the influence model.

However, fault trees cannot model accidents where the components have not failed as they could meet the design specifications but the design cannot act as a barrier to certain events (Leveson 2012). Fault trees also cannot model accidents that occur due to interaction problems between components or actions that have not failed individually. Although the IRP model has considered interdependences between components in the influence model, the structured analysis and design technique (SADT) used to model interdependences has not been very effective.

In general, there is no easy way to quantify or verify probabilities that are used in fault trees because human errors depend on the work context and the dynamics of the environment. For instance, the prob-ability of failure to detect a planned conflict has been estimated to 6 10^-3 per flight, which is the nominal failure probability of a rule-based task. However, many could lead to undetected conflicts that have different dynamics and contexts of work. The authors have recorded at least five scenarios leading to failures of detection (see Table 3.2). These work scenarios indicate how work context and work practices can influence human detection.

The IRP model addresses many work influences in a generic man-ner in order to geman-nerate “modification estimates” without any refer-ence to the particular practices and work contexts. The IRP model considers the following generic influences on performance:

SAFETY ORGANIZATION AND RISK MANAGEMENT 9 3

• Operating environment (i.e., traffic density, airspace design, terrain, weather, visibility conditions, and so on) and quality of equipment

• Workplace factors (i.e., man-machine interface, human reli-ability, job aids) and organizational factors (i.e., procedures, training, resource allocation, and teamwork)

• Safety management system (i.e., safety policy, safety communi-cation, safety assurance, and safety promotion)

Table 3.2 Scenarios Making “Unplanned” Conflicts Difficult to Detect for Controllers

SCENARIO DESCRIPTION

Undetected conflicts in low traffic parts of radar screen

Scanning strategies enable controllers to monitor traffic especially when workload is high and identify conflicts at early stages so that sufficient time is allowed for conflict resolution. On some occasions, the traffic pattern is uneven so that heavy traffic appears in certain parts of the screen while other parts display areas with less traffic.

This uneven traffic pattern may increase the chances of

unrecognized conflicts in the low traffic areas when controllers have devoted their attention to the heavy traffic area.

Undetected conflicts in parts of the screen that have been filtered out

Controllers may choose to increase the screen scale in certain traffic areas where interesting events are presented and filter out traffic in areas where conflicts are least expected (e.g., traffic below 3,000 ft.

may be filtered out when visibility is good and traffic is low). The risk is that filtered out traffic areas may be left unattended too long until a short-term conflict alert is displayed in the system.

Undetected conflicts due to early transfer of aircraft

A conflict may remain undetected in cases where the controllers agree to accept early an aircraft transferred from an adjacent sector but they may be late in considering the aircraft within their responsibility area. This can occur, for instance, when an early transfer aircraft is left in gray color, indicating that it is not under their control yet. Leaving the change of color until later may result in late recognition that the aircraft has already been accepted from the adjacent sector.

Converging traffic that becomes in conflict during transition to another sector

Traffic may become in conflict while aircraft are in transition from one sector to another with different characteristics. For instance, two aircraft descending at different speeds may be safely separated in one sector but may become in conflict as they cross sectors; this is more likely to occur when the next sector has higher separation minima.

Differences in aircraft performance may result in unexpected conflicts

Projecting aircraft trajectories in the future requires a good knowledge of performance characteristics. Unknown to the controller, an aircraft may be heavily loaded and climbing at a slower than expected rate; this can result in conflict with lighter aircraft following the climb behind the heavy aircraft.

9 4 COGNITIVE ENGINEERING AND SAFETY ORGANIZATION

Another challenge in risk analysis concerns the modeling of inter-actions between human activities. It is often the case that separation planning is seen as a sequential phase that follows the identification of traffic conflicts. In fault trees, inadequate tactical separation can be due either to undetected conflicts or to inadequate separation plan-ning. Failures in detection and planning are added through an OR gate to estimate the overall tactical separation failure. In operational practice, however, the two human activities are ongoing since con-trollers may detect several conflicts with different dynamics and men-tally play out a couple of candidate deconflicting strategies. This may result in a decision to intervene late which makes an external observer believe that planning follows detection. On other occasions, however, an early intervention can be made to avoid imminent conflicts in the near future but this early resolution of converging traffic cannot be captured by external observers.

Separation planning seems to have two cognitive elements: (1) micro-managing where imminent conflicts are managed and (2) anticipatory planning with a longer horizon of attention where traffic projections are made in the long term and future points are decided for closer traf-fic monitoring. In micromanaging, controllers resolve an imminent conflict but also try to avoid side effects in other areas. In anticipatory planning, controllers try to stay ahead of traffic and maintain awareness of the traffic dynamics. Fault trees usually consider failures at micro-managing, such as delayed resolution or unsuccessful resolution.

In addition, failures at managing traffic are not static events as assumed in fault trees. Micromanaging conflicts is a dynamic process that may start with a well-thought-out plan that goes astray because of unexpected events and surprises. For instance, separation plan-ning may be tightly coupled, leaving little scope for crew diversions or unexpected events. A traffic pattern that is tight may not be recog-nized by crews who may wish to change their flight to a continuous descent from a stepwise one, hence, giving rise to other conflicts later on. In a similar sense, a correct resolution plan may be interrupted by other crews blocking the radio frequency for too long (e.g., take long in initial contact formalities). That is, the plans may start well but can remain incomplete due to other interruptions.

In other cases, separation planning may be recorded officially as unsuccessful yet it may cause no harm. For instance, a resolution

SAFETY ORGANIZATION AND RISK MANAGEMENT 9 5

plan may result in two aircraft violating the separation minima but the conflict geometry may be such that the conflict is resolved soon after its recording by the system; in this case, there may be no com-plaints from the crews nor from the safety managers. This may hap-pen because avoiding a temporary conflict may require tremendous effort while its tolerance may improve traffic separation in the near future.

Anticipatory planning is an important controller strategy but it is not considered in fault trees. Inadequate anticipation may lead to a tight traffic pattern that increases the chances of conflict but leaves little scope for recovering from unsuccessful resolutions or surprises later on. Anticipation involves staying ahead of traffic (i.e., having a longer attention horizon) and creating open traffic patterns that allow crews and controllers to cope with unexpected events. Anticipatory planning is an invisible strategy to the fault tree analysts because its results are seen later as undetected conflicts or unsuccessful resolu-tions. Many analysts consider anticipatory planning an activity for the coordinating controllers only, but recent research conducted by the authors has shown that executive controllers also actively try to stay ahead of traffic.

3.8.2 Operational Risk Management

Operational risk management (ORM) involves a fast risk analysis where a risk index is calculated and compared to a target risk level. If the estimated risk level is higher than the target then the practitioners can propose risk mitigation measures to manage the risk. ORM can be applied by the practitioners themselves on a daily basis. Several ORM methods have been proposed in the aviation domain, including threat and error management (TEM), failure likelihood indexes, and so on. In 2009, Eurocontrol developed the risk analysis tool (RAT) as a basis for rating the severity of near misses and traffic incidents but RAT can also be used in predicting and assessing risks in future scenarios. Specifically, the European Union requires all ANSPs to use the RAT tool in order to rate the severity of recorded ATM safety occurrences. The general philosophy of the RAT tool is similar to the ARMS method and can be summarized as follows:

= × ×

R S C R

9 6 COGNITIVE ENGINEERING AND SAFETY ORGANIZATION

Where:

• R: Risk is the product of Severity, Controllability, and Repeatability.

• S: Severity of hazardous event refers to task, traffic, or envi-ronmental consequences.

• C: Controllability refers to the actions of practitioners in order to prevent, resolve, or control and recover the hazardous event.

• R: Repeatability refers to the likelihood that a hazardous event may occur in the future, which is a function of work-place conditions and organizational factors.

Although RAT specifies a numerical calculation of risk factors, its causal model can be used as a general guidance for designing a data structure that is useful in the context of several ORM meth-ods. In combination with bow ties, RAT can be used to generate a rich data structure that is useful for producing fault trees and event trees. A bow tie is a widely used risk analysis method that identifies hazards, their causal factors, and their likely consequences. The left side of a bow tie examines possible safety barriers that could prevent the occurrence of a hazard, while its right side examines barriers that provide hazard protection and minimize its adverse consequences. An actual example of a bow-tie analysis of a low level winds shear hazard is shown in Figures 3.7 and 3.8 (for a detailed description of LLWS phenomena see Chapter 5).

The data structure can be rich enough to enable analysts to build their risk models regardless of the specifics of their preferred tech-niques. The argument for designing a rich data structure driven by risk models appeals to many commercial aviation sectors as they are still free to choose the risk model they feel most comfortable with.

It may be seen that this sort of data structure (Table 3.3) provides a complete description of the risk items required in conventional risk assessment but little information about the connections or gates that create the critical paths to hazardous events. In this sense, additional processing of the data may be required in order to create a risk model such as a fault tree. Although the risk data structure does not convey all the details available in risk models, it is not specific to a risk model and has wide applicability.

SAFETY ORGANIZATION AND RISK MANAGEMENT 9 7

Dalam dokumen Cognitive Engineering and Safety Organization in Air Traffic Management (Halaman 139-148)