S AFETY O RGANIZ ATION 3
3.6 Challenges to Safety Management
8 0 COGNITIVE ENGINEERING AND SAFETY ORGANIZATION
context (feedback control or coordination by mutual adjustment).
Two-way communication channels are used to exchange formal and informal information between practitioners or between organiza-tional levels. External communication of the organization with regu-latory institutions and public interest groups could also be considered in order to examine the organizational interface with the environ-ment. In addition, coordination can be viewed at the management and supervisory levels required to synchronize management of change and organizational reforms.
SAFETY ORGANIZATION AND RISK MANAGEMENT 81
competitive challenges and new business objectives. A discussion of the role of organizational policy and intelligence on safety appears in Chapters 11 and 12.
Table 3.1 Challenges to Safety Management Addressed in the Book Chapters
SMS ELEMENTS CHALLENGES CHAPTERS
Safety Policy Management
commitment and responsibility
Safety objectives and planning Common picture of risks The business case for safety
Organizational models (Chapters 11, 12)
Change initiatives
Intelligence on how to change the system
New challenges in ATM (Chapter 10) Safety accountabilities
Appointment of safety practitioners
Communication of safety policies Safety role structures
Direct/indirect communication channels;
work procedures and safety compliance
The ATM system (Chapters 1, 2)
Emergency response plan
Change of authority in abnormal situations Competence in handling abnormal events
Performance models in abnormal situations (Chapters 4, 5, 14) Risk Management
Hazard identification System modeling (e.g., time lags, feedback loops, nonlinear effects) for hazard identification
System modeling (Chapters 11, 12)
System risk assessment
Risk acceptance criteria Assessment of safety barriers
Factors affecting ATM (Chapter 2) Work as done vs. work as planned
Systemic vs. operational risk assessment Safety as part of everyday activities
Error modeling and work practices (Chapters 6, 7) Safety Assurance
Safety performance Performance management and safety indicators
Lagging vs. leading indicators
Organizational models (Chapters 11, 12,
13, 14) Continuous
improvement
Safety indicators drifting slowly away from safety standards
Recording and prioritizing risk information Internal safety
investigation
Risk information from the operations room Recording of active and latent failures Early assessments of incidents
Managing workload and complexity (Chapter 9)
Change management Assessing effectiveness of interventions and change
New challenges in ATM (Chapter 10) Safety Training and Promotion
Training and education Safety communication
Training needs analysis and job competence
Job design and autonomy Hindrances in risk communication Capturing and communicating efficient
work practices
Refresher training (Chapter 8) Organizational models
(Chapters 11, 12) work practices (Chapter 7)
8 2 COGNITIVE ENGINEERING AND SAFETY ORGANIZATION
The tactical management of safety refers to the safety roles of managers and employees, the assignment of key roles in the orga-nization, the communication of safety policies and rules, and the processing of operational feedback. Although the SMS manual specifies the safety processes of tactical management, several chal-lenges face the reality of aviation systems where multiple stake-holders may be involved in the running of everyday business (e.g., ANSPs, airlines, airport, maintenance and so on). Local optimi-zation of individual stakeholders may create side effects on oth-ers and this could create latent conditions of system failures. One challenge in managing system safety regards the development of a common risk picture where decision trade-offs are made by examining both internal and external threats to all stakeholders.
Chapters 1 and 2 on the ATM system go beyond the immediate needs of air traffic control and provide a succinct description of the interaction between stakeholders.
Under the rubric of safety policy also comes the management of abnormal situations and emergencies. The SMS manual specifies the need for a proactive approach to emergency management where haz-ards are identified in advance and risks are mitigated with techno-logical supports or additional safety training. The conduct of regular emergency exercises is very important for maintaining organizational readiness and practitioners’ skills in managing systems under time pressure. At the organizational level, a major challenge in emergency management regards the adaptation of authority roles from normal everyday situations. Since unexpected and abnormal situations have different event dynamics, time constraints, task allocations, and repercussions from everyday situations, the organizational mode of control may have to be adapted accordingly. Whereas a feed-forward mode of control and a hierarchical structure may be efficient for nor-mal operations, the changing demands of an emergency may require a greater reliance on feedback control and a flatter structure where operational staff obtain a higher degree of autonomy (Kontogiannis 2010a,b). In this sense, emergency management goes beyond risk analysis and requires a better understanding of organizational and situational demands (Chapter 2), knowledge of task work and team-work adaptations (Chapters 4 and 5), as well as a consideration of alternative operating modes (Chapter 14).
SAFETY ORGANIZATION AND RISK MANAGEMENT 8 3
3.6.2 Risk Management
Risk management includes the identification of hazards, the calcu-lation of their risk potential and the design of risk mitigation mea-sures. Effective risk management requires not only the collection of historical data on system operation but also the development of mod-els that describe “how-the-system-works” and “how-the- operations-function” to achieve safety. In this sense, risk analysts have been using several models of system functioning (e.g., Functional Block Diagram, Structured Analysis and Design Technique) and task analy-sis. Recently there has been a wider recognition of modeling the non-linear relationships between functions, their time lags, and feedback loops so that the level of risk is evaluated along different time spans.
For this reason, the systems-theoretic accident model and processes (STAMP) has been extensively used in studying the dynamics of the aviation system (see Chapters 11 and 12).
Traditionally, the aviation industry has placed a lot of importance on failure prevention by creating standards for policies, programs and pro-cedures. This work standardization was beneficial in reducing incidents in the past but it is no longer suffices to increase safety levels. ICAO has recognized the need for making the next step to error management where errors and failures that can cripple inside the system could be recovered in a timely fashion. For this reason, Chapter 6 is included on error detection and recovery while implications have been made for error management training.
A usual pitfall in risk management is that deviations from standard procedures are considered hazards that may lead to adverse events.
In other words, the gap between work as planned and work as done has been considered a hazard. In traditional approaches, standard procedures have been used as reference material for conducting risk analysis. However, there has been ample evidence that work practices that deviate from procedures could be a source of resilience in unfore-seen circumstances. To some extent, everyday learning in operational rooms implies situations where controllers experiment with proce-dures to find more efficient ways of doing their jobs. Some new prac-tices may receive wider recognition and become formal procedures themselves. In hindsight, modifications or workarounds may be seen as violations in cases where a problem is not managed properly. Risk
8 4 COGNITIVE ENGINEERING AND SAFETY ORGANIZATION
management should rely not only on written procedures but also on actual descriptions of work practices. Therefore, a better understand-ing is required of the reality of work as done in the aviation industry.
This is the aim of Chapter 7, which presents factors affecting modifi-cations of work practices, practitioner methods for optimizing perfor-mance, and organizational approaches to the communication of best practices invented by practitioners.
Risk management has taken a systems view of operations and looked into the role of workplace, technological, and organizational factors in accident causation. Another type of risk management can be specified for operational practitioners that is practiced on a daily basis for previewing risks in everyday operations. Operational risk management (ORM) is a simplified version of systemic risk manage-ment that focuses on daily hazards and has been part of safety brief-ings and safety previews.
3.6.3 Safety Assurance
Safety assurance refers to the degree of confidence that the SMS can work in practice. It includes ongoing monitoring of safety perfor-mance and periodic evaluation of safety management practices. The ICAO SMS manual presents several sources of ongoing and peri-odic monitoring. such as event reporting, safety reviews and surveys, safety studies, and internal safety investigations. The challenges in safety assurance mainly have to do with performance management and include: defining lagging and leading indicators, encouraging voluntary feedback of operations, identifying early warning signals, and analyzing statistics of risk information. In accident investigation, the challenges relate to considerations of latent organizational failures that set the conditions for the recurrence of similar events. As formal investigations take a long time to get published, it is essential that organizations are able to draw preliminary conclusions from inter-nal investigations so that organizatiointer-nal weaknesses are corrected and similar accidents are avoided (see Chapters 11 and 12).
Continuous reporting of safety-related events is emphasized in the ICAO SMS manual because organizations may slowly drift away from their safety standards without noticing it. To a certain extent, most organizations operate within this practical drift. Some organizations may stray from their standards and then oscillate a short distance from
SAFETY ORGANIZATION AND RISK MANAGEMENT 8 5
them. Yet there are organizations that begin to deviate very slowly, almost insidiously at first, then accelerate quickly away from the safety boundary. Monitoring operations to identify drift is part of a mature SMS. The challenge here is in capturing the right data at the right time.
ICAO makes reference to navigation aids within the practical drift that help organizations navigate the currents and obstacles.
Safety assurance makes feedback on SMS performance possible but also provides valuable input to many system changes. The management of change has been an essential part of the SMS manual because ANSPs have to adapt continually to increasing system demands and competi-tion. This adaptation requires a policy and a program of introducing changes and managing risks. The challenge for managers and designers here is to imagine how the new system of work will operate in the near future and predict likely hazards in new operations. Although some haz-ards may be identified and prevented prior to design, others may make their way through to future operations; their effects can be unmasked by a safety assurance program. New ATM initiatives such as SESAR and NextGen can be considered as change management initiatives and are considered separately in Chapter 10.
From another perspective, change management may be seen as a separate safety case, which identifies change objectives (safety pol-icy), controls risks in the design stage (risk management), assures that
“residual risks” are captured (safety assurance policies), and equips practitioners with necessary skills for transferring their skills into the new context of work (safety promotion). In other words, change man-agement may require all four pillars of safety manman-agement.
Since new change initiatives are made in response to increasing task demands and traffic patterns, the issue of coping with workload and complexity becomes fundamental. In this respect, Chapter 9 deals with the issue of complexity at the operational level where con-trollers have to manage heavier traffic levels. Chapter 10 looks into the strategies that controllers use to reduce complexity or manage high complexity in situations of high traffic or new situations in SESAR and NextGen scenarios.
3.6.4 Safety Promotion
ICAO’s emphasis on safety training focuses on how to make practi-tioners at all organizational levels familiar with the SMS and provide
8 6 COGNITIVE ENGINEERING AND SAFETY ORGANIZATION
training in the identification of hazards, their prioritization and their means of mitigation; a separate issue here is safety communi-cation that should be sensitive to safety warnings. Although, safety training is a legitimate aspect of improving human performance, it appears that safety has been treated as an important but isolated aspect of total performance management (e.g., productivity, quality, and maintenance). A challenge for organizations would be to con-sider new approaches that make a business case for safety by inte-grating safety and productivity aspects of performance (McDonald et al. 2012).
Existing risk management approaches emphasize this division of mind between safety and productivity as illustrated in fault-tree anal-ysis. Most probably, controllers do not perceive of conflict detection as a task separate from other tasks, such as putting arriving aircraft in sequence, establishing departure flows, coordinating with other sec-tors, and so on. The challenge for controllers is to manage these tasks efficiently and safely. Therefore, the focus should be on the cognitive functions of controllers that usually address both efficiency and safety, although with different priorities depending on the circumstances. In this respect, several methods of cognitive tasks analysis (CTA) are presented in Chapter 8 as a basis for controller training. CTA meth-ods are based on models of human performance and behavioral mark-ers that exemplify aspects of poor and excellent performance.
Safety promotion also includes the collection and dissemination of risk information across all organizational levels. Safety communica-tion is very important for remaining alert to early warnings of danger as well as learning lessons from recorded near misses and incidents.
Experienced controllers manage to fix problems quickly, hence over-coming several systemic problems that may persist for long periods as latent failures. Fixing a problem usually is not followed by early warnings to upper management, which prevents a systemic solution.
It is important, therefore, to examine how practitioners develop their quick-fix practices on the job and how organizations can design sys-tems to capture knowledge about work practices that affect safety (see Chapter 7).
Safety communication may encounter many filters and hindrances as it travels upward through organizational levels. In this sense, orga-nizational communication and safety culture are important aspects of
SAFETY ORGANIZATION AND RISK MANAGEMENT 8 7
managing safety communication. Organizational communication can be considered together with other managerial processes within the framework of system theoretical models such as a STAMP and Viable System Model (see Chapters 11 and 12).
3.7 Revisiting the Safety Envelope and Applying Resilience Engineering