Unintended software interactions - Electronic Evidence

6.74 Software code might function correctly, as intended by the programmer, but the interactions between individual components of the software code can be the cause of failure, because the designers of the system fail to account for all the potential interactions. This is because the potential number of defects in software relates not only to the components (lines of code), but also to the number of ways in which they interact – the number of interactions increases faster than the number of components, thus making large systems with many components proportionally harder to get right.

As the work of Bianca Schroeder and Garth A. Gibson demonstrates, the more complex the system becomes, the more likely it is that different types of failure will occur,¹ and the number of reasons that complexity causes failure also increases.² To put the problem into perspective, it is necessary to understand not the number of defects per device but the proportion of design decisions that contain defects, which might be termed a frequency.³ A typical design decision in software looks like this:

if some-condition-I-have-decided-when-I-designed-the-software then

do something otherwise

do something else

1 Schroeder and Gibson, ‘A large-scale study of failures in high-performance computing systems’.

2 For the same discussion in 1986, see Rudolph J Peritz, ‘Computer data and reliability: a call for authentication of business records under the federal rules of evidence’ (1986) 80 Northwestern University Law Review 965, 990–9; Stephen Mason and Timothy S Reiniger, ‘“Trust” between machines?

Establishing identity between humans and software code, or whether you know it is a dog, and if so, which dog?’ (2015) 21 CTLR 135–48; for a specific case study, see Sivanesan Tulasidas, Ruth Mackay, Pascal Craw, Chris Hudson, Voula Gkatzidou and Wamadeva Balachandran, ‘Process of designing robust, dependable, safe and secure software for medical devices: point of care testing device as a case study’ (2013) 6 Journal of Software Engineering and Applications 1.

3 Nobody is certain how many defects occur per lines of code or number of design decisions, but for a good discussion, see McDermid and Kelly, ‘Software in safety critical systems’.

6.75 This means, illustrating the point with this simple example, that each design decision creates at least two choices for the software to handle, and within the ‘do something’ bits, further design choices will have to be made. This demonstrates that in software, a very few decisions rapidly creates a far more complex thing than humans can reliably analyse and be confident they have made the right decisions, in even a modest fraction of the possible cases.¹ Since there are typically thousands of design decisions in the software for even relatively small products, there will be hundreds of defects in the final products – Professor Pham suggests ‘that as software projects become larger, the rate of software defects increases geometrically’.² An average defect

level of one to five defects per thousand lines of code could translate into hundreds if not thousands of defects for devices that have several hundred thousand to a million or more lines of code.³ This is the typical size of most software that controls aircraft,⁴ motor vehicles and many other common systems. The user is affected by how often the software fails, or how likely it is that in a particular occasion the software failed – a probability rather than a frequency, not by how many defects there are. This is because one defect may cause failures frequently, and another defect cause failures only very seldom.

1 I owe this analysis to Professor Harold Thimbleby.

2 Exponential is a more precise term than geometric: Hoang Pham, System Software Reliability (Springer 2000) 2. The software included in motor vehicles (called ‘electronic control units’ in the trade) is increasing in numbers, and has elaborate structures, all of which can lead to malfunctions that can cause death if the software is not properly tested: J. Mössinger, ‘Software in automotive systems’, (2010) 27 IEEE Software 92; Stephen Mason, ‘Vehicle remote keyless entry systems and engine immobilisers: do not believe the insurer that they are perfect’ (2012) 28 Computer Law and Security Review 195 in which it was predicted that the number of vehicles with keyless entry systems being stolen would increase, for which see Carnegie Menon, ‘Hi-tech thieves add computers to crowbars’, The Guardian (London, 25 June 2016) 49 and ‘Is your car the most stolen model in England and Wales?’

at <www.theguardian.com/money/2016/jun/25/hi-tech-thieves-keyless-car-crime-electronic- security>.

3 William Guttman, professor of economics and technology at Carnegie Mellon University, is of the view that the figure is nearer 30 errors per 1,000 lines of code on average: Alorie Gilbert, ‘Newsmaker:

Fixing the sorry state of software’, CNET News (9 October 2002) (this item no longer seems to be available online).

4 On 2 June 1994, Chinook helicopter ZD 576 crashed on the Mull of Kintyre. The RAF Board of Inquiry held the pilots to be negligent. However, some considered that the installation of a Full Authority Digital Engine Control (FADEC) system was to blame, as described in detail in RAF Justice (Computer Weekly) <http://cdn.ttgtmedia.com/rms/computerweekly/DowntimePDF/pdf/rafjust.

pdf>; ‘Chinook crash: critical internal memo on software flaws’, Computer Weekly (4 June 2009) <www.

computerweekly.com/news/2240089594/Chinook-crash-critical-internal-memo-on-software- flaws>; the decision of the RAF Board of Inquiry was subsequently reversed: The Mull of Kintyre Review (HC Paper 1348, 2011) <www.gov.uk/government/uploads/system/uploads/attachment_data/

file/247259/1348.pdf>.

6.76 This issue is further magnified by what are called ‘legacy’ systems. For instance, the computer systems used by airlines are very complex. There are a number of reasons: airlines introduced computer systems in the 1950s; as airlines merge, or take over other airlines, they might combine or adopt the computer systems they have inherited. Over time, as new functions are added, this process has created systems of great complexity. The banking sector has the same problem. Replacing such systems is not an easy decision, because it would take a considerable amount of money and time, and it is doubtful whether any IT firm has sufficient skills and knowledge to provide all the software needed for a complete replacement.¹

1 ‘All systems stop: why big firms like Delta find it so hard to eliminate glitches from their IT systems’, The Economist (London, 13 August 2016) (from the print edition) at <www.economist.com/

news/business/21704842-why-big-firms-delta-find-it-so-hard-eliminate-glitches-their-it-systems- all-systems>.

6.77 One example of such a failure is the loss of the Mars Polar Lander and Deep Space 2 missions. The loss of the spacecraft the failure is recounted in the NASA report:

7.7.2 Premature Descent Engine Shutdown

FAILURE MODE DESCRIPTION

A spurious signal, generated when the landing legs are deployed at an altitude of about 1500 meters, can cause premature descent engine shutdown when the lander is 40 meters above the surface.

…

The touchdown sensors characteristically generate a false momentary signal at leg deployment. This behavior was understood and the flight software was required to ignore these events; however, the requirement did not specifically describe these events, and consequently, the software designers did not properly account for them. The resulting software design recorded the spurious signals generated at leg deployment as valid touchdown events. When the sensor data were enabled at an altitude of 40 meters, the engines would immediately shut down. The lander would free fall to the surface, impacting at a velocity of 22 meters per second (50 miles per hour), and be destroyed.¹

1 Report on the Loss of the Mars Polar Lander and Deep Space 2 Missions (JPL Special Review Board, 22 March 2000, JPL D-18709).

6.78 Professor Leveson describes this as a component interaction accident,¹ where an accident arises because of the interactions between the components of a system, rather than in the failure of any individual component. This is an example of incorrect software requirements, specifically of incorrect dependencies between components:

the assumptions made in one element become an important part of the context of the requirements for some other part.² This illustrates the point that software itself is neither acceptably safe nor unacceptably unsafe; it is the operation of the software that might be called safe or unsafe.

1 Nancy G Leveson, Engineering a Safer World: Systems Thinking Applied to Safety (MIT Press 2011) 8, 49, 66–67.

2 For examples of other accidents, see Trevor Kletz, Paul Chung, Eamon Broomfield and Chaim Shen- Orr, Computer Control and Human Error (Gulf Professional Publishing 1995).

6.79 Consider a practical example. The display on the screen has a meaning, and if that meaning is not veridical, then an accident may result. Where the moon rising over the horizon causes a system to interpret it as a massive ICBM launch, semantic safety is violated: that is, the display (it might be a warning signal or something else) was not veridical. This problem has been linked to the possibility that a nuclear war has been averted by human intervention despite computer warnings of imminent attacks at least twice.¹

1 I owe this suggestion to Professor Peter Bernard Ladkin. For the incident where software code made it appear the Soviet Union had launched an assault of nuclear missiles on the United State of America, see Donald MacKenzie, Mechanizing Proof Computing, Risk and Trust (MIT Press 2004) 23–4 and Eric Schlosser, Command and Control (Penguin 2014) 253–4; for an incident where software code made it appear there was a missile attack by the United States of America against the Soviet Union, see Ron Rosenbaum, How the End Begins: The Road to a Nuclear World War III (Simon & Schuster 2011) 7, 225–6, 248; Pavel Aksenov,

‘Stanislav Petrov: The man who may have saved the world’, BBC News (26 September 2013).

6.80 It should be observed that the increasing use of machine-learning systems complicates this issue, because the software code is instructed to make further decisions when running, which increases the complexity. In addition, the veridicality of machine-learning systems like neural nets cannot be easily understood or verified.¹ 1 I owe this point to Dr Michael Ellims and Professor Martyn Thomas, CBE, FREng.

Dalam dokumen Electronic Evidence (Halaman 171-174)