STEP has been introduced through Software Quality Engineering's (SQE) Systematic Software Testing classes to hundreds of organizations. It's a proven methodology offering significant potential for improving software quality in most companies.
Key differences between STEP and prevalent industry practices are highlighted in Table 1-5.
First is the overall goal of the testing activity. STEP is prevention oriented, with a primary
focus on finding requirements and design defects through early development of test designs.
This results in the second major difference of when major testing activities are begun (e.g., planning timing and activity timing). In STEP, test planning begins during software
requirements definition, and testware design occurs in parallel with software design and before coding. Prevalent practice is for planning to begin in parallel with coding and test development to be done after coding.
Table 1-5: Major Differences Between STEP and Industry Practice
Methodology Focus Planning Timing
Acquisition
Timing Coverage Visibility
STEP
Prevention &
Risk
Management
Begins During Requirements Definition
Begins During Requirements Definition
Known (Relative to Inventories)
Fully
Documented &
Evaluated Prevalent
Industry Practice
Detection &
Demonstration
Begins After Software Design
Begins After Software Design (or Code)
Largely Unknown
Largely
Undocumented with Little or No Evaluation
Another major difference between STEP and prevalent industry practices is the creation of a group of test cases with known coverage (i.e., mapping test cases to inventories of
requirements, design, and code). Finally, using the IEEE documents provides full documentation (i.e., visibility) of testing activities.
Key Point
In STEP, test planning begins during software requirements definition and testware design occurs in parallel with software design and before coding.
STEP also requires careful and systematic development of requirements and design-based coverage inventories and for the resulting test designs to be calibrated to these inventories.
The result is that in STEP, the test coverage is known and measured (at least with respect to the listed inventories). Prevalent practice largely ignores the issue of coverage
measurement and often results in ad hoc or unknown coverage.
A final major difference lies in the visibility of the full testing process. Every activity in STEP leads to visible work products. From plans, to inventories, to test designs, to test specs, to test sets, to test reports, the process is visible and controlled. Industry practice provides much less visibility, with little or no systematic evaluation of intermediate products.
These differences are significant and not necessarily easy to put into practice. However, the benefits are equally significant and well worth the difficulty and investment.
Key Point
Calibration is the term used to describe the measurement of coverage of test cases against an inventory of requirements and design attributes.
Team-Fly
Team-Fly
Chapter 2: Risk Analysis
Overview
"If you do not actively attack risks, they will actively attack you."
— Tom Gilb
Key Point
A latent defect is an existing defect that has not yet caused a failure because the exact set of conditions has never been met.
A masked defect is an existing defect that hasn't yet caused a failure, because another defect has prevented that part of the code from being executed.
There's no way we can ever guarantee that a software system will be "perfect," because failures may come from many unexpected directions. A latent defect in a system that has run well for many years may cause the system to fail unexpectedly. Hardware may fail or defects may remain undetected for years, then suddenly become unmasked. These effects may be amplified as changes to interfaces and protocols in one part of the system begin to interfere with legacy software in another part. Multiplying numbers of users may stress the system, or changes in the business model may cause them to use it in ways that were never originally foreseen. A changing operating environment may also pose risks that can
undermine a sound software design, creating implementation and operational problems.
In his article "Chaos Into Success," Jim Johnson reported that only 26% of projects met the criteria for success - completed on time, on budget, and with all of the features and
functions originally specified. Unfortunately, the disaster stories behind these statistics are often more difficult to digest than the numbers themselves. In an article in IEEE Computer magazine, Nancy Leveson and Clark Turner reported that a computerized radiation therapy machine called Therac-25 caused six known incidents of accidental overdose between June 1985 and January 1987, which resulted in deaths and serious injuries. According to Space Events Diary, corrupted software may have been the cause of the failure of the upper stage on a Titan 4B spacecraft on April 30, 1999. The malfunction caused the upper stage of the rocket to misfire and place its payload (a communications satellite) in the wrong orbit. A review of newspapers, magazines, and Web sites will show that these are only a few of the documented incidents caused by defective software. Thousands of undocumented incidents occur every day and affect nearly every aspect of our lives.
Key Point
A GUI with 10 fields that can be entered in any order results in a set of 3,628,800 combinations that could potentially be tested.
Most software testing managers and engineers realize that it's impossible to test everything in even the most trivial of systems. The features and attributes of a simple application may result in millions of permutations that could potentially be developed into test cases.
Obviously, it's not possible to create millions of test cases; and even if a large number of test cases are created, they generally still represent only a tiny fraction of the possible combinations. Even if you had created thousands of test cases, and through a concerted effort doubled that number, millions of other combinations may still exist and your "doubled"
test set would still represent only a tiny fraction of the potential combinations, as illustrated in Figure 2-1. In most cases, "what" you test in a system is much more important than "how much" you test.
Figure 2-1: Domain of All Possible Test Cases (TC) in a Software System
Tight time schedules and shortages of trained testers serve to exacerbate this problem even further. In many companies, the testers begin work on whatever components or parts of the system they encounter first, or perhaps they work on those parts that they're most familiar with. Unfortunately, both of these approaches typically result in the eventual delivery of a system in which some of the most critical components are untested, inadequately tested, or at the very least, tested later in the lifecycle. Even if problems are found later in the
lifecycle, there may be inadequate time to fix them, thereby adding to the risk of the software. Changing priorities, feature creep, and loss of resources can also reduce the ability of the test team to perform a reasonably comprehensive test.
Team-Fly
Team-Fly