Analyzing and Interpreting Test Results - Testing Across the Entire Software Development Life

Software Testing: Testing Across the Entire Software Development Life Cycle, by G. D. Everett and R. McLeod, Jr.

apparent when customers start calling your HelpDesk with defects that eluded your testing. Customer-discovered software defects will be included in our analysis later in this chapter.

One planning key to successful test results analysis is the clear deﬁ nition of success for each test case. It is common for a test case to have a number of expected results. If the actual results obtained from a test execution all match the expected results, then the test case is normally considered “attempted and successful.” If only some of the actual results obtained from a test execution match the expected results, then the test case is normally considered “attempted but unsuccessful.” Test cases that have not been executed are initially marked “unattempted.”

The “unattempted” versus “attempted …” status of each test case is tracked by testing management because this is the most obvious testing progress indica- tor. Ninety percent “unattempted” test cases indicates that the testing effort has just begun. Ten percent “unattempted” test cases indicates that the testing effort may be close to ﬁ nished. The number of attempted test cases over time gives the test manager an indication of how fast the testing is progressing relative to the size of the test team. If you log 15 test cases attempts by your test team in the ﬁ rst 2 weeks of testing, this indicates an initial attempt rate of 1.5 test case attempts/day. If the test plan calls for a total of 100 test cases to be attempted, then you can calculate an initial estimate of 14 weeks for your test team to “attempt” all 100 test cases in the plan. Here are the calculations.

15 test cases attempted / 10 test work days 1.5 test case attempts/day 100 test cases to attempt / 1.5 test case attempts/day 67 days (14 workweeks) Calculation 12.1 Estimating test execution schedule—ﬁ rst Draft

Some of the “attempts” will result in defect discoveries requiring time for correction and retesting. So the 14-week schedule really represents the expected completion of just the ﬁ rst round of testing execution. Depending on the number of “unsuccessful” test cases encountered during the 14-week period, a second, third, and possibly fourth round of correction and retesting may be necessary to achieve mostly “successful” results.

A test case may be “attempted but unsuccessful” because the actual results do not match the expected results or because the software halted with an error message before the test case was completed. The challenge to the test manager is to prioritize the unsuccessful test case results for correction. If a test case encounters an error that stops the test case before it can be completed, this is usually considered a severe defect sufﬁ cient to warrant immediate corrective action by the developers.

Once that corrective action has been taken and the test case rerun, the test case may go to completion without further showstoppers and become marked as “attempted and successful.” On the contrary, the test case may execute a few more steps and be halted by another defect.

If the application under test allows the test case to go to completion but provides actual results different from the expected results, the test manager needs to prioritize these unsuccessful test case results based on business risk to not correct. For example, if a functional test case shows that a set of screen input values produces 12.2 Test Cases Attempted Versus Successful 177

an incorrect screen output value critical to routine business, then the unsuccessful test case presents a high business risk to not correct. An example of this kind of “attempted but unsuccessful” results would be an incorrect loan payment amortization schedule based on a loan principal value and annual interest rate. Testing can con- tinue, but the application cannot be shipped or deployed to a business until the actual results match the expected results.

A low business risk example would be a “submit” message that appears in green in the lower right-hand corner of the screen instead of appearing in red in the upper left- hand corner of the screen. The actual execution result is different from the expected result, but the application is usable in business with the different outcome. The test manager needs to discuss this test ﬁ nding with the application development manager to determine the priority of correcting the code that produces the “submit” message.

Testers tend to prioritize unsuccessful testing outcomes using the range of num- bers from 1 to 4. Priority 1 is used to indicate the highest business risk. Priority 4 is used to indicate the lowest business risk. Historically, testers use the term “severity”

instead of “priority” to convey relative business risk of unsuccessful tests. Figure 12.1a demonstrates how a test case execution schedule might appear. Figure 12.1b shows the analysis of the Figure 12.1a ﬁ rst week’s test execution results.

Figure 12.1a A test schedule with first-week outcomes

12.3 DEFECT DISCOVERY FOCUSING ON INDIVIDUAL DEFECTS

As we saw in the previous section, there are several possible reasons why the execution of a test case can be considered unsuccessful. The remaining sections of this chapter use the term “defect” for a conﬁ rmed software error discovered by test execution and requiring correction.

At its most basic level, testing discovers defects one at a time. Once the defect has been corrected and retested, the particular area of software under test may operate defect free throughout the remaining test case execution. More frequently, the correction of one defect simply allows the test case to proceed to the next defect in the software, resulting in a number of discovery/correction cycles that are required before the test case can run to successful completion. It is also likely that multiple test cases with different testing objectives in the same area of the software will discover different sequences of defects. The implication is that a single successful test case does not guarantee defect-free code in the area of the software being tested.

The incremental discovery and correction retesting of software defects is the primary way that software testers help software developers implement the development requirements. The fewer the latent defects in the delivered software, the closer the software comes to fulﬁ lling the requirements. The success of incremental defect discovery is directly related to the management process used to track defects from discovery to correction. If defects are discovered but not reported to developers, then testing provides no real value to the development effort. If the defects are discovered and reported to developers but the corrective Figure 12.1b Analysis of first-week test execution outcomes

12.3 Defect Discovery Focusing on Individual Defects 179

action not veriﬁ ed, then testing still provides no real value to the development effort. The success of incremental defect discovery requires defect tracking from discovery through correction to retesting and veriﬁ cation that correction has been achieved.

Defect tracking can be accomplished with a variety of reporting tools ranging from a simple spreadsheet to an elaborate defect management tool. Either way, the organized entry and tracking of simple information pays great dividends toward the success of defect correction efforts. Figure 12.2 demonstrates how an unsuccessful test case attempt causes one or more defect log entries that can be tracked to correction with simple metrics.

Because the severity code is meant to be an aid in determining which defects in the tracking log to correct next, at least three different kinds of severity codes can be found in use either singly or in combination. The first kind of severity code indicates severity relative to testing, that is, “Is this a testing showstopper?” The second kind of severity code indicates severity relative to development, that is, “Is this a development showstopper?” The third kind of severity code indicates severity relative to completing development, that is, “Is this a shipping/deployment showstopper?” The trend is toward capturing all three severity codes for each defect and use the one that makes the most sense depending on how close the development project is to completion. The closer the project comes to completion, the more important the shipping showstopper severity code becomes.

Figure 12.2 Example defect tracking log from unsuccessful test case attempts

Dalam dokumen Testing Across the Entire Software Development Life Cycle (Halaman 195-200)