CAUSES OF INACCURATE ESTIMATES - CHANGES IN SOFTWARE

CHANGES IN SOFTWARE

SIDEBAR 3.3 CAUSES OF INACCURATE ESTIMATES

L

ederer and Prasad (1992) investigated the cost-estimation practices of 115 different organizations. Thirty-ﬁve percent of the managers surveyed on a ﬁve-point Likert scale indi- cated that their current estimates were “moderately unsatisfactory” or “very unsatisfactory.”

The key causes identiﬁed by the respondents included

• frequent requests for changes by users

• overlooked tasks

• users’ lack of understanding of their own requirements

• insufﬁcient analysis when developing an estimate

• lack of coordination of systems development, technical services, operations, data administration, and other functions during development

• lack of an adequate method or guidelines for estimating

Several aspects of the project were noted as key inﬂuences on the estimate:

• complexity of the proposed application system

• required integration with existing systems

• complexity of the programs in the system

• size of the system expressed as number of functions or programs

• capabilities of the project team members

• project team’s experience with the application

• anticipated frequency or extent of potential changes in user requirements

• project team’s experience with the programming language

• database management system

• number of project team members

• extent of programming or documentation standards

• availability of tools such as application generators

• team’s experience with the hardware

Section 3.3 Effort Estimation 105

Size (SLOC) Cost ($)

PHASES AND MILESTONES

RELATIVE SIZE RANGE

x 1.5x 1.25x

0.5x

0.25x

Feasibility Plans &

requirements Product

design

Detailed design

Development and test Concept of

operations

Requirements specification

Product design specs

Detailed design specs

Accepted software +

+ + ++ + + +

++ ++ +

FIGURE 3.12 Changes in estimation accuracy as project progresses (Boehm et al. 1995).

be reﬁned, based on more complete information about the project’s characteristics.

Figure 3.12 illustrates how uncertainty early in the project can affect the accuracy of cost and size estimates (Boehm et al. 1995).

The stars represent size estimates from actual projects, and the pluses are cost estimates. The funnel-shaped lines narrowing to the right represent Boehm’s sense of how our estimates get more accurate as we learn more about a project. Notice that when the speciﬁcs of the project are not yet known, the estimate can differ from the eventual actual cost by a factor of 4. As decisions are made about the product and the process, the factor decreases. Many experts aim for estimates that are within 10 percent of the actual value, but Boehm’s data indicate that such estimates typically occur only when the project is almost done—too late to be useful for project management.

To address the need for producing accurate estimates, software engineers have developed techniques for capturing the relationships among effort and staff characteristics, project requirements, and other factors that can affect the time, effort, and cost of developing a software system. For the rest of this chapter, we focus on effort-estimation techniques.

Expert Judgment

Many effort-estimation methods rely on expert judgment. Some are informal techniques, based on a manager’s experience with similar projects. Thus, the accuracy of the prediction is based on the competence, experience, objectivity, and perception of the estimator. In its simplest form, such an estimate makes an educated guess about the effort needed to build an entire system or its subsystems. The complete estimate can be computed from either a top-down or a bottom-up analysis of what is needed.

Many times analogies are used to estimate effort. If we have already built a system much like the one proposed, then we can use the similarity as the basis for our estimates.

For example, if system A is similar to system B, then the cost to produce system A should be very much like the cost to produce B. We can extend the analogy to say that if A is about half the size or complexity of B, then A should cost about half as much as B.

The analogy process can be formalized by asking several experts to make three predictions: a pessimistic one (x), an optimistic one (z), and a most likely guess (y). Then our estimate is the mean of the beta probability distribution determined by these num- bers: By using this technique, we produce an estimate that “normal- izes” the individual estimates.

The Delphi technique makes use of expert judgment in a different way. Experts are asked to make individual predictions secretly, based on their expertise and using whatever process they choose. Then, the average estimate is calculated and presented to the group. Each expert has the opportunity to revise his or her estimate, if desired.

The process is repeated until no expert wants to revise. Some users of the Delphi technique discuss the average before new estimates are made; at other times, the users allow no discussion.And in another variation, the justiﬁcations of each expert are circu- lated anonymously among the experts.

Wolverton (1974) built one of the ﬁrst models of software development effort.

His software cost matrix captures his experience with project cost at TRW, a U.S. software development company. As shown in Table 3.6, the row name represents the type of software, and the column designates its difﬁculty. Difﬁculty depends on two factors:

whether the problem is old (O) or new (N) and whether it is easy (E), moderate (M), or hard (H).The matrix elements are the cost per line of code, as calibrated from historical data at TRW. To use the matrix, you partition the proposed software system into modules. Then, you estimate the size of each module in terms of lines of code. Using the matrix, you calculate the cost per module, and then sum over all the modules. For instance, suppose you have a system with three modules: one input/output module that is old and easy, one algorithm module that is new and hard, and one data management module that is old and moderate. If the modules are likely to have 100, 200, and 100 lines of code, respectively, then the Wolverton model estimates the cost to be 1100 * 172 + 1200 * 352 + 1100 * 312 = $11,800.

1x + 4y + z2>6.

TABLE 3.6 Wolverton Model Cost Matrix

Difﬁculty

Type of software OE OM OH NE NM NH

Control 21 27 30 33 40 49

Input/output 17 24 27 28 35 43

Pre/post processor 16 23 26 28 34 42

Algorithm 15 20 22 25 30 35

Data management 24 31 35 37 46 57

Time-critical 75 75 75 75 75 75

Section 3.3 Effort Estimation 107

Since the model is based on TRW data and uses 1974 dollars, it is not applicable to today’s software development projects. But the technique is useful and can be trans- ported easily to your own development or maintenance environment.

In general, experiential models, by relying mostly on expert judgment, are subject to all its inaccuracies. They rely on the expert’s ability to determine which projects are similar and in what ways. However, projects that appear to be very similar can in fact be quite different. For example, fast runners today can run a mile in 4 minutes.A marathon race requires a runner to run 26 miles and 365 yards. If we extrapolate the 4-minute time, we might expect a runner to run a marathon in 1 hour and 45 minutes. Yet a marathon has never been run in under 2 hours. Consequently, there must be characteristics of running a marathon that are very different from those of running a mile. Like- wise, there are often characteristics of one project that make it very different from another project, but the characteristics are not always apparent.

Even when we know how one project differs from another, we do not always know how the differences affect the cost.A proportional strategy is unreliable, because project costs are not always linear: Two people cannot produce code twice as fast as one. Extra time may be needed for communication and coordination, or to accommodate differences in interest, ability, and experience. Sackman, Erikson, and Grant (1968) found that the productivity ratio between best and worst programmers averaged 10 to 1, with no easily deﬁnable relationship between experience and performance. Likewise, a more recent study by Hughes (1996) found great variety in the way software is designed and developed, so a model that may work in one organization may not apply to another.

Hughes also noted that past experience and knowledge of available resources are major factors in determining cost.

Expert judgment suffers not only from variability and subjectivity, but also from dependence on current data.The data on which an expert judgment model is based must reﬂect current practices, so they must be updated often. Moreover, most expert judgment techniques are simplistic, neglecting to incorporate a large number of factors that can affect the effort needed on a project. For this reason, practitioners and researchers have turned to algorithmic methods to estimate effort.

Algorithmic Methods

Researchers have created models that express the relationship between effort and the factors that inﬂuence it. The models are usually described using equations, where effort is the dependent variable, and several factors (such as experience, size, and application type) are the independent variables. Most of these models acknowledge that project size is the most inﬂuential factor in the equation by expressing effort as

whereSis the estimated size of the system, and a, b,andcare constants. is a vector of cost factors, through and mis an adjustment multiplier based on these factors.

In other words, the effort is determined mostly by the size of the proposed system, adjusted by the effects of several other project, process, product, or resource characteristics.

x_n, x₁

E = 1a + bS^c2m1X2

Walston and Felix (1977) developed one of the ﬁrst models of this type, ﬁnding that IBM data from 60 projects yielded an equation of the form

The projects that supplied data built systems with sizes ranging from 4000 to 467,000 lines of code, written in 28 different high-level languages on 66 computers, and representing from 12 to 11,758 person-months of effort. Size was measured as lines of code, including comments as long as they did not exceed 50 percent of the total lines in the program.

The basic equation was supplemented with a productivity index that reflected 29 factors that can affect productivity, shown in Table 3.7. Notice that the factors are tied to a very specific type of development, including two platforms: an operational computer and a development computer. The model reflects the particular development style of the IBM Federal Systems organizations that provided the data.

E = 5.25S^0.91

TABLE 3.7 Walston and Felix Model Productivity Factors

1. Customer interface complexity 16. Use of design and code inspections 2. User participation in requirements deﬁnition 17. Use of top-down development 3. Customer-originated program design changes 18. Use of a chief programmer team 4. Customer experience with the application area 19. Overall complexity of code

5. Overall personnel experience 20. Complexity of application processing 6. Percentage of development programmers

who participated in the design of functional speciﬁcations

21. Complexity of program ﬂow

7. Previous experience with the operational computer

22. Overall constraints on program’s design

8. Previous experience with the programming language

23. Design constraints on the program’s main storage

9. Previous experience with applications of similar size and complexity

24. Design constraints on the program’s timing

10. Ratio of average staff size to project duration (people per month)

25. Code for real-time or interactive operation or for execution under severe time constraints 11. Hardware under concurrent development 26. Percentage of code for delivery

12. Access to development computer open under special request

27. Code classiﬁed as nonmathematical application and input/output formatting programs 13. Access to development computer closed 28. Number of classes of items in the database

per 1000 lines of code 14. Classiﬁed security environment for computer

and at least 25% of programs and data

29. Number of pages of delivered documentation per 1000 lines of code

15. Use of structured programming

Section 3.3 Effort Estimation 109

Each of the 29 factors was weighted by 1 if the factor increases productivity, 0 if it has no effect on productivity, and if it decreases productivity. A weighted sum of the 29 factors was then used to generate an effort estimate from the basic equation.

Bailey and Basili (1981) suggested a modeling technique, called a meta-model, for building an estimation equation that reﬂects your own organization’s characteristics.

They demonstrated their technique using a database of 18 scientiﬁc projects written in Fortran at NASA’s Goddard Space Flight Center. First, they minimized the standard error estimate and produced an equation that was very accurate:

Then, they adjusted this initial estimate based on the ratio of errors. If Ris the ratio between the actual effort,E,and the predicted effort, , then the effort adjustment is deﬁned as

They then adjusted the initial effort estimate this way:

Finally, Bailey and Basili (1981) accounted for other factors that affect effort, shown in Table 3.8. For each entry in the table, the project is scored from 0 (not present) to 5 (very important), depending on the judgment of the project manager. Thus, the total

E_adj = e11 + ER_adj2E ifR Ú 1 E>11 + ER_adj2 ifR 6 1

E_adj

ER_adj = eR - 1 if R Ú 1 1 - 1>R ifR 6 1

E¿ E = 5.5 + 0.73S^1.16

-1

TABLE 3.8 Bailey–Basili Effort Modiﬁers Total Methodology

(METH)

Cumulative Complexity (CPLX)

Cumulative Experience (EXP) Tree charts Customer interface complexity Programmer qualiﬁcations

Top-down design Application complexity Programmer machine

experience Formal documentation Program ﬂow complexity Programmer language

experience Chief programmer teams Internal communication

complexity

Programmer application experience

Formal training Database complexity Team experience

Formal test plans External communication complexity

Design formalisms Customer-initiated program design changes

Code reading

Unit development folders

score for METH can be as high as 45, for CPLX as high as 35, and for EXP as high as 25.

Their model describes a procedure, based on multilinear least-square regression, for using these scores to further modify the effort estimate.

Clearly, one of the problems with models of this type is their dependence on size as a key variable. Estimates are usually required early, well before accurate size information is available, and certainly before the system is expressed as lines of code. So the models simply translate the effort-estimation problem to a size-estimation problem.

Boehm’s Constructive Cost Model (COCOMO) acknowledges this problem and incor- porates three sizing techniques in the latest version, COCOMO II.

Boehm (1981) developed the original COCOMO model in the 1970s, using an extensive database of information from projects at TRW, an American company that built software for many different clients. Considering software development from both an engineering and an economics viewpoint, Boehm used size as the primary determi- nant of cost and then adjusted the initial estimate using over a dozen cost drivers, including attributes of the staff, the project, the product, and the development environment. In the 1990s, Boehm updated the original COCOMO model, creating COCOMO II to reﬂect the ways in which software development had matured.

The COCOMO II estimation process reﬂects three major stages of any development project. Whereas the original COCOMO model used delivered source lines of code as its key input, the new model acknowledges that lines of code are impossible to know early in the development cycle. At stage 1, projects usually build prototypes to resolve high-risk issues involving user interfaces, software and system interaction, performance, or technological maturity. Here, little is known about the likely size of the ﬁnal product under consideration, so COCOMO II estimates size in what its creators call application points. As we shall see, this technique captures size in terms of high- level effort generators, such as the number of screens and reports, and the number of third-generation language components.

At stage 2, the early design stage, a decision has been made to move forward with development, but the designers must explore alternative architectures and concepts of operation. Again, there is not enough information to support ﬁne-grained effort and duration estimation, but far more is known than at stage 1. For stage 2, COCOMO II employs function points as a size measure. Function points, a technique explored in depth in IFPUG (1994a and b), estimate the functionality captured in the requirements, so they offer a richer system description than application points.

By stage 3, the postarchitecture stage, development has begun, and far more information is known. In this stage, sizing can be done in terms of function points or lines of code, and many cost factors can be estimated with some degree of comfort.

COCOMO II also includes models of reuse, takes into account maintenance and breakage (i.e., the change in requirements over time), and more. As with the original COCOMO, the model includes cost factors to adjust the initial effort estimate.A research group at the University of Southern California is assessing and improving its accuracy.

Let us look at COCOMO II in more detail. The basic model is of the form

where the initial size-based estimate, is adjusted by the vector of cost driver information,m( ). Table 3.9 describes the cost drivers at each stage, as well as the use of other models to modify the estimate.

bS^c, E = bS^cm1X2

Section 3.3 Effort Estimation 111

TABLE 3.9 Three Stages of COCOMO II

Model Aspect

Stage 1:

Application Composition

Stage 2:

Early Design

Stage 3:

Postarchitecture

Size Application

points

Function points (FPs) and language

FP and language or source lines of code (SLOC)

Reuse Implict in

model

Equivalent SLOC as function of other variables

Requirements change

Implicit in model

% change expressed as a cost factor

Maintenance Application points, annual change trafﬁc (ACT)

Function of ACT, software understanding, unfmiliarity

Function of ACT, software understanding, unfamiliarity

Scale (c) in nominal effort equation

1.0 0.91 to 1.23, depending on precedentedness, conformity, early architecture, risk resolution, team cohesion, and SEI process maturity

0.91 to 1.23, depending on precedentedness, conformity, early architecture, risk resolution, team cohesion, and SEI process maturity Product cost

drivers

None Complexity, required reusability

Reliability, database size, documentation needs, required reuse, and product complexity Platform cost

drivers

None Platform difﬁculty Execution time constraints,

main storage constraints, and virtual machine volatility Personnel cost

drivers

None Personnel capability and experience

Analyst capability, applications experience, programmer capability, programmer experience, language and tool experience, and personnel continuity

Project cost drivers

None Required development

schedule, development environment

Use of software tools, required development schedule, and multisite development

At stage 1, application points supply the size measure. This size measure is an extension of the object-point approach suggested by Kauffman and Kumar (1993) and productivity data reported by Banker, Kauffman, and Kumar (1992). To compute application points, you first count the number of screens, reports, and third-generation language components that will be involved in the application. It is assumed that these elements are defined in a standard way as part of an integrated computer-aided software engineering environment. Next, you classify each application element as simple, medium, or difficult. Table 3.10 contains guidelines for this classification.

TABLE 3.10 Application Point Complexity Levels

For Screens For Reports

Number and source of data tables

Number and source of data tables Number

of views contained

Total 6 4 (62 servers,

63 clients)

Total 68 (2–3 servers,

3–5 clients)

Total 8± (73 servers,

75 clients)

Number of sections contained

Total 64 (62 servers,

63 clients)

Total 68 (2–3 servers,

3–5 clients)

Total 8± (73 servers,

75 clients)

63 Simple Simple Medium 0 or 1 Simple Simple Medium

3–7 Simple Medium Difﬁcult 2 or 3 Simple Medium Difﬁcult

8± Medium Difficult Difficult 4± Medium Difficult Difficult

TABLE 3.11 Complexity Weights for Application Points

Element Type Simple Medium Difﬁcult

Screen 1 2 3

Report 2 5 8

3GL component — — 10

Section 3.3 Effort Estimation 113

The number to be used for simple, medium, or difﬁcult application points is a complexity weight found in Table 3.11. The weights reﬂect the relative effort required to implement a report or screen of that complexity level.

Then, you sum the weighted reports and screens to obtain a single application- point number. If rpercent of the objects will be reused from previous projects, the number of new application points is calculated to be

To use this number for effort estimation, you use an adjustment factor, called a productivity rate, based on developer experience and capability, coupled with CASE maturity and capability. For example, if the developer experience and capability are rated low, and the CASE maturity and capability are rated low, then Table 3.12 tells us that the productivity factor is 7, so the number of person-months required is the number of new application points divided by 7. When the developers’ experience is low but CASE maturity is high, the productivity estimate is the mean of the two values: 16. Likewise, when a team of developers has experience levels that vary, the productivity estimate can use the mean of the experience and capability weights.

At stage 1, the cost drivers are not applied to this effort estimate. However, at stage 2, the effort estimate, based on a function-point calculation, is adjusted for degree of reuse, requirements change, and maintenance. The scale (i.e., the value for cin the effort equation) had been set to 1.0 in stage 1; for stage 2, the scale ranges from 0.91 to

New application points = 1application points2 * 1100 - r2>100

Dalam dokumen Software Engineering Theory and Practice (Halaman 132-146)