CHANGES IN SOFTWARE
SIDEBAR 3.3 CAUSES OF INACCURATE ESTIMATES
L
ederer and Prasad (1992) investigated the cost-estimation practices of 115 different orga- nizations. Thirty-five percent of the managers surveyed on a five-point Likert scale indi- cated that their current estimates were “moderately unsatisfactory” or “very unsatisfactory.”The key causes identified by the respondents included
• frequent requests for changes by users
• overlooked tasks
• users’ lack of understanding of their own requirements
• insufficient analysis when developing an estimate
• lack of coordination of systems development, technical services, operations, data administration, and other functions during development
• lack of an adequate method or guidelines for estimating
Several aspects of the project were noted as key influences on the estimate:
• complexity of the proposed application system
• required integration with existing systems
• complexity of the programs in the system
• size of the system expressed as number of functions or programs
• capabilities of the project team members
• project team’s experience with the application
• anticipated frequency or extent of potential changes in user requirements
• project team’s experience with the programming language
• database management system
• number of project team members
• extent of programming or documentation standards
• availability of tools such as application generators
• team’s experience with the hardware
Section 3.3 Effort Estimation 105
Size (SLOC) Cost ($)
PHASES AND MILESTONES
RELATIVE SIZE RANGE
4x
2x
x 1.5x 1.25x
0.5x
0.25x
Feasibility Plans &
requirements Product
design
Detailed design
Development and test Concept of
operations
Requirements specification
Product design specs
Detailed design specs
Accepted software +
+ + ++ + + +
++ ++ +
FIGURE 3.12 Changes in estimation accuracy as project progresses (Boehm et al. 1995).
be refined, based on more complete information about the project’s characteristics.
Figure 3.12 illustrates how uncertainty early in the project can affect the accuracy of cost and size estimates (Boehm et al. 1995).
The stars represent size estimates from actual projects, and the pluses are cost estimates. The funnel-shaped lines narrowing to the right represent Boehm’s sense of how our estimates get more accurate as we learn more about a project. Notice that when the specifics of the project are not yet known, the estimate can differ from the eventual actual cost by a factor of 4. As decisions are made about the product and the process, the factor decreases. Many experts aim for estimates that are within 10 percent of the actual value, but Boehm’s data indicate that such estimates typically occur only when the project is almost done—too late to be useful for project management.
To address the need for producing accurate estimates, software engineers have developed techniques for capturing the relationships among effort and staff character- istics, project requirements, and other factors that can affect the time, effort, and cost of developing a software system. For the rest of this chapter, we focus on effort-estimation techniques.
Expert Judgment
Many effort-estimation methods rely on expert judgment. Some are informal tech- niques, based on a manager’s experience with similar projects. Thus, the accuracy of the prediction is based on the competence, experience, objectivity, and perception of the estimator. In its simplest form, such an estimate makes an educated guess about the effort needed to build an entire system or its subsystems. The complete estimate can be computed from either a top-down or a bottom-up analysis of what is needed.
Many times analogies are used to estimate effort. If we have already built a system much like the one proposed, then we can use the similarity as the basis for our estimates.
For example, if system A is similar to system B, then the cost to produce system A should be very much like the cost to produce B. We can extend the analogy to say that if A is about half the size or complexity of B, then A should cost about half as much as B.
The analogy process can be formalized by asking several experts to make three predictions: a pessimistic one (x), an optimistic one (z), and a most likely guess (y). Then our estimate is the mean of the beta probability distribution determined by these num- bers: By using this technique, we produce an estimate that “normal- izes” the individual estimates.
The Delphi technique makes use of expert judgment in a different way. Experts are asked to make individual predictions secretly, based on their expertise and using whatever process they choose. Then, the average estimate is calculated and presented to the group. Each expert has the opportunity to revise his or her estimate, if desired.
The process is repeated until no expert wants to revise. Some users of the Delphi tech- nique discuss the average before new estimates are made; at other times, the users allow no discussion.And in another variation, the justifications of each expert are circu- lated anonymously among the experts.
Wolverton (1974) built one of the first models of software development effort.
His software cost matrix captures his experience with project cost at TRW, a U.S. soft- ware development company. As shown in Table 3.6, the row name represents the type of software, and the column designates its difficulty. Difficulty depends on two factors:
whether the problem is old (O) or new (N) and whether it is easy (E), moderate (M), or hard (H).The matrix elements are the cost per line of code, as calibrated from historical data at TRW. To use the matrix, you partition the proposed software system into mod- ules. Then, you estimate the size of each module in terms of lines of code. Using the matrix, you calculate the cost per module, and then sum over all the modules. For instance, suppose you have a system with three modules: one input/output module that is old and easy, one algorithm module that is new and hard, and one data management module that is old and moderate. If the modules are likely to have 100, 200, and 100 lines of code, respectively, then the Wolverton model estimates the cost to be 1100 * 172 + 1200 * 352 + 1100 * 312 = $11,800.
1x + 4y + z2>6.
TABLE 3.6 Wolverton Model Cost Matrix
Difficulty
Type of software OE OM OH NE NM NH
Control 21 27 30 33 40 49
Input/output 17 24 27 28 35 43
Pre/post processor 16 23 26 28 34 42
Algorithm 15 20 22 25 30 35
Data management 24 31 35 37 46 57
Time-critical 75 75 75 75 75 75
Section 3.3 Effort Estimation 107
Since the model is based on TRW data and uses 1974 dollars, it is not applicable to today’s software development projects. But the technique is useful and can be trans- ported easily to your own development or maintenance environment.
In general, experiential models, by relying mostly on expert judgment, are subject to all its inaccuracies. They rely on the expert’s ability to determine which projects are similar and in what ways. However, projects that appear to be very similar can in fact be quite different. For example, fast runners today can run a mile in 4 minutes.A marathon race requires a runner to run 26 miles and 365 yards. If we extrapolate the 4-minute time, we might expect a runner to run a marathon in 1 hour and 45 minutes. Yet a marathon has never been run in under 2 hours. Consequently, there must be character- istics of running a marathon that are very different from those of running a mile. Like- wise, there are often characteristics of one project that make it very different from another project, but the characteristics are not always apparent.
Even when we know how one project differs from another, we do not always know how the differences affect the cost.A proportional strategy is unreliable, because project costs are not always linear: Two people cannot produce code twice as fast as one. Extra time may be needed for communication and coordination, or to accommodate differ- ences in interest, ability, and experience. Sackman, Erikson, and Grant (1968) found that the productivity ratio between best and worst programmers averaged 10 to 1, with no easily definable relationship between experience and performance. Likewise, a more recent study by Hughes (1996) found great variety in the way software is designed and developed, so a model that may work in one organization may not apply to another.
Hughes also noted that past experience and knowledge of available resources are major factors in determining cost.
Expert judgment suffers not only from variability and subjectivity, but also from dependence on current data.The data on which an expert judgment model is based must reflect current practices, so they must be updated often. Moreover, most expert judg- ment techniques are simplistic, neglecting to incorporate a large number of factors that can affect the effort needed on a project. For this reason, practitioners and researchers have turned to algorithmic methods to estimate effort.
Algorithmic Methods
Researchers have created models that express the relationship between effort and the factors that influence it. The models are usually described using equations, where effort is the dependent variable, and several factors (such as experience, size, and application type) are the independent variables. Most of these models acknowledge that project size is the most influential factor in the equation by expressing effort as
whereSis the estimated size of the system, and a, b,andcare constants. is a vector of cost factors, through and mis an adjustment multiplier based on these factors.
In other words, the effort is determined mostly by the size of the proposed system, adjusted by the effects of several other project, process, product, or resource characteristics.
xn, x1
E = 1a + bSc2m1X2
Walston and Felix (1977) developed one of the first models of this type, finding that IBM data from 60 projects yielded an equation of the form
The projects that supplied data built systems with sizes ranging from 4000 to 467,000 lines of code, written in 28 different high-level languages on 66 computers, and representing from 12 to 11,758 person-months of effort. Size was measured as lines of code, including comments as long as they did not exceed 50 percent of the total lines in the program.
The basic equation was supplemented with a productivity index that reflected 29 factors that can affect productivity, shown in Table 3.7. Notice that the factors are tied to a very specific type of development, including two platforms: an operational computer and a development computer. The model reflects the particular development style of the IBM Federal Systems organizations that provided the data.
E = 5.25S0.91
TABLE 3.7 Walston and Felix Model Productivity Factors
1. Customer interface complexity 16. Use of design and code inspections 2. User participation in requirements definition 17. Use of top-down development 3. Customer-originated program design changes 18. Use of a chief programmer team 4. Customer experience with the application area 19. Overall complexity of code
5. Overall personnel experience 20. Complexity of application processing 6. Percentage of development programmers
who participated in the design of functional specifications
21. Complexity of program flow
7. Previous experience with the operational computer
22. Overall constraints on program’s design
8. Previous experience with the programming language
23. Design constraints on the program’s main storage
9. Previous experience with applications of similar size and complexity
24. Design constraints on the program’s timing
10. Ratio of average staff size to project duration (people per month)
25. Code for real-time or interactive operation or for execution under severe time constraints 11. Hardware under concurrent development 26. Percentage of code for delivery
12. Access to development computer open under special request
27. Code classified as nonmathematical application and input/output formatting programs 13. Access to development computer closed 28. Number of classes of items in the database
per 1000 lines of code 14. Classified security environment for computer
and at least 25% of programs and data
29. Number of pages of delivered documentation per 1000 lines of code
15. Use of structured programming
Section 3.3 Effort Estimation 109
Each of the 29 factors was weighted by 1 if the factor increases productivity, 0 if it has no effect on productivity, and if it decreases productivity. A weighted sum of the 29 factors was then used to generate an effort estimate from the basic equation.
Bailey and Basili (1981) suggested a modeling technique, called a meta-model, for building an estimation equation that reflects your own organization’s characteristics.
They demonstrated their technique using a database of 18 scientific projects written in Fortran at NASA’s Goddard Space Flight Center. First, they minimized the standard error estimate and produced an equation that was very accurate:
Then, they adjusted this initial estimate based on the ratio of errors. If Ris the ratio between the actual effort,E,and the predicted effort, , then the effort adjustment is defined as
They then adjusted the initial effort estimate this way:
Finally, Bailey and Basili (1981) accounted for other factors that affect effort, shown in Table 3.8. For each entry in the table, the project is scored from 0 (not present) to 5 (very important), depending on the judgment of the project manager. Thus, the total
Eadj = e11 + ERadj2E ifR Ú 1 E>11 + ERadj2 ifR 6 1
Eadj
ERadj = eR - 1 if R Ú 1 1 - 1>R ifR 6 1
E¿ E = 5.5 + 0.73S1.16
-1
TABLE 3.8 Bailey–Basili Effort Modifiers Total Methodology
(METH)
Cumulative Complexity (CPLX)
Cumulative Experience (EXP) Tree charts Customer interface complexity Programmer qualifications
Top-down design Application complexity Programmer machine
experience Formal documentation Program flow complexity Programmer language
experience Chief programmer teams Internal communication
complexity
Programmer application experience
Formal training Database complexity Team experience
Formal test plans External communication complexity
Design formalisms Customer-initiated program design changes
Code reading
Unit development folders
score for METH can be as high as 45, for CPLX as high as 35, and for EXP as high as 25.
Their model describes a procedure, based on multilinear least-square regression, for using these scores to further modify the effort estimate.
Clearly, one of the problems with models of this type is their dependence on size as a key variable. Estimates are usually required early, well before accurate size infor- mation is available, and certainly before the system is expressed as lines of code. So the models simply translate the effort-estimation problem to a size-estimation problem.
Boehm’s Constructive Cost Model (COCOMO) acknowledges this problem and incor- porates three sizing techniques in the latest version, COCOMO II.
Boehm (1981) developed the original COCOMO model in the 1970s, using an extensive database of information from projects at TRW, an American company that built software for many different clients. Considering software development from both an engineering and an economics viewpoint, Boehm used size as the primary determi- nant of cost and then adjusted the initial estimate using over a dozen cost drivers, includ- ing attributes of the staff, the project, the product, and the development environment. In the 1990s, Boehm updated the original COCOMO model, creating COCOMO II to reflect the ways in which software development had matured.
The COCOMO II estimation process reflects three major stages of any develop- ment project. Whereas the original COCOMO model used delivered source lines of code as its key input, the new model acknowledges that lines of code are impossible to know early in the development cycle. At stage 1, projects usually build prototypes to resolve high-risk issues involving user interfaces, software and system interaction, per- formance, or technological maturity. Here, little is known about the likely size of the final product under consideration, so COCOMO II estimates size in what its creators call application points. As we shall see, this technique captures size in terms of high- level effort generators, such as the number of screens and reports, and the number of third-generation language components.
At stage 2, the early design stage, a decision has been made to move forward with development, but the designers must explore alternative architectures and concepts of operation. Again, there is not enough information to support fine-grained effort and duration estimation, but far more is known than at stage 1. For stage 2, COCOMO II employs function points as a size measure. Function points, a technique explored in depth in IFPUG (1994a and b), estimate the functionality captured in the require- ments, so they offer a richer system description than application points.
By stage 3, the postarchitecture stage, development has begun, and far more information is known. In this stage, sizing can be done in terms of function points or lines of code, and many cost factors can be estimated with some degree of comfort.
COCOMO II also includes models of reuse, takes into account maintenance and breakage (i.e., the change in requirements over time), and more. As with the original COCOMO, the model includes cost factors to adjust the initial effort estimate.A research group at the University of Southern California is assessing and improving its accuracy.
Let us look at COCOMO II in more detail. The basic model is of the form
where the initial size-based estimate, is adjusted by the vector of cost driver infor- mation,m( ). Table 3.9 describes the cost drivers at each stage, as well as the use of other models to modify the estimate.
bSc, E = bScm1X2
Section 3.3 Effort Estimation 111
TABLE 3.9 Three Stages of COCOMO II
Model Aspect
Stage 1:
Application Composition
Stage 2:
Early Design
Stage 3:
Postarchitecture
Size Application
points
Function points (FPs) and language
FP and language or source lines of code (SLOC)
Reuse Implict in
model
Equivalent SLOC as function of other variables
Equivalent SLOC as function of other variables
Requirements change
Implicit in model
% change expressed as a cost factor
% change expressed as a cost factor
Maintenance Application points, annual change traffic (ACT)
Function of ACT, software understanding, unfmiliarity
Function of ACT, software understanding, unfamiliarity
Scale (c) in nominal effort equation
1.0 0.91 to 1.23, depending on precedentedness, conformity, early architecture, risk resolution, team cohesion, and SEI process maturity
0.91 to 1.23, depending on precedentedness, conformity, early architecture, risk resolution, team cohesion, and SEI process maturity Product cost
drivers
None Complexity, required reusability
Reliability, database size, documentation needs, required reuse, and product complexity Platform cost
drivers
None Platform difficulty Execution time constraints,
main storage constraints, and virtual machine volatility Personnel cost
drivers
None Personnel capability and experience
Analyst capability, applications experience, programmer capability, programmer experience, language and tool experience, and personnel continuity
Project cost drivers
None Required development
schedule, development environment
Use of software tools, required development schedule, and multisite development
At stage 1, application points supply the size measure. This size measure is an extension of the object-point approach suggested by Kauffman and Kumar (1993) and productivity data reported by Banker, Kauffman, and Kumar (1992). To compute appli- cation points, you first count the number of screens, reports, and third-generation lan- guage components that will be involved in the application. It is assumed that these elements are defined in a standard way as part of an integrated computer-aided soft- ware engineering environment. Next, you classify each application element as simple, medium, or difficult. Table 3.10 contains guidelines for this classification.
TABLE 3.10 Application Point Complexity Levels
For Screens For Reports
Number and source of data tables
Number and source of data tables Number
of views contained
Total 6 4 (62 servers,
63 clients)
Total 68 (2–3 servers,
3–5 clients)
Total 8± (73 servers,
75 clients)
Number of sections contained
Total 64 (62 servers,
63 clients)
Total 68 (2–3 servers,
3–5 clients)
Total 8± (73 servers,
75 clients)
63 Simple Simple Medium 0 or 1 Simple Simple Medium
3–7 Simple Medium Difficult 2 or 3 Simple Medium Difficult
8± Medium Difficult Difficult 4± Medium Difficult Difficult
TABLE 3.11 Complexity Weights for Application Points
Element Type Simple Medium Difficult
Screen 1 2 3
Report 2 5 8
3GL component — — 10
Section 3.3 Effort Estimation 113
The number to be used for simple, medium, or difficult application points is a complexity weight found in Table 3.11. The weights reflect the relative effort required to implement a report or screen of that complexity level.
Then, you sum the weighted reports and screens to obtain a single application- point number. If rpercent of the objects will be reused from previous projects, the num- ber of new application points is calculated to be
To use this number for effort estimation, you use an adjustment factor, called a produc- tivity rate, based on developer experience and capability, coupled with CASE maturity and capability. For example, if the developer experience and capability are rated low, and the CASE maturity and capability are rated low, then Table 3.12 tells us that the productivity factor is 7, so the number of person-months required is the number of new application points divided by 7. When the developers’ experience is low but CASE maturity is high, the productivity estimate is the mean of the two values: 16. Likewise, when a team of developers has experience levels that vary, the productivity estimate can use the mean of the experience and capability weights.
At stage 1, the cost drivers are not applied to this effort estimate. However, at stage 2, the effort estimate, based on a function-point calculation, is adjusted for degree of reuse, requirements change, and maintenance. The scale (i.e., the value for cin the effort equation) had been set to 1.0 in stage 1; for stage 2, the scale ranges from 0.91 to
New application points = 1application points2 * 1100 - r2>100