SOFTWARE METRICS AND SOFTWARE METROLOGY

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted by Section 107 or 108 of the 1976 United State Copyright Act without prior written permission from the publisher or permission through payment of the appropriate fee per copy to Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA by fax or on the Internet at www.copyright .com. Limitation of Liability/Disclaimer of Warranty: Although the publisher and author have used their best efforts to prepare this book, they make no representations or warranties as to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranty of merchantability or fitness for a specific purpose. 9 : Use Case Points: Analysis of their design Joost Ouwerkerk — Expedia (Canada) 10 : ISO 9126: Analysis of its quality models.

Fagg, and a number of the COSMIC members of the measurement practices committee 13: Convertibility about measurement. The Advanced Readings 1 section provides an analysis of the design of the Productivity Measure in ISO 9126 - 4. The Advanced Readings 2 section provides an illustration of measurement design issues identified in the analysis of the attributes and base measures in ISO 9126.

This reference model includes three views of the quality of a software product at the highest level — see Figure 10.1.

Figure 10.1. Quality in the software lifecycle — ISO 9126 - 1. This ﬁ gure was adopted from Figure 2 (p.3) of ISO/IEC 9126 - 1:2001(E)

An Organizational Reference Context Model

It should be noted that ISO 9126 does not provide any reference values for any of its quality characteristics and sub-characteristics. Using the derived measures from the data preparation phase to populate the instantiated quality model determined in step 1. Comparing the results of step 3 with either the set of reference values or the measures determined in step 2 to make a decision based on both the given information and the relevant information , that is available to the decision maker.

While the ISO 9126 quality models are described at a very high level, the relationships across the models, the quality attributes and sub-attributes are certainly not well understood and not well described. Thus, in practice using any of such ratios textually described in ISO 9126 represents an "attempt to quantify" without a prescribed standard or organizational reference context or empirically verified foundation [Abran et al. Also, neither ISO 9126 nor ISO 14598 (and the forthcoming ISO 25000 series) propose specific "reference models of analysis" or an inventory of "organizational reference contexts" with reference values and decision criteria.

THE METROLOGY -RELATED PART OF ISO 9126: BASE AND DERIVED MEASURES

In this example from ISO 9126, the derived metric (with the corresponding measurement units above) is assigned the following name: “ Failure density against test cases. Overall, this is a fairly poor foundation from a measurement perspective, and both practitioners and researchers need to be very careful in implementing, using and interpreting the numbers that result from using these ISO quantitative models. Notwithstanding the above comments, these ISO documents remain at the forefront of the state of the art on software measurement, and efforts are underway to improve them.

An overview from a metrological perspective of the derivative measure designs in ISO 9126 is then presented, illustrating some of their weaknesses as measurement methods from a metrological perspective.

ANALYSIS OF DERIVED MEASURES

Analysis of the Derived Measures in ISO 9126 - 4: Quality in Use

Each of these gaps in the design of the derived measures presents an opportunity to improve the measures in the upcoming update of ISO, which is the ISO 25000 series. This analysis provides an illustration of the improvements needed for many of the software measures proposed to the industry.

The first three basic measures above refer to terms that are commonly used (i.e. task time, number of tasks and number of user errors), but this leaves a lot to the interpretation of what a task entails, for example. This latitude in the interpretation of these basic metrics provides a rather weak basis for both internal and external benchmarking. The third basic measure, the number of user errors, is defined in Annex F of ISO TR 9126-4 as a “case in which test participants did not complete the task successfully, or had to attempt parts of the task more than once. .

This definition differs significantly from that in the IEEE Standard Dictionary of Software Engineering Terminology, where the term "error" is defined as "the difference between a calculated, observed, or measured value or condition and the true, specified, or value or theoretically correct condition. The fourth basic measure, referred to as the "proportional value of each missing or incorrect component" in the task's outcome is, in turn, based on a different definition, while each "potential missing or incorrect component " is given a weighted value A i. based on the extent to which it reduces the value of the product to the business or user. The three proposed derived measures for the effectiveness characteristic, which are defined as a prescribed combination of the basic measures mentioned above, inherit this.

In summary, there is no assurance that the measurement results of the derived measures are repeatable and reproducible across meters, across groups measuring the same software, or across organizations where a task may be interpreted differently and with different levels of granularity. The next two basic measures (tasks and errors) do not refer to any international measurement standard and must be defined locally. Error rate: The definition of the calculation of this derived measure provides two distinct alternatives for the elements of this calculation.

For example, derived performance and task completion metrics are expressed as percentages and interpreted as performance or completion of a particular task. For task completion and error rates, true values would depend on locally determined and rigorously applied measurement procedures, but without reference to generally recognized normal true values (as locally defined). No other baseline measure in these derived performance measures refers to a common reference scale or to a locally determined one.

THE MISSING LINKS: FROM METROLOGY TO QUANTITATIVE ANALYSIS

Overview

Analysis of the Measurement of “Maturity”

Each of the seven proposed derived measures is described individually as illustrated in the sidebox and Table 10.2 with "Fault density against test cases" as an example. The purpose of this derived measurement in ISO 9126: how many errors were detected during the defined test period. Usage method for this derived measure: count the number of bugs detected and test cases executed.

However, none of the basic embedded measures are precisely defined in ISO 9126, including failures and test cases. Each group within each organization must construct its own set of values for analysis within a specific context. What is the specific contribution of each of the seven derived measures above to the maturity sub-characteristic.

Is there some overlap between the relationships of any of these seven derived measures, and if so, to what extent. If not all seven derived measures are mandatory-required, which one(s) are most representative of the maturity sub-characteristic, and if so, to what extent. In practice, the resulting measure, Failure Density vs. Test Cases, is only a contributor, that is, an indicator—see the definition of "indicator" in the sidebox in Section 4.5.1 of Chapter 4—within that part of the hierarchy of Concepts of related to quality.

None of the expected links between this (weak) metrological basis for measuring the basic and derived attributes and the quantification of the quality sub-attribute (e.g. maturity) and attribute (e.g. reliability) are described in ISO 9126.

OUTSTANDING MEASUREMENT DESIGN ISSUES

Finally, the ISO 9126 standard also contains a number of qualifications of the basic measurements that require further clarification from a measurement perspective. ISO 9126-4 claims that the five derived measures of the productivity attribute (see Table 10.1) assess the resources that users consume in relation to the effectiveness achieved in a specific context of use. The time taken to complete a task is considered the most important resource to take into account when measuring the productivity characteristic of the quality used.

Of the five proposed measures of productivity in ISO 9126 - 4, one is a basic measure: task time. This advanced reading section is of most interest to researchers and industry professionals interested in improving measurement standards, such as those in ISO 9126 and the forthcoming ISO 25000 series, and in finding out priorities in the selection of the base measures to be improve. An inventory of the basic measures in ISO 9126 has identified 80 different basic measures [Desharnais et al.

2009 ]: Are the attributes to be measured in ISO 9126 described with sufficient clarity by these 80 basic measures to ensure the quality of the measurement results. Improving the design of these 80 basic measures is essential for the use of ISO models in industry. Occurrences of base measures within derived measures of ISO 9126 Base measure occurrences.

Nevertheless, over the years the industry has developed several consensuses on measuring the functional size of software. The “second” as a unit of measurement is already well defined and is part of the set of international standards for units of measurement. The first definition of the failure attribute could be proposed, but should be revised in the context of each attribute, quality characteristic and sub-characteristic.

In addition to the 80 different base measures and over 250 derived measures, ISO 9126 — Parts 2, 3 and 4 includes a number of qualifications which characterize some aspect of the base measures (and corresponding distinct properties). Most of the time, the qualifications in the ISO 9126 quality model are added to objectives using a phrase, not just a word.

EXERCISES

Describe and explain the link between the Analysis Models and the derived measures in the ISO 9126 models of software quality

How many of these base measures are related to international standards of measurement?

Name a base measure speciﬁ c to software in ISO 9126 that is supported by a well - documented measurement method?

What is the measurement unit of the “ task efﬁ ciency ” derived measure?

TERM ASSIGNMENTS

Design an analysis model using the above 4 derived measures for the “ productivity ” characteristic that could be used to evaluate “ productivity ” and take

Select one of the base measures from ISO 9126 - 2 and improve its design

Select one of the base measures from ISO 9126 - 3 and improve its design

Identify and document a process to progressively develop a much larger con- sensus on the proposed improved design of one the base measures from ISO

Take any of the quality sub characteristics in ISO 9126 and discuss the meth- odology you would use to describe the linkages between the sub character-

Take any of the derived measures from ISO 9126 and identify sources of uncertainty in their measurement process

Propose an ISO 9126 quality model for Web software in your organization and provide