PDF Evidence-based Software Engineering

The way forward is to dot the i's and cross the t's to understand the processes involved in building and maintaining software systems; work is also needed to replicate some of the dots to confirm they are not dust particles, and to discover missing dots. Thanks to the maintainers of R, CRAN, and the authors of the hundreds of packages used to analyze the data.

Introduction

What has been learned?

Replication

SOFTWARE MARKETS 5
Software markets
SOFTWARE MARKETS 7

The primary activities of software engineering

History of software engineering research
HISTORY OF SOFTWARE ENGINEERING RESEARCH 9

Folklore
Research ecosystems

HISTORY OF SOFTWARE ENGINEERING RESEARCH 11
Overview of contents
OVERVIEW OF CONTENTS 13 Software developers come preloaded with overlearned behaviors, derived from the native
TERMINOLOGY, CONCEPTS AND NOTATION 15

Why use R?

Terminology, concepts and notation
Further reading
FURTHER READING 17 with some of the more higher-level aspects of how people create and use categories “The

Replication is necessary,178i.e. the results that a researcher claims must be able to be reproduced by other people if they are to be widely accepted. FURTHER READING 17 with some of the more general aspects of how people create and use categories "With some of the more general aspects of how people create and use categories "The Big Book of Concepts" by Gregory L.

Figure 1.4: Tycho Brahe’s observations of Mars and a fitted regression model. Data from Brahe 240 via Wayne Pafko

Human cognition

Introduction

Evolutionary psychology132,137 is an approach to psychology that uses insights and principles from evolutionary biology to help understand the workings of the human mind. Details of physical implementation, brain biology,962 also have an impact on psychological performance.

Figure 2.4: Probability that rat N1 will press a lever a given number of times before pressing a second lever to obtain food, when the target count is 4, 8, 12 and 16

INTRODUCTION 21

Modeling human cognition
Embodied cognition

One picture was of a boat, and subjects were asked a question about the front of the boat and then asked a question about the back of the boat. The response time when the question changed from the front to the back of the boat was longer than when the question changed from one about portholes to one about the rear.

Figure 2.6: Example of the evolution of the accumula- accumula-tion of evidence for opaccumula-tion "A", in a diffusion model.

MOTIVATION 23

Perfection is not cost-effective

Motivation

Built-in behaviors

A promotional focus is sensitive to attendance and positive outcomes, trying to secure hits and insure against errors or omissions. A prevention focus is sensitive to absenteeism and negative outcomes, and seeks to insure against correct rejections and errors in the assignment.

MOTIVATION 25

Cognitive effort
Attention

The subjects were divided into three groups; one group had to exert little effort to detect the gray areas, another acted as a control, and the third group had to exert a lot of effort to detect the gray areas. The results showed that subjects who had to expend a large perceptual-motor effort detected the gray area less often than the other two groups.

Visual processing

Subjects faced with a high investment in perceptual-motor effort reduced their total effort by investing in memory. Most of the sensory information received by the brain does not require conscious attention, it is managed unconsciously.

VISUAL PROCESSING 27

Reading

These backward movements, called regressions, are caused by problems with language processing (for example, incorrect syntactic analysis of a sentence) and oculomotor errors (for example, the eyes overshoot their intended target). Characteristics used by the writing system affect the asymmetry of the perceptual span and its width, e.g. the span may be smaller for Hebrew than for English (Hebrew words can be written without the vowels, which requires more effort to decode and plan the next saccade).

Figure 2.13: Perceived grouping of items on a line may be by shape, color or proximity

MEMORY SYSTEMS 29

Memory systems

Short term memory

Alternatively, a long list of small numbers (much smaller than the length of the list) is read faster and with fewer errors than a long list of numbers where the number has a similar size to the length of the list. Readers may want to try to gauge their STM capacity using the list of numbers in the outer margin.

Figure 2.19: Example object layout, and the correspond- correspond-ing ordered tree produced from the answers given by one subject

MEMORY SYSTEMS 31

The goldsmith whom the man she liked visited made the ring that won the prize given at the fair. The ring made by the goldsmith the man she liked to visit won the prize given at the fair.

Table 2.2: Words with either one or more than one syllable (and thus varying in the length of time taken to speak).

MEMORY SYSTEMS 33

Episodic memory
Recognition and recall

Serial order information

A study by Chekaf, Gauvrit, Guida and Mathy341 examined subjects' recall performance on a range of similar items. A study by Adelson9 examined the organization present in the subject's recall sequence of previously remembered lines of code.

Figure 2.26: Semantic memory representation of alpha- alpha-betic letters (the numbers listed along the top are place markers and are not stored in subject memory)

LEARNING AND EXPERIENCE 35

Forgetting

Learning and experience

Figure 2.33 shows a fitted and exponential power law; the fitted power law exponent, for the first series, is -0.5. The amount of practice needed to learn each pattern present in a task (to be able to perform it), depends on the complexity of the patterns.

Figure 2.31: Fraction of correct subject responses, with fitted bi-exponential model in red (blue and green lines are its two exponential components)

LEARNING AND EXPERIENCE 37

Belief

The additional evidence was either positive (e.g., “The other players on Sandy's team did not show an unusual increase in batting average over the past five weeks”) or negative (e.g., “The games in which Sandy showed his improvement, was played against the team in last place in the league"). For some presentations subjects had to respond after seeing each piece of evidence (the step-by-step procedure), in the other presentations subjects did not respond until they had seen all the pieces of evidence. have not seen (the end-of-sequence procedure).

Figure 2.37: Time taken, by the same person, to imple- imple-ment 12 algorithms from the Communications of the ACM (each colored line), with four iteration of the implementa-tion process

LEARNING AND EXPERIENCE 39

Expertise

Studies of the experience of recognized experts in many fields have shown that the time between their initiation and the performance of their best work is at least 10 years, often with many hours of deliberate practice every day of the year. They found that both psychology and medical training strongly influenced statistical and methodological reasoning, while psychology, medicine, and law influenced the ability to perform conditional reasoning; training in

Figure 2.42: Lines of code correctly recalled after a given number of 2-minute memorization sessions; actual pro-gram in upper plot, scrambled line order in lower plot.

LEARNING AND EXPERIENCE 41

Category knowledge

When asked to name certain members of a category, the attributes of the examples are used as clues to retrieve other features with similar attributes. Table 2.3 lists the correspondence of each of the four possible object combinations with categories A and B.

Figure 2.45: Orthogonal representation of shape, color and size stimuli. Based on Shepard

REASONING 43

Categorization consistency

Reasoning

Wason's study was first published in 1968 and considered mathematical logic as the norm against which the performance of human reasoning should be judged. The failure of many subjects to give the expected answer (ie, one derived using mathematical logic) surprised many researchers, and over the years a wide variety of explanations, experiments, theses, and books have attempted to explain the answers given.

Figure 2.49: A commercial event involving a buyer, seller, money and goods; as seen from the buy, sell, pay, or charge perspective

REASONING 45 asking about what is possible or what is necessary. The hypothesis was that subjects

Deductive reasoning
Linear reasoning

Subjects were presented with two relationship statements involving three people, and a possible conclusion (e.g., "Is Mantle worse than Moskowitz?"), and were given 10 seconds to answer "yes", "no", or "don't know ". ”. Table 2.6 shows the percentage of correct answers: a higher percentage of correct answers was given when the direction was better-to-worse (case 1), than mixed direction (cases 2 and 3); the consistent worse-to-better direction performed poorly (case 4); a higher one.

Table 2.5: Percentage of subjects accepting that the stated conclusion could be logically deduced from the given premises

REASONING 47 percentage of correct answers were given when the premises stated an end term (better

Causal reasoning

Clicking on a node activated it, and clicking on the test icon caused zero or more of the other two nodes to be activated (depending on the causal relationship that exists between the nodes; the nodes had to be activated for the activation to propagate); subjects were told that the (unknown to them) causal links were active 80% of the time, and 1% of the time the node was activated on its own. Subjects were asked on Mechanical Turk to infer causal relationships that existed between three nodes by completing 12 tests for each of 15 presented problems.

Number processing

Figure 2.51 shows some possible causal relationships, eg, for the three top left nodes, clicking the test icon when the top node was activated would result in the left/bottom node being activated (80% of the time). A second experiment included reminder information on the screen and a summary of the results of the previous test; the mean score increased to 11.1 (sd=3.5) and 12.1 (sd=2.9) when the results of the previous test were on the screen.

NUMBER PROCESSING 49

Numeric preferences
Symbolic distance and problem size effect

A study by Cummins421 examined the impact of number granularity on the range of values that subjects assign to different types of numerical expressions. A study by Tzelgov, Yehene, Kotler and Alon1860 examined the impact of implicit learning on the symbolic distance effect.

Figure 2.57: Min/max range of values (red/blue lines), and best value estimate (green circles), given by subjects inter-preting the value likely expressed by statements contain-ing “less than 100” and “more than 100”

NUMBER PROCESSING 51

Estimating event likelihood

Figure 2.62 is based on data from the first and last series of 200 dots seen by each subject;. The responses of subjects in the constant blue group did not change between the first/last 200 dots (red and green lines).

High-level functionality

Personality & intelligence
Risk taking

The subjects were divided into two groups: For one group, the percentage of blue dots did not change over time, while for the other group, the percentage of blue dots was reduced after a subject had seen 200 dots. For the decreased blue group, after subjects experienced a decrease in the probability of encountering a blue dot, the color of what they perceived to be a blue dot changed toward purple (blue and purple lines).

HIGH-LEVEL FUNCTIONALITY 53 The term risk asymmetry refers to the fact that people have been found to be risk averse

Decision-making

In one experiment, subjects were asked to decide whether an object was a rotated version of a second object (i.e., the task in Figure 2-8), and to rate their confidence in their answer (on a scale of 0 to 6, with 6 being the most confident). When the majority chose the most extreme wrong answer (i.e. the shortest line), subjects who gave a wrong answer chose the less extreme wrong answer 20% of the time.

Figure 2.63: Fitted regression model for probability that a subject, who switched answer three times, switches their initial answer when told a given fraction of opposite re-sponses were made by others (x-axis), broken down by confidence expressed in thei

HIGH-LEVEL FUNCTIONALITY 55

Expected utility and Prospect theory
Overconfidence
Time discounting
Developer performance

However, studies have found1613 that while people use their memories of the duration of past events to predict the duration of future events, their memories systematically underestimate past duration. One operational characteristic of the brain that can be estimated is the number of operations that could be performed per second (a commonly used method of evaluating the performance of silicon-based processors).

Figure 2.67: Perceived present value (moving through time to the right) of two future rewards

HIGH-LEVEL FUNCTIONALITY 57

Miscellaneous

In deriving this relationship, Fitts drew on ideas from information theory and used a simplified version of Shannon's law. Hick's Law: the time required, RT, to select an item from a list of K items, is: RT = a+blog(K), where aandb are constants; a is smaller for humans than for pigeons.1898.

Table 2.8: Defects detected by six testers (left two columns; some part-time and one who left the company during the study period), the percentage assigned a given status (next three columns), and percentage outcomes assigned by others

HIGH-LEVEL FUNCTIONALITY 59

Cognitive capitalism

Introduction

Figure 3.2 shows annual expenditures from 1959 to 1998 by US corporations (plus lines) and the US federal and state governments (smooth lines). Figure 3.3 shows the growth in the number of people employed by several large software companies.

Investment decisions

Some governments have recognized the importance of national software ecosystems, 1404 both in economic terms (e.g. industry investment in software systems347 that keep them competitive), and as a means of self-determination (i.e. not having important infrastructure dependent on companies based in other countries); there is no shortage of recommendations1792 for how to nurture IT-based businesses, and government-funded reviews of their national software business.1194,1600 Several emerging economies have created sizeable software industries.77. A study by Mulford and Misra1324 of 100 companies in Standard Industry Classifications (SIC) 7371 and 7372iii, with revenues of more than $100 million during found that total software development costs were approximately 19% of revenues; see figure 3.4; sales and marketing ranges from 22% to 40%,335general and administrative (eg salaries, rent, etc.) ranges from 11% to 22%,335with any remainder allocated to profit and associated taxes.

INVESTMENT DECISIONS 63

Discounting for time
Taking risk into account
Incremental investments and returns

The discount rate represents the risk-free element, and the closest thing to a risk-free investment are bonds and government securities (information on these rates is freely available). Governments face a circular problem in how they calculate the discount rate for their investments.

INVESTMENT DECISIONS 65

Investment under uncertainty
Real options

This equation contains a drift term (given by the function, which includes time and time) over a time increment, dt, plus a Wiener process increment, v dz (the random component; also known as a Brownian process) and the function , b, includes time and time. This equation can be solved to find the standard deviation in the value ex: is σ.

Figure 3.8: Illustration of a drift diffusion process. Green lines show possible paths, red lines show bounds of dif-fusion and grey line shows drift with no difdif-fusion compo-nent

CAPTURING COGNITIVE OUTPUT 67

Capturing cognitive output

Intellectual property

Patents grant their holder exclusive use of the claimed invention in the jurisdiction that granted the patent. An analysis768 of the stock price of US ICT companies found that those with software patents had a slightly higher value than those without software patents.

Figure 3.11: Bug bounty payer (left) and payee (right) countries (total value $23,632,408)

CAPTURING COGNITIVE OUTPUT 69

Bumbling through life

The Open Source Initiative (OSI) is a non-profit organization that has defined what constitutes Open Source and maintains a list of licenses that meet this definition. Open source licensing ideology is applied to other goods, especially the hardware on which the software runs.221.

Figure 3.16: Survival curve of OSI licenses that have been listed on the approved license webpage, in days since 15 August 2000, with 95% confidence intervals.

CAPTURING COGNITIVE OUTPUT 71 ecosystems, the concept of career progression is less likely to apply. Some software

Expertise

The level of skills needed to get a job involving software development is evaluated in relation to those who apply for the job, employers may have to settle for anyone who demonstrates the necessary basic competency. Higher education once served as a signaling system,1746 used by employers looking to recruit people early in their professional careers (ie, high cognitive power used to be required to earn a university degree).

Group dynamics

Once a good enough level of programming knowledge is achieved, if the application domain changes more slowly than the software environment, learning more about the application domain can provide a higher ROI (for the individual) compared to improving programming. professionalism (because acquired applied knowledge/skills have a longer useful life). When a certain level of knowledge is reached, such people stop learning and focus on applying what they have learned; in work and sport, there is a difference between training for and performing the activity.

GROUP DYNAMICS 73

Maximizing generated surplus
Motivating members
Social status

A study by Chandlera and Kapelner320 examined the impact of what subjects thought a task was meaningful on their performance. Social animals pay more attention to group members with a higher social status (however that is measured).

GROUP DYNAMICS 75

Social learning
Group learning and forgetting

The rate of change depended on group size and relative subgroup size; see Github–economics/Centola-Becker.R. The impact of the transfer of information about new products is discussed in chapter 3.6.3.

Figure 3.20: Hours required to build a car radio after the production of a given number of radios, with break pe-riods (shown in days above x-axis); lines are regression models fitted to each production period

GROUP DYNAMICS 77

Information asymmetry
Moral hazard
Group survival

Figure 3.24 shows that the interval, in months, between the announcement date and the promised product availability date has little correlation with the interval between the promised and actual delivery date. A product's reliability may only become apparent after extensive use, e.g. the number of errors experienced.

Figure 3.24: Interval between product announcement date and its promised availability date, against interval be-tween promised date and actual date the product became available; lines are a fitted regression model of the form:

GROUP DYNAMICS 79

Group problem solving

When the solution includes s subproblems, the probability of success of the group (i.e. the model of the combination of individuals) is: Pg=. The lines connect the pattern of time/percentage success for answers to the same problem in each time limit group.

Figure 3.26: Percentage of individuals (x-axis) who cor- cor-rectly generated a solution, against mean response time, for 144 problems; colors denote time limits, and a sample of lines connecting performance pairs for the same pro-gram

GROUP DYNAMICS 81

Cooperative competition
Software reuse

There are specific error patterns that result from copy and paste errors.169 Creating reusable software may require more investment than is necessary to create a non-reusable version of the software. Table 3.2 shows that Linux subsystems contain a significant percentage of replicated sequences from their own source; replication between subsystems is less common (the same pattern was seen in FreeBSD 5.2.1).

Company economics

In a large organization, enterprise-level reuse can be worthwhile, but the costs and benefits can be spread among many groups who have no reason to invest their resources in the common good.599 Reasons not to reuse code include: costs of conducting due diligence to ensure intellectual property rights are respected (clones of code appearing on Stack Overflowxiii have been found in Android Apps53 with incompatible licenses, and Github projects127), ego (e.g. being recognized as the author of the functionality) and hedonism (pleasure of inventing a personal wheel creates an incentive to go against using someone else's code). Developers are likely familiar with their own code and the code they regularly encounter.

COMPANY ECONOMICS 83

Cost accounting
The shape of money
Valuing software

Commercial businesses are required to maintain accurate financial records, the purpose of which is to provide vital information to those with a financial interest in the business, including governments seeking to tax profits. An expense is tax deductible in the fiscal year in which it occurs, but the software does not appear as valuable in company accounts; the value of an capitalized item depreciates over time (i.e., a percentage of its value is tax-deductible over a period of years), but has a value in the business accounts. 1001 business accounts can be driven by a desire to to project a certain image on interested outsiders (for example, the company is worth a lot because it has valuable assets4), or to minimize tax liabilities.

Maximizing ROI

However, for accounting purposes, software can be valued in terms of the cost of its production. An organization looking to purchase a software system has the option of paying for its implementation, and the cost of creating a software system is one approach to its valuation.

MAXIMIZING ROI 85

Value creation
Product/service pricing
Predicting sales volume

Microsoft's developer-friendly policy kept the price of C/C++ compilers under Windows relatively low on other platforms.xvii. For example, the cost of producing an item can increase/decrease, shifting the supply curve up/down the price axis (for software, the cost of production is the cost of creating the software system); or customers may find a cheaper alternative, shifting the demand curve down the price axis.

Figure 3.38: Examples of supply (lower) and demand (up- (up-per) curves. Github–Local

MAXIMIZING ROI 87

Managing customers as investments

Figure 3.42 shows the number of transactions closed per week of the quarter, and the average agreed discount. Reasons for the significant spike in the number of deals closed at the end of the quarter include sellers gaming the system to maximize commission and customers holding out for a better deal.

Figure 3.42: Percentage of sales closed in a given week of a quarter, with average discount given

MAXIMIZING ROI 89

Commons-based peer-production