Is Agent-Based Modelling the Future of Prediction?

(1)

Article in International Journal of Social Research Methodology · October 2022

DOI: 10.1080/13645579.2022.2137923

CITATIONS

0

READS

14 1 author:

Some of the authors of this publication are also working on these related projects:

Secondhand MarketsView project

IMAGESView project Edmund Chattoe University of Leicester 82PUBLICATIONS 1,055CITATIONS

SEE PROFILE

All content following this page was uploaded by Edmund Chattoe on 31 October 2022.

The user has requested enhancement of the downloaded file.

(2)

Is Agent-Based Modelling the Future of Prediction?

1 2

Edmund Chattoe-Brown, School of Media, Communication and Sociology, University of 3

Leicester, [email protected] 4

5

Abstract 6

7

This article argues that Agent-Based Modelling, owing to its capabilities and methodology, has 8

a distinctive contribution to make to delivering coherent social science prediction. The 9

argument has four parts. The first identifies key elements of social science prediction induced 10

from real research across disciplines, thus avoiding a straw person approach to what prediction 11

is. The second illustrates Agent-Based Modelling using an example, showing how it provides 12

a framework for coherent prediction analysis. As well as introducing the method to general 13

readers, argument by example minimises generic discussion of Agent-Based Modelling and 14

encourages prediction relevance. The third deepens the analysis by combining concepts from 15

the model example and prediction research to examine distinctive contributions Agent-Based 16

Modelling offers regarding two important challenges: Predictive failure and prediction 17

assessment. The fourth presents a novel approach – predicting models using models – 18

illustrating again how Agent-Based Modelling adds value to social science prediction.

19 20

Keywords: Agent-Based Modelling, Prediction, Research Design, Validation, Methodology.

21 22

1. Introduction 23

24

(3)

Prediction is a notoriously contentious and conceptually challenging aspect of social science.

25

In this article, I show how viewing it through the lens of a novel research method (the computer 26

simulation technique called Agent-Based Modelling) offers both conceptual clarification and 27

novel research tools. To avoid handwaving, I begin by analysing real research across the social 28

sciences to see how prediction is done in practice, generalising to identify core interdisciplinary 29

elements for subsequent analysis. Then, to minimise generic discussion, I introduce Agent- 30

Based Modelling through an example specifically chosen to focus on prediction. This 31

discussion illustrates how Agent-Based Modelling offers a coherent framework for analysing 32

predictions. The two main sections of the article show how, building on real research and the 33

distinctive approach of Agent-Based Modelling, it can contribute to three important challenges 34

in social science prediction: Predictive failure, assessing predictions and evaluating predictive 35

approaches when the nature of the underlying social process is, of necessity, imperfectly 36

known. The final section sums up the contribution of the article (and of Agent-Based 37

Modelling).

38 39

2. What is Social Science Prediction? Induction from Real Research 40

41

Prediction in social science has a long history across many disciplines. The aim of this section 42

is therefore to identify its common features inductively (by focusing on the arguments of real 43

research) thus supporting the relevance of the subsequent Agent-Based Modelling discussion.

44

This approach avoids both straw person claims about what prediction is and reduces bias 45

towards particular approaches (though space does not allow coverage of prediction in all 46

disciplines). The first example will be described in detail to illustrate important concepts in 47

prediction but, again for space reasons, later examples will just be sketched to confirm existing 48

claims or support new ones.

49

(4)

The first example (Burgess and Cottrell 1936) comes from sociology, a field which used to 50

publish prediction research regularly in prestigious journals but has now stopped.¹ 51

52

Burgess and Cottrell research what they call marital adjustment, how happily married people 53

are. They hypothesise predicting this adjustment using relatively measurable partner 54

characteristics. Immediately, two crucial aspects of prediction appear: Research design and 55

aims. The best research design for this study has characteristics measured before marriage with 56

adjustment measured subsequently. This avoids the possibility that rationalisation can increase 57

apparent association. However, such longitudinal designs are more costly and suffer distinctive 58

data problems (like sample attrition exactly because some marriages fail). It is also important 59

to understand why one would want to predict. One common aim is avoiding negative outcomes 60

in society. In this case, if marrying someone with a quick temper is likely to produce 61

unhappiness then individuals may choose not to.

62 63

We now face a general problem with older studies which is that researchers did not seem to 64

see research design issues as clearly as we do now. The article strongly suggests that data were 65

collected cross-sectionally after marriage so that happiness ratings were taken at the same time 66

as reports about whether (for example) the couple did activities outside the home together.

67

Furthermore, some items are clearly not about characteristics (for example whether partners 68

are tolerant or quick tempered – as measured by psychological scales for example) but about 69

behaviours (sharing activities) or practices (agreeing how to handle in-laws) which might 70

reasonably change. This makes interpreting the associations predictively problematic. It is one 71

1 As of 17.06.21, for example, the flagship American Journal of Sociology reports the following articles with prediction in the abstract using JSTOR search: 2020s (0), 2010s (0), 2000s (3), 1990s (3), 1980s (4), 1970s (4), 1960s (8), 1950s (9), 1940s (11), 1930s (4), 1920s (2).

(5)

thing if tolerant partners make good marriages but quite another if good marriages motivate 72

sharing activities or agreement about in-laws. (This issue about causal order is well known in 73

statistics – see Davis 1986.) I suggest, however, that research that would now be done better is 74

still not valueless. This style of prediction is still a coherent and useful thing to attempt (but of 75

course that differs from actually succeeding).

76 77

This discussion leads to another crucial dimension of prediction, namely, why it might work at 78

all. There are competing intuitions about this but they are just intuitions and the aim is to design 79

research actually proving or disproving claims. So it is with prediction. If one accepts relatively 80

stable psychological dispositions (which are themselves empirically supported – see, for 81

example, Conley 1985), one can easily see how being tolerant might help marriage partners to 82

cope generally with negative events like unemployment. Equally, however, it would be 83

implausible to claim that no endogenous processes (like creating shared experience or mutual 84

adaptation) will affect marital happiness. Or that there aren’t phenomena (like adultery or 85

alcoholism) that may be beyond the protective capacity of dispositions and interactions (see, 86

for example, Previti and Amato 2004). But then a properly designed piece of research is exactly 87

what establishes whether psychological variables can predict marital outcomes.

88 89

Another interesting aspect of analysing specific research is that early prediction did not develop 90

independently across disciplines. My next example (Sarbin 1944) is published in a good 91

psychology journal but the author also published in sociology journals. Sarbin’s article is 92

conceptual rather than empirical but makes an important point for my argument (while 93

confirming the importance of research design and that avoiding negative outcomes is a 94

recurring goal of prediction). Sarbin makes a key distinction between what he calls actuarial 95

prediction (which is what Burgess and Cottrell do in relating variables to outcomes) and 96

(6)

individual prediction (often involving expert assessment).² This is clearly very important in 97

criminology (for example) where the decision to parole someone may literally have life and 98

death consequences. There is a tension here between general discomfort with supposing that 99

simple models might predict at all and the possibility that (although expert judgements could 100

be far more nuanced and individual) they might just not perform well. (Indeed, that is what 101

Sarbin 1943 appears to show.) The other important concern that Sarbin raises is that it may 102

simply be fallacious to apply actuarial probabilities to individuals. If people like Bob 103

(according to the model) have a 72% chance to break the terms of their parole then Bob does 104

not have a 72% chance of doing this. He either will or he will not (and that will depend on why 105

people like Bob have a 72% chance of breaking parole including characteristics that researchers 106

haven’t yet modelled). While it seems hard to dispute the logic of this point (you cannot predict 107

the spin of one coin by knowing that many spins come out 50/50) the implications for 108

prediction are less clear. Part of the problem is the absence of a stated mechanism in such 109

accounts. Bob could have a 72% chance of breaking parole if the outcome for each prisoner 110

resulted from an independent dice roll (but that seems implausible). On the other hand, if 111

reoffending was perfectly predicted by some non-modelled phenomenon (like recurrent 112

toothache) which nonetheless correlated with some model variables (like poverty or rural 113

residence) then Bob’s actual chance of breaking parole could be very different than the 114

actuarial prediction. This is part of a wider difficulty in keeping a clear conceptual distinction 115

between what Hendry and others call the Data Generation Process – hereafter DGP (Hendry 116

and Richard 1983), the actual set of social processes giving rise to the data collected and 117

attributed theory/model accounts of these. Hendry’s key point is that we cannot start from the 118

2 An interesting point arises here. The arguments I present don’t depend on whether prediction models are explicit.

An expert can predict well even if they cannot explain how. The same issue arises in machine learning. As long as we design predictive research rigorously, we might trust algorithms even if we cannot understand them.

(7)

assumption that any model is “true” because the nature of abstraction is such that this 119

assumption cannot be correct. If researchers believe breaking parole is caused by IQ (theory) 120

and IQ merely correlates with what actually causes it (DGP) then the theory will be weakly 121

confirmed (but erroneously). This style of theorising also creates a problem because there needs 122

to be a clear mechanism by which general traits can cause decisions (for example to abscond).

123

At this stage, all I can do is draw attention to the role of mechanism in the possibility of 124

effective prediction. I can develop this argument further (and cover prediction in another 125

discipline) by considering work by Ohlin and Duncan (1949). Again there is no clear 126

disciplinary boundary (with research labelled as criminology appearing in a core sociology 127

journal). Ohlin and Duncan’s key point is that even allowing for good research design, effective 128

measures of predictive success are needed. Their research reiterates the observation that 129

prediction tends to survive in fields seelking avoidance of negative social outcomes like crime.

130

(In fact, criminology is a rare discipline where prediction research was still consistently 131

represented until recently – see for example Brennan and Oliver 2013).

132 133

Unlike Burgess and Cottrell, the research design of typical criminological prediction studies is 134

fairly clear. Data is collected about incarcerated prisoners. From this, models are developed 135

predicting who will break parole on release. If such models predicted perfectly, having fewer 136

than a certain number of favourable factors (like good home circumstances) would result in 137

broken parole while having more would result in adherence. Unsurprisingly what actually 138

arises is two overlapping peaks (Ohlin and Duncan 1949, p. 442). People with few favourable 139

factors will probably break parole. People with many probably will not but those in the middle 140

could go either way. But there are obvious caveats to this approach relevant to subsequent 141

arguments. The first is that data can only be about known parole violations (which aren’t an 142

unbiased sample of all violations). The second is that prediction effectiveness may depend on 143

(8)

whether the parole decision is itself independent from attributes of the criminal.³ Ideally, the 144

comparison would be for a common crime like domestic burglary and the parole mechanism 145

would be non-selective (free prisoners after 70% of their sentence automatically). Murderers 146

may never qualify for parole or only after much longer sentences. Parole boards may also 147

expect much stronger evidence of rehabilitation. This implies that a comparison of outcomes 148

for all crimes (which is what Ohlin and Duncan offer) does not constitute an independent 149

homogenous sample.

150 151

Another well-known prediction challenge can be found in demography, forecasting future 152

population. The reasons for doing this are both practical (how many schools might be needed?) 153

and again to avoid negative outcomes (what must be done about human sustainability?) In a 154

useful review, Booth (2006) makes a key distinction between extrapolative methods (what is 155

predictable about future population from past population) and structural methods (any model 156

based attempt to predict). Interestingly, her main critique of structural methods is the risk of 157

misspecification (omitting effects that are actually causal) owing to the weak state of 158

demographic theory. The problem with extrapolative methods is fairly obvious. Why suppose 159

that sufficient information is contained in past aggregates to determine future aggregates? This 160

problem has three related aspects (all of which are relevant to the possible distinctive 161

contribution of Agent-Based Modelling). The first is our conceptual understanding that birth 162

rate results from large numbers of individual decisions within a social and regulatory context.

163

This being so, it is unlikely that aggregate values are directly causal such that the birth rate next 164

year would follow from the birth rate this year. The fact that this appears to happen actually 165

3 There is also a counterfactual problem for all prediction that changes the environment. If someone is never given parole they can never violate it. It is then impossible to tell whether they would have done so had the decision been otherwise.

(9)

results from individual level stabilities (beliefs about suitable family sizes for example). The 166

second aspect is that this approach does not have access to the data underpinning predictive 167

failure. If, for example, there was an endogenous tendency towards smaller families, this would 168

change the trend but extrapolation would only reveal this fully after the fact. The third aspect, 169

which lacks clear conceptualisation in existing research, is the role of policy. Society often 170

wants to falsify predictions of negative outcomes (not releasing people who might violate 171

parole, not allowing human population to become unsustainable). This being so, intervention 172

occurs precisely to change the underlying process that extrapolative methods fail to access.

173

Retrospectively, it shows that a policy worked but can neither predict the effect accurately nor 174

make good on its ex ante prediction (absent the policy.) 175

176

The final example covers two different disciplines in applying economic prediction methods 177

to epidemics (Doornik et al. 2020). Economics echoes the tension between extrapolative and 178

model based prediction but adds to my overview in several ways. Firstly Doornik et al. also 179

emphasise that model based prediction suffers from imperfect theory. Secondly, epidemics 180

reiterate the problems of extrapolation in that society wants to falsify predictions of negative 181

social outcomes (in this case COVID deaths). Thirdly, Doornik et al. produce very short range 182

predictions (over weeks). This draws attention to the information content of past data and the 183

surprise potential inherent in predictions. If the number of murders in a particular town was 184

growing 500, 1000, 1500, 2000 … each year (and suppose, implausibly, that there was no plan 185

to intervene) then a prediction of 2500 would be sensible but also completely unimpressive 186

given the trend. On the other hand, a prediction of 1000 would be completely unjustified in 187

trend terms but hugely impressive (if realised) because it would imply that the predictive model 188

captured the DGP well enough to detect and quantify a turning point before it was realised in 189

the trend. Thus another ingredient in convincing prediction is the need to operate far enough 190

(10)

ahead that not merely predicting the trend can have additional information content. Finally, 191

economics reveals a further complication, that another aim of prediction (for example 192

predicting stock prices) is profit. But just as society wants epidemic predictions to be falsified 193

as negative social outcomes, so predictions for profit may falsify (or self-fulfil) themselves.

194

(You predict that stock will go up so you buy it and it goes up. Or you predict that stock will 195

go up and others try to second guess your prediction using futures and the stock actually goes 196

down.) In the Cottrell and Burgess case, knowing what makes marriages work may merely 197

reduce the number of unhappy people with no other social externalities but certain economic 198

predictions illustrate the opposite extreme where whole markets consist of people trying to 199

second guess each other which may therefore become fundamentally unpredictable. The 200

takeaway message is that careful thought must be given to who is predicting for what purpose 201

and what the social consequences of such predictions might be.

202 203

Finally, and it is interesting that this did not arise in the examples, prediction ethics must be 204

considered. Suppose one really could show that Bob had a 78% chance of reoffending based 205

on compelling evidence. Could society then justify denying him parole? What is an acceptably 206

low chance of reoffending? Realistically it cannot be zero. Thus, even if it were possible to 207

develop accurate predictions, that might not exhaust the social challenge.

208 209

To sum up then, analysing prediction research across disciplines suggests a consensus about 210

its core components:

211 212

• It needs an appropriate research design (probably longitudinal) so that predicted 213

outcomes clearly occur after supposed predictors.

214

(11)

• In designing predictive research, the possible impact of predictions needs to be 215

considered as part of the design. If marital happiness is predicted based on attributes 216

then while this may change who marries who, there is no reason to suppose that it will 217

change the underlying mechanism supporting the prediction (grumpy people remain 218

hard to live with). By contrast, while society wants epidemic predictions falsified, it is 219

problematic if (by falsifying them) their models become untestable.

220

• A clear conceptual framework is needed to show how different prediction approaches 221

function and, in particular, to support the difficult task of thinking clearly about 222

temporal logic. In a rising trend, it is probably impossible for model based prediction 223

to outperform extrapolation but the situation is reversed when (in the future) there is a 224

turning point which the extrapolative method will miss (at least until it is too late).

225

Some way is needed of characterising the information content of existing data (and the 226

amount of latitude in models) so that the claim that one approach to prediction really is 227

outperforming another can be convincingly justified. (I will return to this in section 5.) 228

• The same clear conceptual framework is also needed for assessing claims about 229

mechanism. What is it that remains stable (and what changes) such that prediction is 230

viable? This is easy to see for stable psychological dispositions (being tolerant making 231

happy marriages) but much harder for other possibilities (like the long-term persistence 232

of social practices or homeostasis – increased birth rate in resource limited societies 233

simply leading to increased death rate.) Furthermore, this framework needs to say 234

something intelligible about the relationship between individual choices and aggregates 235

so that sense can be made of social change and policy.

236 237

The generality resulting from this inductive approach to real research ensures that discussion 238

of Agent-Based Modelling in subsequent sections can contribute to actual practice.

239

(12)

240

3. How Does Agent-Based Modelling Frame Prediction? A Worked Example 241

242

Agent-Based Modelling (Gilbert 2020) is a technique that involves representing social 243

processes as computer programmes rather than equations (in regression for example) or 244

narratives (as in ethnography). It is also distinctive in attempting to represent these social 245

processes directly rather than, for example, just solving pre-existing theoretical equations by 246

computer (rather than using a pencil). The best way to illustrate these points clearly 247

(particularly with reference to prediction) is to use an example. The point of the example is 248

therefore not to be empirically accurate but to explain clearly how Agent-Based Modelling is 249

distinctive and how it captures the key components of prediction identified in the previous 250

section. The example chosen is the “Wolf Sheep Predation” Agent-Based Model (hereafter 251

ABM).⁴ In this ABM, sheep eat grass (which is depleted but recovers depending on the sheep 252

population) and reproduce (which puts more pressure on the grass). Wolves (which also 253

reproduce) eat sheep and thus their population expands with larger sheep populations but 254

contracts when these are smaller. Thus the current state of the grass and the sheep/wolf 255

populations depends on past interplay between these species. The ABM also includes 256

parameters which shape the overall system behaviour. These are the initial numbers of sheep 257

and wolves, the amount of energy sheep and wolves get from eating and the chance that each 258

species will reproduce.⁵ Although this ABM is simplistic and therefore subject to almost 259

4 This example is plainly not social but was chosen partly for brevity of explication and partly because there seem to be (perhaps surprisingly) no ABMs that are socially plausible, simple and yet generate long term dynamics with equivalent richness to the synchronised rise and fall of sheep, wolves and grass.

5 The ABM used here (Wilensky 1997) is part of the models library for a package called NetLogo which can be downloaded for free (Wilensky 1999).

(13)

infinite criticism both conceptual and empirical, in terms of explanation it is both concise and 260

precise. The exact state of the simulated world at time zero is known as are all the processes 261

and parameters for its subsequent evolution. This being so, it is possible to let it evolve till time 262

t (considered to be the present). The ABM can then continue to evolve into the future on request 263

but, in the meantime, prediction can be attempted (for example what the wolf population will 264

be at time t+20) using any information and techniques desired.

265 266

However, one just needs to run this ABM twice to identify the first serious challenge to 267

successful prediction. Because the model is not deterministic (reproduction occurs 268

probabilistically for example) even the same starting conditions and parameters will not 269

produce identical time series (although they do resemble each other strongly in the magnitude 270

and duration of population changes for example). Given this, it might occur to the reader that 271

the same situation applies over any specific time period (rendering prediction impossible) but 272

this is too pessimistic. Firstly, the initialisation is untypical with sheep and wolf populations 273

set by the user (rather than system interactions themselves). Thus the initial system situation 274

may be outside that ever found during its endogenous evolution. Secondly, it can be seen from 275

the resemblance of population waves that generally, once the sheep population is rising, it will 276

continue to do so for a while and then stabilise and fall. It is not observed that it rises for ever 277

or that it rises for a while and then drops to zero. Thus while exact prediction may be 278

impossible, the identification of less exact regularities may not be.

279 280

But it is important to be clear that my argument at this stage is not about whether prediction 281

can succeed (which is an empirical matter) but merely whether it can be made objectively 282

intelligible. Within the framework of this example, it is. It makes complete sense to say “20 283

periods after the present I predict that the wolf population will be 30” and for that prediction 284

(14)

to be definitively confirmed or refuted by running the ABM. In addition, it is clear how some 285

challenges to prediction fit naturally into this framework. For example, if I believe the wolf 286

population on 1^st January 1980 is 40 but it is actually 25 then even the correct deterministic 287

ABM will not be able to track system evolution (so the effect of data error on prediction always 288

matters). Further, this ABM encapsulates the assumption (relevant for subsequent discussions 289

of intervention) that there is no structural change. Sheep and wolves are always equally fecund 290

and grass has constant nutritional value. If, at a specific time, farmers started leaving 291

contraceptive laced meat lying about, one could explore the ability of different prediction 292

approaches to identify and accommodate that change.

293 294

However, before taking the argument further it is necessary to digress into Agent-Based 295

Modelling methodology and its relation to data. The crucial element here is that, in principle, 296

it is possible to test ABMs. In designing one, existing data is identified (for example time series 297

of wolf and sheep populations), a decision is made on how to specify the ABM (for example 298

does observation show that starvation and predation are the only causes of death?) and how to 299

calibrate it (what is the reproduction rate for sheep with more or less grass perhaps using literal 300

field experiments.) Having designed an ABM that is as empirically grounded as possible, is it 301

true that simulated outcomes reproduce real ones? (This is called validation. See, for example, 302

Hägerstrand 1965.) I have already suggested that perfect prediction is impossible in stochastic 303

ABMs but is it possible, for example, to predict the population range of species or the 304

probabilities that populations will be within specified ranges? This raises an interesting issue 305

about different degrees of abstraction according to which data can be compared (which needs 306

to be developed further in Agent-Based Modelling. See, for example, Bloomfield 2000).

307 308

(15)

This aspect of Agent-Based Modelling methodology takes my argument in two crucial 309

directions. Firstly, it is clear how direct representation makes ABMs congruent with data. The 310

ABM makes an assumption about birth rate and there is also an empirical fact of the matter 311

about birth rate. This contrasts with theorised representations (or technical assumptions on 312

which different modelling approaches like System Dynamics – see, for example, Wilensky 313

2005 – depend) where there is no guarantee that concepts like transition probability or discount 314

rate have real world referents. Following from this (and very relevant to policy and agency) it 315

also makes perfect sense to say “After 15^th February 1981 wolf fertility began declining to half 316

its previous value as farmers started distributing contraceptive laced meat.” (As I shall show, 317

however, it is very important to be clear about which statements can be made ex ante and ex 318

post. Ex ante, one can only say that wolf fertility is unlikely to rise after this distribution but 319

claiming that fertility drops by half can only be justified ex post.) Thus endogenous system 320

changes can also be represented directly in an ABM. (Of course, this capability is not costless.

321

In an epidemic ABM, for example, the death toll may be reduced by 50% if the population 322

locks down but both intervention and status quo predictions could be wrong – if the ABM is 323

faulty – and one still has to establish how much compliance there really was for evaluation 324

purposes.) At this stage, however, the argument remains one of potential and not practice. ABM 325

can represent the changes that arise from human agency and policy (unlike extrapolative 326

prediction and, arguably, model based predictions where mechanisms are only implicit). But 327

there is a still much hard work to do before this capability translates into predictive success.

328

Secondly, now the temporal logic of prediction is clearer, so are claims about testing 329

predictions using ABMs. As I shall argue subsequently, science should always worry about 330

possibilities for cheating (either deliberately or through flawed methods) but let us suppose for 331

the moment that researchers are totally honest and immune to self-deception. In this case, they 332

run the ABM for enough time periods to generate data (which may be used, for example, to 333

(16)

calibrate parameters or train a machine learning algorithm) and then make a prediction after 334

that point. If the prediction succeeds (by whatever assessment criteria) then the approach is 335

endorsed and it is sensible to suppose that the ABM may also predict the actual future. It is 336

important to be clear, therefore, that while society wants real prediction (and it is the only 337

totally cheat proof test) it does not follow that testing on known data is either pointless or 338

specious.⁶ 339

340

Having identified key components of social science prediction and explained Agent-Based 341

Modelling in the context of a predictive example, I am now in a position to demonstrate the 342

contribution of Agent-Modelling to two important areas, namely analyses of predictive failure 343

and prediction assessment.

344 345

4. What Can Agent-Based Modelling Contribute To Social Science Prediction? Two 346

Examples 347

348

In this section I examine how the distinctive approach of Agent-Based Modelling can improve 349

conceptual understanding and research practice in two key areas: Predictive failure and 350

evaluating predictions.

351 352

4.1 The Challenge of Predictive Failure 353

354

6 The awe surrounding accurate predictions may distract from the fact that testing models honestly can simply involve scientific organisation. One gives a modeller 1000 periods of a time series, asks them to predict it and simply does not provide the 200 periods they are supposed to predict until afterwards! This is the logic of prediction competitions (Erev et al. 2010).

(17)

Some possible causes of predictive failure have already been examined. Extrapolation simply 355

does not allow new information (like underlying behavioural change) to be incorporated into 356

the prediction until it starts affecting the aggregate being predicted. This is the classic problem 357

of turning points given the belief that the aggregate somehow determines itself rather than 358

simply being a summary of an underlying social process changing endogenously. By contrast, 359

model based prediction might work if the model could be mapped onto reality (and that means 360

not only access to relevant data but also an effective representation of mechanism: How exactly 361

does education level show association with birth rate? Will that association support successful 362

causal intervention?) 363

364

But apart from the challenge of devising prediction tests (and avoiding deliberate cheating) it 365

is also necessary to consider how different research methods may permit self-deception. This 366

can occur when, instead of data being used as an independent test for an ABM, the model (on 367

the presumption that it is correct) is fitted to data (has its design and parameters adjusted to 368

maximise match.) The problem with this approach is so obvious that it can only be a belief that 369

there is no alternative which has allowed it to be disregarded. If you start from the presumption 370

that your model is correct then you have no capacity to identify misspecification. Having 371

created this problem, whether you can in fact fit the model merely depends on the information 372

content of the data and the number of free model parameters. (The relationship between 373

available data, model size and fitting versus calibration is a complicated one there is not space 374

to discuss fully here. See Chattoe-Brown 2021 for more analysis of this relationship.) With 375

enough free parameters, you can fit anything (while in Agent-Based Modelling neither the 376

specification nor the calibration should be free to allow this, each being empirically grounded 377

as far as possible). However, the apparent success in fitting models to available data is illusory 378

(18)

because misspecification and associated incorrect parameter values aren’t discovered until 379

prediction of new data is attempted.⁷ 380

381

Having summarised the possibilities for predictive failure in existing approaches, I can now 382

explain how Agent-Based Modelling provides a framework distinguishing sources of 383

predictive failure that are avoidable (with suitable research design) and those which are 384

unavoidable (and can thus only be properly acknowledged in interpreting prediction results).

385 386

I have already suggested that designing models directly representing social processes (and 387

particularly causes) is one important way to avoid predictive failure (because it means that 388

corresponding data is more accessible and there are fewer opportunities for spurious reasoning, 389

for example that association is somehow causal.) I have also suggested how fitting (rather than 390

calibration and validation) may create problems for predictive models by obscuring 391

misspecification and resulting faulty parameters. By contrast, an ABM tries to achieve correct 392

specification and calibration from the outset (however badly it in fact succeeds) so possible 393

weaknesses cannot be concealed.

394 395

Nonetheless, it is clear how additional phenomena (like data error and stochasticity) impose 396

limits on effective prediction even if (somehow) the correct DGP were known. Nonetheless, 397

such problems can be explored (and perhaps even quantified) using the special capabilities of 398

ABMs as I shall argue in section 5. But one source of predictive failure is unavoidable and all 399

7 This is another way of explaining the difference between testing and fitting. If an ABM fails you revisit your specification and calibration assumptions but that does not “exhaust” your testing data. By contrast, all you can do under fitting is more fitting until you have again “exhausted” your data and therefore have to make a leap of faith about whether your model will actually work with new data.

(19)

we can do is acknowledge it clearly. In a model, up to the present, one can assess the extent to 400

which underlying processes (for example a shift to preference for smaller families) might affect 401

predictions. But the one thing that cannot be done logically is to anticipate future changes in 402

that preference. If the family size preference trend has been stable or falling up to the present 403

then (if it subsequently rises) prediction will simply fail. This issue may underpin the radical 404

(but actually spurious) prediction critique that you never can tell. In predicting the outcome of 405

the Oxford-Cambridge boat race, the chance that one crew bus will be beamed up by aliens is 406

not part of the model. Prediction must always take place in a credible context of ceteris paribus, 407

in this case that both crews arrive to race. Because a model of everything is impractical, there 408

may always be events that not only falsify a specific prediction but actually invalidate the 409

prediction process. (You did not get the winner wrong. There was no winner because there was 410

no race.) But of course it is an empirical matter whether, in some circumstances, the ceteris 411

paribus conditions do hold (generally the boat race does take place with two crews) so that 412

prediction is legitimate and can meaningfully succeed or fail.

413 414

The most obvious manifestation of this issue in a prediction context is genuine novelty.

415

Logically, no prediction method can quantify the possibility that, at some future point, an 416

infallible contraceptive will be invented. But this fact should have no bearing on prediction 417

attempts until it occurs and it does not (in fact) undermine concrete attempts to predict. One 418

has to clearly distinguish that conjectured events should have no bearing on attempting 419

prediction but, of course, every bearing on its success if they arise. This issue involves 420

confusion between ex ante and ex post claims that must be rigorously avoided.

421 422

4.2 The Challenge of Assessing Predictions 423

424

(20)

I have already suggested at various points that issues arise with assessing predictions and I 425

draw these together here. The first is the time scale over which predictions are made. If this 426

scale is too long, it is likely that misspecification, data error and genuine novelty will result in 427

unavoidable predictive failure for all approaches. On the other hand, if the scale is too short, it 428

is very hard for model based approaches to distinguish themselves convincingly from 429

extrapolative ones. Taking the classic example of a turning point, both approaches ought to 430

predict a rising trend for a while but the crucial difference is that extrapolative methods will 431

keep doing so until the variable starts to level off while an effective model based prediction 432

will, over a suitable horizon, actually predict a lower value (a surprising prediction given the 433

trend which thus has very high information content.) Thus, it is necessary to consider whether 434

models should have to show improved performance over simple trend prediction (since many 435

time series have significant elements of mere trend.) 436

437

The second issue has also been mentioned but the argument will be consolidated here. There 438

are obviously different ways of characterising data and (like significance levels) predictive 439

performance probably cannot be absolute. A model that can predict the range of wolf and sheep 440

populations is better than one that cannot. A model that can predict the distribution of 441

populations across ranges (45% chance to be between 50 and 75) is better still.⁸ But it is known 442

that even an exact model of a stochastic system will be unable to achieve perfect prediction.

443

Progressive research therefore requires us to identify steadily more demanding predictive 444

challenges and to evaluate as better those ABMs that meet them. (This raises another important 445

issue. The weaknesses of extrapolative methods are universal because they do not evaluate 446

8 It is harder to characterise so called qualitative prediction but my arguments are not intended to rule out this approach. Arguably the claim “the trend will mostly slope up”, while having lower information content than

“the number of murders will increase by about 400 per year”, is still clearly falsifiable.

(21)

anything underlying the aggregate. Model based methods may work well or badly depending 447

on specific areas of application – marriage success based on psychological traits versus 448

speculative markets – and how clear they are about mechanism claims. But the possibility 449

remains that we may able to show that certain approaches to prediction are not just successful 450

in specific cases but generally because they accurately represent the social processes implicated 451

in predictive success or failure.) 452

453

This argument brings us full circle to issues of effective research design. It is very important 454

that popular prediction ideas do not muddle us into making incoherent claims. For example 455

“Donald Trump will be re-elected” is a falsifiable prediction if made before the election. But 456

“Donald Trump has a 65% chance to be re-elected” is not. (For that, he would have to be re- 457

elected in 65 parallel universes out of 100!) In contrast, “65% of incumbents will be re-elected”

458

is again falsifiable. And the attempt to “cheat proof” prediction reminds us that one has a 50%

459

chance to be “right” about Donald Trump’s re-election (in a two candidate race at least) by 460

spinning a coin. So, for credibility, the claim actually needs to be (again based on model 461

comparisons) “I can successfully call this many presidential elections”. Thus, as with all other 462

research, prediction must occur in a rigorously specified context: What is an appropriate sample 463

size of potential predictions given the current best model? In what circumstances can the 464

credibility of unique predictions actually be demonstrated or must these always be instances of 465

more general classes?

466 467

These arguments also lead to the consolidation of another important issue already discussed, 468

namely the relationship between attempted prediction and the present moment. One reason for 469

the high status of model based prediction as a gold standard for social science is that it is 470

robustly cheat proof (absent time travel). But this presumes that there are no other ways of 471

(22)

cheat proofing model testing (and arguably with fitting there aren’t). But if, as argued, the 472

testing of ABMs has its own methodological protection from cheating (namely empirical 473

calibration rather than fitting) then this problem may be practically less damaging (and there 474

are other analogous solutions like prediction competitions or out-of-sample testing). Further, 475

ensuring cheat proof prediction is not costless. If the need to prevent negative social outcomes 476

does not allow you to test predictions ceteris paribus then the danger is that you will neither 477

test the model without the policy nor be able to test it with. Far from being cheat proof then, 478

unless we think carefully about research design and ex ante/ex post claims, the danger is that 479

policy relevant predictions will be influential without any testing. (If a credible model predicts 480

10 million dead then it is very likely that huge efforts will be made to falsify that outcome by 481

intervention. This being so, a much lower ex post death toll will not tell us whether, in fact, the 482

non-intervention prediction was completely spurious.) 483

484

This argument has an important corollary. We have much data about the 1918 flu pandemic 485

(among others). Obviously it is not the data we might collect now and it will be inaccurate.

486

Further we are well aware that the 1918 flu is not the same as COVID but nonetheless two 487

important questions arise. Firstly, when COVID was new and it was simply impossible to test 488

for it or to get accurate data about model parameters, might it still have been better to develop 489

models from historical data and then build on them than to guess? Secondly, once we qualify 490

the idea that the only way to prevent cheating is to predict future events, might such historical 491

modelling actually be quite valuable in narrowing the space of model possibilities against the 492

day when we cannot ethically afford to test ABM predictions because the outcome may be 493

avoidable deaths?

494 495

(23)

Finally, it is well known that, while not cheat proof, there are standard techniques for 496

organising data to increase model credibility like out-of-sample testing. Even with fitted 497

models, this approach adds credibility as long as out-of-sample performance is good but there 498

are reasons for worrying that it may not discussed above (and also why an empirically grounded 499

Agent-Based Model might do better.) 500

501

In this section, I have therefore supported the earlier claim that the ABM approach can make a 502

distinctive contribution to conceptualising and researching recognised specific problems with 503

prediction (predictive failure and assessing predictions). In the final section I show how it can 504

also make a novel contribute to addressing a more general problem: Developing effective 505

prediction techniques when we do not actually know the DGP.

506 507

5. Another Distinctive Use For ABMs: The Prediction Laboratory 508

509

At this point, my argument shifts gear somewhat. The previous section dealt with the problems 510

of making and evaluating specific predictions against data and the contributions that Agent- 511

Based Modelling can make. But there is also a deeper challenge to which it can usefully 512

contribute. That is rigorously analysing general claims about prediction when the actual DGP 513

is not known. Thus although, in principle, data error can be acknowledged as a phenomenon, 514

nothing concrete can be said about it because the whole point is that true data values cannot be 515

known. But we can assess, in as much detail as desired, the capability of different prediction 516

methods to perform on data generated by a known ABM. So, if one tries to fit the correct wolf 517

sheep grass ABM to data generated by the same ABM but perturbed by a fixed amount of data 518

error, what happens? Does the difference manifest only in static system properties (like 519

population ranges) or also in dynamic ones (like the time scale over which sheep populations 520

(24)

rise and fall?) In this way trustworthy insights about the relationship between DGP and ABMs 521

can be developed (since we can repeat them over many different ABMs many times) which 522

may therefore be applied with more confidence when the DGP is not known.

523 524

Other applications of this approach, already raised, would be devising performance measures 525

for extrapolative and model based methods over whole data sets (including the prediction time 526

scale). There is no point in comparing models over time scales where none can perform well 527

(because of things like genuine novelty) but can a time scale be established over which the 528

ability of model based methods to find turning points can be demonstrated effectively?

529 530

We can also develop this prediction laboratory approach to show, for example, exactly how 531

extrapolative methods err and how fitting on past data may generate poor predictive 532

performance compared to specification and calibration. Researchers can accept these issues in 533

principle (although with difficulty) but that is very different to actually seeing them worked 534

out concretely. Further, as already suggested, this approach could be used to illuminate (if not 535

yet actually forecast) the consequences of policy, genuine novelty and so on. (How would 536

population predictions change if perfect contraception was invented in 1 year, 5 or 10? What 537

would one see in epidemic dynamics if 60% lock down compliance could be achieved within 538

a month?) As already shown by the example of running the same ABM twice, problems that 539

are both conceptually and practically difficult to engage with can rapidly be bought down to 540

earth: What does it actually mean for prediction that social processes are stochastic? (And how 541

do researchers engage sensibly with this idea when they cannot perceive it in the unique 542

realisations of actual data?) 543

544

(25)

Two more applications of this laboratory approach immediately suggest themselves based on 545

previous arguments. Firstly, one could explore the extent to which characterisations of systems 546

are invariant to other system properties (like stochasticity). If the exact evolution of wolf and 547

sheep populations cannot be predicted, is it possible instead to robustly predict population 548

ranges or distributions of populations? This kind of analysis can thus be conducted on ABM 549

data and then “let loose” on real data once it is better conceptualised and we have some 550

evidence that it might work.

551 552

Finally, and this is a very important issue, I have already hinted above that there is a general 553

problem with ABM in operationalising certain quality control ideas from statistics. For a 554

regression model we know what it means (and even what it may result in) if there are too many 555

parameters relative to data (or they are the wrong ones). By contrast, while it is just as easy to 556

see how an ABM may be mis-specified (random mixing is assumed when a disease is actually 557

transmitted via social networks) it is much less clear how we formally establish how many 558

social processes and parameters a given amount of data allows us. Using the laboratory 559

approach we can devise and evaluate tests that can then be used in cases where we don’t know 560

the DGP. For example, an ABM might be considered insufficiently discerning if, within its 561

specification and calibration uncertainty, it can reproduce both a time series and its mirror 562

image. This is a very ad hoc suggestion (and might simply not work) but it is only by devising 563

procedures that we can concretely attempt and analyse that we can hope to clarify our 564

conceptual thinking to the point where we can develop effective tests.

565 566

In this section I have shown how the ABM approach can not only contribute to existing 567

prediction challenges but also provide new tools for evaluating prediction strategies in general 568

through the “prediction laboratory” insight based on models of models.

569

(26)

570

6. Conclusion 571

572

In this article, I have argued that ABMs (and their methodology) have a distinctive role to play 573

in social science prediction by serving as a coherent framework changing our perspective on 574

several crucial issues. The argument began by showing that in practice, prediction across the 575

social sciences shares core elements (avoidance of social ills, challenges of research design and 576

conceptualisation, issues with the nature of models – and particularly their claims about 577

mechanism – and so on.) I then illustrated Agent-Based Modelling using a simple example and 578

showed that this could coherently represent prediction (for example about the wolf population 579

20 time periods hence) in terms of these elements. The next stage of the argument was to show 580

how various challenges of predictive failure (whether avoidable or not) and prediction 581

assessment would be viewed differently (and perhaps ameliorated) using ABMs. For example, 582

that an ABM both directly represents the individual processes adding up to the aggregate being 583

predicted and that it can also explicitly represent the changes resulting from policy – for 584

example that people stay at home and thus transmit infection less. Next, I illustrated valuable 585

contributions that might arise from using Agent-Based Modelling as a kind of laboratory to 586

develop concepts and tools that could be evaluated using a known DGP before being used more 587

confidently on an unknown one. Finally, I drew on previous ideas to show how, although future 588

prediction is totally cheat proof, other approaches (like good empirical methodology and 589

prediction competitions) may also make cheating harder and have other advantages (like using 590

data that is more readily available and avoiding the possibility that predictive models used first 591

in crises actually end up untested.) 592

593

References 594

(27)

595

Bloomfield, Peter (2000) Fourier Analysis of Time Series: An Introduction, second edition 596

(Hoboken, NJ: Wiley).

597 598

Booth, Heather (2006) ‘Demographic Forecasting: 1980 to 2005 in Review’, International 599

Journal of Forecasting, 22(3), 547-581.

600 601

Brennan, Tim and Oliver, William L. (2013) ‘Emergence of Machine Learning Techniques in 602

Criminology: Implications of Complexity in Our Data and in Research Question’, Criminology 603

and Public Policy, 12(3), 551-562.

604 605

Burgess, Ernest W. and Cottrell, Leonard S. Junior (1936) ‘The Prediction of Adjustment in 606

Marriage’, American Sociological Review, 1(5), 737-751.

607 608

Chattoe-Brown, Edmund (2021) ‘Agent Based Models’, in Atkinson, Paul, Delamont, Sara, 609

Cernat, Alexandru, Sakshaug, Joseph W. and Williams, Richard A. (eds.) Sage Research 610

Methods Foundations (London: Sage), <https://dx.doi.org/10.4135/9781526421036836969>.

611 612

Conley, James J. (1985) ‘Longitudinal Stability of Personality Traits: A Multitrait- 613

Multimethod-Multioccasion Analysis’, Journal of Personality and Social Psychology, 49(5), 614

1266-1282.

615 616

Davis, James A. (1986) The Logic of Causal Order, Quantitative Applications in the Social 617

Sciences 55 (Beverly Hills, CA: Sage).

618 619

(28)

Doornik, Jurgen A., Hendry, David F. and Castle, Jennifer L. (2020) ‘Statistical Short-Term 620

Forecasting of the COVID-19 Pandemic’, Journal of Clinical Immunology and 621

Immunotherapy, 6(5), article 46.

622 623

Erev, Ido, Ert, Eyal, Roth, Alvin E., Haruvy, Ernan, Herzog, Stefan M., Hau, Robin, Hertwig, 624

Ralph, Stewart, Terrence, West, Robert and Lebiere, Christian (2010) ‘A Choice Prediction 625

Competition: Choices from Experience and from Description’, Journal of Behavioral Decision 626

Making, 23(1), 15-47.

627 628

Gilbert, Nigel (2020) Agent-Based Models, Quantitative Applications in the Social Sciences 629

153, second edition (Thousand Oaks, CA: Sage).

630 631

Hägerstrand, Torsten (1965) ‘A Monte Carlo Approach to Diffusion’, European Journal of 632

Sociology, 6(1), 43-67.

633 634

Hendry, David F. and Richard, Jean-Francois (1983) ‘The Econometric Analysis of Economic 635

Time Series’, International Statistical Review, 51(2), 111-148.

636 637

Ohlin, Lloyd E. and Duncan, Otis Dudley (1949) ‘The Efficiency of Prediction in 638

Criminology’, American Journal of Sociology, 54(5), 441- 452.

639 640

Previti, Denise and Amato, Paul R. (2004) ‘Is Infidelity a Cause or a Consequence of Poor 641

Marital Quality?’ Journal of Social and Personal Relationships, 21(2), 217-230.

642 643

(29)

Sarbin, Theodore R. (1943) ‘A Contribution to the Study of Actuarial and Individual Methods 644

of Prediction’, American Journal of Sociology, 48(5), 593- 602.

645 646

Sarbin, Theodore R. (1944) ‘The Logic of Prediction in Psychology’, Psychological Review, 647

51(4), 210-228.

648 649

Wilensky, Uri (1997) ‘NetLogo Wolf Sheep Predation Model’, Center for Connected Learning 650

and Computer-Based Modeling, Northwestern University, Evanston, IL, 651

<http://ccl.northwestern.edu/netlogo/models/WolfSheepPredation>.

652 653

Wilensky, Uri (1999) ‘NetLogo’, Center for Connected Learning and Computer-Based 654

Modeling, Northwestern University, Evanston, IL, <http://ccl.northwestern.edu/netlogo/>.

655 656

Wilensky, Uri (2005) ‘NetLogo Wolf Sheep Predation (System Dynamics) Model’, Center for 657

Connected Learning and Computer-Based Modeling, Northwestern University, Evanston, IL, 658

<http://ccl.northwestern.edu/netlogo/models/WolfSheepPredation(SystemDynamics)>.

659

View publication stats