Directory UMM :Data Elmu:jurnal:M:Mathematical Biosciences:Vol168.Issue1.Nov2000:

(1)

Applying the saddlepoint approximation to bivariate stochastic

processes

Eric Renshaw

*

Department of Statistics and Modelling Science, University of Strathclyde, Livingstone Tower, 26 Richmond Street, Glasgow G1 1XH, UK

Received 16 September 1999; received in revised form 13 June 2000; accepted 3 July 2000

Abstract

The problem of moment closure is central to the study of multitype stochastic population dynamics since equations for moments up to a given order will generally involve higher-order moments. To obtain a Normal approximation, the standard approach is to replace third- and higher-order moments by zero, which may be severely restrictive on the structure of the p.d.f. The purpose of this paper is therefore to extend the univariate truncated saddlepoint procedure to multivariate scenarios. This has several key ad-vantages: no distributional assumptions are required; it works regardless of the moment order deemed appropriate; and, we obtain an algebraic form for the associated p.d.f. irrespective of whether or not we have complete knowledge of the cumulants. The latter is especially important, since no families of distri-butions currently exist which embrace all cumulants up to any given order. In general the algorithm converges swiftly to the required p.d.f.; analysis of a severe test case illustrates its current operational limit. Ó _{2000 Elsevier Science Inc. All rights reserved.}

Keywords: Birth±death migration; Cumulants; Moment closure; Saddlepoint approximation; Spatial processes; Tail probabilities; Truncation

1. Univariate introduction

Although modelling the growth and dispersal of biological populations is an extremely im-portant and challenging problem (e.g. [8,28]), diculties experienced in handling the associated non-linear mathematics often force analysts back into using linear approximations. Whilst these can provide useful information on the initial qualitative growth of a population, ignoring

*_{Corresponding author.}

E-mail address:[email protected] (E. Renshaw).

(2)

non-linear aspects may have a serious impact on longer-term behavioural prediction. The essence of this problem lies in the general intractability of the forward Kolmogorov partial dierential equation (p.d.e.) for the moment generating function (m.g.f.)

Mh;tX

1

N0

pNtehN 1:1

for the population size probabilitiespNtat timet. Some progress can be made by rewriting this equation in terms of the cumulant generating function (c.g.f.)

Kh;t ln Mh;tX

1

i1

jithi

i! ; 1:2

since we can use this to generate equations for the cumulantsjiti>0. For example, motivated by a desire to model both the annual catch of an invasion of muskrats in eleven Dutch provinces between 1968 and 1991, and the rapid colonization by the Africanized honey (Killer) bees of North and South America (see [17]), Matis et al. [18] consider the power-law logistic process with population birth and death rates kN a1Nÿb1Ns1 and lN a2Nb2Ns1, respectively, for

easily yields a set of ®rst-order ordinary dierential equations for the jit (replace Mh;t by exp_fKh;t_g, expand both sides of the resulting equation in powers ofhand equate coecients), though the dierential equation for thejth cumulant unfortunately involves terms up to thej sth cumulant. This clearly rules out determining exact solutions to the cumulant equations, and so Matis et al. [18] adopt a moment closure approach by solving the system of the ®rst js

cumulant functions with ji0 for all i>js.

This raises two fundamental questions. The ®rst, considered by Matis et al. [18], assesses the error induced into the cumulants themselves by adopting this truncation procedure. The second, considered by Renshaw [29], is assuming that the ®rst js cumulants are known exactly, what error is induced into the underlying probability structure by taking all higher-order cumulants to be zero? He studies this by considering the truncated c.g.f.

Knh

Xn

i1

jihi

i! 1:4

in tandem with the associated saddlepoint approximation. Easton and Ronchetti [5] proposed this approach in the context of deriving an approximation to the c.g.f of some statisticVrX1;. . .;Xrof i.i.d. observations X1;. . .;Xr. They show that it is especially useful in the case of small sample sizes, and although our own scenario of investigating the structure of a single realisation of a stochastic process is quite dierent, their success is certainly encouraging.

A superb account of the derivation of the saddlepoint approximation is provided by Daniels [2] in terms of the dominant term in the contour-integration formula for the inversion of the c.g.f.

(3)

intrinsically better than employing either the Central Limit Theorem or Edgeworth-type approximations. The key result is that forh0 an appropriate root of

xK0h0 1:5

(Theorems 6.1 and 6.2 of Daniels [2] guarantee there will be only one), we have the approximation

fx_'2pK00h0ÿ 1=2

exp_fKh0 ÿh0xg: 1:6

The power of this approach can be seen immediately on noting that it not only reproduces the Normal p.d.f. exactly, but also that the saddlepoint approximation for the gamma p.d.f. diers only from the exact result in thatCais replaced by Stirling's approximation in the normalising factor [2]. The truncated saddlepoint approximation,fn_x_{, is then obtained by substituting (1.4)}

into expressions (1.5) and (1.6), yielding

This general form is extremely useful when we wish to examine the structure of the p.d.f. which corresponds to a given set of cumulants. For although in principle the Kolmogorov equations for the probabilities_fpntgcan be solved numerically to any desired degree of accuracy, this may be computationally far too expensive. In contrast, exploitation of (1.7) is both algebraically tractable for smalln, and numerically fast for alln. Moreover, being a completely general technique it does not necessitate the preselection of an assumed underlying distribution whose parameters are then ®tted according to some statistical goodness-of-®t criteria. Indeed, forn3 (i.e., we incorporate the meanj1 l, variance j2r2 and third central moment j3) we have

[29]. This is a completely general, and algebraically amenable, result, which provides a consid-erable improvement over the Normal approximation (for which j3 0 since it incorporates

skewness. Raising n to 4 and 5 leads to mathematically tractable cubic and quartic equations, respectively, forh0, so expressions (1.8) and (1.9) can be re®ned still further.

Renshaw [29] illustrates the application of this approach by comparing the exact Poisson (10) probabilities with the full saddlepoint probabilities derived through (1.6), and the truncated saddlepoint probabilities derived through (1.7) with truncation pointsn2;3;4 and 6. The full saddlepoint approximation provides an accuracy to within 3% for i>2 and to within 1% for

i>9, in total contrast to the Normal approximation (i.e., (1.7) withn2) which behaves badly in the tails. The third-order case n3 is much better where it exists; for iP12 it not only out

performs the fourth- and sixth-order approximations, but it also marginally beats the full sad-dlepoint approximation itself. However, it does collapse in the lower tail of the distribution due to the necessity of having wP0 in order for h0 to be real. Note that this is not a serious problem,

since employing the modi®cation proposed by Wang [31], in whichj3andj4are each scaled down

(4)

Extending the study to the logistic birth±death process reinforces the conclusion that the third-order approximation (1.8) is best both in terms of transparency and accuracy; here the third-third-order approximation covers the full admissible range, whilst the fourth- and sixth-order approximations do not. Care must be taken though not to dismiss the Normal approximation out of hand, since with a power-law logistic birth±death process it oers the optimal approximation in the tails, and is only marginally worse than the n3, 4 and 6 cases in the centre.

The fact thath0 in (1.9) may be complex highlights the inherent contradiction which underlies

the standard moment closure approach. For the reason behind placing all high-order cumulants equal to zero stems from the Gaussian result thatji0 foriP3. Yet whenj360 the underlying

distribution cannotbe Gaussian, whence ji0 i>3 is hardly an appropriate choice! For the Poisson (10) example this results in third-order approximations forxP5 only, an anomaly which

is easily resolved by allowing j4>0. For then K40h ÿxj1j2hj3h2=2j4h3=6ÿx0

solves to give realhfor allxP0. Table 1 shows the eect of taking variousj₄values on either side

of the true value 10 forx0;5;. . .;25. Increasingj4 clearly `¯attens' the resulting p.d.f., though

betweenj4 5 and 20 the central part of the distribution suers little or no eect. Tail in¯ation is

felt more strongly at smaller values ofx than at higher values.

Note that for this speci®c problem, employing recent theoretical developments based on the tail probability structures of Lugannani and Rice [14] does not yield further improvement. Jensen [13] provides a full account of this approach (also Reid [22]); Daniels [3,4] gives numerical examples and reviews [4] a variety of tail-area approximations using the saddlepoint method. Wang [30] derives the cumulative distribution function of the sample mean of independent bivariate random vectors, whilst Barndor-Nielsen and Cox [1] and McCullagh [20] provide further insight into saddlepoint approximations.

2. Multivariate extension

In terms of straight practical application, extending the truncated saddlepoint approach from one- to multi-variable scenarios simply involves moving from (1.5)±(1.7) to them-variable form

fx1;. . .;xm '

exp_fKh1;. . .;hm ÿh1x1ÿ ÿhmxmg 2pm=2 jK00h1;. . .;hm j

p 2:1

Table 1

Comparison of exact Poisson probabilities with fourth-order truncation forj45;10 (true), 20, 50 and 100

x Exact j45 j410 j420 j450 j4100

0 0.000045 0.000000 0.000104 0.000299 0.000725 0.001246

5 0.037833 0.041460 0.037030 0.033814 0.031068 0.029854

10 0.125110 0.126157 0.126157 0.126157 0.126157 0.126157

15 0.034718 0.035242 0.035008 0.034594 0.033656 0.032660

20 0.001866 0.001826 0.001872 0.001963 0.002211 0.002562

(5)

for

o_K_h₁_;_{. . .}_;_h_m₌o_h_i_x_i _i₁_;_{. . .}_;_m ₂_:₂ and _jK00_j _{the determinant of second derivatives. Although we shall remain with two-variable}

processes throughout this paper for reasons of algebraic simplicity, no additional theoretical diculties occur whenm>2.

To illustrate this procedure, suppose that the variables X;Y follow a bivariate Normal dis-tribution with zero means, variancesr2

1 andr

2

2, and correlationq. Then the associated c.g.f.

Kh1;h2 1=2 h21r

which yield, for given co-ordinatesx;y,

h1

As a second example, suppose that X;Y are i.i.d. Poisson (1) variables, and let U X and

V X Y. Then

this scenario the success of the saddlepoint approximation is limited only by the accuracy of Stirling's approximation.

We shall now show how the saddlepoint approach can be applied to a simple stochastic process, as opposed to a given p.d.f. such as (2.7). Let pairs of (type1, type2) individuals arrive as a Poisson process at ratea, with individuals dying independently according to two simple death processes with ratesl1 andl2, respectively. Then on denotingpijt Pr(population is of sizei;jat time t), the Chapman±Kolmogorov forward equations are given by

(6)

and the associated equation for the probability generating function (p.g.f.)

Eq. (2.12) solves via the auxiliary equations

dt

the three constituent parts of (2.14), and denoting

/₁a=l₁_ÿ1 eÿl1t_; _/ we therefore see thatX;Y is distributed as the convolution

PoissonX;/1ÿ/12 PoissonY;/2ÿ/12 PoissonXY;/12: 2:16

So the exact population size probabilities are given by

pijt

To evaluate the associated saddlepoint values we ®rst note that the c.g.f. is given by (2.14) with

Kh1;h2;t Geh1;eh2;t, whence the saddlepoint equations (2.2) become

For algebraic convenience denote these equations as

awbwzx and czbwzy; 2:19

where

weh1_; _z_eh2_; _a_/

(7)

Then we have

So the full saddlepoint form (2.1) is given by

fx;yexpfÿa1ÿw ÿc1ÿz ÿb1ÿwzgw

l₂ 2. Then from (2.16) we see that the process behaves as the Poisson convolution

PoissonX;4PoissonY;1PoissonXY;2; 2:24

with the exact probabilities (2.17) taking the fairly amenable form

pij1 eÿ7

However, the structure of even this simple multi-termed expression is certainly not transparent, and contrasts markedly with that of the single-termed saddlepoint approximation (2.23), namely

fx;y w

which is clearly much easier to interpret.

To assess the numerical accuracy of (2.26), at the marginal (Poisson) means a=l1 6 and

a=l₂ 3 we have p6;31 0:041068 (to 6 decimal places), so the saddlepoint value f6;3

0:042536 (3.57% too high) compares well. Whilst to examine a `worst case' scenario let us take (1,1); for since Stirling's approximation to 1! is 2.32, we might expect the approximation to perform poorly. In fact,f1;10:006579 compares quite favourably withp1;11 0:005471.

3. Cumulant truncation

(8)

Kh1;h2 j10h1j01h2j20h21=2j11h1h2j02h22=2j30h31=6

Eqs. (3.2) are eciently solved by using bivariate Newton±Raphson, with the iterates

h₁r1;h₂r1

Whence the corresponding saddlepoint approximation is given by

~

fx;yexpfKh1;h2 ÿh1xÿh2yg

2p g1h2ÿg2h1

p : 3:5

For this saddlepoint procedure to yield full support to the underlying process we require the quadratic surface (3.2) to solve for real h1;h2 for all appropriate x;y-values. An extreme

so-lution is to place j31j22j130, for we see from (3.2) that for large jhij, gj40h31=6 and

hj04h32=6, which independently sweep out all possiblexandyvalues. So a saddlepoint solution

will always exist. However, given that there is no unique way of relaxing the zero condition on the fourth-order cumulants, a key question arises as to how (say)

max

(9)

j1010; j0110;

Example A: j209; j114; j0216;

j3015; j2110; j1215; j0325

over 06x;y6100. Given that saddlepoint p.d.f.s operate over _ÿ1<x;y<₁, non-negligible probability mass may accrue outside our chosen bounded region. Thus, not only may we have to scale our saddlepoint p.d.f. to ensure that probabilities sum to one, but also the ensuing sad-dlepoint cumulants,j~ij, may dier from the target cumulantsjij. The development of a universally optimal procedure for achieving this goal is currently an open problem, but the following ap-proach works well. The ®ve free-ranging fourth-order cumulants are used to achieve the ®t, and here we select the initial valuesj~40j~04100 (which are high enough to ensure that the resulting

iterated p.d.fs have full support over 06x;y6100) together withj~31j~22j~130. In general,

deriving a p.d.f. to ®tnth-order cumulants would use then2n1th order cumulants. For_fZ_g

a set of independent uniformly distributed psuedo-random numbers on (_ÿ0:5,0.5):

0. set initial saddlepoint cumulants to the target values, i.e.,j~10j10;. . .;j~03j03, and choose

appropriate values forj~40;. . .;j~04;

1. increment a randomly chosen cumulant j~ij by dZ, for appropriated;

2. (a) evaluate the resulting saddlepoint probabilities f~x;y via the iterative procedure (3.1)±(3.5), and then (b) rescale them to form f^x;yto enable (here) P100

i;j0f^x;y 1;

3. evaluate the cumulantsj^ij corresponding tof^x;y; 4. determineSP3

i;j0j^ijÿjij

2

, and updatej~ijifSis reduced andg1h2ÿg2h1 >0 (to ensure

(3.5) is real);

5. printf^x;y whenS reaches say 10ÿ6_{, then stop;}

6. return to 1.

To enablej~40;. . .;j~04 to `bed-in', initial choice of d was kept relatively high atd10, but even

such coarse-tuning swiftly led to max_jjîjÿjijj<0:03. Switching to a ®ne-tuning regime withd 0:1 then quickly produced max_jjîjÿjijj <0:0006. If required, further accuracy could be achieved with micro-tuning using say d0:001; here this gave max _jjîjÿjijj 0:00001. Clearly the al-gorithm is both fast and extremely precise. The resultingj~ij-values are given (to 3 decimal places, double precision values are used in the computation) by

~

j109:704; j~019:810; ~

j208:651; j~114:640; j~02 16:579; ~

j3020:157; j~2110:157; j~12 14:867; j~0320:944 ~

j4086:05871:715; j~314:98813:427; j~22 3:542 11:694; j~313:52616:984; ~

j04118:07879:273;

the associated fourth-order cumulant estimatesj^40;. . .;j^04are given in brackets. Though the two

means are 3.33 and 2.5 standard deviations above zero, the presence of skewness leads to sub-stantial probability mass outside 06x;y6100 (in step 2aP100

i;j0f~x;y 0:9151, so rescaling is

(10)

~

j13. We shall see in Section 4 that under certain conditions this last observation can prove to be of

crucial importance.

The corresponding rescaled Normal p.d.f. can be obtained from the same algorithm with

~

j30;. . .;j~04 ®xed at zero (the second-order saddlepoint approximation is guaranteed to exist).

Initial coarse-tuning is unnecessary, since third- and fourth-order cumulants no longer feature, and ®ne-tuning swiftly produces excellent accuracy (maximum absolute error of <0.000004): subsequent micro-tuning yields an accuracy of 0.000001. Given the relatively high mean to standard deviation ratios, very little probability leakage occurs (in step 2aP100

i;j0f~x;y 0:9946).

Nevertheless, the tiny amount of rescaling employed in step 2b still has an eect on j^30;. . .;j^04

giving rise to the non-zero values (0.344, 0.467, 1.493, 5.834; _ÿ2:362;_ÿ1:513;_ÿ2:839;

ÿ10:305;_ÿ41:034). The p.d.f f^x;ytakes the scaled bivariate Normal form (1/0.9946)N(9.9822, Fig. 1. Truncated saddlepoint p.d.fs f^x;y corresponding to the cumulants of Example A: (a) least squares ®t to

(11)

9.9378; 9.0665, 4.1723, 16.6494) over 06x;y6100 (Fig. 1(b)), and comparison with Fig. 1(a) shows that this p.d.f. is less peaked, having a modal value of 0.0138 compared to 0.0173 under fourth-order truncation.

4. A cautionary tale!

Although this highly successful example bodes well for many practical situations, caution must be exercised if the proposed cumulant structure exhibits awkward characteristics. As an extreme example, if the variances and covariance are zero then the p.d.f. must comprise a single spike of mass one at the mean, so all central moments and cumulants are zero. Thus in this case any attempt to construct a p.d.f. with non-negative third-order cumulants is doomed to failure. A less obvious situation is where one (or both) of the means lies close to an axis. For the resulting constriction on shape may mean that not all sets of_fj10;. . .;j03gcan lend themselves to an

as-sociated p.d.f. To illustrate this scenario, we shall use the Matis, Zheng and Kie [19] cumulants, given in column 2 of Table 2, for a model with swarming, multiple births and non-exponential birth intervals (Example B). For here the ratiosj10=pj201:21 and (especially)j01=pj020:83

are suciently low to force substantial skewness into the p.d.f. Moreover, when

j40 j040; oK=oh1 and oK=oh2 (i.e., the left-hand sides of (3.2)) are minimised at

h1 h2 0, with values j107:64 and j013:29, respectively. So the third-order truncated

saddlepoint approximation exists only for xP8;yP4, and we therefore have to select appro-priately large initial values forj~40 and j~04in order to ensure existence over all x;yP0.

This example clearly provides a tough test for our truncation procedure, and it is by no means certain that an appropriate p.d.f. can be generated which possesses suciently precise ®rst-, second- and third-order cumulants. The problem is that we implicitly take ®fth- and higher-order cumulants to be zero, even though the true (but unknown) values may be extremely large. So it does not automatically follow that manipulation of the ®rst- to fourth-order cumulants will

Table 2

Comparison of target fjijg and `best ®t' least-squares cumulants fj^ijg cumulants (over 06x6200; 06y6120), together with the `working' cumulantsfj~ijg;for Example B (cumulant structure of Matis et al. [19])

ij jij(target) j^ij(best ®t) j~ij(working)

10 7.64 8.11 5.20

01 3.29 3.97 1.75

20 40.17 39.03 50.94

11 15.32 18.07 31.55

02 15.91 18.46 20.11

30 388.9 388.86 357.69

21 148.3 150.87 166.93

12 107.2 103.89 131.09

03 138.9 138.08 133.30

40 4409.40 7715.35

31 1199.63 12.96

22 665.13 )352.59

13 629.29 35.27

(12)

enable this potentially huge discrepancy to be completely counterbalanced. Note that this di-lemma (often conveniently ignored!) is central to all moment closure problems.

First, we need to acquire seed values forj40;. . .;j04, and examining the change in cumulant

magnitude from ®rst- to second- and third-order suggests the initial `guestimates'

j408000; j31j043000 and j22j132000. However, over the domain 06x6200,

06y6120 f~x;y is negligibly small above this region), the resulting value of P ~

fx;y0:6 (step 2a). Surprisingly, this probability sum is extremely sensitive to multiplying each of the fourth-order cumulants by a common factor. These two features clearly justify the use of step 2b. Whereas in Example A coarse-tuning swiftly led to max_jj^ijÿjijj<0:03, here convergence virtually ceased at 8. The dierence is that the proximity of the means to the axes makes the algorithm far more susceptible to fourth-order cumulants. Moreover, care needs to be taken at the outset to choose initial values ofj~40 andj~04 that are unlikely to lead to the algorithm becoming

trapped in the wrong region of parameter space; here we retained the start values 8000 and 3000, respectively, and tookj~31j~22j~130. The extremely high level of cumulant sensitivity means

that change to one can seriously impact several others, so once coarse-tuning drives j~40;. . .;j~04

towards the vicinity of `proper' values, it becomes necessary to switch to ®ne-tuning. The resulting time to convergence is therefore far longer than occurs under test case A.

Table 2 shows the result of applying the algorithm over 06x6200; 06y6120. Whilst the comparison between the target cumulantsjij (column 2) and the best least-squares ®t saddlepoint cumulants j^ij (column 3) is reasonably good, unlike Example A matching is by no means exact. The problem is that the resulting p.d.f. has a very sharp peak near to the axes, which means that higher-order cumulants now play a much more dominant role. Moreover, the least-squares sur-face is too ¯at to allow the algorithm to snake its way through local minima so it becomes trapped before full convergence can occur. Ways of re®ning the algorithm to circumvent this problem are currently under investigation: the initial choice of j~40;. . .;j~04 appears to be crucial. The

corre-sponding scaled p.d.f.f^x;y(Fig. 2(a)) highlights the proximity of the peak to thex-axis. Whilst the roughly planar structure of log₁₀f^x;y(Fig. 2(b)) shows that outside this small region which contains the main mass of probability the p.d.f. decays slightly faster than exponential.

Attempts to ®t a scaled Normal p.d.f. (i.e., withj~30 j~040) overx;yP0 failed due to

the low value ofj01=pj020:83. For this forces a substantial part of the resulting p.d.f. to lie out

with the quadrantx;yP0, whence too little structure remains to accommodate all ®ve ®rst- and

second-order moments. Even allowing j~30;. . .;j~04 to be free-ranging results in convergence

dif-®culties, with the means becoming trapped away from their target values.

Numerical investigation shows that the general problem of applying the truncated saddlepoint approach when eitherj10=pj20orj01=pj02are small is clearly non-trivial, and alternative search

strategies, probably based on more ecient hill-climbing procedures, need to be developed. Since (3.1)±(3.5) exert strong functional relations between the cumulants, employing a straight grid search for both the full and Normal cases is not straightforward, though ways of circumventing such diculties are currently being studied. For example, in the Normal case evaluatingS(step 4 of the algorithm) over all 35

243 ways that j~ij can be perturbed by say ÿ0:1;0;0:1 is computa-tionally feasible on a PC, and con®rms that S continues to decrease as j~10 and j~01 become

in-creasingly negative. Unfortunately, as this happens the unscaled P

x;yP0f~x;y !0, and the

resultant continual increase in the scaling factor preventsj~10andj~01from stabilising. Eective use

(13)

demand a high-speed computer, since a single cycle invokes 314_{4 782 969 steps. Note that simple}

trial hybrid schemes, involving sequential cycling through j~10;. . .;j~04 combined with random

choice of the perturbation, e.g.,0.1, do not seem to avoid the problem of entrapment in local least-squares minima.

5. Fourth-order truncation: a spatial example

We have seen that provided that the means are not too close to the axes (Example B), the saddlepoint approach is powerful way of constructing p.d.fs which possess desired cumulant structure (e.g., Example A). What we shall now show is that it also provides an excellent way of deriving (approximate) probability solutions to otherwise intractable stochastic processes. Fig. 2. Estimated p.d.f., (a)f^x;yand (b) log10f^x;y, for the Matis et al. [19] cumulant structure (Example B) using

(14)

One of the earliest such models to receive extensive investigation (see [27]) was the two-colony birth±death-immigration±migration process. Although the (non-spatial) birth±death-immigration process is relatively simple to solve, this two-colony version is analytically intractable to direct solution and so provides a good test-bed for the saddlepoint approach. Whilst fairly crude ap-proximate probability expressions can be derived [24], acceptably precise closed form theoretical solutions cannot be obtained. So if the saddlepoint approach can yield a reasonably concise al-gebraic form for the p.d.f., we will have made a substantial advance. Individuals in colonyi1;2 are assumed to give birth at rateki; die at rateli and migrate to colonyj6iat ratemi, and new

Whereas Eq. (2.12) is easily solved, Eq. (5.1) is intractable to direct solution even though it represents one of the simplest types of bivariate stochastic scenarios. This suggests that virtually all multivariate stochastic population processes of interest (see [27] for examples) will not have solutions capable of being expressed in closed form.

Numerical solutions for the population size probabilities_fpijtg i;j0;1;. . . ;tP0 can be derived by writing the Kolmogorov forward dierence-dierential equations in the discrete form

pijt forth;2h;. . .and appropriately smallh. Though as i,jcover all non-negative integers we have to bound the solution over a ®nite rangei0;. . .;M;j0;. . .;N, andMandNmight need to be extremely large to obtain the required level of precision. So computational feasibility is by no means guaranteed. Moreover, to retain a set of `closed' equations we need to modify the process at the boundaries, here by prohibiting birth, migration and immigration intoiM orjN. To obtain fast convergence to the equilibrium probabilitiespijpij1we takehas large as possible subject to the probabilities remaining positive. Whilst to obtain pijt we choose h to balance precision against compute time. Note that although more sophisticated numerical techniques could be employed, all we require here is a benchmark against which to test the saddlepoint approach. mean values 25 lie four standard deviations away from the axes, and so the problems associated with Example B should not arise. Now superimpose migration at ratesm1 0:5 andm2 0:3. Then

dierentiating (5.1) by takingors₌_o_hr

1oh

s

2 and placingh1 h2 0 yields a set of ®rst-order linear

dierential equations for the cumulants jrs whose solution does not necessitate moment closure (for examples involving ®rst- and second-order moments see [23±26]). In our case, on writing

(15)

j10 a2m2ÿa1n2=n1n2ÿm1m2

expressions forj01;j02;j12andj03are symmetric versions of the above. Given that both colonies

experience equilibrium whenm1 m2 0, equilibrium will also exist under migration; in general,

the necessary and sucient condition is thatABPC, whereAl₁m₁ÿk₁; Bl₂m₂ÿk₂

Solving the probability Eq. (5.2) over the ®nite array withMN 100 yields the equilibrium cumulants shown in column 2 of Table 3 (Example C). The values forj10;. . .;j02agree to at least

six decimal places with the exact cumulants calculated from (5.3), and so the bounded stochastic process clearly provides an excellent approximation to the true probabilities. Though the higher-order cumulantsj30;. . .;j04can be derived theoretically from (5.1) (via (5.4) and its fourth-order

counterpart), for our current needs we simply evaluate direct numerical values. As before, using second- and fourth-order truncation produces saddlepoint values over the full x;yP0 range, whilst third-order truncation does not. Faced with this situation in Sections 3 and 4 we had to `guess' likely values for the fourth-order cumulants and proceed accordingly; since here we know the exact values we can insert them directly. So in step 4 of the algorithmScan now take the form

P4

i;j0j^ijÿjij

2

. As the process does not possess any `awkward' characteristics, use of the algorithm results in fast convergence to the precise target cumulants (dierences of order 10ÿ6 _are

obtained extremely rapidly). All the associated working cumulants (column 3 of Table 3) _fj~ijg

Table 3

Comparison of target cumulants fjijg (column 2) for the two-site birth±death-migration process and `best ®t' fourth-order working cumulantsfj~ijg over 06x;y6100(column 3); estimated cumulantsfj^ijgare exacta

(16)

show close agreement with the target values, except for j~40 and j~04, which parallels what we

observed in Example A. Since only minute scaling is required (from 0.999974 to 1), this dis-crepancy is presumably the compensating eect for implicitly taking all ®fth- and higher-order cumulants to be zero. The Normal approximation (columns 4 and 5) shows similar structure, with convergence to exact target cumulants now being obtained even faster; for third- and fourth-order cumulants, only j^40 andj^04 lie away from zero.

To compare the associated probabilities, consider the benchmarks

Whilst the second-order (i.e., Normal) form works well in the centre of the distribution, it clearly performs extremely poorly in the tails. This contrasts markedly with the vastly superior performance of the fourth-order approximation; there is still good agreement at the upper tail point (50, 50) even though this is (5.26, 3.35) standard deviations above the mean. Fig. 3 shows the exact, fourth-order and second-order probability and log-probability plots, and whilst the three p.d.fs appear (visually) similar, switching to a log-scale highlights the dierence in tail structure. Note the severe drop of the exact p.d.f. towards p0;00:591010ÿ17_{, which cannot be}

mat-ched by even the fourth-order approximation. However, for all practical purposes this probability is so small as to be irrelevant, and does not in any way reduce the excellence of the fourth-order truncation approach.

6. Conclusions

Although the Normal approximation plays a central role in the applied statistician's armory, its implementation by population modellers is much more daunting, both technically and philo-sophically. Even for the simple linear birth±death process, for which the mean and variance are trivial to write down, the fact that at time t the population size p.d.f. _fpitgcomprises a curve over i>0 together with a probability spike (corresponding to extinction) at i0, means that when considering the former, care must be taken to use the mean and variance conditional on non-extinction at timet, and not the overall mean and variance. Moreover, unless p10 1, the

closed form solution for pit is opaque, and if we attempt to construct a simpler approximate distribution which brings in higher-order moments, how do we know we have chosen the best one? The problem for multi-variable processes (such as (2.17)) is even more severe, since there is no known standard form which will encompass even general third-order moments. In the n-variate negative binomial distribution [9], for example, the third-order moments are de®ned explicitly in terms of the ®rst- and second-order moments.

For the case of non-linear processes, which comprise the vast bulk of population biology, equations for say ®rst- and second-order moments will almost always involve higher-order

Exact Fourth Second

p(10, 10) Lower tail 0:325910ÿ5 ₀_:₃₇₆₀

10ÿ5

(+15.4%)

0:138910ÿ4

(+326%)

p(22, 28) Mean value 0:451710ÿ2 ₀_:₄₄₉₆

10ÿ2

(₎0.46%)

0:456410ÿ2

(+1.04%)

p(50, 50) Upper tail 0:106310ÿ7 ₀_:₈₈₃₃

10ÿ8 ₍

(17)

Fig. 3. Comparative ®ts to the birth±death-immigration±migration process (Example C) with k1k20:36; l1l21; a1a216; m10:5 and m20:3: (a) exact p.d.f.fx;y and (b) log10fx;y; (c) fourth-order p.d.f.

^

(18)

moments. The standard procedure of applying the Normal approximation is to replace all these higher-order moments by zero, yet this is only reasonable if the Normal form provides a perfect ®t; cumulants can `explode' even under very mild perturbations from Normality. So, if we wish to construct a p.d.f. that provides a good description of even the basic univariate logistic process, then choice of an optimal set of higher-order moments is very much an open question. Although this problem of moment closure was introduced by Whittle [32], in the context of obtaining a multivariate Normal approximation to the distribution of a non-linear Markov process (see references in [11]), it unfortunately received only limited attention for some considerable time thereafter. Only recently has interest in developing p.d.fs for non-linear stochastic population processes gained strength. In addition to the models of Matis et al. referenced above, good ex-amples are provided by Isham [10] for epidemic models, Isham [12] for host±macroparasite in-teraction, Grenfell et al. [6,7] for macro-parasitic infections, and Marion, Renshaw and Gibson [15,16] for modelling nematode infection in ruminants.

There are several key advantages of using the saddlepoint approach over the more usual procedure of investigating families of distributions. First, there is no need to make any distrib-utional assumptions whatsoever. Second, it works regardless of the number of `species' involved. Third, we can inject any higher-order cumulant structure into the algorithm that we deem ap-propriate, in contrast to the popular approach of restricting attention to the Normal approxi-mation. Fourth, we obtain an algebraic form for the associated p.d.f. irrespective of whether we do or do not have complete knowledge of the cumulants, through (2.1) and (2.2) and (3.1)±(3.5), respectively.

In general, systems are unlikely to have probability peaks near an axis, and Examples A (Section 3) and C (Section 5) show that in these, the vast majority of cases, the algorithm con-verges swiftly. Given full knowledge of the cumulant structure, or even partial knowledge with

P ~

fx;y_'1 (step 2a), it can be argued that no rescaling to the p.d.f.f^x;y(step 2b) should be performed, as this may bias values near the mode. Here, we have used rescaling in all three ex-amples in order to demonstrate the eect of its implementation. However, when P ~

fx;y lies away from one then scaling (step 2b) is essential. The cumulantsj~40andj~04, in particular, play a

key role in letting the procedure discard higher-order cumulant values. Moreover, subsequent distortion to the remaining 12j~ijcumulant values may be considerable, as substantial probability mass may lie outside the target domain of interest (usually 06x6M;06y6N for appropriateM andN). The Matis et al. cumulants (Section 4) were deliberately chosen since their small mean to standard deviation ratio made it highly likely that they would present a tough challenge to the truncated saddlepoint procedure. Although the `best least-squares ®t' values j^ij (column 3 of Table 2) compare favourably with the target cumulants jij (column 2), dierences clearly do occur. Current lines of attack for removing these include strengthening the algorithm through the use of superior hill-climbing techniques, and determining how best to obtain the pivotal cumulant values j~40 and j~04. An alternative strategy would be to use Wang's [30] method for determining

cumulant tail probabilities, and it would be interesting to compare this approach with that de-veloped here.

(19)

References

[1] O. Barndor-Nielsen, D.R. Cox, Edgeworth and saddle-point approximations with statistical applications, J. Royal Stat. Soc. B 41 (1979) 279.

[2] H.E. Danies, Saddlepoint approximations in statistics, Ann. Math. Stat. 25 (1954) 631. [3] H.E. Daniels, Saddlepoint approximations for estimating equations, Biometrika 70 (1983) 89. [4] H.E. Daniels, Tail probability approximations, Int. Stat. Rev. 55 (1987) 37.

[5] G.S. Easton, E. Ronchetti, General saddlepoint approximations with applications to L statistics, J. Am. Stat. Assoc. 81 (1986) 420.

[6] B.T. Grenfell, K. Dietz, M.G. Roberts, Modelling the immuno-epidemiology of macroparasites in naturally-¯uctuating host populations, in: B.T. Grenfell, A.P. Dobson (Eds.), Ecology of Infectious Diseases in Natural Populations, 1995, pp. 362±383.

[7] B.T. Grenfell, K. Wilson, V.S. Isham, H.E.G. Boyd, K. Dietz, Modelling patterns of parasite aggregation in natural populations: trichostrongylid nematode±ruminant interactions as a case-study, Parasitology 111 (1995) S135.

[8] R. Hengeveld, Dynamics of Biological Invasions, Chapman and Hall, London, 1984.

[9] J. Herbert, V. Isham, Stochastic host±parasite interaction models, J. Math. Biol., 1999 (to appear). [10] V. Isham, Assessing the variability of stochastic epidemics, Math. Biosci. 107 (1991) 209.

[11] V. Isham, Stochastic models for epidemics with special reference to AIDS, Ann. Appl. Prob. 3 (1993) 1. [12] V. Isham, Stochastic models of host±macroparasite interaction, Ann. Appl. Prob. 5 (1995) 720.

[13] J.L. Jensen, Saddlepoint Approximations, Oxford University, Oxford, 1995.

[14] R. Lugannani, S. Rice, Saddlepoint approximations for the distribution of the sum of independent random variables, Adv. Appl. Prob. 12 (1980) 475.

[15] G. Marion, E. Renshaw, G. Gibson, Stochastic eects in a model of nematode infection in ruminants, IMA J. Math. Appl. Med. Biol. 15 (1998) 97.

[16] G. Marion, E. Renshaw, G. Gibson, Stochastic modelling of environmental variation for biological populations, Theor. Pop. Biol. 57 (2000) 197.

[17] J.H. Matis, T.R. Kie, On approximating the moments of the equilibrium distribution of a stochastic logistic model, Biometrics 52 (1996) 980.

[18] J.H. Matis, T.R. Kie, P.R. Parthasarathy, Density-dependent birth±death-migration models, in: XVIIIth International Biometric Conference, vol. I5, Amsterdam, 1996, p. 79.

[19] J.H. Matis, Q. Zheng, T.R. Kie, Describing the spread of biological populations using stochastic compartmental models with births, Math. Biosci. 126 (1995) 215.

[20] P. McCullagh, Local suciency, Biometrika 71 (1984) 233.

[21] P.S. Puri, Interconnected birth and death processes, J. Appl. Prob. 5 (1968) 334. [22] N. Reid, Saddlepoint methods and statistical inference, Stat. Sci. 3 (1988) 213. [23] E. Renshaw, Birth, death and migration processes, Biometrika 59 (1972) 49. [24] E. Renshaw, Interconnected population processes, J. Appl. Prob. 10 (1973) 1.

[25] E. Renshaw, The eect of migration between two developing populations, Proc. 39th Session Int. Stat. Inst. 2 (1973) 294.

[26] E. Renshaw, Stepping-stone models for population growth, J. Appl. Prob. 11 (1974) 16.

[27] E. Renshaw, A survey of stepping-stone models in population dynamics, Adv. Appl. Prob. 18 (1986) 581. [28] E. Renshaw, Modelling Biological Populations in Space and Time, Cambridge University, Cambridge, 1991. [29] E. Renshaw, Saddlepoint approximations for stochastic processes with truncated cumulant generating functions,

IMA J. Math. Appl. Med. Biol. 15 (1998) 41.

[30] S. Wang, Saddlepoint approximations for bivariate distributions, J. Appl. Prob. 27 (1990) 586. [31] S. Wang, General saddlepoint approximations in the bootstrap, Stat. Prob. Lett. 13 (1992) 61.