• Tidak ada hasil yang ditemukan

Aggregated Spatial Patterns and the Negative Binomial Distribution

Classifying Pest Density 3 3

4.6 Aggregated Spatial Patterns and the Negative Binomial Distribution

We know from experience that pests are often not distributed randomly. They tend to be found in aggregated spatial patterns. There are many biological mechanisms that lead to aggregation: the existence of ‘good’ and ‘bad’ resource patches, settling of offspring close to the parent, mate finding and so on. Likewise, there are many ways in which models for such processes can be combined to obtain probability dis- tributions to account for the effect of spatial aggregation.

Distributions 71

Because these are all greater than the standard value, P= 0.05, we would be justi- fied in assuming a Poisson distribution in each case.

Despite the non-significant P-value for the 5 ×5 grid, Fig. 4.5a is not convinc- ing: the frequencies do not really look like the theoretical values. This is mainly because n= 25 is too few sample units for a good test. There are people who are reluctant to accept the results of a statistical significance test if the data do not look convincing. It is unwise to lean too heavily on a statistical crutch. Statistical tests are most useful when they summarize something that is not difficult to swallow.

Fig. 4.5. Observed frequency distributions for the four grid types shown in Fig.

4.4, with a Poisson distribution fitted to each () with grouping criterion equal to one. (a) 5 ×5; (b) 10 ×10; (c) 20 ×20; (d) 5 ×20.

What kind of frequency distribution should we expect when we sample an aggregated pattern? It has been found in practice that, where sample units are col- lected at random and the numbers of pests are spatially aggregated at the scale of the sample units, the variance tends to be greater than the mean. There are many theoretical distributions (in one way or another related to the Poisson distribu- tion), which allow more variability among numbers in sample units: their variances are greater than their means. The probability distribution that is most often used in describing sampling distributions of pests is the negative binomial probability dis- tribution. Because of its mathematical versatility, this distribution has been found to be a powerful workhorse for matching the frequencies of a wide variety of pest distributions in the field. The negative binomial distribution has two parameters, the mean µ, and a parameter k, which is generally called the exponent or clustering parameterof the distribution. The variance is as follows:

(4.9) and the formulae for calculating the negative binomial probabilities are as follows:

(4.10)

If kincreases while µremains constant, the variance decreases. For very large k, the negative binomial distribution is practically indistinguishable from the Poisson distribution. The effects of changing the parameters is illustrated in Fig.

4.6. When µ= 5 (Figs 4.6a and c), the distribution looks like a Poisson distribution if k= 10, but if k= 0.1, the tail of the distribution is very long: although µis still equal to 5, the probability of getting a count equal to 20 or greater is 0.07 (Fig.

4.6c, box). These figures (Figs 4.6a and c) illustrate how decreasing kincreases the variance per sample unit (Equation 4.9): for µ= 5,

k 0.1 1 10

Variance 255 30 7.5

The effect of varying µwhile kremains constant is easier to understand (Figs 4.6b and d). As µincreases, the probability of getting a zero count decreases and the distribution is stretched out to the right. Again, these figures illustrate how increasing the mean also increases the variance (Equation 4.9): for k= 1,

p

p p

for 0

1 1

0 1

| ,

| , | ,

, ,

µ µ

µ µ µ

µ

k k

k

x k x k k x

x k

x

k

( )

= + 

(

+

)

=

( )

++ +

= …







 σ2 µ µ2

= + k

µ 1 5 10

Variance 2 30 110

The versatility of the negative binomial distribution allows it to model Beall’s data very well, where the Poisson distribution fails badly (Fig. 4.7). The mean and variance of the data are 4.74 and 15.00, respectively (see Chapter 2, Exhibit 2.1).

The fitted values and tests (grouping so that expected class frequencies ≥1) are as follows:

µ^ ^k X2 df P

Poisson 4.74 – 6220 12 0.00

Negative binomial 4.74 2.21 30.6 25 0.20

The value of ^kis an indication of aggregation at the spatial scale of the sample unit, and for µnear 4.74. If the sample unit is changed or the mean density is differ- ent, there is no guarantee that k will remain the same. In fact, there is much

Distributions 73

Fig. 4.6. Negative binomial distributions, showing the effect of changing one

parameter while keeping the other one fixed. (a) and (b) show probabilities; (c) and (d) show corresponding cumulative probabilities. (a,c) µ= 5 and k= 0.1 (), 1 () and 10 (); (b,d) k= 1 and µ= 1 (), 5 () and 10 ().

evidence in the literature that kchanges with sample unit and mean density (see, e.g. Taylor, 1984).

Computers can be made to generate aggregated spatial patterns in a number of ways (for details, see the appendix to this chapter). A computer-generated aggre- gated spatial pattern of points is shown in Fig. 4.8a. When a 36 ×36 grid is super- imposed on this ‘field’ to create 1296 sample units, the aggregation among sample units can be easily seen (Fig. 4.8b). We can fit the Poisson and negative binomial distributions to these data by maximum likelihood (Fig. 4.9). We do not need a sta- tistical test here to infer that the points are not randomly distributed – that is to say, not at the spatial scale of the chosen sample unit – or that the negative bino- mial distribution is a plausible model for the frequency distribution. Using the spatial pattern in Fig. 4.8a as a basis, we illustrate the effect on kof changing the size and shape of the sample unit.

Fig. 4.7. Fitting probability distributions to Beall’s data (Fig. 2.2). (a) Poisson distribution (), µ^ = 4.74; (b) negative binomial (), µ^ = 4.74, k^= 2.21.

Fig. 4.8. (a) A computer-generated spatial pattern. (b) A representation of the pattern in terms of a grid (36 ×36) of sample units, where each small dot represents the presence of one point in the sample unit, and each increase in size represents the presence of one more point.

Distributions 75

Exhibit 4.2. The effect of sample unit size and shape on the negative binomial k The spatial pattern of points displayed in Fig. 4.8a was subdivided into sample units by grids of various shapes and sizes. The negative binomial distribution was fitted to each frequency distribution in turn.

Initially, all sample units were square, with equal numbers of sample units (nxand ny) along each axis:

Grid (nx×ny) n= nx×ny µ^ k^ X2 df P

9 ×9 81 12.4 3.09 35.1 31 0.28

18 ×18 324 3.10 2.01 17.2 10 0.07

27 ×27 729 1.37 1.74 8.15 6 0.23

36 ×36 1296 0.77 1.70 5.3 4 0.25

48 ×48 2304 0.43 1.77 3.58 3 0.31

It is always a good idea to check where a test has produced a significance probabil- ity near the critical value (we use P= 0.05 here as the critical value). Therefore, before proceeding further, we need to check on ‘18 ×18’ (P= 0.07). The frequency distribution and the fitted frequencies are shown in Fig. 4.10a. No wild departure is evident (the worst fits are for counts 5 and 8), so we can proceed to use the above results. The value of kincreases as the size of the sample unit increases. Therefore, the aggregation decreases as sample unit size increases. A plausible reason for this is that a larger sample unit contains more centres of aggregation in the spatial pattern, partially ‘averaging out’ the aggregation effect. Although there is no general guarantee that increasing the size of the sample unit will increase the value of k, it often occurs in practice.

Continued Fig. 4.9. Fitting distributions to the data shown in Fig. 4.8(b). (a) Poisson; (b) negative binomial. Fitted frequencies are shown as circles ().

What happens when the shape of the sample unit is changed? We can investi- gate this by keeping the size of the sample unit constant, but adjusting the grids to make rectangular sample units. Based on the sample unit size for ‘18 18’, we find the following:

Grid (nx×ny) µ^ k^ X2 df P

3 ×108 3.09 4.12 12.2 9 0.20

4 ×81 3.10 4.86 5.76 8 0.67

6 ×54 3.09 3.39 10.6 10 0.39

12 ×27 3.09 2.04 10.1 11 0.52

18 ×18 3.10 2.01 17.2 10 0.07

27 ×12 3.09 2.15 13.8 11 0.24

54 ×6 3.09 3.19 15.2 9 0.09

81 ×4 3.09 4.11 7.31 10 0.70

108 ×3 3.09 7.06 9.3 8 0.32

Again, we must check where near-significance occurred. A plot of the frequencies and the fitted values of ‘54 ×6’ are shown in Fig. 4.10b. There is no obvious feature which should make us want to reject the fit, so we can continue. There appears to be a trend for kto increase as the shape of the sample unit becomes more elon- gated, in either direction. When the pattern was originally generated, no purposeful

‘directional’ pattern was imposed. A plausible reason for increasing kis that as the length of the sample unit grows, the sample unit itself contains parts of more and more centres of aggregation, thus attaining, in part, some of the attributes of a larger sample unit. From the previous discussion on the effect of sample unit size on k, we might therefore expect a larger k as the rectangular shape of the sample unit becomes more elongated.

Fig. 4.10. Fitting the negative binomial distribution to the pattern in Fig. 4.8a subdivided into different sized sample units. (a) 18 ×18 superimposed grid;

(b) 54 ×6 superimposed grid. Fitted frequencies are shown as circles ().

At first sight, the results presented in Exhibit 4.2 should appear disturbing. On the one hand, the negative binomial distribution is versatile enough to fit the fre- quency distributions of many species of interest to pest managers. On the other hand, the parameter kcan vary depending merely on the size and shape of what- ever sample unit is used. Have we taken one step forward and one step backward?

The answer is that we have not, because once we have decided on a sample unit (based on concepts discussed in Chapter 2) we are generally not interested in changing its size and shape. If there are several sample units which satisfy the criteria noted in Chapter 2, we can always compare their precisions and choose the best.

But once the sample unit has been agreed upon, we are then only concerned about how changes in the mean per sample unit, µ, might affect k. Here we can use one of the variance–mean relationships discussed in Chapter 3, namely

IVM: σ2=(a+1)µ+ (b1)µ2 (4.11)

TPL: σ2=aµb (4.12)

Using what is called the ‘method of moments’, we can relate either of these equa- tions to Equation 4.9, to obtain an estimate of k:

(4.13)

or

(4.14)

Once the parameters of either IVM or TPL are estimated, we can estimate kfor any value of µ, the mean per sample unit, and we do not need to worry about finding models to justify it (Binns, 1986). Therefore, if we have a good estimate of IVM or of TPL, we can estimate probability of decision (OC) functions for a range of mean densities using the negative binomial distribution. We shall use Equation 4.14 frequently throughout the book.

TPL:

leading to

a k

k a

b

b

µ µ µ

µ

µ µ

= +

= −

2

2

,

IVM: ( ) ( )

leading to

(

a b

k k a b

+ + − = +

= + −

1 1

1

2 2

µ µ µ µ

µ µ

, )

Distributions 77

4.6.1 Estimating the negative binomial kfrom a variance–mean relationship Exactly how the value of kchanges as the sample unit size changes depends

on the type and degree of aggregation in the spatial pattern. Certainly, kdepends on the spatial aggregation and on the size and shape of the sample unit. Estimating aggregation, per se, is beyond the scope of this book. For further discussion on measures of aggregation see, for example, Pielou (1977).

In our examples so far, we have fitted the Poisson and negative binomial distribu- tions to all of the data. That is, we have used the data from all the sample units superimposed on a spatial pattern. In a sense, therefore, we have been working with the ‘true’ values in the field. In practice, however, only some of the sample units are collected and analysed. Can we compare the fits that we obtain from all the data with those we might obtain when we collect only some of the sample units?