• Tidak ada hasil yang ditemukan

Random Patterns and the Poisson Probability Distribution

Classifying Pest Density 3 3

4.3 Random Patterns and the Poisson Probability Distribution

units (Fig. 4.1b). The spatial pattern of numbers in sample units can be displayed graphically as dots whose sizes represents the numbers of individual organisms in the sample units (Fig. 4.1c). The frequency distribution of these numbers can be calculated as in Chapter 2 (Fig 4.1d). What kind of frequency distribution should one expect when points are spread out randomly like this?

Statistical theory shows that, when events occur at random locations in a region, the number of events which can be found in any specified area within the region follows a Poisson probability distribution. This implies that we can predict the shape of the frequency distribution if we know the rate at which the events occur or, equivalently, if we know the average number of events in a unit of area.

This average number of events constitutes the only parameter of the Poisson distri- bution. It is called the mean and is denoted by the Greek symbol µ. The Poisson probabilities of x= 0, 1, 2, … events per sample unit are defined as follows:

Fig. 4.1. (a) A random spatial pattern of points. (b) The random pattern in (a) with a 20

×20 grid superimposed. (c) A spatial pattern derived from (b) as a summary of (a).

(d) Frequencies derived from (b) (bars) and fitted Poisson frequencies ().

(4.3) where x! denotes the product of the first xintegers:

x! = 1 ×2 ×3 ×… ×x

The notation p(x|µ) is a common way of writing formulae for probabilities. It should be understood as the probability of getting x individual organisms in a sample unit when the parameter is equal to µ. The vertical bar is a convenient way of separating the data (x) from the parameter (µ). Typical shapes of the Poisson dis- tribution are shown in Fig. 4.2.

On the basis of Equations 4.2 and 4.3, we can compare the frequency distribu- tion obtained by our simulated random process with the expected frequencies for a Poisson distribution. The parameter µis 1.25 because, in Fig. 4.1, 500 points were placed in a field with n= 400 sample units. With µ = 1.25, the expected frequencies can be calculated from Equation 4.3:

(4.4) A comparison between the simulated results and theory is shown in Fig. 4.1d. The process of finding the theoretical model which is closest to the data is called model fitting. Here we have fitted the Poisson probability distribution to the observed fre- quencies. The fitting procedure is quite simple, because we need to calculate only the mean number of individuals per sample unit. This mean is an unbiased estima- tor of µ. We usually denote it by the arabic equivalent of µ, namely m.

How does this help us make decisions? From the decision-making perspective, the most important help it gives is that we now know the variance, σ2, of the

E f n x

x x

( )

= p

(

| .1 25

)

=400e1 25. 1 25. !x

p x e

x

x

|µ !

µ µ

( )

=

Distributions 65

Fig. 4.2. Poisson distributions with µ= 0.5 (first set of bars), 4 (), 20 (second set of bars).

number of individuals in a sample unit. For a Poisson distribution, the variance is equal to the mean: σ2= µ. With this knowledge, we can predict the variance of the mean for samples of any size nas in Chapter 2. Therefore, assuming that we can use the normal distribution approximation as in Chapter 2, we can go as far as calculat- ing probability of decision or operating characteristic (OC) functions.

The probability distribution that we have obtained depends on the size and shape of the sample unit. As a consequence, the OC function relates to the sample unit used in the preliminary work, and cannot be used directly for a different sample unit. For instance, when counting pests on a pair of adjacent plants, rather than on one plant, the mean, µ, changes to 2µ. Likewise, the variance, σ2, changes to 2µ. This new mean and variance would have to be used for calculating the OC function. Therefore, if a Poisson distribution can be assumed for the new sample unit, these few mathematical calculations are all that is required to adjust the OC function.

In the previous section, we used the sample mean to estimate the parameter µof the Poisson distribution without indicating why this was a good idea. A motivation can be given by the maximum likelihood principle. According to this principle, the overall probability (or ‘likelihood’) of getting the data which were in fact obtained is calculated, and the model parameters for which this ‘likelihood’ is maximized are defined as the maximum likelihood estimates. These estimates have many desirable properties. For example, in most situations which might be encountered by pest managers, these estimates have the highest precision. Other properties are beyond the scope of this book, and are discussed in statistical textbooks. In general, the maximum likelihood principle forms the basis for estimating parameters of proba- bility distributions. As an aside, it is also the principle that leads to least squares estimates in linear regression.

Suppose that we have sample data X1, X2, …, Xn, and we want to estimate some kind of probability distribution p(x|θ) with parameter θ. Then we should look for the parameter that gives the highest value for the product

L(X1,X2,…,Xn|θ) = p(X1|θ) p(X2|θ) … p(Xn|θ) (4.5) This product, L(X1,X2,…,Xn|θ), is called the likelihood function. Its value depends on the parameter (or parameters) θ and on the sample data. The parame- ter θcan be the mean, the variance or some other defining characteristic. For the Poisson distribution, the likelihood is

(4.6)

L X X X e

X X X e

X X X

n n x x x

n

n nm

n

n

1 2

1 2 1 2

1 2

, , , |

! ! ! ! ! !

(

µ

)

= µ µ + +…+L = µ µ L

4.4 Fitting a Distribution to Data Using the Maximum Likelihood