Calculating a Probability of Decision (OC) Function

Basic Concepts of Sampling

2.11 Calculating a Probability of Decision (OC) Function

size, n, is increased. As the sample size is increased, the standard deviation of the mean, sd_m, decreases. As a result, the standardized form of cd, (cd µ)/sd_m, increases in absolute value (unless _µ=cd). This causes the spread of values in the second line of Table 2.2 to become larger, which in turn causes the spread of probabilities in the third line to shrink. The end result is that OC function becomes steeper about cdas the sample size increases. This is illustrated in Exhibit 2.2.

The OC function for the sampling plan used to classify beetle densities with respect tocd can also be determined using simulation. To simulate the OC function, the following procedure is used:

1. A range of true means (_µ’s) is specified for which OC values are to be generated.

2. For each value of _µ, a random variate is generated from a normal distribution with mean _µand standard deviation , where _σ is the population standard deviation. Each of these random variates represents a sample mean, m.

3. If m≤cd, _µis classified as less than or equal to cdand the recommendation is to do nothing; otherwise, _µis classified as greater than cd.

4. Steps 2 and 3 are repeated several times and the proportion of times _µis classified as less than or equal to cdis determined. This is the simulated OC value for that value of _µ.

5. Steps 2–4 are repeated for each _µin the range of true means.

How close the simulated OC is to the true OC depends on the variability in the sample estimate and the number of times step 4 is repeated. This is illustrated in Exhibit 2.2.

σ / n

Basic Concepts of Sampling for Pest Management 35

Table 2.3. OC function values when an estimated mean is used to classify density with respect to a critical density, cd, equal to 3.5. A sample size of 25 is used, the sample mean is assumed to be normally distributed and sd_m= 0.774.

2 2.5 3 3.25 3.5 3.75 4 4.5 5

1.94 1.29 0.65 0.32 0 0.3 0.7 1.3 1.9 0.97 0.9 0.74 0.63 0.5 0.37 0.23 0.1 0

OC P z cd

sd_m

=  ≤ −







µ cd

sd_m

−µ

Exhibit 2.2. Simple OC functions based on the Central Limit Theorem

For sampling plans that classify the population density on the basis of an estimate of the mean, the OC is the probability of a sample mean, normally distributed with mean µand standard deviation , being less than cd. This probability can be found by determining the normal cumulative probability distribution value for these parameters. Because decreases with increasing sample size, the OC function Continued σ/ n

σ/ n

becomes steeper as the sample size is increased. This is illustrated in Fig. 2.8, in which OC functions are shown for three sample sizes, 25, 50 and 100, when cd= 3.5 and σ²= 15.

OC functions can also be calculated using simulation. The normal distribution with the parameters µand provides a model for sample means each calculated from a set of nsample units. Many such random numbers can be generated and the proportion of times these numbers are less than cd is the OC value for a particular µ. Simulation for a range of µ’s results in a set of proportions that consti- tute an OC function. When simulation is used to determine OC values, the number of times sampling is simulated for each value of µ determines how accurate (smooth) the OC function will be. Three OC functions, one calculated analytically and two using simulation, for n= 25 and σ² = 15 are shown in Fig. 2.9. Those determined by simulation used either 25 or 250 simulation replicates for each OC value. Note that when 250 simulation replicates were used, the simulated OC is close to the analytical result. The number of simulation replicates required to obtain a smooth OC function varies depending on the variability in the sample data.

The accuracy of a simulated OC function can be predicted to a certain extent.

Equation 2.2 can be used to estimate the variance of any one simulated OC value, if the variance of the true value (p) is known. We shall show in Chapter 4 that a

σ/ n

0 2 4 6 8 Population mean

1.0

OC 0.5

Fig. 2.8. OC functions computed using a normal cumulative distribution function for three sample sizes when cd= 3.5 and σ²= 15: ____, n= 25; ----, n= 50; - - -, n= 100.

Basic Concepts of Sampling for Pest Management 37

simulated OC value follows a binomial distribution with expectation pand variance p(1 p)/sr, where sris the number of simulation replicates. It follows from the Central Limit Theorem applied to the simulation process that the standard error of a simulated OC value is

(2.7) and, for two-thirds of the time, the simulated OC should be within one standard error of the true OC, as noted above (Section 2.9). This is illustrated for the OC function in Fig. 2.9, which is based on 25 simulation replicates: the simulated values were mostly within one standard error of the true values (Fig. 2.10). When OC functions, estimated by simulation, are compared with each other, their accuracy should be noted before bold statements are made.

Continued simulation se p p

(

¹sr−

)

0 2 4 6 8 Population mean

1.0

OC 0.5

Fig. 2.9. OC functions computed using a normal cumulative distribution function and via simulation for n = 25, cd= 3.5 and σ²= 15: determined using the normal distribution function (___), using simulation with 25 replicates (---) and with simulation using 250 replicates (- - - -).

Fig. 2.10. Predicting the accuracy of the simulated OC functions in Fig. 2.9. The OC function is computed using the normal distribution (____), with error bars indicating one standard error around it based on 25 replicates, and the simulated OC function is based on 25 replicates (- - - -).

For readers who have worked with statistical hypothesis testing and statistical significance, the OC function may not be new. An experiment could be set up to examine whether the mean of a test population or process (e.g. the survival rate of an insect under stress) is, or is not, less than a certain specified value, C. We plan to assume a normal distribution for the data, with standard deviation equal to _σ, and use a 5% significance test (note that for a standardized normal distribution, the probability of getting a value less than 1.645 is equal to 0.05). What this means is that after the experiment, we intend to state that the true mean of the test population is significantly less than Cif the experimental mean is less than C . Depending on the true mean, the experiment will have varying probabilities of finding significance. A plot of these probabilities against true mean values is called the power functionof the experiment.

Making a pest management decision on the basis of a sample mean, as in Exhibit 2.2, is similar to doing such an experiment. If

the OC function of Exhibit 2.2 and the power function for such an experiment are the same. The plans discussed in Exhibit 2.2 can be regarded as designs for experi- ments to test whether the true mean of pests per sample unit is, or is not, less than C; it depends on how you view what is going on. Looked at the other way, there are

cd= −C 1 645. / n 1 645. / n

many statisticians who advocate looking at the entire power function for a pro- posed experiment, rather than at the probabilities of accepting or rejecting one spe- cific (often artificial) null hypothesis. Tukey (1950) drew attention to the similarity of OC functions and power functions.

Five key ideas have been presented in this chapter:

1. Bias is an omnipresent factor when sampling for pest management. It is impor- tant to consider data collection carefully, so that bias does not unwittingly influence the sample outcome.

2. A trustworthy sample satisfies four criteria: it should be representative, reliable, relevant and practically feasible.

3. The precision of an estimated population parameter increases as the sample size increases. This is because the variance of an estimated parameter decreases with increasing sample size.

4. The distribution of sample means is described by a normal distribution function provided that a minimum sample size is used. Often, n> 25 is sufficient for the normal approximation to be acceptable.

5. OC functions for fixed samples that classify density based on an estimated mean can be calculated using the normal cumulative distribution function. OC functions for these sampling plans can also be generated by simulation.

Beall, G. (1939) Methods of estimating the population of insects in a field. Biometrika30, 422–439.

Brenner, R.J., Focks, D.A., Arbogast, R.T., Weaver, D.K. and Shuman, D. (1998) Practical use of spatial analysis in precision targeting for integrated pest management. American Entomologist4, 79–101.

Cressie, N.A.C. (1993) Statistics for Spatial Data. John Wiley, New York, 900 pp.

Dent, D. (ed.) (1995) Integrated Pest Management. Chapman & Hall, London, 356 pp.

Guttierez, A.P. (1995) Integrated pest management in cotton. In: Dent, D. (ed.) Integrated Pest Management. Chapman & Hall, London.

Legg, D.E., Shufran, K.A. and Yeargan, K.V. (1985) Evaluation of two sampling methods for their influence on the population statistics of alfalfa weevil (Coleoptera:

Curculionidae) larva infestation in alfalfa. Journal of Economic Entomology 78, 1468–1474.

Milikowski, M. (1995) Knowledge of Numbers: a Study of the Psychological Representation of the Numbers 1–100. University of Amsterdam, The Netherlands, 135 pp.

Page, E.S. (1967) A note on generating random permutations. Applied Statistics16, 273–274.

Schotzko, D.J. and O’Keeffe, L.E. (1989) Geostatistical distribution of the spatial distribution of Lygus hesperus(Heteroptera: Miridae) in lentils. Journal of Economic Entomology 82, 1277–1288.

Schotzko, D.J. and O’Keeffe, L.E. (1990) Effect of sample placement on the geostatistical

Basic Concepts of Sampling for Pest Management 39

Dalam dokumen SAMPLING AND MONITORING IN CROP PROTECTION The Theoretical Basis for Developing Practical Decision Guides (Halaman 46-51)