The Beta-binomial Distribution - Classifying Pest Density 3 3

Classifying Pest Density 3 3

4.10 The Beta-binomial Distribution

(4.21)

and where pis the overall probability of a sample unit being infected, and _ρis the intra-cluster correlation. Typical shapes of the beta-binomial distribution for p=0.05, 0.4, 0.9 and for _ρ=0.02, 0.1, 0.33, 0.7 are shown in Figs 4.14a–d. The corresponding mathematical parameters _αand _βare as follows:

mean incidence variance (1 )

[1 ( 1)]

where 1

= − −

= =













 p

p p

R n +_ρ R α

β ρ

α α β

p + ,

+ +

Fig. 4.14. Beta-binomial distributions with cluster size equal to 10. Each graph presents distributions for p= 0.05 (▫), 0.5 () and 0.9 (). (a) ρ= 0.02; (b) ρ= 0.1;

ρ p= 0.05 0.5 0.9

(a) 0.02 _α= 2.45 24.5 44.1

β= 46.55 24.5 4.9

(b) 0.1 _α= 0.45 4.5 8.1

β= 8.55 4.5 0.9

β= 1.929 1.015 0.203

(d) 0.7 _α= 0.021 0.214 0.386

β= 0.407 0.214 0.043

Note that when _ρ is small (_ρ =0.02), the distributions (Fig. 4.14a) are not very different from the corresponding binomial distributions (Fig. 4.13). When _ρ is large, the distribution becomes U-shaped, even for small or large values of p(Fig.

4.14d).

Distributions 85

Exhibit 4.4. The description of the number of virus-infected plants per cluster in sugar beet

As part of a study on the spread of beet yellows virus in experimentally inoculated fields of sugar beet in The Netherlands, symptoms were noted on all plants in 21 rows of one of the fields. The degree of aggregation was estimated in terms of the intra-cluster correlation, ρ.

The disease was clearly aggregated (Fig. 4.15). Plants were allocated to clusters of 10 consecutive individual plants within rows (Fig. 4.16). The binomial and beta- binomial distributions were fitted to the numbers of affected plants in each cluster.

The binomial distribution is clearly unable to model the data, but the beta-binomial distribution provides a good fit (Fig. 4.17). The fitted parameters were as follows:

p ρ α β X² df P

Binomial 0.195 0 – – 1417 5 < 0.001

Beta-binomial 0.193 0.291 0.472 1.968 9.6 8 0.29

The beta-binomial distribution fits the data with different cluster sizes also. The value of pdoes not change much, but the value of ρdoes:

Cluster size, R p ρ

1 ×5 (R= 5) 0.195 0.363

1 ×10 (R= 10) 0.193 0.291

1 ×15 (R= 15) 0.195 0.270

3 ×5 (R= 15) 0.197 0.294

The effect of cluster size on the value of ρis investigated in more depth in the next exhibit.

Continued

Fig. 4.15. The incidence of beet yellows virus on individual sugarbeet plants in 21 rows of 240 plants.

Fig. 4.16. The numbers of diseased sugarbeet plants in clusters of 10 plants in each row of the field depicted in Fig. 4.15. Each small dot represents a cluster with one diseased plant, and each increase in size of the dot represents one or two more diseased plants in a cluster: 1, 2, 3 or 4, 5 or 6, 7 or 8, 9 or 10.

The intra-cluster correlation, _ρ, is not estimated during routine pest manage- ment sampling, but is important in defining the variance of the incidence estimate (Equation 4.21). Intuitively, the average correlation among sample units in a cluster should decrease as the cluster size increases, because of distance. This atten- uation of the correlation as cluster size increases is illustrated in the following exhibit.

Distributions 87

Fig. 4.17. Fitting the data shown in Fig. 4.16 to the binomial (a) and the beta- binomial (b) distributions, using a grouping criterion equal to 1. Fitted frequencies are shown as circles.

Exhibit 4.5. The effect of cluster size and shape on the intra-cluster correlation Under random sampling, any aggregation among sample units is immaterial, and the binomial distribution is appropriate. If for any reason sampling is not random, then the binomial distribution cannot be assumed to hold. In particular, if sample units are chosen in clusters of adjacent sample units, the beta-binomial distribution should be used. The grid of sample units shown in Fig. 4.8b was rendered into a pattern of incidence by allocating 0 to units with 0 or 1 points and 1 to units with at least two points (Fig. 4.18). The effect of cluster size and shape on the intra-class correlation, ρ, was studied for this spatial pattern of incidence. The results are shown in Table 4.2.

The estimate of incidence was always near 0.20, but the estimate of ρ depended on the size and shape of the cluster. The larger clusters (12 sample units) tended to have lower values of ρthan the smaller clusters (six sample units). This is not surprising: the aggregation in the spatial pattern is local (as we noted above when we used the same data to study the negative binomial distribution), and the correlation between sample units decreases as the distance between them increases. As clusters became more elongated, ρdecreased. This result is similar to that noted in Exhibit 4.2, and the reason for the decrease is the same: clusters with longer edges cover more sources of aggregation, and thus average them out.

Continued

Fig. 4.18. The spatial pattern derived from Fig. 4.8b. Each sample unit in Fig. 4.8b containing two or more individual points is plotted here as a point. All other sample units are blank.

Table 4.2. Estimates of incidence and intra-cluster correlation for different sizes and shapes of clusters, used to sample the computer-generated spatial pattern in Fig. 4.18. All frequency distributions fitted the beta-binomial distribution, but none fitted the binomial distribution.

Number of sample

units along the Estimates and approximate standard errors

x-axis y-axis p se(p) ρ se(ρ)

4 1 0.200 0.014 0.166 0.032

2 2 0.201 0.014 0.217 0.035

1 4 0.201 0.013 0.156 0.032

6 1 0.201 0.014 0.098 0.026

3 2 0.200 0.015 0.174 0.031

2 3 0.200 0.015 0.165 0.030

1 6 0.202 0.014 0.113 0.026

12 1 0.201 0.016 0.103 0.024

6 2 0.201 0.017 0.113 0.026

4 3 0.199 0.018 0.149 0.029

3 4 0.200 0.018 0.136 0.028

2 6 0.200 0.016 0.107 0.025

1 12 0.202 0.015 0.075 0.021

As with the negative binomial distribution, we are left with the knowledge that the beta-binomial distribution is able to model incidence very well, but do we need to estimate _ρevery time? A model relating the beta-binomial variance to the binomial variance has been proposed by Hughes et al. (1996) which can do for the beta-binomial distribution what TPL can do for the negative binomial distribution (Section 4.5.1). Hughes et al. (1996) showed that the relationship

(4.22)

fitted a number of data sets, and proposed it as a general formula. We retain their original notation. On the basis of this formula and Equation 4.21, the intra-class correlation can be calculated as

(4.23)

For certain choices of parameters, the distributions in this chapter may be indistinguishable from each other.

The negative binomial distribution with mean _µapproaches the Poisson distribution with mean _µas kincreases. For example, compare Fig. 4.2 (diamond) with Fig. 4.6a (diamond).

When sampling to detect a rare event (binomial distribution with very small p), the total number of sample units where the rare event is observed (X_nout of a total of nsample units) is approximately Poisson with mean equal to np. This is exemplified in Fig. 4.19. When p=0.025 and n=40, the Poisson and binomial distributions are indistinguishable. When pis increased to 0.125 there is little difference between Poisson and binomial. When p = 0.25, there begins to be a noticeable difference, but it is still not great. The binomial distribution is more

ρ R

R AR

p p

= −  −







= −  − −







−

1 1

beta-binomial variance binomial variance 1 1

1 [ (1 )]^{1 b}

beta-binomial variance A(binomial variance) where binomial variance (1 )

= −







p p R

Distributions 89

Dalam dokumen SAMPLING AND MONITORING IN CROP PROTECTION The Theoretical Basis for Developing Practical Decision Guides (Halaman 95-101)