The (𝒂, 𝒃, 𝟎) Class - DISCRETE DISTRIBUTIONS

DISCRETE DISTRIBUTIONS

6.5 The (𝒂, 𝒃, 𝟎) Class

The following definition characterizes the members of this class of distributions.

Definition 6.4 Let𝑝_𝑘be the pf of a discrete random variable. It is a member of the(a,b,𝟎) class of distributionsprovided that there exist constants𝑎and𝑏such that

𝑝_𝑘= (𝑎+ 𝑏

𝑘

)𝑝_𝑘₋₁, 𝑘= 1,2,3,….

This recursion describes the relative size of successive probabilities in the counting distribution. The probability at zero, 𝑝0, can be obtained from the recursive formula because the probabilities must sum to 1. The (𝑎, 𝑏,0) class of distributions is a two- parameter class, the two parameters being𝑎and𝑏. The following example illustrates these ideas by demonstrating that the binomial distribution is a member of the(𝑎, 𝑏,0)class.

EXAMPLE 6.2

Demonstrate that the binomial distribution is a member of the(𝑎, 𝑏,0)class.

The binomial distribution with parameters𝑚and𝑞has probabilities 𝑝_𝑘=

(𝑚 𝑘 )

𝑞^𝑘(1 −𝑞)^𝑚⁻^𝑘, 𝑘= 0,1,…, 𝑚,

and𝑝_𝑘= 0, otherwise. The probabilities for𝑘= 1,2,…, 𝑚can be rewritten as 𝑝_𝑘= 𝑚!

(𝑚−𝑘)!𝑘!𝑞^𝑘(1 −𝑞)^𝑚⁻^𝑘

= 𝑚−𝑘+ 1 𝑘

𝑞 1 −𝑞

{ 𝑚

[𝑚− (𝑘− 1)]!(𝑘− 1)!𝑞^𝑘⁻¹(1 −𝑞)^𝑚⁻⁽^𝑘⁻¹⁾ }

= 𝑞

1 −𝑞 (

−1 +𝑚+ 1 𝑘

)𝑝_𝑘−1.

Hence,𝑝_𝑘 = (𝑎+𝑏∕𝑘)𝑝_𝑘−1holds for 𝑘= 1,2,…, 𝑚with𝑎= −𝑞∕(1 −𝑞)and 𝑏 = (𝑚+ 1)𝑞∕(1 −𝑞). To complete the example, we must verify that the recursion holds for𝑘=𝑚+ 1, 𝑚+ 2,…. For𝑘=𝑚+ 1, we have

(𝑎+ 𝑏 𝑚+ 1

)𝑝_𝑚= (

− 𝑞

1 −𝑞+ 𝑞 1 −𝑞

)

𝑝_𝑚= 0 =𝑝_𝑚+1.

For𝑘= 𝑚+ 2, 𝑚+ 3,…the recursion holds trivially, with both sides clearly being zero. This demonstrates that the binomial distribution is a member of the(𝑎, 𝑏,0)class.

□ As in the above example, substituting in the probability function for the Poisson and negative binomial distributions on each side of the recursive formula in Definition 6.4, with the values of𝑎and𝑏given in Table 6.1, demonstrates that these two distributions are also members of the(𝑎, 𝑏,0)class. In addition, Table 6.1 gives the values of𝑝₀, the starting value for the recursion. The geometric distribution, the one-parameter special case (𝑟= 1) of the negative binomial distribution, is also in the table.

It can be shown (see Panjer and Willmot [100, Chapter 6]) that these are the only possible distributions satisfying this recursive formula.

The recursive formula can be rewritten (if𝑝_𝑘−1>0) as 𝑘 𝑝_𝑘

𝑝_𝑘−1

=𝑎𝑘+𝑏, 𝑘= 1,2,3,….

The expression on the left-hand side is a linear function in𝑘. Note from Table 6.1 that the slope𝑎of the straight line is zero for the Poisson distribution, is negative for the binomial distribution, and is positive for the negative binomial distribution, including the geometric

Table 6.1 The members of the(𝑎, 𝑏,0)class.

Distribution 𝑎 𝑏 𝑝0

Poisson 0 𝜆 𝑒⁻^𝜆

Binomial − 𝑞

1 −𝑞 (𝑚+ 1) 𝑞

1 −𝑞 (1 −𝑞)^𝑚

Negative binomial 𝛽

1 +𝛽 (𝑟− 1) 𝛽

1 +𝛽 (1 +𝛽)⁻^𝑟

Geometric 𝛽

1 +𝛽 0 (1 +𝛽)⁻¹

special case. This relationship suggests a graphical way of indicating which of the three distributions might be selected for fitting to data. We begin by plotting

𝑘 ̂𝑝_𝑘

̂𝑝_𝑘−1

=𝑘 𝑛_𝑘 𝑛_𝑘−1

against𝑘. The observed values should form approximately a straight line if one of these models is to be selected, and the value of the slope should be an indication of which of the models should be selected. Note that this cannot be done if any of the𝑛_𝑘are zero. Hence this procedure is less useful for a small number of observations.

EXAMPLE 6.3

Consider the accident data in Table 6.2, which is taken from Thyrion [120]. For the 9,461 automobile insurance policies studied, the number of accidents under the policy is recorded in the table. Also recorded in the table is the observed value of the quantity that should be linear.

Figure 6.1 plots the value of the quantity of interest against 𝑘, the number of accidents. It can be seen from the graph that the quantity of interest looks approximately linear except for the point at𝑘 = 6. The reliability of the quantities as 𝑘increases diminishes because the number of observations becomes small and the variability of the results grows, illustrating a weakness of thisad hocprocedure. Visually, all the points appear to have equal value. However, the points on the left are more reliable than the points on the right due to the larger number of observations. From the graph, it can be seen that the slope is positive and the data appear approximately linear, suggesting that the negative binomial distribution is an appropriate model. Whether or not the slope is significantly different from zero is also not easily judged from the graph. By rescaling the vertical axis of the graph, the slope can be made to look steeper and, hence, the slope could be made to appear to be significantly different from zero.

Graphically, it is difficult to distinguish between the Poisson and the negative binomial distribution because the Poisson requires a slope of zero. However, we can say that the binomial distribution is probably not a good choice, since there is no evidence of a negative slope. In this case, it is advisable to fit both the Poisson and negative binomial distributions and conduct a more formal test to choose between them. □ It is also possible to compare the appropriateness of the distributions by looking at the relationship of the variance to the mean. For this data set, the mean number of claims per policy is 0.2144. The variance is 0.2889. Because the variance exceeds the mean, the

Table 6.2 The accident profile from Thyrion [120].

Number of Number of

accidents,𝑘 policies,𝑛_𝑘 𝑘 𝑛_𝑘

𝑛_𝑘−1

0 7,840

1 1,317 0.17

2 239 0.36

3 42 0.53

4 14 1.33

5 4 1.43

6 4 6.00

7 1 1.75

8+ 0

Total 9,461

0 1 2 3 4 5 6 7

Ratio

Figure 6.1 The plot of the ratio𝑘𝑛_𝑘∕𝑛_𝑘−1against𝑘.

negative binomial should be considered as an alternative to the Poisson. Again, this is a qualitative comment because we have, at this point, no formal way of determining whether the variance is sufficiently larger than the mean to warrant use of the negative binomial.

To do some formal analysis, Table 6.3 gives the results of maximum likelihood estimation (discussed in Chapters 11 and 12) of the parameters of the Poisson and negative binomial distributions and the negative loglikelihood in each case. In Chapter 15, formal selection methods are presented. They would indicate that the negative binomial is superior to the Poisson as a model for this data set. However, those methods also indicate that the negative binomial is not a particularly good model and, thus, some of the distributions yet to be introduced should be considered.

In subsequent sections, we will expand the class of the distributions beyond the three discussed in this section by constructing more general models related to the Poisson, binomial, and negative binomial distributions.

6.5.1 Exercises

6.2 For each of the data sets in Exercises 12.3 and 12.5 in Section 12.7, calculate values similar to those in Table 6.2. For each, determine the most appropriate model from the (𝑎, 𝑏,0)class.

Table 6.3 Comparison between Poisson and negative binomial models.

Distribution Parameter estimates −Loglikelihood

Poisson ̂𝜆= 0.2143537 5,490.78

Negative binomial ̂𝛽= 0.3055594 5,348.04

̂𝑟= 0.7015122

6.3 Use your knowledge of the permissible ranges for the parameters of the Poisson, negative binomial, and binomial to determine all possible values of 𝑎 and𝑏 for these members of the(𝑎, 𝑏,0)class. Because these are the only members of the class, all other pairs must not lead to a legitimate probability distribution (nonnegative values that sum to 1). Show that the pair𝑎= −1and𝑏= 1.5(which is not on the list of possible values) does not lead to a legitimate distribution.

Dalam dokumen Book LOSS MODELS FROM DATA TO DECISIONS (Halaman 105-109)