• Tidak ada hasil yang ditemukan

The (𝒂, 𝒃, 𝟎) Class

Dalam dokumen Book LOSS MODELS FROM DATA TO DECISIONS (Halaman 105-109)

DISCRETE DISTRIBUTIONS

6.5 The (𝒂, 𝒃, 𝟎) Class

The following definition characterizes the members of this class of distributions.

Definition 6.4 Letπ‘π‘˜be the pf of a discrete random variable. It is a member of the(a,b,𝟎) class of distributionsprovided that there exist constantsπ‘Žand𝑏such that

π‘π‘˜= (π‘Ž+ 𝑏

π‘˜

)π‘π‘˜βˆ’1, π‘˜= 1,2,3,….

This recursion describes the relative size of successive probabilities in the counting distribution. The probability at zero, 𝑝0, can be obtained from the recursive formula because the probabilities must sum to 1. The (π‘Ž, 𝑏,0) class of distributions is a two- parameter class, the two parameters beingπ‘Žand𝑏. The following example illustrates these ideas by demonstrating that the binomial distribution is a member of the(π‘Ž, 𝑏,0)class.

EXAMPLE 6.2

Demonstrate that the binomial distribution is a member of the(π‘Ž, 𝑏,0)class.

The binomial distribution with parametersπ‘šandπ‘žhas probabilities π‘π‘˜=

(π‘š π‘˜ )

π‘žπ‘˜(1 βˆ’π‘ž)π‘šβˆ’π‘˜, π‘˜= 0,1,…, π‘š,

andπ‘π‘˜= 0, otherwise. The probabilities forπ‘˜= 1,2,…, π‘šcan be rewritten as π‘π‘˜= π‘š!

(π‘šβˆ’π‘˜)!π‘˜!π‘žπ‘˜(1 βˆ’π‘ž)π‘šβˆ’π‘˜

= π‘šβˆ’π‘˜+ 1 π‘˜

π‘ž 1 βˆ’π‘ž

{ π‘š

[π‘šβˆ’ (π‘˜βˆ’ 1)]!(π‘˜βˆ’ 1)!π‘žπ‘˜βˆ’1(1 βˆ’π‘ž)π‘šβˆ’(π‘˜βˆ’1) }

= π‘ž

1 βˆ’π‘ž (

βˆ’1 +π‘š+ 1 π‘˜

)π‘π‘˜βˆ’1.

Hence,π‘π‘˜ = (π‘Ž+π‘βˆ•π‘˜)π‘π‘˜βˆ’1holds for π‘˜= 1,2,…, π‘šwithπ‘Ž= βˆ’π‘žβˆ•(1 βˆ’π‘ž)and 𝑏 = (π‘š+ 1)π‘žβˆ•(1 βˆ’π‘ž). To complete the example, we must verify that the recursion holds forπ‘˜=π‘š+ 1, π‘š+ 2,…. Forπ‘˜=π‘š+ 1, we have

(π‘Ž+ 𝑏 π‘š+ 1

)π‘π‘š= (

βˆ’ π‘ž

1 βˆ’π‘ž+ π‘ž 1 βˆ’π‘ž

)

π‘π‘š= 0 =π‘π‘š+1.

Forπ‘˜= π‘š+ 2, π‘š+ 3,…the recursion holds trivially, with both sides clearly being zero. This demonstrates that the binomial distribution is a member of the(π‘Ž, 𝑏,0)class.

β–‘ As in the above example, substituting in the probability function for the Poisson and negative binomial distributions on each side of the recursive formula in Definition 6.4, with the values ofπ‘Žand𝑏given in Table 6.1, demonstrates that these two distributions are also members of the(π‘Ž, 𝑏,0)class. In addition, Table 6.1 gives the values of𝑝0, the starting value for the recursion. The geometric distribution, the one-parameter special case (π‘Ÿ= 1) of the negative binomial distribution, is also in the table.

It can be shown (see Panjer and Willmot [100, Chapter 6]) that these are the only possible distributions satisfying this recursive formula.

The recursive formula can be rewritten (ifπ‘π‘˜βˆ’1>0) as π‘˜ π‘π‘˜

π‘π‘˜βˆ’1

=π‘Žπ‘˜+𝑏, π‘˜= 1,2,3,….

The expression on the left-hand side is a linear function inπ‘˜. Note from Table 6.1 that the slopeπ‘Žof the straight line is zero for the Poisson distribution, is negative for the binomial distribution, and is positive for the negative binomial distribution, including the geometric

Table 6.1 The members of the(π‘Ž, 𝑏,0)class.

Distribution π‘Ž 𝑏 𝑝0

Poisson 0 πœ† π‘’βˆ’πœ†

Binomial βˆ’ π‘ž

1 βˆ’π‘ž (π‘š+ 1) π‘ž

1 βˆ’π‘ž (1 βˆ’π‘ž)π‘š

Negative binomial 𝛽

1 +𝛽 (π‘Ÿβˆ’ 1) 𝛽

1 +𝛽 (1 +𝛽)βˆ’π‘Ÿ

Geometric 𝛽

1 +𝛽 0 (1 +𝛽)βˆ’1

special case. This relationship suggests a graphical way of indicating which of the three distributions might be selected for fitting to data. We begin by plotting

π‘˜ Μ‚π‘π‘˜

Μ‚π‘π‘˜βˆ’1

=π‘˜ π‘›π‘˜ π‘›π‘˜βˆ’1

againstπ‘˜. The observed values should form approximately a straight line if one of these models is to be selected, and the value of the slope should be an indication of which of the models should be selected. Note that this cannot be done if any of theπ‘›π‘˜are zero. Hence this procedure is less useful for a small number of observations.

EXAMPLE 6.3

Consider the accident data in Table 6.2, which is taken from Thyrion [120]. For the 9,461 automobile insurance policies studied, the number of accidents under the policy is recorded in the table. Also recorded in the table is the observed value of the quantity that should be linear.

Figure 6.1 plots the value of the quantity of interest against π‘˜, the number of accidents. It can be seen from the graph that the quantity of interest looks approximately linear except for the point atπ‘˜ = 6. The reliability of the quantities as π‘˜increases diminishes because the number of observations becomes small and the variability of the results grows, illustrating a weakness of thisad hocprocedure. Visually, all the points appear to have equal value. However, the points on the left are more reliable than the points on the right due to the larger number of observations. From the graph, it can be seen that the slope is positive and the data appear approximately linear, suggesting that the negative binomial distribution is an appropriate model. Whether or not the slope is significantly different from zero is also not easily judged from the graph. By rescaling the vertical axis of the graph, the slope can be made to look steeper and, hence, the slope could be made to appear to be significantly different from zero.

Graphically, it is difficult to distinguish between the Poisson and the negative binomial distribution because the Poisson requires a slope of zero. However, we can say that the binomial distribution is probably not a good choice, since there is no evidence of a negative slope. In this case, it is advisable to fit both the Poisson and negative binomial distributions and conduct a more formal test to choose between them. β–‘ It is also possible to compare the appropriateness of the distributions by looking at the relationship of the variance to the mean. For this data set, the mean number of claims per policy is 0.2144. The variance is 0.2889. Because the variance exceeds the mean, the

Table 6.2 The accident profile from Thyrion [120].

Number of Number of

accidents,π‘˜ policies,π‘›π‘˜ π‘˜ π‘›π‘˜

π‘›π‘˜βˆ’1

0 7,840

1 1,317 0.17

2 239 0.36

3 42 0.53

4 14 1.33

5 4 1.43

6 4 6.00

7 1 1.75

8+ 0

Total 9,461

0 1 2 3 4 5 6 7

0 1 2 3 4 5 6 7

k

Ratio

Figure 6.1 The plot of the ratioπ‘˜π‘›π‘˜βˆ•π‘›π‘˜βˆ’1againstπ‘˜.

negative binomial should be considered as an alternative to the Poisson. Again, this is a qualitative comment because we have, at this point, no formal way of determining whether the variance is sufficiently larger than the mean to warrant use of the negative binomial.

To do some formal analysis, Table 6.3 gives the results of maximum likelihood estimation (discussed in Chapters 11 and 12) of the parameters of the Poisson and negative binomial distributions and the negative loglikelihood in each case. In Chapter 15, formal selection methods are presented. They would indicate that the negative binomial is superior to the Poisson as a model for this data set. However, those methods also indicate that the negative binomial is not a particularly good model and, thus, some of the distributions yet to be introduced should be considered.

In subsequent sections, we will expand the class of the distributions beyond the three discussed in this section by constructing more general models related to the Poisson, binomial, and negative binomial distributions.

6.5.1 Exercises

6.2 For each of the data sets in Exercises 12.3 and 12.5 in Section 12.7, calculate values similar to those in Table 6.2. For each, determine the most appropriate model from the (π‘Ž, 𝑏,0)class.

Table 6.3 Comparison between Poisson and negative binomial models.

Distribution Parameter estimates βˆ’Loglikelihood

Poisson Μ‚πœ†= 0.2143537 5,490.78

Negative binomial ̂𝛽= 0.3055594 5,348.04

Μ‚π‘Ÿ= 0.7015122

6.3 Use your knowledge of the permissible ranges for the parameters of the Poisson, negative binomial, and binomial to determine all possible values of π‘Ž and𝑏 for these members of the(π‘Ž, 𝑏,0)class. Because these are the only members of the class, all other pairs must not lead to a legitimate probability distribution (nonnegative values that sum to 1). Show that the pairπ‘Ž= βˆ’1and𝑏= 1.5(which is not on the list of possible values) does not lead to a legitimate distribution.

Dalam dokumen Book LOSS MODELS FROM DATA TO DECISIONS (Halaman 105-109)