• Tidak ada hasil yang ditemukan

Compound Frequency Distributions

Dalam dokumen Book LOSS MODELS FROM DATA TO DECISIONS (Halaman 116-122)

ADVANCED DISCRETE DISTRIBUTIONS

7.1 Compound Frequency Distributions

7

𝑆=𝑀1+𝑀2+β‹―+𝑀𝑁(where𝑁 = 0implies that𝑆= 0) is𝑃𝑆(𝑧) =𝑃𝑁[𝑃𝑀(𝑧)]. This is shown as follows:

𝑃𝑆(𝑧) =

βˆ‘βˆž π‘˜=0

Pr(𝑆=π‘˜)π‘§π‘˜=

βˆ‘βˆž π‘˜=0

βˆ‘βˆž 𝑛=0

Pr(𝑆=π‘˜|𝑁=𝑛) Pr(𝑁=𝑛)π‘§π‘˜

=

βˆ‘βˆž 𝑛=0

Pr(𝑁=𝑛)

βˆ‘βˆž π‘˜=0

Pr(𝑀1+β‹―+𝑀𝑛=π‘˜|𝑁=𝑛)π‘§π‘˜

=

βˆ‘βˆž 𝑛=0

Pr(𝑁=𝑛)[𝑃𝑀(𝑧)]𝑛

=𝑃𝑁[𝑃𝑀(𝑧)].

In insurance contexts, this distribution can arise naturally. If𝑁represents the number of accidents arising in a portfolio of risks and{π‘€π‘˜βˆΆ π‘˜= 1,2,…, 𝑁}represents the number of claims (injuries, number of cars, etc.) from the accidents, then𝑆 represents the total number of claims from the portfolio. This kind of interpretation is not necessary to justify the use of a compound distribution. If a compound distribution fits data well, that may be enough justification itself. Also, there are other motivations for these distributions, as presented in Section 7.5.

EXAMPLE 7.1

Demonstrate that any zero-modified distribution is a compound distribution.

Consider a primary Bernoulli distribution. It has pgf𝑃𝑁(𝑧) = 1 βˆ’π‘ž+π‘žπ‘§. Then consider an arbitrary secondary distribution with pgf𝑃𝑀(𝑧). Then, from (7.1), we obtain

𝑃𝑆(𝑧) =𝑃𝑁[𝑃𝑀(𝑧)] = 1 βˆ’π‘ž+π‘žπ‘ƒπ‘€(𝑧). From (6.4) this is the pgf of a ZM distribution with

π‘ž= 1 βˆ’π‘π‘€0 1 βˆ’π‘0

.

That is, the ZM distribution has assigned arbitrary probability𝑝𝑀0 at zero, while𝑝0is the probability assigned at zero by the secondary distribution. β–‘ EXAMPLE 7.2

Consider the case where both𝑀 and𝑁have a Poisson distribution. Determine the pgf of this distribution.

This distribution is called the Poisson–Poisson or Neyman Type A distribution.

Let𝑃𝑁(𝑧) =π‘’πœ†1(π‘§βˆ’1)and𝑃𝑀(𝑧) =π‘’πœ†2(π‘§βˆ’1). Then, 𝑃𝑆(𝑧) =π‘’πœ†1[π‘’πœ†2(π‘§βˆ’1)βˆ’1].

Whenπœ†2is a lot larger thanπœ†1, for example, πœ†1 = 0.1 andπœ†2 = 10, the resulting

distribution will have two local modes. β–‘

The probability of exactlyπ‘˜claims can be written as Pr(𝑆=π‘˜) =

βˆ‘βˆž 𝑛=0

Pr(𝑆=π‘˜|𝑁=𝑛) Pr(𝑁=𝑛)

=

βˆ‘βˆž 𝑛=0

Pr(𝑀1+β‹―+𝑀𝑁 =π‘˜|𝑁=𝑛) Pr(𝑁=𝑛)

=

βˆ‘βˆž 𝑛=0

Pr(𝑀1+β‹―+𝑀𝑛=π‘˜) Pr(𝑁=𝑛). (7.2) Letting𝑔𝑛= Pr(𝑆=𝑛),𝑝𝑛= Pr(𝑁=𝑛), and𝑓𝑛= Pr(𝑀 =𝑛), this is rewritten as

π‘”π‘˜=

βˆ‘βˆž 𝑛=0

π‘π‘›π‘“π‘˜βˆ—π‘›, (7.3)

whereπ‘“π‘˜βˆ—π‘›, π‘˜= 0,1,…, is the β€œπ‘›-fold convolution” of the functionπ‘“π‘˜, π‘˜= 0,1,…, that is, the probability that the sum of𝑛random variables which are each i.i.d. with probability functionπ‘“π‘˜will take on valueπ‘˜.

When𝑃𝑁(𝑧)is chosen to be a member of the(π‘Ž, 𝑏,0)class, π‘π‘˜=

(π‘Ž+ 𝑏 π‘˜

)π‘π‘˜βˆ’1, π‘˜= 1,2,…, (7.4)

then a simple recursive formula can be used. This formula avoids the use of convolutions and thus reduces the computations considerably.

Theorem 7.1 If the primary distribution is a member of the(π‘Ž, 𝑏,0)class, the recursive formula is

π‘”π‘˜= 1 1 βˆ’π‘Žπ‘“0

βˆ‘π‘˜ 𝑗=1

( π‘Ž+𝑏𝑗

π‘˜ )

π‘“π‘—π‘”π‘˜βˆ’π‘—, π‘˜= 1,2,3,…. (7.5)

Proof:From (7.4),

𝑛𝑝𝑛=π‘Ž(π‘›βˆ’ 1)π‘π‘›βˆ’1+ (π‘Ž+𝑏)π‘π‘›βˆ’1.

Multiplying each side by[𝑃𝑀(𝑧)]π‘›βˆ’1𝑃𝑀′ (𝑧)and summing over𝑛yields

βˆ‘βˆž 𝑛=1

𝑛𝑝𝑛[𝑃𝑀(𝑧)]π‘›βˆ’1𝑃𝑀′ (𝑧) =π‘Ž

βˆ‘βˆž 𝑛=1

(π‘›βˆ’ 1)π‘π‘›βˆ’1[𝑃𝑀(𝑧)]π‘›βˆ’1𝑃𝑀′ (𝑧)

+ (π‘Ž+𝑏)

βˆ‘βˆž 𝑛=1

π‘π‘›βˆ’1[𝑃𝑀(𝑧)]π‘›βˆ’1𝑃𝑀′ (𝑧).

Because𝑃𝑆(𝑧) =βˆ‘βˆž

𝑛=0𝑝𝑛[𝑃𝑀(𝑧)]𝑛, the previous equation is 𝑃𝑆′(𝑧) =π‘Ž

βˆ‘βˆž 𝑛=0

𝑛𝑝𝑛[𝑃𝑀(𝑧)]𝑛𝑃𝑀′ (𝑧) + (π‘Ž+𝑏)

βˆ‘βˆž 𝑛=0

𝑝𝑛[𝑃𝑀(𝑧)]𝑛𝑃𝑀′ (𝑧).

Therefore,

𝑃𝑆′(𝑧) =π‘Žπ‘ƒπ‘†β€²(𝑧)𝑃𝑀(𝑧) + (π‘Ž+𝑏)𝑃𝑆(𝑧)𝑃𝑀′ (𝑧).

Each side can be expanded in powers of𝑧. The coefficients ofπ‘§π‘˜βˆ’1in such an expansion must be the same on both sides of the equation. Hence, forπ‘˜= 1,2,…, we have

π‘˜π‘”π‘˜=π‘Ž

βˆ‘π‘˜ 𝑗=0

(π‘˜βˆ’π‘—)π‘“π‘—π‘”π‘˜βˆ’π‘— + (π‘Ž+𝑏)

βˆ‘π‘˜ 𝑗=0

π‘—π‘“π‘—π‘”π‘˜βˆ’π‘—

=π‘Žπ‘˜π‘“0π‘”π‘˜+π‘Žβˆ‘π‘˜

𝑗=1

(π‘˜βˆ’π‘—)π‘“π‘—π‘”π‘˜βˆ’π‘—+ (π‘Ž+𝑏)

βˆ‘π‘˜ 𝑗=1

π‘—π‘“π‘—π‘”π‘˜βˆ’π‘—

=π‘Žπ‘˜π‘“0π‘”π‘˜+π‘Žπ‘˜βˆ‘π‘˜

𝑗=1

π‘“π‘—π‘”π‘˜βˆ’π‘—+π‘βˆ‘π‘˜

𝑗=1

π‘—π‘“π‘—π‘”π‘˜βˆ’π‘—. Therefore,

π‘”π‘˜=π‘Žπ‘“0π‘”π‘˜+

βˆ‘π‘˜ 𝑗=1

( π‘Ž+𝑏𝑗

π‘˜ )

π‘“π‘—π‘”π‘˜βˆ’π‘—.

Rearrangement yields (7.5). β–‘

In order to use (7.5), the starting value𝑔0is required and is given in Theorem 7.3. If the primary distribution is a member of the(π‘Ž, 𝑏,1) class, the proof must be modified to reflect the fact that the recursion for the primary distribution begins atπ‘˜= 2. The result is the following.

Theorem 7.2 If the primary distribution is a member of the(π‘Ž, 𝑏,1)class, the recursive formula is

π‘”π‘˜=

[𝑝1βˆ’ (π‘Ž+𝑏)𝑝0]π‘“π‘˜+βˆ‘π‘˜

𝑗=1(π‘Ž+π‘π‘—βˆ•π‘˜)π‘“π‘—π‘”π‘˜βˆ’π‘—

1 βˆ’π‘Žπ‘“0

, π‘˜= 1,2,3,…. (7.6)

Proof:It is similar to the proof of Theorem 7.1 and is omitted. β–‘ EXAMPLE 7.3

Develop the recursive formula for the case where the primary distribution is Poisson.

In this case,π‘Ž= 0and𝑏=πœ†, yielding the recursive form π‘”π‘˜= πœ†

π‘˜

βˆ‘π‘˜ 𝑗=1

π‘—π‘“π‘—π‘”π‘˜βˆ’π‘—. The starting value is, from (7.1),

𝑔0= Pr(𝑆= 0) =𝑃(0)

=𝑃𝑁[𝑃𝑀(0)] =𝑃𝑁(𝑓0)

=π‘’βˆ’πœ†(1βˆ’π‘“0). (7.7)

Distributions of this type are calledcompound Poisson. When the secondary distribu- tion is specified, the compound distribution is called Poisson–X, where X is the name

of the secondary distribution. β–‘

The method used to obtain𝑔0applies to any compound distribution.

Theorem 7.3 For any compound distribution,𝑔0=𝑃𝑁(𝑓0), where𝑃𝑁(𝑧)is the pgf of the primary distribution and𝑓0is the probability that the secondary distribution takes on the value zero.

Proof:See the second line of (7.7) β–‘

Note that the secondary distribution is not required to be in any special form. However, to keep the number of distributions manageable, secondary distributions are selected from the(π‘Ž, 𝑏,0)or the(π‘Ž, 𝑏,1)class.

EXAMPLE 7.4

Calculate the probabilities for the Poisson–ETNB distribution, whereπœ† = 3for the Poisson distribution and the ETNB distribution hasπ‘Ÿ= βˆ’0.5and𝛽 = 1.

From Example 6.5, the secondary probabilities are 𝑓0 = 0, 𝑓1 = 0.853553, 𝑓2 = 0.106694, and𝑓3 = 0.026674. From (7.7),𝑔0= exp[βˆ’3(1 βˆ’ 0)] = 0.049787.

For the Poisson primary distribution,π‘Ž= 0and𝑏= 3. The recursive formula in (7.5) becomes

π‘”π‘˜=

βˆ‘π‘˜

𝑗=1(3π‘—βˆ•π‘˜)π‘“π‘—π‘”π‘˜βˆ’π‘—

1 βˆ’ 0(0) =

βˆ‘π‘˜ 𝑗=1

3𝑗 π‘˜π‘“π‘—π‘”π‘˜βˆ’π‘—. Then,

𝑔1= 3(1)

1 0.853553(0.049787) = 0.127488, 𝑔2= 3(1)

2 0.853553(0.127488) +3(2)

2 0.106694(0.049787) = 0.179163, 𝑔3= 3(1)

3 0.853553(0.179163) +3(2)

3 0.106694(0.127488) +3(3)

3 0.026674(0.049787) = 0.184114. β–‘

EXAMPLE 7.5

Demonstrate that the Poisson–logarithmic distribution is a negative binomial distribution.

The negative binomial distribution has pgf 𝑃(𝑧) = [1 βˆ’π›½(π‘§βˆ’ 1)]βˆ’π‘Ÿ.

Suppose that𝑃𝑁(𝑧)is Poisson(πœ†)and𝑃𝑀(𝑧)is logarithmic(𝛽). Then, 𝑃𝑁[𝑃𝑀(𝑧)] = exp{πœ†[𝑃𝑀(𝑧) βˆ’ 1]}

= exp {

πœ† [

1 βˆ’ln[1 βˆ’π›½(π‘§βˆ’ 1)]

ln(1 +𝛽) βˆ’ 1 ]}

= exp

{ βˆ’πœ†

ln(1 +𝛽)ln[1 βˆ’π›½(π‘§βˆ’ 1)]

}

= [1 βˆ’π›½(π‘§βˆ’ 1)]βˆ’πœ†βˆ• ln(1+𝛽)

= [1 βˆ’π›½(π‘§βˆ’ 1)]βˆ’π‘Ÿ,

whereπ‘Ÿ = πœ†βˆ• ln(1 +𝛽). This shows that the negative binomial distribution can be written as a compound Poisson distribution with a logarithmic secondary distribution.

β–‘ Example 7.5 shows that the Poisson–logarithmic distribution does not create a new distri- bution beyond the(π‘Ž, 𝑏,0)and(π‘Ž, 𝑏,1)classes. As a result, this combination of distributions is not useful to us. Another combination that does not create a new distribution beyond the (π‘Ž, 𝑏,1) class is the compound geometric distribution, where both the primary and secondary distributions are geometric. The resulting distribution is a zero-modified geometric distribution, as shown in Exercise 7.2. The following theorem shows that certain other combinations are also of no use in expanding the class of distributions through compounding. Suppose that𝑃𝑆(𝑧) = 𝑃𝑁[𝑃𝑀(𝑧)]as before. Now,𝑃𝑀(𝑧)can always be written as

𝑃𝑀(𝑧) =𝑓0+ (1 βˆ’π‘“0)π‘ƒπ‘€βˆ—(𝑧), (7.8) whereπ‘ƒπ‘€βˆ—(𝑧)is the pgf of the conditional distribution over the positive range (in other words, the zero-truncated version).

Theorem 7.4 Suppose that the pgf𝑃𝑁(𝑧;πœƒ)satisfies 𝑃𝑁(𝑧;πœƒ) =𝐡[πœƒ(π‘§βˆ’ 1)]

for some parameter πœƒ and some function 𝐡(𝑧) that is independent of πœƒ. That is, the parameterπœƒand the argument𝑧only appear in the pgf asπœƒ(π‘§βˆ’ 1). There may be other parameters as well, and they may appear anywhere in the pgf. Then,𝑃𝑆(𝑧) =𝑃𝑁[𝑃𝑀(𝑧);πœƒ] can be rewritten as

𝑃𝑆(𝑧) =𝑃𝑁[𝑃𝑀𝑇(𝑧);πœƒ(1 βˆ’π‘“0)].

Proof:

𝑃𝑆(𝑧) =𝑃𝑁[𝑃𝑀(𝑧);πœƒ]

=𝑃𝑁[𝑓0+ (1 βˆ’π‘“0)𝑃𝑀𝑇(𝑧);πœƒ]

=𝐡{πœƒ[𝑓0+ (1 βˆ’π‘“0)𝑃𝑀𝑇(𝑧) βˆ’ 1]}

=𝐡{πœƒ(1 βˆ’π‘“0)[𝑃𝑀𝑇(𝑧) βˆ’ 1]}

=𝑃𝑁[𝑃𝑀𝑇(𝑧);πœƒ(1 βˆ’π‘“0)]. β–‘

This shows that adding, deleting, or modifying the probability at zero in the secondary distribution does not add a new distribution because it is equivalent to modifying the parameterπœƒof the primary distribution. Thus, for example, a Poisson primary distribution with a Poisson, zero-truncated Poisson, or zero-modified Poisson secondary distribution will still lead to a Neyman Type A (Poisson–Poisson) distribution.

EXAMPLE 7.6

Determine the probabilities for a Poisson–zero-modified ETNB distribution where the parameters areπœ†= 7.5,𝑝𝑀0 = 0.6,π‘Ÿ= βˆ’0.5, and𝛽= 1.

From Example 6.5, the secondary probabilities are 𝑓0 = 0.6, 𝑓1 = 0.341421, 𝑓2= 0.042678, and𝑓3= 0.010670. From (7.7),𝑔0= exp[βˆ’7.5(1 βˆ’ 0.6)] = 0.049787.

For the Poisson primary distribution, π‘Ž= 0and𝑏 = 7.5. The recursive formula in (7.5) becomes

π‘”π‘˜=

βˆ‘π‘˜

𝑗=1(7.5π‘—βˆ•π‘˜)π‘“π‘—π‘”π‘˜βˆ’π‘—

1 βˆ’ 0(0.6) =

βˆ‘π‘˜ 𝑗=1

7.5𝑗 π‘˜ π‘“π‘—π‘”π‘˜βˆ’π‘—. Then,

𝑔1= 7.5(1)

1 0.341421(0.049787) = 0.127487, 𝑔2= 7.5(1)

2 0.341421(0.127487) +7.5(2)

2 0.042678(0.049787) = 0.179161, 𝑔3= 7.5(1)

3 0.341421(0.179161) +7.5(2)

3 0.042678(0.127487) + 7.5(3)

3 0.010670(0.049787) = 0.184112.

Except for slight rounding differences, these probabilities are the same as those

obtained in Example 7.4. β–‘

7.1.1 Exercises

7.1 Do all the members of the(π‘Ž, 𝑏,0)class satisfy the condition of Theorem 7.4? For those that do, identify the parameter (or function of its parameters) that plays the role ofπœƒ in the theorem.

7.2 Show that the following three distributions are identical: (1) geometric–geometric, (2) Bernoulli–geometric, and (3) zero-modified geometric. That is, for any one of the distributions with arbitrary parameters, show that there is a member of the other two distribution types that has the same pf or pgf.

7.3 Show that the binomial–geometric and negative binomial–geometric (with negative binomial parameterπ‘Ÿa positive integer) distributions are identical.

Dalam dokumen Book LOSS MODELS FROM DATA TO DECISIONS (Halaman 116-122)