Compound Frequency Distributions - ADVANCED DISCRETE DISTRIBUTIONS

ADVANCED DISCRETE DISTRIBUTIONS

7.1 Compound Frequency Distributions

7

𝑆=𝑀1+𝑀2+⋯+𝑀_𝑁(where𝑁 = 0implies that𝑆= 0) is𝑃_𝑆(𝑧) =𝑃_𝑁[𝑃_𝑀(𝑧)]. This is shown as follows:

𝑃_𝑆(𝑧) =

∑∞ 𝑘=0

Pr(𝑆=𝑘)𝑧^𝑘=

∑∞ 𝑘=0

∑∞ 𝑛=0

Pr(𝑆=𝑘|𝑁=𝑛) Pr(𝑁=𝑛)𝑧^𝑘

∑∞ 𝑛=0

Pr(𝑁=𝑛)

∑∞ 𝑘=0

Pr(𝑀₁+⋯+𝑀_𝑛=𝑘|𝑁=𝑛)𝑧^𝑘

∑∞ 𝑛=0

Pr(𝑁=𝑛)[𝑃_𝑀(𝑧)]^𝑛

=𝑃_𝑁[𝑃_𝑀(𝑧)].

In insurance contexts, this distribution can arise naturally. If𝑁represents the number of accidents arising in a portfolio of risks and{𝑀_𝑘∶ 𝑘= 1,2,…, 𝑁}represents the number of claims (injuries, number of cars, etc.) from the accidents, then𝑆 represents the total number of claims from the portfolio. This kind of interpretation is not necessary to justify the use of a compound distribution. If a compound distribution fits data well, that may be enough justification itself. Also, there are other motivations for these distributions, as presented in Section 7.5.

EXAMPLE 7.1

Demonstrate that any zero-modified distribution is a compound distribution.

Consider a primary Bernoulli distribution. It has pgf𝑃_𝑁(𝑧) = 1 −𝑞+𝑞𝑧. Then consider an arbitrary secondary distribution with pgf𝑃_𝑀(𝑧). Then, from (7.1), we obtain

𝑃_𝑆(𝑧) =𝑃_𝑁[𝑃_𝑀(𝑧)] = 1 −𝑞+𝑞𝑃_𝑀(𝑧). From (6.4) this is the pgf of a ZM distribution with

𝑞= 1 −𝑝^𝑀₀ 1 −𝑝0

That is, the ZM distribution has assigned arbitrary probability𝑝^𝑀₀ at zero, while𝑝0is the probability assigned at zero by the secondary distribution. □ EXAMPLE 7.2

Consider the case where both𝑀 and𝑁have a Poisson distribution. Determine the pgf of this distribution.

This distribution is called the Poisson–Poisson or Neyman Type A distribution.

Let𝑃_𝑁(𝑧) =𝑒^𝜆¹⁽^𝑧⁻¹⁾and𝑃_𝑀(𝑧) =𝑒^𝜆²⁽^𝑧⁻¹⁾. Then, 𝑃_𝑆(𝑧) =𝑒^𝜆¹^[^𝑒^𝜆2(𝑧⁻¹⁾^−1].

When𝜆2is a lot larger than𝜆1, for example, 𝜆1 = 0.1 and𝜆2 = 10, the resulting

distribution will have two local modes. □

The probability of exactly𝑘claims can be written as Pr(𝑆=𝑘) =

∑∞ 𝑛=0

Pr(𝑆=𝑘|𝑁=𝑛) Pr(𝑁=𝑛)

∑∞ 𝑛=0

Pr(𝑀1+⋯+𝑀_𝑁 =𝑘|𝑁=𝑛) Pr(𝑁=𝑛)

∑∞ 𝑛=0

Pr(𝑀1+⋯+𝑀_𝑛=𝑘) Pr(𝑁=𝑛). (7.2) Letting𝑔_𝑛= Pr(𝑆=𝑛),𝑝_𝑛= Pr(𝑁=𝑛), and𝑓_𝑛= Pr(𝑀 =𝑛), this is rewritten as

𝑔_𝑘=

∑∞ 𝑛=0

𝑝_𝑛𝑓_𝑘^∗^𝑛, (7.3)

where𝑓_𝑘^∗^𝑛, 𝑘= 0,1,…, is the “𝑛-fold convolution” of the function𝑓_𝑘, 𝑘= 0,1,…, that is, the probability that the sum of𝑛random variables which are each i.i.d. with probability function𝑓_𝑘will take on value𝑘.

When𝑃_𝑁(𝑧)is chosen to be a member of the(𝑎, 𝑏,0)class, 𝑝_𝑘=

(𝑎+ 𝑏 𝑘

)𝑝_𝑘−1, 𝑘= 1,2,…, (7.4)

then a simple recursive formula can be used. This formula avoids the use of convolutions and thus reduces the computations considerably.

Theorem 7.1 If the primary distribution is a member of the(𝑎, 𝑏,0)class, the recursive formula is

𝑔_𝑘= 1 1 −𝑎𝑓0

∑𝑘 𝑗=1

( 𝑎+𝑏𝑗

𝑘 )

𝑓_𝑗𝑔_𝑘₋_𝑗, 𝑘= 1,2,3,…. (7.5)

Proof:From (7.4),

𝑛𝑝_𝑛=𝑎(𝑛− 1)𝑝_𝑛−1+ (𝑎+𝑏)𝑝_𝑛−1.

Multiplying each side by[𝑃_𝑀(𝑧)]^𝑛⁻¹𝑃_𝑀^′ (𝑧)and summing over𝑛yields

∑∞ 𝑛=1

𝑛𝑝_𝑛[𝑃_𝑀(𝑧)]^𝑛⁻¹𝑃_𝑀^′ (𝑧) =𝑎

∑∞ 𝑛=1

(𝑛− 1)𝑝_𝑛₋₁[𝑃_𝑀(𝑧)]^𝑛⁻¹𝑃_𝑀^′ (𝑧)

+ (𝑎+𝑏)

∑∞ 𝑛=1

𝑝_𝑛−1[𝑃_𝑀(𝑧)]^𝑛⁻¹𝑃_𝑀^′ (𝑧).

Because𝑃_𝑆(𝑧) =∑∞

𝑛=0𝑝_𝑛[𝑃_𝑀(𝑧)]^𝑛, the previous equation is 𝑃_𝑆^′(𝑧) =𝑎

∑∞ 𝑛=0

𝑛𝑝_𝑛[𝑃_𝑀(𝑧)]^𝑛𝑃_𝑀^′ (𝑧) + (𝑎+𝑏)

∑∞ 𝑛=0

𝑝_𝑛[𝑃_𝑀(𝑧)]^𝑛𝑃_𝑀^′ (𝑧).

Therefore,

𝑃_𝑆^′(𝑧) =𝑎𝑃_𝑆^′(𝑧)𝑃_𝑀(𝑧) + (𝑎+𝑏)𝑃_𝑆(𝑧)𝑃_𝑀^′ (𝑧).

Each side can be expanded in powers of𝑧. The coefficients of𝑧^𝑘⁻¹in such an expansion must be the same on both sides of the equation. Hence, for𝑘= 1,2,…, we have

𝑘𝑔_𝑘=𝑎

∑𝑘 𝑗=0

(𝑘−𝑗)𝑓_𝑗𝑔_𝑘−𝑗 + (𝑎+𝑏)

∑𝑘 𝑗=0

𝑗𝑓_𝑗𝑔_𝑘−𝑗

=𝑎𝑘𝑓₀𝑔_𝑘+𝑎∑^𝑘

𝑗=1

(𝑘−𝑗)𝑓_𝑗𝑔_𝑘₋_𝑗+ (𝑎+𝑏)

∑𝑘 𝑗=1

𝑗𝑓_𝑗𝑔_𝑘₋_𝑗

=𝑎𝑘𝑓0𝑔_𝑘+𝑎𝑘∑^𝑘

𝑗=1

𝑓_𝑗𝑔_𝑘−𝑗+𝑏∑^𝑘

𝑗=1

𝑗𝑓_𝑗𝑔_𝑘−𝑗. Therefore,

𝑔_𝑘=𝑎𝑓₀𝑔_𝑘+

∑𝑘 𝑗=1

( 𝑎+𝑏𝑗

𝑘 )

𝑓_𝑗𝑔_𝑘₋_𝑗.

Rearrangement yields (7.5). □

In order to use (7.5), the starting value𝑔0is required and is given in Theorem 7.3. If the primary distribution is a member of the(𝑎, 𝑏,1) class, the proof must be modified to reflect the fact that the recursion for the primary distribution begins at𝑘= 2. The result is the following.

Theorem 7.2 If the primary distribution is a member of the(𝑎, 𝑏,1)class, the recursive formula is

𝑔_𝑘=

[𝑝1− (𝑎+𝑏)𝑝0]𝑓_𝑘+∑_𝑘

𝑗=1(𝑎+𝑏𝑗∕𝑘)𝑓_𝑗𝑔_𝑘−𝑗

1 −𝑎𝑓0

, 𝑘= 1,2,3,…. (7.6)

Proof:It is similar to the proof of Theorem 7.1 and is omitted. □ EXAMPLE 7.3

Develop the recursive formula for the case where the primary distribution is Poisson.

In this case,𝑎= 0and𝑏=𝜆, yielding the recursive form 𝑔_𝑘= 𝜆

𝑘

∑𝑘 𝑗=1

𝑗𝑓_𝑗𝑔_𝑘₋_𝑗. The starting value is, from (7.1),

𝑔0= Pr(𝑆= 0) =𝑃(0)

=𝑃_𝑁[𝑃_𝑀(0)] =𝑃_𝑁(𝑓₀)

=𝑒⁻^𝜆⁽¹⁻^𝑓⁰⁾. (7.7)

Distributions of this type are calledcompound Poisson. When the secondary distribution is specified, the compound distribution is called Poisson–X, where X is the name

of the secondary distribution. □

The method used to obtain𝑔₀applies to any compound distribution.

Theorem 7.3 For any compound distribution,𝑔0=𝑃_𝑁(𝑓0), where𝑃_𝑁(𝑧)is the pgf of the primary distribution and𝑓0is the probability that the secondary distribution takes on the value zero.

Proof:See the second line of (7.7) □

Note that the secondary distribution is not required to be in any special form. However, to keep the number of distributions manageable, secondary distributions are selected from the(𝑎, 𝑏,0)or the(𝑎, 𝑏,1)class.

EXAMPLE 7.4

Calculate the probabilities for the Poisson–ETNB distribution, where𝜆 = 3for the Poisson distribution and the ETNB distribution has𝑟= −0.5and𝛽 = 1.

From Example 6.5, the secondary probabilities are 𝑓0 = 0, 𝑓1 = 0.853553, 𝑓2 = 0.106694, and𝑓3 = 0.026674. From (7.7),𝑔0= exp[−3(1 − 0)] = 0.049787.

For the Poisson primary distribution,𝑎= 0and𝑏= 3. The recursive formula in (7.5) becomes

𝑔_𝑘=

∑_𝑘

𝑗=1(3𝑗∕𝑘)𝑓_𝑗𝑔_𝑘−𝑗

1 − 0(0) =

∑𝑘 𝑗=1

3𝑗 𝑘𝑓_𝑗𝑔_𝑘₋_𝑗. Then,

𝑔1= 3(1)

1 0.853553(0.049787) = 0.127488, 𝑔2= 3(1)

2 0.853553(0.127488) +3(2)

2 0.106694(0.049787) = 0.179163, 𝑔3= 3(1)

3 0.853553(0.179163) +3(2)

3 0.106694(0.127488) +3(3)

3 0.026674(0.049787) = 0.184114. □

EXAMPLE 7.5

Demonstrate that the Poisson–logarithmic distribution is a negative binomial distribution.

The negative binomial distribution has pgf 𝑃(𝑧) = [1 −𝛽(𝑧− 1)]⁻^𝑟.

Suppose that𝑃_𝑁(𝑧)is Poisson(𝜆)and𝑃_𝑀(𝑧)is logarithmic(𝛽). Then, 𝑃_𝑁[𝑃_𝑀(𝑧)] = exp{𝜆[𝑃_𝑀(𝑧) − 1]}

= exp {

𝜆 [

1 −ln[1 −𝛽(𝑧− 1)]

ln(1 +𝛽) − 1 ]}

= exp

{ −𝜆

ln(1 +𝛽)ln[1 −𝛽(𝑧− 1)]

}

= [1 −𝛽(𝑧− 1)]⁻^𝜆^{∕ ln(1+}^𝛽⁾

= [1 −𝛽(𝑧− 1)]⁻^𝑟,

where𝑟 = 𝜆∕ ln(1 +𝛽). This shows that the negative binomial distribution can be written as a compound Poisson distribution with a logarithmic secondary distribution.

□ Example 7.5 shows that the Poisson–logarithmic distribution does not create a new distribution beyond the(𝑎, 𝑏,0)and(𝑎, 𝑏,1)classes. As a result, this combination of distributions is not useful to us. Another combination that does not create a new distribution beyond the (𝑎, 𝑏,1) class is the compound geometric distribution, where both the primary and secondary distributions are geometric. The resulting distribution is a zero-modified geometric distribution, as shown in Exercise 7.2. The following theorem shows that certain other combinations are also of no use in expanding the class of distributions through compounding. Suppose that𝑃_𝑆(𝑧) = 𝑃_𝑁[𝑃_𝑀(𝑧)]as before. Now,𝑃_𝑀(𝑧)can always be written as

𝑃_𝑀(𝑧) =𝑓0+ (1 −𝑓0)𝑃_𝑀^∗(𝑧), (7.8) where𝑃_𝑀^∗(𝑧)is the pgf of the conditional distribution over the positive range (in other words, the zero-truncated version).

Theorem 7.4 Suppose that the pgf𝑃_𝑁(𝑧;𝜃)satisfies 𝑃_𝑁(𝑧;𝜃) =𝐵[𝜃(𝑧− 1)]

for some parameter 𝜃 and some function 𝐵(𝑧) that is independent of 𝜃. That is, the parameter𝜃and the argument𝑧only appear in the pgf as𝜃(𝑧− 1). There may be other parameters as well, and they may appear anywhere in the pgf. Then,𝑃_𝑆(𝑧) =𝑃_𝑁[𝑃_𝑀(𝑧);𝜃] can be rewritten as

𝑃_𝑆(𝑧) =𝑃_𝑁[𝑃_𝑀^𝑇(𝑧);𝜃(1 −𝑓0)].

Proof:

𝑃_𝑆(𝑧) =𝑃_𝑁[𝑃_𝑀(𝑧);𝜃]

=𝑃_𝑁[𝑓0+ (1 −𝑓0)𝑃_𝑀^𝑇(𝑧);𝜃]

=𝐵{𝜃[𝑓₀+ (1 −𝑓₀)𝑃_𝑀^𝑇(𝑧) − 1]}

=𝐵{𝜃(1 −𝑓0)[𝑃_𝑀^𝑇(𝑧) − 1]}

=𝑃_𝑁[𝑃_𝑀^𝑇(𝑧);𝜃(1 −𝑓0)]. □

This shows that adding, deleting, or modifying the probability at zero in the secondary distribution does not add a new distribution because it is equivalent to modifying the parameter𝜃of the primary distribution. Thus, for example, a Poisson primary distribution with a Poisson, zero-truncated Poisson, or zero-modified Poisson secondary distribution will still lead to a Neyman Type A (Poisson–Poisson) distribution.

EXAMPLE 7.6

Determine the probabilities for a Poisson–zero-modified ETNB distribution where the parameters are𝜆= 7.5,𝑝^𝑀₀ = 0.6,𝑟= −0.5, and𝛽= 1.

From Example 6.5, the secondary probabilities are 𝑓0 = 0.6, 𝑓1 = 0.341421, 𝑓2= 0.042678, and𝑓3= 0.010670. From (7.7),𝑔0= exp[−7.5(1 − 0.6)] = 0.049787.

For the Poisson primary distribution, 𝑎= 0and𝑏 = 7.5. The recursive formula in (7.5) becomes

𝑔_𝑘=

∑_𝑘

𝑗=1(7.5𝑗∕𝑘)𝑓_𝑗𝑔_𝑘−𝑗

1 − 0(0.6) =

∑𝑘 𝑗=1

7.5𝑗 𝑘 𝑓_𝑗𝑔_𝑘₋_𝑗. Then,

𝑔₁= 7.5(1)

1 0.341421(0.049787) = 0.127487, 𝑔₂= 7.5(1)

2 0.341421(0.127487) +7.5(2)

2 0.042678(0.049787) = 0.179161, 𝑔3= 7.5(1)

3 0.341421(0.179161) +7.5(2)

3 0.042678(0.127487) + 7.5(3)

3 0.010670(0.049787) = 0.184112.

Except for slight rounding differences, these probabilities are the same as those

obtained in Example 7.4. □

7.1.1 Exercises

7.1 Do all the members of the(𝑎, 𝑏,0)class satisfy the condition of Theorem 7.4? For those that do, identify the parameter (or function of its parameters) that plays the role of𝜃 in the theorem.

7.2 Show that the following three distributions are identical: (1) geometric–geometric, (2) Bernoulli–geometric, and (3) zero-modified geometric. That is, for any one of the distributions with arbitrary parameters, show that there is a member of the other two distribution types that has the same pf or pgf.

7.3 Show that the binomial–geometric and negative binomial–geometric (with negative binomial parameter𝑟a positive integer) distributions are identical.

Dalam dokumen Book LOSS MODELS FROM DATA TO DECISIONS (Halaman 116-122)