• Tidak ada hasil yang ditemukan

CONTINUOUS MODELS

5.2 Creating New Distributions

5.2.6 Splicing

Another method for creating a new distribution is by splicing. This approach is similar to mixing in that it might be believed that two or more separate processes are responsible for generating the losses. With mixing, the various processes operate on subsets of the population. Once the subset is identified, a simple loss model suffices. For splicing, the processes differ with regard to the loss amount. That is, one model governs the behavior of losses in some interval of possible losses while other models cover the other intervals.

Definition 5.8 makes this precise.

Definition 5.8 A k-component spliced distribution has a density function that can be expressed as follows:

๐‘“๐‘‹(๐‘ฅ) =

โŽงโŽช

โŽชโŽจ

โŽชโŽช

โŽฉ

๐‘Ž1๐‘“1(๐‘ฅ), ๐‘0< ๐‘ฅ < ๐‘1, ๐‘Ž2๐‘“2(๐‘ฅ), ๐‘1< ๐‘ฅ < ๐‘2,

โ‹ฎ โ‹ฎ

๐‘Ž๐‘˜๐‘“๐‘˜(๐‘ฅ), ๐‘๐‘˜โˆ’1< ๐‘ฅ < ๐‘๐‘˜.

For๐‘—= 1,โ€ฆ, ๐‘˜, each๐‘Ž๐‘— >0and each๐‘“๐‘—(๐‘ฅ)must be a legitimate density function with all probability on the interval(๐‘๐‘—โˆ’1, ๐‘๐‘—). Also,๐‘Ž1+โ‹ฏ+๐‘Ž๐‘˜= 1.

EXAMPLE 5.8

Demonstrate that Model 5 in Section 2.2 is a two-component spliced model.

The density function is ๐‘“(๐‘ฅ) =

{

0.01, 0โ‰ค๐‘ฅ <50, 0.02, 50โ‰ค๐‘ฅ <75,

and the spliced model is created by letting๐‘“1(๐‘ฅ) = 0.02, 0 โ‰ค ๐‘ฅ < 50, which is a uniform distribution on the interval from 0 to 50, and๐‘“2(๐‘ฅ) = 0.04,50 โ‰ค ๐‘ฅ < 75, which is a uniform distribution on the interval from 50 to 75. The coefficients are then

๐‘Ž1= 0.5and๐‘Ž2= 0.5. โ–ก

It was not necessary to use density functions and coefficients, but this is one way to ensure that the result is a legitimate density function. When using parametric models, the motivation for splicing is that the tail behavior may be inconsistent with the behavior for small losses. For example, experience (based on knowledge beyond that available in the current, perhaps small, data set) may indicate that the tail follows the Pareto distribution, but there is a positive mode more in keeping with the lognormal or inverse Gaussian distributions. A second instance is when there is a large amount of data below some value but a limited amount of information elsewhere. We may want to use the empirical distribution (or a smoothed version of it) up to a certain point and a parametric model beyond that value. Definition 5.8 is appropriate when the break points๐‘0,โ€ฆ, ๐‘๐‘˜are known in advance.

Another way to construct a spliced model is to use standard distributions over the range from๐‘0to๐‘๐‘˜. Let๐‘”๐‘—(๐‘ฅ)be the๐‘—th such density function. Then, in Definition 5.8, replace๐‘“๐‘—(๐‘ฅ)with๐‘”๐‘—(๐‘ฅ)โˆ•[๐บ(๐‘๐‘—) โˆ’๐บ(๐‘๐‘—โˆ’1)]. This formulation makes it easier to have the break points become parameters that can be estimated.

0 0.002 0.004 0.006 0.008 0.01

0 50 100 150 200 250

f(x)

x

Pareto Exponential

Figure 5.1 The two-component spliced density.

Neither approach to splicing ensures that the resulting density function will be contin- uous (i.e. the components will meet at the break points). Such a restriction could be added to the specification.

EXAMPLE 5.9

Create a two-component spliced model using an exponential distribution from0to๐‘ and a Pareto distribution(using๐›พin place of๐œƒ)from๐‘toโˆž.

The basic format is

๐‘“๐‘‹(๐‘ฅ) =

โŽงโŽช

โŽจโŽช

โŽฉ

๐‘Ž1๐œƒโˆ’1๐‘’โˆ’๐‘ฅโˆ•๐œƒ

1 โˆ’๐‘’โˆ’๐‘โˆ•๐œƒ, 0< ๐‘ฅ < ๐‘, ๐‘Ž2

๐›ผ๐›พ๐›ผ(๐‘ฅ+๐›พ)โˆ’๐›ผโˆ’1

๐›พ๐›ผ(๐‘+๐›พ)โˆ’๐›ผ , ๐‘ < ๐‘ฅ <โˆž.

However, we must force the density function to integrate to 1. All that is needed is to let๐‘Ž1=๐‘ฃand๐‘Ž2= 1 โˆ’๐‘ฃ. The spliced density function becomes

๐‘“๐‘‹(๐‘ฅ) =

โŽงโŽช

โŽจโŽช

โŽฉ

๐‘ฃ ๐œƒโˆ’1๐‘’โˆ’๐‘ฅโˆ•๐œƒ

1 โˆ’๐‘’โˆ’๐‘โˆ•๐œƒ, 0< ๐‘ฅ < ๐‘, (1 โˆ’๐‘ฃ)๐›ผ(๐‘+๐›พ)๐›ผ

(๐‘ฅ+๐›พ)๐›ผ+1, ๐‘ < ๐‘ฅ <โˆž

, ๐œƒ, ๐›ผ, ๐›พ, ๐‘ >0,0< ๐‘ฃ <1.

Figure 5.1 illustrates this density function using the values ๐‘ = 100, ๐‘ฃ = 0.6, ๐œƒ= 100,๐›พ = 200, and๐›ผ= 4. It is clear that this density is not continuous. โ–ก 5.2.7 Exercises

5.1 Let ๐‘‹ have cdf ๐น๐‘‹(๐‘ฅ) = 1 โˆ’ (1 +๐‘ฅ)โˆ’๐›ผ, ๐‘ฅ, ๐›ผ > 0. Determine the pdf and cdf of ๐‘Œ =๐œƒ๐‘‹.

5.2 (*) One hundred observed claims in 1995 were arranged as follows: 42 were between 0 and 300, 3 were between 300 and 350, 5 were between 350 and 400, 5 were between 400 and 450, 0 were between 450 and 500, 5 were between 500 and 600, and the remaining 40 were above 600. For the next three years, all claims are inflated by 10% per year. Based on the empirical distribution from 1995, determine a range for the probability that a claim exceeds 500 in 1998 (there is not enough information to determine the probability exactly).

5.3 Let ๐‘‹ have a Pareto distribution. Determine the cdf of the inverse, transformed, and inverse transformed distributions. Check Appendix A to determine if any of these distributions have special names.

5.4 Let๐‘‹have a loglogistic distribution. Demonstrate that the inverse distribution also has a loglogistic distribution. Therefore, there is no need to identify a separate inverse loglogistic distribution.

5.5 Let๐‘Œ have a lognormal distribution with parameters๐œ‡and๐œŽ. Let๐‘=๐œƒ๐‘Œ. Show that ๐‘ also has a lognormal distribution and, therefore, the addition of a third parameter has not created a new distribution.

5.6 (*) Let๐‘‹have a Pareto distribution with parameters๐›ผand๐œƒ. Let๐‘Œ = ln(1 +๐‘‹โˆ•๐œƒ).

Determine the name of the distribution of๐‘Œ and its parameters.

5.7 Venter [124] notes that if ๐‘‹ has a transformed gamma distribution and its scale parameter๐œƒhas an inverse transformed gamma distribution (where the parameter๐œis the same in both distributions), the resulting mixture has the transformed beta distribution.

Demonstrate that this is true.

5.8 (*) Let๐‘have a Poisson distribution with meanฮ›. Letฮ›have a gamma distribution with mean 1 and variance 2. Determine the unconditional probability that๐‘ = 1.

5.9 (*) Given a value ofฮ˜ = ๐œƒ, the random variable ๐‘‹has an exponential distribution with hazard rate functionโ„Ž(๐‘ฅ) = ๐œƒ, a constant. The random variable ฮ˜has a uniform distribution on the interval(1,11). Determine๐‘†๐‘‹(0.5)for the unconditional distribution.

5.10 (*) Let๐‘have a Poisson distribution with meanฮ›. Letฮ›have a uniform distribution on the interval(0,5). Determine the unconditional probability that๐‘โ‰ฅ2.

5.11 (*) A two-point mixed distribution has, with probability ๐‘, a binomial distribution with parameters๐‘š= 2 and๐‘ž = 0.5, and with probability1 โˆ’๐‘, a binomial distribution with parameters๐‘š= 4and๐‘ž= 0.5. Determine, as a function of๐‘, the probability that this random variable takes on the value 2.

5.12 Determine the probability density function and the hazard rate of the frailty distribu- tion.

5.13 Suppose that๐‘‹|ฮ›has a Weibull survival function๐‘†๐‘‹|ฮ›(๐‘ฅ|๐œ†) = ๐‘’โˆ’๐œ†๐‘ฅ๐›พ,๐‘ฅโ‰ฅ 0, and ฮ›has an exponential distribution. Demonstrate that the unconditional distribution of๐‘‹is loglogistic.

5.14 Consider the exponentialโ€“inverse Gaussian frailty model with ๐‘Ž(๐‘ฅ) = ๐œƒ

2โˆš

1 +๐œƒ๐‘ฅ, ๐œƒ >0.

(a) Verify that the conditional hazard rateโ„Ž๐‘‹|ฮ›(๐‘ฅ|๐œ†)of๐‘‹|ฮ›is indeed a valid hazard rate.

(b) Determine the conditional survival function๐‘†๐‘‹|ฮ›(๐‘ฅ|๐œ†).

(c) If ฮ› has a gamma distribution with parameters ๐œƒ = 1 and ๐›ผ replaced by 2๐›ผ, determine the marginal or unconditional survival function of๐‘‹.

(d) Use (c) to argue that a given frailty model may arise from more than one combi- nation of conditional distributions of๐‘‹|ฮ›and frailty distributions ofฮ›.

5.15 Suppose that๐‘‹has survival function๐‘†๐‘‹(๐‘ฅ) = 1 โˆ’๐น๐‘‹(๐‘ฅ), given by (5.3). Show that ๐‘†1(๐‘ฅ) = ๐น๐‘‹(๐‘ฅ)โˆ•[E(ฮ›)๐ด(๐‘ฅ)]is again a survival function of the form given by (5.3), and identify the distribution ofฮ›associated with๐‘†1(๐‘ฅ).

5.16 Fix ๐‘  โ‰ฅ 0, and define an โ€œEsscher-transformedโ€ frailty random variable ฮ›๐‘  with probability density function (or discrete probability mass function in the discrete case) ๐‘“ฮ›๐‘ (๐œ†) =๐‘’โˆ’๐‘ ๐œ†๐‘“ฮ›(๐œ†)โˆ•๐‘€ฮ›(โˆ’๐‘ ), ๐œ†โ‰ฅ0.

(a) Show thatฮ›๐‘ has moment generating function

๐‘€ฮ›๐‘ (๐‘ง) =๐ธ(๐‘’๐‘งฮ›๐‘ ) = ๐‘€ฮ›(๐‘งโˆ’๐‘ ) ๐‘€ฮ›(โˆ’๐‘ ) . (b) Define the cumulant generating function ofฮ›to be

๐‘ฮ›(๐‘ง) = ln[๐‘€ฮ›(๐‘ง)], and use (a) to prove that

๐‘โ€ฒฮ›(โˆ’๐‘ ) =E(ฮ›๐‘ )and๐‘ฮ›โ€ฒโ€ฒ(โˆ’๐‘ ) = Var(ฮ›๐‘ ).

(c) For the frailty model with survival function given by (5.3), prove that the associated hazard rate may be expressed asโ„Ž๐‘‹(๐‘ฅ) =๐‘Ž(๐‘ฅ)๐‘โ€ฒฮ›[โˆ’๐ด(๐‘ฅ)], where๐‘ฮ›is defined in (b).

(d) Use (c) to show that

โ„Žโ€ฒ๐‘‹(๐‘ฅ) =๐‘Žโ€ฒ(๐‘ฅ)๐‘ฮ›โ€ฒ[โˆ’๐ด(๐‘ฅ)] โˆ’ [๐‘Ž(๐‘ฅ)]2๐‘ฮ›โ€ฒโ€ฒ[โˆ’๐ด(๐‘ฅ)].

(e) Prove using (d) that if the conditional hazard rateโ„Ž๐‘‹|ฮ›(๐‘ฅ|๐œ†)is nonincreasing in๐‘ฅ, thenโ„Ž๐‘‹(๐‘ฅ)is also nonincreasing in๐‘ฅ.

5.17 Write the density function for a two-component spliced model in which the density function is proportional to a uniform density over the interval from 0 to1,000 and is proportional to an exponential density function from1,000toโˆž. Ensure that the resulting density function is continuous.

5.18 Let๐‘‹have pdf๐‘“(๐‘ฅ) = exp(โˆ’|๐‘ฅโˆ•๐œƒ|)โˆ•2๐œƒforโˆ’โˆž< ๐‘ฅ <โˆž. Let๐‘Œ =๐‘’๐‘‹. Determine the pdf and cdf of๐‘Œ.

5.19 (*) Losses in 1993 follow the density function๐‘“(๐‘ฅ) = 3๐‘ฅโˆ’4, ๐‘ฅโ‰ฅ 1, where๐‘ฅis the loss in millions of dollars. Inflation of 10% impacts all claims uniformly from 1993 to 1994. Determine the cdf of losses for 1994 and use it to determine the probability that a 1994 loss exceeds 2,200,000.

5.20 Consider the inverse Gaussian random variable๐‘‹with pdf (from Appendix A) ๐‘“(๐‘ฅ) =

โˆš ๐œƒ 2๐œ‹๐‘ฅ3exp

[

โˆ’ ๐œƒ 2๐‘ฅ

(๐‘ฅโˆ’๐œ‡ ๐œ‡

)2]

, ๐‘ฅ >0, where๐œƒ >0and๐œ‡ >0are parameters.

(a) Derive the pdf of the reciprocal inverse Gaussian random variable1โˆ•๐‘‹. (b) Prove that the โ€œjointโ€ moment generating function of๐‘‹and1โˆ•๐‘‹is given by

๐‘€(๐‘ง1, ๐‘ง2) =E

(๐‘’๐‘ง1๐‘‹+๐‘ง2๐‘‹โˆ’1)

=

โˆš ๐œƒ ๐œƒโˆ’ 2๐‘ง2

exp

โŽ›โŽœ

โŽœโŽœ

โŽ ๐œƒโˆ’โˆš(

๐œƒโˆ’ 2๐œ‡2๐‘ง1) (

๐œƒโˆ’ 2๐‘ง2) ๐œ‡

โŽžโŽŸ

โŽŸโŽŸ

โŽ  ,

where๐‘ง1< ๐œƒโˆ•( 2๐œ‡2)

and๐‘ง2< ๐œƒโˆ•2.

(c) Use (b) to show that the moment generating function of๐‘‹is ๐‘€๐‘‹(๐‘ง) =๐ธ(

๐‘’๐‘ง๐‘‹)

= exp [๐œƒ

๐œ‡ (

1 โˆ’

โˆš 1 โˆ’2๐œ‡2

๐œƒ ๐‘ง )]

, ๐‘ง < ๐œƒ 2๐œ‡2.

(d) Use (b) to show that the reciprocal inverse Gaussian random variable1โˆ•๐‘‹has moment generating function

๐‘€1โˆ•๐‘‹(๐‘ง) =E

(๐‘’๐‘ง๐‘‹โˆ’1)

=

โˆš ๐œƒ ๐œƒโˆ’ 2๐‘งexp

[๐œƒ ๐œ‡

( 1 โˆ’

โˆš 1 โˆ’2

๐œƒ๐‘ง )]

, ๐‘ง < ๐œƒ 2.

Hence prove that1โˆ•๐‘‹has the same distribution as๐‘1+๐‘2, where๐‘1has a gamma distribution,๐‘2has an inverse Gaussian distribution, and๐‘1is independent of๐‘2. Also, identify the gamma and inverse Gaussian parameters in this representation.

(e) Use (b) to show that

๐‘= 1 ๐‘‹

(๐‘‹โˆ’๐œ‡ ๐œ‡

)2

has a gamma distribution with parameters๐›ผ = 1

2 and the usual parameter๐œƒ (in Appendix A) replaced by2โˆ•๐œƒ.

(f) For the mgf of the inverse Gaussian random variable๐‘‹in (c), prove by induction on๐‘˜that, for๐‘˜= 1,2,โ€ฆ, the๐‘˜th derivative of the mgf is

๐‘€๐‘‹(๐‘˜)(๐‘ง) =๐‘€๐‘‹(๐‘ง)

๐‘˜โˆ’1

โˆ‘

๐‘›=0

(๐‘˜+๐‘›โˆ’ 1)!

(๐‘˜โˆ’๐‘›โˆ’ 1)!๐‘›! (1

2 )๐‘˜+3๐‘›2

๐œƒ๐‘˜โˆ’2๐‘› ( ๐œƒ

2๐œ‡2 โˆ’๐‘ง )โˆ’๐‘˜+๐‘›

2

and hence that the inverse Gaussian random variable has integer moments E[๐‘‹๐‘˜] =

๐‘˜โˆ‘โˆ’1 ๐‘›=0

(๐‘˜+๐‘›โˆ’ 1)!

(๐‘˜โˆ’๐‘›โˆ’ 1)!๐‘›! ๐œ‡๐‘›+๐‘˜

(2๐œƒ)๐‘›, ๐‘˜= 1,2,โ€ฆ.

(g) The modified Bessel function,๐พ๐œ†(๐‘ฅ)may be defined, for half-integer values of the index parameter๐œ†, by๐พโˆ’๐œ†(๐‘ฅ) =๐พ๐œ†(๐‘ฅ), together with

๐พ๐‘š+1

2

(๐‘ฅ) =

โˆš๐œ‹ 2๐‘ฅ๐‘’โˆ’๐‘ฅ

โˆ‘๐‘š ๐‘—=0

(๐‘š+๐‘—)!

(๐‘šโˆ’๐‘—)!๐‘—! ( 1

2๐‘ฅ )๐‘—

, ๐‘š= 0,1,โ€ฆ.

Use part (f) to prove that, for๐›ผ >0,๐œƒ >0, and๐‘š= 0,1,โ€ฆ,

โˆซ

โˆž 0

๐‘ฅ๐‘šโˆ’32๐‘’โˆ’๐›ผ๐‘ฅโˆ’2๐‘ฅ๐œƒ ๐‘‘๐‘ฅ= 2 ( ๐œƒ

2๐›ผ )๐‘š2โˆ’14

๐พ๐‘šโˆ’1

2

(โˆš2๐›ผ๐œƒ) .

5.3 Selected Distributions and Their Relationships

Dalam dokumen Book LOSS MODELS FROM DATA TO DECISIONS (Halaman 86-91)