• Tidak ada hasil yang ditemukan

Tails of Distributions

Dalam dokumen Book LOSS MODELS FROM DATA TO DECISIONS (Halaman 50-58)

BASIC DISTRIBUTIONAL QUANTITIES

3.4 Tails of Distributions

Thetailof a distribution (more properly, the right tail) is the portion of the distribution corresponding to large values of the random variable. Understanding large possible loss values is important because these have the greatest effect on total losses. Random variables that tend to assign higher probabilities to larger values are said to be heavier tailed. Tail weight can be a relative concept (model A has a heavier tail than model B) or an absolute concept (distributions with a certain property are classified as heavy tailed). When choosing models, tail weight can help narrow the choices or can confirm a choice for a model.

3.4.1 Classification Based on Moments

Recall that in the continuous case, theπ‘˜th raw moment for a random variable that takes on only positive values (like most insurance payment variables) is given by∫0∞π‘₯π‘˜π‘“(π‘₯)𝑑π‘₯. Depending on the density function and the value of π‘˜, this integral may not exist (i.e.

it may be infinite). One way of classifying distributions is on the basis of whether all moments exist. It is generally agreed that the existence of all positive moments indicates a (relatively) light right tail, while the existence of only positive moments up to a certain value (or existence of no positive moments at all) indicates a heavy right tail.

EXAMPLE 3.9

Demonstrate that for the gamma distribution all positive moments exist but for the Pareto distribution they do not.

For the gamma distribution, the raw moments are πœ‡π‘˜β€² =

∫

∞ 0

π‘₯π‘˜π‘₯π›Όβˆ’1π‘’βˆ’π‘₯βˆ•πœƒ Ξ“(𝛼)πœƒπ›Ό 𝑑π‘₯

=∫

∞ 0

(π‘¦πœƒ)π‘˜(π‘¦πœƒ)π›Όβˆ’1π‘’βˆ’π‘¦

Ξ“(𝛼)πœƒπ›Ό πœƒπ‘‘π‘¦, making the substitution𝑦=π‘₯βˆ•πœƒ

= πœƒπ‘˜

Ξ“(𝛼)Ξ“(𝛼+π‘˜)<∞for allπ‘˜ >0. For the Pareto distribution, they are

πœ‡π‘˜β€² =

∫

∞ 0

π‘₯π‘˜ π›Όπœƒπ›Ό (π‘₯+πœƒ)𝛼+1 𝑑π‘₯

=∫

∞

πœƒ (π‘¦βˆ’πœƒ)π‘˜π›Όπœƒπ›Ό

𝑦𝛼+1𝑑𝑦, making the substitution𝑦=π‘₯+πœƒ

=π›Όπœƒπ›Όβˆ«

∞ πœƒ

βˆ‘π‘˜ 𝑗=0

(π‘˜ 𝑗 )

π‘¦π‘—βˆ’π›Όβˆ’1(βˆ’πœƒ)π‘˜βˆ’π‘—π‘‘π‘¦, for integer values ofπ‘˜.

The integral exists only if all of the exponents on𝑦in the sum are less thanβˆ’1, that is, ifπ‘—βˆ’π›Όβˆ’ 1<βˆ’1for all𝑗or, equivalently, ifπ‘˜ < 𝛼. Therefore, only some moments

exist. β–‘

By this classification, the Pareto distribution is said to have a heavy tail and the gamma distribution is said to have a light tail. A look at the moment formulas in Appendix A reveals which distributions have heavy tails and which do not, as indicated by the existence of moments.

It is instructive to note that if a distribution does not have all its positive moments, then it does not have a moment generating function (i.e. if𝑋 is the associated random variable, then E(𝑒𝑧𝑋) = ∞for all𝑧 >0). However, the converse is not true. The lognormal distribution has no moment generating function even though all its positive moments are finite.

Further comparisons of tail behavior can be made on the basis of ratios of moments (assuming they exist). In particular, heavy-tailed behavior is typically associated with large values of quantities such as the coefficient of variation, the skewness, and the kurtosis (see Definition 3.2).

3.4.2 Comparison Based on Limiting Tail Behavior

A commonly used indication that one distribution has a heavier tail than another distribution with the same mean is that the ratio of the two survival functions should diverge to infinity (with the heavier-tailed distribution in the numerator) as the argument becomes large. The divergence implies that the numerator distribution puts significantly more probability on large values. Note that it is equivalent to examine the ratio of density functions. The limit of the ratio will be the same, as can be seen by an application of L’HΛ†opital’s rule:

π‘₯β†’lim∞

𝑆1(π‘₯) 𝑆2(π‘₯) = lim

π‘₯β†’βˆž

𝑆1β€²(π‘₯) 𝑆2β€²(π‘₯) = lim

π‘₯β†’βˆž

βˆ’π‘“1(π‘₯)

βˆ’π‘“2(π‘₯) = lim

π‘₯β†’βˆž

𝑓1(π‘₯) 𝑓2(π‘₯).

0 0.00005 0.0001 0.00015 0.0002 0.00025 0.0003 0.00035 0.0004 0.00045

50 70 90 110 130 150

x

f(x)

Pareto Gamma

Figure 3.7 The tails of the gamma and Pareto distributions.

EXAMPLE 3.10

Demonstrate that the Pareto distribution has a heavier tail than the gamma distribution using the limit of the ratio of their density functions.

To avoid confusion, the letters𝜏 andπœ†will be used for the parameters of the gamma distribution instead of the customary𝛼andπœƒ. Then, the required limit is

π‘₯β†’lim∞

𝑓Pareto(π‘₯) 𝑓gamma(π‘₯) = lim

π‘₯β†’βˆž

π›Όπœƒπ›Ό(π‘₯+πœƒ)βˆ’π›Όβˆ’1 π‘₯πœβˆ’1π‘’βˆ’π‘₯βˆ•πœ†πœ†βˆ’πœΞ“(𝜏)βˆ’1

=𝑐 lim

π‘₯β†’βˆž

𝑒π‘₯βˆ•πœ† (π‘₯+πœƒ)𝛼+1π‘₯πœβˆ’1

> 𝑐 lim

π‘₯β†’βˆž

𝑒π‘₯βˆ•πœ† (π‘₯+πœƒ)𝛼+𝜏

and, either by application of L’HΛ†opital’s rule or by remembering that exponentials go to infinity faster than polynomials, the limit is infinity. Figure 3.7 shows a portion of the density functions for a Pareto distribution with parameters𝛼= 3andπœƒ = 10and a gamma distribution with parameters𝛼= 1

3 andπœƒ = 15. Both distributions have a mean of 5 and a variance of 75. The graph is consistent with the algebraic derivation.

β–‘

3.4.3 Classification Based on the Hazard Rate Function

The hazard rate function also reveals information about the tail of the distribution. Distribu- tions with decreasing hazard rate functions have heavy tails. Distributions with increasing hazard rate functions have light tails. In the ensuing discussion, we understand β€œdecreasing”

to mean β€œnonincreasing” and β€œincreasing” to mean β€œnondecreasing.” That is, a decreasing function can be level at times. The exponential distribution, which has a constant hazard rate, is therefore said to have both a decreasing and an increasing hazard rate. For distributions with monotone hazard rates, distributions with exponential tails divide the distributions into heavy-tailed and light-tailed distributions.

Comparisons between distributions can be made on the basis of the rate of increase or decrease of the hazard rate function. For example, a distribution has a lighter tail than

another if its hazard rate function is increasing at a faster rate. Often, these comparisons and classifications are of interest primarily in the right tail of the distribution, that is, for large functional values.

EXAMPLE 3.11

Compare the tails of the Pareto and gamma distributions by looking at their hazard rate functions.

The hazard rate function for the Pareto distribution is β„Ž(π‘₯) = 𝑓(π‘₯)

𝑆(π‘₯) = π›Όπœƒπ›Ό(π‘₯+πœƒ)βˆ’π›Όβˆ’1 πœƒπ›Ό(π‘₯+πœƒ)βˆ’π›Ό = 𝛼

π‘₯+πœƒ,

which is decreasing. For the gamma distribution we need to be a bit more clever because there is no closed-form expression for𝑆(π‘₯). Observe that

1

β„Ž(π‘₯) = ∫π‘₯βˆžπ‘“(𝑑)𝑑𝑑

𝑓(π‘₯) = ∫0βˆžπ‘“(π‘₯+𝑦)𝑑𝑦 𝑓(π‘₯) ,

and so, if𝑓(π‘₯+𝑦)βˆ•π‘“(π‘₯)is an increasing function ofπ‘₯for any fixed𝑦, then1βˆ•β„Ž(π‘₯) will be increasing inπ‘₯and thus the random variable will have a decreasing hazard rate.

Now, for the gamma distribution, 𝑓(π‘₯+𝑦)

𝑓(π‘₯) = (π‘₯+𝑦)π›Όβˆ’1π‘’βˆ’(π‘₯+𝑦)βˆ•πœƒ π‘₯π›Όβˆ’1π‘’βˆ’π‘₯βˆ•πœƒ =

( 1 +𝑦

π‘₯ )π›Όβˆ’1

π‘’βˆ’π‘¦βˆ•πœƒ,

which is strictly increasing inπ‘₯ provided that𝛼 <1 and strictly decreasing inπ‘₯ if 𝛼 >1. By this measure, some gamma distributions have a heavy tail (those with𝛼 <1) and some (with𝛼 >1) have a light tail. Note that when𝛼= 1, we have the exponential distribution and a constant hazard rate. Also, even thoughβ„Ž(π‘₯)is complicated in the gamma case, we know what happens for largeπ‘₯. Because𝑓(π‘₯)and𝑆(π‘₯)both go to 0 asπ‘₯β†’βˆž, L’HΛ†opital’s rule yields

π‘₯β†’limβˆžβ„Ž(π‘₯) = lim

π‘₯β†’βˆž

𝑓(π‘₯)

𝑆(π‘₯) = βˆ’ lim

π‘₯β†’βˆž

𝑓′(π‘₯)

𝑓(π‘₯) = βˆ’ lim

π‘₯β†’βˆž

[𝑑 𝑑π‘₯ln𝑓(π‘₯)

]

= βˆ’ lim

π‘₯β†’βˆž

𝑑 𝑑π‘₯

[

(π›Όβˆ’ 1) lnπ‘₯βˆ’π‘₯ πœƒ ]

= lim

π‘₯β†’βˆž

(1

πœƒ βˆ’π›Όβˆ’ 1 π‘₯

)

= 1 πœƒ.

That is,β„Ž(π‘₯)β†’1βˆ•πœƒasπ‘₯β†’βˆž. β–‘

3.4.4 Classification Based on the Mean Excess Loss Function

The mean excess loss function also gives information about tail weight. If the mean excess loss function is increasing in 𝑑, the distribution is considered to have a heavy tail. If the mean excess loss function is decreasing in𝑑, the distribution is considered to have a light tail. Comparisons between distributions can be made on the basis of whether the mean excess loss function is increasing or decreasing. In particular, a distribution with an increasing mean excess loss function has a heavier tail than a distribution with a decreasing mean excess loss function.

In fact, the mean excess loss function and the hazard rate are closely related in several ways. First, note that

𝑆(𝑦+𝑑) 𝑆(𝑑) =

exp [

βˆ’βˆ«0𝑦+π‘‘β„Ž(π‘₯)𝑑π‘₯] exp

[

βˆ’βˆ«0π‘‘β„Ž(π‘₯)𝑑π‘₯] = exp [

βˆ’βˆ«

𝑦+𝑑 𝑑 β„Ž(π‘₯)𝑑π‘₯

]

= exp [

βˆ’βˆ«

𝑦 0

β„Ž(𝑑+𝑑)𝑑𝑑 ]

.

Therefore, if the hazard rate is decreasing, then for fixed𝑦it follows that∫0π‘¦β„Ž(𝑑+𝑑)𝑑𝑑 is a decreasing function of 𝑑, and from the preceding, 𝑆(𝑦+𝑑)βˆ•π‘†(𝑑) is an increasing function of𝑑. But from (3.5) the mean excess loss function may be expressed as

𝑒(𝑑) = βˆ«π‘‘βˆžπ‘†(π‘₯)𝑑π‘₯ 𝑆(𝑑) =

∫

∞ 0

𝑆(𝑦+𝑑) 𝑆(𝑑) 𝑑𝑦.

Thus, if the hazard rate is a decreasing function, then the mean excess loss function𝑒(𝑑) is an increasing function of 𝑑 because the same is true of 𝑆(𝑦+𝑑)βˆ•π‘†(𝑑) for fixed 𝑦. Similarly, if the hazard rate is an increasing function, then the mean excess loss function is a decreasing function. It is worth noting (and is perhaps counterintuitive), however, that the converse implication is not true. Exercise 3.29 gives an example of a distribution that has a decreasing mean excess loss function, but the hazard rate is not increasing for all values. Nevertheless, the implications just described are generally consistent with the preceding discussions of heaviness of the tail.

There is a second relationship between the mean excess loss function and the hazard rate. Asπ‘‘β†’βˆž,𝑆(𝑑)andβˆ«π‘‘βˆžπ‘†(π‘₯)𝑑π‘₯go to zero. Thus, the limiting behavior of the mean excess loss function asπ‘‘β†’βˆžmay be ascertained using L’HΛ†opital’s rule because formula (3.5) holds. We have

𝑑→limβˆžπ‘’(𝑑) = lim

π‘‘β†’βˆž

βˆ«π‘‘βˆžπ‘†(π‘₯)𝑑π‘₯ 𝑆(𝑑) = lim

π‘‘β†’βˆž

βˆ’π‘†(𝑑)

βˆ’π‘“(𝑑) = lim

π‘‘β†’βˆž

1 β„Ž(𝑑)

as long as the indicated limits exist. These limiting relationships may be useful if the form of𝐹(π‘₯)is complicated.

EXAMPLE 3.12

Examine the behavior of the mean excess loss function of the gamma distribution.

Because𝑒(𝑑) =βˆ«π‘‘βˆžπ‘†(π‘₯)𝑑π‘₯βˆ•π‘†(𝑑)and𝑆(π‘₯)is complicated,𝑒(𝑑)is complicated.

But𝑒(0) =E(𝑋) =π›Όπœƒ, and, using Example 3.11, we have

π‘₯β†’limβˆžπ‘’(π‘₯) = lim

π‘₯β†’βˆž

1

β„Ž(π‘₯) = 1

π‘₯β†’limβˆžβ„Ž(π‘₯) =πœƒ.

Also, from Example 3.11, β„Ž(π‘₯) is strictly decreasing in π‘₯ for 𝛼 < 1 and strictly increasing inπ‘₯for𝛼 >1, implying that𝑒(𝑑)is strictly increasing from𝑒(0) =π›Όπœƒto 𝑒(∞) =πœƒfor𝛼 <1and strictly decreasing from𝑒(0) =π›Όπœƒto𝑒(∞) =πœƒfor𝛼 >1. For 𝛼= 1, we have the exponential distribution for which𝑒(𝑑) =πœƒ. β–‘

3.4.5 Equilibrium Distributions and Tail Behavior

Further insight into the mean excess loss function and the heaviness of the tail may be obtained by introducing the equilibrium distribution (also called the integrated tail distribution). For positive random variables with𝑆(0) = 1, it follows from Definition 3.3 and (3.5) with𝑑 = 0that E(𝑋) =∫0βˆžπ‘†(π‘₯)𝑑π‘₯or, equivalently,1 =∫0∞[𝑆(π‘₯)βˆ•E(𝑋)]𝑑π‘₯, so that

𝑓𝑒(π‘₯) = 𝑆(π‘₯)

E(𝑋), π‘₯β‰₯0, (3.11)

is a probability density function. The corresponding survival function is 𝑆𝑒(π‘₯) =

∫

∞

π‘₯ 𝑓𝑒(𝑑)𝑑𝑑= ∫π‘₯βˆžπ‘†(𝑑)𝑑𝑑

E(𝑋) , π‘₯β‰₯0. The hazard rate corresponding to the equilibrium distribution is

β„Žπ‘’(π‘₯) = 𝑓𝑒(π‘₯)

𝑆𝑒(π‘₯) = 𝑆(π‘₯)

∫π‘₯βˆžπ‘†(𝑑)𝑑𝑑 = 1 𝑒(π‘₯)

using (3.5). Thus, the reciprocal of the mean excess function is itself a hazard rate, and this fact may be used to show that the mean excess function uniquely characterizes the original distribution. We have

𝑓𝑒(π‘₯) =β„Žπ‘’(π‘₯)𝑆𝑒(π‘₯) =β„Žπ‘’(π‘₯)π‘’βˆ’βˆ«0π‘₯β„Žπ‘’(𝑑)𝑑𝑑, or, equivalently,

𝑆(π‘₯) = 𝑒(0) 𝑒(π‘₯)π‘’βˆ’βˆ«0π‘₯

[ 1

𝑒(𝑑)

]𝑑𝑑

using𝑒(0) =E(𝑋).

The equilibrium distribution also provides further insight into the relationship between the hazard rate, the mean excess function, and the heaviness of the tail. Assuming that 𝑆(0) = 1, and thus 𝑒(0) = E(𝑋), we have ∫π‘₯βˆžπ‘†(𝑑)𝑑𝑑 = 𝑒(0)𝑆𝑒(π‘₯), and from (3.5),

∫π‘₯βˆžπ‘†(𝑑)𝑑𝑑=𝑒(π‘₯)𝑆(π‘₯). Equating these two expressions results in 𝑒(π‘₯)

𝑒(0) = 𝑆𝑒(π‘₯) 𝑆(π‘₯).

If the mean excess function is increasing (which is implied if the hazard rate is decreasing), then𝑒(π‘₯) β‰₯ 𝑒(0), which is obviously equivalent to 𝑆𝑒(π‘₯) β‰₯ 𝑆(π‘₯)from the preceding equality. This, in turn, implies that

∫

∞ 0

𝑆𝑒(π‘₯)𝑑π‘₯β‰₯ ∫0βˆžπ‘†(π‘₯)𝑑π‘₯.

But E(𝑋) =∫0βˆžπ‘†(π‘₯)𝑑π‘₯from Definition 3.3 and (3.5) if𝑆(0) = 1. Also,

∫

∞ 0

𝑆𝑒(π‘₯)𝑑π‘₯=

∫

∞ 0

π‘₯𝑓𝑒(π‘₯)𝑑π‘₯,

since both sides represent the mean of the equilibrium distribution. This may be evaluated using (3.9) with𝑒= ∞, π‘˜= 2, and𝐹(0) = 0to give the equilibrium mean, that is,

∫

∞ 0

𝑆𝑒(π‘₯)𝑑π‘₯=

∫

∞ 0

π‘₯𝑓𝑒(π‘₯)𝑑π‘₯= 1 E(𝑋)∫

∞ 0

π‘₯𝑆(π‘₯)𝑑π‘₯= E(𝑋2) 2E(𝑋). The inequality may thus be expressed as

E(𝑋2)

2E(𝑋) β‰₯E(𝑋)

or usingVar(𝑋) =E(𝑋2) βˆ’ [E(𝑋)]2asVar(𝑋)β‰₯[E(𝑋)]2. That is, the squared coefficient of variation, and hence the coefficient of variation itself, is at least 1 if 𝑒(π‘₯) β‰₯ 𝑒(0).

Reversing the inequalities implies that the coefficient of variation is at most 1 if𝑒(π‘₯)≀𝑒(0), which is in turn implied if the mean excess function is decreasing or the hazard rate is increasing. These values of the coefficient of variation are consistent with the comments made here about the heaviness of the tail.

3.4.6 Exercises

3.25 Using the methods in this section (except for the mean excess loss function), compare the tail weight of the Weibull and inverse Weibull distributions.

3.26 Arguments as in Example 3.10 place the lognormal distribution between the gamma and Pareto distributions with regard to heaviness of the tail. To reinforce this conclusion, consider: a gamma distribution with parameters𝛼= 0.2,πœƒ= 500; a lognormal distribution with parametersπœ‡= 3.709290,𝜎 = 1.338566; and a Pareto distribution with parameters 𝛼 = 2.5,πœƒ = 150. First demonstrate that all three distributions have the same mean and variance. Then, numerically demonstrate that there is a value such that the gamma pdf is smaller than the lognormal and Pareto pdfs for all arguments above that value, and that there is another value such that the lognormal pdf is smaller than the Pareto pdf for all arguments above that value.

3.27 For a Pareto distribution with𝛼 > 2, compare𝑒(π‘₯)to𝑒(0)and also determine the coefficient of variation. Confirm that these results are consistent with the Pareto distribution being heavy tailed.

3.28 Let π‘Œ be a random variable that has the equilibrium density from (3.11). That is, π‘“π‘Œ(𝑦) =𝑓𝑒(𝑦) = 𝑆𝑋(𝑦)βˆ•E(𝑋)for some random variable𝑋. Use integration by parts to show that

π‘€π‘Œ(𝑧) = 𝑀𝑋(𝑧) βˆ’ 1 𝑧E(𝑋) whenever𝑀𝑋(𝑧)exists.

3.29 You are given that the random variable𝑋 has probability density function𝑓(π‘₯) = (1 + 2π‘₯2)π‘’βˆ’2π‘₯, π‘₯β‰₯0.

(a) Determine the survival function𝑆(π‘₯).

(b) Determine the hazard rateβ„Ž(π‘₯).

(c) Determine the survival function𝑆𝑒(π‘₯)of the equilibrium distribution.

(d) Determine the mean excess function𝑒(π‘₯).

(e) Determinelimπ‘₯β†’βˆžβ„Ž(π‘₯)andlimπ‘₯β†’βˆžπ‘’(π‘₯).

(f) Prove that𝑒(π‘₯)is strictly decreasing butβ„Ž(π‘₯)is not strictly increasing.

3.30 Assume that𝑋has probability density function𝑓(π‘₯), π‘₯β‰₯0.

(a) Prove that

𝑆𝑒(π‘₯) = ∫π‘₯∞(π‘¦βˆ’π‘₯)𝑓(𝑦)𝑑𝑦 E(𝑋) . (b) Use (a) to show that

∫

∞

π‘₯ 𝑦𝑓(𝑦)𝑑𝑦=π‘₯𝑆(π‘₯) +E(𝑋)𝑆𝑒(π‘₯). (c) Prove that (b) may be rewritten as

𝑆(π‘₯) = ∫π‘₯βˆžπ‘¦π‘“(𝑦)𝑑𝑦 π‘₯+𝑒(π‘₯) and that this, in turn, implies that

𝑆(π‘₯)≀ E(𝑋) π‘₯+𝑒(π‘₯). (d) Use (c) to prove that, if𝑒(π‘₯)β‰₯𝑒(0), then

𝑆(π‘₯)≀ E(𝑋) π‘₯+E(𝑋) and thus

𝑆[π‘˜E(𝑋)]≀ 1 π‘˜+ 1,

which forπ‘˜= 1implies that the mean is at least as large as the (smallest) median.

(e) Prove that (b) may be rewritten as 𝑆𝑒(π‘₯) = 𝑒(π‘₯)

π‘₯+𝑒(π‘₯)

∫π‘₯βˆžπ‘¦π‘“(𝑦)𝑑𝑦 E(𝑋) and thus that

𝑆𝑒(π‘₯)≀ 𝑒(π‘₯) π‘₯+𝑒(π‘₯).

3.5 Measures of Risk

Dalam dokumen Book LOSS MODELS FROM DATA TO DECISIONS (Halaman 50-58)