Tails of Distributions - BASIC DISTRIBUTIONAL QUANTITIES

BASIC DISTRIBUTIONAL QUANTITIES

3.4 Tails of Distributions

Thetailof a distribution (more properly, the right tail) is the portion of the distribution corresponding to large values of the random variable. Understanding large possible loss values is important because these have the greatest effect on total losses. Random variables that tend to assign higher probabilities to larger values are said to be heavier tailed. Tail weight can be a relative concept (model A has a heavier tail than model B) or an absolute concept (distributions with a certain property are classified as heavy tailed). When choosing models, tail weight can help narrow the choices or can confirm a choice for a model.

3.4.1 Classification Based on Moments

Recall that in the continuous case, the𝑘th raw moment for a random variable that takes on only positive values (like most insurance payment variables) is given by∫₀^∞𝑥^𝑘𝑓(𝑥)𝑑𝑥. Depending on the density function and the value of 𝑘, this integral may not exist (i.e.

it may be infinite). One way of classifying distributions is on the basis of whether all moments exist. It is generally agreed that the existence of all positive moments indicates a (relatively) light right tail, while the existence of only positive moments up to a certain value (or existence of no positive moments at all) indicates a heavy right tail.

EXAMPLE 3.9

Demonstrate that for the gamma distribution all positive moments exist but for the Pareto distribution they do not.

For the gamma distribution, the raw moments are 𝜇_𝑘^′ =

∫

∞ 0

𝑥^𝑘𝑥^𝛼⁻¹𝑒⁻^𝑥^∕^𝜃 Γ(𝛼)𝜃^𝛼 𝑑𝑥

=∫

∞ 0

(𝑦𝜃)^𝑘(𝑦𝜃)^𝛼⁻¹𝑒⁻^𝑦

Γ(𝛼)𝜃^𝛼 𝜃𝑑𝑦, making the substitution𝑦=𝑥∕𝜃

= 𝜃^𝑘

Γ(𝛼)Γ(𝛼+𝑘)<∞for all𝑘 >0. For the Pareto distribution, they are

𝜇_𝑘^′ =

∫

∞ 0

𝑥^𝑘 𝛼𝜃^𝛼 (𝑥+𝜃)^𝛼⁺¹ 𝑑𝑥

=∫

∞

𝜃 (𝑦−𝜃)^𝑘𝛼𝜃^𝛼

𝑦^𝛼⁺¹𝑑𝑦, making the substitution𝑦=𝑥+𝜃

=𝛼𝜃^𝛼∫

∞ 𝜃

∑𝑘 𝑗=0

(𝑘 𝑗 )

𝑦^𝑗⁻^𝛼⁻¹(−𝜃)^𝑘⁻^𝑗𝑑𝑦, for integer values of𝑘.

The integral exists only if all of the exponents on𝑦in the sum are less than−1, that is, if𝑗−𝛼− 1<−1for all𝑗or, equivalently, if𝑘 < 𝛼. Therefore, only some moments

exist. □

By this classification, the Pareto distribution is said to have a heavy tail and the gamma distribution is said to have a light tail. A look at the moment formulas in Appendix A reveals which distributions have heavy tails and which do not, as indicated by the existence of moments.

It is instructive to note that if a distribution does not have all its positive moments, then it does not have a moment generating function (i.e. if𝑋 is the associated random variable, then E(𝑒^𝑧𝑋) = ∞for all𝑧 >0). However, the converse is not true. The lognormal distribution has no moment generating function even though all its positive moments are finite.

Further comparisons of tail behavior can be made on the basis of ratios of moments (assuming they exist). In particular, heavy-tailed behavior is typically associated with large values of quantities such as the coefficient of variation, the skewness, and the kurtosis (see Definition 3.2).

3.4.2 Comparison Based on Limiting Tail Behavior

A commonly used indication that one distribution has a heavier tail than another distribution with the same mean is that the ratio of the two survival functions should diverge to infinity (with the heavier-tailed distribution in the numerator) as the argument becomes large. The divergence implies that the numerator distribution puts significantly more probability on large values. Note that it is equivalent to examine the ratio of density functions. The limit of the ratio will be the same, as can be seen by an application of L’Hˆopital’s rule:

𝑥→lim∞

𝑆1(𝑥) 𝑆2(𝑥) = lim

𝑥→∞

𝑆₁^′(𝑥) 𝑆₂^′(𝑥) = lim

𝑥→∞

−𝑓1(𝑥)

−𝑓2(𝑥) = lim

𝑥→∞

𝑓1(𝑥) 𝑓2(𝑥).

0 0.00005 0.0001 0.00015 0.0002 0.00025 0.0003 0.00035 0.0004 0.00045

50 70 90 110 130 150

f(x)

Pareto Gamma

Figure 3.7 The tails of the gamma and Pareto distributions.

EXAMPLE 3.10

Demonstrate that the Pareto distribution has a heavier tail than the gamma distribution using the limit of the ratio of their density functions.

To avoid confusion, the letters𝜏 and𝜆will be used for the parameters of the gamma distribution instead of the customary𝛼and𝜃. Then, the required limit is

𝑥→lim∞

𝑓Pareto(𝑥) 𝑓_gamma(𝑥) = lim

𝑥→∞

𝛼𝜃^𝛼(𝑥+𝜃)⁻^𝛼⁻¹ 𝑥^𝜏⁻¹𝑒⁻^𝑥^∕^𝜆𝜆⁻^𝜏Γ(𝜏)⁻¹

=𝑐 lim

𝑥→∞

𝑒^𝑥^∕^𝜆 (𝑥+𝜃)^𝛼⁺¹𝑥^𝜏⁻¹

> 𝑐 lim

𝑥→∞

𝑒^𝑥^∕^𝜆 (𝑥+𝜃)^𝛼⁺^𝜏

and, either by application of L’Hˆopital’s rule or by remembering that exponentials go to infinity faster than polynomials, the limit is infinity. Figure 3.7 shows a portion of the density functions for a Pareto distribution with parameters𝛼= 3and𝜃 = 10and a gamma distribution with parameters𝛼= ¹

3 and𝜃 = 15. Both distributions have a mean of 5 and a variance of 75. The graph is consistent with the algebraic derivation.

□

3.4.3 Classification Based on the Hazard Rate Function

The hazard rate function also reveals information about the tail of the distribution. Distribu- tions with decreasing hazard rate functions have heavy tails. Distributions with increasing hazard rate functions have light tails. In the ensuing discussion, we understand “decreasing”

to mean “nonincreasing” and “increasing” to mean “nondecreasing.” That is, a decreasing function can be level at times. The exponential distribution, which has a constant hazard rate, is therefore said to have both a decreasing and an increasing hazard rate. For distributions with monotone hazard rates, distributions with exponential tails divide the distributions into heavy-tailed and light-tailed distributions.

Comparisons between distributions can be made on the basis of the rate of increase or decrease of the hazard rate function. For example, a distribution has a lighter tail than

another if its hazard rate function is increasing at a faster rate. Often, these comparisons and classifications are of interest primarily in the right tail of the distribution, that is, for large functional values.

EXAMPLE 3.11

Compare the tails of the Pareto and gamma distributions by looking at their hazard rate functions.

The hazard rate function for the Pareto distribution is ℎ(𝑥) = 𝑓(𝑥)

𝑆(𝑥) = 𝛼𝜃^𝛼(𝑥+𝜃)⁻^𝛼⁻¹ 𝜃^𝛼(𝑥+𝜃)⁻^𝛼 = 𝛼

𝑥+𝜃,

which is decreasing. For the gamma distribution we need to be a bit more clever because there is no closed-form expression for𝑆(𝑥). Observe that

ℎ(𝑥) = ∫_𝑥^∞𝑓(𝑡)𝑑𝑡

𝑓(𝑥) = ∫₀^∞𝑓(𝑥+𝑦)𝑑𝑦 𝑓(𝑥) ,

and so, if𝑓(𝑥+𝑦)∕𝑓(𝑥)is an increasing function of𝑥for any fixed𝑦, then1∕ℎ(𝑥) will be increasing in𝑥and thus the random variable will have a decreasing hazard rate.

Now, for the gamma distribution, 𝑓(𝑥+𝑦)

𝑓(𝑥) = (𝑥+𝑦)^𝛼⁻¹𝑒⁻⁽^𝑥⁺^𝑦^)∕^𝜃 𝑥^𝛼⁻¹𝑒⁻^𝑥^∕^𝜃 =

( 1 +𝑦

𝑥 )_𝛼−1

𝑒⁻^𝑦^∕^𝜃,

which is strictly increasing in𝑥 provided that𝛼 <1 and strictly decreasing in𝑥 if 𝛼 >1. By this measure, some gamma distributions have a heavy tail (those with𝛼 <1) and some (with𝛼 >1) have a light tail. Note that when𝛼= 1, we have the exponential distribution and a constant hazard rate. Also, even thoughℎ(𝑥)is complicated in the gamma case, we know what happens for large𝑥. Because𝑓(𝑥)and𝑆(𝑥)both go to 0 as𝑥→∞, L’Hˆopital’s rule yields

𝑥→lim∞ℎ(𝑥) = lim

𝑥→∞

𝑓(𝑥)

𝑆(𝑥) = − lim

𝑥→∞

𝑓^′(𝑥)

𝑓(𝑥) = − lim

𝑥→∞

[𝑑 𝑑𝑥ln𝑓(𝑥)

]

= − lim

𝑥→∞

𝑑 𝑑𝑥

[

(𝛼− 1) ln𝑥−𝑥 𝜃 ]

= lim

𝑥→∞

𝜃 −𝛼− 1 𝑥

)

= 1 𝜃.

That is,ℎ(𝑥)→1∕𝜃as𝑥→∞. □

3.4.4 Classification Based on the Mean Excess Loss Function

The mean excess loss function also gives information about tail weight. If the mean excess loss function is increasing in 𝑑, the distribution is considered to have a heavy tail. If the mean excess loss function is decreasing in𝑑, the distribution is considered to have a light tail. Comparisons between distributions can be made on the basis of whether the mean excess loss function is increasing or decreasing. In particular, a distribution with an increasing mean excess loss function has a heavier tail than a distribution with a decreasing mean excess loss function.

In fact, the mean excess loss function and the hazard rate are closely related in several ways. First, note that

𝑆(𝑦+𝑑) 𝑆(𝑑) =

exp [

−∫₀^𝑦⁺^𝑑ℎ(𝑥)𝑑𝑥] exp

[

−∫₀^𝑑ℎ(𝑥)𝑑𝑥] = exp [

−∫

𝑦+𝑑 𝑑 ℎ(𝑥)𝑑𝑥

]

= exp [

−∫

𝑦 0

ℎ(𝑑+𝑡)𝑑𝑡 ]

Therefore, if the hazard rate is decreasing, then for fixed𝑦it follows that∫₀^𝑦ℎ(𝑑+𝑡)𝑑𝑡 is a decreasing function of 𝑑, and from the preceding, 𝑆(𝑦+𝑑)∕𝑆(𝑑) is an increasing function of𝑑. But from (3.5) the mean excess loss function may be expressed as

𝑒(𝑑) = ∫_𝑑^∞𝑆(𝑥)𝑑𝑥 𝑆(𝑑) =

∫

∞ 0

𝑆(𝑦+𝑑) 𝑆(𝑑) 𝑑𝑦.

Thus, if the hazard rate is a decreasing function, then the mean excess loss function𝑒(𝑑) is an increasing function of 𝑑 because the same is true of 𝑆(𝑦+𝑑)∕𝑆(𝑑) for fixed 𝑦. Similarly, if the hazard rate is an increasing function, then the mean excess loss function is a decreasing function. It is worth noting (and is perhaps counterintuitive), however, that the converse implication is not true. Exercise 3.29 gives an example of a distribution that has a decreasing mean excess loss function, but the hazard rate is not increasing for all values. Nevertheless, the implications just described are generally consistent with the preceding discussions of heaviness of the tail.

There is a second relationship between the mean excess loss function and the hazard rate. As𝑑→∞,𝑆(𝑑)and∫_𝑑^∞𝑆(𝑥)𝑑𝑥go to zero. Thus, the limiting behavior of the mean excess loss function as𝑑→∞may be ascertained using L’Hˆopital’s rule because formula (3.5) holds. We have

𝑑→lim∞𝑒(𝑑) = lim

𝑑→∞

∫_𝑑^∞𝑆(𝑥)𝑑𝑥 𝑆(𝑑) = lim

𝑑→∞

−𝑆(𝑑)

−𝑓(𝑑) = lim

𝑑→∞

1 ℎ(𝑑)

as long as the indicated limits exist. These limiting relationships may be useful if the form of𝐹(𝑥)is complicated.

EXAMPLE 3.12

Examine the behavior of the mean excess loss function of the gamma distribution.

Because𝑒(𝑑) =∫_𝑑^∞𝑆(𝑥)𝑑𝑥∕𝑆(𝑑)and𝑆(𝑥)is complicated,𝑒(𝑑)is complicated.

But𝑒(0) =E(𝑋) =𝛼𝜃, and, using Example 3.11, we have

𝑥→lim∞𝑒(𝑥) = lim

𝑥→∞

ℎ(𝑥) = 1

𝑥→lim∞ℎ(𝑥) =𝜃.

Also, from Example 3.11, ℎ(𝑥) is strictly decreasing in 𝑥 for 𝛼 < 1 and strictly increasing in𝑥for𝛼 >1, implying that𝑒(𝑑)is strictly increasing from𝑒(0) =𝛼𝜃to 𝑒(∞) =𝜃for𝛼 <1and strictly decreasing from𝑒(0) =𝛼𝜃to𝑒(∞) =𝜃for𝛼 >1. For 𝛼= 1, we have the exponential distribution for which𝑒(𝑑) =𝜃. □

3.4.5 Equilibrium Distributions and Tail Behavior

Further insight into the mean excess loss function and the heaviness of the tail may be obtained by introducing the equilibrium distribution (also called the integrated tail distribution). For positive random variables with𝑆(0) = 1, it follows from Definition 3.3 and (3.5) with𝑑 = 0that E(𝑋) =∫₀^∞𝑆(𝑥)𝑑𝑥or, equivalently,1 =∫₀^∞[𝑆(𝑥)∕E(𝑋)]𝑑𝑥, so that

𝑓_𝑒(𝑥) = 𝑆(𝑥)

E(𝑋), 𝑥≥0, (3.11)

is a probability density function. The corresponding survival function is 𝑆_𝑒(𝑥) =

∫

∞

𝑥 𝑓_𝑒(𝑡)𝑑𝑡= ∫_𝑥^∞𝑆(𝑡)𝑑𝑡

E(𝑋) , 𝑥≥0. The hazard rate corresponding to the equilibrium distribution is

ℎ_𝑒(𝑥) = 𝑓_𝑒(𝑥)

𝑆_𝑒(𝑥) = 𝑆(𝑥)

∫_𝑥^∞𝑆(𝑡)𝑑𝑡 = 1 𝑒(𝑥)

using (3.5). Thus, the reciprocal of the mean excess function is itself a hazard rate, and this fact may be used to show that the mean excess function uniquely characterizes the original distribution. We have

𝑓_𝑒(𝑥) =ℎ_𝑒(𝑥)𝑆_𝑒(𝑥) =ℎ_𝑒(𝑥)𝑒⁻^∫⁰^𝑥^ℎ^𝑒⁽^𝑡⁾^𝑑𝑡, or, equivalently,

𝑆(𝑥) = 𝑒(0) 𝑒(𝑥)𝑒⁻^∫⁰^𝑥

[ ₁

𝑒(𝑡)

]𝑑𝑡

using𝑒(0) =E(𝑋).

The equilibrium distribution also provides further insight into the relationship between the hazard rate, the mean excess function, and the heaviness of the tail. Assuming that 𝑆(0) = 1, and thus 𝑒(0) = E(𝑋), we have ∫_𝑥^∞𝑆(𝑡)𝑑𝑡 = 𝑒(0)𝑆_𝑒(𝑥), and from (3.5),

∫_𝑥^∞𝑆(𝑡)𝑑𝑡=𝑒(𝑥)𝑆(𝑥). Equating these two expressions results in 𝑒(𝑥)

𝑒(0) = 𝑆_𝑒(𝑥) 𝑆(𝑥).

If the mean excess function is increasing (which is implied if the hazard rate is decreasing), then𝑒(𝑥) ≥ 𝑒(0), which is obviously equivalent to 𝑆_𝑒(𝑥) ≥ 𝑆(𝑥)from the preceding equality. This, in turn, implies that

∫

∞ 0

𝑆_𝑒(𝑥)𝑑𝑥≥ ∫₀^∞𝑆(𝑥)𝑑𝑥.

But E(𝑋) =∫₀^∞𝑆(𝑥)𝑑𝑥from Definition 3.3 and (3.5) if𝑆(0) = 1. Also,

∫

∞ 0

𝑆_𝑒(𝑥)𝑑𝑥=

∫

∞ 0

𝑥𝑓_𝑒(𝑥)𝑑𝑥,

since both sides represent the mean of the equilibrium distribution. This may be evaluated using (3.9) with𝑢= ∞, 𝑘= 2, and𝐹(0) = 0to give the equilibrium mean, that is,

∫

∞ 0

𝑆_𝑒(𝑥)𝑑𝑥=

∫

∞ 0

𝑥𝑓_𝑒(𝑥)𝑑𝑥= 1 E(𝑋)∫

∞ 0

𝑥𝑆(𝑥)𝑑𝑥= E(𝑋²) 2E(𝑋). The inequality may thus be expressed as

E(𝑋²)

2E(𝑋) ≥E(𝑋)

or usingVar(𝑋) =E(𝑋²) − [E(𝑋)]²asVar(𝑋)≥[E(𝑋)]². That is, the squared coefficient of variation, and hence the coefficient of variation itself, is at least 1 if 𝑒(𝑥) ≥ 𝑒(0).

Reversing the inequalities implies that the coefficient of variation is at most 1 if𝑒(𝑥)≤𝑒(0), which is in turn implied if the mean excess function is decreasing or the hazard rate is increasing. These values of the coefficient of variation are consistent with the comments made here about the heaviness of the tail.

3.4.6 Exercises

3.25 Using the methods in this section (except for the mean excess loss function), compare the tail weight of the Weibull and inverse Weibull distributions.

3.26 Arguments as in Example 3.10 place the lognormal distribution between the gamma and Pareto distributions with regard to heaviness of the tail. To reinforce this conclusion, consider: a gamma distribution with parameters𝛼= 0.2,𝜃= 500; a lognormal distribution with parameters𝜇= 3.709290,𝜎 = 1.338566; and a Pareto distribution with parameters 𝛼 = 2.5,𝜃 = 150. First demonstrate that all three distributions have the same mean and variance. Then, numerically demonstrate that there is a value such that the gamma pdf is smaller than the lognormal and Pareto pdfs for all arguments above that value, and that there is another value such that the lognormal pdf is smaller than the Pareto pdf for all arguments above that value.

3.27 For a Pareto distribution with𝛼 > 2, compare𝑒(𝑥)to𝑒(0)and also determine the coefficient of variation. Confirm that these results are consistent with the Pareto distribution being heavy tailed.

3.28 Let 𝑌 be a random variable that has the equilibrium density from (3.11). That is, 𝑓_𝑌(𝑦) =𝑓_𝑒(𝑦) = 𝑆_𝑋(𝑦)∕E(𝑋)for some random variable𝑋. Use integration by parts to show that

𝑀_𝑌(𝑧) = 𝑀_𝑋(𝑧) − 1 𝑧E(𝑋) whenever𝑀_𝑋(𝑧)exists.

3.29 You are given that the random variable𝑋 has probability density function𝑓(𝑥) = (1 + 2𝑥²)𝑒⁻²^𝑥, 𝑥≥0.

(a) Determine the survival function𝑆(𝑥).

(b) Determine the hazard rateℎ(𝑥).

(d) Determine the mean excess function𝑒(𝑥).

(e) Determinelim_𝑥→_∞ℎ(𝑥)andlim_𝑥→_∞𝑒(𝑥).

(f) Prove that𝑒(𝑥)is strictly decreasing butℎ(𝑥)is not strictly increasing.

3.30 Assume that𝑋has probability density function𝑓(𝑥), 𝑥≥0.

(a) Prove that

𝑆_𝑒(𝑥) = ∫_𝑥^∞(𝑦−𝑥)𝑓(𝑦)𝑑𝑦 E(𝑋) . (b) Use (a) to show that

∫

∞

𝑥 𝑦𝑓(𝑦)𝑑𝑦=𝑥𝑆(𝑥) +E(𝑋)𝑆_𝑒(𝑥). (c) Prove that (b) may be rewritten as

𝑆(𝑥) = ∫_𝑥^∞𝑦𝑓(𝑦)𝑑𝑦 𝑥+𝑒(𝑥) and that this, in turn, implies that

𝑆(𝑥)≤ E(𝑋) 𝑥+𝑒(𝑥). (d) Use (c) to prove that, if𝑒(𝑥)≥𝑒(0), then

𝑆(𝑥)≤ E(𝑋) 𝑥+E(𝑋) and thus

𝑆[𝑘E(𝑋)]≤ 1 𝑘+ 1,

which for𝑘= 1implies that the mean is at least as large as the (smallest) median.

(e) Prove that (b) may be rewritten as 𝑆_𝑒(𝑥) = 𝑒(𝑥)

𝑥+𝑒(𝑥)

∫_𝑥^∞𝑦𝑓(𝑦)𝑑𝑦 E(𝑋) and thus that

𝑆_𝑒(𝑥)≤ 𝑒(𝑥) 𝑥+𝑒(𝑥).

3.5 Measures of Risk

Dalam dokumen Book LOSS MODELS FROM DATA TO DECISIONS (Halaman 50-58)