IMPORTANCE SAMPLING

Theorem 4.1 Theorem 4.1 The acceptance-rejection algorithm generates a random vari- able X such that

6.5 IMPORTANCE SAMPLING

110 VARlANCE REDUCT/ON TECHNlQUES

>

CCV.bar <- mean(CCV)

Var.CCV <- sum((CCV-CCV.bar>-2>/(N2-1)

SE <- sqrt(Var.CCV)

CI <- CCV.bar-1.96*SE/sqrt(N2)

CI [2] <- CCV. bar+l .96*SE/sqrt (N2)

CCV

.

_bar

For N I = 500 and N2 = 50,000, we have a 95% confidence interval for CCV of j1.0023 1.02471. In this case, the estimated call price is 1.0135 with standard error 0.0057.

In using control variates, there are a number of features that should be kept in mind.

0 What should constitute the appropriate control? We have seen that in simple cases, the underlying asset prices may be appropriate. In more complicated situation, we may use some easily computed quantities that are highly correlated with the object of interest as control variates. For example, standard calls and puts frequently provide convenient source of control variates for pricing exotic options, and so does the underlying asset itself.

0 The control variate estimator is usually unbiased by construction. Also, we can separate the estimation of the coefficients (2;) from the estimation of prices.

0 The flexibility of choosing the ~s suggests that we can sometimes make optimal use of information. In any event, we should exploit the specific feature of the problem under consideration, rather than generic appli- cations of routine methods.

0 Because of its close relationship with linear regression, control variates are easily computed and explained.

0 We have only covered linear control. In practice, one can consider using nonlinear control variates, for example, X U / p y . Statistical inference for nonlinear control may be tricky though.

IMPORTANCE SAMPLING 111

then concludes with examples illustrating the different methods of variance reduction in risk management.

Suppose that we are interested in estimating 6 = E [ h ( X ) ] =

/

h ( z ) f ( z ) d s ,

where X =

(XI,. . . ,

X n ) denotes an n-dimensional random vector having a joint p.d.f. f(z) ⁼f(z1,.

. .

_,zn).Suppose that a direct simulation of the random vector X is inefficient so that computing h ( z ) is infeasible. This inefficiency may be due to difficulties encountered in simulating X , or the variance of h ( z ) being too large, or a combination of both.

Suppose there exists another density g ( x ) , which is easy t o simulate and satisfies the condition that f(z) ⁼0 whenever g ( s ) = 0. Then 6 can be estimated by

6 = E[h(z)]

h(x)f ^{( X I}

[

^dx)

I'

where the notation E, denotes the expectation of the random vector X taken under the density g, i.e., X has joint p.d.f. g ( s ) . It follows from this identity that 6 can be estimated by generating X with density g and then using as the estimator the average of the values of h ( X ) f ( X ) / g ( X ) . In other words, we could construct a Monte Carlo estimator of 6 ⁼E ( h ( X ) ) by first computing i.i.d. random vectors X i with p.d.f. g ( X ) , then using the estimator

1 h ( X i ) f ( X i ) n i= 1 g ( X i ) ^'

If adensity g ( s ) can be chosen so that the random variable h ( X ) f ( X ) / g ( X ) has a small variance, then this approach is known as the importance sampling approach and can result in an efficient estimator of 6.

To see how it works, note that the ratio f ( X ) / g ( X ) represents the likelihood ratio of obtaining X with respective densities f ^and^{g .}^If^X ^{is dis-} tributed according to g , then f ( X ) would be small relative to g ( X ) and therefore when X is simulated according t o g, the likelihood ratio f ( X ) / g ( X ) will usually be small in comparison to 1. On the other hand, it can be seen that

Thus, even though the likelihood ratio f ( X ) / g ( X ) is smaller than 1, its mean is equal to 1, suggesting that it occasionally takes large values and results in a large variance.

112 VARIANCE REDUCTION TECHNIQUES

To make the variance of h ( X )

f

( X ) / g ( X ) small, we arrange for a density g such that those values of X for which f ( X ) / g ( X ) is large are precisely the values for which h ( X ) is small, thus making the ratio h ( X ) f ( X ) / g ( X ) stays small. Since importance sampling requires h t o be small sometimes, it works best when estimating a small probability. Further discussions on importance sampling and likelihood method are given in Glasserman (2003).

Example 6.10 Consider the problem 0 = E(U5)).

Suppose that we use the standard method 0 = U:, then we oversample the data near the origin and undersample the data near 1. It is easy to compute that

A 1 1 1 1 0.0631

Var(0) = -{EU1' - (EU5),} = -(- - -) =

-.

n n 11 36 n

Now, suppose we use the importance sampling, putting more weights near 1.

Let g(x) = 5x4 for 0

<

1. Then

x 5 . 1 E,X 61 = Eg(-) _{5 x 4} =

-.

₅

The variance of this method is 1

Var(&) = -{E,X2 25n - (E,X),}

3:2(53:4) d3: - (

I'

3:(5x4) d.),) 25n

25n 7

1 5 5 ,

- - - { - - $ )

1

0.00794

- n

- ₇

resulting a variance reduction of 98.74%.

cl

How do we choose g in general? This requires the notion of the so-called tilted density. Recall the notation that M ( t ) = E(etx) represents the moment generating function (m.g.f.) of the random variable X with density f . Definition 6.2 A density function

is called a tilted density _ofa given f, ^-CO

<

^CO.

Note that from this definition, a random variable with density ft tends to be larger than the one with density

f

when t

>

0, and tends to be smaller when t < 0.

IMPORTANCE SAMPLING 113

Example 6.11 Let f be a Bernoulli density with parameter p. Then f ( x ) = p"(l-p)'-", x = 0 , l . In this case, the m.g.f. is M ( t ) = E ( e t x ) = pet+(l-p)

so that

Thus, the tilted density ft is a Bernoulli density with parameterpt = pet/(pet+

1 - P I . 17

In many instances, we are interested in sums of independent random vari- ables. In these cases, the joint density f (2) of z = ( X I , .

. . ,

x,) can be written as the product of the marginals f i of xi so that

f(z)

= f l ( X 1 ) .

. .

f n ( % ) .

In this situation, it is often useful to generate the X i according to their tilted densities with a common t.

Example 6.12 Let

XI,. . . ,

X , be independent with marginal densities f i . Suppose we are interested in estimating the quantity

6 = P(S 2 a ) ,

where S =

E:=,

X i and a

>

E ( X i ) is a given constant. W e can apply tilted densities to estimate 6. Let I { S 2 U} equal 1 i f S 2 a and U otherwise.

Then

6 = E(I{S 2 a } ) ,

where the expectation is taken with respect to the joint density. Suppose we simulate X i according to the tilted density function f t , i , where the value of t

>

0 is to be specified. To construct the importance sampling estimator, note that h ( X ) = I { S 2 a } , f ( X ) =

n

f i ( X i ) , and g ( X ) = H f t , i ( X i ) . The importance sampling estimator would be

e

⁼^{I { S}²U } H - . . f i ( X i )

i f t ,i ( X i ) NOW f i ( X i ) / ft,i(Xz) = Mi(t)e-txa, therefore,

114 VARIANCE REDUCTION TECHNIQUES

Since at is assumed that t

>

0 , S

2

a iff e-ts 5 e-ta and I { S 2

5

ePta,

6 5

M(t)e-ta.

We now find t

>

0 such that the right-hand side of the above inequality is minimized. I n that case, we obtain an estimator that lies between 0 and mint M(t)e-ta. It can be shown that such t can be found by solving the equa- tion

E t ( S ) = a.

ABer solving for t, it can be utilized in the simulation. To be specific, suppose

XI,. .

_,X,

are i.i.d. Bernoulli trials with p = pi = 0.4. Let n = 20 and a = 16. Then

so that

6

⁼I { S

2

n ( p e t

+

¹^-^{p ) .}

Recall from the preceding example that the tilted density ft,i is the p.d.6 of a Bernoulli trial with parameter p* = pet/(pet

+

¹^-p ) . It follows that

20 Pet pet

+

¹^-^p'

Et(S) = 20p* = i= 1

Plugging in n = 20,p = 0.4, a = 16, we have 0.4et 0.4et

+

^0.6

20 = 16,

which leads to et* = 6. Therefore, we should generate Bernoulli trials with parameter 0.4et'/(0.4et*

+

^0.6)⁼0.8 as the g and evaluate M ( t * ) = (0.4et'

+

0.6)20 and e-t's = (116)'. The importance sampling estimator is now

8

= I { S 2 1 6 ) A ~ f ( t * ) e - ~ * ~ = I { S 2 16}320(1/6)s Furthermore, we know that

6

⁵M(t*)e-t*a = 320(1/6)16 = 0.001236.

Thus, in each iteration, the value _ofthe importance sampling estimator lies between 0 and 0.001236.

O n the other hand, we can also evaluate 0 = P(S 2 16) exactly, which equals to the probability that a Binomial random variable with parameters 20 and 0.4 be at least as big as 16. This value turns out to be 0.000317. Recall the function h ( X ) = I { S 2 16). This is a Bernoulli trial with parameter 6 = 0.000317. Therefore, if we simulate directly from X s , the standard estimator OS has variance

var(Bs)

⁼

q i

e)

⁼^3.169^x^10-4

^.

IMPORTANCE SAMPLING 115

As 0 5

6

5 0.001236, it can be shown that

Var(i)

5

(0.001236)2/4 = 3.819 x 10-7,

which is much smaller than the variance of the standard estimator

6,.

⁰

Another application of importance sampling is to estimate tail probabilities (recall at the beginning we mentioned that importance sampling works best in small probability). Suppose we are interested in estimating P ( X

>

a), where X has p.d.f. f ândâîsa given constant. Let I ( X

>

a) = 1 if X

>

a and 0 otherwise. Then

P ( X > a ) = E f ( l ( X > a ) )

Take g(x) = Ae-Xx, x

>

0, an exponential density with parameter A. Then the above derivation shows

P ( X

>

a ) = E,[eXX

f

( X ) l X

>

a]e-’”/A.

Using the so-called “memoryless property”

,

i.e., P ( X

>

s+tlX

>

s) = P(X

>

t ) , of an exponential distribution, it can be easily seen that the conditional distribution of an exponential distribution conditioned on { X

>

a } has the same distribution as a

+

^{X .}Therefore,

e-Aa

P ( X

>

a ) = -E,[eX(X+a) _A f ^{( X}

+

^{a ) ]}

= xE,[eXX 1

f

( X f a)].

We can now estimate Q by generating X I , .

. . ,

X , according to an exponential distribution with parameter X and using

..

^{1 1}

Q = An _ _

ce””. ^f

^{( X i}

⁺

^a).

i = l

Example 6.13 Suppose we are interested in Q ⁼P ( X

>

a ) , where X is standard normal. Then f i s the normal density. Let g be an exponential

116 VARIANCE REDUCTION TECHNIQUES

density with X = a. Then

P ( X

>

a ) = -Eg[eax _a1 f ( X

+

^{a ) ]}

W e can therefore estimate 0 by generating X , an exponential distribution with rate a , and then using

to estimate 0. To compute the variance of

6,

we need to compute quantities These can be computed numerically and can be shown to be and E,[e-XZ].

Eg[e-X2/2] = aeaz/2&(1 ^-@(U)), Eg[e-xz] = ~ e ~ ’ / ~ f i ( l ^-@(a/&)).

For example, if a = 3 and n = 1, then Var(e-x2/2) = 0.0201 and Var(9) =

(s)’

^x^0.0201

^-

^4.38^x1OP8. On the other hand, a standard estimator has 0 Consider simulating a vanilla European call option price again, using the importance sampling technique. Suppose that we evaluate the value of a deep out-of-money (So

<<

K ) European call option with a short maturity T . Many sampling paths result ST 5 K and give zero-values. Thus, these samples are wasted. One possible way to deal with this problem is t o increase the values of Zis by sampling them from a distribution with large mean and large variance. Sample

Zi

from N( 2, s2) so that

a&Zi

-

^N(m,^a2Ts2).

vanance O(1 - 0) = 0.00134.

U f i

Note that

Zi

can be written as

2.

_$-a%@^-

m +

^{s Z ~ ,}^2,

-

^N(0,l).

The importance sampling estimator is then given by 1 N

c1

⁼e - T T - N ma~{SOe(T-ff212)T+o~” ^-K , O)R(Zz),

i= 1

where

IMPORTANCE SAMPLING 117

Thus, CI can be expressed as

Example 6.14 Let

SO

⁼^100,^K ⁼140, r = 0.05, 0 = 0.3, and T = 1.

We simulate the value of this deep out-of-money European call option, using the importance sampling technique and compare it with the result of standard method.

The SPLUS code is as follows:

######## Part (1) ^:Standard method ^########

N <- 10000

so <- ₁₀₀

K <- 140

t <- 1

r <- _0.05

sigma <- _0.3

nu <- r - _sigma-2/2

Ci <- rep(0,N)

for (i in 1:N) z <-

c

rnorm(1)

ST <- SO*exp(nu*t+sigma*sqrt(t)*z)

Ci [i] <- exp(-r*t)*max(ST-K,O) C.bar 1 <- mean(Ci)

SE <- sqrt (var (Ci) / (N-1) ⁾

C .bar SE

######## Part(2): Importance Sampling ^########

N ^<- ¹⁰⁰⁰⁰

so

<- ₁₀₀

K <- 140

t <- 1

r <- _0.05

sigma <- 0.3

nu <- r - _sigma-2/2

m <- _0.5

s <- 1.1

Ci <- rep(0,N)

for (i in 1:M)

118

c

22 <- ST <- Ci Cil

Dalam dokumen Simulation Techniques in Financial Risk Management (Halaman 124-132)

Theorem 4.1 Theorem 4.1 The acceptance-rejection algorithm generates a random vari- able X such that

6.5 IMPORTANCE SAMPLING

>

.

/

(XI,. . . ,

. .

[

I'

f

-.

<

<

-.

I'

1

cl

<

<

f

>

. . ,

f(z)

. .

XI,. . . ,

E:=,

>

>

n

e

>

2

5

6 5

>

XI,. .

,X,

6

2

+

+

+

+

+

+

8

6

var(Bs)

q i

e)

.

6

5

6,.

>

>

>

>

>

f

>

,

>

>

>

>

+

>

+

f

. . ,

..

ce””. f

+

>

>

+

6,

(s)’

_,X,

^.

ce””. ^f

⁺

^-