E l e c t ro n ic
J o ur n
a l o
f
P r
o b
a b i l i t y
Vol. 10 (2005), Paper no. 45, pages 1468-1495.
Journal URL
http://www.math.washington.edu/∼ejpecp/
The Steepest Descent Method for Forward-Backward SDEs 1 Jakˇsa Cvitani´c and Jianfeng Zhang
Caltech, M/C 228-77, 1200 E. California Blvd. Pasadena, CA 91125.
Ph: (626) 395-1784. E-mail: [email protected] and
Department of Mathematics, USC
3620 S Vermont Ave, KAP 108, Los Angeles, CA 90089-1113. Ph: (213) 740-9805. E-mail: [email protected].
Abstract. This paper aims to open a door to Monte-Carlo methods for numeri-cally solving Forward-Backward SDEs, without computing over all Cartesian grids as usually done in the literature. We transform the FBSDE to a control problem and propose the steepest descent method to solve the latter one. We show that the original (coupled) FBSDE can be approximated by decoupled FBSDEs, which fur-ther boils down to computing a sequence of conditional expectations. The rate of convergence is obtained, and the key to its proof is a new well-posedness result for FBSDEs. However, the approximating decoupled FBSDEs are non-Markovian. Some Markovian type of modification is needed in order to make the algorithm efficiently implementable.
Keywords: Forward-Backward SDEs, quasilinear PDEs, stochastic control, steepest decent method, Monte-Carlo method, rate of convergence.
2000 Mathematics Subject Classification. Primary: 60H35; Secondary: 60H10, 65C05, 35K55
Submitted to EJP on July 26, 2005. Final version accepted on December 1, 2005.
1
Introduction
Since the seminal work of Pardoux-Peng [19], there have been numerous publica-tions on Backward Stochastic Differential Equapublica-tions (BSDEs) and Forward-Backward SDEs (FBSDEs). We refer the readers to the book Ma-Yong [17] and the reference therein for the details on the subject. In particular, FBSDEs of the following type are studied extensively:
Xt=x+
Z t
0 b(s, Xs, Ys, Zs)ds+ Z t
0 σ(s, Xs, Ys)dWs;
Yt=g(XT) +
Z T
t f(s, Xs, Ys, Zs)ds−
Z T
t ZsdWs;
(1.1)
where W is a standard Brownian Motion, T > 0 is a deterministic terminal time,
and b, σ, f, g are deterministic functions. Here for notational simplicity we assume
all processes are 1-dimensional. It is well known that FBSDE (1.1) is related to the following parabolic PDE on [0, T]×IR (see, e.g., [13], [20], and [7])
(
ut+ 12σ2(t, x, u)uxx+b(t, x, u, σ(t, x, u)ux)ux+f(t, x, u, σ(t, x, u)ux) = 0;
u(T, x) = g(x); (1.2)
in the sense that (if a smooth solutionu exists)
Yt=u(t, Xt), Zt=ux(t, Xt)σ(t, Xt, u(t, Xt)). (1.3)
Due to its importance in applications, numerical methods for BSDEs have re-ceived strong attention in recent years. Bally [1] proposed an algorithm by using a random time discretization. Based on a new notion of L2-regularity, Zhang [21] obtained rate of convergence for deterministic time discretization and transformed the problem to computing a sequence of conditional expectations. In Markovian set-ting, significant progress has been made in computing the conditional expectations. The following methods are of particular interest: the quantization method (see, e.g., Bally-Pag`es-Printems [2]), the Malliavin calculus approach (see Bouchard-Touzi [4]), the linear regression method or the Longstaff-Schwartz algorithm (see Gobet-Lemor-Waxin [10]), and the Picard iteration approach (see Bender-Denk [3]). These methods work well in reasonably high dimensions. There are also lots of publications on nu-merical methods for non-Markovian BSDEs (see, e.g., [5], [6], [12], [15], [24]). But in general these methods do not work when the dimension is high.
numerical schemes for (1.2). Recently Delarue-Menozzi [8] proposed a probabilistic algorithm. Note that all these methods essentially need to discretize the space over regular Cartesian grids, and thus are not practical in high dimensions.
In this paper we aim to open a door to truly Monte-Carlo methods for FBSDEs, without computing over all Cartesian grids. Our main idea is to transform the FBSDE to a stochastic control problem and propose the steepest descent methodto solve the latter one. We show that the original (coupled) FBSDE can be approximated by solving a certain number ofdecoupledFBSDEs. We then discretize the approximating decoupled FBSDEs in time and thus the problem boils down to computing a sequence of conditional expectations. The rate of convergence is obtained.
We note that the idea to approximate with a corresponding stochastic control problem is somewhat similar to the approximating solvability of FBSDEs in Ma-Yong [18] and the near-optimal control in Zhou [25]. However, in those works the original problem may have no exact solution and the authors try to find a so called approximating solution. In our case the exact solution exists and we want to approx-imate it with numerically computable terms. More importantly, in those works one only cares for the existence of the approximating solutions, while here for practical reasons we need explicit construction of the approximations as well as the rate of convergence.
The key to the proof is a new well-posedness result for FBSDEs. In order to obtain the rate of convergence of our approximations, we need the well-posedness of some adjoint FBSDEs, which are linear but with random coefficients. It turns out that all the existing methods in the literature do not work in our case.
At this point we should point out that, unfortunately, our approximating decou-pled FBSDEs are non-Markovian (that is, the coefficients are random), and thus we cannot apply directly the existing methods for Markovian BSDEs. In order to make our algorithm efficiently implementable, some further modification of Markovian type is needed.
Although in the long term we aim to solve high dimensional FBSDEs, as a first attempt and for technical reasons (in order to apply Theorem 1.2 below), in this paper we assume all the processes are one dimensional. We also assume that b = 0 and f is independent of Z. That is, we will study the following FBSDE:
Xt =x+
Z t
0 σ(s, Xs, Ys)dWs;
Yt=g(XT) +
Z T
t f(s, Xs, Ys)ds−
Z T
t ZsdWs.
(1.4)
In this case, PDE (1.2) becomes
(
ut+12σ2(t, x, u)uxx+f(t, x, u) = 0;
Moreover, in order to simplify the presentation and to focus on the main idea, through-out the paper we assume
Assumption 1.1 All the coefficientsσ, f, gare bounded, smooth enough with bounded derivatives, and σ is uniformly nondegenerate.
Under Assumption 1.1, it is well known that PDE (1.5) has a unique solution u
which is bounded and smooth with bounded derivatives (see [11]), that FBSDE (1.4) has a unique solution (X, Y, Z), and that (1.3) holds true (see [13]). Unless otherwise specified, throughout the paper we use (X, Y, Z) andu to denote these solutions, and
C, c >0 to denote generic constants depending only on T, the upper bounds of the
derivatives of the coefficients, and the uniform nondegeneracy of σ. We allow C, c to vary from line to line.
Finally, we cite a well-posedness result from Zhang [23] (or [22] for a weaker result) which will play an important role in our proofs.
Theorem 1.2 Consider the following FBSDE
Xt=x+
Z t
0 b(ω, s, Xs, Ys, Zs)ds+ Z t
0 σ(ω, s, Xs, Ys)dWs;
Yt =g(ω, XT) +
Z T
t f(ω, s, Xs, Ys, Zs)ds−
Z T
t ZsdWs;
(1.6)
Assume that b, σ, f, g are uniformly Lipschitz continuous with respect to (x, y, z); that there exists a constant c >0 such that
σybz ≤ −c|by +σxbz+σyfz|; (1.7)
and that
I02 =△ Enx2+|g(ω,0)|2+ Z T
0 [|b| 2+
|σ|2+|f|2](ω, t,0,0,0)dto<∞.
Then FBSDE (1.6) has a unique solution(X, Y, Z) such that
En sup
0≤t≤T
[|Xt|2+|Yt|2] + Z T
0 |Zt| 2dto
≤CI02,
where C is a constant depending only on T, c and the Lipschitz constants of the coef-ficients.
2
The Steepest Descent Method
Let (Ω,F, P) be a complete probability space,W a standard Brownian motion,T >0 a fixed terminal time, F=△ {Ft}0≤t≤T the filtration generated by W and augmented
by the P-null sets. Let L2(F) denote square integrable F-adapted processes. From now on we always assume Assumption 1.1 is in force.
2.1
The Control Problem
In order to numerically solve (1.4), we first formulate a related stochastic control prob-lem. Given y0 ∈ IR and z0 ∈ L2(F), consider the following 2-dimensional (forward) SDE with random coefficients (z0 being considered as a coefficient):
Xt0 =x+
Z t
0 σ(s, X 0
s, Y
0
s)dWs;
Yt0 =y0− Z t
0 f(s, X 0
s, Ys0)ds+
Z t
0 z 0
sdWs;
(2.1)
and denote
V(y0, z0)
△
= 1 2E
n
|YT0−g(XT0)|2
o
. (2.2)
Our first result is
Theorem 2.1 We have
En sup
0≤t≤T
[|Xt−Xt0|2+|Yt−Yt0|2] + Z T
0 |Zt−z 0
t|
2dto
≤CV(y0, z0).
Proof. The idea is similar to the four step scheme (see [13]).
Step 1. Denote
∆Yt=△ Yt0−u(t, Xt0); ∆Zt=△ zt0−ux(t, Xt0)σ(t, Xt0, Yt0).
Recalling (1.5) we have
d(∆Yt) = zt0dWt−f(t, Xt0, Yt0)dt−ux(t, Xt0)σ(t, Xt0, Yt0)dWt
−hut(t, Xt0) + 1
2uxx(t, X 0
t)σ
2(t, X0
t, Y
0
t )
i
dt
= ∆ZtdWt−h1
2uxx(t, X 0
t)σ
2(t, X0
t, Y
0
t ) +f(t, X
0
t, Y
0
t )
i
dt
+h1
2uxx(t, X 0
t)σ
2(t, X0
t, u(t, X
0
t)) +f(t, X
0
t, u(t, X
0
t))
i
dt
where
αt =△ 1
2∆Ytuxx(t, X
0
t)[σ2(t, Xt0, Yt0)−σ2(t, Xt0, u(t, Xt0))]
+ 1
∆Yt[f(t, X
0
t, Yt0)−f(t, Xt0, u(t, Xt0))]
is bounded. Note that ∆YT =YT0−g(XT0).By standard arguments one can easily get
En sup
0≤t≤T|
∆Yt|2 + Z T
0 |∆Zt| 2dto
≤CE{|∆YT|2}=CV(y0, z0). (2.3)
Step 2. Denote ∆Xt=△ Xt−X0
t. We show that
En sup
0≤t≤T|
∆Xt|2o≤CV(y0, z0). (2.4)
In fact,
d(∆Xt) = hσ(t, Xt, u(t, Xt))−σ(t, Xt0, Yt0)
i
dWt.
Note that
u(t, Xt)−Yt0 =u(t, Xt)−u(t, Xt0)−∆Yt.
One has
d(∆Xt) = [α1t∆Xt+α2t∆Yt]dWt,
whereαitare defined in an obvious way and are uniformly bounded. Note that ∆X0 = 0. Then by standard arguments we get
En sup
0≤t≤T |
∆Xt|2o≤CEn
Z T
0 |∆Yt| 2dto
,
which, together with (2.3), implies (2.4).
Step 3. We now prove the theorem. Recall (1.3), we have
En sup
0≤t≤T|
Yt−Yt0|2+
Z T
0 |Zt−z 0
t|2dt
o
=En sup 0≤t≤T|
u(t, Xt)−u(t, Xt0)−∆Yt|2
+ Z T
0 ¯ ¯
¯ux(t, Xt)σ(t, Xt, u(t, Xt))−ux(t, X 0
t)σ(t, Xt0, u(t, Xt0))
+ux(t, Xt0)σ(t, Xt0, u(t, Xt0))−ux(t, Xt0)σ(t, Xt0, Yt0)−∆Zt¯¯ ¯ 2
dto
≤CEn sup
0≤t≤T
[|∆Xt|2+|∆Yt|2] + Z T
0 h
|∆Xt|2 +|∆Yt|2+|∆Zt|2idto
≤CV(y0, z0),
2.2
The Steepest Descent Direction
Our idea is to modify (y0, z0) along the steepest descent direction so as to decrease
V as fast as possible. First we need to find the Fr´echet derivative of V along some direction (∆y,∆z), where ∆y∈IR,∆z ∈L2(F). For δ≥0, denote
yδ0 =△ y0+δ∆y; zt0,δ
△
=zt0+δ∆zt;
and letX0,δ, Y0,δ be the solution to (2.1) corresponding to (yδ
0, z0,δ). Denote:
∇Xt0 =
Z t
0 [σ 0
x∇Xs0+σy0∇Ys0]dWs;
∇Yt0 = ∆y−
Z t
0 [f 0
x∇Xs0+fy0∇Ys0]ds+
Z t
0 ∆zsdWs;
∇V(y0, z0) = E n
[YT0−g(XT0)][∇YT0−g′(X0
T)∇XT0]
o ;
where ϕ0s
△
=ϕ(s, Xs0, Ys0) for any function ϕ. By standard arguments, one can easily
show that
lim
δ→0 1
δ[X
0,δ
t −Xt0] =∇Xt0; lim δ→0
1
δ[Y
0,δ
t −Yt0] =∇Yt0;
lim
δ→0 1
δ[V(y δ
0, z0
,δ
)−V(y0, z0)] =∇V(y0, z0);
where the two limits in the first line are in theL2(F) sense.
To investigate ∇V(y0, z0) further, we define some adjoint processes. Consider (X0, Y0) as random coefficients and let ( ¯Y0,Y˜0,Z¯0,Z˜0) be the solution to the follow-ing 2-dimensional BSDE:
¯
Yt0 = [YT0−g(XT0)]−
Z T
t [f
0
yY¯s0+σy0Z˜s0]ds−
Z T
t
¯
Zs0dWs;
˜
Yt0 =g′(XT0)[YT0 −g(XT0)] + Z T
t [f
0
xY¯
0
s +σ
0
xZ˜
0
s]ds−
Z T
t
˜
Zs0dWs.
(2.5)
We note that (2.5) depends only on (y0, z0), but not on (∆y,∆z).
Lemma 2.2 For any (∆y,∆z), we have ∇V(y0, z0) = E
n ¯
Y00∆y+ Z T
0 ¯
Zt0∆ztdto.
Proof. Note that
∇V(y0, z0) =E n
¯
YT0∇YT0 −Y˜T0∇XT0o.
Applying Ito’s formula one can easily check that
Then
∇V(y0, z0) =E n
¯
Y00∇Y00−Y˜00∇X00 + Z T
0 ¯
Zt0∆ztdto=EnY¯00∆y+ Z T
0 ¯
Zt0∆ztdto.
That proves the lemma.
Recall that our goal is to decrease V(y0, z0). Very naturally one would like to choose the following steepest descent direction:
∆y=△ −Y¯00; ∆zt=△ −Z¯t0. (2.6)
Then
∇V(y0, z0) = −E n
|Y¯00|2+ Z T
0 | ¯
Zt0|2dto, (2.7)
which depends only on (y0, z0) (not on (∆y,∆z)).
Note that if ∇V(y0, z0) = 0, then we gain nothing on decreasing V(y0, z0). For-tunately this is not the case.
Lemma 2.3 Assume (2.6). Then ∇V(y0, z0)≤ −cV(y0, z0).
Proof. Rewrite (2.5) as
¯
Yt0 = ¯Y00+ Z t
0 [f 0
yY¯s0+σy0Z˜s0]ds+
Z t
0 ¯
Zs0dWs;
˜
Yt0 =g′(XT0) ¯YT0+ Z T
t [f
0
xY¯s0+σx0Z˜s0]ds−
Z T
t
˜
Zs0dWs.
(2.8)
One may consider (2.8) as an FBSDE with solution triple ( ¯Yt,Yt,˜ Zt˜), where ¯Yt is the forward component and ( ˜Yt,Zt˜) are the backward components. Then ( ¯Y00,Z¯t0) are considered as (random) coefficients of the FBSDE. One can easily check that FBSDE (2.8) satisfies condition (1.7) (with both sides equal to 0). Applying Theorem 1.2 we get
En sup
0≤t≤T
[|Y¯t0|2+|Y˜t0|2] + Z T
0 | ˜
Zt0|dto≤CI02 =CEn|Y¯00|2+ Z T
0 | ¯
Zt0|2dto.
In particular,
V(y0, z0) = 1 2E{|Y¯
0
T|2} ≤CE
n
|Y¯00|2+ Z T
0 | ¯
Zt0|2dt
o
, (2.9)
2.3
Iterative Modifications
We now fix a desired error levelεand pick an (y0, z0). If we are extremely lucky that
V(y0, z0)≤ε2, then we may use (X0, Y0, z0) defined by (2.1) as an approximation of
(X, Y, Z). In other cases we want to modify (y0, z0). From now on we assume
V(y0, z0)> ε2; E{|YT0−g(XT0)|4} ≤K04; (2.10)
whereK0 ≥1 is a constant. We note that one can always assume the existence ofK0 by letting, for example, y0 = 0, zt0 = 0.
Proof. We proceed in four steps.
Following the proof of Lemma 2.2, we have
Step 2. First, one can easily show that
En sup
which is bounded. For any constants a, b > 0 and 0 < λ <1, applying the Young’s Inequality we have
(a+b)4 = a4+ 4(λ14a)3(λ−
Noting that the value ofλ we will choose is less than 1, we have
En|YTθ−g(XTθ)|4
Step 3. Denote
where
∆fy(θ)=△ fy(t, Xtθ, Y θ
t )−fy(t, Xt0, Yt0);
and all other terms are defined in a similar way. By standard arguments one has
En sup
0≤t≤T
[|∆ ¯Ytθ|2+|∆ ˜Ytθ|2] + Z T
0 [|∆ ¯Z
θ t|
2+
|∆ ˜Ztθ|2]dt}
≤CEn|∆YTθ|2 +|∆XTθ|2+|YT0−g(XT0)|2|∆g′(θ)|2
+ Z T
0 h
|Y¯t0|2|[∆fxθ|2+|∆fyθ|2] +|Z˜t0|2[|∆σθx|2+|∆σyθ|2]idto
≤CEn|∆YTθ|2 +|∆XTθ|2+|YT0−g(XT0)|2|∆XTθ|2
+ Z T
0 [| ¯
Yt0|2+|Z˜t0|2][|∆Xtθ|2+|∆Ytθ|2]dto
≤CE12
n sup 0≤t≤T
[|∆Xtθ|4+|∆Ytθ|4]o×
E12
n
1 +|YT0−g(XT0)|4+³ Z T
0 [| ¯
Yt0|2+|Z˜t0|2]dt´2o
≤CK02λ2[1 +K02]≤CK04λ2,
thanks to (2.17), (2.10), and (2.16). In particular,
En|∆ ¯Y0θ|2+ Z T
0 |∆ ¯Z
θ t|
2dto
≤CK04λ2. (2.19)
Step 4. Note that
¯ ¯ ¯E
n ¯
Y0θY¯00+ Z T
0 ¯
ZtθZ¯t0dto−En|Y¯00|2+ Z T
0 | ¯
Zt0|2dto¯¯ ¯
≤En|∆ ¯Y0θY¯00|+ Z T
0 |∆ ¯Z
θ tZ¯t0|dt
o
≤CEn|∆ ¯Y0θ|2 + Z T
0 |∆ ¯Z
θ t|2dt
o + 1
2E n
|Y¯00|2+ Z T
0 | ¯
Zt0|2dto
≤CK04λ2+ 1 2E
n
|Y¯00|2+ Z T
0 | ¯
Zt0|2dto.
Then, by (2.9) we have
EnY¯0θY¯00+ Z T
0 ¯
ZtθZ¯t0dto ≥ 1
2E n
|Y¯00|2+ Z T
0 | ¯
Zt0|2dto−CK04λ2
≥ cV(y0, z0)−CK04λ2.
Choose c1
△
= q c
2C for the constants c, C as above, and λ
△
= c1ε
K2
0. Then by (2.10) we
get
EnY¯0θY¯00+ Z T
0 ¯
ZtθZ¯t0dt
o
≥cV(y0, z0)−
c
2ε 2
Then (2.11) follows directly from (2.15).
Finally, plug λ into (2.18) and let θ= 1 we get (2.12) for some C0. Now we are ready to approximate FBSDE (1.4) iteratively. Set
y0
△
= 0, zt0 = 0△ , K0
△
=E14{|Y0
T −g(X
0
T)|
4
}. (2.21)
Fork = 0,1,· · ·,let (Xk, Yk,Y¯k,Y˜k,Z¯k,Z˜k) be the solution to the following FBSDE:
Xtk =x+
Z t
0 σ(s, X
k
s, Ysk)dWs;
Ytk =yk−
Z t
0 f(s, X
k s, Y
k s )ds+
Z t
0 z
k sdWs;
¯
Ytk = [YTk−g(XTk)]−
Z T
t [f k
yY¯sk+σykZ˜sk]ds−
Z T
t
¯
ZskdWs;
˜
Ytk =g′(XTk)[YTk−g(XTk)] + Z T
t [f k xY¯
k s +σ
k xZ˜
k s]ds−
Z T
t
˜
ZskdWs.
(2.22)
We note that (2.22) is decoupled, with forward components (Xk, Yk) and backward
components ( ¯Yk,Y˜k,Z¯k,Z˜k). Denote
λk =△ c1ε
K2
k
, yk+1
△
=yk−λkY¯0k, ztk+1
△
=zkt −λkZ¯tk, Kk4+1
△
=Kk4+ 2C0εKk2, (2.23)
wherec1, C0 are the constants in Lemma 2.4.
Theorem 2.5 There exist constants C1, C2 and N ≤C1ε−C2 such that
V(yN, zN)≤ε2.
Proof. Assume V(yk, zk) > ε2 for k = 0,· · ·, N −1. Note that K4
k+1 ≤ (Kk2+C0ε)2. ThenKk2+1 ≤Kk2 +C0ε, which implies that
Kk2 ≤K02+C0kε.
Thus by Lemma 2.4 we have
V(yk+1, zk+1)≤ h
1− c0ε
K02+C0kε i
V(yk, zk).
Note that log(1−x)≤ −x forx∈[0,1). Forε small enough, we get
log(V(yN, zN))≤log(V(0,0)) +
N−1 X
k=0
log³1− c0ε
K2
0 +C0kε ´
≤C−c
N−1 X
k=0 1
k+ε−1 ≤C−c
Z N−1 0
dx x+ε−1
Forc, C as above, choose N to be the smallest integer such that
N ≥1 +ε−1[eCcε−
2
c −1].
We get
log(V(yN, zN))≤C−c[C
c −
2
clog(ε)] = log(ε
2),
which obviously proves the theorem.
3
Time Discretization
We now investigate the time discretization of FBSDEs (2.22). Fix n and denote
ti =△ i
nT; ∆t
△
= T
n; i= 0,· · ·, n.
3.1
Discretization of the FSDEs
Giveny0 ∈IR and z0 ∈L2(F), denote
Xtn,00
△
=x; Ytn,0 0
△
=y0;
Xtn,0
△
=Xtin,0+σ(ti, X n,0
ti , Y n,0
ti )[Wt−Wti], t∈(ti, ti+1]; Ytn,0
△
=Ytin,0 −f(ti, X n,0
ti , Y n,0
ti )[t−ti] +
Z t
ti z
0
sdWs, t∈(ti, ti+1].
(3.1)
Note that we do not discretizez0 here. For notational simplicity, we denote
Xin,0
△
=Xtin,0; Y n,0
i
△
=Ytin,0; i= 0,· · ·, n.
Define
Vn(y0, z0)
△
= 1 2E{|Y
n,0
n −g(X n,0
n )|2}. (3.2)
First we have
Theorem 3.1 Denote
In,0 =△ En max
0≤i≤n[|Xti −X n,0
i |2+|Yti −Y n,0
i |2] +
Z T
0 |Zt−z 0
t|2dt
o
. (3.3)
Then
In,0 ≤CVn(y0, z0) +
C
We note that (see, e.g. Zhang [21]),
max 0≤i≤n−1E
n sup
ti≤t≤ti+1
[|Xt−Xti|2 +|Yt−Yti|2]o≤ C
n;
En max
0≤i≤n−1ti≤supt≤ti+1[|Xt−Xti| 2 +
|Yt−Yti|2]o≤ Clogn
n ;
En
n−1 X
i=0 Z ti+1
ti |Zt−
1 ∆tEi{
Z ti+1
ti Zsds}|
2dto
≤ Cn;
whereEi{·}=△ E{·|Fti}. Then one can easily show the following estimates:
Corollary 3.2 We have
max 0≤i≤n−1E
n sup
ti≤t≤ti+1
[|Xt−Xin,0|2+|Yt−Yin,0|2]o≤CVn(y0, z0) +
C n;
En max
0≤i≤n−1ti≤supt≤ti+1[|Xt−X
n,0
i |2+|Yt−Y n,0
i |2]
o
≤CVn(y0, z0) +
Clogn
n ;
En
n−1 X
i=0 Z ti+1
ti |
Zt− 1
∆tEi{
Z ti+1
ti
zs0ds}|2dto≤CVn(y0, z0) +
C n.
Proof of Theorem 3.1. Recall (2.1). For i= 0,· · ·, n, denote
∆Xi =△ Xti0 −Xin,0; ∆Yi
△
=Yti0−Yin,0.
Then
∆X0 = 0; ∆Y0 = 0;
∆Xi+1 = ∆Xi+ Z ti+1
ti
h
[α1i∆Xi+βi1∆Yi] + [σ(t, Xt0, Yt0)−σ(ti, Xti0, Yti0)]idWt;
∆Yi+1 = ∆Yi− Z ti+1
ti
h
[α2i∆Xi+βi2∆Yi] + [f(t, Xt0, Yt0)−f(ti, Xti0, Yti0)]idt;
whereαij, βij ∈ Fti are defined in an obvious way and are uniformly bounded. Then
E{|∆Xi+1|2}
=En|∆Xi|2+ Z ti+1
ti
h
[α1i∆Xi+βi1∆Yi] + [σ(t, Xt0, Yt0)−σ(ti, Xti0, Yti0)]i2dto
≤En|∆Xi|2+ C
n[|∆Xi|
2+
|∆Yi|2] +C
Z ti+1
ti [|X
0
t −X
0
ti|
2+
|Yt0−Yti0|2]dto;
and, similarly,
E{|∆Yi+1|2}
≤En|∆Yi|2 +C
n[|∆Xi|
2+
|∆Yi|2] +C
Z ti+1
ti [|X
0
t −X
0
ti|
2+
Denote
Ai =△ E{|∆Xi|2+|∆Yi|2}.
ThenA0 = 0, and
Ai+1 ≤[1 +
C
n]Ai+CE
nZ ti+1
ti [|X
0
t −Xti0|2+|Yt0−Yti0|2]dt
o
.
By the discrete Gronwall inequality we get
max
0≤i≤nAi ≤C n−1
X
i=0
En
Z ti+1
ti
[|Xt0−Xti0|2+|Yt0 −Yti0|2]dto
≤C
n−1 X
i=0
En
Z ti+1
ti
hZ t
ti |σ(s, X
0
s, Ys0)|2ds+|
Z t
ti f(s, X
0
s, Ys0)ds|2+
Z t
ti |z
0
s|2ds
i
dto
≤C
n−1 X
i=0
En|∆t|2+|∆t|3+ ∆t
Z ti+1
ti |z
0
t|2dt
o
≤ Cn +C
nE
nZ T
0 |z 0
t|2dt
o
. (3.5)
Next, note that
∆Xi =
i−1 X
j=0 Z tj+1
tj
h
[α1j∆Xj +βj1∆Yj] + [σ(t, Xt0, Yt0)−σ(tj, Xtj0, Ytj0)]idWt;
∆Yi =
i−1 X
j=0 Z tj+1
tj
h
[α2j∆Xj+βj2∆Yj]−[f(t, Xt0, Yt0)−f(tj, Xtj0, Ytj0)]idt.
Applying the Burkholder-Davis-Gundy Inequality and by (3.5) we get
En max
0≤i≤n[|∆Xi|
2+
|∆Yi|2]o≤ C
n +
C
nE
nZ T
0 |z 0
t|
2dto
,
which, together with Theorem 2.1, implies that
In,0 ≤CV(y0, z0) +
C
n +
C
nE
nZ T
0 |z 0
t|
2dto
.
Finally, note that
V(y0, z0)≤CVn(y0, z0) +CE n
|∆Xn|2 +|∆Yn|2o=CVn(y0, z0) +CAn.
We get
In,0 ≤CVn(y0, z0) +
C
n +
C
nE
nZ T 0 |z
0
t|2dt
o
Moreover, noting thatZt =ux(t, Xt)σ(t, Xt, Yt) is bounded, we have
3.2
Discretization of the BSDEs
Define the adjoint processes (or say, discretize BSDE (2.5)) as follows.
not discretized. Denote ∆Wi+1
where
Repeating the same arguments and by induction we get
∇Vn(y0, z0) =E
¿From now on, we choose the following “almost” steepest descent direction:
∆y=△ −Y¯0n,0;
We note that ∆z is well defined here. Then we have
Lemma 3.3 Assume (3.9). Then for n large, we have ∇Vn(y0, z0)≤ −cVn(y0, z0).
Proof. We proceed in several steps.
Step 1. We show that
By standard arguments we get
Then (3.10) follows from the Burkholder-Davis-Gundy Inequality.
Step 2. We show that
Vn(y0, z0)≤CE
Applying Theorem 1.2, we get
thanks to (3.10). Choosingn ≥2C, we get (3.11) immediately.
where we used (3.10) for the last inequality. Then,
¯
Assume n is large. By (3.11) we get
EnY¯0n,0∆y+
+√C
n 0max≤i≤nE
n
|∇Xin,0|2+|∇Yin,0|2+|Y¯in,0|2o
≤ √CnVn(y0, z0). (3.15)
Recall (3.8). Combining the above inequality with (3.13) we prove the lemma for largen.
3.3
Iterative Modifications
We now fix a desired error level ε. In light of Theorem 3.1, we set n = ε−2. So it suffices to find (y, z) such that Vn(y, z)≤ε2. As in §2.3, we assume
Vn(y0, z0)> ε2; E n
|Ynn,0−g(Xnn,0)|4o
≤K04. (3.16)
Lemma 3.4 Assume (3.16). There exist constants C0, c0, c1 >0, which are
indepen-dent of K0 and ε, such that
∆Vn(y0, z0)
△
=Vn(y1, z1)−Vn(y0, z0)≤ −
c0ε
K02
Vn(y0, z0), (3.17)
and
E{|Ynn,1−g(XTn,1)|4} ≤K14 =△ K04+ 2C0εK02, (3.18)
where, recalling (3.9) and denoting λ=△ c1ε
K2 0,
y1
△
=y0+λ∆y; z1t
△
=zt0+λ∆zt; (3.19)
and, for 0≤θ≤1,
X0n,θ =△ x; Y0n,θ =△ y0+θλ∆y;
Xin,θ+1 =△ Xin,θ+σ(ti, Xin,θ, Yin,θ)∆Wi+1;
Yin,θ+1 =△ Yin,θ−f(ti, Xin,θ, Yin,θ)∆t+ Z ti+1
ti [zt+θλ∆zt]dWt.
(3.20)
Proof. We shall follow the proof for Lemma 2.4.
Step 1. For 0≤θ ≤1, denote
¯
Ynn,θ
△
=Ynn,θ −g(Xnn,θ); Y˜nn,θ
△
=g′(Xnn,θ)[Ynn,θ−g(Xnn,θ)];
¯
Yin,θ−1 = ¯Yin,θ −fy,in,θ−1Y¯in,θ−1∆t−σy,in,θ−1
Z ti
ti−1
˜
Ztn,θdt−
Z ti
ti−1
¯
Ztn,θdWt;
˜
Yin,θ−1 = ˜Yin,θ +fx,in,θ−1Y¯in,θ−1∆t+σx,in,θ−1
Z ti
ti−1
˜
Ztn,θdt−
Z ti
ti−1
˜
and
Step 2. First, similarly to (3.10) and (3.12) one can show that
Therefore, similarly to (2.18) one can show that
En|Ynn,θ −g(Xnn,θ)|4o≤[1 +Cλ]K04. (3.25)
Step 3. Denote
∆ ¯Yin,θ = ¯△ Yin,θ −Y¯in,0; ∆ ˜Yin,θ = ˜△ Yin,θ−Y˜in,0;
and all other terms are defined in a similar way. By standard arguments one has
thanks to (3.24), (3.16), and (3.23). In particular,
Step 4. Recall (3.12). Note that
¯
Choose n large and by (3.11) we get
EnY¯0n,θ∆y+
Moreover, similarly to (3.14) and (3.15) we have
En max
Then by (3.27) and choosing n large, we get
EnY¯0n,θ∆y+
(3.16), we have
∆Vn(y0, z0)≤λ[− We now iteratively modify the approximations. Set
Fork = 0,1,· · ·, define (Xn,k, Yn,k,Y¯n,k,Y˜n,k,Z¯n,k,Z˜n,k) as follows:
wherec1, C0 are the constants in Lemma 3.4. Then following exactly the same argu-ments as in Theorem 2.5, we can prove
Theorem 3.5 Set n=ε−2. There exist constants C
1, C2 and N ≤C1ε−C2 such that
Vn(yN, zN)≤ε2.
4
Further Simplification
We now transform (3.30) into conditional expectations. First,
Thus 3.4. We have the following algorithm.
First, set
3.4, and Corollary 3.2, we have
En max
Otherwise, we proceed the loop as follows:
and
zin,k+1
△
= 1 ∆tEi
n
Yin,k+1+1∆Wi+1 o
. (4.5)
We note that in the last line of (4.4), the two terms stand for Rti+1
ti z k
tdWt and
Rti+1
ti ∆ztkdWt, respectively.
By Theorem 3.4, the above loop should stop after at most C1ε−C2 steps.
We note that in the above algorithm the only costly terms are the conditional expectations:
Ei{Y¯in,k+1}, Ei{Y˜in,k+1}, Ei{∆Wi+1Yin,k+1}, Ei{M
n,k i+1Y˜
n,k
i+1} or Ei{∆Wi+1Y˜in,k+1}. (4.6) By induction, one can easily show that
Yin,k =un,ki (X0n,k,· · ·, Xin,k),
for some deterministic function un,ki . Similar properties hold true for ( ¯Yin,k,Y˜in,k). However, they are not Markovian in the sense that one cannot writeYin,k,Y¯in,k,Y˜in,k
as functions of Xin,k only. In order to use Monte-Carlo methods to compute the
conditional expectations in (4.6) efficiently, some Markovian type modification of our algorithm is needed.
Acknowledgement. We are very grateful to the anonymous referee for his/her careful reading of the manuscript and many very helpful suggestions.
References
[1] V. Bally, Approximation scheme for solutions of BSDE, Backward stochastic differential equations (Paris, 1995–1996), 177–191, Pitman Res. Notes Math. Ser., 364, Longman, Harlow, 1997.
[2] V. Bally, G. Pag`es, and J. Printems, A quantization tree method for pricing and hedging multidimensional American options, Math. Finance, 15 (2005), no. 1, 119–168.
[3] C. Bender and R. Denk, Forward Simulation of Backward SDEs, preprint.
[4] B. Bouchard and N. Touzi, Discrete-time approximation and Monte-Carlo sim-ulation of backward stochastic differential equations, Stochastic Process. Appl., 111 (2004), no. 2, 175–206.
[6] D. Chevance, Numerical methods for backward stochastic differential equations, Numerical methods in finance, 232–244, Publ. Newton Inst., Cambridge Univ. Press, Cambridge, 1997.
[7] F. Delarue, On the existence and uniqueness of solutions to FBSDEs in a non-degenerate case, Stochastic Process. Appl., 99 (2002), no. 2, 209–286.
[8] F. Delarue and S. Menozzi, A forward-backward stochastic algorithm for quasi-linear PDEs, Ann. Appl. Probab., to appear.
[9] J. Jr. Douglas, J. Ma, and P. Protter, Numerical methods for forward-backward stochastic differential equations,Ann. Appl. Probab., 6 (1996), 940-968.
[10] E. Gobet, J. Lemor, and X. Warin, A regression-based Monte-Carlo method to solve backward stochastic differential equations, Ann. Appl. Probab., 15 (2005), no. 3, 2172–2202..
[11] O. Ladyzhenskaya, V. Solonnikov, and N. Ural’ceva,Linear and quasilinear equa-tions of parabolic type, Translaequa-tions of Mathematical Monographs, Vol. 23 Amer-ican Mathematical Society, Providence, R.I. 1967.
[12] J. Ma, P. Protter, J. San Mart´ın, and S. Torres,Numerical method for backward stochastic differential equations,Ann. Appl. Probab. 12 (2002), no. 1, 302–316. [13] J. Ma, P. Protter, and J. Yong, Solving forward-backward stochastic
differen-tial equations explicitly - a four step scheme, Probab. Theory Relat. Fields., 98 (1994), 339-359.
[14] R. Makarov,Numerical solution of quasilinear parabolic equations and backward stochastic differential equations, Russian J. Numer. Anal. Math. Modelling, 18 (2003), no. 5, 397–412.
[15] J. M´emin, S. Peng, and M. Xu, Convergence of solutions of discrete reflected BSDEs and simulations, preprint.
[16] G. Milstein and M. Tretyakov, Numerical algorithms for semilinear parabolic equations with small parameter based on approximation of stochastic equations, Math. Comp., 69 (2000), no. 229, 237–267.
[17] J. Ma and J. Yong, Forward-Backward Stochastic Differential Equations and Their Applications, Lecture Notes in Math., 1702, Springer, 1999.
[19] E. Pardoux and S. Peng S.,Adapted solutions of backward stochastic equations, System and Control Letters, 14 (1990), 55-61.
[20] E. Pardoux and S. Tang,Forward-backward stochastic differential equations and quasilinear parabolic PDEs, Probab. Theory Related Fields, 114 (1999), no. 2, 123–150.
[21] J. Zhang, A numerical scheme for BSDEs, Ann. Appl. Probab., 14 (2004), no. 1, 459–488.
[22] J. Zhang, The well-posedness of FBSDEs, Discrete and Continuous Dynamical Systems (B), to appear.
[23] J. Zhang, The well-posedness of FBSDEs (II), submitted.
[24] Y. Zhang and W. Zheng,Discretizing a backward stochastic differential equation, Int. J. Math. Math. Sci., 32 (2002), no. 2, 103–116.