Dynamic Programming and Optimal Nonanticipative Policy

INVENTORY MODELS WITH TWO CONSECUTIVE DELIVERY MODES

3.3. Dynamic Programming and Optimal Nonanticipative Policy

In this section, we use dynamic programming to study the problem. We verify that the cost of the nonanticipative policy obtained from the solution of the dynamic programming equations equals the value function of the problem over (1, A^). First, we define the problem over (n, N). Let

• N N

= Hn{Xn) + E

l=n

(3.14) where

( F , 5 ) = ((F„,...,FAr),(5„,...,5iv))

is a history-dependent or nonanticipative admissible decision for the problem defined over periods {n,N). That is, given Xn, s„_i, and i^ as constants, (Fn^Sn) is a vector of nonnegative constants, (Fk^Sk) (n < k < N) are positive real-valued functions of the history of the demand information from period n to period/c, given by { ( / £ ^ p / | ) , n < i < k — l},k = n + 1, ...,N — 1, and F/v is a nonnegative real-valued function of the history of the demand information up from period n to period (A'' — 1), given by {(/^ , / | ) , n < i <

N — 1} and / ^ . Here s„_i has the same meaning as SQ and is an outstanding

slow order to be delivered at the end of period n. Define the value function associated with the problem over periods (n, A^) as follows:

Vn {Xn, Sn-l4]i) = _ inf < Jn (x^, S„_i, 4 , (^,'S')) > , (3.15)

{F,S)EAn I J where An denotes the class of all history-dependent admissible decisions for

the problem over {n,N). We have the following theorem on the property of the value function Vnixn, s„_i, i^).

T H E O R E M 3.1 Assume that (3.1) and (3.3)-(3.7) hold. Then the value func- tion Fi(a:i, Soj^i) is convex and Lipschitz continuous inxi on (—oo, +00), and the value Junctions Vn{xn,Sn-i,in) (2 < n < A'') are convex and Lipschitz continuous in (xn,s„_i) on (—00, +00) x [0, +00).

Proof To prove the convexity, it suffices to show that for any 6 € [0,1], {xn,Sn-i) e (-00,+oo)x[0,+oo),and(x„,Sn_i) e (-00, +oo)x[0, +00), we have

Vn {eXn + (1 - 0)Xn, OSn-l + (1 " 6l)Sn-l, 4 )

<9 -Vn {xn,Sn-l,in) + (1 - 6) • Vn ( x „ , S „ _ i , 4 ) • (3.16) We note that for any two admissible decisions ( F , S) and ( F , S) of the problem over (n, iV>, the convex combination (9F + (1 - 9)F, eS-\-{l- 9)3) is also an admissible decision. It follows from (3.4), (3.5), and (3.14) that

Jn (9Xn + (1 - 9)Xn, 9Sn-l + (1 " 9)s {9F + {I - 9)F,9~S + {I - 9)S

<9 • Jn \Xn,Sn-l,i\, \^^^))

+ (1 -9)-Jn ( x „ , S n _ i , 4 , ( F , 5 ) ) , (3.17) which, in turn, implies (3.16).

The proof for VI{XI,SQ^ i\) can be similarly established. Here we give the proof of the Lipschitz continuity for V^(a:„, s„_i, i^). To prove the Lipschitz continuity, it is sufficient to show that there exists a constant L > 0 such that for any (a:„,s„_i), (rc„,Sn_i) e (-00, +00) x [0, +cx)),

IJuyXni Sn-1, ini \F •, S)) — Jn{Xn-, S^-l, in^ (-^' ' ^ ) ) |

< L • {\Xn - Xn\ + \Sn-l - Sn-l\) . (3.18)

Note that from (3.5),

< CH • {\Xn - Xn\ + \Sn-l - 5 „ _ i | ) N

+ ^ C / f • (\Xn - Xn\ + | s „ _ i - S „ _ i | ) , i=n

which implies (3.18) in view of A'' < oo.

(3.19) D

REMARK 3.1 In classical inventory models with convex cost, the value function is convex in the initial inventory level (see Bensoussan, Crouhy, and Proth [2]). Theorem 3.1 states that this classical result remains valid for the problem with dual delivery modes and forecast updates.

In view of (3.14), we can write the dynamic programming equation corre- sponding to the problem as follows:

- Heixe) -H mf Ic^icj)) + C|(a)

a>0 ^

+E

Ue+i {xi + ^ + S£_i - gi(i}jf,vi), cr, l}^-^)

£ = l , . . . , i V - l , UN {XN,SN-I,i]^)

HNixN) + mi\cl,(4>) + Cf,ia)

(3.20)

r > 0

+E

HN+1 {XN + SN-1 +4>- gNiili, 1%: ^N))

(3.21)

REMARK 3.2 In the dynamic programming equations (3.20)-(3.21), the inventory cost is also charged for the initial inventory level. In some inventory literature, this cost for the initial inventory level is not charged. This means that H£{xe) and HN(XN) would be absent from (3.20)-(3.21), respectively. But this charge is of no consequence.

Next, we state the following theorem, which gives the relationship between the value function and the dynamic programming equation.

T H E O R E M 3.2 Assume that (3.\) and (3.3)-{3.7) hold. Then the value func- tions Vk{xkiSk-i,il.),l < k < N, defined in (3.13) and (3.15), satisfy the dynamic programming equations (3.20)-(3.21).

Proof It follows from the definition of Vjv {^NI syv-i, i\j) that it satisfies the last equation in (3.20)-(3.21). Suppose that V^(x£, S£_i,iJ)(^ = /c+l,..., A^) satisfies (3.20)-(3.21). ô^s^tshovjihdLiVk(xk, Sk-i,i\)dindVkî(xk+i, Sk,ilî) also satisfy the first equation in (3.20)-(3.21). That is,

Vk {xk,Sk-i,i\)

= Hkixk) + inf [Clicf>) + Clia) + E[Vk+i {X^+uaJl.,,) ] },

(T>0

(3.22) where

^k+i = Xk + Sk-i + (}) - gk{ikJk^'^k)

By the definition of Vk(xki Sk-i,ik) and the history dependence of ( F , S), we have

Vk (xk,Sk-i,ik)

= Hk{xk)+ inf { E [ ^ ( C / ( F , ) + C | ( 5 , )

+He+i{Xe+i))]]

= Hk(xk)+ M {c[(Fk) + C'k(Sk)

+E

^Hk+iiXk+i)+ Yl (^cliFe) + CliSi) + He+iiXi+i))]].

e=k+i

(3.23) It follows from (3.1) that

E[Hk+i{Xk+i) + Y^ [C^{Fe) + C!(Si)-^He+iiXe+i) e=k+i

= ^\^\Hk+i(Xk+i) N

e=k+i

= ^\Hk+i{Xk+i) + C^_^i(F^+i) + CI,^i(Sk+i) + Hk+2{Xk+2)

+E[ ^ (c/(F,) + C|(5,) + iJ,+i(X,+i))|(/|,4Vi)]}.

e^k+2

By the induction on the index (A; + 1),

inf {E[i7fc+i(X,+i) (F,5)GA ^ L

+ E ( c / ( F , ) + C|(5,) + i / , + i ( X , + i ) ) ] }

= E[Ffc+i(Xfc+i,5fc,4Vi)]. (3.24) Therefore, (3.23) and (3.24) complete the proof. D

REMARK 3.3 Compared with the dynamic programming equation of the rolling horizon problem studied by Sethi and Sorger [14], the dynamic programming equations (3.20)-(3.21) have one more decision variable S£_i—that is, order- ing from the slow source. On the other hand, compared with the dynamic programming equations of the dual-source production-inventory problem studied by Scheller-Wolf and Tayur [12], the dynamic programming equations (3.20)- (3.21) have one more state variable ij —that is, the updated demand information.

Next, we discuss how an optimal solution of our periodic-review inventory model with fast and slow orders and demand-information updates could be found.

Assumptions (3.6)-(3.7) imply that there exists an upper-bound order quan- tity Q > 0 such that

<p>o

CT>0

inf |c/((/.)H-C|((7)

+ E [V^+i (a:£ + 0 + se-i - gi{i\jj, vt),(j, l}+i)]

foYi= 1 , . . . , A ^ - 1 , and

+ E [HN+I{XN + SN-1 + (p- gN{iN,lNi'^N))]

+ E [HN+lixN + SN-1 + (/)- 9N{iNJN,VN))]

By Theorem 3.9 in the appendix to this chapter and Theorem 3.1, in view of the discussion leading to (3.10), there exist Borel-measurable functions

^N{XN, SAT-I, zjv) and aN(xN, SAT-I, zjv)(= 0) such that

+ E [HN+1 (XN + SN-1 +4>N{XN,SN-I,i]v) - 9NiiNJN,VN))]

+ E [HN+1 (XN + SN-1 +(i}- QNiil/jhy'^N))] j , (3.25) and there exist Borel-measurable functions

{4>i{xe,se^i,il),ae{xe,si.i,i})), 1 < £ < N - 1, (3.26) such that

(xi,se-i,i})) + CI{ae{xe,se-i,ii))

+E [V^+i {xi + 4>e(xe,se-i,ij) + Si-i - gi>{i\jj,vi), a-^(x£,S£_i,i]),//+i)]

+E [V^+i {xi + (t) + S£_i - geiii,!},Vi),c7,l}^-^)] \.

(3.27)

Define

Xi = a;i, (3.28) Fi = M^i,so,i\), (3.29) Si = ai{xi,so,il), (3.30) and

Xe = Xe-i + Fe-i + Se^2-9e-iilliJe-i^ve-i), (3.31)

Fe = MXi,Se-ij}), (3.32) Se = ae{X£,Se-i,Ii), (3.33) for 2 < £ < A^ - 1, where ^o = SQ. Finally, define

XN = XN-1-\-FN-1+SN-2-9N-lilN-l^^N-l^'^N~l), 0-^^)

FN = (f>N{^N,SN-i,Ip^), (3.35)

SN = 0. (3.36) Using the dynamic programming equations (3.20)-(3.21), we can prove the

following result,

T H E O R E M 3.3 {VERIFICATIONTHEOREM) Assume that (3.1) and (3.3)- (3.7) hold. Then

{(FU...,FM)ASU...SN))

given in (3.28)-(3.36) is an optimal solution to the problem. That is,

r N

Hi{xi) + E

e=i

^ IcIiFe) + C!(Se) + He+i{Xe+i)

= Vi{xuso,i\). (3.37)

REMARK 3.4 Theorems 3.2 and 3.3 establish the existence of an optimal nonanticipative policy—that is, there exists a policy in the class of all history- dependent policies whose objective function value equals the value function defined in (3.13), and there exists a nonanticipative policy defined by (3.28)- (3.36) that provides the same value for the objective function.

Proof of Theorem 3.3 By (3.27) we know that

( ( F i , . . . , ^ i v ) , ( ^ i , . . . , ^ i v ) ) € ^ i

—that is, it is a history-dependent policy. Next, we show that equation (3.37) holds. It suffices to show that for any

((Fi,...,FAr),(5i,...,5yv))€^i, we have

r N

Hi{x,) + E

• - £ = 1

^ ( C / ( F , ) + CliSe) + He+iiXe+i)

>Hi{x,) + E

^ ( C / ( F , ) + C!{Se) + He+i{Xe-^i)

(3.38) where X^ (1 < £ < N) is defined to be the same as X^ in (3.31)-(3.33) and (3.34)-(3.35) with the exception that F^ and Si are replaced by Fi and S^, respectively. By the definition of {Fi,Si) and (3.27), it is possible to obtain

C({Fi) + CfiSi) + E [V2{X2, Suli)]

< C(iFi) + CfiSi) + E [V2iX2, Suli)] • (3.39) Furthermore, from the history-dependent property of the decisions, we know that Fi, Si, Fi, and Si are constants and that (F2, S2) and (F2, S2) are dependent on {/f, I^}. Thus by (3.27),

V2{X2,Si,li)

= H2{X2) + ^^inf {C({<f>) + C|(a)

+E [F3 (X2 + CI> +SI-92(11 llv2),cTji) |(/?,/2)] }

< H2{X2) + ci{F2) + C|(52) + E [V, (X,,S2Jl) {(ifJl)] , (3.40)

V^3(X3,52,/l)|(/?,/l) and

^ 2 ( ^ 2 , 5 1 , / I )

= H2{X2) + C|(F2) -h C|(52) -f E ^

(3.41) Therefore, it follows from (3.41) that

E[y2(X2,5i,/])]

- E [H2{X2) + Ci(F2) + C|(52) -f E [v2{X^,S2jl)\{llll)\ } , (3.42)

and from (3.40) that E[V2(X2,5i,/2^)]

< E {H2{X2) + Cl{F2) + CI{S2) + E [V3(X3, 52, /1)|(/?, ID] } . (3.43) Combining (3.39) and (3.42)-(3.43) yields

Hi{x,) + E

Y,{chFe) + CtCSd)+H2{X2)

U^l

+ E[V3(X3,52,/1)]

<Hi(xi) + E X ; ( c / ( F , ) + C | ( 5 , ) ) + / f 2 ( X 2 ) +E [1/3(^3,52,/I)].

Repeating (3.42) and (3.43), we finally prove that (3.38) holds.

Dalam dokumen State of the Art Annotated (Halaman 65-73)