Multi-objective risk-averse two-stage
stochastic programming problems
C
¸ a˘gın Ararat
∗†Ozlem C
¨
¸ avu¸s
∗†Ali ˙Irfan Mahmuto˘gulları
∗November 15, 2017
Abstract
We consider a multi-objective risk-averse two-stage stochastic programming problem with a multivari-ate convex risk measure. We suggest a convex vector optimization formulation with set-valued constraints and propose an extended version of Benson’s algorithm to solve this problem. Using Lagrangian duality, we develop scenario-wise decomposition methods to solve the two scalarization problems appearing in Benson’s algorithm. Then, we propose a procedure to recover the primal solutions of these scalarization problems from the solutions of their Lagrangian dual problems. Finally, we test our algorithms on a multi-asset portfolio optimization problem under transaction costs.
Keywords and phrases: multivariate risk measure, multi-objective risk-averse two-stage stochastic programming, risk-averse scalarization problems, convex Benson algorithm, nonsmooth optimization, bundle method, scenario-wise decomposition
Mathematics Subject Classification (2010): 49M27, 90C15, 90C25, 90C29, 91B30.
1
Introduction
We consider amulti-objective risk-averse two-stage stochastic programming problem of the general form
min z w.r.t. RJ+ s.t. z∈R(Cx+Qy)
(x, y)∈ X, z∈RJ.
In this formulation, xis thefirst-stage decision variable, y is the second-stage decision variable and X is a compact finite-dimensional set defined by linear constraints. C, Qare cost parameters which are matrices of appropriate dimension. We assume thatC is deterministic and Qis random. R(·) is amultivariate convex risk measure, which is a set-valued mapping from the space ofJ-dimensional random vectors into the power set ofRJ (seeHamel and Heyde (2010)). In other words,R(Cx+Qy) is the set of deterministic cost vectors
z∈RJ for whichCx+Qy−z becomes acceptable in a certain sense.
The above problem is avector optimizationproblem and solving it is understood as computing theupper image P of the problem defined by
P = cl
z∈RJ |z∈R(Cx+Qy),(x, y)∈ X ,
whose boundary is the so-called efficient frontier. Here, cl denotes the closure operator. One would be interested in finding a setZ ofweakly efficient solutions (x, y, z) withz∈R(Cx+Qy) for some (x, y)∈ X
such that there is noz′∈R(Cx′+Qy′) with (x′, y′)∈ X andz′< z. Here, “<” denotes the componentwise
strict order inRJ. The zcomponents of these solutions are on the efficient frontier. In addition, the set Z
is supposed to constructP in the sense that
P= cl co(
z∈RJ|(x, y, z)∈ Z +RJ+),
∗Bilkent University, Department of Industrial Engineering, Ankara, Turkey. †C¸ . Ararat and ¨O. C¸ avu¸s contributed equally to this work.
where co denotes the convex hull operator. Our aim is to compute P approximately using a finite set of
weakly efficient solutions.
Algorithms for computing upper images of vector optimization problems are extensively studied in the literature. A seminal contribution in this field is the algorithm for linear vector optimization problems by Benson (1998), which computes the set of all weakly efficient solutions of the problem and works on the outer approximation of the upper image rather than the feasible region itself. Benson’s algorithm has been generalized recently in Ehrgottet al. (2011) and L¨ohneet al.(2014) for (ordinary) convex vector optimiza-tion problems, namely, optimizaoptimiza-tion problems with a vector-valued objective funcoptimiza-tion and a vector-valued constraint that are convex with respect to certain underlying cones, e.g., the positive orthants in the respec-tive dimensions. While the algorithm in Ehrgott et al.(2011) relies on the differentiability of the involved functions, L¨ohneet al.(2014) makes no assumption on differentiability and obtains finer approximations of the upper image by making use of the so-calledgeometric dual problem.
In the literature, there is a limited number of studies on multi-objective two-stage stochastic optimization problems. Some examples of these studies are Abbas and Bellahcene (2000), Cardonaet al.(2011), where the decision maker is risk-neutral, that is, one takesR(Cx+Qy) =E[Cx+Qy]+RJ+. In principle, multi-objective risk-neutral two-stage stochastic optimization problems with linear constraints and continuous variables can be formulated as linear vector optimization problems and they can be solved using the algorithm in Benson (1998). If the number of scenarios is not too large, then the problem can be solved in reasonable computation time. Otherwise, one should look for an efficient method, generally, based on scenario decompositions.
To the best of our knowledge, for the risk-averse case, there is no study on multi-objective two-stage stochastic programming problems. However, single-objective mean-risk type problems can be seen as scalar-izations of two-objective stochastic programming problems (see, for instance, Ahmed (2006), Miller and Ruszczy´nski (2011)). On the other hand, Dentcheva and Wolfhagen (2016), Noyan et al. (2017) work on single-objective problems with multivariate stochastic ordering constraints. As pointed out in the recent sur-vey Gutjahr and Pichler (2016), there is a need for a general methodology for the formulation and solution of multi-objective risk-averse stochastic problems.
The main contributions of the present study can be summarized as follows:
1. To the best of our knowledge, this is the first study focusing on multi-objective risk-averse two-stage stochastic programming problems in a general setting.
2. We propose a vector optimization formulation for our problem using multivariate convex risk measures. Such risk measures include, but are not limited to, multivariate coherent risk measures and multivariate utility-based risk measures.
3. To solve our problem, we suggest an extended version of the convex Benson algorithm in L¨ohne et al. (2014) that is developed for a convex vector optimization problem with a vector-valued constraint. Different from L¨ohne et al.(2014), we deal with set-valued risk constraints and dualize them using the dual representation of multivariate convex risk measures (see Hamel and Heyde (2010)) and the Lagrange duality for set-valued constraints (see Borwein (1981)).
4. The convex Benson algorithm in L¨ohneet al.(2014) cannot be used for some multivariate risk measures, specifically, for higher-order nonsmooth risk measures. On the other hand, our method is general and can be used for any risk measures for which subgradients can be calculated. An example of such risk measures is higher-order mean semideviation (see Shapiroet al.(2009) and the references therein).
5. Two risk-averse two-stage stochastic scalarization problems, namely, the problem of weighted sum scalar-ization and the problem of scalarscalar-ization by a reference variable, have to be solved during the procedure of the convex Benson algorithm. As the number of scenarios gets larger, these problems cannot be solved in reasonable computation time. Therefore, based on Lagrangian duality, we propose scenario-wise de-composable dual problems for these scalarization problems and suggest a solution procedure based on the bundle algorithm (see Lemar´echal (1978), Ruszczy´nski (2006) and the references therein).
7. We propose a procedure to recover the primal solutions of the scalarization problems from the solutions of their Lagrangian dual problems.
The rest of the paper is organized as follows: In Section 2, we provide some preliminary definitions and results for multivariate convex risk measures. In Section 3, we provide the problem formulation and recall the related notions of optimality. Section 4 is devoted to the convex Benson algorithm. The two scalarization problems in this algorithm are treated separately in Section 5. In particular, we propose scenario-wise decomposition algorithms and procedures to recover primal solutions. Computational results are provided in Section 6. Some proofs related to Section 5 are collected in the appendix.
2
Multivariate convex risk measures
We work on a finite probability space Ω ={1, . . . , I}withI≥2. For eachi∈Ω, letpi>0 be the probability of the elementary event{i} so thatP
i∈Ωpi= 1.
Let us introduce the notation for (random) vectors and matrices. Let J ≥ 1 be a given integer and
J ={1, . . . , J}. R+J and RJ++ denote the set of all elements of the Euclidean space RJ whose components are nonnegative and positive, respectively. For w = (w1, . . . , wJ)T, z = (z1, . . . , zJ)T ∈ RJ, their scalar product and Hadamard product are defined as
wTz=X
j∈J
wjzj∈R, w·z= (w1z1, . . . , wJzJ)T∈RJ,
respectively. For a setZ ⊆RJ, its associated indicator function (in the sense of convex analysis) is defined by
IZ(z) =
(
0 ifz∈ Z,
+∞ else,
for eachz∈RJ. We denote byLJthe set of allJ-dimensional random cost vectorsu= (u1, . . . , uJ)T, which
is clearly isomorphic to the spaceRJ×I ofJ×I-dimensional real matrices. We writeL=L1 forJ = 1. For
u∈LJ, we denote byui= (u1i, . . . , uJi)T∈RJ its realization ati∈Ω, and define the expected value ofuas
E[u] =X
i∈Ω
piui∈RJ.
Similarly, given another integerN ≥1, we denote byLJ×N the set of allJ×N-dimensional random matrices
Qwith realizationsQ1, . . . , QI.
The elements of LJ will be used to denote random cost vectors; hence, lower values are preferable. To that end, we introduceLJ+, the set of all elements inLJwhose components are nonnegative random variables. Given u, v∈LJ, we writeu≤v if and only ifuji ≤vij for everyi∈Ω and j∈ J, that is, v∈u+LJ+. We call a set-valued functionR: LJ→2RJ
amultivariate convex risk measure if it satisfies the following axioms (see Hamel and Heyde (2010)):
(A1) Monotonicity: u≤vimpliesR(u)⊇R(v) for everyu, v∈LJ.
(A2) Translativity: R(u+z) =R(u) +z for everyu∈LJ andz∈RJ.
(A3) Finiteness: R(u)∈/
∅,RJ for everyu∈LJ.
(A4) Convexity: R(γu+ (1−γ)v)⊇γR(u) + (1−γ)R(v) for everyu, v∈LJ,γ∈(0,1).
(A5) Closedness: Theacceptance set A:=
u∈LJ |0∈R(u) ofRis a closed set.
A multivariate convex risk measureR is calledcoherent if it also satisfies the following axiom:
Remark 2.1. It is easy to check that the values of a multivariate convex risk measureRare in the collection of allclosed convex upper subsets ofRJ, that is,
G=
E⊆RJ|E= cl co(E+RJ+) ,
where cl and co denote the closure and convex hull operators, respectively. In other words, for everyu∈LJ, the setR(u) is a closed convex set with the propertyR(u) =R(u)+RJ
+. The collectionG, when equipped with the superset relation⊇, is a complete lattice in the sense that every nonempty subsetEofGhas an infimum (and also a supremum) which is uniquely given by infE= cl coS
E∈EEas an element ofG(seeExample 2.13
in Hamelet al.(2016)). The complete lattice property ofGmakes it possible to study optimization problems with G-valued objective functions and constraints, as will also be crucial in the approach of the present paper.
A multivariate convex risk measure Rcan be represented in terms of vectors µ of probability measures and weight vectorswin the coneRJ+\{0}, which is called itsdual representation. To state this representation, we provide the following definitions and notation.
Let MJ1 be the set of all J-dimensional vectors µ= (µ1, . . . , µJ) of probability measures on Ω, that is, for eachj ∈ J, the probability measureµj assigns the probabilityµj
i to the elementary event{i} fori∈Ω. Forµ∈MJ1 and i∈Ω, we also writeµi := (µ1i, . . . , µiJ)T ∈RJ. Finally, forµ∈MJ1 and u∈LJ, we define the expectation ofuunderµby
Eµ[u] =Eµ1[u1], . . . ,EµJ[uJ]T=X i∈Ω
µi·ui.
A multivariate convex risk measure Rhas the following dual representation (see Theorem 6.1 in Hamel and Heyde (2010)): for everyu∈LJ,
R(u) = \ µ∈MJ
1,w∈RJ+\{0}
Eµ[u] +
z∈RJ |wTz≥ −β(µ, w)
= \
w∈RJ
+\{0}
(
z∈RJ |wTz≥ sup
µ∈MJ
1
wTEµ[u]−β(µ, w) )
,
whereβ is theminimal penalty function ofRdefined by
β(µ, w) = sup u∈A
wTEµ[u] = sup
wTEµ[u]|0∈R(u), u∈LJ , (2.1)
for each µ∈MJ1, w ∈RJ+\{0}. Note thatβ(·, w) and β(µ,·) are convex functions as they are suprema of linear functions.
Thescalarization ofRby a weight vector w∈RJ+\{0}is defined as the function
u7→ϕw(u) := inf z∈R(u)w
Tz (2.2)
onLJ. As an immediate consequence of the dual representation ofR, we also obtain a dual representation for its scalarization:
ϕw(u) = sup µ∈MJ
1
wTEµ[u]−β(µ, w)
. (2.3)
Some examples of multivariate coherent and convex risk measures are themultivariate conditional value-at-risk (multivariate CVaR) and themultivariate entropic risk measure, respectively.
Example 2.2 (Multivariate CVaR). LetC⊆RJ be a polyhedral closed convex cone with RJ
+⊆C6=RJ. The multivariate conditional value-at-risk is defined by
R(u) = CV aRν1(u1), . . . , CV aRνJ(uJ)
T
where
CV aRνj(uj) = inf
zj∈R
zj+ 1 1−νjE
(uj−zj)+
,
for eachu∈LJ andj∈ J (see Definition 2.1 and Remark 2.3 in Hamelet al.(2013)). Here,νj ∈(0,1) is a risk-aversion parameter and (x)+:= max{x,0}forx∈R. The minimal penalty function ofR is given by
β(µ, w) =
(
0 ifw∈C+ and µji pi ≤
1
1−νj, ∀i∈Ω, j∈ J,
+∞ else,
whereC+ is the positive dual cone ofC defined by
C+=
w∈RJ|wTz≥0, ∀z∈C .
Note that (2.4) is the multivariate extension of the well-known conditional value-at-risk (seeRockafellar and Uryasev (2000), Rockafellar and Uryasev (2002)).
Example 2.3 (Multivariate entropic risk measure). Consider the vector-valued exponential utility function
U: RJ →RJ defined by
U(x) = (U1(x1), . . . , UJ(xJ))T,
where
Uj(xj) = 1−eδ
jxj
δj ,
for eachx∈RJandj∈ J. Here,δj >0 is a risk-aversion parameter. Note thatUj(·) is a concave decreasing function. Let C ⊆RJ be a polyhedral closed convex cone with RJ
+ ⊆C 6=RJ. The multivariate entropic risk measureR:LJ →2RJ
is defined as
R(u) =
z∈RJ |E[U(u−z)]∈C , (2.5)
for eachu∈LJ (see Section 4.1 in Araratet al.(2017)). SinceRJ+⊆C, larger values of the expected utility are prefered. Moreover, as eachUj(·) is a decreasing function, z∈R(u) impliesz′∈R(u) for everyz′≥z.
Finally, the minimal penalty function ofR is given by (see Proposition 4.4 in Araratet al.(2017))
β(µ, w) =X j∈J
wj
δj H(µ
j||p)−1 + logwj
+ inf s∈C+
X
j∈J
1
δj s
j−wjlogsj ,
whereH(µj||p) is therelative entropy ofµj with respect topdefined by
H(µj||p) =X i∈Ω
µjilog µ j i
pi
!
.
Note that (2.5) is the multivariate extension of the well-known entropic risk measure (seeF¨ollmer and Schied (2002)).
3
Problem formulation
We consider a multi-objective risk-averse two-stage stochastic programming problem. The decision variables and the parameters of the problem consist of deterministic and random vectors and matrices of different dimensions. To that end, let us fix some integersJ, K, L, M, N≥1 and deterministic parametersA∈RK×M and b∈RK. At the first stage, the decision-maker chooses a deterministic vectorx∈RM
+ with associated costCx, whereC∈RJ×M. At the second stage, the decision-maker chooses a random vectory∈LN
Given feasible choices of the decision variables x∈RM andy∈LN, the risk associated with the second-stage cost vectorQy∈LJ is quantified via a multivariate convex risk measureR:LJ →2RJ
. The setR(Qy) consists of the deterministic cost vectors inRJ that can makeQyacceptable in the following sense:
R(Qy) =
z∈RJ |Qy−z∈ A ,
where A =
u∈LJ |0∈R(u) is the acceptance set of the risk measure. Hence, R(Qy) collects the deterministic cost reductions fromQy that would yield an acceptable level of risk for the resulting random cost. Together with the deterministic cost vectorCx, the overall risk associated withxand y is given by the set
Cx+R(Qy) ={Cx+z|Qy−z∈ A}={Cx+z|z∈R(Qy)}=R(Cx+Qy),
where the last equality holds thanks to the translativity property (A2).
Our aim is to calculate the “minimal” vectorsz∈R(Cx+Qy) over all feasible choices ofxandy. Using vector optimization, we formulate our problem as follows:
min z w.r.t. RJ+ (PV)
s.t. z∈R(Cx+Qy)
Ax=b
Tix+Wiyi=hi, ∀i∈Ω
z∈RJ, x∈RM+, yi∈RN+, ∀i∈Ω.
Let
X :=
(x, y)∈RM+ ×LN+ |Ax=b, Tix+Wiyi=hi, ∀i∈Ω .
We assume thatX is a compact set. Let us denote byR the image of the feasible region of (PV) under the
objective function, that is,
R=
z∈RJ|z∈R(Cx+Qy),(x, y)∈ X = [ (x,y)∈X
R(Cx+Qy).
Theupper image of (PV) is defined as the set
P= clR= cl [
(x,y)∈X
R(Cx+Qy). (3.1)
In particular, we haveP ∈ G, that is,P is a closed convex upper set; see Remark 2.1.
Finding the “minimal” z vectors of (PV) is understood as computing the boundary of the set P. For completeness, we recall the minimality notions for (PV).
Definition 3.1. A point (x, y, z)∈ X ×RJ is called a weak minimizer (weakly efficient solution) of (PV)
if z ∈ R(Cx+Qy) and z is a weakly minimal element of R, that is, there exists no z′ ∈ R such that
z∈z′+RJ ++.
Definition 3.2. (Definition 3.2 in L¨ohneet al.(2014)) A set Z ⊆ X ×RJ is called aweak solutionof(PV)
if the following conditions are satisfied:
1. Infimality: it holdscl co
z∈RJ |(x, y, z)∈ Z +RJ +
=P,
2. Minimality: each(x, y, z)∈ Z is a weak minimizer of (PV).
Ideally, one would be interested in computing a weak solution Z of (PV). However, except for some special cases (e.g. when the values of R and the upper image P are polyhedral sets), such Z consists of
infinitely many feasible points, that is, it is impossible to recoverP using only finitely many values ofR.
Therefore, our aim is to propose algorithms to compute P approximately through finitely many feasible
Definition 3.3. (Definition 3.3 in L¨ohne et al.(2014)) Let ǫ >0. A nonempty finite set Z ⊆ X ׯ RJ is called a finite weakǫ-solutionof(PV)if the following conditions are satisfied:
1. ǫ-Infimality: it holdsco z∈RJ|(x, y, z)∈Z¯ +RJ+−ǫ1⊇P,
2. Minimality: each(x, y, z)∈Z¯ is a weak minimizer of (PV).
As noted in L¨ohneet al.(2014), a finite weakǫ-solution ¯Z provides an outer and an inner approximation ofP in the sense that
co
z∈RJ |(x, y, z)∈Z¯ +RJ+−ǫ1⊇P ⊇co
z∈RJ|(x, y, z)∈Z¯ +RJ+. (3.2)
Let us also introduce the weighted sum scalarization problem with weight vectorw∈RJ+\{0}:
minwTz s.t. z∈R(Cx+Qy), (x, y)∈ X. (P1(w))
DefineP1(w) as the optimal value of (P1(w)). For the remainder of this section, we provide a discussion on
the existence of optimal solutions of (P1(w)) as well as the relationship between (P1(w)) and (PV).
Proposition 3.4. Let w∈RJ+\{0}. Then, there exists an optimal solution(x, y, z)of(P1(w)).
Proof. Note thatP1(w) = inf(x,y)∈Xϕw(Cx+Qy), whereϕw(·) is the scalarization of Rby was defined
in (2.2). Since ϕw(·) admits the dual representation in (2.3), it is a lower semicontinuous function on LJ. Moreover,X is a compact set by assumption. By Theorem 2.43 in Aliprantis and Border (2006), it follows that an optimal solution of (P1(w)) exists.
Remark 3.5. Note that the feasible region
(x, y, z)∈ X ×RJ|z∈R(Cx+Qy) of (PV) is not compact in general due to the multivariate risk measure R, which has unbounded values. However, in L¨ohne et al. (2014), the feasible region of a vector optimization problem is assumed to be compact. Therefore, by assuming onlyX to be compact, Proposition 3.4 generalizes the analogous result in L¨ohneet al.(2014).
The following proposition is stated in L¨ohneet al. (2014) without a proof. It can be shown as a direct application of Theorem 5.28 in Jahn (2004).
Proposition 3.6. (Proposition 3.4 in L¨ohne et al.(2014)) Letw∈RJ+\{0}. Every optimal solution(x, y, z)
of (P1(w))is a weak minimizer of(PV).
Proposition 3.6 implies that, in the weak sense, solving (PV) is understood as solving the family (P1(w))w∈RJ
+\{0} of weighted sum scalarizations.
4
Convex Benson algorithms for
(
P
V)
The convex Benson algorithms have a primal and a dual variant. While the primal approximation algorithm computes a sequence of outer approximations for the upper imageP in the sense of (3.2), the dual
approx-imation algorithm works on an associated vector maximization problem, called thegeometric dual problem. To explain the details of these algorithms, we should define the concept of geometric duality as well as a new scalarization problem (P2(v)), called the problem ofscalarization by a reference variable v∈RJ.
4.1
The problem of scalarization by a reference variable
The problem (P2(v)) is required to find the minimum step-length to enter the upper imageP from a point
v∈RJ\P along the direction1= (1, . . . ,1)T∈RJ. It is formulated as
minα s.t.v+α1∈R(Cx+Qy), (x, y)∈ X, α∈R. (P2(v))
Note that (P2(v)) is a scalar convex optimization problem with a set-valued constraint. We denote byP2(v)
the optimal value of (P2(v)). We relax the set-valued constraint v+α1 ∈ R(Cx+Qy) in a Lagrangian fashion and obtain the following dual problem using the results of Section 3.2 in Borwein (1981):
maximize γ∈RJ
+
inf (x,y)∈X,α∈R
α+ inf
z∈R(Cx+Qy)−v−α1γ
Tz
Note that (LD2(v)) is constructed by rewriting the risk constraint of (P2(v)) as 0∈R(Cx+Qy)−v−α1
and calculating the support function of the setR(Cx+Qy)−v−α1by the dual variableγ∈RJ+. The next proposition states the strong duality relationship between (P2(v)) and (LD2(v)).
Proposition 4.1. (Theorem 19 and Equation (3.23) in Borwein (1981)) Let v ∈ RJ. Then, there exist
optimal solutions(x(v), y(v), α(v))of(P2(v))andγ(v)of(LD2(v)), and the optimal values of the two problems coincide.
Finally, we recall the relationship between (P2(v)) and (PV). The next proposition is provided without a proof since the proof in L¨ohneet al. (2014) can be directly applied to our case.
Proposition 4.2. (Proposition 4.5 in L¨ohne et al. (2014)) Let v ∈ RJ. If (x(v), y(v), α(v)) is an optimal
solution of(P2(v)), then (x(v), y(v), v+α(v)1)is a weak minimizer of (PV).
4.2
Geometric duality
LetW be the unit simplex inRJ, that is,
W =
w∈RJ+|wT1= 1 .
For eachj∈ J, lete(j) be thejth unit vector inRJ, that is, thejthentry ofe(j)is one and all other entries are zero.
The geometric dual of problem (PV) is defined as the vector maximization problem
max (w1, . . . , wJ−1,P1(w))T w.r.t. K (D
V) s.t. w∈W,
where K is the so-called ordering cone defined asK =
λe(J)|λ≥0 . Similar to the upper image P of (PV), we can define the lower imageD of (DV) as
D:=
(w1, . . . , wJ−1, p)∈RJ |w= (w1, . . . , wJ−1, wJ)∈W, p≤P1(w) .
Remark 4.3. In analogy with Remark 2.1, the lower image D is a closed convex K-lower set, that is,
cl co(D−K) =D.
Next, we state the relationship between D and the optimal solutions of (P1(w)), (P2(v)), (LD2(v)).
Proposition 4.4. (Proposition 3.5 in L¨ohne et al. (2014)) Let w ∈ W. If (P1(w)) has a finite optimal
valueP1(w), then(w1, . . . , wJ−1,P1(w))T is a boundary point ofD and it is also aK-maximal element of D, that is, there is no d∈D such that dJ >P1(w).
Proposition 4.5. (Propositions 4.6, 4.7 in L¨ohne et al. (2014)) Let v ∈ RJ. If (x(v), y(v), α(v)) is an
optimal solution of (P2(v))andγ(v) is an optimal solution of (LD2(v)), thenγ(v) is a maximizer of (DV), that is, (γ1
(v), . . . , γ J−1
(v) ,P1(γ(v)))T is a K-maximal element of the lower image D. Moreover, {z ∈ RJ |
γT
(v)z≥γ(v)T (v+α(v)1)} is a supporting halfspace ofP at the point(v+α(v)1).
Proposition 4.6. Let w∈RJ+\{0}. If(x(w), y(w), z(w)) is an optimal solution of(P1(w)), then {d∈RJ |
(zJ
(w)−z(w)1 , . . . , z(w)J −z J−1
(w),1)Td≤z(w)J }is a supporting halfspace ofD at the point(w1, . . . , wJ−1,P1(w)).
Proof. From Proposition 4.4, d:= (w1, . . . , wJ−1,P1(w)) is a boundary point of D. Moreover, it follows
that
(z(w)J −z1(w), . . . , z(w)J −z(w)J−1,1)Td=−wTz(w)+zJ(w)+P1(w) =zJ
sinceP1(w) =wTz
(w). Hence, the assertion of the proposition follows.
(a) LetZ¯ be a finite weak ǫ-solution of(PV). Then,
Pin( ¯Z) := co
z∈RJ |(x, y, z)∈Z¯ +RJ+
is an inner approximation of the upper imageP, that is,Pin( ¯Z)⊆P. Moreover,
Dout( ¯Z) =nd∈RJ | zJ−z1, . . . , zJ−zJ−1,1T
d≤zJ, ∀z∈Z¯o
is an outer approximation of the lower imageD, that is,D ⊆Dout( ¯Z).
(b) LetW¯ be a finiteǫ-solution of(DV). Then,
Din( ¯W) := co(
(w1, . . . , wJ−1,P1(w))T|w∈W¯ )−K
is an inner approximation ofD, that is,Din( ¯W)⊆D. Moreover,
Pout( ¯W) =
z∈RJ |wTz≥P1(w), ∀w∈W¯
is an outer approximation ofP, that is, P⊆Pout( ¯W).
The problems (P1(w)), (P2(v)) and the above propositions form a basis for the primal and dual convex Benson algorithms. These algorithms are explained briefly in the following sections.
4.3
Primal algorithm
The primal algorithm starts with an initial outer approximationP0 for the upper image P. To construct
P0, for each j ∈ J, the algorithm computes the supporting halfspace of P with direction vector e(j)
by solving the weighted-sum scalarization problem (P1(e(j))). If (x(j), y(j), z(j)) is an optimal solution of (P1(e(j))), then this halfspace supports the upper imageP at the point z(j). Then, P0 is defined as the
intersection of theseJ supporting halfspaces.
The algorithm iteratively obtains a sequenceP0⊇P1⊇P2⊇. . .⊇P of finer outer approximations,
it updates a set ¯Zand ¯W of weak minimizers and maximizers for (PV) and (DV), respectively. At iteration
k, the algorithm first computesVk, that is the set of all vertices ofPk. For each vertexv∈ Vk, an optimal solution (x(v), y(v), α(v)) to (P2(v)) is computed. The optimal α(v) is the minimum step-length required to find a boundary point (v+α(v)1) ofP. Since the triplet (x(v), y(v), v+α(v)1) is a weak minimizer of (PV) by Proposition 4.2, it is added to the set ¯Z. Then, an optimal solutionγ(v)of the dual problem (LD2(v)) is computed, which is a maximizer for (DV) (seeProposition 4.5) and is added to the set ¯W. This procedure is continued until a vertexvwith a step-length greater than an error parameterǫ >0 is detected. For suchv, using Proposition 4.5, a supporting halfspace ofP at point (v+α(v)1) is obtained. The outer approximation is updated asPk+1 by intersectingPk with this supporting halfspace. The algorithm terminates when all the vertices are inǫ-distance to the upper imageP.
At the termination, the algorithm computes inner and outer approximationsPin( ¯Z),Pout( ¯W) for the
upper imagePandDin( ¯W),Dout( ¯Z) for the lower imageDusing Proposition 4.7. Note that bothPout( ¯W) andPk are outer approximations forP. However,Pout( ¯W) is a finer outer approximation thanPk. The reason is that whenPk is updated, only the vertices in more thanǫ-distance toP are used. On the other hand, all the vertices are considered when calculatingPout( ¯W). Furthermore, the algorithm returns a finite weakǫ-solution ¯Z to (PV) and a finiteǫ-solution ¯W to (DV) (see Theorem 4.9 in L¨ohneet al.(2014)).
The steps of the primal algorithm are provided as Algorithm 1.
4.4
Dual algorithm
The steps of the dual algorithm follow in a way that is similar to the primal algorithm; however, as a major difference, at each iteration, an outer approximation for the dual image D is obtained. Moreover, the dual
Algorithm 1Primal Approximation Algorithm
1: Compute an optimal solution (x(j), y(j), z(j)) to (P1(e(j))) for eachj∈ J;
2: LetP0={z∈RJ :eT
(j)z≥P1(e(j)), ∀j ∈ J };
3: k←0;
¯
Z ← {(x(j), y(j), z(j))|j∈ J }; ¯
W ← {e(j)|j∈ J };
4: repeat
5: M ←RJ;
6: Compute the setVk of the vertices ofPk;
7: foreachv∈ Vk do
8: Compute an optimal solution (x(v), y(v), α(v)) of (P2(v)) and an optimal solutionγ(v)of (LD2(v));
9: Z ←¯ Z ∪ {¯ (x(v), y(v), v+α(v)1)}; ¯
W ←W¯∪ {γ(v)};
10: if α(v)> ǫthen
11: M ← M ∩
n
z∈RJ:γT
(v)z≥γ(v)T (v+α(v)1)
o
;
12: break;
13: end if
14: end for
15: if M 6=RJ then
16: Pk+1←Pk∩ M,k←k+ 1;
17: end if
18: untilM=RJ;
19: ComputePin( ¯Z),Pout( ¯W),Din( ¯W),Dout( ¯Z) as in Proposition 4.7;
20: return
¯
Z: A finite weakǫ-solution to (PV); ¯
W: A finiteǫ-solution to (DV); Pin( ¯Z),Pout( ¯W),Din( ¯W),Dout( ¯Z);
At the termination, the algorithm computes inner and outer approximations for the upper imageP and
lower imageD using Proposition 4.7. Furthermore, the algorithm returns a finite weakǫ-solution ¯Z to (PV)
and a finiteǫ-solution ¯W to (DV) (see Theorem 4.14 in L¨ohneet al.(2014)).
The steps of the dual algorithm are provided as Algorithm 2.
5
Scenario decomposition for scalar problems
Algorithm 2Dual Approximation Algorithm
1: Compute an optimal solution (x(η), y(η)), z(η)) to (P1(η)) forη= (J1, . . . , 1 J)T ;
2: LetD0={d∈RJ |P1(η)≥dJ};
3: k←0;
¯
Z ← {(x(η), y(η), z(η))}; ¯
W ← {η};
4: repeat
5: M ←RJ;
6: Compute the setVk of vertices ofDk;
7: foreacht= (t1, . . . , tJ−1, tJ)T∈ Vk do
8: Letw= (t1, . . . , tJ−1,1−PJj=1−1tj)T;
9: Compute an optimal solution (x(w), y(w), z(w)) to (P1(w));
10: Z ←¯ Z ∪ {¯ (x(w), y(w), z(w))};
11: if w∈RJ++ ortJ−P1(w)≤ǫthen
12: W¯ ←W¯∪ {w};
13: end if
14: if tJ−P1(w)> ǫthen
15: M ← M ∩
n
d∈RJ|(zJ
(w)−z1(w), . . . , zJ(w)−z J−1
(w),1)Td≤z(w)J
o
;
16: break;
17: end if
18: end for
19: if M 6=RJ then
20: Dk+1←Dk∩ M,k←k+ 1;
21: end if
22: untilM=RJ;
23: ComputePin( ¯Z),Pout( ¯W);Din( ¯W),Dout( ¯Z) as in Proposition 4.7;
24: return
¯
Z: A finite weakǫ-solution to (PV); ¯
W: A finiteǫ-solution to (DV); Pin( ¯Z),Pout( ¯W),Din( ¯W),Dout( ¯Z);
5.1
The problem of weighted sum scalarization
Let w∈ RJ+\{0}. The weighted sum scalarization problem (P1(w)) defined in Section 3 can be rewritten more explicitly as:
min wTz (P1(w))
s.t. z∈R(Cx+Qy)
Ax=b
Tix+Wiyi=hi, ∀i∈Ω
z∈RJ, x∈RM+, yi∈RN+, ∀i∈Ω.
We propose a Lagrangian dual reformulation of (P1(w)) whose objective function is scenario-wise de-composable. The details are provided in Section 5.1.1. Based on this dual reformulation, in Section 5.1.2, we propose a dual cutting-plane algorithm for (P1(w)), called the dual bundle method, which provides an optimal dual solution. As the Benson algorithms in Section 4 require an optimal primal solution in addition to an optimal dual solution, in Section 5.1.3, we show that such a primal solution can be obtained from the dual of the so-calledmaster problem in the dual bundle method.
5.1.1 Scenario decomposition
with the previous one, we add the so-callednonanticipativity constraints
pi(xi−E[x]) = 0, ∀i∈Ω,
which are equivalent tox1=. . .=xI. Let us introduce
F:=
(x, y)∈LM ×LN |(xi, yi)∈ Fi, ∀i∈Ω , (5.1) where, for eachi∈Ω,
Fi:=
(xi, yi)∈RM+ ×RN+ |Axi=b, Tixi+Wiyi=hi .
With this notation and using the nonanticipativity constraints, we may rewrite (P1(w)) as follows:
min wTz (P′
1(w)) s.t. z∈R(Cx+Qy)
pi(xi−E[x]) = 0, ∀i∈Ω (x, y)∈ F, z∈RJ.
Note that the optimal value of (P′
1(w)) isP1(w).
The following theorem provides a dual formulation of (P′
1(w)) by relaxing the nonanticipativity constraints in a Lagrangian fashion. We call this dual formulation as (D1(w)).
Theorem 5.1. It holds
P1(w) = sup
µ∈MJ
1,λ∈LM
( X
i∈Ω
fi(µi, λi, w)−β(µ, w)|E[λ] = 0
)
, (D1(w))
where, for eachi∈Ω,µi ∈RJ+,λi∈RM,
fi(µi, λi, w) := inf (xi,yi)∈Fi
wT[µ
i·(Cxi+Qiyi)] +piλTixi, (5.2)
andβ is defined by (2.1).
Proof. We may write
P1(w) = inf
(x,y)∈F,z∈RJ
wTz|z∈R(Cx+Qy), p
i(xi−E[x]) = 0, ∀i∈Ω (5.3)
= inf (x,y)∈F
inf z∈R(Cx+Qy)w
Tz|p
i(xi−E[x]) = 0, ∀i∈Ω
(5.4)
= inf (x,y)∈F
(
sup µ∈MJ
1
wTEµ[Cx+Qy]−β(µ, w)
|pi(xi−E[x]) = 0, ∀i∈Ω
)
, (5.5)
where the passage to the last line is by (2.3). Using the minimax theorem of Sion (1958), we may interchange the infimum and the supremum in the last line. This yields
P1(w) = sup
µ∈MJ
1
(F(µ, w)−β(µ, w)), (5.6)
where, for eachµ∈MJ 1,
F(µ, w) := inf (x,y)∈F
wTEµ[Cx+Qy]|p
i(xi−E[x]) = 0, ∀i∈Ω . (5.7)
scenario, we dualize the nonanticipativity constraints. The reader is referred to Section 2.4.2 of Shapiroet al. (2009) for the details on the dualization of nonanticipativity constraints. To that end, let us assign Lagrange multipliers ˜λ1, . . . ,λ˜I ∈RM for the non-anticipativity constraints. Note that we may consider them as the realizations of a random Lagrange multiplier ˜λ∈LM. By strong duality for linear programming,
F(µ, w) = sup
where the Lagrangianℓis defined by
ℓ(x, y,˜λ) :=wTEµ[Cx+Qy] +X of the theorem follows from (5.6) and (5.8).
5.1.2 The dual bundle method
To solve (D1(w)) given in Theorem 5.1, we propose a dual bundle method which constructs affine upper approximations forfi(·,·, w),i∈Ω, and−β(·, w). The upper approximations are based on the subgradients of these functions at points (µ(ℓ), λ(ℓ)) that are generated iteratively by solving the so-calledmaster problem. The reader is referred to Ruszczy´nski (2006) for the details of the bundle method.
Fori∈Ω, µ′
In the next proposition, we show how to compute the subdifferential of the functionfi(·,·, w) at a point (µ′
Proposition 5.2. Fori∈Ω, µ′ Finally, the setFi is compact by assumption. By Theorem 2.87 in Ruszczy´nski (2006), the assertion of the proposition follows.
Next, we show how to compute a subgradient of the function−β(·, w) at a pointµ′.
Proposition 5.3. Recall that the setA=
u∈LJ|0∈R(u) is the acceptance set ofR. Forµ′∈LJ
is also affine and continuous for all µ∈M1. By Theorem 2.87 in Ruszczy´nski (2006), the assertion of the proposition follows.
Remark 5.4. For practical risk measures, such as the multivariate entropic risk measure (see Example 2.3),
the function −β(·, w) is differentiable and the subdifferential is a singleton. For coherent multivariate risk measures, such as the multivariate CVaR (see Example 2.2), there exists a convex coneQ ⊆MJ1 such that
−β(µ, w) = 0 ifµ∈ Qand−β(µ, w) =−∞otherwise. For multivariate CVaR with risk-aversion parameter
that the cut (5.10) is always satisfied; therefore, it can be ignored.
At each iterationk of the bundle method, we solve themaster problem
withL={1, . . . , k},̺ >0. Here,k·k denotes the Euclidean norm on an appropriate dimension. Note that constraints (5.14) and (5.15) forµare equivalent to havingµ∈MJ1, and constraint (5.13) forλis equivalent to havingE[λ] = 0. ¯µ(k)∈MJ
1,¯λ(k)∈LM withE[¯λ(k)] = 0 are parameters of the problem, called thecenters, that are initialized and updated within the bundle method. The quadratic terms in the objective function are Moreau-Yosida regularization terms and they make the overall objective function strictly convex. These regularization terms enforce an optimal solution of (M P1(w)) to be close to the centers.
Let (µ(k+1), λ(k+1), ϑ(k+1), η(k+1)) be an optimal solution for (M P1(w)). Computing the subgradients
The centers are updated in the following fashion. At iterationk, one checks if the difference between the objective value of (D1(w)) evaluated at the point (µ(k), λ(k)), that is,P
The steps of our dual bundle method are provided as Algorithm 3. By (Ruszczy´nski, 2006, Theorem 7.16), the bundle method generates a sequence (¯µ(k),λ¯(k))k∈N that converges to an optimal solution of (D1(w))
as k→ ∞. In practice, the stopping condition in line 22 of Algorithm 3 is not satisfied. Therefore, it is a general practice to stop the algorithm when
X
i∈Ω
ϑ(k+1)i +η(k+1)−F¯(k+1)≤ε (5.17)
for some small constantε >0.
Remark 5.5. Note that the objective function of (M P1(w)) can be replaced with
ϑ+η−X
and constraint (5.11) can be replaced with
ϑ≤X
This way one would obtain an upper approximation for the sumP
i∈Ωfi(·,·, w). Compared to the multiple cuts in (5.11), this provides a looser upper approximation for P
i∈Ωfi(·,·, w). However, while one adds
I=|Ω|cuts at each iteration in the multiple cuts version, this approach adds a single cut.
5.1.3 Recovery of primal solution
Both the primal and the dual Benson algorithms require an optimal solution (x(w), y(w), z(w)) of the problem (P′
1(w)). Therefore, in Theorem 5.6, we suggest a procedure to recover an optimal primal solution from the solution of the master problem (M P1(w)).
Theorem 5.6. Let L={1, . . . , k} be the index set at the last iteration of the dual bundle method with the
approximate stopping condition (5.17) for some ε > 0. Let n+ 1 be the first descent iteration after the approximate stopping condition is satisfied and let L′ = {1, . . . , n}. For (M P1(w)) with centers µ¯(k),¯λ(k) and index set L′, letτ = (τ(ℓ)
Algorithm 3A Dual Bundle Method for (P1(w)) 20: (Optional) Remove all cuts whose dual variables at the solution of master problem are zero; 21: F¯(k+1)←P
i ) be an optimal solution of the subproblem in line 6 of Algorithm 3 for each i∈Ωandℓ∈ L′. Let
Moreover, letz(w) be a minimizer of the problem
(d) Asε→0, it holdswTz
(w)→P1(w).
The proof of Theorem 5.6 is given in Appendix A.
5.2
The problem of scalarization by a reference variable
Letv∈RJ\P. The problem (P2(v)) defined in Section 4.1 is formulated to find the minimum step-length
to enterP from valong the direction1∈RJ and it can be rewritten more explicitly as
min α (P2(v))
s.t. v+α1∈R(Cx+Qy)
Ax=b
Tix+Wiyi=hi ∀i∈Ω
α∈R, x∈RM+, yi∈RN+ ∀i∈Ω.
We propose a scenario-wise decomposition solution methodology for (P2(v)). Even the steps we follow are similar to the ones for (P1(w)), the decomposition is more complicated because the weights are not parameters but instead they are decision variables in the dual problem of (P2(v)) (see Theorem 5.11 below). Therefore, following the same steps as in (P1(w)) results in a nonconvex optimization problem. In order to resolve this convexity issue, we propose a new formulation for (P2(v)) by introducing finite measures to the dual representation ofR.
The flow of this section is as follows: in Sections 5.2.1 and 5.2.2, we propose a scenario-wise decomposition solution methodology for (P2(v)). Section 5.2.3 is devoted to the recovery of a primal solution.
5.2.1 Scenario decomposition
To derive a decomposition algorithm for (P2(v)), we randomize the first stage variablex∈RM as in (P1(w)) and add the nonanticipativity constraints
pi(xi−E[x]) = 0, ∀i∈Ω.
Using the feasible regionF defined by (5.1), we may rewrite (P2(v)) as follows:
min α (P′
2(v)) s.t. v+α1∈R(Cx+Qy)
pi(xi−E[x]) = 0 ∀i∈Ω (x, y)∈ F
α∈R
Note that the optimal value of (P′
2(v)) isP2(v). Different from the approach for (P′
1(w)), in order to obtain a convex dual problem for (P2′(v)), we use finite measuresminstead of probability measuresµin the dual representation ofR. To that end, letMJf be the set of allJ-dimensional vectorsm= (m1, . . . , mJ)T of finite measures on Ω, that is, for eachj∈ J, the
finite measuremj assignsmj
i to the elementary event{i} fori∈Ω. Form∈MJf andi∈Ω, we also write
mi:= (m1i, . . . , mJi)T∈RJ.
The following lemma provides the relationship betweenµandm.
Lemma 5.7. For every µ∈MJ
1 andγ∈RJ+\{0}, there existsm∈MJf such that
γTEµ[u] =X i∈Ω
mTiui, γT1=
X
i∈Ω
mTi1, (5.18)
Proof. Letµ∈MJ1 and γ∈RJ+\{0}. Definem∈MJf by
Example 5.8. Recall Example 2.3 on the multivariate entropic risk measure. The function ˜β(·) takes the
form
Example 5.9. Recall Example 2.2 on the multivariate CVaR. The function ˜β(·) takes the form
˜ linear function so thatm7→β˜(m) is a convex function since it is the supremum of linear functions indexed by u ∈ A. Similarly, fi(·,·,·) is not a concave function in general. However, (mi, λi) 7→ f˜(mi, λi) is the infimum of linear functions indexed by (xi, yi)∈ Fi; therefore, it is a concave function.
In view of Remark 5.10, while the first reformulation of (P′
2(v)) provided in Theorem 5.11 is not a convex optimization problem, the second reformulation, that is (D2(v)), is a convex optimization problem.
The proof of Theorem 5.11 uses Lemma 5.7 and the following lemma of independent interest.
Lemma 5.12. For everyu∈LJ,
inf{α∈R|v+α1∈R(u)}= sup
γT(Eµ[u]−v)−β(µ, γ)|µ∈MJ
1, γT1= 1, γ ∈RJ+ .
Proof. Letu∈LJ. Note that
inf{α∈R|v+α1∈R(u)}= inf{α∈R|0∈R(u)−v−α1}
is the optimal value of a single-objective optimization problem with a set-valued constraint function α7→ H(α) = R(u)−v−α1. Using the Lagrange duality in Borwein (1981) for such problems, in particular, Theorem 19, we have
inf{α∈R|v+α1∈R(u)}= sup γ∈RJ
inf α∈R
α+ inf z∈R(u)−v−α1γ
Tz
. (5.21)
To be able to use this result, we check the following constraint qualification: H is open at 0∈ RJ in the sense that for every α∈Rwith 0∈H(α) and for everyε > 0, there exists an open ballV around 0∈RJ such that
V ⊆ [
˜
α∈(α−ε,α+ε)
H(˜α). (5.22)
To that end, letα∈R with 0∈ H(α), that is, v+α1∈R(u). Let ε >0. Since 1 is an interior point of
RJ
+andR(u) +RJ+=R(u) due to the monotonicity and translativity ofR, it follows thatv+ (α+ε)1is an interior point ofR(u). On the other hand, note that
[
˜
α∈(α−ε,α+ε)
H(˜α) = [ ˜
α∈(α−ε,α+ε)
R(u−v−α˜1)
=R(u−v−(α+ε)1) =R(u)−v−(α+ε)1
thanks to the monotonicity and translativity of R. Hence, 0 ∈RJ is an interior point of the above union. Therefore, (5.22) holds for some open ballV around 0∈RJ and (5.21) follows.
SinceR(u) +RJ+=R(u) andR(u) is a convex set as a consequence of the convexity ofR, one can check that infz∈R(u)γTz=−∞for everyγ /∈RJ+. Hence, the supremum in (5.21) can be evaluated over allγ∈RJ+. Finally, using (2.3), we obtain
inf{α∈R|v+α1∈R(u)}
= sup γ∈RJ
+ inf α∈R
α+ inf z∈R(u)−v−α1γ
Tz
= sup γ∈RJ
+ inf
α∈R α−γ
T(v+α1) + sup
µ∈MJ
1
γTEµ[u]−β(µ, γ) !
= sup γ∈RJ
+
"
inf α∈R(1−γ
T1)α+ sup
µ∈MJ
1
γT(Eµ[u]−v)−β(µ, γ) #
= sup
γT(Eµ[u]−v)−β(µ, γ)|µ∈MJ
1, γT1= 1, γ∈RJ+ ,
Proof of Theorem 5.11. Using Lemma 5.12, we may write
Using the minimax theorem of Sion (1958), we may interchange the infimum and the supremum, and obtain
P2(v) = sup
where fi is defined by (5.2). Hence, the first reformulation follows. The second reformulation follows from the first reformulation and Lemma 5.7.
5.2.2 The dual bundle method
To solve (D2(v)) provided in Theorem 5.11, we propose a dual bundle method similar to the one in Sec-tion 5.1.2.
At each iterationkof the dual bundle method, we solve the master problem (M P2(v)) given below. Here, (gmi, gλi)∈R
The steps of the dual bundle method are provided in Algorithm 4. Similar to (5.17), the algorithm stops in practice when
Algorithm 4A Dual Bundle Method for (P2(v)) 20: (Optional) Remove all cuts whose dual variables at the solution of master problem are zero; 21: F¯(k+1)←P
+. In the next proposition, we show how to compute
∂mi,λif˜i(m
Proof. The proof of this proposition is similar to the proofs of Propositions 5.2 and 5.3. Therefore, it is
5.2.3 Recovery of primal solution
The primal Benson algorithm requires an optimal solution (x(v), y(v), α(v)) of the problem (P2′(v)). Therefore, in Theorem 5.14, we suggest a procedure to recover an optimal primal solution from the solution of the master problem (M P2(v)).
Theorem 5.14. Let L = {1, . . . , k} be the index set at the last iteration of the dual bundle method with
the approximate stopping condition (5.29) for some ε >0. Let n+ 1 be the first descent iteration after the appriximate stopping condition is satisfied and let L′ = {1, . . . , n}. For (M P2(v)) with centers m¯(k),¯λ(k) and index set L′, let τ = (τ(ℓ)
i )i∈Ω,ℓ∈L′, θ = (θ(ℓ))ℓ∈L′, σ ∈ RM, ψ ∈R, ν = (νi)i∈Ω be the Lagrangian dual variables assigned to the constraints (5.23), (5.24), (5.25), (5.26),(5.27), respectively, with τi(ℓ) ≥0, θ(ℓ)≥ 0, νi ∈ RJ+ for each i ∈ Ω, ℓ ∈ L′. Let (x
(ℓ) i , y
(ℓ)
i ) be an optimal solution of the subproblem in line 6 of Algorithm 4 for each i∈Ωandℓ∈ L′. Let
τ(n+1)= (τi(ℓ,n+1))i∈Ω,ℓ∈L′, θ(n+1)= (θ(ℓ,n+1))ℓ∈L′, σ(n+1), ψ(n+1), ν(n+1)= (ν (n+1) i )i∈Ω
be a dual optimal solution for(M P2(v)). Let x(v)= ((x(v))i)i∈Ω, y(v)= ((y(v))i)i∈Ω be defined by
(x(v))i:= X ℓ∈L′
τi(ℓ)x(ℓ)i , (y(v))i:= X ℓ∈L′
τi(ℓ)y(ℓ)i .
Let
α(v):= inf{α∈R|v+α1∈R(Cx¯+Qy¯)}.
Then, (x(v), y(v), α(v)) is an approximately optimal solution of(P2′(v))in the following sense:
(a) ((x(v))i,(y(v))i)∈ Fi for each i∈Ω.
(b) v+α(v)1∈R(Cx(v)+Qy(v)).
(c) Asε→0, it holds(x(v))i−σ(n+1)→0for each i∈Ω.
(d) Asε→0, it holdsα(v)→P2(v).
The proof of Theorem 5.14 is given in Appendix B.
5.2.4 Recovery of a solution to(LD2(v))
In addition to a primal optimal solution (x(v), y(v), α(v)), the primal Benson algorithm also requires an optimal solution γ(v) of the dual problem (LD2(v)) (see Section 4.1). Therefore, in Theorem 5.15, we suggest a procedure to recover this solution from the solution of the master problem (M P2(v)).
Theorem 5.15. In the setting of Theorem 5.14, let
γ(v)=
X
i∈Ω
m(n+1)i . (5.30)
Then, γ(v) is an approximately optimal solution of (LD2(v))in the following sense: asε→0, it holds
inf (x,y)∈X,α∈R
α+ inf
z∈R(Cx+Qy)−v−α1γ
T
(v)z
−P2(v)
→0. (5.31)
6
Computational Study
In order to test our methods, we solve a multi-objective risk-averse portfolio optimization problem under transaction costs. We consider a one-period market withJ risky assets. Each assetj∈ J ={1, . . . , J}has a random returnrj∈L. At the beginning of the period, it costsθjk∈Runits of assetj for an agent to buy one unit of asset k∈ J. At the end of the period, the random transaction cost of buying one unit of asset
kisπjk∈Lunits of asset j.
The risk-averse agent has a capitalc∈R++ units of asset 1 to be invested in theJ assets. Letxj∈R+ denote the number of physical units of assetj purchased by the agent; hence, she spendsxjθ1j units of asset 1 for this purchase. At the end of the period, the agent observes the random return of each asset as well as the random transaction costs between the assets. The value of each assetj is (1 +rj)xj and it is transacted to purchase the J assets with a transaction cost of πjk for asset k. Let qjk ∈ L
+ denote the number of physical units of assetkpurchased by selling some units of assetj. Let yk∈L
+ denote the total number of physical units of asset kpurchased by the agent so that yk =P
j∈J qjk. The objective is to minimize the
risk of the random cost vector −y ∈LJ using a multivariate convex risk measureR. This problem can be formulated as follows:
min z w.r.t. RJ+ s.t. z∈R(−y)
X
j∈J
θ1jxj=c
(1 +rij)xj =
X
k∈J
πijkq jk
i , ∀j∈ J, i∈Ω
yij=X k∈J
qkji , ∀j∈ J, i∈Ω
z∈RJ, x∈RJ+, yi∈R+J, qi∈RJ+×J, ∀i∈Ω.
Note thatx∈RJ+ is the first-stage,yi∈R+J, qi∈RJ+×J, i∈Ω are the second-stage decision variables. All computational experiments are conducted on a PC with 8.00 GB of RAM and an Intel(R) Core(TM) i7-4790 [email protected] GHz processor. We use Matlab implementations of Algorithms 3 and 4 where CVX 1.22 is used to solve master problems and CPLEX 12.6 is used to solve subproblems.
We generate two classes of instances where the number of assets J is either 2 or 3. In both cases, we assumec= 1. We setθ12= 1.0815,θ13= 0.9094. The return of asset 1 is uniformly distributed between−0.1 and 0.2, denoted byr1∼U[−0.1,0.2]. Similarly, we assumer2∼U[−0.05,0.1] andr3∼U[−0.15,0.3]. The random transaction costs among the assets are assumed to have the following distributions: π12∼U[1,1.1],
π21∼U[0.9,1],π13∼U[0.9,1],π31∼U[1,1.1],π23∼U[0.8,1],π32∼U[1,1.2], andπ11=π22=π33= 1. First of all, in Example 6.1, we compare our dual bundle method with CVX on the problem (P1(w)) of weighted sum scalarization. Our dual bundle method takes the advantage of scenario-wise decompositions while CVX solves the problem as a standard convex optimization problem without decompositions.
Example 6.1. We compare the CPU times (in seconds) of the dual bundle method and the CVX solver on
(P1(w)) instances with two- and three-dimensional multivariate entropic risk measure and different numbers of scenarios (I). In each instance, we use a fixed weight vectorw.
As observed in Table 1 and Table 2, the CVX solver overperforms the dual bundle method for smaller numbers of scenarios. However, as the number of scenarios increases, the dual bundle method overperforms the CVX solver. For instance, forI= 10000 in Table 1, the CVX solver cannot solve the problem due to a memory error. The same situation is observed forI= 500 in Table 2.
I Dual Bundle Method CVX
1000 869.22 75.98
2500 2130.85 588.85
5000 4170.55 3091.44
10000 8452.47 **
Table 1: Computational performances of the dual bundle method and CVX for a two-dimensional multivari-ate entropic risk measure instance with weight vectorw= (1/2,1/2)
I Dual Bundle Method CVX
50 56.39 35.22
100 252.73 98.73
250 1838.38 346.87
500 6309.39 **
Table 2: Computational performances of the dual bundle method and CVX for a three-dimensional multi-variate entropic risk measure instance with weight vectorw= (1/3,1/3,1/3)
(#opt.), the number of vertices in the final outer approximation (#vert.) and the CPU time in seconds (time).
Example 6.2. (Two-dimensional multivariate CVaR) We consider J = 2 assets underI = 500 scenarios.
The parameters of the multivariate CVaR are chosen as ν1 = 0.8, ν2 = 0.9. We use error parameter valuesǫ∈
10−2,10−3,10−4 . The computational results are reported in Table 3. It can be seen that the performances of the primal and dual algorithms are close to each other.
The inner (red lines) and outer (blue lines) approximations of the upper imageP and the lower image D are given in Figures 1 and 2. These figures are obtained by the primal algorithm. Since the
corre-sponding figures for the dual algorithm are similar, they are omitted. Clearly, the algorithm provides finer approximations for the upper and lower images whenǫis reduced from 10−3to 10−4.
ǫ #opt. #vert. time
Primal Algorithm
10−2 5 3 2675.69 10−3 11 6 10513.06 10−4 23 13 11391.12
Dual Algorithm
10−2 5 4 2819.92 10−3 13 8 7021.55 10−4 25 15 10007.75
-0.024 -0.022 -0.02 -0.018 -0.016 -0.014 -0.012 -0.01 -0.008 -0.006 -0.004 -0.002 -0.898
-0.896 -0.894 -0.892 -0.89 -0.888
-0.886 Inner and outer approximation of (P)
0.482 0.484 0.486 0.488 0.49 0.492 -0.471
-0.4705 -0.47 -0.4695 -0.469 -0.4685
-0.468 Inner and outer approximation of (D)
(a) Upper image (b) Lower image
Figure 1: Inner and outer approximations obtained by the primal algorithm forǫ= 10−3
-0.024 -0.022 -0.02 -0.018 -0.016 -0.014 -0.012 -0.01 -0.008 -0.006 -0.004 -0.002 -0.898
-0.896 -0.894 -0.892 -0.89 -0.888
-0.886 Inner and outer approximation of (P)
0.482 0.484 0.486 0.488 0.49 0.492 -0.471
-0.4705 -0.47 -0.4695 -0.469 -0.4685
-0.468 Inner and outer approximation of (D)
(a) Upper image (b) Lower image
Figure 2: Inner and outer approximations obtained by the primal algorithm forǫ= 10−4
Example 6.3. (Two-dimensional multivariate entropic risk measure) We consider J = 2 assets under
I= 500 scenarios. The parameters of the multivariate entropic risk measure are chosen asδ1=δ2= 0.1 and the coneC is generated by the vectors (2,1) and (1,2). We use error parameter valuesǫ∈ {0.1,0.05,0.01}. The computational results are reported in Table 4. In this example, the dual algorithm solves more optimization problems and enumerates more vertices than the primal algorithm in significantly shorter time. The inner and outer approximations of the upper imageP and the lower imageD obtained by the primal
algorithm are given in Figures 3 and 4. Since the corresponding figures for the dual algorithm are similar, they are omitted.
ǫ #opt. #vert. time
Primal Algorithm
0.1 25 13 37706.90
0.05 37 19 84730.81 0.01 83 42 144848.62
Dual Algorithm
0.1 31 17 13955.43
0.05 47 25 14088.08 0.01 85 44 17121.26
0 1 2 3 4 5 6 7 8 -3.5
-3 -2.5 -2 -1.5 -1
0.68 0.7 0.72 0.74 0.76 0.78 -1.2
-1.15 -1.1 -1.05 -1 -0.95 -0.9 -0.85 -0.8 -0.75 -0.7
(a) Upper image (b) Lower image
Figure 3: Inner and outer approximations obtained by the primal algorithm forǫ= 0.05
0 1 2 3 4 5 6 7 8
-3.5 -3 -2.5 -2 -1.5 -1
0.68 0.7 0.72 0.74 0.76 0.78 -1.2
-1.15 -1.1 -1.05 -1 -0.95 -0.9 -0.85 -0.8 -0.75 -0.7
(a) Upper image (b) Lower image
Figure 4: Inner and outer approximations obtained by the primal algorithm forǫ= 0.01
Example 6.4. (Three-dimensional multivariate CVaR) We consider J = 3 assets underI= 250 scenarios.
The parameters of the multivariate CVaR are chosen asν1 = 0.8, ν2= 0.9. We use error parameter values
ǫ∈
10−2,10−3,10−4 .
The computational results are reported in Table 5. For ǫ = 10−2 and ǫ = 10−3, the primal algorithm terminates in shorter time while, forǫ= 10−4, the dual algorithm is faster.
The outer approximations of the upper image P and the lower image D obtained by the primal and
dual algorithms are given in Figures 5-7. Note that the dots represent the vertices of some polyhedra even if they are not connected by line segments.
As observed in these figures, the primal algorithm provides a better approximation of the lower image compared to the dual algorithm. However, the approximation of the upper image provided by the dual algorithm is better than the one by the primal algorithm.
ǫ #opt. #vert. time
Primal Algorithm
10−2 21 9 16856.84 10−3 82 32 67555.09 10−4 468 162 319862.68
Dual Algorithm
10−2 24 11 20303.43 10−3 98 36 79397.49 10−4 448 152 249081.86
Table 5: Computational results for the three-dimensional multivariate CVaR
Upper image Lower image
(a) Primal algorithm
Upper image Lower image
(b) Dual algorithm
Upper image Lower image
(a) Primal algorithm
Upper image Lower image
(b) Dual algorithm
Upper image Lower image
(a) Primal algorithm
Upper image Lower image
(b) Dual algorithm
Figure 7: Outer approximations obtained by the primal and dual algorithms forI= 250 andǫ= 10−4
Example 6.5. (Three-dimensional multivariate entropic risk measure) We considerJ = 3 assets underI=
100 scenarios. The parameters of the multivariate entropic risk measure are chosen asδ1=δ2=δ3= 0.1 and the coneCis generated by the vectors (1,2,3),(3,2,1). We use error parameter valuesǫ∈ {0.1,0.05,0.01}. The computational results are reported in Table 4. We are not able to solve this problem using the primal algorithm as the dual bundle method for (P2(v)) does not converge for some vertices v. This is in line with what is reported in (L¨ohneet al., 2014, Example 5.4) for a four-objective problem with multivariate entropic risk measure. The results of the dual algorithm are provided in Table 6 and Figure 8.
As the multivariate entropic risk measure is defined in terms of the exponential utility function, which is strictly convex, the upper and lower images are non-polyhedral sets. For this reason, the polyhedral outer approximations of these sets have a more uniform density of vertices over their surfaces compared to the outer approximations for the multivariate CVaR.
ǫ #opt. #vert. time
Dual Algorithm
0.1 196 61 48742.57 0.05 319 99 82237.89 0.01 670 211 168460.45
5 5
0 5
0
-5 0
-5 -5
0.5 -1.4
-1.2
0.8 0.6 -1
0.4 0
-0.8
0.2 0 -0.6
-0.4
Upper image Lower image
(a)ǫ= 0.1
Upper image Lower image
(b)ǫ= 0.05
Upper image Lower image
(c)ǫ= 0.01
Figure 8: Outer approximation obtained by the dual algorithm forI= 100