Convex Polyhedral Sets and Cones - Foundations of Optimization

7 Convex Polyhedra

In this chapter, we develop the basic results of the theory of convex polyhedra. This is a large area of research that has been studied from many different points of view. Within optimization, it is very important in linear program- ming, especially in connection with the simplex method for solving linear programs. The choice of the topics we treat in this chapter is dictated mostly by the needs of optimization. However, we do not have space to treat the extensive body of work concerning the combinatorial theory of convex polyhedra, some of which is intimately related to the simplex method and its variants. The interested reader may consult the books [115, 50, 274] for more information on this topic and the book [5] for differential-geometric questions regarding convex polyhedra.

In this chapter,E will be a finite-dimensional vector space.

Lemma 7.2.A finitely generated cone is a closed set.

Proof. LetK be a finitely generated cone:

K=nX^k

j=1

tjaj :tj ≥0, j= 1, . . . , ko .

By Carath´eodory’s theorem (Theorem 4.21, p. 94), any point x∈K can be written as

j=1

δjbj, δj ≥0,

where {bj}^p1 is a linearly independent subset of {ai}^k1. It follows that x ∈ {Pp

1δjbj : δj ≥ 0}, a simplical cone that is the image of the nonnegative orthantR^p+ under the linear map

T(δ) =

j=1

δjbj.

Since{bj}^p1 is linearly independent, T is a homeomorphism, and since R^p+ is closed, so is the simplical cone. The coneKis a union of such simplical cones, which are finitely many in number, so must be closed. ut

It follows from this lemma that {Pk

1δiai :δi ≥0}= cl cone(a1, . . . , ak).

We will denote this set by cone(a1, . . . , ak); thus

cone(a₁, . . . , a_k) := cl cone(a₁, . . . , a_k).

Lemma 7.3.The dual of finitely generated coneK= cone(a₁, . . . , a_k)is the polyhedral coneL=∩^kj=1{x:ha_j, xi ≤0}.

Proof. Clearly, we have K^∗=

(

x:DX^k

j=1

tjaj, xE

≤0 for all tj ≥0 )

⊇ ∩^k1{x:haj, xi ≤0}. Ifx∈ K^∗, choosing tj = 1 and all otherti = 0 implieshaj, xi ≤ 0, proving

thatK^∗⊆L. ut

7.1.1 Convex Polyhedral Cones

Theorem 7.4.Let a1, . . . , ak ∈E. The finitely generated cone K= cone(a1, . . . , ak)

and the polyhedral cone

L={x:haj, xi ≤0, j= 1, . . . , k} are polars of each other, that is, K^∗=L andL^∗=K.

7.1 Convex Polyhedral Sets and Cones 177 Proof. It follows from Lemma 7.3 that K^∗ = L. Since K is closed by Lemma 7.2, Theorem 6.19 implies thatK= (K^∗)^∗=L^∗. ut IfE is endowed with a basis, the above theorem takes the following form.

Corollary 7.5.Let A be ann×k matrix. Then the cones K={Av:v≥0} andL={x:A^Tx≤0} are polars of each other.

Proof. Let A = [a₁. . . , a_k], where {a_i} are the columns of A. Then K = cone(a₁, . . . , a_k) andL={x:ha_j, xi ≤0, j= 1, . . . , k}. ut Theorem 7.6. (Farkas’s lemma, homogeneous version)Let a1, . . . , ak

be given vectors in E. The following statements are equivalent:

(a) If x∈ E satisfies the inequalities hai, xi ≤ 0,i = 1, . . . , k, then it also satisfies the inequality hb, xi ≤0. In other words,

[ha_i, xi ≤0, i= 1, . . . , k] =⇒ [hb, xi ≤0].

(b) The vector b is a nonnegative linear combination of {ai}^k1, that is, b = Pk

i=1tiai for someti≥0,i= 1, . . . , k.

Proof. This is essentially a restatement of Theorem 7.4. Define K={x:hai, xi ≤0, i= 1, . . . , k}.

Part (a) is equivalent to the statement,b∈K^∗, whereas part (b) states that b∈cone(a1, . . . , ak). We have K^∗= cone(a1, . . . , ak) by Theorem 7.4. ut The general, affine version of Farkas’s lemma is given in Theorem 7.20 on page 185.

Corollary 7.7.Let c₁, . . . , c_k,a₁, . . . , a_l be given vectors inE. The following statements are equivalent:

(a)

[hci, xi= 0, i= 1, . . . , k, haj, xi ≤0, j= 1, . . . , l] =⇒ [hb, xi ≤0]. (b) There existti∈R(i= 1, . . . , k) and sj≥0 (j= 1, . . . , l) such that

i=1

tici+

j=1

sjaj.

Proof. The equalityhci, xi= 0 is equivalent to the inequalitieshci, xi ≤0 and h−ci, xi ≤0. By Farkas’s lemma, part (a) is equivalent to

b∈cone(c1, . . . ,−c1, . . . ,−ck, a1, . . . , al) :=L.

An arbitrary element of x ∈ L can be written as x = Pk

i=1(α_i −β_i)c_i+ Pl

j=1s_ja_j with α_i ≥ 0, β_i ≥ 0 (i = 1, . . . , k), and s_j ≥ 0 (j = 1, . . . , l).

Sincet_i :α_i−β_i can be any real number, we see that parts (a) and (b) are

equivalent. ut

The following important result establishes theequivalence of finitely generated and polyhedral cones.

Theorem 7.8.Every finitely generated cone K is a convex polyhedral cone, and vice versa.

Proof. We first show that every finitely generated coneKis a polyhedral cone.

LetK= cone(a₁, . . . , a_k)⊆E be a finitely generated cone. We claim thatK is a polyhedral cone using induction onk. Ifk= 1, thenK={ta:t≥0}. If a= 0∈E, then

K={0}={x:xi= 0, i= 1, . . . , n}={x:hei, xi= 0, i= 1, . . . , n}, where{e_i}ⁿ1is a basis ofE, and each equationhe_i, xi= 0 can be written as two inequalitieshe_i, xi ≤0 and h−e_i, xi ≤ 0. This proves thatK is a polyhedral cone. If 06=a∈E, thenK is a half-line. In this case, pick a basis{e_i}ⁿ1 ofE such thate₁ =a and {e_i}ⁿ2 is a basis of {a}^⊥. Then we can write K in the form

K={x:h−e1, xi ≤0,hei, xi= 0, i= 2, . . . , n}, proving thatK is again a polyhedral cone.

Supposing that the claim is proved for k−1, we will prove it for k. Let K = cone(a, a1, . . . , a_k−1). DefineK1 := cone(a1, . . . , a_k−1). Any x∈K can be written asx=y+ta, wherey∈K andt≥0. This means that

K={x∈E:∃t≥0, x−ta∈K1}. By the induction hypothesis, there exist{b_j}^m1 such that

K1={x∈E:hbj, xi ≤0, j= 1, . . . , m}. Consequently, we have

K={x∈E:∃t≥0,hb_j, x−tai ≤0, j= 1, . . . , m}

={x∈E:∃t≥0,hbj, xi ≤thbj, ai, j= 1, . . . , m}. (7.1) We will writeKas a polyhedral cone by “eliminating” the variabletin these inequalities. Define the index setsI⁺ :={j :hbj, ai>0},I⁻ :={j :hbj, ai<

0}, andI⁰:={j:hbj, ai= 0}. The conditions in (7.1) can then be written as hbi, xi/hbi, ai ≤t fori∈I⁺,hbj, xi/hbj, ai ≥t forj∈I⁻, andhbl, xi ≤0 for j∈I⁰. Therefore,

x∈E :∃t≥0,hb_i, xi

hbi, ai ≤t≤hb_j, xi

hbj, ai,hb_l, xi ≤0, i∈I⁺, j∈I⁻, l∈I⁰

. Clearly, a variablet≥0 exists above if and only if

max

i∈I⁺

hbi, xi hb_i, ai ≤ min

j∈I⁻

hbj, xi hb_j, ai, min

j∈I⁻

hbj, xi hb_j, ai ≥0.

7.1 Convex Polyhedral Sets and Cones 179 Sincehbj, ai<0 forj ∈I⁻, the second inequality above is equivalent to the condition thathbj, ai ≤0 for allj∈I⁻. It follows that

x∈E:hb_l, xi ≤0,hb_i, xi

hbi, ai ≤hb_j, xi

hbj, ai, l∈I⁻∪I⁰, i∈I⁺, j∈I⁻

, which proves thatK is a polyhedral cone.

Conversely, suppose that K is a polyhedral cone. It follows from Theo- rem 7.4 thatK^∗ is finitely generated. The above argument shows thatK^∗ is polyhedral. Theorem 7.4 again implies thatK= (K^∗)^∗ is finitely generated.

u t Remark 7.9.The method of elimination of the variablet from (7.1) is called the Fourier–Motzkin elimination method. It is a powerful tool that can be used to derive most theoretical results for systems of linear equalities and inequalities, including the derivation of (various forms of) Farkas’s lemma;

see [180]. The elimination method can also be used to solve systems of linear equalities and inequalities numerically. However, it is a very inefficient tool in this respect, since the elimination of a single variable typically leads to the creation of many additional equations and inequalities.

Remark 7.10.A more general version of the elimination of variables idea ap- plies to systems of polynomial equations and inequalities, and goes by the name Tarski–Seidenberg principle; see [34]. This is an indispensable theoretical tool in real algebraic geometry. Unfortunately, the Tarski–Seidenberg principle is also a very inefficient computational tool for solving systems of polynomial equations and inequalities, for the same reasons.

Remark 7.11.Another elimination procedure is at work in multilinear algebra.

LetV andW be vector spaces overRand consider bilinear mapsf(v, w) from V ×W into an arbitrary vector spaceZ. Thus,f is a map that is linear in each of the variables separately, that is, f(α1v1+α2v2, w) = α1f(u1, w) + α2f(u2, w) andf(v, β1w1+β2w2) =β1f(u, , w1)+β2f(u, w2). It is well known in multilinear algebra that the condition

α₁f(v₁, w₁) +α₂f(v₂, w₂) +· · ·+α_nf(v_n, w_n) = 0

for all bilinear mapsf :V ×W →Z (7.2) is equivalent to the condition that

α1(v1⊗w1) +α2(v2⊗w2) +· · ·+αn(vn⊗wn) = 0.

Consequently, the elimination of the quantifier “for allf” in (7.2) leads to the concept of tensor products.

Dalam dokumen Foundations of Optimization (Halaman 192-197)