In this section and the next, we prove separation theorems involving two or more convex sets in arbitrary vector spaces overR. We first prove separation theorems using a synthetic, algebraic framework, suggested in the works [167, 168, 169, 177, 17, 18], and especially in the charming book [63], because this approach brings out the basic ideas behind the separation theorems most clearly, and gives the most general results. Moreover, this approach effectively isolates the role of topological considerations in separation theorems, and makes it possible to prove topological separation theorems with relative ease.
Readers who are not interested infinite-dimensional vector spaces may skip this section and the next two without any loss of continuity.
We start by defining several relevant concepts.
Definition 6.24.Let E be a real vector space. A hyperplaneH in E is the level set of a nontrivial linear functional`:E→R, that is,
H ={x∈E:`(x) =α} for someα∈R.
The hyperplane H partitions E into two half-spaces; in the definitions below, “closed” and “open” are algebraic concepts and do not refer to any topology ofE.
Definition 6.25.An algebraically closed half-space inE is a set either of the form
H¯(`,α)+ :={x∈E :`(x)≥α}, or of the form
H¯(`,α)− :={x∈E :`(x)≤α}, where` is a nonzero linear functional onE andα∈R.
Similarly, an algebraically open half-space inE is a set either of the form H(`,α)+ :={x∈E :`(x)> α},
or of the form
H(`,α)− :={x∈E :`(x)< α}.
Definition 6.26.Let C and D be two nonempty sets, and H := H(`,α) a hyperplane in a vector spaceE.
H is called a separating hyperplanefor the setsC andDifCis contained in one of the algebraically closed half-spaces determined by H and D in the other, sayC⊆H¯(`,α)+ andD⊆H¯(`,α)− .
H is called a strictly separating hyperplanefor the sets C and D if C is contained in one of the algebraically open half-spaces determined byH andD in the other, sayC⊆H(`,α)+ and D⊆H(`,α)− .
6.6 Separation of Convex Sets in General Vector Spaces 157 H is called a strongly separating hyperplanefor the setsC andD if there exist β andγ satisfying γ < α < β, such thatC⊆H¯(`,β)+ , andD⊆H¯(`,γ)− .
H is called a properly separating hyperplane for the sets C andD if H separatesCandD, andCandDare not bothcontained in the hyperplane H.
If there exists a hyperplane H separating the sets C and D in one of the senses above, we say thatC andD can be separated,strictly separated, strongly separated,properly separated, respectively.
As in the finite-dimensional case, hyperplanes are proper, maximal affine subsets. However, when E is a topological vector space, it is no longer true that every hyperplane is necessarily closed.
Lemma 6.27.Let E be a real vector space. A set H ⊂E is a hyperplane if and only ifH is a proper maximal affine subset ofE.
Moreover, ifE is a topological vector space, then the hyperplaneH(`,α) is closed if and only if`is a continuous linear functional.
Proof. Clearly, a hyperplaneH(`,α) is a proper affine subset of E. The max- imality of H holds: if a ∈ E \H, then `(a) 6= 0, so that if x ∈ E, we have `(x) = `((`(x)/`(a))a), that is, x−`(x)/`(a)a ∈ H, proving that E= span{H, a}.
Conversely, suppose thatHis a proper maximal affine subset ofE. Assume without loss of generality thatH is a linear subspace ofE. Ifa∈E\H, then E= span{H, a}, so that everyx∈E has a representationx=u+ta, where u∈H andt∈R. This representation is unique, sincex=u1+t1a=u2+t2a implies thatu2−u1= (t1−t2)a∈H∩span({a}) ={0}, that is,u2=u1and t2=t1. Define
`(x) =t, where x=u+ta, u∈H, t∈R,
which is easily shown to be a linear functional. Clearly,H =H(`,0), proving thatH is a hyperplane.
Now suppose that E is a topological vector space. If ` is continuous, it is clear that H(`,α) is a (topologically) closed set. Conversely, if H :=H(`,α) is closed, we claim that ` is continuous. Pick a point xin the complement of H, which is an open set. There exists an open neighborhood N of the origin such that x+N ⊆ E \H. We may assume that N is a symmetric neighborhood, that is,N =−N: since (t, x)7→ txis continuous, there exist δ >0 and a neighborhood W of the origin such thattV ⊆N for all|t|< δ.
The set ¯N := ∪|t|<δtW ⊆ N is clearly a symmetric neighborhood of the origin. If`(N) is unbounded, then it is easy to see that `(N) = R, so that there exists y ∈N satisfying `(y) =α−`(x), which gives the contradiction x+y ∈ (x+N)∩H =∅. Therefore, `(N) is bounded, say |`(x)| ≤ M for x∈N. The continuity of` follows, because given >0,|`(x)|< for every
x∈(/M)N. ut
Note that the proof above also establishes the following result.
Corollary 6.28.Let E be a real topological vector space, and H := H(`,α) a hyperplane. The linear functional ` is continuous if and only if one of the half-spacesH+,H− contains an open set.
Some form of Zorn’s lemma is needed to prove separation theorems in general vector spaces. Recall that apartial order on a setX is a reflexive, antisymmetric, and transitive relation onX, that is, forx, y, z∈X, we have
(a)xx,
(b)xy,yx =⇒ x=y, (c) xy,yz =⇒ xz.
A subsetY ⊆X is called totally ordered if any two elementsx, y∈Y can be compared, that is, eitherxy or yx. An upper bound of any setZ ⊆X is a pointx∈ X such thatz x for everyz ∈ Z. A maximal element of a partially ordered setX is a pointx∈X such thatxz implies thatz=x.
Lemma 6.29. (Zorn’s lemma) A partially ordered set has a maximal ele- ment if every totally ordered subset of it has an upper bound.
Zorn’s lemma is a basic axiom of set theory equivalent to the axiom of choice or the well-ordering principle; see for example [125] for more details.
A pair of nonempty convex setsCandDsatisfyingC∩D=∅andC∪D= Eare calledcomplementary convex sets. The following result essentially goes back to [152] and [248].
Lemma 6.30.IfAandB are two nonempty, disjoint convex sets in a vector spaceE, then there exist complementary convex sets CandD inE such that A⊆C andB⊆D.
Proof. We introduce a relation on the set C of disjoint convex subsets (C, D)⊆E×E such that A⊆C andB ⊆D by the inclusion relation, that is, we declare (C, D)(C0, D0) ifC⊆C0 andD⊆D0. It is evident thatis a partial order relation onC. Moreover, ifD ⊂ Cis any totally ordered subset, then the union of sets inD is a pair of disjoint convex sets that is an upper bound forD. Thus, Zorn’s lemma applies, and there exists a maximal element (C, D)∈ C, that is,C andD are convex sets satisfyingA⊆C and B ⊆D, and wheneverC0andD0 are convex sets satisfyingC⊆C0 andD⊆D0, then C0=CandD0 =D.
We claim thatC∪D=E. If this is not true, pick a pointx∈E\(C∪D).
Since (C, D) is a maximal pair, we have co({x} ∪C)∩D 6=∅ and co({x} ∪ D)∩C6=∅. Lety1∈co({x} ∪D)∩C andy2∈co({x} ∪C)∩D; then there exist x2 ∈ D such that y1 ∈ (x, x2), and x1 ∈ C such that y2 ∈ (x, x1);
seeFigure 6.5. But the intersection point z of the line segments [x1, y1] and [x2, y2] belongs to bothC andD, a contradiction. This proves the claim and
the lemma. ut
6.6 Separation of Convex Sets in General Vector Spaces 159
x
y 1
z
x 2
y 2
x 1
Fig. 6.5.
Lemma 6.31.Let(C, D)be complementary convex sets in a vector spaceE.
The set
L:= ac(C)∩ac(D) is either a hyperplane inE or the whole spaceE.
Moreover,
(a) L = E if and only if ai(C) = ai(D) = ∅, or equivalently if and only if ac(C) = ac(D) =E.
(b) If L is a hyperplane, then the sets ai(C)and ai(D) are both nonempty, and the pairs(ai(C),ai(D))and(ac(C),ac(D))are the algebraically open and closed half-spaces associated withL, respectively.
Proof. The setLis convex, because ac(C) and ac(D) are convex sets. The set Lis not empty: letx∈Candy∈D; there exists a pointw∈(x, y) such that [x, w)⊆C and [y, w)⊆D, implyingw∈ac(D)∩ac(D)6=∅.
First, we claim that
ac(C) =E\ai(D).
If x /∈ ai(D), then there exists u ∈E such that any point v ∈E satisfying x∈(u, v) has the property that (x, v]⊆E\D=C; thus,x∈ac(C), and we have proved that ac(C)∪ai(D) =E. The sets ac(C) and ai(D) are disjoint:
if x ∈ ac(C)∩ai(D), then there exists u ∈ C such that [u, x) ⊆ C. Let v be a point such thatx ∈(u, v); either v ∈C, in which case [u, v] ⊂C and x∈ai(D) ⊆D, which is impossible, or v ∈D and then [u, x) intersects D, sincex∈ai(D), which is also impossible. Our claim is proved, and we have
ac(C) =E\ai(D), and ac(D) =E\ai(C). (6.9) It follows immediately that L = E if and only if ai(C) = ai(D) = ∅, or equivalently if and only if ac(C) = ac(D) =E, proving (a).
It is now easy to show that L is an affine set. If x, y∈ L andz satisfies y∈(x, z) andz /∈L= ac(C)∩ac(D), thenz /∈ac(C), say; but thenz∈ai(D), and sincex∈ac(D), we must havey∈ai(D) by Lemma 5.5, which contradicts the assumption thaty∈ac(C). Thus,z∈L, andLis an affine set.
Finally, assume that L6=E. Pick p∈ai(C) =E\ac(D), so thatp /∈L;
seeFigure 6.6.To prove thatLis a hyperplane, it suffices to show that E=
aff({p} ∪L). Pick a pointr∈Land consider a pointq=αr−pwithα >1, that is,r∈(p, q). We must have q∈ai(D), because otherwiseq∈ac(C) and r ∈ai(C) = E\ac(D) by Lemma 5.5, contradicting r∈ L. If x∈ C\L is an arbitrary point, then the segment [x, q] must intersectL; in fact, the point w ∈ (x, q) satisfying [x, w) ⊆ C and [q, w) ⊆ ai(D) must lie on L, because w ∈ ac(C) and w ∈ ac(ai(D)) = ac(D). This proves that x∈ aff({p} ∪L).
Similarly, ify∈D\Lis an arbitrary point, theny∈aff({p} ∪L). Altogether, we have proved thatE= aff({p} ∪L), that is,Lis a hyperplane.
It follows from (6.9) that the sets ai(C), L,ai(D) are disjoint and their
union isE. This proves (b). ut
L p
y
r
x
w
q Fig. 6.6.
Theorem 6.32.Let C and D be nonempty convex sets in a vector space E such thatai(C)6=∅. Then there exists a hyperplane H separatingC andD if and only ifai(C)∩D=∅, in which case ai(C)lies in one of the algebraically open half-spaces associated withH.
Proof. Suppose that the hyperplaneH separatesCandD, such thatC⊆H¯+ andD⊆H¯−. The setCcannot lie onH, since aff(C) =E; hence there exists a pointy ∈C∩H+. We must have ai(C)⊆H+, because if there is a point x∈ ai(C)∩H, then there exists a point z ∈ C such that x ∈ (y, z); since z∈C∩H−, this gives a contradiction.
Conversely, if ai(C)∩D = ∅, then Lemma 6.30 implies that there exist complementary convex sets ( ˜C,D) such that ai(C)˜ ⊆ C˜ and D ⊆ D. We˜ claim that ai(C)⊆ai( ˜C)6=∅. Ifx∈ai(C), then for any y ∈E, there exists u ∈ C such that x ∈ (u, y). Since [x, u) ⊆ ai(C) by Lemma 5.5, we may assume that u∈ai(C)⊆C; this proves the claim. Lemma 6.31 implies that˜ H := ac( ˜C)∩ac( ˜D) is a hyperplane that separates ˜C and ˜D, hence ai(C) and D. Supposing ai(C)⊆H¯+ andD ⊆H¯−, we must haveC ⊆H¯+, because if
6.6 Separation of Convex Sets in General Vector Spaces 161 x∈C∩H−andy∈ai(C)⊆H¯+, then Lemma 5.5 implies that (y, x) contains a pointz∈ai(C)∩H− =∅, a contradiction.
We have proved thatH separatesC andD. ut
Theorem 6.33. (Proper separation theorem) LetCandDbe nonempty convex sets in a vector spaceE, such thatrai(C)6=∅ andrai(D)6=∅.
Then, there exists a hyperplaneH properly separatingC andDif and only ifrai(C)∩rai(D) =∅.
Proof. The proof is partly the same as the proof Theorem 6.15. Define the convex setK :=C−D. It follows from Lemma 5.11 that rai(K) = rai(C− D) = rai(C)−rai(D); thus, rai(C)∩rai(D) =∅and 0∈/rai(K) are equivalent statements. Lemma 6.14 holds in arbitrary vector spaces, as is evident from its proof. Thus, the proof of the theorem reduces to establishing the fact that the sets{0}andK are properly separable if and only if 0∈/rai(K).
Suppose that the origin andKare properly separated by a hyperplaneH, such that 0∈H¯− and K ⊆H¯+. We claim that 0∈/ rai(K). If 0∈/ H, then rai(K)⊆H¯+, so that 0∈/ rai(K). Otherwise, 0∈H and there exists a point x∈K\H. If we had 0∈rai(K), there would exist a pointy ∈K such that 0∈(x, y), giving the contradictiony∈C∩H−=∅. This proves the claim.
To prove the converse implication, suppose that 0∈/ rai(K), and letL:=
aff(K) be the affine hull ofK. If 0∈/ L, we will show that{0} andL can be properly separated; this will imply that{0}andKcan be properly separated.
Consider the set of all affine subsets ofE that contain Lbut not 0, partially ordered by set inclusion. Zorn’s lemma guarantees the existence of a maximal affine subspaceH containingLbut not 0. We claim thatH is a hyperplane;
otherwise,H0= aff({0} ∪H)6=E, and if we pickx∈E\H0, then the affine set aff({x} ∪H) strictly includesH but does not include 0, contradicting the maximality ofH. This proves the claim. It is clear thatH properly separates {0}andL.
If 0 ∈ L, we apply Theorem 6.32 within the vector space L to the sets {0} and K, and obtain a hyperplane P in L separating 0 and K such that rai(K)⊆ P+; see Figure 6.3. We may again assume that 0 ∈ P (otherwise the translation of P so that it passes through the origin also satisfies the same separation properties). Zorn’s lemma implies that there exists a maximal linear subspaceH ofEextendingP and satisfyingP =H∩L. We claim that His a hyperplane inE. Otherwise, pickx∈E\Hand form the linear subspace H0 := span{x, H}, which strictly containsH. We haveH0∩L=P: anyy∈H0 can be written asy =αx+hwithα∈Randh∈H; if y ∈L, theny∈H, and consequentlyαx∈H, which implies thatα= 0 andy∈H∩L=P. This proves that H0∩L =P (the inclusionP ⊆H0∩L is trivial), contradicting the maximality ofH.
We have proved thatH is a hyperplane; clearlyH properly separates{0}
andK. ut
Theorem 6.34.Let C be a nonempty convex set in a vector spaceE. If M is an affine set such that rai(C)∩M =∅, then there exists a hyperplane H extendingM such that rai(C)∩H =∅.
Proof. The proof of the theorem is the same as the proof of Theorem 6.17 except that we replace ri(C) in that proof by rai(C) and invoke Theorem 6.33
instead of Theorem 6.15. ut