Stability Properties - Qualitative Properties of the Minimum Sum-of-Squares

Chapter 3. Qualitative Properties of the Minimum Sum-of-Squares

3.5 Stability Properties

x := (x¹, x²) is not a local solution of the MSSC problem under consideration.

The above analysis shows that the k-means algorithm is very sensitive to the choice of starting centroids. The algorithm may give a global solution, a local-nonglobal solution, as well as a centroid system which is not a local solution. In other words, the quality of the obtained result greatly depends on the initial centroid system.

Figure 3.1: The centroids in item (e) of Example 3.1

Let us abbreviate the local solution set of (3.2) to F1(a). Note that the inclusion F(a) ⊂ F₁(a) is valid, and it may be strict.

Definition 3.3 A family {I(j) | j ∈ J} of pairwise distinct, nonempty subsets of I is said to be a partition of I if ^[

j∈J

I(j) = I.

From now on, let ¯a = (¯a¹, ...,¯a^m) ∈ _R^nm be a fixed vector with the property that a¯¹, ...,a¯^m are pairwise distinct.

Theorem 3.5 (Local Lipschitz property of the optimal value function) The optimal value function v : _R^nm → _R is locally Lipschitz at ¯a, i.e., there exist L₀ > 0 and δ₀ > 0 such that

|v(a)−v(a⁰)| ≤ L₀ka−a⁰k

for all a and a⁰ satisfying ka−ak¯ < δ₀ and ka⁰−¯ak< δ₀.

Proof. Denote by Ω the set of all the partitions of I. Every element ω of Ω is a family {I_ω(j) | j ∈ J} of pairwise distinct, nonempty subsets of I with [

j∈J

I_ω(j) = I. We associate to each pair (ω, a), where a = (a¹, ..., a^m) ∈ _R^nm and ω ∈ Ω, a vector x_ω(a) = (x¹_ω(a), . . . , x^k_ω(a)) ∈ _R^nk with

x^j_ω(a) = 1

|I_ω(j)|

i∈Iω(j)

aⁱ (3.32)

for every j ∈ J. By Theorem 3.1, problem (3.2) has solutions and the number of the global solutions is finite, i.e., F(¯a) is nonempty and finite. Moreover, for each ¯x = (¯x¹, ...,x¯^k) ∈ F(¯a), one can find someω ∈ Ω satisfying ¯x^j = x^j_ω(¯a) for all j ∈ J. Let Ω₁ = {ω₁, . . . , ω_r} be the set of the elements of Ω corresponding the global solutions. Then,

f(x_ω₁(¯a),a)¯ < f(x_ω(¯a),¯a) (∀ω ∈ Ω\Ω₁), (3.33) where

f(x, a) = 1 m

i∈I

minj∈J kaⁱ−x^jk². (3.34) For each pair (i, j) ∈ I×J, the rule (x, a) 7→ kaⁱ−x^jk² defines a polynomial function on _R^nk ×_R^nm. In particular, this function is locally Lipschitz on its domain. So, by [20, Prop. 2.3.6 and 2.3.12] we can assert that the function f(x, a) in (3.34) is locally Lipschitz on _R^nk×_R^nm.

Now, observe that for any ω ∈ Ω and j ∈ J, the vector function x^j_ω(.) in (3.32), which maps _R^nm to _Rⁿ, is continuously differentiable. In particular, it is locally Lipschitz on _R^nm.

For every ω ∈ Ω, from the above observations we can deduce that the function g_ω(a) := f(x_ω(a), a) is locally Lipschitz on _R^nm. Rewriting (3.33) as

g_ω₁(¯a) < g_ω(¯a) (∀ω ∈ Ω\Ω₁)

and using the continuity of the functions g_ω(.), we can find a number δ₀ > 0 such that

g_ω₁(a) < g_ω(a) (∀ω ∈ Ω\Ω₁) (3.35) for all a satisfying ka−ak¯ < δ₀. Since ¯a¹, ...,¯a^m are pairwise distinct, without loss of generality, we may assume that a¹, ..., a^m are pairwise distinct for any a = (a¹, ..., a^m) with ka−ak¯ < δ₀.

Now, consider a vector a = (a¹, ..., a^m) satisfying ka−ak¯ < δ₀. By (3.35), f(x_ω₁(a), a) < f(x_ω(a), a) for all ω ∈ Ω \ Ω₁. Since f(., a) is the objective function of (3.2), this implies that the set {x_ω(a) | ω ∈ Ω \ Ω1} does not contain any global solution of the problem. Thanks to Theorem 3.1, we know that the global solution set F(a) of (3.2) is contained in the set

{x_ω(a) | ω ∈ Ω₁}.

Hence,

F(a) ⊂ {x_ω(a) | ω ∈ Ω₁} = {x_ω₁(a), . . . , x_ω_r(a)}. (3.36) Since F(a) 6= ∅, by (3.36) one has

v(a) = min{f(x, a) | x ∈ F(a)} = min{f(x_ω_`(a), a) | ` = 1, . . . , r}.

Thus, we have proved that

v(a) = min{g_ω_`(a) | ` = 1, . . . , r} (3.37) for alla satisfyingka−¯ak < δ₀. As it has been noted, the functions g_ω, ω ∈ Ω, are locally Lipschitz on _R^nm. Hence, applying [20, Prop. 2.3.6 and 2.3.12] to the minimum function in (3.37), we can assert that v is locally Lipschitz at

¯ a.

The proof is complete. 2

Theorem 3.6 (Local upper Lipschitz property of the global solution map) The global solution map F : _R^nm _⇒ _R^nk is locally upper Lipschitz at ¯a, i.e.,

there exist L >0 and δ > 0 such that

F(a) ⊂ F(¯a) + Lka−¯akB¯_R^nk (3.38) for all a satisfying ka−ak¯ < δ. Here

B¯_R^nk := ⁿx = (x¹, . . . , x^k) ∈ _R^nk | ^X

j∈J

kx^jk ≤ 1^o

denotes the closed unit ball of the product space _R^nk, which is equipped with the sum norm kxk = ^X

j∈J

kx^jk.

Proof. Let Ω, Ω₁ = {ω₁, . . . , ω_r}, x_ω(a) = (x¹_ω(a), . . . , x^k_ω(a)) ∈ _R^nk, and δ₀ be constructed as in the proof of the above theorem. For any ω ∈ Ω, the vector function x_ω(.), which maps _R^nm to _R^nk, is continuously differentiable.

Hence, there exist L_ω > 0 and δ_ω > 0 such that

kx_ω(a)−xω(_ea)k ≤ Lωka−_eak (3.39) for any a, _ea satisfying ka−ak¯ < δ_ω and k_ea−¯ak < δ_ω. Set

L = max{L_ω₁, . . . , L_ω_r} and δ = min{δ₀, δ_ω₁. . . , δ_ω_r}.

Then, for every a satisfying ka−¯ak < δ, by (3.36) and (3.39) one has F(a) ⊂ {x_ω₁(a), . . . , xωr(a)} ⊂ {x_ω₁(¯a), . . . , xωr(¯a)}+ Lka−¯akB¯_R^nk

= F(¯a) +Lka−ak¯ B¯_R^nk.

Hence, inclusion (3.38) is valid for every a satisfying ka−ak¯ < δ. 2 Theorem 3.7 (Aubin property of the local solution map) Let x¯ = (¯x¹, ...,x¯^k) be an element of F1(¯a) satisfying condition (C1), that is, x¯^j¹ 6= ¯x^j² whenever j2 6= j1. Then, the local solution map F1 : _R^nm _⇒ _R^nk has the Aubin property at (¯a,x), i.e., there exist¯ L₁ > 0, ε > 0, and δ₁ > 0 such that

F₁(a)∩B(¯x, ε) ⊂ F₁(_ea) +L₁ka−_eakB¯_R^nk (3.40) for all a and _ea satisfying ka−¯ak < δ₁ and k_ea−¯ak< δ₁.

Proof. Suppose that ¯x = (¯x¹, ...,x¯^k) ∈ F₁(¯a) and ¯x^j¹ 6= ¯x^j² for all j₁, j₂ ∈ J with j2 6= j1. Denote by J1 the set of the indexes j ∈ J such that ¯x^j is attractive w.r.t. the data set {¯a¹, . . . ,¯a^m}. Put J2 = J\J1. For every j ∈ J1, by Theorem 3.3 one has

k¯aⁱ−x¯^jk< k¯aⁱ−x¯^qk (∀i ∈ I(j), ∀q ∈ J \ {j}). (3.41)

In addition, the following holds:

x^j = 1

|I(j)|

i∈I(j)

aⁱ, (3.42)

where I(j) = {i ∈ I | ¯aⁱ ∈ A[¯x^j]}. For every j ∈ J₂, by Theorem 3.3 one has k¯x^q −a¯^pk < k¯x^j −a¯^pk (∀q ∈ J₁, ∀p ∈ I(q)). (3.43) Let ε0 > 0 be such that k¯x^j¹ −x¯^j²k > ε0 for all j1, j2 ∈ J with j2 6= j1.

By (3.41) and (3.43), there exist δ₀ > 0 and ε ∈ 0,ε₀ 4

such that

kaⁱ−x^jk < kaⁱ−x^qk (∀j ∈ J₁, ∀i ∈ I(j), ∀q ∈ J \ {j}) (3.44) and

kx^q −a^pk< kx^j −a^pk (∀j ∈ J₂, ∀q ∈ J₁, ∀p∈ I(q)) (3.45) for all a = (a¹, ..., a^m) ∈ _R^nm and x = (x¹, . . . , x^k) ∈ _R^nk with ka−¯ak< δ₀ and kx−xk¯ < 2kε. As ¯x^j¹ 6= ¯x^j² for all j₁, j₂ ∈ J with j₂ 6= j₁, by taking a smaller ε > 0 (if necessary), for any x = (x¹, . . . , x^k) ∈ _R^nk satisfying kx−xk¯ < 2kε we have x^j¹ 6= x^j² for all j₁, j₂ ∈ J with j₂ 6= j₁.

For every j ∈ J1 and a = (a¹, ..., a^m) ∈ _R^nm, define x^j(a) = 1

|I(j)|

i∈I(j)

aⁱ. (3.46)

Comparing (3.46) with (3.42) yields x^j(¯a) = ¯x^j for all j ∈ J₁. Then, by the continuity of the vector functions x^j(.), where j ∈ J₁, we may assume that kx^j(_ea) − x¯^jk < ε for all j ∈ J₁ and _ea = (_ea¹, ...,_ea^m) ∈ _R^nm satisfying k_ea−ak¯ < δ₀ (one can take a smaller δ₀ > 0, if necessary).

Since the vector functions x^j(.), j ∈ J₁, are continuously differentiable, there exist L₁ > 0 such that

kx^j(a)−x^j(_ea)k ≤ 1

kL₁ka−_eak (3.47) for any a, _ea satisfying ka−¯ak < δ₀ and k_ea−ak¯ < δ₀ (one can take a smaller δ₀ > 0, if necessary). Choose δ₁ ∈ (0, δ₀) as small as ²_kL₁δ₁ < ε.

With the chosen constants L₁ > 0, ε > 0, and δ₁ > 0, let us show that the inclusion (3.40) is fulfilled for all a and _ea satisfying ka − ¯ak < δ₁ and k_ea−ak¯ < δ₁.

Leta and _ea be such that ka−¯ak < δ₁ and k_ea−¯ak < δ₁. Select an arbitrary element x = (x¹, . . . , x^k) of the set F₁(a) ∩ B(¯x, ε). Put x_e^j = x^j(_ea) for all j ∈ J₁, where x^j(a) is given by (3.46). For any j ∈ J₂, set x_e^j = x^j.

Claim 1. The vector x_e = (x_e¹, . . . ,x_e^k) belongs to F1(_ea).

Indeed, the inequalities ka − ¯ak < δ₁ and kx − xk¯ < ε imply that both properties (3.44) and (3.45) are available. From (3.44) it follows that, for every j ∈ J₁, the attraction set A[x^j] is {aⁱ | i ∈ I(j)}. Since I(j) 6= ∅ for each j ∈ J₁ and x ∈ F₁(a), by Theorem 3.3 we have

x^j = 1

|I(j)|

i∈I(j)

aⁱ. (3.48)

Comparing (3.48) with (3.46) yields x^j = x^j(a) for all j ∈ J₁. By (3.45) we see that, for every j ∈ J₂, the attraction set A[x^j] is empty. Moreover, one has

x^j ∈ A[x]/ (∀j ∈ J₂) (3.49) where A[x] is the union of the balls ¯B(a^p,ka^p − x^qk) with p ∈ I, q ∈ J satisfying p ∈ I(q).

For each j ∈ J₁, using (3.47) we have

kx^j(_ea)−x¯^jk ≤ kx^j(_ea)−x^j(a)k+ kx^j(a)−x¯^jk

≤ _k¹L₁k_ea−ak+ε

≤ _k¹L₁(k_ea−¯ak+k¯a−ak) + ε

≤ _k²L₁δ₁ +ε < 2ε.

Besides, for each j ∈ J₂, we have kx^j(_ea)−x¯^jk = kx^j −x¯^jk < ε. Therefore, kx_e−xk¯ = ^X

j∈J1

kx^j(_ea)−x¯^jk+^X

j∈J2

kx^j −x¯^jk< 2kε.

In combination with the inequality k_ea−ak¯ < δ₁, this assures that the properties (3.44) and (3.45), where _ea and x(_ea) respectively play the roles of a and x, hold. In other words, one has

k_eaⁱ−x_e^jk < k_eaⁱ−x_e^qk (∀j ∈ J₁, ∀i ∈ I(j), ∀q ∈ J \ {j}) (3.50) and

kx_e^q −_ea^pk< kx_e^j−_ea^pk (∀j ∈ J₂, ∀q ∈ J₁, ∀p∈ I(q)). (3.51) So, similar to the above case of x, for every j ∈ J₁, the attraction set A[x_e^j] is {_eaⁱ | i ∈ I(j)}. Recall that I(j) 6= ∅ for each j ∈ J₁ and x_e^j was given by

xe^j = x^j(_ea) = 1

|I(j)|

i∈I(j)

eaⁱ. (3.52)

In addition, for every j ∈ J₂, the attraction set A[x_e^j] is empty and one has xe^j ∈ A[/ x]_e (∀j ∈ J₂), (3.53) where A[x] is the union of the balls ¯_e B(_ea^p,k_ea^p − x_e^qk) with p ∈ I, q ∈ J satisfying p ∈ I(q). Besides, from (3.50) and (3.51) it follows that |J_i(x)|_e = 1 for every i ∈ I. Since kx_e−xk¯ < 2kε, we have x_e^j¹ 6= x_e^j² for all j1, j2 ∈ J with j2 6= j1. Due to the last two properties and (3.52), (3.53), by Theorem 3.4 we conclude that _ex∈ F₁(_ea).

Claim 2. One has x∈ x_e+L₁ka−_eakB¯_R^nk.

Indeed, since x^j = x^j(a) for all j ∈ J₁, x_e^j = x^j for any j ∈ J₂, by (3.52) and (3.47) we have

kx−xk_e = ^X

j∈J1

kx^j −x_e^jk+ ^X

j∈J2

kx^j −x_e^jk

= ^X

j∈J1

kx^j(a)−x^j(_ea)k

≤ k.1

kL₁ka−_eak = L₁ka−_eak.

It follows that x ∈ x_e+L₁ka−_eakB¯_R^nk.

Combining Claim 2 with Claim 1, we have x ∈ F1(_ea) + L1ka − _eakB¯_R^nk. Thus, property (3.40) is valid for all a and _ea satisfying ka − ¯ak < δ1 and

k_ea−ak¯ < δ₁. 2

Dalam dokumen DC ALGORITHMS IN NONCONVEX QUADRATIC PROGRAMMING AND APPLICATIONS IN DATA CLUSTERING (Halaman 76-82)