Overview of the Chapter - Doctor of Philosophy

3.4 Conclusion

4.0.1 Overview of the Chapter

Goal of the Chapter. (i) Designing a (2√

3 +ϵ)-factor approximation algorithm for the 2-dispersion problem in R², where ϵ > 0, and (ii) developing a common framework for the dispersion problem in Euclidean space using which we improve the approximation factor to 2√

3 for the 2-dispersion problem in R², propose an optimal algorithm for the 2-dispersion problem inR¹, and propose a 2-factor approximation result for the 1-dispersion problem in

(2√

3 +ϵ)-Factor Approximation Algorithm

R².

Organization of the Chapter. The remainder of the chapter is organized as follows. In Section 4.1, we propose a (2√

3 +ϵ)-factor approximation algorithm for the 2-dispersion problem in R², where ϵ > 0. In Section 4.2, we propose a common framework for the dispersion problem in Euclidean space. Using the framework, we propose a 2√

3-factor approximation algorithm for the 2-dispersion problem in R², a polynomial-time optimal algorithm for the 2-dispersion problem on a line, and a 2-factor approximation algorithm for the 1-dispersion problem in R². Finally, we conclude the chapter in Section 4.3.

4.1 (2 √

3 + ϵ)-Factor Approximation Algorithm

In this section, we propose a (2√

3 +ϵ)-factor approximation algorithm for the 2-dispersion problem, whereϵ >0. The algorithm is based on a greedy approach. We briefly discuss the algorithm as follows. Let I = (P, k) be an arbitrary instance of the 2-dispersion problem, where P = {p₁, p₂, . . . , p_n} is the set of n points in R² and k ∈ [3, n] is a positive integer.

Initially, we choose a subset S₃ ⊆ P of size 3 such that cost₂(S₃) is maximized. Next, we add a point p ∈ P into S₃ to construct a set S₄, i.e., S₄ = S₃ ∪ {p}, so that cost₂(S₄) is maximized, and continues this process until the construction of the set S_k of size k. The pseudo-code of the algorithm is described in Algorithm2.

LetOP T ={p^∗₁, p^∗₂, . . . , p^∗_k} be an optimal solution of the 2-dispersion problem for input I = (P, k). For p ∈ P, we define a disk D[p] as follows: D[p] = {q ∈ R² | d(p, q) ≤

cost2(OP T) 2√

3+ϵ }. Accordingly, we define a subset of disk, D[S], forS ⊆P as D[S] ={D[p]|p∈ S}.

Algorithm 2 Euclidean Dispersion Algorithm(P, k)

Input: A setP ={p₁, p₂, . . . , p_n}of n points, and a positive integerk (3≤k ≤n).

Output: A subset S_k⊆P of size k.

1: Compute {p_i₁, p_i₂, p_i₃} ⊆P such thatcost₂({p_i₁, p_i₂, p_i₃}) is maximized.

2: S3 ← {pi1, pi2, pi3}

3: for (j = 4,5, . . . , k)do

4: Let p∈P \Sj−1 such thatcost₂(Sj−1 ∪ {p}) is maximized.

5: Sj ←Sj−1∪ {p}

6: end for

7: return (S_k)

Lemma 4.1.1. For any point p_i ∈P, |D[p_i]∩OP T| ≤2.

p_a p_a

pb pc

p_c p_b

D[p_i] D[p_i]

Figure 4.1: Points pa, pb, pc∈D[pi]

Proof. On the contrary, assume that there are three points p_a, p_b, p_c ∈ D[p_i]∩OP T. Let S = {p_a, p_b, p_c}. Without loss of generality, assume that cost₂(p_a, S) ≤ cost₂(p_b, S) and cost₂(p_a, S) ≤ cost₂(p_c, S), i.e., d(p_a, p_b) + d(p_a, p_c) ≤ d(p_a, p_b) +d(p_b, p_c) and d(p_a, p_b) + d(pa, pc)≤ d(pa, pc) +d(pb, pc), which leads to d(pa, pb)≤ d(pb, pc) and d(pa, pc)≤d(pb, pc).

Notice that maximizing d(p_a, p_b) +d(p_a, p_c) results in minimizing d(p_b, p_c)(see Figure 4.1).

The minimum value of d(p_b, p_c) is √

3^cost²^{(OP T}⁾

2√

3+ϵ as both d(p_a, p_b) and d(p_a, p_c) is less than equal to d(p_b, p_c). Therefore, by the packing argument inside a disk, d(p_a, p_b) +d(p_a, p_c) is maximum if p_a, p_b, p_c are on an equilateral triangle and on the boundary of the disk D[p_i].

(2√

3 +ϵ)-Factor Approximation Algorithm

Then, cost₂(S) ≤ d(p_a, p_b) +d(p_a, p_c) ≤ √

3^cost²^{(OP T}⁾

2√

3+ϵ +√

3^cost²^{(OP T}⁾

2√

3+ϵ = 2√

3^cost²^{(OP T}⁾

2√

3+ϵ <

cost₂(OP T), which leads to a contradiction to the optimal value cost₂(OP T). Therefore, for any pi ∈P, D[pi] contains at most two points inOP T.

Consider the set S_i with i < k, an i-th size solution in the Algorithm 2. Let U = S_i ∩ OP T. Assume that S_i^′ = S_i \ U and OP T^′ = OP T \U. Note that for any disk D[p^∗_ℓ]∈D[OP T^′], |D[p^∗_ℓ]∩U| ≤1 (by Lemma4.1.1).

Lemma 4.1.2. For some D[p^∗_j] ∈ D[OP T^′], D[p^∗_j] contains at most one point in S_i, i.e.,

|D[p^∗_j]∩S_i| ≤1.

Proof. On the contrary, assume that there does not exist any D[p^∗_j] ∈ D[OP T^′] such that

|D[p^∗_j]∩Si| ≤ 1, i.e., for each D[p^∗_v] ∈ D[OP T^′], |D[p^∗_v]∩Si| > 1. Construct a bipartite graph H(D[OP T^′]∪S_i,E) as follows: (i) D[OP T^′] and S_i are two partite vertex sets, and (ii) (D[p^∗_j], p)∈E if and only if p∈S_i is contained in D[p^∗_j](see Figure4.2).

D[OP T^′] S_i =S_i^′∪ U U

D[p^∗_j]

p S_i^′

Figure 4.2: H(D[OP T^′]∪S_i,E)

Claim 4.1.1. For a disk D[p^∗_t]∈D[OP T^′], if |D[p^∗_t]∩U|= 1, then any point in D[p^∗_t]∩S_i^′ is not contained in any disk in D[OP T^′]\ {D[p^∗_t]}.

Proof of the Claim. On the contrary assume that a point p∈D[p^∗_t]∩S_i^′ is contained in a disk D[p^∗_w] ∈ D[OP T^′]\ {D[p^∗_t]} (see Figure 4.3). Therefore, p ∈ D[p^∗_t]∩D[p^∗_w] implies d(p^∗_t, p^∗_w) ≤ 2× ^cost₂^√²^{(OP T}_3+ϵ ⁾. Since |D[p^∗_t]∩U|= 1, let D[p^∗_t]∩U = {p^∗_u}. Now, consider the 2-dispersion cost of p^∗_t with respect to OP T, i.e., cost₂(p^∗_t, OP T)≤ d(p^∗_t, p^∗_u) +d(p^∗_t, p^∗_w) ≤

cost2(OP T) 2√

3+ϵ + 2× ^cost²^{(OP T}⁾

2√

3+ϵ = 3× ^cost²^{(OP T}⁾

2√

3+ϵ < cost₂(OP T), which is a contradiction to the optimality of OP T (see Figure 4.3). Thus, any point in D[p^∗_t]∩S_i^′ is not contained in any disk in D[OP T^′]\ {D[p^∗_t]}, if |D[p^∗_t]∩U|= 1. □

D[p^∗_t] D[p^∗_w] p^∗_t

p^∗_w p^∗_u

≤2×^cost²^{(OP T)}

2√ 3+ϵ

≤ ^cost₂^√²^{(OP T}_3+ϵ ⁾ D[p^∗_u]

Figure 4.3: 2-dispersion cost of p^∗_t with respect to OP T.

Now, for allD[p^∗_ℓ]∈D[OP T^′] that satisfy the condition of Claim4.1.1, we removeD[p^∗_ℓ] fromD[OP T^′] to getD[OP T^′′] andD[p^∗_ℓ]∩S_i^′ fromH to getS_i^′′ repeatedly, followed byUto construct H^′ = (D[OP T^′′], S_i^′′). Since |D[OP T^′]|+|U|=|OP T|=k, and |S_i^′|+|U|=|Si|<

k, therefore|D[OP T^′]|>|S_i^′|. During the construction ofH^′ = (D[OP T^′′], S_i^′′), the number of vertices removed from the partite set S_i^′ is at least the number of vertices removed from the partite set D[OP T^′]. Therefore, |D[OP T^′′]|>|S_i^′′|.

Thus, the lemma follows from the fact that the degree of each vertex inD[OP T^′′] is at least 2 and the degree of each vertex in S_i^′′ at most 2 in the bipartite graphH^′, which leads to a contradiction as |D[OP T^′′]|>|S_i^′′|.

(2√

3 +ϵ)-Factor Approximation Algorithm

Theorem 4.1.3. For any ϵ > 0, Algorithm 2 produces a (2√

3 +ϵ)-factor approximation result in polynomial time.

Proof. Let I = (P, k) be an arbitrary input instance of the 2-dispersion problem, where P = {p₁, p₂, . . . , p_n} is the set of n points in R² and k is a positive integer. Let S_k = {p₁, p₂, . . . , p_k} be the output of Algorithm 2 for instance I. We know that OP T = {p^∗₁, p^∗₂, . . . , p^∗_k} is an optimal solution of the 2-dispersion problem for the instance I.

To prove the theorem, we need to show that ^cost_cost²^{(OP T}⁾

2(S_k) ≤2√

3 +ϵ. Here, we use induction to show thatcost₂(S_i)≥ ^cost²^{(OP T}⁾

2√

3+ϵ for each i= 3,4, . . . , k. SinceS₃ is an optimum solution for 3 points (see line number 1 of Algorithm 2), therefore, cost₂(S₃) ≥ cost₂(OP T) ≥

cost2(OP T) 2√

3+ϵ holds. Now, assume that the condition holds for eachi such that 3 ≤i < k. We will prove that the condition,i.e.,cost₂(S_i+1)≥ ^cost²^{(OP T}⁾

2√

3+ϵ , holds for (i+ 1) too.

We know by Lemma4.1.2 that there exists at least one diskD[p^∗_j]∈D[OP T^′] such that

|D[p^∗_j]∩S_i| ≤1. Now, consider the case where|D[p^∗_j]∩S_i|= 1, then the distance ofp^∗_j to the second closest point inS_i is greater than ^cost²^{(OP T}⁾

2√

3+ϵ (see Figure4.4 ). Therefore, we can add the point p^∗_j ∈ OP T to the set S_i to construct the set S_i+1. Now, consider the case where

|D[p^∗_j]∩S_i|= 0, then the distance of the point p^∗_j ∈OP T to any point ofS_i is greater than

cost2(OP T) 2√

3+ϵ . Note that in both cases cost₂(p^∗_j, S_i+1) ≥ ^cost₂^√²^{(OP T}_3+ϵ ⁾. So, by adding the point p^∗_j to the set Si, we can construct the setSi+1 such that cost2(Si+1)≥ ^cost₂^√²^{(OP T}_3+ϵ ⁾.

Now, we argue that for any arbitrary point p ∈ S_i+1, cost₂(p, S_i+1) ≥ ^cost²^{(OP T}⁾

2√

3+ϵ . We consider the following two cases: Case (1)p^∗_j is not one of the closest points of pinS_i+1, and Case (2)p^∗_j is one of the closest points ofpinS_i+1. In the Case (1),cost₂(p, S_i+1)≥ ^cost²^{(OP T}⁾

2√ 3+ϵ

by the definition of the set S_i. In the Case (2), suppose that p is not contained in the disk D[p^∗_j], then d(p, p^∗_j) ≥ ^cost²^{(OP T}⁾

2√

3+ϵ . This implies cost₂(p, S_i+1) ≥ ^cost²^{(OP T}⁾

2√

3+ϵ . Now, if p is

p^∗_j

p_ℓ

p_i D[p^∗_j]

Figure 4.4: p_ℓ lies outside the disk D[p^∗_j]

contained in D[p^∗_j], then there exists at least one of the closest points of p that is not contained in D[p^∗_j], otherwise it leads to a contradiction to Lemma 4.1.2. Assume that q is one of the nearest points of p that is not contained in D[p^∗_j] (see Figure 4.5 ). Since d(p, q)≥ ^cost₂^√²^{(OP T}_3+ϵ ⁾, therefore,cost₂(p, S_i+1)≥ ^cost₂^√²^{(OP T}_3+ϵ ⁾. Therefore, by constructing the set Si+1 = Si∪ {p^∗_j}, we ensure that the cost of each point in Si+1 is greater than or equal to

cost2(OP T) 2√

3+ϵ .

p^∗_j p

D[p^∗_j]

Figure 4.5: q is not contained in D[p^∗_j]

Since our algorithm chooses a point (see line number4 of Algorithm2) that maximizes cost₂(S_i+1), the algorithm will always choose a point in iterationi+1 such thatcost₂(S_i+1)≥

A Common Framework for the Euclidean Dispersion Problem

cost2(OP T) 2√

3+ϵ .

With the help of Lemma 4.1.1 and Lemma 4.1.2, we conclude that cost2(Si+1) ≥

cost2(OP T) 2√

3+ϵ and thus the condition also holds for (i+ 1).

Therefore, for any ϵ >0, Algorithm2 produces a (2√

3 +ϵ)-factor approximation result in polynomial time.

4.2 A Common Framework for the Euclidean Disper-

Dalam dokumen Doctor of Philosophy (Halaman 71-78)