Based on this graph, sets of the supports for sums of squares (SOS) polynomials that lead to efficient SOS and semidefinite programming (SDP) relaxations are obtained

(1)

RELAXATIONS FOR POLYNOMIAL OPTIMIZATION PROBLEMS WITH STRUCTURED SPARSITY

HAYATO WAKI^∗, SUNYOUNG KIM^†, MASAKAZU KOJIMA ^‡, AND MASAKAZU MURAMATSU^§

Abstract. Unconstrained and inequality constrained sparse polynomial optimization problems (POPs) are considered. A correlative sparsity pattern graph is defined to find a certain sparse structure in the objective and constraint polynomials of a POP. Based on this graph, sets of the supports for sums of squares (SOS) polynomials that lead to efficient SOS and semidefinite programming (SDP) relaxations are obtained. Numerical results from various test problems are included to show the improved performance of the SOS and SDP relaxations.

Key words. Polynomial optimization problem, sparsity, global optimization, Lagrangian relaxation, Lagrangian dual, sums of squares optimization, semidefinite programming relaxation

AMS subject classifications.15A15, 15A09, 15A23

1. Introduction. Polynomial optimization problems (POPs) arise from various applications in science and engineering. Recent developments [9, 15, 18, 19, 22, 25, 27, 31] in semidefinite programming (SDP) and sums of squares (SOS) relaxations for POPs have attracted a lot of research from diverse directions. These relaxations have been extended to polynomial SDPs [11, 12, 17] and POPs over symmetric cones [20]. In particular, SDP and SOS relaxations have been popular for their theoretical convergence to the optimal value of a POP [22, 25]. From a practical point of view, improving the computational efficiency of SDP and SOS relaxations using the sparsity of polynomials in POPs has become an important issue [15, 19].

A polynomialf in real variablesx1, x2, . . . , xn of a positive degreedcan have all monomials of the formx^α₁¹x^α₂²· · ·x^α_nⁿ with nonnegative integers αi (i = 1,2, . . . , n) such that αi ≥0 andPn

i=1αi≤d; all monomials of different form add up to ^n+d_d . We call such a polynomial fully dense. When we examine polynomials in POPs from applications, we notice in many cases that they aresparse polynomials having a few or some of all possible monomials as defined in [19]. The sparsity provides a computational edge if it is handled properly when deriving SDP and SOS relaxations.

More precisely, taking advantage of the sparsity of POPs is essential to obtaining an optimal value of a POP by applying SDP and SOS relaxations in practice.

For sparse POPs, generalized Lagrangian duals and their SOS relaxations were proposed in [15]. The relaxations are derived using SOS polynomials for the La- grangian multipliers with similar sparsity to the associated constraint polynomials, and then converted into equivalent SDPs. As a result, the size of the resulting relaxations is reduced and computational efficiency is improved. This approach is shown to have an advantage in implementation over the SDP relaxation given in [22] whose size depends only on the degrees of objective and constraint polynomials of the POP.

∗Department of Mathematical and Computing Sciences, Tokyo Institute of Technology, 2-12-1 Oh-Okayama, Meguro-ku, Tokyo 152-8552 Japan ([email protected])

†Department of Mathematics, Ewha Women’s University, 11-1 Dahyun-dong, Sudaemoon-gu, Seoul 120-750 Korea. The research was supported by KRF 2003-041-C00038. ([email protected])

‡Department of Mathematical and Computing Sciences, Tokyo Institute of Technology, 2-12-1 Oh-Okayama, Meguro-ku, Tokyo 152-8552 Japan ([email protected])

§Department of Computer Science, The University of Electro-Communications, Chofugaoka, Chofu-Shi, Tokyo 182-8585 Japan ([email protected])

1

(2)

The aim of this paper is to propose new practical SOS and SDP relaxations for a sparse POP and show their performance for various test problems. The framework of SOS and SDP relaxations presented here are based on the one proposed in the paper [15]. The main idea here is that we define sparsity of a POP more precisely by finding a structure of the polynomials in the POP to obtain sparse SOS and SDP relaxations accordingly. Specifically, we introducecorrelative sparsity, which is a special case of the sparsity [19] mentioned above; the correlative sparsity implies the sparsity, but the converse is not necessarily true. The correlative sparsity is described in terms of an n×nsymmetric matrixR, which we call thecorrelative sparsity pattern matrix(csp matrix) of the POP. Each elementRijof the csp matrixRis either 0 or⋆representing a nonzero value. We assign⋆to every diagonal elementRii (i= 1,2, . . . , n), and also to each off-diagonal element Rij = Rji (1 ≤ i < j ≤ n) if and only if either (i) the variables xi and xj appear simultaneously in a term of the objective function, or (ii) they appear in an inequality constraint. The csp matrix R constructed in this way represents the sparsity pattern of the Hessian matrix of the generalized Lagrangian function of the paper [15] (or the Hessian matrix of the objective function in unconstrained cases) except for the diagonal elements; some diagonal elements of the Hessian matrix may vanish whileRii =⋆ (i= 1,2, . . . , n) by definition. We say that the POP iscorrelatively sparseif the csp matrixR(or the Hessian matrix of the generalized Lagrangian function) is sparse.

From the csp matrixR, it is natural to induce graphG(N, E) with the node set N ={1,2, . . . , n}and the edge setE={{i, j}:Rij =⋆, i < j}corresponding to the nonzero off-diagonal elements ofR. We callG(N, E) thecorrelative sparsity pattern graph (csp graph) of the POP. We employ some results of graph theory regarding maximal cliques of chordal graphs [1]. A key idea in this paper is to use the maximal cliques of a chordal extension of the csp graphG(N, E) to construct sets of supports for a sparse SOS relaxation. This idea is motivated by the recent work [5] that proposed positive semidefinite matrix completion techniques for exploiting sparsity in primal-dual interior-point methods for SDPs.

Theoretically, the proposed sparse SOS and SDP relaxations are not guaranteed to generate lower bounds of the same quality as the dense SDP relaxation [22] for general POPs. Practical experiences, however, show that the performance gap between the two relaxations is small as we will observe in Section 6. In particular, the definition of a structured sparsity based on the csp matrixRand the csp graphG(N, E) make it possible to achieve the same quality of lower bounds for quadratic optimization problems (QOPs) where all polynomials in the objective function and constraints are quadratic. More precisely, the proposed sparse relaxation of order 1 obtains lower bounds of the same quality as the dense SOS relaxation of order 1, as shown in Section 4.5.

The remaining of the paper is organized as follows. After introducing basic notation and symbols on polynomials, we define SOS polynomials in Section 2. In Sec- tion 3, we first describe the dense SOS relaxation of unconstrained POPs, and then the sparse SOS relaxation. We show how a csp matrix is defined from a given unconstrained POP, and how a sparse SOS relaxation is constructed using the maximal cliques of a chordal extension of a csp graph induced from the csp matrix. Section 4 contains the description of a SOS relaxation of an inequality constrained POP with a structured sparsity characterized by a csp matrix and a csp graph. We introduce a generalized Lagrangian dual for the inequality constrained POP and a sparse SOS relaxation. Section 5 discusses some additional techniques which enhance the practical

(3)

performance of the sparse SOS relaxation such as computing optimal solutions, han- dling equality constraints and scaling. Section 6 includes numerical results on various test problems. We show the proposed sparse SOS and SDP relaxations exhibit much better performance in practice. Finally, we give concluding remarks in Section 7.

2. Polynomials and SOS polynomials. Let R be the set of real numbers, and Z+ the set of nonnegative integers. R[x] is the set of real-valued multivariate polynomials in xi (i = 1,2, . . . , n). Each polynomial f ∈ R[x] is represented as f(x) = P

α∈Fc(α)x^α where F ⊂ Zⁿ+ is a nonempty finite subset, c(α) (α ∈ F) are real coefficients, and x^α = x^α₁¹x^α₂²· · ·x^α_nⁿ. The support of f is defined by supp(f) = {α∈ F:c(α)6= 0} ⊂ Zⁿ₊, and the degree of f ∈ R[x] by deg(f) = max{Pn

i=1αi:α∈supp(f)}.

For every nonempty finite setG ⊂Zⁿ+, R[x,G] denotes the set of polynomials in xi (i = 1,2, . . . , n) whose support is in G; R[x,G] ={f ∈R[x] : supp(f)⊂ G}. We denote R[x,G]² the set of SOS polynomials inR[x,G]. By construction, we see that supp(g)⊂ G+G ifg∈R[x,G]² whereG+Gdenotes the Minkowski sum of twoG’s.

LetR^G denote the|G|-dimensional Euclidean space whose coordinates are indexed byα ∈ G. Each vector ofR^G is denoted as w = (wα :α ∈ G). We use the symbol S(G) for the set of|G| × |G|symmetric matrices with coordinatesα∈ G. LetS+(G) be the set of positive semidefinite matrices inS(G);

w^TV w= X α∈G

X β∈G

Vαβwαwβ≥0 for everyw= (wα:α∈ G)∈R^G.

The symbol u(x,G) is used for the|G|-dimensional column vector consisting of ele- mentsx^α (α∈ G). Then, the setsR[x,G]²can be rewritten as

R[x,G]² =

u(x,G)^TV u(x,G) :V ∈ S+(G) . (2.1)

For more details, see the papers [4, 25]. LetN ={1,2, . . . , n},∅ 6=C⊂N, and A^C_ω=

(

α∈Zⁿ+:αi= 0 ifi6∈C and X

i∈C

αi≤ω )

.

Then we observe thatA^C_ω+A^C_ω =A^C_2ωfor every nonemptyC⊂N andω∈Z+. 3. SOS relaxations of unconstrained polynomial optimization problems. In this section, we consider an unconstrained POP

minimizef0(x).

(3.1)

Let ζ^∗ = inf{f0(x) :x∈Rⁿ}. Throughout this section, we assume thatζ^∗ >−∞.

Then deg(f0) must be an even integer, i.e., deg(f0) = 2ω0 for someω0 ∈ Z+. By Lemma in Section 3 of [29], we also know that F0 = supp(f0) ⊂conv(F^e₀), where F^e₀={α∈ F0:αi is an even nonnegative integer (i= 1,2, . . . , n)}.

3.1. An outline of sparse SOS relaxations. We first convert the POP (3.1) into an equivalent problem

maximize ζ subject to f0(x)−ζ≥0.

(3.2)

We fix a positive integerω ≥ω0, and replace the constraint of the problem (3.2) by an SOS constraint to obtain:

maximize ζ subject to f0(x)−ζ∈R[x,A^N_ω]². (3.3)

(4)

The SOS optimization problem (3.3) serves as a relaxation of the POP (3.1). See the paper [26] and the references therein for more details of this relaxation. We can rewrite the SOS constraint of (3.2) using the relation (2.1) as

f0(x)−ζ=u(x,A^N_ω)^TV u(x,A^N_ω) andV ∈ S+(A^N_ω).

(3.4)

We call the parameter ω ∈ Z+ in (3.3) the (relaxation) order. In fact, we can fix ω = ω0 in the unconstrained case. Nevertheless, we regard ω as a parameter to be consistent with the notation of the constrained case.

We call a polynomial f0 ∈ R[x,A^N_2ω] sparse if the number of elements in its support F0 = supp(f0) is much smaller than the number of elements in A^N_2ω that forms a support of fully dense polynomials inR[x,A^N_2ω]. When the objective function f0 is a sparse polynomial in R[x,A^N_2ω], the size of the SOS constraint (3.3) can be reduced by eliminating redundant elements fromA^Nω. In fact, by applying Theorem 1 of [29],A^Nω in the problem (3.3) can be replaced by

G⁰₀= convnα

2 :α∈ F^e₀[ {0}o

∩Zⁿ+⊂ A^N_ω. Note that{0}is added as the support for the real number variableζ.

A method that can further reduce the size of the SOS optimization problem by eliminating redundant elements inG⁰₀was proposed by Kojimaet alin [19]. We write the resulting SOS constraint from their method as

f0(x)−ζ∈R[x,G^∗₀]², (3.5)

whereG^∗₀⊂ G⁰₀⊂ A^N_ω denotes the set obtained by applying the method.

We now outline a new sparse relaxation. Using the structure obtained from the correlative sparsity, we generate multiple support setsG1,G2, . . . ,Gp⊂Zⁿ+ such that

F0

[{0} ⊂ [p ℓ=1

(Gℓ+Gℓ), (3.6)

and replace the SOS constraint (3.5) by f0(x)−ζ∈

Xp ℓ=1

R[x,Gℓ]², (3.7)

where Pp

ℓ=1R[x,Gℓ]² =Pp

ℓ=1hℓ:hℓ∈R[x,Gℓ]² (ℓ= 1,2, . . . , p) . The support of f0(x)−ζ is F0S

{0}, while the support of each polynomial in Pp

ℓ=1R[x,Gℓ]² is contained inSp

ℓ=1(Gℓ+Gℓ). Hence (3.6) is necessary for the SOS constraint (3.7) to be feasible although it is not sufficient. If the size of each Gℓ is much smaller than the size ofG^∗₀ and if the number of the support setspis not large, the size of the SOS constraint (3.7) is smaller than the size of the SOS constraint of (3.3).

3.2. Correlative sparsity pattern matrix. The sparsity considered here is measured by the number of different kinds of the cross terms in the objective polyno- mialf0. We will call this type of sparsitycorrelative sparsity. The correlative sparsity is represented with then×n(symbolic, symmetric)correlative sparsity pattern matrix (abbreviated bycsp matrix)Rwhose (i, j)th elementRij is given by

Rij =





⋆ ifi=j,

⋆ ifαi≥1 andαj ≥1 for someα∈ F0= supp(f0), 0 otherwise

(5)

(i= 1,2, . . . , n, j= 1,2, . . . , n). Here ⋆stands for some nonzero element. If the csp matrixRoff0 is sparse, thenf0 is sparse as defined in [19], but the converse is not true. We say that f0 is correlatively sparse if the associated csp matrix is sparse.

As mentioned in the Introduction, the correlative sparsity of an objective function f0(x) is equivalent to the sparsity of its Hessian matrix with some additional nonzero diagonal elements.

3.3. Correlative sparsity pattern graphs. We describe a method to determine the sets of supports G1,G2, . . . ,Gp for the target SOS relaxation (3.7) of the unconstrained POP (3.1). The basic idea is to use the structure of the csp matrixR and some results from graph theory.

Given a csp matrixR, the undirected graphG(N, E) withN ={1,2, . . . , n}and E ={{i, j}:i, j∈N, i < j, Rij=⋆}is called the correlative sparsity pattern graph (abbreviated to csp graph). LetC1, C2, . . ., Cp ⊂N denote the maximal cliques of the csp graph G(N, E). Then, choose the sets of supports G1,G2, . . . ,Gp such that Gℓ=A^C_ω^ℓ (ℓ= 1,2, . . . , p).We can easily verify that the relation (3.6) holds. However, the method described above for choosing G1,G2, . . . ,Gp has a critical disadvantage since finding all maximal cliques of a graph is a difficult problem in general. In fact, finding a single maximum clique is an NP hard problem. To resolve this difficulty, we generate a chordal extensionG(N, E^′) of the csp graphG(N, E) and use the extended csp graphG(N, E^′) instead ofG(N, E). See [1, 7] for chordal graphs and finding all maximal cliques.

Consequently, we obtain a sparse SOS relaxation of the POP (3.1):

maximize ζ subject to f0(x)−ζ∈ Xp ℓ=1

R[x,A^Cω^ℓ]², (3.8)

whereCℓ(ℓ= 1,2, . . . , p) denote the maximal cliques of a chordal extensionG(N, E^′) of the csp graphG(N, E).

There may be several different chordal extensions of a graphG(N, E), and any of them is valid for deriving the sparse relaxation presented in this paper. The chordal extension with the least number of edges, called the minimum chordal extension, serves best for the resulting sparse relaxation. We remark that finding a chordal extension of a graph is equivalent to calculating symbolic sparse Cholesky factorization of its adjacency matrix; the resulting sparse matrix represents the chordal extension. The minimum chordal extension corresponds to the sparse Cholesky factorization with the minimum fill-ins. Finding the minimum chordal extension is difficult in general, but fortunately, several heuristics, such as the minimum degree ordering, are known to efficiently produce a good approximation. For more information on symbolic Cholesky factorization with minimum degree ordering and a chordal extension, see the paper [6].

It should be noted that the number of the maximal cliques ofG(N, E^′) does not exceedn, which is equivalent to the number of nodes of the graph G(N, E^′) as well as to the number of variables of the objective polynomialf0.

Let us consider a few typical examples. Suppose that the objective polynomial function f0 ∈ R[x]2ω of the unconstrained POP (3.1) is a separable polynomial of the form f0(x) = Pn

i=1hi(xi), where each hi(xi) denotes a polynomial in a single variable xi ∈ R with deg(hi(xi)) = 2ω. In this case, the csp matrix R becomes an n×n diagonal matrix so that Ci = {i} (i = 1,2, . . . , n). Hence we take Gℓ = ρe^ℓ:ρ= 0,1,2, . . . , ω (ℓ = 1,2, . . . , n) in the sparse SOS relaxation (3.8). Here

(6)

e^ℓ ∈ Rⁿ denotes the ℓth unit vector with 1 at the ℓth coordinate and 0 elsewhere.

The resulting SOS optimization problem inherits the separability from the separable polynomial objective functionf0, and is subdivided intonindependent subproblems;

each subproblem forms an SOS relaxation of the corresponding subproblem of the POP (3.1), minimizinghℓ(xℓ) in a single variable. We remark here that if we directly apply the sparse SOS relaxation proposed in the paper [19], we obtain the dense relaxation of the form (3.3). Therefore, this case shows a critical difference between the sparse SOS relaxation proposed in this paper and the one given in the paper [19].

See Proposition 5.1 of [19] for more details.

Suppose that f0(x) = Pn−1

i=1 aix⁴_i +bix²_ixi+1+cixixi+1

, where ai, bi and ci

are nonzero real numbers (i = 1,2, . . . , n−1). Then, the csp matrix turns out to be the n×n tridiagonal matrix which induces in fact a chordal graph; hence there is no need to extend. In this case, the maximal cliques of the a chordal graph are Cℓ={ℓ, ℓ+ 1} (ℓ= 1,2, . . . , n−1).

For another example, let us consider f0(x) = Pn−1

i=1 aix⁴i +bix²ixn+cixixn

, where ai, bi andci are nonzero real numbers (i= 1,2, . . . , n−1). In this case, we have the csp matrix:

R=







⋆ 0 . . . 0 ⋆

0 ⋆ 0 ⋆

... . .. ... 0 0 . . . ⋆ ⋆

⋆ ⋆ . . . ⋆ ⋆





 ,

which gives a chordal graph with the maximal cliquesCℓ={ℓ, n}(ℓ= 1,2, . . . , n−1).

4. SOS relaxations of inequality constrained POPs. Let fk ∈R[x] (k = 0,1,2, . . . , m). Consider the following POP:

minimize f0(x) subject to fk(x)≥0 (k= 1,2, . . . , m).

(4.1)

Letζ^∗= inf{f0(x) :fk(x)≥0 (k= 1,2, . . . , m)}. With the correlative sparsity of the POP (4.1), we determine the generalized Lagrangian function with the same sparsity and proper sets of supports in an SOS relaxation. A sparse SOS relaxation is proposed in two steps. In the first step, we convert the POP (4.1) into an unconstrained minimization of the generalized Lagrangian function according to the paper [15]. In the second step, we apply the sparse SOS relaxation given in the previous section for unconstrained POPs to the resulting minimization problem. A key point of utilizing the correlative sparsity of the POP (4.1) is that the POP (4.1) and its generalized Lagrangian function have the same correlative sparsity.

4.1. Correlative sparsity in inequality constrained POPs. LetFk ={i: αi ≥1 for someα∈supp(fk)} (k = 1,2, . . . , m). Each Fk is regarded as the index set of variables xi’s of the polynomialfk. For example, if n= 4 and fk(x) = x³₁+ 3x1x4−2x²₄, thenFk ={1,4}. Define then×n(symbolic, symmetric) csp matrixR such that

Rij =









⋆ ifi=j,

⋆ ifαi≥1 andαj ≥1 for someα∈supp(f0),

⋆ ifi∈Fk andj∈Fk for somek∈ {1,2, . . . , m}, 0 otherwise.

When the csp matrixRis sparse, we call that the POP (4.1) is correlatively sparse.

(7)

4.2. Generalized Lagrangian duals. The generalized Lagrangian function [15] is defined as

L(x,ϕ) =f0(x)− Xm k=1

ϕk(x)fk(x)

wherex∈Rⁿ,ϕ= (ϕ1, ϕ2, . . . , ϕm)∈Φ, and Φ =

ϕ= (ϕ1, ϕ2, . . . , ϕm) : ϕk∈R[x,A^N_ω]² for someω∈Z+

(k= 1,2, . . . , m)

.

Then, for each fixedϕ∈Φ, the problem of minimizingL(x,ϕ) overx∈Rⁿ serves as a Lagrangian relaxation problem; its optimal value L^∗(ϕ) = inf{L(x,ϕ) :x∈Rⁿ}, bounds the optimal valueζ^∗ of the POP (4.1) from below.

If our aim is to preserve the correlative sparsity of the POP (4.1) in the resulting SOS relaxation, we need to have the Lagrangian function L that inherits the correlative sparsity from the POP (4.1). Notice that ϕ can be chosen for this pur- pose. In [15], Kim et al. proposed to choose a polynomial of the same variables as the variables xi (i ∈ Fk) in the polynomial fk for each multiplier polynomial ϕk; supp(ϕk) ⊂

α∈Zⁿ+:αi= 0 (i6∈Fk) . Let ωk = ⌈deg(fk)/2⌉(k = 0,1,2, . . . , m) and ωmax = max{ωk : k = 0,1, . . . , m}. For every nonnegative integer ω ≥ ωmax, define

Φω=n

ϕ= (ϕ1, ϕ2, . . . , ϕm) :ϕk ∈R[x,A^F_ω−ω^k _k]² (k= 1,2, . . . , m)o .

Here the parameterω∈Z+ serves as the (relaxation) order of the SOS relaxation of the POP (4.1) that is derived in the next subsection. Then a generalized Lagrangian dual (with the Lagrangian multiplierϕrestricted to Φω) [15] is defined as

maximize ζ subject to L(x,ϕ)−ζ≥0 andϕ∈Φω. (4.2)

LetL^∗ω denote the optimal value of this problem;L^∗ω= sup{L^∗(ϕ) :ϕ∈Φω}. Then L^∗ω≤ζ^∗. If the POP (4.1) includes the box inequality constraint of the formρ−x²_i ≥ 0 (i= 1,2, . . . , n) for someρ >0, we know by Theorem 3.1 of [15] thatL^∗_ω converges toζ^∗ asω→ ∞.

4.3. Sparse SOS relaxations.

We show how a sparse SOS relaxation is formulated using the sets of supports constructed from the csp matrixR. Let ω ≥ωmax be fixed. Suppose thatϕ ∈Φω. Then L(·,ϕ) forms a polynomial inxi (i = 1,2, . . . , n) with deg(L(·,ϕ)) = 2ω. We also observe from the construction of the csp matrixRand Φωthat the polynomial L(·,ϕ) has the same csp matrix as the csp matrixR constructed for the POP (4.1).

As in Section 3.4, the csp matrixRinduces the csp graphG(N, E). By construction, we know that eachFk forms a clique of the csp graph G(N, E). LetC1, C2, . . . , Cp

be the maximal cliques of a chordal extensionG(N, E^′) of G(N, E). Then, a sparse SOS relaxation of the POP (4.1) is written as

maximize ζ subject to L(x,ϕ)−ζ∈ Xp ℓ=1

R[x,A^C_ω^ℓ]² andϕ∈Φω. (4.3)

Let ζω denote the optimal objective value of this SOS optimization problem. Then ζω ≤L^∗_ω ≤ζ^∗ for everyω≥ωmax, but the convergence of ζω to ζ^∗ asω → ∞is not guaranteed in theory.

(8)

The above idea of the SOS relaxation of the constrained POP (4.1) using the generalized Lagrangian function stems from Putinar’s lemma [28], and it was first used in [22]. In fact, if we replace every index subset Fk of N (k = 1,2, . . . , m) by the entire index set N and if we take p = 1 and C1 = N, then the resulting SOS relaxation (4.3) of the POP (4.1) essentially coincides with the dense SOS relaxation (4.10) of Lasserre [22], and in this case, it was shown in [22] thatζω→ζ^∗ asω→ ∞ under moderate assumptions.

4.4. Primal approach. We have formulated a sparse SOS relaxation (4.3) of the inequality constrained POP (4.1) in the previous subsection. For numerical com- putation, we convert the SOS optimization problem (4.3) into an SDP, which serves as an SDP relaxation of the POP (4.1). We may regard this way of deriving an SDP relaxation from the POP (4.1) as the dual approach. We briefly mention below the so-called primal approach to the POP (4.1) whose sparsity is characterized by the csp matrixR and the csp graphG(N, E). We use the same symbols and notation as in Section 4.3. Letω≥ωmax. To derive a primal SDP relaxation, we first transform the POP (4.1) into an equivalent polynomial SDP

minimize f0(x)

subject to u(x,A^F_ω−ω^k _k)u(x,A^F_ω−ω^k _k)^Tfk(x)∈ S+(A^F_ω−ω^k _k) (k= 1,2, . . . , m),

u(x,A^C_ω^ℓ)u(x,A^C_ω^ℓ)^T ∈ S+(A^C_ω^ℓ) (ℓ= 1,2, . . . , p).







 (4.4)

The matricesu(x,A^F_ω−ω^k _k)u(x,A^F_ω−ω^k _k)^T (k = 1,2, . . . , m) and u(x,A^C_ω^ℓ)u(x,A^C_ω^ℓ)^T (ℓ = 1,2, . . . , p) are positive semidefinite symmetric matrices of rank one for any x∈Rⁿ, and has 1 as a diagonal element. These facts ensure the equivalence between the POP (4.1) and the polynomial SDP above. Let

Fe = [p ℓ=1

A^C_ω^ℓ

!

\{0},

Se=S(A^F_ω−ω¹ ₁)× · · · × S(A^F_ω−ω^m _m)× S(A^C_ω¹)× · · · × S(A^C_ω^p) (the set of block diagonal matrices of matrices inS(A^F_ω−ω^k _k)

(k= 1, . . . , m) andS(A^C_ω^ℓ) (ℓ= 1, . . . , p) on their diagonal blocks), Se+=n

M ∈Se: positive semidefiniteo .

Then we can rewrite the polynomial SDP above as minimize X

α∈eF

˜

c0(α)x^α subject to M(0) +X

α∈eF

M(α)x^α∈Se+.

for some ˜c0(α) ∈R (α ∈ F),e M(0)∈ Se andM(α) ∈Se (α ∈F). Now, replacinge each monomialx^α by a single real variable yα, we have an SDP relaxation problem of (4.1):

minimize X

α∈Fe

˜

c0(α)yα subject to M(0) +X

α∈eF

M(α)yα∈Se+. (4.5)

We denote the optimal objective value by ˆζω.

(9)

The primal approach described in this section is based on the moment formulation proposed by [22], which is the dual to the SOS relaxation of the constrained POP (4.1).

More precisely, if we replace every index subsetFkofN (k= 1,2, . . . , m) by the entire index set N and if we takep= 1 and C1 =N, then we have the SDP relaxation of the POP (4.1) which corresponds to the SDP (4.6) of Lasserre [22]. In this case, the linearization of the matrixu(x,A^N_ω)u(x,A^N_ω)^T forms the moment matrixMN(y) of the SDP (4.5) of Lasserre [22].

4.5. SOS and SDP relaxations of quadratic optimization problems with order 1. Consider a quadratic optimization problem (QOP)

minimize x^TQ₀x+ 2q^T₀x

subject to x^TQ_kx+ 2q^T_kx+γk ≥0 (k= 1,2, . . . , m).

(4.6)

Here Qk denotes an n×nsymmetric matrix,qk ∈Rⁿ andγk ∈R. In this case, we show that the proposed sparse SOS relaxation (4.3) of orderω= 1 using any chordal extension of the csp graphG(N, E) attains the same optimal value as the dense SOS relaxation [22] of orderω = 1. This demonstrates an advantage of using the set of maximal cliques of a chordal extension of the csp graphG(N, E) instead of the set of maximal cliques ofG(N, E) itself.

We formulate the dense [22] and sparse relaxations of order ω = 1 using SOS polynomials from the dual side. Consider the Lagrangian dual of (4.6).

maximize ζ subject to L(x,ϕ)−ζ≥0 (∀x∈Rⁿ) andϕ∈R^m+, (4.7)

whereLdenotes the Lagrangian function such that L(x,ϕ) =x^T Q₀−

Xm k=1

ϕkQk

!

x+ 2 q₀− Xm k=1

ϕkqk

!T

x− Xm k=1

ϕkγk. Then we replace the constraint L(x,ϕ)−ζ ≥ 0 (∀x ∈ Rⁿ) by an SOS condition L(x,ϕ)−ζ∈R[x,A^N₁]² to obtain the dense relaxation [22] of orderω= 1

maximize ζ subject to L(x,ϕ)−ζ∈R[x,A^N₁]² andϕ∈R^m+. (4.8)

Now consider the aggregated sparsity pattern matrix ˜R over the coefficient ma- tricesQ₀,Q₁, . . . ,Q_m such that

R˜ij =





⋆ ifi=j,

⋆ ifi6=j and [Qk]ij 6= 0 for somek∈ {0,1,2, . . . , m}, 0 otherwise,

which coincides with the csp matrix of the Lagrangian functionL(·,ϕ) withϕ∈R^m+. LetG(N, E^′) be a chordal extension of the csp graphG(N, E) from ˜R, andCℓ (ℓ= 1,2, . . . , p) the maximal cliques ofG(N, E^′). Then we can apply the sparse relaxation (3.8) to the unconstrained minimization of the Lagrangian functionL(·,ϕ) withϕ∈ R^m+. Thus, replacingR[x,A^N₁]² in the dense SOS relaxation (4.8) by

Xp ℓ=1

R[x,A^C₁^ℓ]², we obtain the sparse SOS relaxation

maximize ζ

subject to L(x,ϕ)−ζ∈ Xp ℓ=1

R[x,A^C₁^ℓ]² and ϕ∈R^m₊.





 (4.9)

(10)

Note thatL(·,ϕ) is a quadratic function inx∈Rⁿwhich results in the same csp graph G(N, E) for eachϕ∈R^m+, and thatCℓ (ℓ= 1,2, . . . , p) are the maximal cliques of a chordal extensionG(N, E^′) of the csp graphG(N, E). Hence, if ϕis chosen so that the Hessian matrix∇xxL(x, ϕ) ofL(x, ϕ) is positive semidefinite,∇xxL(x, ϕ) can be factorized using a Cholesky factorization such that ∇xxL(x, ϕ) = M M^T for some n×nmatrixM with the property{i∈N :Mij6= 0} ⊂C_j^′ for some maximal clique C_j^′ of G(N, E^′) (j= 1,2, . . . , n). If in additionL(x,ϕ)−ζ is an SOS polynomial or the constraint of the dense relaxation (4.8) is satisfied, thenL(x,ϕ)−ζis represented as

L(x,ϕ)−ζ= (1,x^T)MfMf^T 1

x

= Xn ℓ=1

Mf^T_.ℓ+1

1 x

2

+α²

for someα≥0 and some (1 +n)×(1 +n) matrixMf of the form Mf =

α b

0 M

.

Here Mf.ℓ+1 denotes the (ℓ+ 1)st column of M.f It should be noted that each Mf^T_.ℓ+1

1 x

is an affine function whose support is contained in A^C₁^ℓ^′ =

α∈Zⁿ+:αi= 0 (i6∈C_ℓ^′)

as a polynomial. Therefore we have shown that the dense SOS relaxation (4.8) with orderω= 1 is equivalent to the sparse SOS relaxation (4.9) with orderω= 1.

5. Some technical issues.

5.1. Computing optimal solutions. Henrion and Lasserre [10] presented a linear algebra method that computes multiple optimal solutions of the POP (4.1).

The moment matrix of full size induced fromu(x,A^N_ω)u(x,A^N_ω)^T plays an essential role in their method. In the proposed sparse relaxation, however, the moment matrix of full size is not available; instead multiple but partial moment matrices from u(x,A^C_ω^ℓ)u(x,A^C_ω^ℓ)^T (ℓ= 1,2, . . . , p), where the monomials in variablesxi (i∈Cℓ) with degree up to ω are taken for the elements of the column vector u(x,A^C_ω^ℓ), are generated. As mentioned in the previous sections, we further apply the method [19]

that eliminates redundant monomials fromu(x,A^C_ω^ℓ) to reduce the size of the partial moment matrices. Because of these reasons, it is difficult to utilize the linear algebra method in the sparse relaxation.

We present a different technique. The basic idea is to perturb the POP (4.1) so that the projection of optimal solutions of the resulting primal SDP relaxation (4.5) onto the space of the variablesxi (i = 1,2, . . . , n) consists of a unique point, which is the unique optimal solution of the perturbed POP. This technique is originally proposed in Section 6.7 of [9]. We may assume without loss of generality that the objective polynomial functionf0of the POP (4.1) is linear; iff0is not linear, we may replacef0(x) by a new variablex0and add the inequality constraintf0(x)≤x0.

We consider

minimizef0(x) +p^Txsubject tofk(x)≥0 (k= 1,2, . . . , m).

(5.1)

Here p ∈ Rⁿ denotes a perturbation vector. We then focus on the primal SDP relaxation of the perturbed POP (5.1), which can be described as the problem of

(11)

minimizing f0(ye¹, ye², . . . , yeⁿ) +Pn

i=1piyeⁱ subject to the constraint of the SDP (4.5). Define

De=n

(ye¹, ye², . . . , yeⁿ)∈Rⁿ : (yα:α∈F) is a feasible solution of (4.5)e o . Note that De is a convex subset of Rⁿ. Then the primal SDP relaxation of the perturbed POP (5.1) is equivalent to the convex program

minimize f0(x) +p^Txsubject tox∈D,e (5.2)

which may be regarded as the projection of the primal SDP relaxation of the perturbed POP (5.1) onto the space of the variables of (5.1). Now we assume a certain weak stability on the optimal solution set of the convex program (5.2): there existǫ > 0 andρ >0 such that the optimal solution set of the convex program (5.2) is nonempty and is contained in the ball B(ρ) = {x∈ Rⁿ : kxk ≤ρ} for any perturbation with kpk ≤ ǫ. In this case, we can replace the feasible region De of the convex program (5.2) byDe∩B(ρ). NowDe∩B(ρ) is convex and bounded. Hence, the convex program (5.2) has a unique solution for almost everypwithkpk ≤ǫby Theorem 2.2.9 of [30].

Consequently, under the assumption on its optimal solution set, the convex program (5.2) has a unique optimal solution for almost every smallp. Suppose that:

(a) pis sufficiently small.

(b) The convex program (5.2) has a unique optimal solution ˆx; this means that ˆ

x= (ˆye¹,yˆe², . . . ,yˆeⁿ)^T is obtained from any optimal solution (ˆyα :α∈Fe) of the primal SDP relaxation of the perturbed POP (5.1) .

(c) The optimal value of the primal SDP relaxation of (5.1) coincides with the value f0(ˆx) +p^Tx; whenˆ f0is linear, this condition always holds.

(d) ˆxis a feasible solution of the perturbed POP (5.1).

Then ˆxis an optimal solution of the perturbed POP (5.1), which may be regarded as an approximate optimal solution of the original POP (4.1).

5.2. Equality constraints. Consider the POP minimize f0(x)

subject to fk(x)≥0 (k= 1,2, . . . , m), hj(x) = 0 (j= 1,2, . . . , q).

(5.3)

Here hj ∈ R[x]. Replacing each hj(x) = 0 by two inequality constraints hj(x)≥0 and−hj(x)≥0, we reduce the POP (5.3) to the inequality constrained POP:

minimize f0(x)

subject to fk(x)≥0 (k= 1,2, . . . , m),

hj(x)≥0, −hj(x)≥0 (j = 1,2, . . . , q).



(5.4)  Let

ωk =⌈deg(fk)/2⌉(k= 0,1,2, . . . , m), χj =⌈deg(hj)/2⌉(j= 1,2, . . . , q),

ωmax= max{ωk (k= 0,1,2, . . . , m), χj (j = 1,2, . . . , q)}, Fk ={i:αi≥1 for someα∈supp(fk)} (k= 1,2, . . . , m), Hj ={i:αi≥1 for someα∈supp(hj)} (j= 1,2, . . . , q).

(12)

We construct the csp matrix R and the csp graphG(N, E) of the POP (5.4). Let C1, C2, . . . , Cp be the maximal cliques of a chordal extension of G(N, E), and ω≥ωmax. Applying the SOS relaxation in Section 4 to the POP (5.4), we have

maximize ζ subject to f0(x)−

Xm k=1

ϕk(x)fk(x)− Xq j=1

ψ_j⁺(x)−ψ_j⁻(x)

hj(x)−ζ

∈Pp

ℓ=1R[x,A^C_ω^ℓ]²,

ϕ∈Φω, ψ_j⁺, ψ⁻_j ∈R[x,A^H_ω−χ^j _j]² (j= 1,2, . . . , q).

SinceR[x,A^H_ω−χ^j _j]²−R[x,A^H_ω−χ^j _j]²=R[x,A^H_2(ω−χ^j

j)], this problem is equivalent to maximize ζ

subject to f0(x)− Xm k=1

ϕk(x)fk(x)− Xq j=1

ψj(x)hj(x)−ζ

∈Pp

ℓ=1R[x,A^C_ω^ℓ]², ϕ∈Φω, ψj∈R[x,A^H_2(ω−χ^j

j)] (j= 1,2, . . . , q).









 (5.5)

We can solve the SOS optimization problem (5.5) as an SDP with free variables.

5.3. Reducing the sizes of SOS relaxations. In [19], a method of reducing the size of the SOS relaxation is proposed by exploiting sparsity. The method consists of two phases. Suppose that, given an SOS polynomialf whose support isF, we want to represent f using unknown polynomialsφi ∈ R[x,G] (i = 1,2, . . . , k) with some support G such thatf = Pk

i=1φ²_i. In phase 1 of the method in [19], we compute G⁰= convα

2 :α∈ F^e T

Zⁿ+,whereF^e={α∈ F :αi is even (i= 1,2, . . . , n)}. It is known in [29] that supp(φi)⊂ G⁰ for any SOS representation off =Pk

i=1φ²_i. In phase 2, we eliminate redundant elements fromG⁰ that are unnecessary in any SOS representation off.

In the sparse SOS relaxations (3.8) and (4.3), we can apply phase 2 of the method with some modification to eliminate redundant elements from A^C_ω^ℓ (ℓ= 1,2, . . . , p).

LetF denote the support of a polynomialf which we want to represent as

f = Xp ℓ=1

ψℓ for someψℓ∈R[x,Gℓ]² (ℓ= 1,2, . . . , p).

(5.6)

The polynomialf can be eitherf0−ζ in the unconstrained POP (3.1), orL(·,ϕ)−ζ withϕ∈Φωin the constrained POP (4.1). In both cases, we assume that the family of supports Gℓ =A^C_ω^ℓ (ℓ = 1,2, . . . , p) is sufficient to represent f as in (5.6); hence phase 1 is not implemented. Let F^e ={α ∈ F : αi is even (i = 1,2, . . . , n)}. For eachα∈Sp

ℓ=1Gℓ, we check whether the following relations are true.

2α6∈ F^e and 2α6∈

[p ℓ=1

{β+γ:β∈ Gℓ, γ∈ Gℓ, β6=α}

If an α∈ Gℓ satisfies these relations, we can eliminate α from Gℓ and continue this process until noα∈Sp

ℓ=1Gℓ satisfies these two relations. See [19] for more details.

(13)

5.4. Supports for Lagrange multiplier polynomials. In the generalized La- grangian dual (4.2) and the sparse SOS relaxation (4.3), each multiplier polynomial ϕk has been chosen from the SOS polynomials with the supportA^F_ω−ω^k _kto inherit the correlative sparsity from the original POP (4.1). For eachk, let Jk ={ℓ:Fk ⊂Cℓ} (k= 1,2, . . . , m). By construction,Jk 6=∅. Now we can replace the supportA^F_ω−ω^k _kof SOS polynomials forϕk by a union ofA^C_ω−ω^ℓ _k over someℓ∈Jk in the sparse SOS relaxation (4.3). This modification strengthens the SOS relaxation (4.3) without losing much of the correlative sparsity of the other part.

5.5. Valid polynomial inequalities and their linearization. By adding ap- propriate valid polynomial inequalities to the constrained POP (4.1), we can strengthen its SDP relaxation (4.5). This idea has been used in many convex relaxation methods. See the paper [18] and the references therein. We consider two types of valid polynomial inequalities that occur frequently in practice. These inequalities are used for some test problems in the numerical experiments in Section 6. Suppose that (4.1) involves the nonnegative and upper bound constraints on all variables: 0 ≤ xi ≤ ρi (i= 1,2, . . . , n),where ρi denotes a nonnegative number (i= 1,2, . . . , n). In this case, 0≤x^α≤ρ^α (α∈Fe) form valid inequalities, whereρ= (ρ1, ρ2, . . . , ρn)∈Rⁿ. Therefore we can add their linearizations 0≤yα≤ρ^α to the primal SDP relaxation (4.5). The complementarity condition xixj = 0 is another example. If αi ≥1 and αj ≥1 for some α∈Zⁿ+, thenx^α = 0 forms a valid equality in this case; hence we can addyα= 0 to the primal SDP relaxation or we can reduce the size of the primal SDP relaxation by eliminating the variableyα= 0.

5.6. Scaling. High degree of polynomials in POPs can cause numerical problems. Even when the degrees of objective and constrained polynomials are small, the polynomial SDP (4.4) involves high degree monomialsx^α as the orderω gets larger.

Note that each variableyα corresponds to a monomialx^α. More precisely, if x is a feasible solution of the POP (4.1), then (yα=x^α:α∈Fe) is a feasible solution of the primal SDP relaxation (4.5) with the same objective value as (4.1). Therefore, if the magnitudes of some components of a feasible (or optimal) solutionxof (4.1) are much larger (or smaller) than 1, the magnitude of some components of the corresponding solution (yα =x^α:α∈F) can be huge (or tiny). This may be the source of numer-e ical difficulties. To avoid such unbalanced magnitudes in the components of feasible (or optimal) solutions of the primal SDP relaxation (4.5), it would be ideal to scale the POP (4.1) so that the magnitudes of all nonzero components of optimal solutions of the scaled problem are near 1. Practically such an ideal scaling is impossible.

Here we restrict our discussion to a POP of the form (4.1) with additional finite lower and upper bounds on variablesxi (i= 1,2, . . . , n): ηi ≤xi≤ρi (i= 1,2, . . .), whereηi andρi denote real numbers such thatηi < ρi. In this case, we can perform a linear transformation to the variablesxisuch thatzi= (xi−ηi)/(ρi−ηi).Then we have objective and constrained polynomialsgk ∈R[z] (k= 0,1, . . . , m) such that

gk(z1, z2, . . . , zn) =fk((ρ1−η1)z1+η1,(ρ2−η2)z2+η2, . . . ,(ρn−ηn)zn+ηn).

We further normalize the coefficients of eachgk ∈R[z] such thatg^′_k(z) =gk(z)/νk. Here νk denotes the maximum magnitude of the coefficients of the polynomial gk ∈ R[z] (k= 0,1,2, . . . , m). Consequently, we obtain a scaled POP which is equivalent to the POP (4.1) with the additional bounding constraint on variablesxi(i= 1,2, . . . , n):

minimize g^′₀(z)

subject to g^′_k(z)≥0 (k= 1,2, . . . , m), 0≤zi≤1 (i= 1,2, . . . , n).

(5.7)

(14)

We note that the scaled POP (5.7) provides the same csp matrix as the original POP (4.1). Furthermore, we can add the constraints 0 ≤yα ≤1 (α ∈ F) to its primale SDP (4.5) to strengthen the relaxation. A similar technique can be found in [9].

6. Numerical results. In this section, we present numerical results of the proposed sparse relaxation for unconstrained and constrained problems. The focus is on verifying the efficiency of the sparse relaxation compared with the dense relaxation in [22]. The sparse and dense relaxations were implemented with MATLAB for constructing SDP problems and then a software package SeDuMi 1.05 was used to solve the SDP problems. All the experiments were done on 2.4GHz AMD Opteron cpu with 8.0GB memory. Unconstrained problems that we dealt with are benchmark test problems from [3, 21, 24] and randomly generated test problems with artificial correlative sparsity. Constrained test problems are chosen from [8] and optimal control problems [2].

We employ the techniques described in Section 5.1 for finding an optimal solution.

In particular, we use the random perturbation techniques with the parameter ǫ = 10⁻⁵ in all the experiments presented here. After an optimal solution ˆy of an SDP relaxation of the POP is found by SeDuMi, the linear part ˆx is considered for a candidate of an optimal solution of the POP.

With regard to computing the accuracy of an obtained solution, we use the following for an unconstrained POP with an objective functionf0:

ǫobj=|the optimal value of SDP−(f0(ˆx) +p^Tx)|ˆ max{1,|f0(ˆx) +p^Tx|}ˆ .

Herep∈Rⁿ denotes a randomly generated perturbation vector such that |pj|< ǫ= 10⁻⁵ (j = 1,2, . . . , n). For an inequality and equality constrained POP of the form (5.3), we need another measure for feasibility in addition toǫobj defined above. The following feasibility measure is used:

ǫfeas= min{fk(ˆx) (k= 1, . . . , m), −|hj(ˆx)|(j= 1, . . . , q)}.

We use the technique given in Section 5.2 for every equality constrained problem and the technique in Section 5.3 of reducing the size of an SOS relaxation for all test problems. In addition, we apply the techniques presented in Sections 5.4, 5.5 and 5.6 to every constrained problems from the literature [8]. Specifically, we use∪ℓ∈JkA^C_ω−ω^ℓ _k as the supports ofψk discussed in Section 5.4.

Table 6.1 shows notation used in the description of numerical experiments in the following subsections. The notation cl.str indicates the structure of the maximal cliques obtained by applying MATLAB functions ’symamd’ and ’chol’ to the csp matrix. For example, 4*3+5*2 means three cliques of size 4 and two cliques of size 5.

6.1. Unconstrained cases. The problems presented here are from the litera- tures [3, 21, 24] and randomly generated problems. Table 6.2 displays the numerical results of the following two functions.

• The chained singular function [3]:

fcs(x) =X

i∈J

(xi+ 10xi+1)²+ 5(xi+2−xi+3)² +(xi+1−2xi+2)⁴+ 10(xi−10xi+3)⁴ whereJ ={1,3,5, . . . , n−3}and nis a multiple of 4.

(15)

n the number of variables of a POP d the degree of a POP

sparse cpu time in seconds consumed by the proposed sparse relaxation dense cpu time in seconds consumed by the dense relaxation [22]

cl.str the structure of the maximal cliques

#clique the average number of cliques found in randomly generated problems

#solved the number of problems solved among randomly generated problems max.cl the number of the maximal cliques

max the maximum of cpu time consumed by randomly generated problems avr the average of cpu time consumed by randomly generated problems min the minimum of cpu time consumed by randomly generated problems cpu cpu time in seconds

ω the relaxation order

Table 6.1 Notation

• The Broyden banded function [21]:

fBb(x) = Xn i=1



xi(2 + 5x²i) + 1−X

j∈Ji

(1 +xj)xj





2

whereJi ={j|j6=i,max(1, i−5)≤j ≤min(n, i+ 1)}.

The above two problems of relatively small size could be solved by the dense relaxation as shown in Table 6.2, and their results can be used for the comparison of the performance of the sparse and dense relaxations. In the case of the chained singular functionfcs, its csp matrixR has nonzero elements near the diagonal,i.e., Rij = 0 if|j−i|>3. This means thatfcsis correlatively sparse. The ‘cl.str’ column of Table 6.2 shows that the sparsity can be detected correctly. As a result, the sparse relaxation is much more efficient than the dense relaxation. We could successfully solve the problem of 100 variables in a few seconds, while the dense relaxation could not handle the problem of 20 or 30 variables.

If we look at the result of the Broyden banded function fBb in Table 6.2, we observe that there is virtually no difference in performance between the proposed sparse and dense relaxations for n = 6 and n= 7. Because the csp matrix of this function has the band-width 7, it is fully dense when n = 6 andn = 7; the sparse relaxation is identical to the dense relaxation in these cases. Asnincreases, however, a sparse structure such as 7*2 forn= 8 can be found and the sparse relaxation takes advantage of the structured sparsity providing an optimal solution faster than the dense relaxation.

In Tables 6.3, we present the numerical results of the following functions:

• The Broyden tridiagonal function [21]

fBt(x) = ((3−2x1)x1−2x2+ 1)²+

n−1X

i=2

((3−2xi)xi−xi−1−2xi+1+ 1)² + ((3−2xn)xn−xn−1+ 1)².

(16)

Chained singular function Broyden banded function n cl.str ǫobj sparse dense n cl.str ǫobj sparse dense

16 3*14 3.5e-7 0.6 3059.5 6 6*1 8.0e-9 11.3 11.6

40 3*38 8.4e-7 1.4 — 7 7*1 1.9e-8 69.5 69.5

100 3*98 5.5e-7 3.8 — 8 7*2 2.8e-8 164.1 373.7

200 3*198 3.0e-7 8.4 — 9 7*3 9.1e-8 240.3 1835.6

400 3*398 3.6e-7 19.3 — 10 7*4 6.2e-8 348.7 8399.4

Table 6.2

Numerical results of the chained singular function and the Broyden banded function

• The chained Wood function [3]:

fcW(x) = 1 +X

i∈J

100(xi+1−x²_i)²+ (1−xi)²+ 90(xi+3−x²_i+2)²+ (1−xi+2)² +10(xi+1+xi+3−2)²+ 0.1(xi+1−xi+3)²

, whereJ ={1,3,5, . . . , n−3}and nis a multiple of 4.

• The generalized Rosenbrock function [24]:

fgR(x) = 1 + Xn

i=2

n100 xi−x²_i−12

+ (1−xi)²o .

Each of the above three functions has a band structure in its csp matrix, and therefore, the problems of large sizes can be handled efficiently. For example, the Broyden tridiagonal function fBt with 1000 variables could be solved in 16 seconds with the accuracy of 2.6e-07. Note that the solutions are accurate in all tested cases.

Broyden tridiagonal Chained Wood Generalized Rosenbrock n cl.str ǫobj cpu cl.str ǫobj cpu cl.str ǫobj cpu 600 3*598 9.1e-7 9.3 2*599 1.4e-5 0.9 2*599 3.9e-7 3.4 700 3*698 9.0e-7 10.9 2*699 1.6e-5 1.1 2*699 7.5e-9 4.0 800 3*798 2.2e-7 12.6 2*799 1.8e-5 1.3 2*799 2.1e-7 5.1 900 3*898 1.3e-7 14.4 2*899 3.4e-5 1.4 2*899 2.1e-7 5.6 1000 3*998 2.6e-7 16.0 2*999 3.8e-5 1.6 2*999 4.5e-7 5.9

Table 6.3

Numerical results of Broyden tridiagonal function, the chained Wood function and the generalized Rosenbrock function

Next, we present the numerical results of randomly generated problems. The aim of the test using randomly generated problems is to observe the effects of increasing the number of variables, the degree of the polynomials as well as the maximal size of cliques of the csp graph of a POP. The dense relaxation could not handle the randomly generated problems of the sizes reported here, and we include only the numerical results from the sparse relaxation.

Let us describe how an unconstrained problem with artificial correlative sparsity is generated randomly. We begin by constructing a chordal graph randomly such that the size of every maximal clique is not less than 2 and not greater than max.cl.

From the chordal graph, we derive the set of maximal cliques{C1, . . . , Cℓ} with 2≤

|Ci| ≤max.cl (i= 1, . . . , ℓ). We letvC_i(x) = (x^d_k:k∈Ci) where 2dis the degree of

(17)

the polynomial, and generate a positive definite matrixVi ∈ S++(Ci) and a vector g_i∈[−1,1]^|A^Ci2d−1| (i= 1,2, . . . , ℓ) randomly such that the minimum eigenvalueσof V1, . . . ,Vℓsatisfies the following relation:

σ≥ Xℓ

i=1

kg_ik2

q

|A^C_2d−1ⁱ |

.

By usingVi and g_i, we define the objective function:

frand(x) = Xℓ i=1

vCi(x)^TVivCi(x) +g^T_i u(x,A^C_2d−1ⁱ ) .

This unconstrained POP is guaranteed to have an optimal solution in the compact set {x= (x1, . . . , xn)∈Rⁿ |maxi=1,...,n|xi| ≤ 1}. A scaling with the maximum of the absolute values of the coefficients offrand(x) is used in numerical experiments.

The numerical results are shown in Tables 6.4, 6.5 and 6.6. Table 6.4 exhibits how the sparse relaxation performs for varying number of variables, Table 6.5 for raising the degree of the unconstrained problems, and Table 6.6 for increasing bounds of sizes of the cliques. For each choice ofn, dand max.cl, we generated 50 problems. Each column of #solved indicates the number of the problems whose optimal solutions were obtained withǫobj≤10⁻⁵out of 50 problems. All problems tested were solved.

n #clique max avr min #solved

20 14.3 0.9 0.3 0.2 50/50

40 30.9 4.1 1.0 0.4 50/50

60 47.4 6.9 2.0 0.9 50/50

80 64.2 13.0 3.8 1.4 50/50

100 80.3 37.9 8.8 1.9 50/50

Table 6.4

Randomly generated polynomials with max.cl= 4and2d= 4

2d #clique max avr min #solved

4 22.7 1.1 0.6 0.3 50/50

6 22.9 18.9 5.1 1.4 50/50

8 22.7 624.2 74.7 7.9 50/50

Table 6.5

Randomly generated polynomials with max.cl= 4, andn= 30

max.cl #clique max avr min #solved

4 22.7 1.1 0.6 0.3 50/50

6 20.0 31.5 6.4 1.3 50/50

8 17.3 497.9 79.8 4.0 50/50

Table 6.6

Randomly generated polynomials with2d= 4andn= 30

In Table 6.4, we notice that the number of cliques increases withn. For problems of large numbers of variables and cliques such as n = 100 and #clique= 80.3, the sparse relaxation provides optimal solutions in 8.8 average cpu seconds.

(18)

The numerical results in Table 6.5 displays the performance of the sparse relaxation for the problem ofn= 30 with degrees up to 8. The maximum size of cliques is fixed to 4. As mentioned before, the size of the SDP relaxation of the POP of increasing degree becomes large rapidly even if the POP remains correlatively sparse.

When 2d= 8, the average cpu time is 74.7 and the maximum is 624.2.

A large size of cliques used during problem generation also increases the complex- ity of the problem as shown in Table 6.6. We tested with the maximum size of cliques 4, 6, and 8, and observe that cpu time to solve the corresponding problems grows very rapidly,e.g. 79.8 average cpu seconds, 497.9 maximum cpu seconds for max.cl

= 8. From the increase of work measured by cpu time, we notice that the impact of the maximum size of cliques is comparable to that of degree, and bigger than that of the number of variables.

6.2. Constrained cases. In this subsection, we deal with the following constrained POPs:

• Small-sized POPs from the literature [8].

• Optimal control problems [2].

The numerical results on POPs from [8] are presented in Table 6.7. All problems are quadratic optimization problems except ’alkyl’ which involves polynomials of degree 3 in its equality constraints. We also added lower and upper bounds for the variables.

In preliminary numerical experiment for some of the test problems, severe numerical difficulties occurred in badly-scaled problems or problems with the complementarity condition. We incorporate all the techniques in Sections 5.4, 5.5 and 5.6 into the dense and sparse relaxations for these problems.

In Table 6.7,ǫ^′_feas denotes the feasibility for the scaled problems at the approximate optimal solutions obtained by the sparse and dense relaxations. We see thatǫ^′_feas is small in most of the problems while the feasibility ǫfeas for the original problems at the approximate optimal solutions becomes larger. The lower bounds obtained by the sparse relaxation are as good as the ones by the dense relaxation except the five problems ex5 2 2 cases1, 2, 3, ex5 3 2 and ex9 1 4. In the former three cases, the dense relaxation with orderω = 2 succeeds in computing accurate bounds while the sparse relaxation with orderω= 3 computes accurate bounds with the same quality.

When we compare the performance of the sparse relaxation with the dense relaxation using these problems in Table 6.7, we observe that the sparse relaxation are much faster than the dense relaxation in large dimensional problems. In some problems, however, the technique given in Section 5.3 for reducing the sizes of relaxations worked so effectively that the difference between the dense and sparse relaxations decreases. For example, the sparse and dense relaxations of ex2 1 3 without incor- porating this reduction technique took 0.9 and 464.5 seconds, respectively, while 0.2 and 2.8 seconds were consumed with the technique, respectively, as shown in Ta- ble 6.7. We will present more detailed comparison between the dense relaxation with the technique and the dense relaxation without the technique in Section 6.3.