• Tidak ada hasil yang ditemukan

In principle, the lower bound (2.32) still requires an optimization over the diagonal matrix D ≥0. A particular choice that may be computationally feasible is D = αI, for some α.

However in this section we focus on an even more simple choice forD. Namely, as noted in [89], letting D →0 in Corollary 2.2 we obtain

LBeigbmin(LL) min

a∈D||a−L−1b||2. (2.46)

0 5 10 x 109 0

0.05 0.1 0.15 0.2

flop count PLTSD

0 5 10

x 109 0

0.2 0.4 0.6 0.8

flop count SD

0 5 10

x 109 0

0.2 0.4 0.6 0.8 1

flop count SDsdp

0 5 10

x 109 0

0.1 0.2 0.3 0.4 0.5

flop count GSPHSD

0 5 10

x 109 0

0.05 0.1 0.15 0.2

flop count SPHSD

Figure 2.10: Flop count histograms for SD, SPHSD, GSPHSD, PLT, and SDsdp algorithms, m= 45,SN R= 3 [dB], D={−12,12}45

We also mention that this bound could have been obtained in an easier fashion as

LBeigbmin(LL) min

a∈D||a−L1b||2 ≤min

a∈D(a−L1b)λmin(LL)I(a−L1b)

≤min

a∈D(a−L−1b)(LL)(a−L−1b)≤min

a∈D||La−b||2. (2.47) Although this may raise concern that the resulting bound will be too loose, it turns out that it yields an algorithm with smaller flop count than the standard sphere decoder. The key observation is that, withD = 0, all the computations required at any point in the tree are linear in the dimension. (The standard sphere decoder also requires a linear number of operations per point.) Since it is based on the minimum eigenvalue we refer to this bound as theeigen bound.

Clearly, since (2.46) can be regarded as a special case of (2.32) it is a lower bound on the integer least-squares problem (2.11). Note that it appears as if (2.46) may not be a good bound since λmin(LL) could become very small. However, since the minimization in

(2.46) is performed over integers, the resulting bound turns out to be sufficiently large to serve our purposes (i.e., tree pruning in sphere decoding), especially in the case of higher symbol constellations. Furthermore, we will show that the computation required to find LBeigb for a node at a level k in the search tree is linear in k.

The key observation which enables efficient computation of LBeigb in (2.46) is that the vector L1b can be propagated as the search progresses down the tree. Before proceeding any further, we will simplify notation. First, recall that

L=R1:k1,1:k1 and b =z1:k1=y1:k1−R1:k1,k:msk:m.

Let us denote F1:k−1,1:k−1=R1:k−1,1:k−11, and introduce

f(k1)=L1b=F1:k1,1:k1z1:k1=F1:k1,1:k1(y1:k1−R1:k1,k:msk:m). (2.48)

We wish to find a recursion that relates the vector f(k−2) to the already calculated vector f(k1).

f(k1) = F1:k1,1:k1(y1:k1−R1:k1,k:msk:m)

=

F1:k−2,1:k−1

Fk1,1:k1

y1:k1

F1:k−2,1:k−2 F1:k−2,k−1 0 Fk1,k1



R1:k−2,k:m Rk1,k:m

sk:m

=

F1:k−2,1:k−1y1:k−1 Fk1,1:k1y1:k1

−

F1:k−2,1:k−2R1:k−2,k:m+F1:k−2,k−1Rk−1,k:m Fk1,k1Rk1,k:m

sk:m

=

F1:k2,1:k2y1:k2+F1:k2,k1yk1 Fk1,1:k1y1:k1



F1:k2,1:k2R1:k2,k:msk:m+F1:k2,k1Rk1,k:msk:m Fk−1,k−1Rk−1,k:msk:m



(2.49)

From (2.49), we see that

f1:k(k1)2 =F1:k2,1:k2y1:k2+F1:k2,k1yk1−F1:k2,1:k2R1:k2,k:msk:m−F1:k2,k1Rk1,k:msk:m. (2.50) Similarly,

f(k−2) = F1:k2,1:k2y1:k2−R1:k2,k1:msk1:m

= F1:k−2,1:k−2y1:k−2−F1:k−2,1:k−2

R1:k−2,k−1 R1:k−2,k:m

sk1 sk:m



= F1:k2,1:k2y1:k2−F1:k2,1:k2R1:k2,k:msk:m−F1:k2,1:k2R1:k2,k1sk1. (2.51)

Using (2.50) and (2.51), we relate f(k1) and f(k2) as

f(k−2) =f1:k−2(k1)+F1:k2,k1Rk1,k:msk:m−F1:k2,k1yk1−F1:k2,1:k2R1:k2,k1sk1. (2.52) All operations in the recursion (2.52) are linear, except for the matrix-vector multiplication F1:k−2,1:k−2R1:k−2,k−1, which is quadratic. However, this multiplication needs to be com- puted only once for each level of the tree, and the resulting term is used for computing (2.52) for all points visited by the algorithm at a level. Therefore, this multiplication may be treated as a part of pre-processing, i.e., we compute it for all k before actually running Algorithm 1. Hence, updating the vectorL1bin the (2.46) requires a computational effort that is linear ink. Furthermore, since it is done component-wise, the minimization in (2.46) also has complexity that is linear ink. Hence we conclude that the complexity of computing the eigen bound is linear ink. Also, it should be noted that in addition to standard sphere decoder we have to compute λk = min eig(F1:k−1,1:k−1 F1:k1,1:k1),1 ≤ k ≤ m. However, computing these λks requires an effort that is negligible to the overall flop count for the model parameters that we will consider.

We state the subroutine for computing LBeig below.

Subroutine for computing LBeig:

Input: R, y1:k1,sk:m, F = R1, λk = min eig(F1:k 1,1:k1F1:k1,1:k1),1 ≤ k ≤ m, F R1:k2,k1=F1:k2,1:k2R1:k2,k1,1≤k≤m, f1:k−1k .

1. If k = m, fk1 = F1:k−1,1:k−1(y1:k−1 −R1:k−1,k:msk:m); otherwise, fk1 = f1:k−1k + F1:k1,kRk,k+1:msk+1:m−F1:k1,kyk−F R1:k1,ksk.

2. if k >1,LBeigbkmina∈D||a−fk−1||2, otherwise,LBeigb= 0.

We refer to the modification of the sphere decoding algorithm which makes use of the lower boundLBeigb as EIGSD algorithm and study its expected computational complexity in the following subsection.

2.8.1 Eigen bound performance comparison

In this subsection we study the performance of EIGSD algorithm.

In particular, Figure 2.11 compares the expected complexity and total number of points in the tree of the EIGSD algorithm to the expected complexity and total number of points of the standard sphere decoder algorithm. We employ both algorithms for detection in a multi- antenna communication system with 6 antennas, where the components of the transmitted symbol vectors are points in a 256-QAM constellation. Note that the signal-to-noise ratio in Figure 2.11 is defined asSN R= 10log10255m12σ2, whereσ2 is the variance of each component of the noise vector w. Both algorithms choose the initial search radius statistically as in [49] (the sequence ofs,= 0.9, = 0.99,= 0.999, etc.), and employ the Schnor-Euchner search strategy, updating the radius every time the bottom of the tree is reached. As the simulation results in Figure 2.11 indicate, the EIGSD algorithm runs more than 4.5 times faster than the SD algorithm.

In Figure 2.12 the flop count histograms of SD and EIGSD algorithms are shown. As can be seen, the EIGSD algorithm has a significantly better shaped (shorter tail) distribution of the flop count than the SD algorithm.

18 18.5 19 19.5 20 0.2

0.4 0.6 0.8 1 1.2 1.4 1.6

1.8x 106 Flop count

SNR [db]

flop count

EIGSD SD

0 5 10 15

100 101 102 103 104

Number of points per level

level

number of points per level

EIGSD SD

Figure 2.11: Computational complexity of the SD and EIGSD algorithms, m = 12, D = {−152,−132, . . . ,132,152 }12

We would also like to point out that the EIGSD algorithm is not restricted to applications in communication systems. In Figure 2.13 we show what its potential can be if applied to a random integer least-squares problem. In the problem simulated in Figure 2.13, H was generated as anm×m matrix with i.i.d. Gaussian entries and entries of ywere generated uniformly from the interval [−M210,M210]. The problem that we were solving was again

smin∈Dmkx−Hsk2, (2.53)

whereD={−M21,−M23, . . . ,M23,M21}. The initial radiusdg was chosen as

dg =||x−Hˆs|| (2.54)

where ˆs is obtained by rounding the components of H1x to the closest element in D. ˆs generated in this way is sometimes called the Babai estimate [44]. Figure 2.13 compares

−1 0 1 2 3 4 5 x 108 0

0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05

flop count EIGSD

−1 0 1 2 3 4 5

x 108 0

0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05

flop count SD

Figure 2.12: Flop count histograms for SD and EIGSD algorithms, m = 12, SN R = 18 [dB], D={−152,−132, . . . ,132,152}12

the expected flop count of the EIGSD and SD algorithms for different large values of M.

As can be seen, the larger the set of allowed integers the better the EIGSD performance.