Preliminaries and Reviews

5.2 GTD Transform Coder for Optimizing Coding Gain

5.2.1 Preliminaries and Reviews

The transform coder is shown in Fig. 5.1. The signalxis first multiplied by anM ×M matrixT so thaty = [y₁ y₂ · · · y_M]^T = Tx. The quantizers are scalar quantizers, and are modeled as an additive noise sources so that“y_i=y_i+q_i. Suppose theith quantizerQ_ihasb_ibits, then the variance of the quantization errorqisatisfies

σ²_q

i =c2^−2bⁱσ_y²

i, (5.1)

whereσ²_y_iis the variance of the signal input to theith quantizer. This result generally holds under the high bit rate assumption [35, 64, 107]. The constantc depends on the type of the quantizer and the statistics ofy_i. It is assumed that all the scalar quantizers have the samec. The signal is reconstructed at the decoder by multiplying withT⁻¹.

The problem of minimizing the arithmetic mean of MSE (AM-MSE) of the reconstructed coefficients E[||x−bx||²₂], under the average bit rate constraint, is solved by the KLT [107]. The KLT usesT=U^T, whereUis anyM×M orthonormal matrix such thatRx=UΣU^T, whereΣis the diagonal matrix of the eigenvalues{σ₁²,· · · , σ²_M}ofRx(assumed to be in non-increasing order).

Under the high bit rate assumption (5.1), the optimal bit allocation is given by the bit-loading formula [35, 107]

bi=b+1

2log₂ σ²_i

det(R_x)^M¹ , (5.2)

where the average bit rate is constrained to bebbits per data stream. Note thatσ²_i is actually the signal variance of the transform coefficienty_i. With the bit allocation choosen as in (5.2), the MSE σ²_q_idue to theith quantizer becomes independent ofi(as seen by substituting (5.2 ) into (5.1), with

σyi=σi.). The resulting AM-MSE is

EKLT =c2^−2bdet(Rx)^M¹. (5.3)

It was shown in [106] that under the high bit rate assumption, it is not a loss of generality to assume that the transform is orthonormal.² It should be noted that the KLT decorrelates the signal, so the components of y are statistically independent (under the Gaussian assumption) [24]. This is a necessary condition for optimality (minimum MSE) under the use of scalar quantizers [35] in the high bit rate case.

y ^ y

T T ^! ¹

Q₁ Q₂

x ^ x

T ^. ^. T ¹

. .

. . .

Q_M

. .

Figure 5.1: Schematic of a transform coder with scalar quantizers.

The PLT, proposed in [79], is a signal dependent non-orthonormal transform, which utilizes linear prediction theory [35, 108]. It has the same decorrelation property as the KLT, and is shown to have the same MMSE performance if the so-calledminimum noise structureand optimal bit allocation are used [79]. In the article of Phoong and Lin [79], the original PLT is used for the vectorx obtained by blocking a scalar WSSx(n). In the following review of the PLT idea, it can be seen that the PLT can actually be used for a vector process which need not to be a blocked version of a scalar process. The development of [79] which was based on linear prediction theory does not apply in this case, but some of the main conclusions continue to be true as we shall elaborate next.

Consider the LDU decomposition [31] of the covariance matrixRxgiven by

R_x=LDL^T. (5.4)

2It should be noted that the KLT is optimal among memoryless transforms. IfTis replaced withT(z)which has memory, then the lapped transform and its variations can be used to further improve the coding performance [1, 62]. In the lapped transform the optimal transform is no longer necessarily orthogonal but biorthogonal [62]. Such transforms are popular in modern practical transform coders [93].

HereLis lower triangular with diagonal elements equal to unity, andDis a diagonal matrix with positive diagonal elements. We can rewrite this as

(L⁻¹R

x2)(L⁻¹R

x2)^T =D.

That is,L⁻¹xhas the diagonal covariance matrixD. So premultiplyingxwithL⁻¹results in decorrelation. The transform coder withT=L⁻¹will be referred to as the PLT here, and its implementation is shown in Fig. 5.2. The multipliersskm in the figure are the coefficients in the matrixL.

In this implementation the quantizer noise is amplified byT⁻¹ =L.A different implementation, called the minimum noise structure I (MINLAB(I)) [82] is shown in Fig. 5.3. At each step of the Minlab encoder as well as the decoder, a prediction is made based on the quantized data, whereas in the structure in Fig. 5.2 the encoder makes predictions based on the original data but the decoder makes predictions with quantized data. This structure is shown to have the unity noise gain property [79]. It minimizes the AM-MSE if the bit loading for each quantizer follows the bit loading formula:

b_i=b+1

2log₂ D_ii

det(Rx)^M¹ . (5.5)

Note thatDii is actually the signal variance of the input to theith quantizer. By choosing the bit allocation as in (5.5), the MSEs at the output of the quantizers are made identical as seen by using (5.5) in (5.1). The resulting AM-MSE will be

EP LT =c2^−2bdet(Rx)^M¹, (5.6)

which is the same as what the KLT can achieve when the optimal bit loading is applied. The reason for the name PLT is that the multipliersskmare related to optimal linear predictor coefficients [79]

whenx(n)is the blocked version of a scalar WSS processx(n).For simplicity we shall continue to use the term PLT even when this is not the case. The PLT achieves the same optimal performance as the KLT but with less computational complexity in the implementation. Other attractive features are mentioned in [79].

Q

₂

^

Q

₁

s

₂₁

s

₃₁

s

₄₁

s

₄₁

s

₃₁

s

₂₁

Q

₂

x x ^

Q

₃

s

₃₂

s

₄₂

s

₄₃

s

₄₃

s

₄₂

s

₃₂

Q

₄

Figure 5.2: A direct implementation of the PLT.

Q

₂

^

Q

₁

s

₂₁

s

₃₁

s

₄₁

s

₄₁

s

₃₁

s

₂₁

x

x ^

Q

₃

s

₃₂

s

₄₂

s

₄₃

s

₄₃

s

₄₂

s

₃₂

Q

₄

Figure 5.3: The PLT implemented using MINLAB(I) structure.

Dalam dokumen The Roles of Majorization and Generalized Triangular Decomposition in Communication and Signal Processing (Halaman 119-123)