High Order FIR Filter Hardware Implementation Complexity Reduction

(1)

High Order FIR Filter Hardware Implementation Complexity Reduction

Item Type Conference Paper

Authors Degtyarev, Alexander;Saifullin, Karim R.;Bakhurin, Sergey Citation Degtyarev, A., Saifullin, K., & Bakhurin, S. (2022). High

Order FIR Filter Hardware Implementation Complexity

Reduction. 2022 24th International Conference on Digital Signal Processing and Its Applications (DSPA). https://doi.org/10.1109/

dspa53304.2022.9790772 Eprint version Post-print

DOI 10.1109/DSPA53304.2022.9790772

Publisher IEEE

Rights Archived with thanks to IEEE Download date 2024-01-16 17:04:42

Link to Item http://hdl.handle.net/10754/679014

(2)

High order FIR-filter hardware implementation complexity reduction

Alexander Degtyarev Moscow Institute of Physics

and Technology (MIPT) Moscow, Russian Federation

[email protected]

Karim Sayfullin

King Abdullah University of Science and Technology (KAUST) Thuwal, Makkah Province, Saudi Arabia

[email protected]

Sergey Bakhurin Moscow Research Centre Huawei Technologies Co. LTD

Moscow, Russian Federation [email protected]

Abstract—In this paper, the authors propose an algorithm for hardware implementation complexity reduction of high order FIR-filters. Current algorithm reduces number of multipliers at FIR-filter expressing initial impulse response samples through the other initial impulse response samples. This approach allows replacing multipliers with shift registers and adders, which leads to reduction of FIR-filter power consumption and required crys- tal area. Algorithm works with arbitrary filters: Low Pass, High Pass, Band Pass, Band Stop and also algorithm supports FIR- filters with symmetrical and asymmetrical impulse responses.

I. INTRODUCTION

High-order FIR-filters are widely used in digital signal processing and it‘s applications. There are hardware implementation advantages of FIR-filters comparing IIR-filters such as stability, absence of feedback implementation necessity, opportunity to provide linear phase characteristic. However, the higher filter order, the more resources are required. Nth- order FIR-filter requiresN+ 1multipliers, which are the most difficult FIR-filter implementation elements in terms of power consumption and occupied chip area. These two parameters have quadratic growth depending on multipliers bit depth, whereas in case of adders and bit shifters there is a linear bit depth dependence.

Hence, there is the FIR-filter complexity reduction problem, which can be considered as the number of multipliers reduction problem. Presently there are solution methods for current issue. The simplest idea is to replace the same coefficients multipliers with only one. This approach is works effectively in case of linear phase filters. For example, consider4th-order FIR-filter with symmetric impulse response:

h= h0 h1 h2 h1 h0^T

. (1)

Direct structure for the case (1) can be simplified as on figure 1. Also, direct form of FIR-filter can be modified to polyphase form. Assumexk andhn to be the input signal and impulse response of filter respectively, then output signal yk

and filter transfer function H(z)can be written as [2]:

yk =

N

X

n=0

hnxk−n, H(z) =

N

X

n=0

hnz⁻ⁿ. (2)

z^´¹ z^´¹ z^´¹ z^´¹

`

ˆ ˆ ˆ

xk

yk

` `

h2 h1 h0

Fig. 1. Complexity-reduced symmetrical FIR form

An FIR filter can be implemented as L-polyphase structure which is obtained by rewritingz-domain equivalent of convo- lution (2) as [3]:

L−1

X

l=0

Yl(z)z^−l=

L−1

X

l=0

Xi(z)z⁻ⁱ

L−1

X

l=0

Hj(z)z^−j. (3) Here Yl(z), Xi(z) andHj(z) are the polyphase components of output, input and filter transfer function, correspondently.

For example, 2-polyphase structure [1] would be as on figure 2 . This structure has reduced-complexity form [1]

shown on figure 3 , whereFsis a sample rate of inputxk and outputyksignals. BothH0(z)andH1(z)are transfer functions of polyphase subfilters with half number of multipliers; hence, number of multiplications per second in case of simplified 2- polyphase form would be 25% lower comparing with direct 2-polyphase form.

Complexity reduction applied to subfilters leads to the number of multiplications per second exponential reduction.

However, this method requires additional pre/postprocessing adders. Moreover, different clock regions need to be pro- vided for different parts of structure, as a result, number of multiplications per second reduction leads to implementation complexity increasing.

Method suggested in this paper offers replacing multipliers with adders and bit shifters without clock regions changing in hardware implementation. This replacement allows to reduce power consumption and occupied chip area when implement- ing FIR-filters.

(3)

Fs{2clock region H0pzq

H1pzq H0pzq H1pzq

`

` z^´1

x2k

x2k`1

y2k

y2k`1

Fig. 2. 2-polyphase form

H0pzq H0pzq `H1pzq

H1pzq

`

` `

z^´¹ x2k

x_2k`1

y2k

y_2k`1 Fs{2clock region

Fig. 3. Reduced complexity 2-polyphase FIR-filter form

II. FIR-FILTER COMPLEXITY REDUCTION MATHEMATICAL APPROACH

A. Definitions

ConsiderNth-order FIR-filter with initial impulse response h ∈ R^N⁺¹. Let M be desired number of multipliers to reduce to. Defineb∈R^M as linear independent set of initial coefficients, called basis coefficients, such that:

b={bi}, bi∈h, i= 0, M −1, M6N+ 1, (4) where basis coefficients b are chosen from samples of initial impulse response h. Then reduced-complexity coefficients g ∈ R^N+1 can be expressed through basis coefficients and transformation matrix T∈R^(N^+1)×M:

g=Tb, T={Ti,j}, i= 0, N , j= 0, M −1, (5) Ti,j∈ {0,2^−p}, p∈0, K, (6) where K — the highest power of 2, which determines coefficients of simplified structure g through initial h expression precision. Each h_j = b_i in (4) correspond to coefficient of multiplier that would remain in simplified structure. Multipli- ers with coefficients hi∈/b would be eliminated.

MatrixTcontains elements2^−ponly to express coefficients of reduced-complexity FIR filter g through particular set of initial coefficientsb, which means multipliers with coefficients hi ∈/ b could be replaced with bit shifters and additional adders.

B. Examples

• Consider 4th-order FIR-filter with symmetric impulse response (1). In this case basis coefficients will be

b = b0 b1 b2T

= h0 h1 h2T

. Consequently, reduced-complexity coefficients vector can be written as:

g=





 g0

g1

g2

g3

g4







=Tb=







1 0 0 0 1 0 0 0 1 0 1 0 1 0 0









 h0

h1

h2



. (7)

New structure of filter with coefficients g is shown on figure 1 according to (7). Here matrix-vector product in is realized by 3 multipliers due to matrixT number of columns.

• Assume filter with initial impulse response h = 1 2 4 3 6T

. To reduce complexity of structure, multipliers with coefficients1,2and4 could be replaced with bit shifters. In this case reduced-complexity vector g, basis coefficient vector band transformation matrixT can be written as:

g=





 1 2 4 3 6





 , b=

4 6

, T=







0.25 0 0.5 0

1 0

0 0.5

0 1





 (8)

According to (8), simplified structure can be shown as on figure 4:

z^´1 z^´1 z^´1 z^´1

"2 "1 "1

` `

`

ˆ ˆ

`

4 6

xk

yk

Fig. 4. Complexity-reduced FIR form

Here symbols 2 and1 mean 2 bit and 1 bits shift respectively. Figure 4 shows that matrix-vector product is realized by2 multipliers and 3bit shifters due to matrix T number of columns and its special structure.

• In the examples initial impulse response could be expressed exactly through specially structured transform matrix and basis coefficients h = g = Tb. In general coefficients h couldn‘t be expressed exactly through powers of 2, therefore only approximate expression of initial coefficientsg≈h is discussed below.

III. PROBLEM FORMULATION

The task is to develop the algorithm for the numerical search for the basis coefficients b ∈ R^M, where M — desired number of multipliers, set by user, and transformation matrixT, consisting only of elements2^−p, to generate the best approximationg=Tbto the initial coefficientsh:

Z π

−π

|H(ω)−G(ω)|²dω→min

T,b, (9)

(4)

where H(ω)andG(ω)— frequency responses of filter with initial coefficients h and reduced-complexity filter g correspondently. However, due to complexity of the problem (9) the heuristic algorithm is suggested in this paper. Define the problem as a search of the most suitable pair of basis coefficient bj and power pof2 to express each initial coefficient hi:

p, j= arg min

hi −

bj

2^p

, (10)

here {hi} = h, {bj} = b, b ⊂ h, i ∈ 0, N , j ∈0, M−1, p∈0, K.

IV. ALGORITHM DESCRIPTION

Firstly, basis coefficients need to be found. Basis coefficients b ∈R^M — set ofM the biggest linearly independent initial coefficients. Define hi and hj coefficients linearly independent, if they satisfy:

αhi+βhj≥∆, ∀α, β6= 0, (11) where∆— equivalence coefficient.∆is small, fixed number, set by user, used for algorithm calculation stability.

Search for matrix T on a fixed grid is based on iterative algorithm. The input values of the algorithm are initial coefficients h, desired filter order M, the highest power of twoK and coefficient∆.

Define auxiliary objects. Consider Dp ∈ R^(N+1)×(N⁺¹⁾, p∈0, K as difference matrix, that consist of absolute values of differences between elements of initial impulse response and initial coefficients multiplied by 2^−p. Thus, there is 3- dimensional array:

D=













D0,0,p · · · D0,N,p

... . .. ... DN,0,p · · · DN,N,p













, p∈0, K. (12)

Di,j,p=

hi −

hj

2^p

, i, j∈0, N . (13) Matrices (12), (13) are the algorithm key objects, that contain information about coefficients similarity. Assume u ∈ N^N+10 — indices of basis coefficients b ⊂ h. Consider Q ∈ R^N^+1×M — matrix of minimal differences indices.

Algorithm finds minimal elements of 3-dimensional array D ≡ {Di,j,p}, i, j ∈ 0, N , p ∈ 0, K along index p and writes their indices to Q. Minimal differences are stored in D₀ ≡ {Di,j,0}. Define q ∈ R^N+1 — vector of minimal differences indices. Algorithm finds minimal elements of D0 ≡ {Di,j,0} along index j and writes their indices to q.

Algorithm description:

Step 1. On this stage, firstly, algorithm fills matrix D with differences (13), also it searches for linearly dependent coefficients according to (11). If pare hi andhj/2^p is found thenhi is marked as coefficient that can‘t be include in basis vector, writingui=−1. Exactlyhiis marked, because it can be expressed through higher coefficient, but basis coefficients

are the highest. There are additional conditions in line 6 of algorithm that are used not to mark all coefficients, when hi = hj, i = j and to reduce number of operations for p= 0, because in this case difference matrix is symmetrical.

The next aim is to leave only M the highest coefficients in b and leave their indices u. For this purpose algorithm synchronously sortsb andu by vectorb in decreasing order.

Before leaving first M elements in sorted array, condition uM−1 =−1 is checked. This condition means a number of linearly independent coefficients less thenM, which leads to error. If there is no error algorithm deletes all elements except firstM. Vectorustill contains indices ofb, so this two vectors could be synchronously sorted by uto restore initial order of coefficientsb.

Algorithm: Step 1 Data:h,M,K,∆ Result:Obtainb,u

1 Initialization:

2 b=h,N=size(h),u= 0 ... NT

,

3 D_p, p∈0, K,q,Q,T with zeros

4 fori∈0, N , j∈0, N , p∈0, K do

5 Calculate Di,j,p=

hi −

hj

2^p

;

6 if

Di,j,p<∆ ∧

i6=j ∧ p >0

∨ i > j ∧ p= 0

then

7 ui =−1;

8 Combineb,u in structure{bi, ui}, i∈0, N;

9 Sort{bi, ui}, i∈0, N bybi in decreasing order;

10 Extractb andufrom {bi, ui}, i∈0, N;

11 ifuM−1=−1 then

12 Error: choose lower number of multipliersM;

13 Stop algorithm;

14 deletebi, i>M;

15 deleteui, i>M;

16 Combineb,u in structure{bi, ui}, i∈0, M−1;

17 Sort{bi, ui}, i∈0, M−1 byui in increasing order;

18 Extractb andufrom {bi, ui}, i∈0, M−1;

As a result, matrices D_p are full, b ∈ R^M includes only basis coefficients and u ∈ N^M0 contains only indices of the biggest linearly independent coefficients — basis coefficients.

Step 2. Here algorithm leaves only columns of Dp with indicesu, because the aim is to express each initial coefficient through basis coefficient. Then for each arrayDi,j,p, wherei, j are fixed it finds the lowest element, stores it inD₀ and stores its index pinQ.

As a result, matrixD₀≡ {Di,j,0}, i∈0, N , j∈0, M−1 contains minimal differences in dimension with index p, whereas matrix Qcontains powerspof 2, which are optimal in terms of expressing initial coefficient hi through basis coefficientbj.

Step 3. On this stage algorithm finds the lowest element in each arrayDi,j,0, where j is fixed and stores its indexiinq.

(5)

Algorithm: Step 2 Result:ObtainD0,Q

1 D=D[:,u,:]

2 fori∈0, N , j∈0, M−1 do

3 Find minp{Di,j,p};

4 Write D₀≡Di,j,0= minp(Di,j,p);

5 Remember indexQ_i,j=pmin≡arg min(Di,j,p);

As a result, index vectorqcontains indices of optimal basis coefficients to express through.

Algorithm: Step 3 Result:Obtainq

1 fori∈0, N do

2 Find minj{Di,j,0};

3 Remember indexq_i=jmin≡arg min(Di,j,0);

Step 4. Now q contains optimal basis coefficients indices relatively to vector b, because on the step 4 algorithm was searching for the lowest elements of array{Di,j,0}, i∈0, N in matrix D₀ with dimensions N + 1×M, due to D was compressed on the step 2. Therefore, firstly, it finds indices of optimal basis coefficients relatively to vectorh. They could be found as uqi, because vectoruwas sorted in increasing order on the step 1. After that it fills transformation matrix.

Algorithm: Step 4 Result:ObtainT

1 fori∈0, N do

2 qi=uqi;

3 Ti,qi =sign(hihqi) 1 2^Q^i,qi;

Step 5. Finally, to obtain reduced-complexity FIR-filter impulse response algorithm multiplies generated transform matrix by basis coefficients, as in (5).

V. ALGORITHM PERFORMANCE

In this section the examples of the algorithm’s operation are given.

• Linear phase real Low Pass Filter. Initial number of multipliers — N = 139, reduced number of multipliers — M = 25. Initial impulse response h ={hn}, n∈ 0, N and difference between initial and generated impulse responses h −g ={hn−gn}, n∈0, N are shown on figure 5:

Figure 5 shows that impulse response error is about10⁻³ times of initial filter coefficients, which results in good performance in frequency domain. Also, it can be noticed from figure 5 that error samples {hn−gn} correspond- ing to the biggest coefficients equal zero, because this samples are chosen as basis coefficients.

0 0.1 0.2 0.3

hn

0 20 40 60 80 100 120

´10^´4 0 10^´4

n hn´gn

Fig. 5. Linear phase Low Pass Filter. Initial impulse responsehnand impulse response errorhn−gn

Frequency responses of initialH(ω)dB, generatedG(ω) dB filters and spectrum of error E(ω) =H(ω)−G(ω) are shown on figure 6:

´80

´60

´40

´20 0

139 MULTs

|Hpωq|2,dB

´80

´60

´40

´20 0

25 MULTs

|Gpωq|2,dB

´π ´^π2 0 ^π₂ π

´80

´60

´40

´20 0

|Epωq|²“ |Hpωq ´Gpωq|²

60dB

ω

|Epωq|2,dB

Fig. 6. Linear phase Low Pass Filter. Frequency responses of initialH(ω), generatedG(ω)filters and spectrumE(ω)of error

From figure 6 it can be seen that error at lower than -60 dB in bandwidth.

• Linear phase real Band Pass Filter. Initial number of multipliers — N = 139, reduced number of multipliers — M = 25. Initial impulse response h = {hn}, n ∈ 0, N and difference between initial and generated impulse responses h −g = {hn−gn}, n∈ 0, N are shown on figure 7:

As for linear phase Low Pass Filter, from figure 7 it can be seen that reduced filter provides error at about -60 dB in bandwidth. Frequency responses of initial H(ω) dB, generated G(ω) dB filters and spectrum of error

(6)

´0.2 0 0.2 0.4

hn

0 20 40 60 80 100 120

´5¨10^´5 0 5¨10^´5

n hn´gn

Fig. 7. Linear phase Band Pass Filter. Initial impulse responsehnand impulse response errorhn−gn

E(ω) =H(ω)−G(ω)dB are shown on figure 8:

´80

´60

´40

´20 0

139 MULTs

|Hpωq|2,dB

´80

´60

´40

´20 0

25 MULTs

|Gpωq|2,dB

´π ´^π2 0 ^π₂ π

´80

´60

´40

´20 0

|Epωq|²“ |Hpωq ´Gpωq|²

60dB

ω

|Epωq|2,dB

Fig. 8. Linear phase Band Pass Filter. Frequency responses of initialH(ω), generatedG(ω)filters and spectrumE(ω)of error

• Minimum phase real Low Pass Filter. Initial number of multipliers — N = 139, reduced number of multipliers — M = 25. Initial asymmetrical impulse response h = {hn}, n ∈ 0, N and difference between initial and generated impulse responses h − g={hn−gn}, n∈0, N are shown on figure 9:

Frequency responses of initialH(ω)dB, generatedG(ω) dB filters and spectrum of error E(ω) =H(ω)−G(ω) are shown on figure 10:

Figure 10 shows reduced filter provides error at about -50 dB for M = 25and -70 dB forM = 70in bandwidth.

CONCLUSIONS

Given method provides static FIR-filter number of multipliers significant reduction with minor error power spectral

´0.05 0 0.05 0.1

hn

0 20 40 60 80 100 120

´5¨10^´4 0 5¨10^´4

n hn´gn

Fig. 9. Minimum phase Low Pass Filter. Initial impulse responsehn and impulse response errorhn−gn

´100

´50 0

139 MULTs

|Hpωq|2,dB

´100

´50 0

25 MULTs 70 MULTs

|Gpωq|2,dB

´π ´^π2 0 ^π₂ π

´100

´50

0 25 MULTs 70 MULTs

ω

|Epωq|2,dB

Fig. 10. Minimum phase Low Pass Filter. Frequency responses of initial H(ω), generatedG(ω)filters and spectrumE(ω)of error

density. The most important advantages of given approach is ability to reduce number of multipliers remaining filter order the same.

REFERENCES

[1] A. Eghbali, O. Gustafsson, H. Johansson, and Per Lowenborg, “On the Complexity of Multiplierless Direct and Polyphase FIR Filter Structures,” Division of Electronics Systems, Deparment of Electrical Engineering, Linkoping University, SE-581 83, SWEDEN.

[2] S. K. Mitra, “Digital Signal Processing: A Computer Based Approach,”

McGraw-Hill, Feb. 2006.

[3] K. K. Parhi, VLSI, “Digital Signal Processing Systems: Design and Implementation,” John Wiley and Sons, 1999.

[4] J. Clerk Maxwell, A Treatise on Electricity and Magnetism, 3rd ed., vol.

2. Oxford: Clarendon, 1892, pp.68–73.

[5] I. S. Jacobs and C. P. Bean, “Fine particles, thin films and exchange anisotropy,” in Magnetism, vol. III, G. T. Rado and H. Suhl, Eds. New York: Academic, 1963, pp. 271–350.

[6] K. Elissa, “Title of paper if known,” unpublished.

(7)

[7] R. Nicole, “Title of paper with only first word capitalized,” J. Name Stand. Abbrev., in press.

[8] Y. Yorozu, M. Hirano, K. Oka, and Y. Tagawa, “Electron spectroscopy studies on magneto-optical media and plastic substrate interface,” IEEE Transl. J. Magn. Japan, vol. 2, pp. 740–741, August 1987 [Digests 9th Annual Conf. Magnetics Japan, p. 301, 1982].

[9] M. Young, The Technical Writer’s Handbook. Mill Valley, CA: Univer- sity Science, 1989.