Properties of LTI Systems - Discrete-Time Speech Signal Processing

BIBLIOGRAPHY

2.8 Properties of LTI Systems

××

z– plane 1

Unit Circle Im

Figure 2.8 Pole-zero conﬁguration for a causal and stable discrete-time system.

2.8.2 Magnitude-Phase Relationships

When all zeros as well as poles of a rational transfer functionH (z)are inside the unit circle, the transfer function is said to beminimum phase, and the corresponding impulse response is referred to as aminimum-phase sequence[7]. Such zeros and poles will sometimes be called

“minimum-phase components” of the transfer function. An arbitrary causal and stable digital ﬁlter, however, will have its poles, but not necessarily its zeros, inside the unit circle. Zeros (or poles) outside the unit circle are referred to asmaximum-phasecomponents. Therefore,H (z) is generallymixed-phase, consisting of a minimum-phase and maximum-phase component, i.e.,

H (z) = H_min(z)H_max(z)

whereH_min(z)consists of all poles and zeros inside the unit circle, andH_max(z)consists of all zeros outside the unit circle. Although not strictly following our deﬁnitions, we also include in these terms zeros on the unit circle. The terminology “minimum-phase” and “maximum-phase”

is also applied to discrete-time signals, as well as to systems and their impulse responses.

Alternatively, any digital ﬁlter can be represented by the cascade of a minimum-phase

“reference” systemH_rmp(z)and an all-pass systemA_all(z): H (z) = H_rmp(z)A_all(z)

where the all-pass system is characterized by a frequency response with unity magnitude for all ω. In particular, an arbitrary rational all-passA_all(z)can be shown to consist of a cascade of factors of the form [₁¹₋⁻_az^a^∗₋^z1]^±¹ where|a| < 1 (Exercise 2.10). Consequently, such all-pass systems have the property that their poles and zeros occur at conjugate reciprocal locations; by this, we mean that a zero atz= _a¹∗ implies a pole atz=a. It follows that ifH (z)contains Qzeros then, because|H (z)| = |H_rmp(z)|, there exists a maximum of 2^Qdifferent phase functions (excluding linear phase), and thus 2^Q different sequences, for the given magnitude

2.8 Properties of LTI Systems 35

function.⁸ These phase functions can be obtained by reflecting zeros about the unit circle to their conjugate reciprocal locations by multiplying H (z) by [₁¹_−az⁻â^∗₋₁^z]^±¹. Because the all- pass function with unity power (+1) has negative phase (Exercise 2.10), if we flip a zero of a minimum-phase function outside the unit circle, then the resulting phase is more negative than the original. The termminimum phase-lagwould thus be more precise than the commonly used minimum phase[7].

There are a number of important properties of minimum-phase sequences that have important consequences for speech modeling and processing. A minimum-phase sequence is uniquely speciﬁed by the magnitude of its Fourier transform. This result will be proven formally in Chap- ter 10 in the context of the complex cepstrum representation of a sequence, but can be seen intuitively from the example of a stable rationalz-transform. In this case, for a given Fourier- transform magnitude, there is only one sequence with all zeros (or poles) inside the unit circle.

Likewise, the Fourier transform phase of a minimum-phase sequenceH (ω)uniquely speciﬁes the sequence (to within a scale factor).

Another useful property relates to the energy concentration of a minimum-phase sequence.

From Parseval’s theorem, all sequences with the same Fourier transform magnitude have the same energy, but when we flip zeros (or poles) inside and outside the unit circle to their conjugate reciprocal locations, this energy gets distributed along the time axis in different ways. It can be shown that a finite-length (all-zero) minimum-phase sequence has energy most concentrated near (and to the right of) the time origin, relative to all other finite-length causal sequences with the same Fourier transform magnitude, and thus tends to be characterized by an abrupt onset or what is sometimes referred to as a fast “attack” of the sequence.⁹This property can be formally expressed as [7]

n=0

|h_rmp[n]|² ≤

n=0

|h[n]|², m ≥ 0 (2.23)

whereh[n] is a causal sequence with the Fourier transform magnitude equal to that of the reference minimum-phase sequenceh_rmp_[n]. As zeros are ﬂipped outside the unit circle, the energy of the sequence is delayed in time, the maximum-phase counterpart having maximum energy delay (or phase lag) [7]. Similar energy localization properties are found with respect to poles. However, because causality strictly cannot be made to hold when az-transform contains maximum-phase poles, it is more useful to investigate how the energy of the sequence shifts with respect to the time origin. As illustrated in Example 2.11, ﬂipping poles from inside to outside the unit circle to their conjugate reciprocal location moves energy to the left of the time origin, transforming the fast attack of the minimum-phase sequence to a more gradual onset. We will see throughout the text that numerous speech analysis schemes result in a minimum-phase vocal tract impulse response estimate. Because the vocal tract is not necessarily minimum phase, synthesized speech may be characterized in these cases by an unnaturally abrupt vocal tract impulse response.

8Because we assume causality and stability, the poles lie inside the unit circle. Different phase functions, for a speciﬁed magnitude, therefore are not contributed by the poles.

9It is of interest to note that a sufﬁcient but not necessary condition for a causal sequence to be minimum phase is that|h[0]|>_∞

n=1|h[n]|[9].

EXAMPLE2.11 An example comparing a mixed-phase impulse responseh[n], having poles inside and outside the unit circle, with its minimum-phase referencehrmp[n] is given in Figure 2.9.

The minimum-phase sequence has pole pairs at 0.95e^±j0.1 and 0.95e^±j0.3. The mixed-phase sequence has pole pairs at 0.95e^±^j0.1 and _0.95¹ e^±^j0.3. The minimum-phase sequence (a) is concentrated to the right of the origin and in this case is less “dispersed” than its non-minimum-phase counterpart (c). Panels (b) and (d) show that the frequency response magnitudes of the two sequences are identical. As we will see later in the text, there are perceptual differences in speech synthesis between the fast and gradual “attack” of the minimum-phase and mixed-phase sequences,

respectively. 䉱

1.5 1 0.5 0 – 0.5

– 0.5

30 20 10

–10 –20 –30 –40 0 30 20 10

–10 –20 –30 –40 0

–1

–1 1 0.5 0

0 10 20

(a)

30 40 0 1000 2000 3000 4000 5000

0 1000 2000 3000 4000 5000

0 10 20

Time (ms)

Amplitude Amplitude (dB)

(c)

(b)

Frequency (Hz) (d)

30 40

Figure 2.9 Comparison in Example 2.11 of (a) a minimum-phase sequenceh_rmp[n] with (c) a mixed- phase sequenceh[n] obtained by ﬂipping one pole pair ofhrmp[n] outside the unit circle to its conjugate reciprocal location. Panels (b) and (d) show the frequency response magnitudes of the minimum- and mixed-phase sequences, respectively.

2.8 Properties of LTI Systems 37

2.8.3 FIR Filters

There are two classes of digital filters:finite impulse response(FIR) andinfinite impulse response (IIR) filters [7],[10]. The impulse response of an FIR filter has finite duration and corresponds to having no denominator in the rational functionH (z), i.e, there is no feedback in the difference Equation (2.21). This results in the reduced form

y[n] =

r=⁰

β_rx[n− r]. (2.24)

Implementing such a ﬁlter thus requires simply a train of delay, multiply, and add operations.

By applying the unit sample input and interpreting the output as the sum of weighted delayed unit samples, we obtain the impulse response given by

h[n] = β_n, 0 ≤ n ≤ M

= 0, _otherwise

Becauseh_[n] is bounded over the duration 0 ≤n ≤ M, it is causal and stable. The corresponding rational transfer function in Equation (2.22) reduces to the form

X(z) = Az⁻^r

M_i

k=1

(₁−a_kz⁻¹)

M_o

k=1

(₁ −b_kz)

withM_i +M_o =M and with zeros inside and outside the unit circle; the ROC is the entire z-plane except at the only possible polesz=^{0 or}z= ∞^.

FIR ﬁlters can be designed to have perfect linear phase. For example, if we impose on the impulse response symmetry of the form h[n] = h[M −n], then under the simplifying assumption thatM is even (Exercise 2.14),H (ω) =A(ω)e⁻^{j ω(M/}²⁾ whereA(ω)is purely real, implying that phase distortion will not occur due to ﬁltering [10], an important property in speech processing.¹⁰

2.8.4 IIR Filters

IIR filters include the denominator term in H (z) and thus have feedback in the difference equation representation of Equation (2.21). Because symmetry is required for linear phase, most¹¹IIR filters will not have linear phase since they are right-sided and infinite in duration.

Generally, IIR ﬁlters have both poles and zeros. As we noted earlier for the special case where the number of zeros is less than the number of poles, the system functionH (z)can be expressed in a partial fraction expansion as in Equation (2.18). Under this condition, for causal systems, the impulse response can be written in the form

h[n] =

k=¹

A_kcⁿ_ku[n]

10This does not mean thatM(ω)is positive. However, if the ﬁlterh[n] has most of its spectral energy whereM(ω) >0, then little speech phase distortion will occur.

11A class of linear phase IIR ﬁlters has been shown to exist [1]. The transfer function for this ﬁlter class, however, is not rational and thus does not have an associated difference equation.

wherec_k is generally complex so that the impulse response is a sum of decaying complex exponentials. Equivalently, because h[n] is real, it can be written by combining complex conjugate pairs as a set of decaying sinusoids of the form

h[n] =

N_i/2

k=¹

B_k|c_k|ⁿ^cos(ω_kn+φ_k)u[n]

where we have assumed no real poles and thusN_i is even. Given a desired spectral magnitude and phase response, there exist numerous IIR ﬁlter design methods [7],[10].

In the implementation of IIR filters, there exists more flexibility than with FIR filters. A

“direct-form” method is seen in the recursive difference equation itself [7],[10]. The partial fraction expansion of Equation (2.18) gives another implementation that, as we will see later in the text, is particularly useful in a parallel resonance realization of a vocal tract transfer function.

Suppose, for example, that the number of poles inH (z) is even and that all poles occur in complex conjugate pairs. Then we can alter the partial fraction expansion in Equation (2.18) to take the form

X(z) =

N_i/2 k=1

A_k(1 −p_kz⁻¹) (1 −c_kz⁻¹)(1−c_k^∗z⁻¹)

Ni/2 k=¹

A_k(1 −p_kz⁻¹)

1−u_kz⁻¹ + v_kz⁻² ^(2.25)

which represents ^N₂ⁱ second-order IIR ﬁlters in parallel. Other digital ﬁlter implementation structures are introduced as needed in speech analysis/synthesis schemes throughout the text.

Dalam dokumen Discrete-Time Speech Signal Processing (Halaman 54-59)