978-1-4799-7447-4/14/$31.00 ©2014 IEEE
Homomorphic Filtering for Extracting Javanese Gong
Wave Signals
Matias H.W. Budhiantho
1and Gunawan Dewantoro
2Department of Electronics and Computer Engineering1,2 Satya Wacana Christian University
Salatiga, Indonesia [email protected] [email protected]
Abstract—The Gong is a unique percussion instrument of Gamelan because of its wave-like sounds after being struck immediately. These special sounds correspond to the characteristics of each gong which is originated from different places. This phenomenon might raises from the primary beat of the partials and or mismatched overtone frequencies. However, there has been a lack of studies that rigorously locate the beat and/or a set of beat frequencies the better understand the perceptible beat. This study attempts to trace the beat-showing occurrences in the quefrency domain employing a set of homomorphic operations. First, acoustic measurement was conducted and further analyzed using homomorphic operations to obtain its cepstral profile. The rapidly-varying frequency part was filtered out to show the location of a set of beats. The cepstrum also show a prominent peak at 0.01 s, which corresponds to the harmonicity of 47.6 Hz. This is in agreement with the measured fundamental frequency of Gong Kempul.
Keywords—gong kempul; homomorphic filtering; cepstrum; beat frequency
I. INTRODUCTION
Sound of Javanese Gong plays an important role in Javanese culture. In gamelan music orchestra, gong sound instructs, marks and ends certain parts of a gamelan composition [1]. The gong sound is also used to declare the opening and closing of important religious and secular events or various rituals. The roaring wavelike sound of the gong is associated by Javanese with Bima’s giggle that creates a grandeur yet calming feeling. Bima is known as a bold but honest and just hero, a great legend in Javanese puppet shadow (wayang) stories. The number of wavelike sound repetition cycles in a best sounding gong can be as many as 12 to 13. A gong that cannot produce a wavelike sound is considered to be only “howling” [2].
A temporal and spectral description of Javanese gong kempul was studied in [3]. The gong has a set of approximately harmonic partial with fundamental frequency at about 93 Hz. The partials, both harmonic and nonharmonic, beat together producing roaring sound. However, it was not clear the origins of the beats since it was difficult to detect the partials at low frequency. Bhalke et al. [4] presents classification and recognition of monophonic isolated musical
are processed with binaural impulse responses. The extraction of low order from higher order acoustic impulse responses is justified based upon an application of the theory pertaining to the clustering of the zeros of random coefficient polynomials about the unit circle. Fraile et al. [10] This paper approaches the problem of inverse filtering by homomorphic prediction. While not favoured much by researchers in recent literature, such an approach offers two potential advantages: it does not require previous pitch detection and it does not rely on any assumptions about the spectral enevelope of the glottal signal. Its performance is herein assessed and compared to that of an adaptive inverse filtering method making use of synthetic voices produced with a biomechanical voice production model. Results indicate that the performance of the inverse filtering based on homomorphic prediction is within the range +of that of adaptive inverse filtering and, at the same time, it has a better behavior when the spectral envelope of the glottal signal does not suit an all-pole model of predefined order. Hasani et al. [11] uses homomorphic filtering to produce time-domain intensity envelopes of the heart sounds and separates the sounds into four overlapping parts:the first heart sound, the systolic period, the second heart sound and the diastolic period. This study incorporate homomorphic filtering to obtain the fast-varying frequency part of gong signal. The paper is organized as follow: Section II introduces the homomorphic systems that led to the cepstral analysis fundamentals; Section III describes the spectral analysis of the gong signal under Matlab; in Section IV, the obtained cepstral separation are presented and discussed. Finally, Section V concludes the paper emphasizing the beat frequency(ies) of the gong which give the audience a well-described perceptible beat sounds.
II. HOMOMORPHIC SYSTEMS [12]
A. Complex Cepstrum
The word “cepstrum” has four interchanging letters in the word “spectrum” because in general we find ourselves operating on the frequency side in ways customary on the time side and vice versa. Consider a stable sequence x(t) whose Fourier transform is
)
the complex cepstrum corresponding to x(t) is defined to be
the stable sequence x(t)
∧
whose Fourier transform is: ) Therefore, the complex cepstrum of x(t) is given by the
inverse Fourier transform integral as:
∫
In contrast to the complex cepstrum, the cepstrum cx(t) of a signal (sometimes referred to as the real cepstrum) is defined as the inverse Fourier transform of the logarithm of the magnitude of the Fourier transform; i.e.,
∫
Since the cepstrum is based on only the Fourier transform magnitude, it is not invertible, i.e., x(t) cannot be recovered from cx(t). However, the cepstrum is easier to compute than the complex cepstrum.
B. Homomorphic Deconvolution
The principle of generalized superposition as stated for a system require that, if H is the system transformation, then for any two inputs x1(t) and x2(t) and any scalar c, addition, multiplication, convolution, etc) and ◊ is a rule for combining inputs with scalars. Similarly, Dwill denote a rule for combining system outputs and •is a rule for combining outputs with scalars. Such systems which obey a generalized principle of superposition are referred to as homomorphic systems since they can be represented by algebraically linear (homomorphic) mappings between input and output signal spaces. Any homomorphic systems can be decomposed as a cascade of three homomorphic systems, as indicated in Fig. 1, with the first system depending only on the input operation Δ, the last system depending only on the output operation ◊ and the middle system corresponding to a linear system.
Fig. 1. Canonic representation of homomorphic systems.
The system DΔ[.] is referred to as the characteristic system
associated with the input operationΔ, and −1
◊
D [.] is the inverse of DΔ[.], the characteristic system associated with the
output operation◊. Such decomposition will be very useful in applying homomorphic systems to deconvolution. In particular, DΔ[.] can be chosen with convolution as the
operation of superposition at the input and addition as the operation of superposition at the output. As in (2), if the complex logarithm is appropriately defined, when X(jω)=
The operation that define the complex cepstrum is shown in Fig.2. The cascade of Fourier transform, complex logarithm, and inverse Fourier transform can be thought of as a representation of the characteristic system D*[.]. Each of the three basic component transformations is also homomorphic, and the corresponding input and output operations are indicated in Fig. 2(a). The Fourier transform maps convolution to multiplication; the complex logarithm converts multiplication to addition; and the inverse Fourier transform is a linear transformation in the ordinary sense. The third system in the cascade of Fig. 1 is the inverse of the characteristic system for convolution. Consequently, its input must be the complex cepstrum of its output. Then, the Fourier transform of the output is
Fig. 2. Representation of (a) characteristic system for convolution and (b) its
inverse
The system L in Fig.1 is a linear system in the usual sense. If the linear system removes x2(t)
∧
completely from the additive combination of (8), then x2(t) is removed from the convolutional combination of input x(t)=x1(t)*x2(t). The class
of linear frequency-invariant system, known as lifter, is useful for separating the complex cepstrum of the inputs. The time index independent variable of the cepstrum is called quefrency (Lifter and quefrency have interchanging letters in the word filter and frequency, respectively).
III. SPECTRAL ANALYSIS OF GONG SIGNALS
We recorded 48kHz sample tones of the Gong Kempul using ARTA PC Software. A controlled gong striker was utilized to exert a controlled impact force upon the gong boss, as seen in Fig. 3. First, a measurement condenser microphone acquired the acoustic signal by near field measurement from behind of the boss, externally powered by a phantom power. A
sound card then interfaced and digitized this signal in order that computers are able to recognize.
Fig. 3. The Gong Kempul and its controlled striker
An impulse-generated response was recorded to analyze in both the time and frequency domain as shown in Fig. 4.
2 2.5 3 3.5 4 4.5 5 5.5 6
Fig. 4. Measured gong signal in the (a) time domain and (b) frequency
domain
The time signal envelope in Fig. 4(a) shows fluctuation of approximately 3.5 Hz in agreement with the signal energy decay fluctuation [3]. A properly-tuned gong produces numerous partials with harmonic and inharmonic frequencies. The slowly decaying sound of the gong open the possibility of those partials to interfere that result in various beats. There are two kinds of major beats, i.e. primary and mistuned octaves. The primary beats occur between closely existing partial in the spectrum, while the mistuned octaves occur between a tone and its mistuned harmonics [13]. Table 1 shows the 18 consistent partials of the gong with 4 slightly mistuned octaves. And, in the case of this Gong Kempul, since the gong sound become very weak after 3s, the perceptible beat frequency cannot be lower than 0.3 Hz.
TABLE I. PARTIALS AND THEIR INTERACTING BEATS
No Partials (Hz) Harmonics Primary
Beats (Hz)
Mistuned octaves (Hz)
1 93.02 1 -
2 158.2 65.18
3 160.5 2.3
4 182.7 22.1
5 186 2 3.3 0.04
6 213.9 27.9
7 242.9 29
8 248.2 5.3
9 251.6 3.3
10 253.4 1.8
11 275.8 22.4
12 279.1 3 3.3 0.04
13 368.9 89.8
14 372.1 4 3.2 0.02
15 433.9 61.8
16 462.3 28.4
17 464.9 5 2.6 0.2
18 475.7 10.8
Beat is perceptible as a fluctuating sound if its frequency is less than 10 Hz [14-15]. And, in the case of this Gong Kempul, since the gong sound become very weak after 3s, The perceptible beat frequency cannot be lower than 0.3 Hz. However, we still cannot be certain where the wavelike sound of the gong is originated from. Hence, a cepstrum based approach was employed to extract the beat of gong signal, i.e. to seek the rapidly varying complex logarithm of X(jω) less than 10 Hz.
IV. HOMOMORPHIC FILTERING
The signal x(t) in Fig. 4(a) was truncated to N=32768 samples and N-point DFT were used to compute the cepstrum of the signal as shown in Fig. 5.
,
Fig. 5. Cepstrum of the gong signal
The peak in the cepstrum at 0.01075 s indicates that the gong signal was pitched with that period, or in the other words, at frequency of 1/0.01075 = 93.0232 Hz, which coincides to the measured fundamental frequency shown in Table 1. The slowly varying component of the complex logarithm corresponds to the “low-time” portion of the cepstrum. Correspondingly, the more rapidly varying component of the complex logarithm corresponds to the “high-time” portion of the cepstrum. This suggests that the two convolved components of x(t) can be separated by applying linear filtering to the logarithm of the Fourier transform, or the
complex cepstrum can be separated by time gating, i.e., by frequency-invariant linear filtering. Fig. 6 depicts the operation involved in separation of the components of a convolution by time gating of the cepstrum of the gong signal, which is also known as liftering.
Fig. 6. Time domain representation of cepstrum-gating
A. Lowpass Frequency-Invariant Filtering
Fig. 7 shows the cepstrum-gating as required for recovering the slowly-varying complex logarithm. The smoothed spectrum or spectrum envelope, obtained by frequency-invariant lowpass filtering with cut off below the cepstrum peak, is shown in Fig.8.
0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
-0.1 0 0.1 0.2 0.3 0.4
Quefrency(s)
M
a
gn
it
ud
e
signal cepstrum Low-time pass filter
Fig. 7. Low-time gating with cut-off quefrency 0.01 s
0 50 100 150 200 250 300 350 400 450 500
-5 0 5 10
Frequency(Hz)
Magni
tud
e
signal spectrum spectrum envelope
Fig. 8. The magnitude of spectrum with the corresponding smoothed spectra
superimposed.
B. Highpass Frequency-Invariant Filtering
In order to obtaining the beat frequency(ies) less than 10 Hz, we need to further explore the rapidly varying complex logarithm of X(jω). Careful attentions need to be addressed in determining the length of the high-time gating. If the segment is too long, the properties of the signal will change too much across the segment. If the segment is too short, there will not be enough to obtain beat frequencies information. The highpass frequency-invariant filter l(t) is depicted in Fig 9.
0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
-0.1 -0.05 0 0.05 0.1 0.15 0.2 0.25 0.3
X: 0.01075 Y: 0.05277
Quefrency(s)
M
agni
tude
]
[.
∗
D
D
∗−1[
.
]
x(t) y(t)
l(t) )
(t x ∧
Fig. 9.Highpass system.
0 1 2 3 4 5 6 7 8 9 10
0.5 1 1.5 2 2.5 3
Frequency (Hz)
M
a
gni
tud
e
(a)
0 1 2 3 4 5 6 7 8 9 10
0.5 1 1.5 2 2.5 3
Frequency (Hz)
M
a
gn
it
ud
e
(b)
0 1 2 3 4 5 6 7 8 9 10
0.5 1 1.5 2 2.5 3
Frequency (Hz)
M
a
gn
it
ud
e
(c)
0 1 2 3 4 5 6 7 8 9 10
0.5 1 1.5 2 2.5
3 X: 0.5859 Y: 2.751
Frequency (Hz)
M
agni
tude
(d)
Fig. 10.The result of highpass frequency-invariant filtering with N1=0.01s
and (a) N2=0.1 s, (b) N2=0.2 s, (c) N2=0.5 s (d) N2=1 s
The value of N1 is the same as the cutoff quefrency of lowpass frequency-invariant filtering, i.e. 0.01 s. The results of the cepstrum-gating with various N2 are shown in Fig. 10. As seen from the above figures, Fig 10(a) is still not enough to obtain beat frequencies information. Fig. 10(b) starts showing indications of beat frequencies with inadequate peak height to discriminate. By extending the length of the cepstrum-gating, we might observe some peaks to which the beat frequencies correspond, as seen in Fig. 10(c). Meanwhile, Fig. 10(d) has enough information to discriminate and locate the peaks comprehensively. It is shown that the peaks less than 10 Hz indicated in Fig. 10(d) are at frequency of 0.5859 Hz, 1.855 Hz, 3.32 Hz, 5.273 Hz, 6.641 Hz and 8.594 Hz. Some of the above frequencies, i.e. 1.855 Hz, 3.32 Hz, 5.273, are in good agreement with the primary beats listed Table 1 (in red). However, the other three beat frequencies found in Fig. 10(d) do not match with the primary beats nor mistuned octaves. Therefore, the other possible source of beat is from further difference (difference of difference due to the heterodyning process). The selected differences of primary beats were sought and listed in Table 2.
TABLE II. DIFFERENCES OF PRIMARY BEATS
No Differences of Primary Beats (Hz) Remarks (All primary beats are listed in Table 1)
1 0.6 Difference between primary beats no.16 and no.7
2 6.6 Difference between primary beats no.11 and no.7
3 8.5 Difference between primary beats no.18 and no.3
V. CONCLUSION
The wave-like sounds of the gong raises from the interferences between the either harmonic or nonharmonic partials. The primary beats occur between closely existing partial in the spectrum, while the mistuned octaves occur between a tone and its mistuned harmonics. The rapidly-varying frequency part was filtered out using cepstrum-gating to show the location of a set of beats. It is shown that the peaks less than 10 Hz indicated by cepstral profile are at frequency of 0.5859 Hz, 1.855 Hz, 3.32 Hz, 5.273 Hz, 6.641 Hz and 8.594 Hz. These frequencies are in good agreement with either primary beats or differences of primary beats due to heterodyning process. The mistuned octaves are very small causing it imperceptible to human ears. The cepstrum also show a prominent peak at 0.01075 s, which corresponds to the harmonicity of 93.02 Hz. This match with the measured fundamental frequency of Gong Kempul. Further research are encouraged to analyze the gong’s partials using numerical analysis subject to the gong geometries and materials
ACKNOWLEDGEMENT
This research is financially supported by Satya Wacana Christian University under Competitive Grant Scheme No. 045/Penel./Rek./5/II/2014
0 N1 N2
t l(t)
REFERENCES
[1] N. Sorrel, A Guide to The Gamelan. Oregon, Portland: Amadeus Press,
1990, pp. 44.
[2] K.M. Hood, Music of the Roaring Sea: The Evolution of Javanese
Gamelan, Wilhelmshaven: Heinrichshofen, 1980.
[3] M.H.W. Budhiantho and G. Dewantoro, “The spectral and temporal
desription of Javanese Gong Kempul”, 5th International Conferece on
Information Technology and Electrical Engineering, Yogyakarta, Indonesia, pp. 300-304, October 2013.
[4] D.G. Bhalke, C.B.R. Rao, and D.S. Bormane, “Musical instrument
classification using higher order spectra”, 2014 International Conference on Signal Processing and Integrated Networks (SPIN), Noida, pp. 40-45, February 2014.
[5] W.H. Tsai and H.P. Lin, “Background music removal based on cepstrum
transformation for popular singer identification”, IEEE T. Audio Speech, vol. 19 (5), pp. 1196-1205, July 2011.
[6] S. Gaikwad, A.V. Chitre, and Y.H. Dandawate, Classification of Indian
classical instruments using spectral and principal component analysis based cepstrum features, International Conference on Electronic Systems, Signal Processing and Computing Technologies, Nagpur, pp. 276-279, January 2014.
[7] M. Caetano and X. Rodet, “Improved estimation of the amplitude
envelope of time-domain signals using true envelope cepstral
smoothing”, IEEE International Conference on Acoustics, Speech, and Signal Processing, Prague, pp. 4244-4247, 2011.
[8] B. Bispo, P. Rodrigues, and D. Freitas, “Acoustic feedback cancellation
based on cepstral analysis”, Signal Processing: Algorithms, Architecture, and Applications, Poland, pp. 205-209, 2013.
[9] I.J. Kelly and A.M. Boland, “Exploiting randomness in acoustic impulse
responses to achieve headphone compensation through deconvolution”, J. Acoust. Soc. Am., vol. 133 (5), pp. 2778-2787, May 2013
[10] R. Fraile, M. Kob, J.M.G. Arriola, N.S. Lechon, J.I.G. Llorente, and
V.O. Ruiz. “Glottal inverse filtering of speech based on homomorphic prediction: A cepstrum-based algorithm not requiring prior detection of either pitch or glottal closure”, Comm. Com. Inf. Sc., vol. 127, pp. 238– 251, 2011.
[11] K. Hassani, K. Bajelani, M. Navidbakhsh, D.J. Doyle and F. Taherian,
“Heart sound segmentation based on homomorphic filtering”, Perfusion, vol. 29(4), pp. 1-9, 2014.
[12] A.V. Oppenheim and R.W. Schafer, Discrete-Time Signal Processing”.
New Jersey: Prentice Hall, Inc, 1989, pp. 768-825.
[13] M.H.W. Budhiantho and G. Dewantoro, “Gong wave signals”, J.
Acoust. Soc. Am., vol. 134 (4188), December 2013
[14] D.E. Hall, Basic Acoustics, 1st ed. New York: Harper & Row Publishers,
1987
[15] A. H. Benade, Fundamentals of Musical Acoustics, 2nd ed., Dover