4.3 Feature Extraction
4.3.2 Curvelet Transform Based Feature
Discrete curvelet transform is performed on the spectrogram representation of an ECG beat. This resulting in curvelet coefficients in three layers, namely coarse, detail, and fine. Since level three curvelet decomposition is chosen to obtain better identification accuracy, the detail layer is decomposed to thirty two angular divisions, where the sixteen detail layers are symmetric to the rest of the sixteen detail layers. Figure 4.4 (a), (b), (c) represents the curvelet coefficients in coarse, one of the detail and fine layers, respectively for an ECG beat shown in Fig. 4.3 (a).
If all the curvelet coefficients of the entire time-frequency plane are considered to be used as a feature, it would definitely result in a feature vector with a very large dimension. Thus for the task of human identification such curvelet coefficients can be exploited to form the following effective feature vectors based on the concept of cross correlation between two successive columns and mean of each column elements in the curvelet domains.
1. Feature based on Cross-Correlation in Curvelet Domain
It would be interesting if one can develop a feature vector, which can provide distinguishable characteristics from person to person considering the similarity information in the curvelet domain. However, extracting such a feature from all the curvelet coefficients is not a trivial task. In signal processing, cross- correlation is one of the most popular ways of measuring similarity between two sequences as a function of a time-lag. The main idea is to utilize the cross- correlation operation to extract features from human ECG in curvelet domain.
The precise variations of ECG that remain recaptured in time domain are expected to be better reflected in the curvelet domain sequences represented by diff erent layers of coarse, detail and fi ne layer coefficients shown in Fig. 4.4.
If Y [k] and Z [k] are two M -point curvelet sequences corresponding to two columns of course, detail or fine layer, their cross-correlation is defined as,
It is expected that in order to reflect the similarity of two sequences, it is sufficient to consider the zero lag value of the cross-correlation. Ideally, if both columns
Voltage (mv)Frequency (Hz) Voltage (mv) Frequency (Hz) Voltage (mv)Frequency (Hz) 2 1
1 0.5 0
0
0 50 100 150 200 250 (a) Time
−1
−20 50 100 150 200 250
(b) Time
1 0.5 0
−0.5
0 50 100 150 200 250
(c) Time
10 20 30 40 50
10 20 30 40 (d) Time
10 20 30 40 50
10 20 30 40 (e) Time
10 20 30 40 50
10 20 30 40 (f)Time
Fig. 4.3: (a)(b)(c) ECG beats for three different persons, (d)(e)(f) Corresponding spectrograms of (a) ,(b),(c)
are taken the region of the ECG beat corresponding the P wave, QRS complex and T wave i.e., high energy the cross-correlation sequence would be similar and the zero lag cross correlation value obtained in curvelet domain would be appreciably higher in magnitude. On the other hand, if we consider two adjacent columns in the neighborhood other than P wave, QRS complex and T wave, i.e, a region with low energy, the zero-lag curvelet cross-correlation value would be large in magnitude. Thus, the value of zero-lag cross-correlation between two columns in the curvelet domains varies proportional to the degree of energy variation between these two columns. Thus, the zero-lag values of the cross- correlations f r o m the sequence of curvelet coefficients of adjacent two columns of the coarse, detail and fine layers are proposed to form the feature vector. In Figs. 4.5 (a)-(d), the proposed feature vector consisting of zero-lag values of the cross-correlations obtained from the successive two columns of the curvelet coefficients of coarse, detail and fine layers are shown for four different persons.
It is seen from this figure that the four person’s proposed features shows difference in pattern which would be less distinguishable in terms of curvelet coefficients only.
Course Scale Detail
10 5
20 10
30
5 10 15 20 25 30 (a) Curvelet Coefficients
15
Fine scale
2 4 6 8 10 12
(b) Curvelet Coefficients
20
40
60
10 20 30 40 50 60 (c) Curvelet Coefficients
Fig. 4.4: (a) Coarse Layer (b) Detail Layer (c) Fine Layer
From Fig. 4.5, it is seen that, if all the zero-lag values of cross-correlation in curvelet domain are considered, the dimension of the feature set remain quite large. So feature reduction is required to perform retaining all important information capable of separating from person to person while maintaining compactness for an individual. For this purpose, we propose A) reduce feature based on dominant energy bands and B) that on employing PCA described as follows:
A. Reduction of Cross-Correlation Feature Based on Dominant Energy Bands
Analyzing the cross-correlation feature, it is found that there exists redundant in- formation that is not useful for human identification.
From Fig. 4.5, it is visible that zero values of cross-correlation corresponding to fine layer curvelet coefficients have no definite pattern as it resembles to noise thus representing the high frequency components of ECG signal. Therefore, the features corresponding to the fine layer curvelet coefficients are disregarded. The zero-lag values of cross-correlation corresponds to only coarse layer curvelet coefficients are demonstrated in Fig. 4.6 (a)-(d) for four different persons. It is clearly seen from this figure that the amplitude all the zero-lag values of cross- correlation in coarse layer are not significant as most of the amplitudes is very small. Our motivation is to extract some significant zero-lag from coarse
AmplitudeAmplitude AmplitudeAmplitude
5 6
4 5
High Frequency
3 Fine Coefficient 4
3 2
2 1
1
0 0
−1 −1
−2 −2
High Frequency Fine Coefficient
0 50 100 150 200 250 300
(a) Curvelet Coefficient
0 50 100 150 200 250 300
(b) Curvelet Coefficient
8 10
8 6
High Frequency
Fine Coefficient 6
4
4
High Frequency Fine Coefficient
2 2
0 0
−20 50 100 150 200 250 300
(c) Curvelet Coefficient
−2
0 50 100 150 200 250 300
(d) Curvelet Coefficient
Fig. 4.5: Feature vector consisting of zero-lag values of the cross-correlations obtained from the successive two columns of the curvelet coefficients for (a) Person 1 (b) Person 2 (c) Person 3 (d) Person 4
layer to produce an effective feature set. The marked region in the figure shows a dominant energy lobe or band, which has a definite shape for each person. To capture its shape, two minima at both ends of the band is determined. Finally, the zero-lag values of cross-correlation that exist within the dominant energy band in the coarse layer are used to construct the reduced feature set. Similar plots representing the zero-lag values of cross-correlation corresponding to detail layer for four curvelet coefficients are shown in Fig. 4.7 (a)-(d). It is seen that in detail layer, two dominant energy bands having definite shapes for each person exist. The zero-lag values of cross correlation that reside within the dominant energy bands, the detail layer are captured using the approach as mentioned above.
By concatenating three dominant energy bands, one from coarse layer and two
Amplitude Amplitude AmplitudeAmplitude
Fig. 4.6: Zero-lag values of cross-correlation corresponding to coarse layer curvelet coefficients for (a) Person 1 (b) Person 2 (c) Person 3 (d) Person 4
0.4
0.3
0.2 Dominant Energy Bands
0.5 0.4 0.3
Dominant Energy Bands
0.1 0.2
0.1 0
0
−0.1
0 50 100 150 200
(a) Curvelet Coefficient
−0.10 50 100 150 200 (b) Curvelet Coefficient 0.6
0.5 0.4 0.3 0.2 0.1 0
−0.1
Dominant Energy Bands
0.5 0.4 0.3 0.2 0.1 0
−0.1
Dominant Energy Bands
0 50 100 150 200
(c) Curvelet Coefficient 0 50 (d) Curvelet Coefficient 100 150 200
Fig. 4.7: Zero-lag values of cross-correlation corresponding to detail layer curvelet coefficients for (a) Person 1 (b) Person 2 (c) Person 3 (d) Person 4
The reduced feature set thus obtained is illustrated for four persons in Figs. 4.8 (a)-(d). It is to be noted that from Fig. 4.5 through Fig. 4.8 (a)-(d), forty beats for each person are considered. The length of the feature set is highly redundant compared to the length of the feature set while considering all zero lag values
AmplitudeAmplitude AmplitudeAmplitude
of cross-correlation corresponding to coarse, detail and fine layer curvelet coefficient as depicted in Fig. 4.5. It is vivid from Fig. 4.8, that the derived feature set has distinct pattern for each person which is specially reflected by the dominant energy band of the feature.
6 6
5 5
4 4
3 3
2 2
1 1
0 0
−10 5 10 15 20 (a) Curvelet Coefficient
−10 5 10 15 20 (b) Curvelet Coefficient
10 10
8 8
6 6
4
4 2
0
−20 5 10 15 20 25 (c) Curvelet Coefficient
2
0
0 5 10 15 20
(d) Curvelet Coefficient
Fig. 4.8: Reduced feature set concatenating zero-lag values of cross-correlation corresponding to dominant energy bands of coarse and detail layers for (a) Person 1 (b) Person 2 (c) Person 3 (d) Person 4
B. Reduction of Cross-Correlation Feature Based on PCA
Principal component analysis (PCA) is a very well-known and efficient orthogonal linear transformation. It has been shown that PCA is advantageous than the independent component analysis (ICA) and linear discriminant analysis (LDA). It reduces the dimension of the feature space and the correlation among the feature vectors by projecting the original feature space into a smaller feature subspace through a transformation [24, 38]. The PCA transform the original p- dimensional feature vector into L-dimensional linear subspace that is spanned by
the leading eigen vectors of the covariance matrix of feature vector in each cluster.
PCA is theoretically the optimum transform for given data in the least square sense. For a data matrix, X T, with zero empirical mean, where each row represents a different repetition of the experiments and each column gives the result from a particular probe, the PCA transformation is given by:
Y T = X T W = V Σ T , (4.3) where the matrix Σ is an uxv diagonal matrix with nonnegative real numbers on the diagonal and W Σ V T is the singular value decomposition of X . If q beats of each person are considered a n d a total Q curvelet coefficients are selected, the feature space per person would have a dimension of s x Q. For the proposed feautre set based on zero-lag of cross-correlation corresponding to coarse, detail and fine layer curvelet coefficients as described through Fig. 4.5, employment of PCA on the derived feature space could efficiently reduce the feature dimension without losing much information. Hence PCA is employed to reduce the dimension of the proposed feature space. Principal components of derived feature space are represented in Fig. 4.9 for four different persons considering forty beats for each person. It is shown in this figure that even 16 principal components can provide feature set that is very distinct for different persons while maintaining the desired compactness for the same person.
2. Feature Based on Mean of Column Elements in Curvelet Domain The feature extraction is performed by taking mean of each column elements representing curvelet coefficients in coarse, details and fine layers. For the course layer of size (G, H ), where G and H denotes number of rows and columns of course layer, respectively, the mean of column elements (MC) of course layer can be computed by
where CC stands for curvelet coefficients in the coarse layer.
The feature vector based on MC elements in coarse layer can be written as, XC = {h[1], h[2], ..., h[N ]}. (4.5)
AmplitudeAmplitude AmplitudeAmplitude
20 4
15 3
2 10
1 5
0
0 −1
−5 2 4 6 8 10 12 14 16 (a) Principal Components
−2 2 4 6 8 10 12 14 16 (b) Principal Components
6 15
4
10
2
5 0
−2 0
−4 2 4 6 8 10 12 14 16 (c) Principal Components
−5 2 4 6 8 10 12 14 16 (d) Principal Components
Fig. 4.9: Principal Components of the derived cross-correlation based feature space for four different persons
Similar way, the feature vectors based on MC elements in thedetain and fi ne layers are obtained and denoted by XD a n d XF, respectively. By concatenating all three individual feature vectors, the final feature set based on MC elements in curvelet domain is formed as
Fig. 4.10 (a)-(d) shows the feature vector consists of mean of column (MC) elements in curvelet domain for four different persons, width includes coarse, detail and fine layer coefficients resulting from the curvelet transform on the pre- processed ECG signal. It is vivid from this figure that for each persons, the proposed feature based on MC elements in curvelet domain is more prominent compared to that obtained based on zero-lag of cross-correlation in curvelet domain. Moreover, while comparing different persons, this feature shows more difference in pattern, which is comparatively less distinguishable in case of cross- correlation based feature. But similar to the cross-correlation based feature, it is also of very high dimension.
AmplitudeAmplitude AmplitudeAmplitude
Therefore, reduced feature set is obtained based on dominant energy bands and that on PCA as employed before while reducing the length of cross-correlation based feature.
0.4
0.3 0.25 0.3
0.2
High Frequency Fine Coefficient
0.2 0.15
High Frequency Fine Coefficient
0.1
0
0.1 0.05 0
−0.1
0 50 100 150 200 250 300 (a) Curvelet Coefficient
−0.05
0 50 100 150 200 250 300 (b) Curvelet Coefficient
0.4 0.3
0.3
0.2
High Frequency Fine Coefficient
0.25 0.2 0.15
High Frequency Fine Coefficient
0.1
0
0.1 0.05 0
−0.10 50 100 150 200 250 300 (c) Curvelet Coefficient
−0.05
0 50 100 150 200 250 300 (d) Curvelet Coefficient
Fig. 4.10: Feature based on MC elements corresponding to coarse, detail and fine layer curvelet coefficients for (a) Person 1 (b) Person 2 (c) Person 3 (d) Person 4
A. Reduction of Mean of Column Elements Feature Based on Dominant Energy Bands
Detail analysis on the feature based on MC elements in curvelet domain reveals that there exists highly redundant information that may not be effective in human identification. To this end, the MC elements corresponding to fine layer are ignored as they are found to have very low magnitude representing high frequency noise still remain in the pre-processed ECG thus carrying no information relevant to human identification. Fig. 4.11 demonstrates t h e MC elements corresponding to coarse layer coefficients for four persons. It is clearly seen that marked region represents a band or lobe, which is dominant in energy has a definite shape for each person. Two minima at both ends of the lobe
AmplitudeAmplitude AmplitudeAmplitude
is identified to capture the shape of the dominant energy band. The MC elements that exist between this two minima are used to form the feature set.
Similar plots for MC elements corresponding to detail curvelet coefficients are displayed in Fig. 4.12 for four persons. It’s observable from the marked regions in this figure that there exists two bands of dominant energy, which are significant in magnitude for each person and distinguishable while comparing among different persons. Therefore, the two bands with the dominant energy are captured using the approached employed in the coarse layer. By concatenating three bands, one from coarse layer and two from detail layer, the final reduced feature is formed as plotted in Fig. 4.13. This figure attests that the reduced feature shows unique signature for each person and more distinguishable pattern for different persons while comparing with the reduced feature set based on zero-lag of cross-correlation in curvelet domain as presented in Fig. 4.8.
0.4 0.3
0.3
0.2
0.1
0
Dominant Energy Band
0.25 0.2 0.15 0.1 0.05 0
Dominant Energy Band
−0.10 5 10 15 20 25 30 (a) Curvelet Coefficient
−0.05
0 5 10 15 20 25 30 (b) Curvelet Coefficient
0.4 0.4
0.3
0.2
Dominant Energy Band
0.3
0.2
Dominant Energy Band
0.1 0.1
0 0
−0.1
0 5 10 15 20 25 30
(c) Curvelet Coefficient
−0.10 5 10 15 20 25 30 (d) Curvelet Coefficient
Fig. 4.11: Feature based on MC elements corresponding to coarse layer curvelet coefficients for (a) Person 1 (b) Person 2 (c) Person 3 (d) Person 4
Amplitude Amplitude Amplitude Amplitude
0.2 0.2
0.15
0.1
Dominant Energy Bands
0.15
0.1
Dominant Energy Bands
0.05 0.05
0 0 50 100 150 200 (a) Curvelet Coefficient
0
0 50 100 150 200 (b) Curvelet Coefficient
0.15 0.15
0.1 Dominant
Energy Bands
0.1 Dominant
Energy Bands
0.05 0.05
00 50 100 150 200
(c) Curvelet Coefficient
00 50 100 150 200
(d) Curvelet Coefficient
Fig. 4.12: Feature based on MC elements corresponding to detail layer curvelet coefficients for (a) Person 1 (b) Person 2 (c) Person 3 (d) Person 4
B. Reduction of Mean of Column Elements Feature Based on PCA
For the proposed feature set based on MC elements corresponding to coarse, detail and fine layer curvelet coefficients as described through Fig. 4.10, the employment of PCA on the derived feature space can result in an efficient reduction of feature dimension without losing significant information. Hence employing PCA on the pro-posed feature space, sixteen principal components are derived to obtain a reduced feature set. Such principal components of the feature space are plotted for six persons in Fig. 4.14. This figure shows that the reduced feature set is more prominent that vary from person to person significantly while preserving better compactness among the individual in comparison to the reduced feature set shown in Fig. 4.9, which is obtained by employing PCA on the feature space based on zero-lag of cross-correlation corresponding to coarse, detail and fin e layers of curvelet coefficients.
AmplitudeAmplitude AmplitudeAmplitude 0.3
0.25 0.2 0.15 0.1 0.05 0
−0.05
−0.10 5 10 15 20 25 (a) Curvelet Coefficient
0.25
0.2
0.15
0.1
0.05
0
0 5 10 15 20 25
(b) Curvelet Coefficient
0.3 0.3
0.25 0.2
0.25 0.2 0.15
0.1 0.05
0.15 0.1 0.05 0
−0.05
−0.10 5 10 15 20 25 (c) Curvelet Coefficient
0
−0.05
−0.10 5 10 15 20 25 (d) Curvelet Coefficient
Fig. 4.13: Reduced feature set concatenating MC elements corresponding to dominant energy bands of coarse and detail layers for (a) Person 1 (b) Person 2 (c) Person 3 (d) Person 4