1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 60 61
Lagged Correlogram Patterns-based Seizure Detection Algorithm using Optimized HMM Feature Fusion
Morteza Behnam
Department of Electrical Engineering, Najafabad Branch, Islamic Azad University,
Najafabad, Isfahan, Iran.
Hossein Pourghassem Department of Electrical Engineering, Najafabad Branch, Islamic Azad University,
Najafabad, Isfahan, Iran.
Abstract— Epileptic seizure detection by EEG signal processing as an offline procedure is dependent to appropriate features. In this paper, a novel feature extraction and a feature fusion method for EEG signal classification are introduced. After preprocessing, the windowed EEG signals are decomposed to five rhythms by filter bank of Discrete Wavelet Transform (DWT) structure.
Three correlogram spectrums for 3 lags are computed for these rhythms. The extracted features of these spectrums have been combined by Hidden Markov Model (HMM) as a fusion method.
The lag values are optimized using a hybrid approach based on Multi-Layer Perceptron (MLP) neural network and Genetic Algorithm (GA) with implementing the Hill-Climbing (HC) search technique. The final feature vector is obtained with optimal lags and optimized HMM feature fusion. This scenario for feature extraction and fusion is called Lagged Correlogram Patterns (LCP) algorithm. Meanwhile, the spectral entropy as a frequency model of signal is estimated. The maximum value of averaged spectrum on all windows has been considered as a feature. Finally, the feature vectors are classified by Support Vector Machine (SVM) classifier with Radial Basis Function (RBF) kernel. The average of accuracy rate of 81.40% is obtained for the performance of seizure recognition.
Keywords— Correlogram; SVM; HMM; Feature Fusion;
Spectral Entropy; MLP; GA.
I. INTRODUCTION
Analyzing the Electroencephalogram (EEG) signals as a clinical application is a convenient technique for epileptic seizure detection. The EEG signal recording is a non-invasive method to show the brain activities. This signal acquisition is based on an international system in the name of 20-10 system [1]. The EEG is a summation of the electrical-ionic activities of nerve cells. When the seizure attack is occurred, a special lobe of the brain finds unknown reactions and the cells demonstrate an unbalanced behavior. This event is affected by high voltage discharging in the specific area of the brain cells. EEG signal processing as an offline method is used for seizure detection.
Offline processing for the next interventions and decision on surgery is very important. So, the EEG signal processing with robust method and high performance is essential [1, 2].
In recent years, many attempts to detect the epileptic seizure have been accomplished. In the majority of these works, feature extraction and signal classification are basis [2]. Some methods decompose the EEG signal to the rhythms using the filter banks
such as wavelet transforms [3], then by extracting the suitable features, the seizure is detected. These features are usually composed of time-domain and time-frequency attributes for example transform-based features and spectral analysis [4, 5].
The dimensionality reduction scenarios consist of Principle Component Analysis (PCA), Linear Discriminant Analysis (LDA) and kernel-based methods [5], or optimal feature selection algorithms, such as hybrid models based on a neural network or statistical classifier and an evolutionary algorithm can be used for achieving to the seizure detection with high accuracy and its pattern reorganization [5, 6]. On the other hand, feature fusion is also a way to dimensionality reduction of features space [7]. The techniques for fusion with the kernel matrices and tensor-based scenarios are frequently used [7, 8]. The EEG time series as a biomedical signal has a determined frequency band width. The brain rhythms and all of brain activities such as Evoke Related Potential (ERP) are happened in this band width. The seizure disorder has the comportment in this band.
The sleep spindle, bruxism, restless legs syndrome and all types of epilepsy are in the same band [2]. So, our main challenge is the optimized feature extraction with information of dynamics and static states of EEG signals as stochastic process for seizure attack detection in the offline mode.
II. EPILEPTIC SEIZURE DETECTION ALGORITM
Our proposed algorithm in this paper has been demonstrated in Fig. 1. The EEG signals are prepared and each epoch is improved in the preprocessing step by band pass FIR filter using Kaiser-Bessel windowing procedure [9]. The cutoff frequencies are tuned on 0.5 and 35 Hz and the EEG signals are filtered [2].
Each epoch is windowed then two feature extraction scenarios have been used on each window. At first, the EEG signal by DWT is decomposed to five dominant rhythms. By introducing a novel method that is called Lagged Correlogram Patterns (LCP), the correlogram for each rhythm in three values of lag are computed. The first three maximum values in these three spectrums are obtained. By Hidden Markov Model (HMM), this maximum vector is fused. The lag values with a hybrid model based on MLP and GA with implementing the Hill-Climbing search are optimized. Meanwhile, the spectral entropy for each windowed signal is estimated. The maximum value of the averaged spectrums is considered as a feature. Ultimately, the signals with 10 optimal features are classified by SVM classifier with RBF kernel in two classes, seizure and non-seizure signals.
IEEE INDICON 2015 1570157887
Fig. 1. Block diagram of our proposed seizure detection algorithm.
III. FEATURE EXTRACTION
After signal filtration, each signal is windowed by Hanning window. In this procedure, overlapping of 50% is considered for providing the new epochs [9]. This overlap saves the final information in the signal sequences and it is induced to reduce the bias of spectral estimation. In according to Fig. 1, two scenarios for feature extraction are used. One feature is based on estimating the spectral entropy and another feature vector is based on the optimized HMM feature fusion by fusing the extracted features from the correlogram spectrum of the EEG rhythms that we called LCP algorithm.
A. Lagged Correlogram Patterns (LCP) 1) EEG Rhythm Decomposition by DWT
It has been supposed that each windowed signal is stochastic process. Each process is Wide Sense Stationary (W.S.S.) and Mean Ergodic (M.E.). So, a sample of windowed epoch is shown as,
( )
0.5 0.5cos 21 w n n
N π
⎛ ⎞
= − ⎜⎝ − ⎟⎠ , 0≤ ≤n N −1 (1)
( ) ( ) ( )
. 1, 2,... , Tw w w wN
S n =S n w n = ⎡⎣S S S ⎤⎦ (2) where N is the number of sample in each epoch and w n( ) is Hanning window equation with the length of 60 samples in our application [9]. By using DWT as a filter bank, the EEG signal in each windowed epoch is decomposed [3]. The DWT in each
level has two half-band filters. After filtering and downsampling the input windowed signal, two sequences are prepared (detail and approximation signals). The approximation sequence is used for more decomposition in the next level. With considering four levels and Daubechies of order 4 (Daub4) as kernel function of DWT [3, 6], the windowed EEG signal is decomposed to five brain rhythms. We use four details and final approximation coefficients as EEG rhythms. These rhythms are Delta [0.5, 4]
Hz, Theta [4, 8] Hz, Alpha [8, 13] Hz, Beta [13, 22] Hz and Gamma [22, 30] Hz.
2) Correlogram Spectrum of The Rhythms The rhythms are supposed in the form of following notation,
[
1 2]
( ) , , ... , T
i L
R n = r r r (3) where L is the length of rhythm, i (i =1, 2,…, 5) for five rhythms. The correlation value between two samples of one epoch demonstrates their similarity. Each rhythm is random process, W.S.S. and M.E., so the Ri is a random vector [10].
Also, each sample of Ri is a random variable (RV). Correlation coefficient is a normalized covariance between two RVs. With these explanations the correlogram is defined by,
( )
( ( ) ) ( ( ) ) ( ( ) )
1
2
1
ˆ
L
i i
i i
k
i L
i i k
R k R R k R
R k R
υ
υ ρ υ
−
=
=
− + −
=
−
∑
∑
(4)where Riis the mean value for rhythm vector i. Also,υ is lag, the lag means the space between two samples of vector for computing the correlogram [11, 12]. The υ must be υ <Lor
0, 1,...,L 1
υ= − . The graph ρ υˆ ( )versusυ is called correlogram.
It is an estimation of autocovariance function for distribution of samples [12]. The ρˆi is a sequence as correlogram series for rhythm i. By computing the ρˆi for 5 extracted rhythms, we have five sequences of ρˆi versusυ. With forming the five sequences of ρˆi in a matrix, the correlogram spectrum is obtained as,
{ }
P P P P
1
11 51
2 12 22
5
2
1 2 5
21
5
ˆ ( ) ˆ ( ) ˆ
ˆ ( )
T Delta Theta Rhythm i
i
Gamma
M M M
CS
ρ ρ ρ ρ
ρ ρ ρ
υ
υ ρ
ρ
ρ υ ρ ρ ρ
⎛ ⎞
⎛ ⎞
⎜ ⎟
⎜ ⎟
⎜ ⎟
⎜ ⎟
=⎜ ⎟ =⎜ ⎟
⎜ ⎟
⎜ ⎟
⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎝ ⎠ ⎝ ⎠
"
"
# # # % #
"
(5)
where M is the length of ρˆi . This spectrum depicts the autocorrelogram signal for υ as a lag value for five rhythms.
The matrix CS
{ }
ρˆi for prototype windowed epoch of signal is demonstrated in Fig. 2. The maximum values of this spectrum are correlated samples of the rhythms. The important issue is assigning the lag υ, so we need an optimization algorithm. For initial parameter setting, three lag values are supposed.Therefore, three spectrums are prepared. In each correlogram spectrum three maximum values have been extracted as maximum similarity for this epoch. Ultimately, nine values from three spectrums are extracted. This vector is named Max Vector (MV). By applying a fusion method based on HMM, this vector is converted to other vector with more information and less redundancy.
3) HMM Feature Fusion Algorithm
With regard to theorem of Cauchy-Schwarz inequality, the correlation coefficients are ρ ≤1. So, the maximum values are also less than one. The MV has the values in range of [-1,1].
So, at first step of fusion algorithm, the members of MV are normalized to interval of [0,1]. The MV is considered as Morkov chain. To reduce the redundancy and dependency among the samples of MV, we used HMM for feature combination [13].
After normalization step, by considering three states of
1 2 3
{S , S , S }
S = , that the sub-interval of the states are equaled to S1=[0,0.33], S2=[0.33,0.66] and S3=[0.66,1]. With computing the state change by,
( ) ( )
, 1
, ,
LMV
i j
i j i j
N S S State Change i j SC
LMV
= =
∑
=(6) where N S S
(
i, j)
is the number of state change from state i-th to state j-th into string MV as a Discrete-Time Markov Chain (DTMC) [8]. LMV is the length of Markov chain (here9
LMV = ). So, the transition matrix is obtained as following,
Fig. 2. The schema of correlogram spectrum with M =128 samples for five rhythms of the windowed epoch (for i=1, 2, ... , 5).
11 12 13
21 22 23
31 32 33
SC SC SC
TM SC SC SC
SC SC SC
⎛ ⎞
⎜ ⎟
= ⎜ ⎟
⎜ ⎟
⎝ ⎠
(7)
After computing the transition matrix TM as fused features that extracted from the correlogram spectrum, the transition matrix is deformed to the vector H ,
(8)
11, 12, 13 21, 22, 23 31, 32, 33
H = ⎡⎣SC SC SC SC SC SC SC SC SC ⎤⎦ The optimized vector H in the next procedure is used as sub feature vector for pattern recognition into seizure and non- seizure classes.
4) LCP Parameter Optimization By MLP-GA Algorithm To optimize the value of parameter lag, the best values for LCP method are selected by using a hybrid model based on GA and MLP neural network.
The vector H can be used as feature vector for signal classification. So, by designing the MLP neural network as a classifier, the feature vector of observations are classified. The number of input neurons is nine and it is equal to length of the vector H . The number of hidden neurons is considered equal to 18 based on our experiments. The learning algorithm is Levenberg-Marquardt and the Back Propagation (BP) algorithm is considered as training function. In each epoch of training 30% of dataset is used for test and 70% of dataset is used for training [14]. The activation function is sigmoid with equation,
{
(x)}
11 x
AF sgn = e−
+ (9) The GA is applied as an evolutionary search method for optimization [15]. With regard to the exploration capability of
the GA, this algorithm with specification of global search in combination with Hill-Climbing (HC) method with capability of exploitation for simultaneously following the multi-agent problem is very efficient. The MLP neural network classifies the EEG signals using the patterns of H . In each classification, Mean Square Error (MSE) of classifier is considered as,
( ) { ( ) }
( ) ( ) ( )
2
2 1
21 , ,
k N
i
Targ MSE k E e n
MSE k et i k Output i k PN =
=
=
∑
− (10)where N is the number of observations, P is output neurons of the MLP and the k value is the number of generating [14]. The cost function of GA is based on Minimum MSE (MMSE) and defined by following,
1
( )
Cost Function
MSE k
= (11) With this cost function, GA searches for optimal parameters with Hill-Climbing algorithm. In order to increase the evaluation the HC search algorithm moves towards the top of the cost function as each local search of the GA. In each iteration of GA, the HC is stopped when the present situation is larger than its neighbors. To provide the vector H three lags must be generated.
In fact, the GA is lag generator for correlogram method. So, with considering the three agents in the parallel form, the GA searches for optimal lags.
In structure of the GA, 20 initial chromosomes for each lag are used and they are replaced by 10 optimal chromosomes in the next generation. Meanwhile, after all of the generations 10 chromosomes remain. The lag value is positive integer number.
On the other hand, the chromosome strings have binary values.
By converting the generated lags to 8-bit arrays, each string has 8 genes for optimizing. General parameters of the GA are set so that the cross over percent is 50% for each string or in other word, 4 genes in every 8 genes, the percentage of mutation is 1% probability for each gene and k as generating parameter is equal to 30 times. In this algorithm, all of processes are based on single point cross over and inverting mutations [14-16]. Due to nr law [15], the dimension of search space is calculated by,
1 2 3 1273
Energy Function =L L L = (12) where energy function is the dimension of search space and
Liis permutations of the lag values. After this optimization algorithm, three optimal lags with the best MMSE are obtained.
These lags are: L1=1, L2=5,L3=4. So, we use from these optimized parameters in the final signal classification.
B. Spectral Entropy Estimation
To compute the Cross Power Spectral Density (CPSD), the modified periodogram with Welch method has been used [17].
This method and overlapping in windowing procedures are induced that the variance of spectrum estimation is reduced. In this method, K parts of windowed signals with overlap of 50%
are considered [17]. The CPSD is computing by,
Fig. 3. Mean of K sequences in all windows of EEG signal for computing the estimation of the spectral entropy .
( ( ) )
1{ ( ) ( ) }
0
.
N jk n H
w w w
Welch
k
P S n e ω E S n S n k
− −
=
=
∑
+ (13)where PWelchis the Welch power spectral density for windowed signal Sw
( )
n . When the N value tends towards ∞, the spectral estimation remains unbiased so,{ } { }
ˆ
0
Welch
Welch W elch Welch
P
P P E P
Bias = − → (14) So, to compute the spectral entropy, after computing the CPSD [18], the absolute value of entropy is obtained by following equation,
( ( ) ) { ( ( ) ) }
1
ˆ . l og ˆ
N
Welch Welch
n
w w
SEE P S n P S n
=
= −
∑
(15)where SEE is computed for all windowed sequences of each EEG signal [19]. Ultimately, we have K series of SEE for each signal. By averaging these sequences, an estimation of spectral entropy similar to Fig. 3 is obtained. So, the maximum value for the graph of Fig. 3 is defined as,
( )
1
1 K
SEE
i
M Max SEE i
K =
⎧ ⎫
⎪ ⎪
= ⎨ ⎬
⎪ ⎪
⎩
∑
⎭ (16) where MSEE represents the maximum entropy and frequency distribution of EEG signal [18,19]. So, for each signal one feature is extracted as maximum value of the spectral entropy.IV. EXPRIMENTAL RESULTS
A. EEG Signal Dataset
In this paper, the CHB-MIT dataset that collected by the Children’s Hospital Boston has been used. This dataset has been
composed of the recorded EEG signals from the children with intractable seizure disorder. The patients up to several days after withdrawal of anti-seizure medication have been monitored for intervention and treatment [20].
In this research, 104 hours of EEG signals have been used.
Each signal is divided to N=120 epochs with the length of 30 seconds. So, there are 12480 EEG signals with sampling frequency of 256 Hz and 16-bit resolution. In this set, 2040 signals are observed with symptoms of seizure attack.
B. K-fold Cross Validation
To recognize the seizure from non-seizure signals and final pattern classification, the K-fold cross validation algorithm is applied [2]. The expected error of this algorithm is estimated with averaging of the errors for all folds by,
1
1 K
K fold i
i
E E
− K
=
=
∑
(17) where Ei is the error of fold i-th and K (here K=4) is the number of folds. All of data participate in the training and test procedures without any repetition [21].C. Signal Classification with SVM and RBF Kernel
The extracted feature vector for each signal is composed of 10 features, 9 fused features with LCP method and one feature as maximum entropy. For seizure recognition, the SVM has been employed as a classifier. Due to the patterns of two-class, it seems that the SVM classifier has more efficiency than other classifiers [16, 18]. The SVM finds an optimal hyper-plate by mapping the features to the high dimensional space for separating the attributes with maximum margin as a decision boundary. By considering RBF as a kernel for SVM classifier, we attempt to solve the complexity of feature space [19].
We consider the nonlinear functions to map the input vector (feature vector F ) to the space of ϕ. The nonlinear function ϕ transforms a nonlinearly separable problem into a linearly separable one. With mapping by this transformer function,
( )
2
2 2
F mF
F mF e σ
ϕ
− −
− = (18) where mF is statistical mean of feature vector F and . is Euclidian norm. σ2 as spread constant and variance of Gaussian function has been considered on 0.01. By using this kernel, SVM classifies the EEG signals into two classes, seizure and non-seizure signals [19].
D. Final Results of Classification
We have 4 folds and four classifiers. Each classifier has special parameters to detect the patterns. In this classification, 70% of EEG dataset for training, 20% for test and 10% for validation check (VC) are considered. After seizure detection, final results have been presented in Table I.
Ultimately, with convergence of learning function, the mean value of accuracy for feature classification is 86.68% for training
TABLE I. FINAL RESULTS OF EEGSIGNAL CLASSIFICATION.
Number of Signals Accuracy Rate (%) Results
Training (70%)
Test (20%)
VC
(10%) Training Test VC MSE ( 10 )× −4
Fold 1 8736 2496 1248 84.80 73.60 81.20 2.64
Fold 2 8736 2496 1248 86.99 76.80 81.59 2.32
Fold 3 8736 2496 1248 87.44 78.80 80.45 2.12
Fold 4 8736 2496 1248 87.51 75.20 79.56 2.48
Final Results of Epileptic Seizure Detection Variance of Measures Mean of Accuracy (%)
MSE Test
Error Training
Error Test Training Final Accuracy
(%) 4.95×10-10 4.9467 1.6323 76.100 86.685 81.40
Fig. 4. Comparison the performances of seizure detection for different folds.
and 76.10% for the test also the variance of MSE is equal to 4.95 10× −10. To consider a suitable detector, we have selected a classifier by averaged parameters with the accuracy rate of 81.40%. Fig. 4 is a comparable schema for performance of the different classifiers. In this figure, four series of results for each fold is represented.
V. CONCLUSION
In this paper, an epileptic seizure detection algorithm by EEG signal processing was proposed. The statistical feature extraction and HMM fusion method based on proposed LCP and spectral entropy estimation were so efficiency for seizure detection. The GA in combination with the MLP classifier for lag optimization can appropriately search the multi-dimensional space. The SVM classifier with RBF kernel was also used to classify the seizure from non-seizure patterns.
REFERENCES
[1] S. Nasehi and H. Pourghassem, “Seizure detection algorithms based on analysis of EEG and ECG signals: a survey,” Neurophysiology, vol. 44, no. 2, pp. 174-186, June 2012.
[2] S. Nasehi, H. Pourghassem, “A New Feature Dimensionally Reduction Approach Based on General Tensor Discriminant Analysis in EEG Signal Classification”, International Conference on Intelligent Computation and Bio-Medical Instrumentation (ICBMI), pp.188-191, Wuhan, China, 14-17 Dec. 2011.
[3] O. Salem, A. Naseem, and A. Mehaoua, “Epileptic Seizure Detection From EEG Signal Using Discrete Wavelet Transform and Ant Colony Classifier,” IEEE International Conference on Communications (ICC), Sydney, NSW, pp. 3529-3534, June 2014.
[4] I. Kalatzis, E.P. Ventouras, C.C. Papageorgiou, A.D. Rabavilas, and D.
Cavouras, “Design and implementation of an SVM-based computer classification system for discriminating depressive patients from healthy controls using the P600 component of ERP signals,” Computer Methods and Programs in Biomedicine, vol. 75, no. 1, pp. 11-22, July 2004.
[5] S. Nasehi and H. Pourghassem, “A novel effective feature selection algorithm based on S-PCA and wavelet transform features in EEG signal classification,” IEEE 3rd International Conference on Communication Software and Networks (ICCSN), Xian, China, vol. 1, pp. 114-117, 27- 29 May 2011.
[6] S. Nasehi and H. Pourghassem, “Online mental task classification based on DWT-PCA features and probabilistic neural network,” International Journal of Imaging and Robotics, vol. 7, no. 1, pp. 110-118, January 2012.
[7] S. Nasehi and H. Pourghassem, “A Novel Fast Epileptic Seizure Onset Detection Algorithm Using General Tensor Discriminant Analysis,”
Journal of Clinical Neurophysiology, vol. 30, no. 4, pp. 362-370, August 2013.
[8] S. Nasehi and H. Pourghassem, “Mental Task Classification Based on HMM and BPNN,” International Conference on Communication Systems and Network Technologies (CSNT 2013), India, pp. 210-214, April 2013.
[9] M. Behnam and H. Pourghassem, “Periodogram Pattern Feature-based Seizure Detection Algorithm using Optimized Hybrid Model of MLP and Ant Colony,” 23rd Iranian Conference on Electrical Engineering (ICEE 2015), Tehran, Iran, pp. 32-37, May 2015.
[10] D.A. Engemanna and A. Gramfort, “Automated model selection in covariance estimation and spatial whitening of MEG and EEG signals,”
NeuroImage, vol. 108, pp. 328-342, March 2015.
[11] A. Yasuhara, “Correlation between EEG abnormalities and symptoms of autism spectrum disorder (ASD),” Brain and Development, vol. 32, no.
10, pp. 791-798, November 2010.
[12] S. Nasehi and H. Pourghassem, “Epileptic Seizure Onset Detection Algorithm Using Dynamic Cascade Feed-Forward Neural Networks,”
International Conference on Intelligent Computation and Bio-Medical Instrumentation (ICBMI), Wuhan, Hubei, China, pp. 196-199, December 2011.
[13] B. Obermaier, C. Guger, C. Neuper, and G. Pfurtscheller, “Hidden Markov models for online classification of single trial EEG data,”
Pattern Recognition Letters, vol. 22, no. 12, pp. 1299-1309, October 2001.
[14] S. Nasehi and H. Pourghassem, “A Novel Epileptic Seizure Detection Algorithm Based on Analysis of EEG and ECG Signals Using Probabilistic Neural Network,” Australian Journal of Basic and Applied Sciences, vol. 5, no. 12, pp. 308-315, December 2011.
[15] K. Hsu and S. Yu, “Detection of seizures in EEG using subband nonlinear parameters and genetic algorithm,” Computers in Biology and Medicine, vol. 40, no. 10, pp. 823-830, October 2010.
[16] A.R. Naghsh-Nilchi and M. Aghashahi, “Epilepsy seizure detection using eigen-system spectral estimation and Multiple Layer Perceptron neural network,” Biomedical Signal Processing and Control, vol. 5, no.
2, pp. 147-157, April 2010.
[17] S. Nasehi and H. Pourghassem, “Patient-specific epileptic seizure onset detection algorithm based on spectral features and IPSONN classifier,”
International Conference on Communication Systems and Network Technologies (CSNT), pp. 186-190, April 2013.
[18] Y. Kumar, M.L. Dewal, and R.S. Anand, “Epileptic seizures detection in EEG using DWT-based ApEn and artificial neural network,” Signal, Image and Video Processing, vol. 8, no. 7, pp. 1323-1334, October 2014.
[19] N. Nicolaou and J. Georgiou, “Detection of epileptic electroencephalogram based on Permutation Entropy and Support Vector Machines,” Expert Systems with Applications, vol. 39, no. 1, pp. 202-209, January 2012.
[20] EEG Signal Database, http://www.physionet.org/pn6/chbmit.
[21] S. Nasehi and H. Pourghassem, “Real-Time Seizure Detection Based on EEG and ECG Fused Features Using Gabor Functions,” IEEE International Conference on Intelligent Computation and Bio-Medical Instrumentation (ICBMI), Wuhan, China, pp. 204-207, December 2011.