Modified Index Coding Scheme for Significance Map

3.2 Construction of Adaptive Subband Coding Scheme

3.2.5 Modified Index Coding Scheme for Significance Map

In wavelet thresholding based methods, the compression is done in two stages: 1) compression of the nonzero wavelet coefficients (NZWC) vector and 2) compression of the significance map. Transmission or storage of indices of the significant coefficients or significance map is important for perfect reconstruction at the decoder. The significance map is defined as the indication of whether a particular wavelet coefficient is zero or nonzero in a given wavelet coefficients vector. For wavelet based image coding, many approaches to efficiently code the significance map by exploiting the properties of the wavelet transformed image [217].

In most of the wavelet thresholding based ECG compression methods [134, 135, 137, 138, 142, 211], the significance map which is the binary vector that stores the information generated by scanning the wavelet coefficients in TWC vector and representing a ‘1’ if a nonzero (or significant) wavelet coefficient is scanned and representing a ‘0’ if a zero (or an insignificant) coefficient is scanned [134]. This is referred as binary significance map (BSM) [134, 135] and is compressed using the 8 bits/element table (T8) [137] or the RLE [138,211] and Huffman coding algorithms [137,142]. In [134,135], the BSM is compressed efficiently using a variable-length code based on run length encoding algorithm. The compression efficiency of the threshold based method depends on the number of bits used to code the NZWC vector and the number of bits used to code the significance map. Considerable gains in compression ratio could probably be achieved if a significance map is compressed in an efficient way. Therefore, we attempt an efficient coding scheme for significance map by exploiting the characteristics of the ECG signal and the distribution of the significant coefficients at each subband of the wavelet transform.

Most of the energy in an ECG signal is concentrated in the low-frequency region. The wavelet transfor- mation causes the significant part of the signal energy to be concentrated at the low-frequency components, with majority of the coefficients having little energy. It can be observed that at each subband of the wavelet transform the energy distribution is concentrated in a small number of wavelet coefficients. Therefore, the wavelet transform of most ECG signals are sparse, resulting in a larger number of small wavelet coefficients and a smaller number of large coefficients. A few number of wavelet coefficients may be sufficient to represent the ECG signal which is characterized by a cyclic occurrence of patterns (QRS complexes, P and T waves) with different frequency contents. Moreover, significant wavelet coefficients for each signal block appear considerably close in the order sequence within a wavelet subband. Since nonzero coefficients in the TWC vector exhibit localized patterns at each subband, thus significance map is created by storing the indexes or locations of nonzero wavelet coefficients. The resultant is referred as integer significance map (ISM). The use of first order forward difference for the indexes leads to increases the closeness of the indexes of the significant coefficients within subbands. This is described as follows. The first order forward

difference of a set of positive integersI is a new set Îwhere Î=I(n+1)−I(n). Set Î has one element less than the initial setIthat is stored and transmitted. This is referred as differencing further in this work. Re- sulting set Î can be much more efficiently coded since differencing increases the probability of occurrence of symbols within a set or subset. Thus, the first difference of the indexes of the significant coefficients can be coded with a Huffman encoding in a highly efficient way.

The proposed modified index coding (MIC) combines two ideas such as differencing and Huffman encoding to compress the ISM. The first step of this concept is similar to the one introduced in a lossy image codec based on index coding [217]. Let us explain first the index coding with an example and then we focuss on our proposed MIC scheme. A finite wavelet coefficient index set is given by I={k:k∈Z⁺,k≤K}, whereKis a positive number that is the length of the WC vector andkis the index of the wavelet coefficient in the WC vector. If we assumeA is a subset ofI then how to code the set or the subset Ain an efficient way. AssumeW to be the amplitudes of a hypothetical set of wavelet coefficients for one signal block, W={−141.87,47.44,−56.72,83.62,−13.79,−1.21,0.46,0.59,−14.50,1.38,0.10,0.65,13.60,−8.47, 5.01,6.59,−14.52,5.30,6.10,−2.33,7.35,1.02,−24.49−2.33,11.35}. With a threshold T =10.15, the significant or nonzero wavelet coefficients are

I={−141.87,47.44,−56.72,83.62,−13.79,−14.50,13.60,−14.52,−24.49,11.35} (3.15) and their wavelet indexes in setW are

I={1,2,3,4,5,9,13,17,23,25} (3.16) In this example, the number of significant coefficients is 10. The elements in a wavelet index set I are all positive integers. We want to code this index setI. Since the elements ofI are ordered in a monotonically increasing order, we can take the forward difference of two adjacent elements to produce a difference set ˆI of initial setI,

Iˆ={1,1,1,1,1,4,4,4,5,2}. (3.17) The elements in set Î are referred as skips. And it is straightforward to get set I from the difference set Iˆ by taking the partial sums of Î. ThusI and Î contain the same information and it means that there is no information loss. The difference set Î can be compressed by means of a Huffman coding or a variable length coding procedure which assigns codewords of variable lengths to the possible outcomes Î such that highly probable outcomes are assigned shorter codewords, and vice versa. Huffman coding is based on the frequency of occurrence of the skips values in difference set Î. In this example, the frequency histogram of of skips values of the set Îis given by

Iˆ_h={5,3,1,1}. (3.18)

and then the probability setPwhich provides probabilities of the skips values of the set ˆI, is given by

P={0.5,0.3,0.1,0.1}. (3.19)

As a variable-length encoding scheme, Huffman encoding has been shown to be one of the most leading techniques being used in various applications dealing with data compression. In fact, as it is widely prac- tised, the combination of run length encoding and Huffman encoding provides a near optimal encoding technique which is easy to implement. In this work, we use the combination of differencing encoding and Huffman encoding schemes since the length of the wavelet index set I is small and it does not have any repeated symbols like runs of ones and runs of zeros in the binary significance map. The run length coding is more suitable to code the binary significance map. In this work, Huffman coding is employed which minimizes the average code length required for difference set ˆI. The Huffman coding is performed in an efficient way by examining the probabilities (or count) of the symbols in the difference set ˆI.

Generally, more number of large wavelet coefficients are localized in wavelet subbands such as approx- imation subband and lower detail subbands having the smaller sizes. Thus this property results in a more number of ones in difference set Î, and its probability is naturally high compared to that of other symbols which are available in the set. Thus its count is coded by assigning a fixed number of bits. A few larger values are observed in difference set Î due to long skip between the localized patterns. This may reduce the coding efficiency of the Huffman coder. Therefore, we have employed a skips detector with threshold I_T assigned based on prior study of distribution of localized wavelet coefficients in WC vector or adaptively found by exploiting local skip statistics of the set I. The detector finds and assigns a zero to values of skips in the set Î aboveI_T while that large skip is stored. The large skips removed set is referred asS. The large skips are coded using fixed-length coding. The number of large skips maybe small and thus its code length is small. The codebook is generated which consists of four parts. The first part contains the code assigned to ones count. The second part carries the code used to represent the symbols in setS. The third part carries the code used to represent counts of those symbols. Finally, the fourth part contains the code of the large skips. As a measure of the compression efficiency of an encoding scheme, we use the average codeword length. LetS= (S,p)be an information source and(C,f)be an encoding scheme for symbol set S={s₁,s₂,s₃, ...s_q,}whereS denotes source symbolss_q, 1≤i≤q, p_i denotes the probability of the source producing symbols_q,Cdenotes a code for the sourceSandC(i)is the codeword assigned to source symbols_q. The entropy of the sourceSis given by

H=−

q i=1

∑

p_i log₂ p_i (3.20)

and the average codeword length of(C,f)is given by L_c=

q i=1

∑

length[f(s_i)] p(s_i) (3.21)

As we will see, the proposed modified index coding produces the most efficient scheme in the sense of having the smallest average codeword length among the encoding schemes used for the ISM. Finally, the proposed modified index coding schemes consists of the following steps:

Step 1: Obtaining the thresholded wavelet coefficients (TWC) vector of the frame or the WC vector depend- ing on the thresholding procedure.

Step 2: Creation of integer significance map (ISM) by storing the locations of nonzero wavelet coefficients in the TWC vector.

Step 3: Applying difference encoding to the ISM and results in DISM set(s).

Step 4: Detecting larger skips in the difference set and replacing those skips with the zero value. Thus it results in two sets: processed DISM set and large skips subset.

Step 5: Compression of the processed DISM using the Huffman coding algorithm and representation of the large skips using the adaptive fixed-length coding procedure.

Step 6:Finally, generation of codebook for each significance map.

The compression performance of the reported various encoding schemes and the proposed MIC schemes such as Huffman coding of difference of ISM (HDISM) and Huffman coding of processed difference of ISM (HPDISM) is tested using more than 150 significance maps. The widely used encoding shemes for the compression of significance map are the arithmetic coding of binary significance map (ABSM), the Huffman coding of 8 bits/element table (T8) version of the BSM (HT8BSM) and the Huffman coding of run length encoded version of the BSM (HRBSM). The significance map resulting from the wavelet coefficients vector of the ECG signal by thresholding is expressed by the binary strings and the wavelet indexes of the entire wavelet subbands for testing purposes. In the first experiment, we have considered the signal block of 1024 and 2048 samples and the widely usedmitarecords 107 and 117. For different values of percent retained energy (RE), the BSM and ISM are created, and then these maps are compressed using the reported encoding schemes and the proposed encoding schemes, respectively. The compression ratio calculated for each encoding scheme is shown in Fig. 3.11. It can be observed that for both the signal block lengths the compression performances of the proposed encoding schemes such as HDISM and HPDISM are better compared to the other encoding schemes. Meanwhile, the compression ratios achieved using the HPDISM encoding are higher as compared to that of the HDISM encoding scheme. In the second experiment, fifty significance maps created using the following specifications such as the percent RE value of 99.5% and the signal block of 1024 samples taken from fifty records, and then significance map of each signal block is encoded separately. Note that difference sets obtained for these significance maps may have diverse skip values and also may have various localized patterns. The compression performance of the encoding

0.991 0.992 0.994 0.996 0.998 1 2

3 4 5 6 7 8 9 10

Retained energy

Compression ratio

0.990 0.992 0.994 0.996 0.998 1

2 4 6 8 10

Retained energy

Compression ratio

0.991 0.992 0.994 0.996 0.998 1

2 3 4 5 6 7 8 9 10

Retained energy

Compression ratio

0.990 0.992 0.994 0.996 0.998 1

2 4 6 8 10

Retained energy

Compression ratio

ABSM HT8BSM HRBSM HDISM HPDISM ABSM

HT8BSM HRBSM HDISM HPDISM

ABSM HT8BSM HRBSM HDISM HPDISM ABSM

HT8BSM HRBSM HDISM HPDISM

(a) (b)

(c) (d)

Figure 3.11: Compression performance of the various encoding schemes for the significance maps which are generated from the processed ECG signal with (a) 1024-samples taken from the mitarecord 107; (b) 1024-samples taken from themitarecord 117; (c) 2048-samples taken from themitarecord 107; and (d) 2048-samples taken from themitarecord 117.

Table 3.7: Average compression performance of different encoding schemes for 150 significance maps.

Encoding N=1024 samples N=4096 samples

scheme Compression ratio (CR) Coding time, tesm(ms) Compression ratio (CR) Coding time, tesm(ms)

for SM ACR±σ At_esm±σ ACR±σ At_esm±σ

HT8BSM 1.89±0.48 158.2±33.83 2.11±0.52 585±97.7

HRBSM 2.40±0.61 57.6±22.03 3.19±0.85 167.2±78.1

HDISM 2.94±0.95 60.8±23.9 3.51±1.03 202.9±94.3

HPDISM 3.55±1.11 47.73±22.7 4.15±1.20 153.2±83.06

schemes for the case of different significance maps is shown in Fig. 3.12. These experiments show that compression performance of the proposed HPDISM encoding scheme is better for varying distribution of various significance maps.

The compression ratio and encoding time (t_esm) are measured for different significance map encoding schemes, averaged across the 150 significance maps and the results are shown in Table 3.7. The standard

0 5 10 15 20 25 30 35 40 45 50 0

1 2 3 4 5 6 7 8

Significance map number

Compression ratio

ABSM HT8BSM HRBSM HDISM HPDISM

Figure 3.12: Compression performance of the proposed encoding schemes for varying distribution of various significance maps.

deviation of the measured values are also summarized. Table 3.7 shows that HPDISM performs better than the other schemes with the highest compression ratio and lowest coding time. Especially remarkable is the increase in compression performance of the proposed HPDISM compared to the widely employed HT8BSM algorithm in ECG compression method or to the HRBSM algorithm used in the transform based methods. It is shown that by employing the proposed steps for encoding significance map becomes significantly faster and the compression ratio is also higher compared to other coding schemes employed for significance map.

By exploiting the local distribution of the wavelet indexes, we have shown that an intuitive way of coding of difference set symbols results in a much more efficient compression of the significance map without increasing the coding time. We thus follow the encoding procedure of the HPDISM scheme described in this section to code the significance maps. The HPDISM is referred as modified index coder (MIC) in this thesis. Overall compression performance with the combined quantized nonzero wavelet coefficients vector and the processed DISM of the significance map will be discussed in the next Section.

3.2.5.1 Performance of the Modified Index Coding Scheme

In this section, tests are carried out using the most widely employedmitadatasets and the frequently used signal duration. We compare the performance of the proposed encoder for significance map derived from the frames. The performance of the proposed scheme is compared with the HT8BSM [137] and the arithmetic encoding of significance map for 5.62 seconds and 60 seconds duration signals, respectively in Fig.3.13 and 3.14. Dataset-I is used for the testing purpose. From Fig.3.13(a), it can be observed that the proposed scheme requires less number of bits compared to the HT8BSM and arithmetic encodings. The arithmetic coding scheme requires less number of bits for 60 seconds duration signals compared to 5.62 second signal

(a)

(b) 0

200 400 600 800

100 101 102 103 107 109 111 115 117 118 119 Avg MIT-BIH arrhythmia record

Code length

Proposed Benzid Arithmetic

0 5 10 15 20

100 101 102 103 107 109 111 115 117 118 119 Avg

MIT-BIH arrythmia record

Proposed Benzid Arithmetic

Figure 3.13: Performance of the modified index coding scheme for 5.62-sec duration samples. (a) Compar- ison of codeword lengths of the proposed, Benzid and Arithmetic coding. (b) Compression results of the different encoding methods applied to code the significance map.

(a)

(b) 0

2000 4000 6000

100 101 102 103 107 109 111 115 117 118 119 Avg MIT-BIH arrhythmia record

codelength

Proposed Benzid Arithmetic

0 10 20 30

100 101 102 103 107 109 111 115 117 118 119 Avg MIT-BIH arrhythmia record

Proposed Benzid Arithmetic

Figure 3.14: Performance of the modified index coding scheme for 60-sec duration samples. (a) Comparison of codeword lengths of the proposed, Benzid and Arithmetic coding. (b) Compression results of the different coding methods.

since the length of the sequence is large. The results show that the performance of the proposed scheme is better than others even for the small duration test signal blocks. The proposed entropy coder achieves high

compression efficiency because the number of indexes in the details band is less and the indexes are in an incremental pattern.

To show the coding gain, the proposed entropy coder is evaluated with the dataset-II records: 104, 107, 111, 112, 115, 116,117, 118, 119, 201, 207, 208, 209, 212, 213, 214, 228, 231 and 232 with different block lengths and the EPE values. The overall performance of the proposed MIC is compared with the HT8BSM encoding for binary significance map (BSM). The effectiveness of the coder is tested for different block lengths and the experimental results are shown in Fig. 3.15. It can be observed that the code length of the proposed MIC scheme is less compared to the HT8BSM encoding scheme for all the tested data records.

Due to the localization of the wavelet coefficients, the indices of the nonzero wavelet coefficients in each subband results in an incremental pattern. Consequently, integer significance map is encoded efficiently with less number of bits by exploiting the redundancy among the indexes of the nonzero wavelet coefficients using the difference encoding, large skips finding and Huffman encoding. Thus, the proposed modified index coder achieves high compression efficiency. This compression methodology will be used in the automatic data rate control algorithm and quality control algorithm in the following Section.

Dalam dokumen Cardiovascular Signal Compression Using a New Wavelet Energy Based Diagnostic Distortion Measure (Halaman 157-164)