Lip contour extraction using RGB color space and fuzzy c-means clustering

(1)

Lip contour extraction using RGB color space and fuzzy c-means clustering

Vahid Ezzati Chahar Ghaleh Department of Electrical Engineering

Shahed University Tehran, Iran [email protected]

Alireza Behrad

Department of Electrical Engineering Shahed University

Tehran, Iran [email protected]

Abstract— Lip contour extraction is very important issue in visual speech recognition systems (lip reading). To extract the lip contour, proper segmentation is needed. There are many approaches for image segmentation such as colour segmentation (histogram-based and clustering-based) that have been widely used in different areas. In this paper we use RGB colour space and fuzzy C-means clustering for lip segmentation. Compared to previous methods, we obtain a simple feature for lip region extraction using RGB components which can be used as input to C-means clustering algorithm for lip region extraction. Then the outputs of the C-means clustering algorithm are fed into active contour model to obtain final lip region. We tested the proposed algorithm with different images and results showed good segmentation for different speakers with different illumination.

Keywords- RGB color space, Lip contour, Fuzzy C-means Clustering, Active contour models)

I. INTRODUCTION

Recently continuous lip motion information from speakers is useful in the visual speech recognition (lip reading) especially in the noisy environments. In order to improve the accuracy and robustness of automatic visual speech recognition system good lip segmentation is needed. Many researchers attempted to obtain accurate approaches for lip segmentation. The main problem with lip segmentation is the separation of skin color from lip color. Because the skin and lip both contain a lot of red color and so in the segmentation problem it is very difficult to separate skin and lip color from each other. Different image segmentation techniques have been proposed for lip segmentation. From these approaches color based image segmentation has been widely used [1], [2].

Cheng et al.[xx] proposed a histogram based segmentation technique which involves performing fuzzy partitioning on a two-dimensional histogram based on the maximum fuzzy entropy principle. In [3] Alan et al. used a novel spatial fuzzy C-means clustering (FCM) algorithm for lip segmentation problem. They used FCM algorithm with including the coordinate information as feature vectors. Also Lewis et al. in [4] presented an algorithm named Red exclusion that can

effectively separate the lip color from skin color. Many researchers also used different color spaces as feature vectors, such as 1976 CIELAB (L, a, b) and 1976 CIELUV (L, u, v) color spaces or HSV color space, because they are not sensitive to illumination's variation. For instance Eveno et al.

in [5] used a new color transformation named pseudo hue and also Alan et al. in [3,6] used CIELAB and CIELUV color spaces as feature vectors for fuzzy clustering.

II. METHODE A. color space

are different kinds of color spaces such as RGB, HSV, CIELAB, CIELUV, to name a few. Color spaces are used for different purposes such as segmentation. For segmentation issues, uniform color spaces such as CIELAB, CIELUV and HSV color spaces have been widely used, because they are not sensitive to illumination's variation, as already said. But they are nonlinear and computational cost is high. Also as reported in [5] for certain talkers skin and lip, hue values are very close so thresholding is inefficient to separate them.

As we know, the original images in most of vision systems are in the RGB color format. RGB color space involves three components R-component (Red), G-component (Green) and B-component (Blue). In RGB space, skin and lip pixels have quite different components. For both, Red is prevalent.

Moreover there is more green than blue in the skin color mixture and for lips these two components are almost the equal. The difference between Red and Green is greater for lip than for skin. With this knowledge and after comparison three components and those histograms of lip regions, we knew that the value of G-component for lip region is less than R- component and so B-component, as shown in Figure 1 and Figure 2.

(2)

R-component

G-component

B-component Fig.1 - RGB color components

(a)

(b)

Fig . 2. Histogram of R, G, and B components for lip and skin regions. (a) Histogram for face region (b) Histogram for lip region

We found a simple relationship between three components of RGB color space to separate skin region from lip region.

When we subtract R-component and B-component from G- component and then add the results as equation 1, the resultant f value has a good property to distinguish lip from skin. In fact with this work we exclude green component to separate lip from skin. The equation for this approach is as follow:

) (

)

( f

_r

f

_g

f

_b

f

_g

f = − + −

(1)

Where

f

_rR-component is

f

_gis G-component and

f

_bis B- component of RGB image.

When we apply traditional threshold to the result of equation 1, good results are not obtained. For this reason we used clustering method to achieve better results which is explained in the next section.

B. Fuzzy C-means clustering (FCM)

Fuzzy c-means (FCM) is a proper method of clustering which allows data of two or more clusters to be separated properly.

This method is frequently used in pattern recognition [7], [8].

The principle of this algorithm is the minimization of the following objective function:

∞

≤

−

= ∑∑

= =

m c

x u

J

_i _j

N

i C

j m ij

m

|| ||

²

, 1

1 1

(2)

where m is any real number greater than 1,

u

_ij is the

probability of the membership of x_i in the cluster j, x_i is the i^th entry of N-dimensional data,

c

_j is the center of the j^th

cluster, and ||*|| is any norm expressing the similarity between

data and the center of cluster.

Fuzzy partitioning is done through an iterative optimization of

(3)

the objective function that defined in eq.2, with the update of membership

u

_ij and the cluster centers

c

_j as follows:

∑

=

−

 



 



−

=

C

k

m

k i

j i ij

c x

c x u

1

1 2

||

1 (3)

∑

=

⋅

=

_N

i m ij N

i

i m ij j

u x u c

1

(4)

This iteration will stop when:

ε

<

+

−

|}

{|

max

_ij

u

_ij⁽^k ¹⁾

u

_ij⁽^k⁾

(5)

Where

ε

is the termination criterion and its value is between 0 and 1, and

k

is the iteration step. This procedure converges to a local minimum of

J

_m

.

The stages of this algorithm are as follows:

1. Initialize

U = [ ] u

^ij ^{matrix to}

U

⁽⁰⁾

2. At k^th -step: calculate the center of vectors

[ ]

^j

k

c

C

⁽ ⁾

=

^using

U

⁽^k⁾^{as follow:}

∑

=

⋅

=

_N

i m ij N

i

i m ij j

u x u c

1

1 (6)

3. Update

U

^(k⁾to

U

⁽^k⁺¹⁾using the following equation:

∑

=

−

 



 



−

=

C

k

m

k i

j i ij

c x

c x u

1

1 2

||

1

(7)

4. If

|| U

⁽^k⁺¹⁾

− U

⁽^k⁾

|| < ε

then stop; otherwise go to step 2.

As already motioned, data are bound to each cluster by means of a membership function, which represents the fuzzy behavior of this algorithm. To do that, we simply have to build an appropriate matrix named U whose elements are numbers between 0 and 1, where uij represent the membership between

data i and cluster j. Other properties of matrix U are shown below:

[ ] i j

u

_ij

∈ 0 , 1 ∀ ,

(8)

i u

_ik

C

j

∀

∑ =

=

1

(9)

N N u

N

i

ij

< ∀

< ∑

=

0

1

(10) For image data clustering, FCM algorithm attempts to assign a probability value to each pixel to minimize a fuzzy entropy measure. As the clustering method is an unsupervised learning method where neither prior assumption about the underlying features distribution nor training is needed. It is able to handle various lips and skin color due to different human race or make-up [9].

The FCM algorithm starts with initial values for the clusters centers, which are intended to mark the mean location of each cluster. The initial value for these clusters centers is most likely incorrect. Additionally, FCM assigns to every data point a membership grade for each cluster. By iteratively updating the clusters centers and the membership grades for each data point, FCM iteratively moves the clusters centers to the correct location within a data set. This iteration is based on minimizing an objective function that represents the distance from any given data point to a clusters center weighted by that data point’s membership grade.

In this work the f values obtained from eq. 1 are considered as input data (feature vector) to fuzzy clustering algorithm. We used C=2(two clusters lip region and non-lip region) and m=

2.

To obtain good segmentation results we used an adaptive threshold which is calculated using the following equation:

| ) , min(

| C

₁

C

₂

T = µ

_f

+ σ

_f

+

⁽¹¹⁾

Where

µ

_f ^and

σ

f are mean and standard deviation of

f

respectively and

C

₁,

C

₂ are centers of the clusters.

Finally pixels with value greater than this threshold and probability membership greater than 0.5 are labeled as lip region. Figure 3 show the result of the above steps.

(a) (b) (c)

Fig. 3. Result of segmentation algorithm. (a) RGB image filtered with our approach, (b) lip membership after applying FCM algorithm (c) binary image after adaptive thresholding.

(4)

C. Lip contour extraction

After segmenting the lip area, we can extract lip boundaries.

However the boundaries of the output image in the previous image is not smooth enough and its contour is not matched with true lip contours in some of input images. To enhance the efficiency of the algorithm we have used active contour model to extract the lip contour correctly. For this purpose, we first extract boundaries of segmented lip and then use it in each frame as initial contour for active contour model. The active contour models or snakes [10] are based on the contour energy minimization. Contour energy is the sum of three energies including internal external and constraint energies which are defined using the following equations:

∑

= = =

+ +

= ^N

i

constra N

i

External N

i

Internal

snake E i E i E i

E

1

int 1

1

) ( )

( )

( ⁽¹¹⁾

) ( )

(i E i

E

E_Internal =

α

_i _continuity +

β

_i _Curvature ⁽¹²⁾

2 1| 2|

) 1

( = i− i₋

continuity i v v

E (13)

2 1

1 ) ( )|

( 2| ) 1

( = i₊ − i − i− i₋

Curvature i v v v v

E

(14)

∑

=

∇

−

= ^N

i

i i

External i I x y

E

1

| ) , (

| )

( ⁽¹⁵⁾

Where

v

are points on the contours and

N

is number of

v

points and

x

_i

, y

_i are coordinates of

v

points on the contour.

Fig (4) show results before and after applying active contour model.

Fig .4 lip contour before and after applying active contour model.

As be seen in Fig (4) with applying active contour model we can extract features such as width and height or points of contour correctly.

III. IMPLEMENTATION AN EXPERIMENTAL RESULTS We implemented the proposed algorithm using a MATLAB program and tested the proposed algorithm with different videos. The implemented algorithm for lip contour extraction in different videos consists of the following steps:

Step1. Get frames from the video

Step2. Obtain

f

values for each frame and use it as feature vector for next step.

Step3. Apply FCM algorithm to achieved image from previous step

Step4. Extract the boundaries of segmented lip

Step5. Apply active contour model to obtain the lip contour.

Figure 5 shows the results of the different stages of the proposed algorithm. The first column (left column) of Fig. 5 shows the original frames. The results of lip segmentation using equation 1 and fuzzy c-means algorithm are shown in second column. Third column shows the result of boundary extraction using binary image of second column. Fourth column shows the final lip contours after applying active contour algorithm. As it is shown in this figure, final contours are smooth and accurate.

Fig. 5. Lip contour extraction: First column: RGB images, Second column:

Segmented lip image, Third column: lip boundaries obtained using segmented image, Fourth column: lip contours after applying active contour model.

IV. CONCLUSION

In this work, we used RGB color space combined with fuzzy C-means clustering (FCM) and active contour model for lip segmentation. First we found a simple relationship between RGB color components to separate skin color from lip color and then applied the resultant feature vector to FCM and

(5)

finally used active contour model for precise extraction of lip contour. We tested the algorithm with different images and results were promising. For future works we are going to extract feature from resultant lip contour for lip reading using lip images.

REFERENCES

[1] H. D. Cheng, Y. H. Chen, and X. H. Jiang, “Thresholding using two dimensional histogram and fuzzy entropy principle,” IEEE Trans. Image Processing, vol. 9, pp. 732–735, Apr. 2000.W.-K. Chen, Linear Networks and Systems (Book style). Belmont, CA: Wadsworth, 1993, pp. 123–135.

[2] H. D. Cheng, J. R. Chen, and J. Li, “Threshold selection based on fuzzy C-partition entropy approach,” Pattern Recogn., vol. 31, no. 7, pp.

857–870, 1998.

[3] A. W. C. Liew, S. H. Leung and W. H. Lau, “Segmentation of color lip images by spatial fuzzy clustering,” IEEE Trans on Fuzzy Systems, vol.11, pp.542-549, August 2003.

[4] Lewis TW, Powers D M, “Lip Feature Extraction Using Red Exclusion”, In Proc. Selected papers from Pan-Sydney Workshop on Visual Information Processing, 2000, pp.61-67

[5] N. Eveno, A. Caplier, P. Y. Coulon, “New color transformation for lips segmentation,” proc of IEEE 4^th, Workshop on Multimedia Signal Processing, pp. 3-8, Cannes, Oct. 2001.

[6] A.W.C. Liew, S.H. Leung, W.H. Lau, “Lip contour extraction using a deformable model,” In Proc. IEEE ICIP, Vol. 2, pp. 255-258, 2000.

[7] J.C. Dunn (1973): "A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters", Journal of Cybernetics 3: 32-57

[8] J.C. Bezdek "Pattern Recognition with Fuzzy Objective Function Algorithms", Plenum Press, New York, 1981.

[9] S.L.Wang, W.H.Lau, S.H.Leung and A. W. C. Liew “Lip Segmentation with the Presence of Beards”, IEEE ICASSP pp.529-532, 2004 [10] M. Kass, A. Witkins, D. Tersopoulos “ Snakes : Actives Contours

Models“, International Journal of computer vision, vol. 1, no. 4, pp.321- 331, January 1988,