• Tidak ada hasil yang ditemukan

Proposed Method

Dalam dokumen SPEECH ENHANCEMENT (Halaman 113-117)

approach.

Performance of any FER system mainly depends on the availability of most discrimina- tive features. Apparently, informative/active regions of a face can provide most discriminative features [47, 102, 103]. In Chapter 4, we proposed an efficient face model by extracting infor- mative regions of a face. Our proposed face model consists of 54 facial landmark points, and the features are extracted only from the salient/informative regions of a face. It was shown in our earlier work [101] that our proposed face model outperformed several existing facial models [12, 14, 33]. So, we employed our proposed informative region-based face model [101] to extract features from multi-view facial images. The proposed UMvDLPP method is elaborately discussed in the following Section.

- view vth

Uncorrelated Multi-view Locality Preserving Projection

V

1

Vi

Vv

1-view

- view

i

th

y1"v

y2"v v C

"

y

Correlated Common Space

Uncorrelated Common Discriminative Space

Figure 5.2: Overall representation of the proposed UMvDLPP method for multi-view facial expres- sion recognition.

of the data space. Additionally, the proposed method maximizes the local between-class scatter matrix of intra-view and inter-view. This formulation is more suitable for multi-view facial expression recognition, as it can handle multi-modal characteristics of multi-view observations (Figure 5.1). Finally, the proposed objective function of UMvDLPP is formulated as a trace ratio minimization problem, and so it can be represented as:

V1,V2, ...,Vv

opt = arg min

V1,V2,...,Vv

tr VTXLeqXTV tr(VTXBeqXTV)

= arg min

V1,V2,...,Vv

tr VTPeqV tr(VTQeqV)

(5.1)

where, Peq =XLeqXT and Qeq =XBeqXT are Dv×Dv matrices, which are represented as:

Peq =

P11 · · · P1v

... . .. ... Pv1 · · · Pvv

and Qeq =

Q11 · · · Q1v

... . .. ... Qv1 · · · Qvv

The diagonal blocks of Peq i.e., Pkk, k = 1,2, ..., v, whose trace represent the sum of the distances between the samples of the same class of kth view (intra-view), whereas trace of off- diagonal blocksPkl, (k 6=l) indicates sum of the distances between the samples of the same class corresponding to two different views (inter-view) i.e., kth view and lth view. Mathematically, we define the block matrix Pkl as:

Pkl =

XkLkXkT; if k =l h

Xk Xl i

Lkl

h

Xk Xl iT

; ifk 6=l (5.2)

where, Lk is the intra-view Laplacian matrix, and it is evaluated as Lk = Dk − Ak. The elements of Ak (similarity matrix for kth view) can be obtained by applying RBF kernel on ith and jth elements of the original data space Xk. In this, (i, j)th element of Ak is defined as follows:

Ak,ij =

 exp

||xki−xkj||

2

σ2

0, otherwise

, if ci =cj (5.3)

However, the inter-view Laplacian matrix Lkl is evaluated by using the joint observations of kth view andlth view of facial features.

Similarly, the diagonal blocks of Qeq i.e., Qkk represents the intra-view local between- class scatter matrix, whereas off-diagonal block matrices Qkl (k6=l) represent inter-view local between-class scatter matrix. Formulation of inter-view Qkl can be mathematically defined as follows:

Qkl =





1 2

2N

P

i,j=1

Bij xvi −xvj

xvi −xvjT

;v =k and l, k 6= l

1 2

PN i,j=1

Bij xvi −xvj

xvi −xvjT

;v =k or l, k = l

(5.4)

where,

Bij =

Aij

nc , if ci =cj

0 , if ci 6=cj

(5.5)

The proposed objective function of UMvDLPP is further modified in the form of ratio trace problem. Since, the ratio trace problem can be converted into a generalized eigenvalue equation, hence there exists a global optimal solution [111]. The converted ratio trace form of UMvDLPP can be written as follows:

V1,V2, ...,Vv

opt = arg min

V1,V2,...,Vvtr

VTPeqV VTQeqV

(5.6) Then, the dth column of V i.e., vd can be obtained by solving the following generalized eigen- value equation:

PeqvddQeqvd (5.7)

where, λd is the dth lowest eigenvalue. Finally, reduced feature vector onto the common space is obtained by using the following transformation:

Yk =VkTXk; k = 1,2, ..., v (5.8) Although the obtained common space is quite discriminative, the different components of the reduced feature vectors may be correlated, and so, we call this space as a Correlated Com- mon Space (CCS), and it is denoted by Yccs. In our proposed approach, instead of classifying directly from CCS, we first transformed features of the correlated common space Yccs to the Uncorrelated Common Space (UCS)Yucs, and then classification is done. For this, UCS is ob- tained from the CCS with the help of the transformation matrixU = [u1,u2, ...,ud] [115]. The columns ofU(eigenvectors) are essentially the solutions of the following generalized eigenvalue equation corresponding to the first d lowest eigenvalues:

Yccs(Lccs+Bccs)YccsT u=λccsYccsGccsYccsT u (5.9) where, Lccs and Bccs are Laplacian and between-class transformation matrices respectively.

These matrices are obtained from the CCS. The matrix Gccs =I−(1/vN)eeT, where I is an identity matrix, and e = (1,1, ...,1)T. Therefore, the transformed UCS can be obtained by

Pan of Pan of

Pan of -45° -30° -15° Pan of Pan of 15° Pan of 30° Pan of 45°

Figure 5.3: A part of BU3DFE dataset with 54 landmark points. Locations of these facial points show the informative regions of a multi-view facial image as suggested in [101].

linear projection:

Yucs=UTYccs (5.10)

Finally, we usekNN classifier for classification onto the uncorrelated common discriminative space. The learning of kNN is straightforward. During inference process, the test sample xktest of kth view is first projected onto the CCS by using learned view-specific transformation matrix (Vk) followed by a projection onto the UCS by using the transformation matrix U. This sequence of projections can be represented as:

yktest=UT

VkTxktest

(5.11) The class-label of the test samplexktest is obtained on the basis of the labels ofk-nearest samples of yktest onto the uncorrelated common discriminative space.

Dalam dokumen SPEECH ENHANCEMENT (Halaman 113-117)