Proposed Method - SPEECH ENHANCEMENT

approach.

Performance of any FER system mainly depends on the availability of most discriminative features. Apparently, informative/active regions of a face can provide most discriminative features [47, 102, 103]. In Chapter 4, we proposed an efficient face model by extracting informative regions of a face. Our proposed face model consists of 54 facial landmark points, and the features are extracted only from the salient/informative regions of a face. It was shown in our earlier work [101] that our proposed face model outperformed several existing facial models [12, 14, 33]. So, we employed our proposed informative region-based face model [101] to extract features from multi-view facial images. The proposed UMvDLPP method is elaborately discussed in the following Section.

- view vth

Uncorrelated Multi-view Locality Preserving Projection

V

1-view

- view

i

y1^"^v

y2^"^v v C

Correlated Common Space

Uncorrelated Common Discriminative Space

Figure 5.2: Overall representation of the proposed UMvDLPP method for multi-view facial expression recognition.

of the data space. Additionally, the proposed method maximizes the local between-class scatter matrix of intra-view and inter-view. This formulation is more suitable for multi-view facial expression recognition, as it can handle multi-modal characteristics of multi-view observations (Figure 5.1). Finally, the proposed objective function of UMvDLPP is formulated as a trace ratio minimization problem, and so it can be represented as:

V¹,V², ...,V^v

opt = arg min

V¹,V²,...,V^v

tr V^TXLeqX^TV tr(V^TXB_eqX^TV)

= arg min

V¹,V²,...,V^v

tr V^TPeqV tr(V^TQ_eqV)

(5.1)

where, P_eq =XL_eqX^T and Q_eq =XB_eqX^T are Dv×Dv matrices, which are represented as:

Peq =







P11 · · · P1v

... . .. ... Pv1 · · · Pvv







and Qeq =







Q11 · · · Q1v

... . .. ... Qv1 · · · Qvv







The diagonal blocks of Peq i.e., Pkk, k = 1,2, ..., v, whose trace represent the sum of the distances between the samples of the same class of k^th view (intra-view), whereas trace of off- diagonal blocksPkl, (k 6=l) indicates sum of the distances between the samples of the same class corresponding to two different views (inter-view) i.e., k^th view and l^th view. Mathematically, we define the block matrix Pkl as:

Pkl =







X^kLkX^k^T; if k =l h

X^k X^l i

Lkl

X^k X^l iT

; ifk 6=l (5.2)

where, Lk is the intra-view Laplacian matrix, and it is evaluated as Lk = Dk − Ak. The elements of Ak (similarity matrix for k^th view) can be obtained by applying RBF kernel on i^th and j^th elements of the original data space X^k. In this, (i, j)^th element of Ak is defined as follows:

Ak,ij =





 exp

−^||x^kⁱ^−x^k^j^||

σ²

0, otherwise

, if ci =cj (5.3)

However, the inter-view Laplacian matrix Lkl is evaluated by using the joint observations of k^th view andl^th view of facial features.

Similarly, the diagonal blocks of Qeq i.e., Qkk represents the intra-view local between- class scatter matrix, whereas off-diagonal block matrices Qkl (k6=l) represent inter-view local between-class scatter matrix. Formulation of inter-view Qkl can be mathematically defined as follows:

Qkl =











1 2

i,j=1

Bij x^v_i −x^v_j

x^v_i −x^v_jT

;v =k and l, k 6= l

1 2

PN i,j=1

Bij x^v_i −x^v_j

x^v_i −x^v_jT

;v =k or l, k = l

(5.4)

where,

B_ij =







Aij

nc , if ci =cj

0 , if ci 6=cj

(5.5)

The proposed objective function of UMvDLPP is further modified in the form of ratio trace problem. Since, the ratio trace problem can be converted into a generalized eigenvalue equation, hence there exists a global optimal solution [111]. The converted ratio trace form of UMvDLPP can be written as follows:

V¹,V², ...,V^v

opt = arg min

V¹,V²,...,V^vtr

V^TPeqV V^TQ_eqV

(5.6) Then, the d^th column of V i.e., vd can be obtained by solving the following generalized eigenvalue equation:

Peqvd=λdQeqvd (5.7)

where, λd is the d^th lowest eigenvalue. Finally, reduced feature vector onto the common space is obtained by using the following transformation:

Y^k =V^k^TX^k; k = 1,2, ..., v (5.8) Although the obtained common space is quite discriminative, the different components of the reduced feature vectors may be correlated, and so, we call this space as a Correlated Com- mon Space (CCS), and it is denoted by Yccs. In our proposed approach, instead of classifying directly from CCS, we first transformed features of the correlated common space Yccs to the Uncorrelated Common Space (UCS)Yucs, and then classification is done. For this, UCS is obtained from the CCS with the help of the transformation matrixU = [u1,u2, ...,u_d] [115]. The columns ofU(eigenvectors) are essentially the solutions of the following generalized eigenvalue equation corresponding to the first d lowest eigenvalues:

Yccs(Lccs+Bccs)Y_ccs^T u=λccsYccsGccsY_ccs^T u (5.9) where, Lccs and Bccs are Laplacian and between-class transformation matrices respectively.

These matrices are obtained from the CCS. The matrix Gccs =I−(1/vN)ee^T, where I is an identity matrix, and e = (1,1, ...,1)^T. Therefore, the transformed UCS can be obtained by

Pan of Pan of

Pan of -45° -30° -15° Pan of 0° Pan of 15° Pan of 30° Pan of 45°

Figure 5.3: A part of BU3DFE dataset with 54 landmark points. Locations of these facial points show the informative regions of a multi-view facial image as suggested in [101].

linear projection:

Yucs=U^TYccs (5.10)

Finally, we usekNN classifier for classification onto the uncorrelated common discriminative space. The learning of kNN is straightforward. During inference process, the test sample x^k_test of k^th view is first projected onto the CCS by using learned view-specific transformation matrix (V^k) followed by a projection onto the UCS by using the transformation matrix U. This sequence of projections can be represented as:

y^k_test=U^T

V^k^Tx^k_test

(5.11) The class-label of the test samplex^k_test is obtained on the basis of the labels ofk-nearest samples of y^k_test onto the uncorrelated common discriminative space.

Dalam dokumen SPEECH ENHANCEMENT (Halaman 113-117)