Methodology - Randomized Graph-based Hamming Embedding (RGHE)

3) Spectral-Transform-based Methods

3.3 Randomized Graph-based Hamming Embedding (RGHE)

3.3.1 Methodology

The RGHE initially constructs a set of minutia vicinity that describes the nearest neighborhood structure in the Euclidean space. Each minutia vicinity is then decomposed into (4) minutiae triplets, where a set of geometric invariant features is derived from each triplet. Then, the invariant features are projected onto a random subspace determined by an externally-derived

identical to the Minutia vicinity Decomposition and Two-Dimensional Random Projection, which are presented in Section 3.2. Thereafter, the randomized minutia vicinity decomposition features (RMVD) are transformed using Graph-based Hamming Embedding to preserve the neighborhood structure of RMVD. The block diagram of RGHE is presented in Fig. 3.6.

3.3.1.1 Minutia Vicinity Decomposition (MVD) and Randomizing MVD These two processes are completely identical to the minutia vicinity decomposition (MVD) and 2-dimensional random projection described in Section 3.2. Hence, the identical processes are not repeated in this section.

3.3.1.2 Graph based Hamming Embedding (GHE)

The trained RMVD Ω ∈ ℝ^𝑁×36 consists of a set of N minutiae vicinities 𝒖 ∈ ℝ³⁶ of a fingerprint image in the Euclidean space. Let G={U,W} be a weighted graph with vertex U for |U|=N and weight matrix 𝐖 ∈ ℝ^𝑁×𝑁. Each element wij of W denotes the global similarity of vertex pairs (u_i, u_j), which is measured by w_ij = exp(−||u_i− u_j||² / σ²) with σ representing the bandwidth of heat kernel (Belkin & Niyogi, 2003). Our objective is to search a mapping function that preserves the Euclidean distance between the resultant m-components feature with respect to the minutia vicinities in the Euclidean space.

This problem can be formulated by solving the following optimization problem (Weiss et al., 2009):

min φ∫W|| φ(ui) − φ(uj)||²p(ui)p(uj)duiduj (3.7) subject to

φ(u) ∈ {-1, 1}^m ∫φ(u)p(u)du = 0

∫φ(u)φ(u)^Tp(u)du = I

where p(u) is the probability distribution of u.

The second constraint ∫φ(u)p(u)du = 0 requires the flipping probability of each individual bit of the resultant binary code to be 0.5 and the third constraint ∫φ(u)φ(u)^Tp(u)du = I requires the bits to be uncorrelated. Although the optimization problem in (2.7) with the first constraint φ(u) ∈ {-1, 1}^m is NP hard, several analytical solutions are available by applying spectral relaxation, such as eigenfunctions of the weighted Laplace-Beltrami operators defined on manifolds (Belkin & Niyogi, 2003).

Specifically, let 𝑳_𝑝be a weighted Laplacian operator that maps a

function 𝜑 to 𝜓 = 𝐿_𝑝𝜑 by

𝜓(𝐮)/𝜑(𝐮) = 𝐷(𝐮)𝜑(𝐮)𝑝(𝐮) − ∫ 𝐖(𝐬, 𝐮)𝜑(𝐬)𝑝(𝐬)𝑑𝐬_𝑠 with 𝐷(𝐮) = ∫ 𝐖(𝐮, 𝐬)𝑝(𝐬)𝑑𝐬_𝑠 . The solution for the minimization problem in eq.

(3.7) is therefore eigenfunctions 𝜉 that satisfy 𝐿_𝑝𝜉 = 𝛽𝜉 for a real-valued 𝛽.

To solve the above problem, two assumptions have to be made: 1) p(u) is a separable distribution; 2) each input feature is drawn from a uniform

𝐿_𝑝 have an outer product form. The “outer-product” eigenfunctions are merely products of eigenfunctions along different dimensions and their eigenvalue is simply the product of the eigenvalues of these dimensions. Therefore, the first assumption implies that we may construct an eigenfunction 𝜉(𝐮) of 𝐿_𝑝 using a product of 36 single-dimensional eigenfunctions, 𝜉(𝐮) = ∏³⁶_𝑖=1𝜉(𝑢_𝑖) corresponding to each feature. The second assumption allows us to select the following eigenfunctions as the single-dimensional eigenfunctions of the single dimensional Laplacian 𝐿_𝑝 in the small 𝜖, which is well studied in mathematics (Weiss et al., 2009):

𝜉_𝑘(𝑥) = sin (𝜋

2+ 𝑘𝜋

𝑏 − 𝑎𝑥) (3.8)

𝛽_𝑘= 1 − 𝑒⁻^𝜖

2 2 |𝑘𝜋

𝑏−𝑎| (3.9)

where x is a single-dimensional arbitrary real feature uniformly distributed in the range of [a, b]; and 𝛽_𝑘 is the corresponding eigenvalue of 𝜉_𝑘(𝑥), which serves as an indicator for eigenfunctions selection for the GHE mapping. We notice that the assumption on uniformly distributed data may not fit the case in practice. However, the experimental result illustrates that eq. (3.8) works well on RMVD, although the experimental data might not be uniformly distributed.

From the above description, a two-step algorithm can be derived: 1) Principal Component Analysis (PCA) alignment, 2) Eigenfunctions selection.

Here, we take a multi-dimensional Gaussian as the distribution function for p(u) defined in eq. (3.7). This is attributed to the nice property of Gaussian distribution function that can be made separable by simply aligning the data along the axes by rotation, which motivates the use of PCA. It is important to

note that PCA in GHE merely serves the purpose of data alignment but not dimensionality reduction as applied in (Ratha et al., 2007; Ferrara et al., 2012).

Therefore, the inversion issue (privacy breach) of back-projection in (Ratha et al., 2007; Ferrara et al., 2012) and the other projection-based techniques (Ratha et al., 2007) would not happen in our case.

The second step is to compute m eigenfunctions using 𝜉_𝑖(𝐲) =

∏³⁶_𝑗=1𝜉_𝑖(𝒚_𝑗) for i=1,…,m according to eq. (3.8), where y is the 36-dimensional PCA-aligned data. This can be done by evaluating the k eigenvalues for each of the 36 PCA directions using eq. (3.9) and sorting the resultant 36k eigenvalues ascending. After discarding eigenfunctions with zero eigenvalue, we select m eigenfunctions with the m smallest eigenvalues from the remaining eigenfunctions to form a m-components feature vector. The same process is repeated for N minutiae vicinities of an image. Finally, a N × m real- valued feature matrix is obtained.

The GHE is applied to every individual randomized MVD in both enrollment and verification stages. The parameters required to be stored during the enrollment stage are the range of Y (a and b) and the projection matrix R for data alignment.

Note that GHE seeks a m-components feature vector from each 36- dimensional RMVD feature vector of a minutia vicinity. By considering all the minutia vicinities, a GHE-extracted real-valued template of size N × m can be

32, 64, 129 or 256 components and this causes the corresponding real-valued template to take the size of N × 16, N × 32, N × 64, N × 128 or N × 256, respectively. Algorithm 3.1 presents the detailed flow of the proposed GHE method.

Algorithm 3.1. Graph based Hamming Embedding (GHE)

Dalam dokumen PRIVACY PRESERVING MINUTIA-BASED FINGERPRINT TEMPLATE PROTECTION (Halaman 99-104)