2.2 Silhouette image based methods
2.2.5 Moments and moment invariants
Moments are region based descriptors in which all the pixels within a shape region are taken into account to obtain the shape representation [63, 97]. The moments extract statistical description of the pixels in the shape region [63]. The moment functions allow to derive moment invariants that are robust to geometrical transformations and less sensitive to shape defections [98].
Moments can be defined as the projection of a given function onto the polynomials that form the basis set [99]. The polynomials can be orthogonal or non-orthogonal. Accordingly, the moments are categorised as non-orthogonal moments and orthogonal moments [99]. The simple and the widely used non-orthogonal
2.2 Silhouette image based methods
moments in hand posture description are the geometric moment invariants.
Consider f (x,y) to represent a binary image of size (N+1)× (M+1) such that x ∈ {0,1,· · ·,N} and y∈ {0,1,· · ·,M}. The function f (x,y) takes the unity value inside the shape region and otherwise it takes the zero value. The geometric moments of order (n+m) representing the image is defined as
Gnm= XN
x=0
XM y=0
xnymf (x,y), n,m=0,1,2,· · · (2.11)
Using non-linear combinations of the lower order geometric moments, Hu [100] derived a set of moment invariants called as geometric moment invariants that are invariant under image scaling, translation and rotation.
Previous works on shape classification [63] have shown that the geometric moment invariants are not sufficient for describing arbitrarily distorted contour-based shapes and perspectively transformed shapes. Hence, the geometric moment invariants are used along with other geometric properties for representing hand postures.
Chalechale et al [65] used geometric moment invariants and geometric properties (area, perimeter, major axis length, minor axis length and eccentricity) as features for representing 25 hand posture signs. The classi- fication was based on Bayesian rule assuming Gaussian distribution for the extracted features. The descriptors were used to classify a database consisting of 2080 hand posture samples and achieved a classification accu- racy of 98%. Similarly, in [101], the geometric moment invariants were combined with features like normalised length of the contour and directional gradients for representing hand postures. The classification was performed using weighted nearest neighbor-based classification scheme. The technique was tested on a database consist- ing of 700 samples of 3 hand postures acquired under various lighting conditions. Out of 700 samples 200 samples were used for testing and the technique achieved an average performance of 95%. Tofighi et al [102]
derived geometric moment invariants from the shape boundary of the hand postures as feature descriptors.
Along with the geometric moments, the convex points on the hand posture were also employed to form the feature vector. The classification was performed using the minimum distance classifier. The efficiency of the geometric moment was tested on a database of 500 samples of 10 gesture classes and their results reported an average classification rate of 90%.
The studies suggest that the geometric moment invariants are suitable for describing simple shapes and not sufficient to accurately describe large number of shapes. The basis functions of the geometric moments are cor- related implying that these moment features are redundant [98]. Teague [103] suggested image representation through orthogonal moments that are derived from the orthogonal polynomials. The Zernike moments (ZMs) and the pseudo-Zernike moments (PZMs) are among the efficient orthogonal moments used for hand posture
0 0.2 0.4 0.6 0.8 1
−1.5
−1
−0.5 0 0.5 1 1.5
ρ − Radius R nm − Radial polynomial
0 0.5 1
−1 0 1
R00
R11 R20
R22
R31 R33
R40
(a)
n=6, m=0
n=6, m=2
n=6, m=4
n=12, m=0
n=12, m=2
n=12, m=4
(b)
Figure 2.2: (a) 1D Zernike radial polynomials Rnm(ρ) and (b) 2D complex Zernike polynomials Vnm(ρ, θ) (real part).
representation. The ZMs and the PZMs are rotation invariant descriptors that are derived using the complex Zernike polynomials and pseudo-Zernike polynomials respectively as basis functions.
The ZMs and PZMs are defined on the polar coordinates (ρ, θ), such that 0 ≤ ρ ≤ 1 and 0≤ θ ≤ 2π. The complex Zernike polynomial Vnm(ρ, θ) of order n≥0 and repetition m is defined as [99]
Vnm(ρ, θ)=Rnm(ρ) exp(−jmθ) (2.12)
For even values of n− |m|and|m| ≤n, Rnm(ρ) is the real-valued radial polynomial given by Rnm(ρ)=
(n−|m|)/2
X
s=0
(−1)s(n−s)!ρn−2s
s!((n+|m|)/2−s)!((n− |m|)/2−s)!
The plots of the radial polynomials Rnm(ρ) for different orders n and repetition m are given in Figure 2.2(a). The 2D complex Zernike polynomials Vnm(ρ, θ) obtained for different values of n and m are shown in Figure 2.2(b).
From the plots, we can infer that the Zernike polynomials have wider supports. Therefore, the Zernike moments characterise the global shape features.
The complex Zernike polynomials satisfy the orthogonality property,
2π
Z
0 1
Z
0
Vnm∗ (ρ, θ) Vlk(ρ, θ) ρdρdθ = π
n+1δ[n−l]δ[m−k]
2.2 Silhouette image based methods
whereδ[.] is the Kronecker delta function. The Zernike moment Znmof order n and repetition m is given by
Znm= n+1 π
2π
Z
0 1
Z
0
Vnm∗ (ρ, θ) f (ρ, θ) ρ dρdθ (2.13)
where|m| ≤n and n− |m|is even.
The integration in (2.13) needs to be computed numerically. The magnitude |Znm|is invariant to rotation and hence, used for rotation invariant gesture representation [66, 67].
The pseudo-Zernike basis polynomials exhibit properties similar to those of Zernike basis polynomials.
The pseudo-Zernike basis polynomials differ only in terms of the radial polynomials. The radial basis of the pseudo-Zernike polynomials are real-valued functions and defined as
Rnm(ρ) =
(n−|m|)
X
s=0
(−1)s(2n+1−s)!ρn−2s s!(n− |m| −s)!(n+|m|+1−s)!
Chang et al [66] used the ZMs and the PZMs as combined features for hand posture classification. The ex- periments were performed on a database consisting of 600 hand postures of 6 gesture signs collected from 10 subjects. The ZMs and PZMs features representing the hand postures were classified using the nearest neighborhood classification technique and achieved a classification rate of 97.3%.
Gu and Su [67] have shown that the ZMs as efficient descriptors for view and user invariant representation of the hand postures. A hierarchial classifier based on the multivariate decision tree was employed for classifying the hand posture features. The database used for the experiment consisted of 3850 samples of 11 gesture signs.
The images were acquired from 5 subjects and at 7 different viewing directions with the frontal view-angle varying between−60◦to 60◦. The results have shown that the ZMs are robust to large variations in the viewing angle and the user’s hand shape.