M01745

(1)

Automatic Batik Motifs Classification using

Various Combinations of SIFT Features Moments

and

k

-Nearest Neighbor

Iwan Setyawan

1

, Ivanna K. Timotius

2

, Marchellius Kalvin

3 Department of Electronic Engineering

Satya Wacana Christian University Salatiga 50711, Indonesia

Email:1iwan.setyawan@ieee.org,2ivanna.timotius@ieee.org,3marchellka29@gmail.com

Abstract—Batik cloth is Indonesia’s national heritage. Across the archipelago, there are numerous patterns and motifs of batik, each having its own meaning and cultural significance. In this paper, we present the results of our investigation of various combinations of SIFT features moments used in automatic clas-sification of batik motifs. The clasclas-sification method used in this paper is thek-Nearest Neighbor. Our experiments show that the best performance of the system is obtained using feature vectors of length 7, yielding a classification accuracy rate of 31.43% for 7 classes of batik motifs with no batik motif classes having zero classification accuracy rate. Furthermore, our experiments suggest that the feature moment that seems to be the best for the classification process is the µc, while the feature moment that

seems to hinder the classification process is theσ2_c.

Keywords—SIFT features, feature moments, pattern classifica-tion, batik motifs

I. INTRODUCTION

The batik cloth is Indonesia’s national heritage and is also part of the national identity that has been recognized by the UNESCO [1]. There are numerous variations of the batik motifs and patterns produced in Indonesia. These variations depends on the particular areas in which the batik is produced and also shows influence of other cultures for example Chinese and Dutch. These motifs and patterns are created not only to be visually pleasant but they also carry philosophical meanings. Examples of these batik motifs and patterns are presented in Figure 1.

These variety of motifs and patterns can be classified into several classes. Based on the collection of the Danar Hadi Batik Museum in Solo, Central Java, Indonesia, the classes are the “Parang”, “Lereng”, “Dutch”, “Chinese”, “Ce-plokan”, “Semen” and “Lunglungan”. Each class has its own distinctive patterns and motifs. The “Parang” motif (Figure 1a), is dominated by slanted lines comprised of ornaments resembling the character “S”. The “Lereng” motif (Figure 1b) is superficially very similar to the “Parang” motif in that it is dominated by slanted lines. However, the “Lereng” motif is missing the “Mlinjon” ornaments found in the “Parang” motif (the “Mlinjon” ornament is shown circled in Figure 1a). “Dutch” inspired batik motifs (Figure 1c) comprises primarily of animals and plants pattern or even themes based on western fairy tales such as The Red Riding Hood or The Snow White. “Chinese” inspired batik motifs (Figure 1d) feature prominent

(a) “Parang” (b) “Lereng”

(c) “Dutch” (d) “Chinese”

(e) “Ceplokan” (f) “Semen”

(g) “Lunglungan”

(2)

Chinese mythical animals like the dragon or the phoenix. The “Ceplokan” motif (Figure 1e) is mainly comprised of geometric shapes such as squares, circles or diamonds. The “Semen” motif (Figure 1f) has a particular characteristic namely the “Meru” decoration (the large diamond-shaped decoration shown in the image). Finally, the “Lunglungan” (Figure 1g) motif is a derivation of the “Semen” motif. The main distinction between the two motifs is that the “Meru” decoration is not always present in the “Lunglungan” motif.

In general the available batik motifs can be classified into two major classes, i.e. those containing geometrical motifs and those containing non-geometrical motifs [2]. Batik with geometrical motifs contains geometrical shapes that appear repeatedly. Looking at the examples provided in Figure 1, we see that the “Parang”, “Lereng” and “Ceplokan” motifs fall into this category. On the other hand, batik with non-geometrical motifs does not follow any certain pattern rule. Again referring to Figure 1, we see that the “Dutch”, “Chinese”, “Semen” and “Lunglungan” motifs fall into this category.

The classification of batik motifs into the aforementioned classes are usually performed manually by batik experts. Some research to automatically classify batik motifs into different classes have been performed in the literature. Most of these works utilize the Gray-Level Co-occurence Matrix (GLCM) as feature descriptors [3][4][5]. The authors in [4] also give performance comparison of their feature descriptor with the feature descriptors constructed using Canny edge detectors and Gabor filters. The authors in [2] use color co-occurence matrices, difference between pixels of the scan pattern, and color histogram to build their feature descriptors. It should be noted that since batik motifs variation depends on the area in which they are produced, none of these works share the same batik motifs classes between them. Furthermore, the batik classes used in these works are also slightly different from those we use in this paper.

In this paper we propose a digital image based method of automatically classifying these batik motifs. The features used for classification are various combinations of SIFT [6] feature (i.e., the gradient magnitude and the orientation angle) moments. The feature moments used in this paper are the mean, variance, standard deviation, skewness and kurtosis. Each combination forms an λ-dimensional feature vector. The classification is performed using thek-Nearest Neighbor method [7]. Our experiments show that the highest overall classification rate is achieved using 4- and 5-dimensional feature vectors. However, in these cases some motifs cannot be classified at all. Using 7- and 8-dimensional feature vectors yields a slightly lower overall classification rate but all motifs can be classified.

The rest of the paper is organized as follows. In Section 2 we provide a more detailed discussion of the SIFT algorithm. In Section 3 we discuss the feature moments combinations used to construct the feature vectors. In Section 4 we present the our proposed system. In Section 5 we discuss the experi-mental setup and results. Finally, in Section 6 we provide our conclusions and pointers to our future works.

II. SCALEINVARIANTFEATURETRANSFORM

The Scale Invariant Feature Transform (SIFT) is an ap-proach for extracting image features [6]. The features extracted using this approach are invariant to image scaling and rotation and are also partially robust against changes in illumination and camera angle (in 3-dimensional space) [6]. The extracted features are usually referred to as SIFT keys.

Due to the robustness of the SIFT keys, this approach has been widely used in various applications. One such application is in the detection of particular objects within an image [6]. Another example is provided by [8], where the authors proposed an approach to construct facial templates based on SIFT features extracted from multiple facial images. The authors in [9] use adapted SIFT features in a face recognition application. The SIFT feature extraction is adapted so that the scheme is much more robust against illumination changes. The authors in [10] incorporates SIFT to extract features from video frames that are used to detect whether a particular video has duplicates within a certain set of videos. The ability of the SIFT approach to detect robust points from an image has also been used to combat desynchronization (geometric) attacks on digital watermarking systems. Examples of such approaches are presented in [11] and [12]. Despite the wide range of applications of SIFT features, to the best of our knowledge the use of SIFT features for automatic classification of batik motifs is novel.

The SIFT approach essentially consists of four steps, namely [6]:

1) Detection of scale-space extrema:This step searches extrema across all scales and all spatial areas of the input image. These extrema are potential points that are invariant to scale and rotation. For efficiency, this step is implemented using a difference-of-Gaussian function as follows.

The parameterkS defines the separation between two

adjacent scales while σ controls the width of the Gaussian.

2) Localization of keypoints:This step determines the location and scale of each potential point identified in the first step. Keypoints are selected based on their stability.

(3)

4) Keypoint descriptor:Local image gradients are then measured at the selected scale, within the region of each identified keypoint. These gradients are then transformed into a representation such that it is toler-ant against local geometric distortion and illumination change.

In this paper, we do not use the full SIFT descriptor. Instead, we use the largest gradient magnitude, m(x,y) and

its corresponding orientation, θ(x,y) of a keypoint located at

the spatial position L(x,y) of the input image at a chosen

scale. The gradient magnitude and orientation are computed as follows [6]. First, let dh(x,y) and dv(x,y) be the horizontal

and vertical image gradient ofL(x,y), respectively. These are

calculated as follows:

dh(x,y)=L(x+1,y)−L(x−1,y) (3)

dv(x,y)=L(x,y+1)−L(x,y−1) (4)

The gradient magnitude and orientation are then given by:

m(x,y)=

III. FEATUREMOMENTSCOMBINATIONS ASFEATURE VECTORS

The gradient magnitude and orientation values obtained from an input image is then further processed to construct the feature vectors used in batik motifs classification. We calculate the moments of θ(x,y) and c(x,y)=m(x,y)·θ(x,y). Then,

we combine these moments to form feature vectors of varying lengths.

A. Calculation of SIFT feature moments

In this paper, we use five moments of the SIFT features: mean, variance, standard deviation, skewness and kurtosis. These moments are computed as follows. Let θi and ci,

i = 1,2, . . . ,N be the set of θ(x,y) and c(x,y) obtained

from an input image. The mean, variance, standard deviation, skewness and kurtosis are obtained, respectively:

µX =

In the previous equations, X can be either θ or c. For example, the variance of the gradient orientation, σ_θ2, would be computed as follows:

σ2_θ = PN

i=1(θi −µθ)

N₋1 (12)

It should be obvious that after performing these calcula-tions, we have 10 different feature moments for any given input image. These feature moments are then combined to construct the feature vectors. We choose to use the moments of the SIFT features – rather than the SIFT features directly – because the number of SIFT features generated from the batik cloth images varies widely and it is very difficult to pick the best features to use. The feature moments, on the other hands, give us the general properties of the SIFT features extracted from each image. In other words, these moments act like a digest or a hash that can represent the properties of the batik motif.

B. Construction of the feature vectors

The feature vectors used in this paper are constructed by combining the available feature moments into vectors of varying length. The length of the vectors varies from 1 (using a single moment) to 10 (using all available moments). For example, the vector constructed by using 4 different moments would have a length of 4. An example of such a vector is V ={µθ, µc, σc, α4θ}.

The number of vectors of a given length λ, constructed from the 10 available feature moments, can be computed as follows

C10_λ = 10!

λ!(10−λ)! (13) Thus, for example, the number of feature vectors of length

λ=4 is 210.

IV. PROPOSEDSYSTEM

In this section we present the details of our proposed au-tomatic batik motifs classification system, presented in Figure 2. Our proposed system consists of two main steps, namely the training and testing steps. During the training step, we construct the feature vectors based on our training images. During the testing step, we generate feature vectors based on the testing images. The training- and testing-vectors are then fed into ak-NN algorithm to classify the batik motifs. Thek -NN algorithm is one of the very basic classification algorithm. We use this algorithm because our main interest in this paper is to investigate the suitability of SIFT feature moments for batik motifs classification. By using a basic classifier, the overall system performance will reflect the suitability these features for this particular application. Additionally, thek-NN algorithm is suitable in a multi-class classification scenario. Other, more advanced algorithm, such as Support Vector Machines (SVM) are basically a 2-class classifier algorithm [7] and thus in this particular application we would need multiple SVMs.

The steps to construct the feature vector of lengthλ from an input image I(x,y), are as follow:

1) Convert I(x,y) into an 8-bit grayscale image since

the standard SIFT only support grayscale images. 2) Compute the SIFT keys from the image (i.e.,mi and

θi,i=1, . . . ,N).

3) Calculate the feature moments (mean, variance, stan-dard deviation, skewness and kurtosis).

(4)

Training image Testing image

Fig. 2: Proposed batik motifs classification system

The result of this process is C10_λ feature vectors. The vectors constructed from training images are called the training vectors while those constructed from the testing images are called the testing vectors.

V. EXPERIMENTALSETUP ANDRESULTS

The performance of the automatic batik motifs classifi-cation system is evaluated by performing classificlassifi-cation on seven different batik motifs, as shown in Figure 1. In our experiments, the batik motifs images are taken from the collection of the Danar Hadi Batik Museum in Solo, Central Java, Indonesia. These images are taken using a DSLR camera on “auto” settings using only the incandescent light available in the museum (no additional lighting or flashlight is used). The images are taken such that the position of the camera is perpendicular to the batik cloth. All images used for the experiments are resized to 500_×331 pixels. For the training image database, we use 10 images for each batik motif class (a total of 70 images). For the testing image database we use 5 images for each batik motif class (35 images in total). For each image, the number of feature vectors constructed,nV, is

given by

TABLE I: Average classification accuracy

λ A₊ (%) A₋ (%)

The parameters used for the SIFT keys extraction are the ones recommended in [6], namely kS =

√

2 andσ=1.6. The SIFT keys used in our experiments are taken from the third octave. This is done because our experiments show that these provide sufficient differentiating power between batik motifs. For the classifier, we use thek-NN classifier withk₌1.

The performance of the proposed system is evaluated based on the average classification accuracy of the seven different batik motifs classes. The result of the experiments are summarised in Table I, showing the maximum (A₊) and minimum (A₋) average classification accuracy rates for each feature vector length (λ). Note that since for λ =10 there is

only one feature vector available, we get A₊₌A₋.

Table I shows that the highest overall classification accu-racy rates are achieved by feature vectors with λ = 4 and

λ₌5. Specifically, the 4-dimensional vectors giving the high-est accuracy rate are {µθ, µc, α3c, α4θ} and{µc, σθ, α3c, α4θ},

while for the 5-dimensional vectors the highest accuracy rate are achieved by the vectors {µθ, µc, σθ, α3θ, α3c} and

{µc, σθ, α3θ, α3c, α4θ}. The table also shows that the lowest

overall accuracy rate is achieved by the feature vector con-taining 10 feature moments.

Although the 4- and 5-dimensional feature vectors give the highest overall accuracy rates, in both of these cases not all batik motifs can be classified. Specifically, in both cases the “Semen” motif has zero accuracy rate. The next highest overall accuracy rate is given by the 3- and 6-dimensional feature vectors (34.29%). Again, in these cases not all batik motifs can be classified. The 3-dimensional feature vector that give the highest accuracy rate, {µc, α3c, α4θ} fails to

classify both the “Chinese” and “Semen” motifs while the 6-dimensional feature vector with the highest accuracy rate, {µθ, µc, σθ, α3θ, α3c, α4θ}, fails to classify the “Semen” motif.

The 7- and 8-dimensional vectors both give lower maximum average accuracy rate of 31.43%, but in these cases all batik motifs can be classified (i.e., have non-zero classification accuracy rate). Using 1-dimensional vector also gives a max-imum average accuracy rate of 31.43%, but in this case the “Chinese” motif cannot be correctly classified. The 7- and 8-dimensional vectors that give the highest accuracy rates are listed in Table II. Since both the 7- and 8-dimensional feature vectors give the same accuracy rate, these results suggest that using 7-dimensional feature vector is enough for batik motifs classification using the proposed system.

(5)

TABLE II: 7- and 8-dimensional vectors with highest accuracy rates (31.43%)

λ Vectors

7

{µθ, µc, σ2_θ, σθ, σc, α3θ, α4c}

{µθ, µc, σ2_θ, σθ, σc, α3c, α4c}

{µθ, µc, σ2_θ, σθ, σc, α4θ, α4c}

{µθ, µc, σ2θ, σc, α3θ, α3c, α4c}

{µθ, µc, σ2_θ, σc, α3c, α4θ, α4c}

{µθ, µc, σ2_θ, σc, α3θ, α4θ, α4c}

{µc, σ2θ, σθ, σc, α3θ, α3c, α4c}

{µc, σ2_θ, σθ, σc, α3θ, α4θ, α4c}

{µc, σ2_θ, σθ, σc, α3c, α4θ, α4c}

{µc, σ2θ, σc, α3θ, α3c, α4θ, α4c}

8

{µθ, µc, σ2_θ, σθ, σc, α3θ, α3c, α4c}

{µθ, µc, σ2θ, σθ, σc, α3θ, α4θ, α4c}

{µθ, µc, σ2_θ, σθ, σc, α3c, α4θ, α4c}

{µc, σ2_θ, σθ, σc, α3θ, α3c, α4θ, α4c}

TABLE III: Average accuracy rates for each batik motif class

Motif class Average accuracy rate (%)

“Parang” 37.88

“Lereng” 17.18

“Dutch” 8.55

“Chinese” 3.94

“Ceplokan” 16.83

“Semen” 10.95

“Lunglungan” 8.61

values ofλ) of each batik motif classes. From this table we can see that the “Parang” motif has the highest average accuracy which suggests that this motif is “easiest” to classify using the proposed system. The next two highest accuracy rates are achieved by the “Ceplokan” and “Lereng” motifs. All these three motifs have strong geometric patterns. This suggests that the proposed system is most suitable for classifying batik motifs dominated by geometric patterns. Table III also shows that the most difficult batik motif to classify is the “Chinese” motif. This is possibly due to the complexity of the motif and lack of geometric patterns.

During our experiments, we also find that some batik motifs samples cannot be correctly classified, irrespective of the value of λ. These are shown in Figure 3. On the other hand, some batik motifs samples are consistently classified correctly by the system irrespective of the value of λ. These samples are shown in Figure 4. As can be seen in this figure, the motifs that are consistently classified correctly are mostly dominated by geometric shapes. Although the samples for the “Dutch” (Figure 4a), “Lunglungan” (Figure 4b) and “Semen” (Figures 4i and 4j) do not have much geometric structure, similar motifs to these test images are also available in the training image database and this influences the performance of the system on these motifs.

Further analysis is also performed to determine which

fea-(a) “Chinese #4” (b) “Lereng #3”

(c) “Lunglungan #4”

Fig. 3: Batik motifs samples that are consistently misclassified (have zero classification accuracy rate)

ture moments are most suitable for batik motifs classification. In order to do so, we will first look at the accuracy rates of the 1-dimensional vectors since this shows the individual contributions of each feature moment. The accuracy rates of these vectors are shown in Table IV. This table shows that

σ2_c and σc are two feature moments that give the lowest

accuracy rates. This suggests that the inclusion of these feature moments in feature vectors will produce low accuracy rates. On the other hand, µc is shown to give the highest accuracy

rate and correspondingly its inclusion in feature vectors will increase accuracy rates. Looking at the vectors giving the highest accuracy rates for λ = 4,5,7 and 8 that have been presented above, we see that µc is a feature moment that is

always present in these vectors. Meanwhile,σ2_c is the feature moment that is never present in vectors giving the highest accuracy rates. Similar observations are also made for other values of λ (with an exception of λ ₌ 10 since there is only one vector). On the other hand, feature moment σc is

not present in feature vectors giving highest accuracy rates for λ =2,3,4,5,6, but is always present in such vectors for

λ=7,8,9. From these observations, we can conclude that the best feature moment for batik motifs classification is µc while

the worst feature moment is the σ2_c. However, the roles of the other feature moments (or combinations thereof) still need further investigation.

VI. CONCLUSIONS ANDFUTUREWORK

(6)

(a) “Dutch #5” (b) “Lunglungan #3”

(c) “Ceplokan #4” (d) “Ceplokan #5”

(e) “Lereng #4” (f) “Lereng #5”

(g) “Parang #4” (h) “Parang #5”

(i) “Semen #2” (j) “Semen #4”

Fig. 4: Batik motifs samples that are consistently classified correctly

However, in all of these cases some batik motifs cannot be classified at all (i.e., have zero classification accuracy rate). The use of 7- and 8-dimensional feature vectors, on the other hand, results in no batik motifs having zero classification accuracy rate. Therefore, using 7-dimensional feature vector is enough for batik motifs classification using the proposed system. Finally, our experiments also suggest that the feature moment that help the classification process is theµc, while the

feature moment that seems to hinder the classification process is the σ2_c.

TABLE IV: 1-dimensional vectors accuracy

Feature moment Accuracy (%)

µθ 14.29

µc 31.43

σ2_θ 17.14

σ2_c 5.71

σθ 17.14

σc 5.71

α3θ 14.29

α3c 8.57

α4θ 14.29

α4c 8.57

In our future work, we will continue our investigation to find the best feature descriptors (or combinations thereof) for batik motifs classification in order to increase the classification accuracy. We will also investigate other combinations of the SIFT parameters (kSandσ). Furthermore, in this paper we use

a very simple classifier (i.e., thek-NN). In the future, we will also investigate the combination of the features proposed in this paper with more advanced classifiers, such as the Support Vector Machine, to classify batik motifs.

REFERENCES

[1] UNESCO, Indonesian Batik, Inscribed in 2009 on the Repre-sentative List of the Intangible Cultural Heritage of Humanity, http://www.unesco.org/culture/ich/RL/00170.

[2] N. Suciati, W.A. Pratomo, and D. Purwitasari,Batik Motif Classifica-tion using Color-Texture-Based Feature ExtracClassifica-tion and BackpropagaClassifica-tion Neural Network, Proc. of IIAI 3rd Int. Conf. on Advanced Applied Informatics (IIAIAAI), Kitakyushu, pp. 517 – 521, 2014.

[3] K.S. Loke and M. Cheong,Efficient Textile Recognition via Decompo-sition of Co-occurrence Matrices, Proc. IEEE Int. Conf. on Signal and Image Processing Applications (ICSIPA), pp. 257 – 261, Kuala Lumpur, 2009.

[4] I. Nurhaida, R. Manurung, and A.M. Arymurthy,Performance Compar-ison Analysis Features Extraction Methods for Batik Recognition, Proc. Int. Conf. on Advanced Computer Science and Information Systems (ICACSIS), pp. 207 – 212, Depok, 2012.

[5] A.E. Minarno, Y. Munarko, A. Kurniawardhani, F. Bimantoro and N. Suciati, Texture Feature Extraction Using Co-Occurrence Matrices of Sub-Band Image For Batik Image Classification, Proc. 2nd Int. Conf. on Information and Communication Technology (ICoICT), pp. 249 – 254, Bandung, 2014.

[6] D.G. Lowe,Distinctive image features from scale-invariant keypoints, Int. J. Computer Vision 60:2, pp. 91–110, 2004.

[7] R.O. Duda, P.E. Hart, and D.G. Stork, “Pattern Classification,” John Wiley & Son, Inc., 2 ed., 2001.

[8] A. Rattani, D.R. Kisku, A. Lagorio and M. Tistarelli,Facial Template Synthesis based on SIFT Features, Proc. IEEE Workshop Automatic Identification Advanced Technologies, Alghero, Italy, pp. 69–73, 2007. [9] J. Kriˇzaj, V. ˇStruc and N. Paveˇsi´c,Adaptation of SIFT Features for Robust

Face Recognition, Proc. 7th Int. Conf. Image Analysis and Recognition, Part I, P´ovoa de Varzim, Portugal, pp. 394–404, 2010.

[10] M. Douze, H. J´egou and C. Schmid, An Image-based Approach to video copy detection with spatio-temporal post-filtering, IEEE Trans. Multimedia 12:4, pp. 257–266, 2010.

[11] C.G. Thorat and B.D. Jadhav, A blind digital watermark technique for color image based on integer wavelet transform and SIFT, Proc. Int. Conference and Exhibition on Biometrics Technology, pp. 236–241, 2010.