Proceedings of the 2nd International Conference on Sustainable Technology Development
Special Pattern Development for Feature Extraction
In Balinese Print Character Recognition System
Base on Localized Arc Pattern Method
AA. K. Oka Sudana a); Ni Kadek Ayu Wirdianib); Gusti Agung Ayu Putri c);
a
Lecture at Program Study of Information Technology , Udayana University, Bali Email: agungokas@unud.ac.id
b
Lecture at Program Study of Informatics , STIKI Indonesia, Denpasar, Bali Email: ayu_wirdi@yahoo.com
c
Lecture at Program Study of Information Technology , Udayana University, Bali Email: dongdek@yahoo.com
ABSTRACT: One of pattern recognition that people usually know is character recognition. Object of character recognition in this research isBalinese print character recognition system. Balinese character is unique, the form of one and the other is almost same and some character is differentiated by one line.
Feature extraction of character is conducted by special pattern that is formed from Localized Arc Pattern Methods. Model selection based on apparition each model frequency is got from Balinese character database image. The patterns is formed by the characteristic point in a square 5 x 5 produces 125 pieces of possible initial patterns that can be grouped into an 103 patterns early models. Reduction of processing time is done by selection of 125 patterns that are frequently come up in Aksara Bali. The selection patterns are performed by using computer program to calculate the frequency of each pattern appeared on 600 pieces sample of binary image the Balinese character. Patterns are obtained from the model selection process as many as 23 pieces pattern.
The features of image tester are compared with reference feature. These comparisons yield dissimilarity value. Then this value is sorted and the smallest dissimilarity value is used to define whether the character test is recognized or not, through a critical value comparison. The experiment achieved a success rate of 96.4%.
Key words: pattern recognition, Balinese character recognition, Localized Arc Pattern Methods, special model pattern of Aksara Bali, feature extraction.
1. INTRODUCTION
Technological developments in the field of informatics and computer are very fast. Computer system was developed to perform as a pattern recognition process of human ability. Pattern recognition systems are widely used today, for example, fingerprint and hand palm of images recognations, voice recognition, until the handwriting recognation. One of the common pattern recognition is the handwriting recognition. Writing has unique properties that result in an exciting new problem to be investigated.
Proceedings of the 2 International Conference on Sustainable Technology Development
own uniqueness. Handwriting recognition is used as the object in this research area is Balinese simbol as know as Aksara Bali. It has a unique writing of a similar shape to one another and some writings are distinguished only by a single line sketch (Agung BW et al, 2009). Aksara Bali also have different properties with the Latin, Japan, Korea and China writings characteristic. It becomes a problem in recognizing the Balinese writing. Therefore, here is built a system for the Balinese writing recognation, which will help people to be easier reading balinese writing (Aksara Bali). Development of this system is expected to provide an alternative method for the recognition of a computerized image of Balinese writing simbol, that it can attract the younger generation to learn it which is one of Bali's cultural heritages.
Feature extraction applies a specific pattern based on the Localized Arc Pattern method, which is compatible to the Aksara Bali. It is chosen because this method takes the characteristics of Balinese writing which is expected to give better recognition results. This method has proven quite successful in terms of image signature verification and handwriting recognition of Latin, Japanese, Korean and Chinese. Measurement of accuracy levels of the method for Balinese Character Recognition by calculating the percentage of success, the average error and the complexity of the system.
2. RESEARCH METHOD
2.1.Aksara Bali
Orthography of Aksara Bali in the form of Latin letters is adjusted to the Indonesian language orthography, which the spelling is as simple as possible and its phonetic, that is correct or close to the actual utterance. The letters that is used to write the Aksara Bali in Latin letters form is divided into two, namely: Aksara Suara (vocals alphabet) and Aksara Wianjana (consonants alpahabet) as shown in Table 1 and Table 2.
Table 1. List of Aksara Suara (Vocals Alphabet)
Nomor Aksara Bali Bali Latin
A
Ê
I
U
E
Proceedings of the 2nd International Conference on Sustainable Technology Development
Table 2. List of Aksara Wianjana (Consonants Alphabet)
Nomer Aksara Bali Bali Latin
h / a
n
c
r
k
d
t
s
w
Nomer Aksara Bali Bali Latin
l
m
g
b
ng
p
j
y
ny
2.2.Data
Source of data as Aksara Bali samples that is used to build models of pattern formation and testing of the character recognition is an image of Aksara Bali from the study of I Komang Gede Suamba Dharmayasa (Dharmayasa, 2009). Balinese simbol samples are obtained from the scan results of Balinese language textbooks are extracted using the characters segmentation per block and also from the internet.
2.3.Step of Character Recognition of Aksara Bali
Proceedings of the 2 International Conference on Sustainable Technology Development
i. Data acquisition is a data conversion process. Here, scanner is used to convert analog data Aksara Bali to the digital image. It is stored in bitmap file format of the raw data and will be processed on the next step.
ii. Pre-processing, if the resulting bitmap file in the data collection phase has not been shaped in two colors (black and white) then, that image must be converted into image data in two colors. Next, elimination was done to data that is not required, to ensure that the data which will be processed on the next step is a valid data.
iii.Feature extraction. Characteristics extractions apply a special model for the pattern of Aksara Bali Localized Arc Pattern Method as shown in Figure 2. Aksara Bali that have shaped the binary image will be processed to obtain the frequency of occurrence of each pattern. Patterns that have the same model number but with different serial number, frequency occurrence summed to obtain the frequency of occurrence of the pattern model.
iv.Enrollment. Steps of Aksara Bali reference registration are done by extracting the characteristics of some of the Balinese reference, and the results obtained are stored in a database file reference.
v. Comparison. The comparison step is the core of the whole recognition process. Here, the characteristic image of Aksara Bali input will be compared to the reference characteristics that exist in the database. At this stage, the calculations of the frequencies obtained in the process of feature extraction will be done. Based on it, the dissimilarity (dissimilarity measure) of each reference to the input image is obtained. Dissimilarity values are applied as the basic of the recognition decision. Reference database record is read one character reference data.
vi.Reference database design. Database design is the process of establishing a reference database file to be used as a reference during the recognition process. In the execution of this recognition system used 6 pieces of Aksara Bali samples for each character, with the details: 3 for the reference and the remaining 3 as a comparison to determine the threshold value. Reference database design phase consists of two main points of reference, namely Aksara Bali registration and determination of threshold values to be stored in one record with the ID numbers of keywords. Afterwards, it was continued by comparing the Aksara Bali that will be used to determine the threshold value. Based on a comparison of three Aksara Bali then are obtained value of each inequality. The median value of inequalities is stored in the reference database complements the previous sample frequency, and used as the threshold value (threshold) or the critical value (Cc) is multiplied by a constant Cd.
vii. Decision making, it is the final step. This phase intend to give the decision of the benchmarking process that has been done. Dissimilarity values obtained in the previously is sequenced. The identity reference with the smallest dissimilarity value and meet the threshold value (threshold) are decided as a alphabet of Balinese simbol corresponding to the entered image Aksara Bali. If the smallest dissimilarity values obtained are above the threshold value, it is concluded that Balinese character input is not recognized. Threshold values obtained with the previous tests. If d(Pj, Qi) is defined as the value of the
dissimilarity between the reference Balinese owned by a Balinese character Pj tested by
Balinese Qi, Ccj is the critical value has been obtained previously from a Balinese
character of Pj and Cd is a constant multiplier, then apply the relationship:
if d(Pj, Qi) ≤ Ccj x Cd then ‘RECOGNIZED’
Proceedings of the 2nd International Conference on Sustainable Technology Development
2.4.Recognition System Modeling
Image Input: Balinese character
Recognition Result Report Output: smallest dissimilarity value and ID Aksara
Character Enrolment
Image Input: Balinese character Database of
Model pattern
Reference Database
Decision Making
Threshold and Critical Value
Comparison with All Record in Reference
Database
Searching the smallest dissimilarity value Model pattern
development System Developer
Figure1. Balinese Character Recognition System Modeling
3. RESULT AND DISCUSION
New model of pattern formation is based on the constraints in the Localized Arc Pattern Method for Japanese writing and Latin signature in order to reduce the number of pattern models used. Therefore, the processing time of the system can be shortened. Its main limitation is the localization problem in a defined pattern of the model in a small square measuring 5 x 5; however, the election is based on a sample Aksara Bali.
3.1 Models Pattern Development
The patterns is formed by the characteristic point in a square 5 x 5 produces 125 pieces of possible initial patterns that can be grouped into an 103 patterns early models. Reduction of processing time is done by selection of 125 patterns (show in Figure 2) that are frequently come up in Aksara Bali.
3.2 Models Pattern Selection
Proceedings of the 2 International Conference on Sustainable Technology Development
No.1 Model 1
z
No.2 Model 2
z
No.3 Model 3
z
No.4 Model 4
z
No.5 Model 5
z
No.6 Model 6
z
No.7 Model 7
z
No.8 Model 8
z
No.9 Model 9
z
No.10 Model 10
z
No.11 Model 11
z
No.12 Model 12
z
No.13 Model 13
z
No.14 Model 14
z
No.15 Model 15
z
No.16 Model 16
z
No.17 Model 17
z
No.18 Model 18
z
No.19 Model 19
z
No.20 Model 20
z
No.21 Model 21
z
No.22 Model 22
z
No.23 Model 23
z
No.24 Model 24
z
No.25 Model 25
z
No.26 Model 26
z
No.27 Model 27
z
No.28 Model 28
z
No.29 Model 29
z
No.30 Model 30
z
No.31 Model 31
z
No.32 Model 32
z
No.33 Model 33
z
No.34 Model 34
z
No.35 Model 35
z
No.36 Model 36
z
No.37 Model 37
z
No.38 Model 38
z
No.39 Model 39
z
No.40 Model 40
z
No.41 Model 41
z
No.42 Model 42
z
No.43 Model 43
z
No.44 Model 44
z
No.45 Model 45
z
No.46 Model 46
z
No.47 Model 47
z
No.48 Model 48
z
No.49 Model 49
z
No.50 Model 50
z
No.51 Model 51
z
No.52 Model 52
z
No.53 Model 53
z
No.54 Model 54
z
No.55 Model 55
z
No.56 Model 56
z
No.57 Model 57
z
No.58 Model 58
z
No.59 Model 59
z
No.60 Model 59
z
No.61 Model 59
z
No.62 Model 60
z
No.63 Model 61
z
No.64 Model 61
z
No.65 Model 61
z
No.66 Model 62
z
No.67 Model 63
z
No.68 Model 64
z
No.69 Model 64
z
No.70 Model 64
z
No.71 Model 65
z
No.72 Model 66
z
No.73 Model 66
z
No.74 Model 66
z
No.75 Model 67
z
No.76 Model 68
z
No.77 Model 69
z
No.78 Model 69
z
No.79 Model 70
z
No.80 Model 71
z
No.81 Model 71
z
No.82 Model 72
z
No.83 Model 73
z
No.84 Model 74
z
No.85 Model 74
z
No.86 Model 75
z
No.87 Model 76
z
No.88 Model 76
z
No.89 Model 77
z
No.90 Model 78
z
No.91 Model 78
z
No.92 Model 79
z
No.93 Model 80
z
No.94 Model 81
z
No.95 Model 81
z
No.96 Model 82
z
No.97 Model 83
z
No.98 Model 84
z
No.99 Model 85
z
No.100 Model 86
z
No.101 Model 87
z
No.102 Model 88
z
No.103 Model 89
z
No.104 Model 90
z
No.105 Model 91
z
No.106 Model 92
z
No.107 Model 92
z
No.108 Model 92
z
No.109 Model 93
z
No.110 Model 94
z
No.111 Model 94
z
No.112 Model 94
z
No.113 Model 95
z
No.114 Model 96
z
No.115 Model 97
z
No.116 Model 98
z
No.117 Model 98
z
No.118 Model 98
z
No.119 Model 99
z
No.120 Model 100
z
No.121 Model 101
z
No.122 Model 101
z
No.123 Model 101
z
No.124 Model 102
z
No.125 Model 103
z
Figure2. All pattern possibility from Localized Arch Pattern with matrix 5x5.
Table 3. The frequency of 23 selected pattern appeared from 600 binary image of Aksara Bali
No Model Freq No Model Freq
1 58 58154 13 82 126
2 1 36365 14 14 68
3 63 17262 15 12 49
4 46 11319 16 86 32
5 49 8896 17 19 30
6 4 743 18 10 23
7 83 539 19 26 19
8 2 489 20 90 15
9 6 244 21 31 15
10 5 226 22 13 14
11 3 223 23 47 10
Proceedings of the 2nd International Conference on Sustainable Technology Development
Table 4. The reorder frequency and rename of 23 selected pattern appeared
No Model Freq No Model Freq
1 1 36365 13 26 19
2 2 489 14 31 15
3 3 223 15 46 11319
4 4 743 16 47 10
5 5 226 17 49 8896
6 6 244 18 58 58154
7 8 171 19 63 17262
8 10 23 20 82 126 9 12 49 21 83 539 10 13 14 22 86 32 11 14 68 23 90 15 12 19 30
Figure 3. The 23 Selected Special Model Pattern of Balinese Character Base on Localized Arc Pattern Method
In final implementation these model pattern in Balinese Print Character Recognition System, performance of the system is measured by two types of errors, namely: the rejection error (false rejection) and reception errors (false acceptance). The system developed has a minimum percentage of error in all combinations of the constant multiplier threshold Cd 2.0: 3.0: 4.0: 5.0 and the constant of cutting q-value of Eigen value 3, with an average system error is 3.6% to obtain a success rate of 96.4%.
4. CONCLUTION
Based on trial and analysis results that have been done can be concluded as follows:
Proceedings of the 2 International Conference on Sustainable Technology Development
selection is done from the implementation of the pattern founding in Aksara Bali image databases, based on the accumulated frequency of occurrence of each pattern model. As can be seen from the percentage of errors and processing time, this method proved quite effective and produces better performance for the Aksara Bali recognition, as compared with the pattern of Indonesia Signature models.
4.2.The special pattern base on Localized Arc Pattern Method for Balinese image character, are formed by the characteristic point in a square 5 x 5 produces 125 pieces of possible initial patterns that can be grouped into an 103 patterns early models. 4.3.Reduction of processing time is done by selection of 125 patterns. The selection
patterns are performed by using computer program to calculate the frequency of each pattern appeared on a number of binary image of the Balinese character. Sample data that is used to establish the pattern of the model are 600 pieces of Aksara Bali image which is taken from some books and the internet. Patterns are obtained from the model selection process as many as 23 pieces pattern
4.4.Performance of the system is measured by two types of errors, namely: the rejection error (false rejection) and reception errors (false acceptance). The system developed has a minimum percentage of error in all combinations of the constant multiplier threshold Cd 2.0: 3.0: 4.0: 5.0 and the constant of cutting q-value of Eigen value 3, with an average system error is 3.6% to obtain a success rate of 96.4%.
REFERENCE
Agung BW, Rudy Hermanto I Gede, Retno Novi D ang. (2009). Pengenalan Huruf Bali dengan Menggunakan metode Modified Direction Feature (MDF) dan Learning Vector Quantization (LVQ). Konferensi Nasional Sistem dan Informatika 2009. Institut Teknologi Telkom, Bandung. yudiagusta.files.wordpress.com/.../007-012-knsi09-002-pengenalan-huruf-bali-menggunakan-metode-modified-direction-feature-_mdf
Oka Sudana, AA. K. (2006). Rancang Bangun Sistem Verifikasi Tandatangan dan Pengenalan Tulisan Tangan dengan Metode Pola Busur Terlokalisasi.Proceeding of the Research and Studies III. TPSDP – DIKTI 2006.
Oka Sudana, AA.K. (2007). Implementasi Pola Model Tandatangan Jepang dan Tandatangan Indonesia untuk Verifikasi Tandatangan Latin. Jurnal Pakar, Vol 7, No 4, Yogyakarta. Shin-ichi Kikuchi, Takehiro Furuta, Takako Akakura. (2008). Periodical Examinees
Identification in e-Test Systems Using the Localized Arc Pattern Method. Distance Learning and the Internet Conference 2008. p.213-220. Waseda University, Japan. Suamba Dharmayasa, I Komang Gede. (2009). Pengenalan Karakter Bali Cetak
Menggunakan Metode Moment dan Jaringan Syaraf Tiruan Learning Vector Quantization; Teknik Elektro Udayana, Jimbaran, Bali.
Yoshimura, I., Shimizu, T. dan Yoshimura, M.. (1993). A Zip Code Recognition System using the Localized Arc Pattern Method. Proceedings of 2nd International Conference on Document Analysis & Recognition. IEEE Computer Society. p183-186.
Yoshimura, M. dan Yoshimura, I., (1988), “Writer Identification Based on the Arc Pattern Transformation”,Proceedings of 9th International Conference on Pattern Recognition, November 14-17, 1993, IEEE Computer Society, Washington, p.353-361.
Yoshimura, I. dan Yoshimura, M., (1994), “Arc Pattern Method for Writer Recognition as an Aid for Person Identification”, Nagoya University p.71-82.
___.___. 2010. Aksara Bali. http://wapedia.mobi/id/Aksara_Bali . Diakses tanggal 09
Special Pattern Development for Feature Extraction
In Balinese Print Character Recognition System
Base on Localized Arc Pattern Method
By
A.A.K. Oka Sudana
Gusti Agung Ayu Putri
Ni Kadek Ayu Wirdiani
OVERVIEW
Writing in each region has a variety of typefaces and has its own
uniqueness.
Aksara Bali
has a unique writing of a similar shape to one
another and some writings are distinguished only by a single line
sketch.
To be easier reading Balinese writing.
Expected to provide an alternative method for the recognition of a
computerized image of Balinese writing simbol.
New model of pattern formation is based on the constraints in the
Localized Arc Pattern Method for Japanese writing and Latin
SAMPLE OF AKSARA BALI
SAMPLE OF AKSARA BALI
System of Recognition Modelling
Recognition Result Report
Output: smallest dissimilarity value and ID Aksara
Character Enrolment
Image Input: Balinese character
Image Input: Balinese character System Developer
Database of Model pattern
Reference Database
Model pattern development
Decision Making
Threshold and Critical Value
Searching the smallest dissimilarity value
RESULT AND DISCUSION
Models Pattern Development
The patterns is formed
by the characteristic point in a square 5 x 5 produces
125 pieces of possible initial patterns that can be
grouped into an 103 patterns early models. Reduction
of processing time is done by selection of 125 patterns
(show in Figure 2) that are frequently come up in
Aksara Bali.
Models Pattern Selection
using computer program
Localized Arch Pattern
End Point
l=3
A
B
l=2
l=1
l=0
l= -1
l= -2
l= -3
Distance
Radius OA =2AB / l
End point End Point
B
A
All pattern possibility from Localized Arch Pattern with matrix 5x5
No.1 Model 1
No.2 Model 2
No.3 Model 3
No.4 Model 4
No.5 Model 5
No.6 Model 6
No.7 Model 7
No.8 Model 8
No.9 Model 9
No.10 Model 10
No.11 Model 11
No.12 Model 12
No.13 Model 13
No.14 Model 14
No.15 Model 15
No.16 Model 16
No.17 Model 17
No.18 Model 18
No.19 Model 19
No.20 Model 20
No.21 Model 21
No.22 Model 22
No.23 Model 23
No.24 Model 24
No.25 Model 25
No.26 Model 26
No.27 Model 27
No.28 Model 28
No.29 Model 29
No.30 Model 30
No.31 Model 31
No.32 Model 32
No.33 Model 33
No.34 Model 34
No.35 Model 35
No.36 Model 36
No.37 Model 37
No.38 Model 38
No.39 Model 39
No.40 Model 40
No.41 Model 41
No.42 Model 42
No.43 Model 43
No.44 Model 44
No.45 Model 45
No.46 Model 46
No.47 Model 47
No.48 Model 48
No.49 Model 49
No.50 Model 50
No.51 Model 51
No.52 Model 52
No.53 Model 53
No.54 Model 54
No.55 Model 55
No.56 Model 56
No.57 Model 57
No.58 Model 58
No.59 Model 59
No.60 Model 59
No.61 Model 59
No.62 Model 60
No.63 Model 61
No.64 Model 61
No.65 Model 61
No.66 Model 62
No.67 Model 63
No.68 Model 64
No.69 Model 64
No.70 Model 64
No.71 Model 65
No.72 Model 66
No.73 Model 66
No.74 Model 66
No.75 Model 67
No.76 Model 68
No.77 Model 69
No.78 Model 69
No.79 Model 70
No.80 Model 71
No.81 Model 71
No.82 Model 72
No.83 Model 73
No.84 Model 74
No.85 Model 74
No.86 Model 75
No.87 Model 76
No.88 Model 76
No.89 Model 77
No.90 Model 78
No.91 Model 78
No.92 Model 79
No.93 Model 80
No.94 Model 81
No.95 Model 81
No.96 Model 82
No.97 Model 83
No.98 Model 84
No.99 Model 85
No.100 Model 86
No.101 Model 87
No.102 Model 88
No.103 Model 89
No.104 Model 90
No.105 Model 91
No.106 Model 92
No.107 Model 92
No.108 Model 92
No.109 Model 93
No.110 Model 94
No.111 Model 94
No.112 Model 94
No.113 Model 95
No.114 Model 96
No.115 Model 97
No.116 Model 98
No.117 Model 98
No.118 Model 98
No.119 Model 99
Frequency of 23 selected pattern appeared from 600 binary image of Aksara
No
Model
Freq
No
Model
Freq
1
58
58154
13
82
126
2
1
36365
14
14
68
3
63
17262
15
12
49
4
46
11319
16
86
32
5
49
8896
17
19
30
6
4
743
18
10
23
7
83
539
19
26
19
8
2
489
20
90
15
9
6
244
21
31
15
10
5
226
22
13
14
11
3
223
23
47
10
Reorder (shorting) and Rename
No
Model
Freq
No
Model
Freq
1
1
36365
13
26
19
2
2
489
14
31
15
3
3
223
15
46
11319
4
4
743
16
47
10
5
5
226
17
49
8896
6
6
244
18
58
58154
7
8
171
19
63
17262
8
10
23
20
82
126
9
12
49
21
83
539
10
13
14
22
86
32
11
14
68
23
90
15
The 23 Selected Special Model Pattern of Balinese Character
Base on Localized Arc Pattern Method
14
21
No. 1 Model 1
No. 2 Model 2
No. 3 Model 3
No. 4 Model 4
No. 5 Model 5
No. 6 Model 6
No. 7 Model 7
No. 8 Model 8
No. 9 Model 9
No. 10 Model 10
No. 11 Model 11
.
No. 12 Model 12
No. 13 Model 13
No. 14 Model
No. 15 Model 15
No. 16 Model 16
No. 17 Model 17
No. 18 Model 18
No. 19 Model 19
No. 20 Model 20
No. 21 Model
No. 22 Model 22
No. 23 Model 23
Conclusion
Aksara Bali print recognition is emphasized in the
process of feature extraction that is performed with a
special pattern based on Localized Arc Pattern Method.
Model selection is done from the implementation of the
pattern founding in Aksara Bali image databases, based
on the accumulated frequency of occurrence of each
pattern model.
The special pattern base on Localized Arc Pattern
Method for Balinese image character, are formed by the
characteristic point in a square 5 x 5 produces 125
Conclusion
Reduction of processing time is done by selection of 125
patterns
by using computer program to calculate the frequency
of each pattern appeared on a number of binary image of the
Balinese character. Patterns are obtained from the model selection
process as many as 23 pieces pattern.
Performance of the system is measured by two types of errors,
namely: the rejection error (false rejection) and reception errors
(false acceptance). The system developed has a minimum
Special Pattern Development for Feature Extraction
In Balinese Print Character Recognition System
Base on Localized Arc Pattern Method
By
A.A.K. Oka Sudana
Gusti Agung Ayu Putri
Ni Kadek Ayu Wirdiani