• Tidak ada hasil yang ditemukan

Table of Content

N/A
N/A
Nguyễn Gia Hào

Academic year: 2023

Membagikan "Table of Content "

Copied!
85
0
0

Teks penuh

I certify that this report entitled “Handwriting Recognition on Library Book Label” is my own work, except as cited in the references. Second, I also want to thank my friends who helped me so that I can finish the project in time. This project aims to design a character recognizer for use in a smartphone application to identify misplaced books on a library shelf.

The application works as follows: The user first takes a photo of books on a library shelf. If a book is misplaced, its call number will be out of sequence with the others, and the application can alert the user to such a situation. There are two important technical problems to overcome, namely image segmentation and handwriting recognition.

Two important aspects are required of the handwriting recognizer for the mobile application to be usable: (1) The recognizer must be able to analyze each photo in a fraction of a second; (2) The knower must achieve almost perfect accuracy. The resulting system, when tested using our database of handwritten characters from library tags, showed near-perfect accuracy.

INTRODUCTION

Project Overview

It is routine for librarians to browse the books on the library shelves to look for misplaced literature. However, the task can be assisted with a device equipped with a camera, which requires the librarian to simply take pictures of the books, the device will then automatically identify the misplaced books based on the call numbers on the pictures. Such a device can be realized inexpensively through a software application that can be installed on ordinary smartphones, which are ubiquitous today.

In this project, we are interested in the third of these components, that is, the problem of offline handwritten text recognition.

Figure  1.1.1: Books on the  bookshelf  and call  numbers  written  on book label    If a book is misplaced, then its call  number will be out-of-sequence with the call numbers  of  its  neighbouring  books
Figure 1.1.1: Books on the bookshelf and call numbers written on book label If a book is misplaced, then its call number will be out-of-sequence with the call numbers of its neighbouring books

Project Scope and Objectives

Methods/Technologies Involved

LITERATURE REVIEW

Common Methods Used in Character Recognition

  • Artificial Neural Network
  • Freeman Chain Code
  • Edit Distance

As mentioned above, Nawrin & Hassan (Nawrin & Hassan 2012) reported that by using Freeman Chain Code as features, almost 100% accuracy was achieved in the classification of Bangla characters. Likewise, Gaurav & Jayashree (Gaurav & Jayashree 2013) mentioned the use of Freeman Chain Code as a feature for character recognition. A Freeman Chain Code is an alternative representation (that is, a function) of the image to be identified.

Therefore, the character of a chain code can be estimated by examining how well it matches chain codes of known characters. One possible method of comparing two chain codes is via the so-called Edit distance (or Levenshtein distance), which measures the difference between two given sequences of characters. The editing distance between two strings is roughly the number of changes it takes to transform one string into the other.

The smaller the number of differences between two sequences, the more likely they are to be similar. It is suitable for this program because it can treat the string code as a sequence of characters.

Figure  2.1.2.1: Direction  representation  by Freeman  Chain  Code
Figure 2.1.2.1: Direction representation by Freeman Chain Code

METHODOLOGY

A literature search was also conducted, examining articles and journal publications to identify possible solutions to the problem. A consistent naming format was essential in the testing phase, where a large number of images must be processed semi-automatically. After summarizing the findings from the previous phase, possible algorithms are designed to solve the problem.

These algorithms are tested against test samples in order for me to examine the strengths and weaknesses of the algorithms and find solutions to overcome potential pitfalls. This is the most critical part among the entire process as it affects the final results produced. 13 Several sets of test samples will be used to ensure that the results obtained are not biased.

Testing is also used to assess whether it is able to achieve the defined goals and expected values ​​in the planning phase.

ALGORITHM IMPLEMENTATION

Main Handwriting Recognizer Algorithm

If the neural network is able to identify the image with an output higher than 0.8, the character is determined and the whole process ends. If the received character is classified as 1, B, D, K or M, use chain code composition analysis and KNN to identify the characters.

Figure  4.1.2: Main  procedure  of the  program 1. Read input  image  (.bmp).
Figure 4.1.2: Main procedure of the program 1. Read input image (.bmp).

Detailed Description of Main Procedure

  • Image pre-processing - Crop and Scale Image
  • Feature Extraction
  • Classification with Artificial Neural Network
  • Image Preprocessing - Filtering Freeman Chain Code Database
  • Image Pre-processing – Obtain Outline of Character
  • Feature Extraction – Freeman Chain Code
  • Classification – Edit Distance
  • Classification – Chain Code Composition Analysis
  • Overall Results Obtained

In the second identification process, a database of Freeman chain codes is used to match the chain code of the input image (more details in section 4.2.6-7). To overcome this problem, the chain code database is first filtered to remove the chain codes that are unlikely to match the chain code of the image. To do this, I compare the image of each chain code with the image of the input character.

If the image for a chain code does not match the input image, the chain code is removed from the database and not used in identification. The next step after filtering the chain code database is to transform the input image into a Freeman chain code. To overcome this, I use the outline of the handwritten character to produce the chain code.

The algorithm moves to this neighboring pixel, say x, and records the corresponding chain code for the movement. This traversal procedure is repeated until the start pixel is encountered again, in which case the entire outline has been traversed and the sequence of recorded movements is the contour's chaincode. This can ensure that the chaincode starts in the upper left area of ​​each character.

Such noises are immediately apparent when closely examining the chaincode; it is likely to appear as an anomaly among a series of the same numbers. The process is repeated until the (n2)-th element, where n is the total length of the chain code. Finally, the converted stringcode of the input character is compared to the stringcodes in the filtered database.

The characters that the chain code comparison failed to identify are for the characters „1‟, „B‟, „D‟, „K‟ and „M‟. Therefore, the differences in the composition of their chain codes can be effectively used to distinguish them from the other characters. Although the composition of the chain code effectively differentiates between the different chain codes, it is not consistent in giving the highest similarity scores to the correct chain code.

That is, K-chain codes in the database with the closest compositions to the input chain code are obtained. In such a situation, I use the KNN results with the output values ​​of the ANN (explained in 4.1.4) to modify the difference score from the concatenation.

Figure  4.2.1.3: After  crop and  scale  image
Figure 4.2.1.3: After crop and scale image

Sample Preparation

Training and Testing Engine

  • Artificial Neural Network
  • Freeman Chain Code Recognizer

Each row has a score indicative of the input's similarity to the character represented by the row. For example, the score in the first row indicates the similarity between the test sample and the character "0", and the score in the second indicates the similarity between the test sample and the character "1", etc. During training, I noticed that when the score is 0.8 or above, it will identified characters always be correct.

The ideal trained network will be the one with more results produced with probability 0.8 and above. So training and testing are performed several times and I chose the neural network with an output that is closest to the ideal state. For the Freeman chaincode analysis, no training is required to fit parameters other than the preprocessing required to prepare the chaincode database.

Figure  4.3.1.2: Sample  output  of  neural  network Input  = Testing
Figure 4.3.1.2: Sample output of neural network Input = Testing

DISCUSSION

Achievement, Future Improvement and Conclusion

Hassan, 2012, „Optical Bangla Character Recognition Using Chain Code‟, IEEE/OSA/IAPR International Conference on Computing, Electronics and. Jayashree M.Kundargi, 2013, „a review of OCR feature extraction techniques for offline handwriting focused Indian scripts‟, International Journal of Engineering Research and Applications (IJERA), Vol.3, Issue 1, January – February 2013, p. 919-926.

Gambar

Figure  1.1.1: Books on the  bookshelf  and call  numbers  written  on book label    If a book is misplaced, then its call  number will be out-of-sequence with the call numbers  of  its  neighbouring  books
Figure  2.1.2.1: Direction  representation  by Freeman  Chain  Code
Figure  2.1.2.2: Freeman  Chain  Code travels  for  character  „C‟
Figure  4.1.2: Main  procedure  of the  program 1. Read input  image  (.bmp).
+7

Referensi

Dokumen terkait

Assignment of gateway process data bits The input word of the gateway process data is used to view the gateway and system redundancy of the excom station: Bit Byte 7 6 5 4 3 2 1 0 0