• Tidak ada hasil yang ditemukan

B..4 Tess-Two

Since Tesseract OCR is compiled using C++, Tess-Two is a free open-source Android library project that provides a Java API for accessing natively-compiled Tesseract and Leptonica APIs [35].

B..5 Tess2Speech’ Tessdata folder

Data files (.traineddata) for Tesseract OCR should be placed in Tesseract’s ’tess- data’ folder in order for it to detect the trained data. It is the same for Tess2Speech.

In order for a trained data be used in Tess2Speech, it should be located in Tess2Speech’ ’tessdata’ folder. It is usually located on

’Phone Storage/Android/Data/anteraaron.tess2speech/files’ directory.

B..6 Text-to-Speech System

Any Text-to-Speech system installed can be used on Tess2Speech. This means that the performance of Tess2Speech’ TTS is based on the TTS being used on the phone. Text-to-Speech systems can be downloaded on Google Play Store. TTS settings can be changed on Android’s phone settings.

B..7 Tess2Speech’ Canvas

As we stated on Background of the Study, stylus-based applications are mostly

proprietary and expensive. As a solution to that, I developed a pseudo stylus-

based application. In the canvas part of Tess2Speech, I created a Canvas object

on the phone’s screen, which is similar to how MS Paint works, and then convert

the content of that canvas to image every time I want to convert that handwriting

to text or speech. The performance of this canvas is dependent on the hardware

of the phone (touchscreen capability).

B..8 Tess2Speech’ Image Viewer

The reason why Tess2Speech is only accepting input images of format PNG, JPEG, GIF, and BMP is because Android’s ImageView Object accepts these same image formats. In order to support pinch-zoom, double-tap zoom, and pan functionalities in the ImageView, I used an open-source Android View class called GestureIm- ageView by jasonpolites [36].

B..9 Tess2Speech’ Camera

Almost all smart phones have a camera, and it is accessible by all of the appli- cations installed on the phone. The problem is that not all camera resolutions are equal. It varies from phone-to-phone. Since Tess2Speech relies mostly on the quality of images, an image taken from a high-resolution camera will most likely yield a better accuracy.

B..10 Image Cropping and Rotation

It is stated in Scope and Limitations no.7 that the border of the paper should not be seen as it adds noise to the image and may be interpreted as a character such as ’l’ (small L) or ’1’ (one). As a solution, I implemented a choice for the user to crop the image and only select which part of the image is relevant. I used an open-source image cropper library by ArthurHub [37], and edited it a bit to fit my personal preferences.

Since Image De-Skewing is hard to implement without the use of large libraries, I also provided an alternative solution by allowing the user to manually rotate the image allowing them to correctly align the texts on the image.

B..11 Tess2Speech’ PDF Viewer

Android supports viewing of PDF starting from API level 21. Since the minimum

API level of Tess2Speech will be API level 16, I used an open-source PDF Viewer to

support lower API levels. Android-pdfview is a library created by Joan Zapata [38]

which provides fast PDFView component for Android, with animations, gestures, and Zoom. It is based on VuDroid [39], which is a PDF decoder made by google.

B..12 PDF to Image and Vice-Versa

It is stated in B..11 that the PDF Viewer uses VuDroid [39]. Since VuDroid is a PDF Decoder, it should have a functionality of converting a PDF to an Image.

After studying the source code of VuDroid, I found a method on how to convert PDF to Image and vice-versa. To maximize the use of the library, I will incorporate these functionalities in Tess2Speech.

B..13 Tess2Speech’ Ebook Viewer

Ebooks with .epub extensions are essentialy a zip of html files. Since Android does not have an innate capability to display Ebook, I used an open-source library called EPubLib by Paul Siegman [40]. EPubLib is a Java Library for reading, writing, and manipulating epub files. The problem is that the formatting of epub is not acquired when using EPubLib. By creating a workaround, I eventually displayed the proper formatting of the Ebook.

B..14 Tess2Speech’s Built-in File Picker

I also incorporated a built-in File Picker for Tess2Speech in-case that the user does not have a compatible File Picker installed on their phone. The file picker I used is an open-source file picker library by Anders Kaloer [41].

B..15 Saving Files

Converted texts can be saved as a text file (.txt), or a PDF file (.pdf) by using

VuDroid [39]. An SD Card is required in order to save Tess2Speech files. If an

SD Card is not present, Tess2Speech will prompt the user and automatically close

the application.

B..16 Settings

The Preference Screen of Tess2Speech contains different settings for Tess2Speech such as the option to whether turn-on Image Preprocessing, Automatic Image Resizing (for faster conversion), Tesseract language, help, and Licenses.

B..17 Tess2Speech Trainer

Tess2Speech Trainer is a user-friendly desktop application which allows the users to personalize and train Tess2Speech to recognize a new font or handwriting in just a click of a button. It produces a .traineddata file that can be used by Tess2Speech.

The Box Editor of Tess2Speech trainer is an edited jTessBoxEditor by Vi- etOCR [32]. I added an indicator on which box files have already been visited to help users know if they missed editing some characters. I removed other func- tionalities of jTessBoxEditor that are not needed and integrated it to Tess2Speech Trainer’s Graphical User Interface.

Tess2Speech Trainer is programmed using Java and is a .jar executable. Tess2Speech Trainer is also a stand-alone application, which means that there is no installation required for the application itself. The user just needs a Java installed to run the .jar executable inside the Tess2Speech Trainer. It is important to know that

’Tess2Speech Trainer.jar’ file cannot be moved outside the folder or it will not run.

Only ’create shortcut’ method is allowed.

Dokumen terkait