NETWORK TRANSLATOR USING CONVOLUTIONAL NEURAL AN ASSISTIVE BAHASA MALAYSIA SIGN LANGUAGE

(1)

International Journal of Technology Management and Information System (IJTMIS) eISSN: 2710-6268 [Vol. 4 No. 4 December 2022]

Journal website: http://myjms.mohe.gov.my/index.php/ijtmis

AN ASSISTIVE BAHASA MALAYSIA SIGN LANGUAGE TRANSLATOR USING CONVOLUTIONAL NEURAL

NETWORK

Gloria Jennis Tan^1*, Nur Suhada Khairu², Tan Chi Wee³, Ung Ling Ling⁴, Ngo Kea Leng⁵ and Zeti Darleena Eri⁶

1 2 6 Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA Cawangan Terengganu,

Kuala Terengganu, MALAYSIA

3 Faculty of Computing and Information Technology, Tunku Abdul Rahman University College, Kuala Lumpur, MALAYSIA

4 Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA Cawangan Sabah, Kota Kinabalu, MALAYSIA

5 Academy of Language Studies, Universiti Teknologi MARA Cawangan Terengganu, Kuala Terengganu, MALAYSIA

*Corresponding author: [email protected]

Article Information:

Article history:

Received date : 12 November 2022 Revised date : 25 November 2022 Accepted date : 13 December 2022 Published date : 15 December 2022

To cite this document:

Tan, G. J., Khairu, N. S., Tan, C. W., Ung, L. L., Ngo, K. L., & Eri, Z. D.

(2022). AN ASSISTIVE BAHASA MALAYSIA SIGN LANGUAGE TRANSLATOR USING

CONVOLUTIONAL NEURAL NETWORK. International Journal of Technology Management and Information System, 4(4), 39-52.

Abstract: The aim of this study is to develop a prototype Bahasa Melayu Sign Language translator which convert hand sign language to written text using the Deep Convolutional Neural Network (CNN). Tremendous works have been done on the invention of the sign language translator, which has enabled the deaf-mute to communicate efficiently with the community. However, most of them are focus on the English textual content. This calls for an urgent need to design and develop a sign language translator for Malay language. The Deep CNN architectures were trained and tested using public dataset for the classification of the alphabets and numeric of the sign language. Several experiments were conducted to extract the significant features and benchmarking for future works that can benefit the researchers. This area, however, still has massive space for future researchers to research in. The main proposed sign language translator is to help and assist the deaf-mute community to communicate and at the same time help both communities

(2)

to learn the sign language and create awareness of the sign language in Malaysia.

Keywords: Sign Language (SL), Bahasa Isyarat Malaysia (BIM), Assistive Technology, Deaf, Hard of Hearing.

1. Introduction

Communication is crucial in daily life and communicating effectively is one of the most essential life skills (Alshamrani & Bahattab, 2015). At its most fundamental, communication is the act of conveying information between two persons through verbal, written, visual, or nonverbal.

Successful communication is essential in developing self-personalities, establishing relationships, building career, forming social networks and etc. (American Sign Language and British Sign Language Differences, 2021).

Sign language is a natural language, and it is the only language used by the disabled community to communicate. It is also the most widely used language by many people in communicating with the deaf-mute communities. Deaf–mute communities will face problems when they are in public because most people in public are not familiar using sign language and most of them are not familiar with sign language, thus, sometimes it can lead to miscommunication. Miscommunication is one thing that should be avoided in communication because it is vital to effective communication. Without a proper solution to this issue, the deaf-mute community will not be able to communicate, eventually affecting their social life, career, and livelihood.

To overcome the problem between deaf-mute communities to communicate with speaking community, a sign language translator was invented to allow them to communicate freely in public.

In recent years, sign language has evolved, whereby different signs were created for different languages, building up the vocabularies (Awaludin, 2021). There are various sign languages used around the world one of them which is commonly used is American Sign Language (ASL).

According to research, only 30% of the signals in American Sign Language (ASL) and British Sign Language (BSL) are identical but in Malaysia, Malaysians used Bahasa Isyarat Malaysia (BIM or Malaysian Sign Language) to communicate. This has encouraged researchers to design and developing sign language translators for different languages, using various techniques.

However, most were focus on English textual content (Jayadeep et al., 2020; Abou Haidar et al., 2019; Malo et al., 2022). Therefore, this project aims to design and develop a simple prototype to translate Malay sign language to text.

In Malaysia, Bahasa Isyarat Malaysia (BIM) is used in daily communication between the deaf- mute community. Initially BIM was only used for the deaf community. As number of deaf-mute communities increasing the BIM was introduced also to the mute. Based on Malaysia Federation

(3)

lacking and will create barrier between these two communities. The increasing number of hearing disabilities communities makes the BIM’s need to be used in daily life of the deaf-mute community to communicate.

The main aim of this proposed project is to help and assist the Malaysian deaf–mute community to communicate and at the same time it can be used as teaching aid. Further, this sign language translator is also important to create awareness of the sign language among Malaysians. By developing this translator and practicing it in public in future, without realizing the community will also create awareness in sign language.

2. Literature Review

Sign language translator system is one of the developments that has ability to translate the sign language input into text or in the form of sound. The sign language translator can recognize the hand gesture through the camera virtually while the translator can also recognize the gesture by using the device such as glove. The translator can be identified through the image or video and the gesture by using gesture sensor for the translator that apply the device.

The gesture that can be identified by using the camera it is used in the machine learning algorithm to ensure the captured gesture can be categorized into the desired class. The system will interpret the result to the user by showing the translated sign language in the form of audio or text.

There are two types of sign language translator, one of it is the translator using an interpreter that use human works and another translator developed using deep learning that can recognize the gestures of the sign language. Image processing is one of the techniques which can be used to improve or extract relevant information from the image.

2.1 Bahasa Melayu Sign Language

Bahasa Melayu sign language is also known as Bahasa Isyarat Malaysia (BIM) is a language that is put into practice and used by all Malaysian disabled communities especially for the deaf-mute communities. BIM is the main language for hearing impaired people in Malaysia. According to Jabatan Kebajikan Masyarakat (JKM) in 2005 almost 26,294 hearing disabilities communities already registered with JKM.

BIM was introduced in Malaysia due to the increased number of deaf-mute communities; it was introduced to ensure the deaf-mute communities can communicate efficiently. BIM was initially used for deaf communities, it used sign language such as hand, body movement, and facial expression to communicate with others. According to (Mohd Jalani et al., 2021), alphabets of Malaysian sign language are also used to spell out some words, names, or any sign.

Persekutuan Orang Pekak Malaysia (MFD) released a book for Bahasa Isyarat Malaysia as a reference medium for the deaf communities to learn BIM. Malaysian used BIM as a standard sign language in Malaysia. Bahasa Melayu sign language consists of various dialects based on every state such as Penang Sign Language (PSL) and Selangor Sign Language (SSL). Despite BIM was implemented from ASL it still has a different concept to differentiate between BIM and ASL. The

(4)

difference between these two languages can be seen by its gestures for certain alphabet. Figure 1 shows that Bahasa Melayu sign language for the alphabet used by Malaysian deaf-mute communities to communicate. Figure 2 shows the difference between BIM and ASL for “G”

alphabet.

2.2 Sign Language Translator System

Sign language translator system is one of the technologies that was developed which could translate the sign language input into normal text or in the form of sound. Sign language translator can recognize the hand gesture through camera virtually while the translator can also recognize the gesture by using the device 12 such as glove. The translator can recognize through the image or video and the gesture by using gesture sensor for the translator that apply device.

Figure 1: Bahasa Melayu Sign Language Alphabet.

(Source: MyHEALTH Kementerian Kesihatan Malaysia)

Figure 2: Difference Beteen BIM (eft) and ASL (rght) for "G" Alphabet.

(Source: Yuliia Moisieieva,2020 & Signing Savy,2022)

The gesture that can be recognized by using the camera uses the machine learning algorithm to ensure the gesture has been captured can be classify into the desired class. The system will interpret the result to the user by showing the translated sign language in the form of audio or text.

(5)

In this project the machine learning used to ensure the sign language can be translated. The image will be input into the system and will undergo image pre-processing to get the desired object in the image and then the collected dataset will be trained and tested by using machine learning algorithm which are in this project. CNN model will be used to classify the type of sign language.

After undergoing the image pre-processing the image will undergo the sign language detection.

As a result, the translated sign language will be shown to the users. Figure 3 shows the sign language translator that was implemented on device while Figure 4 shows the sign language translator that was not implemented on device.

Figure 3: Sign Language Translator Implement on Device.

(Source: Nisha Guragain, 2021)

Figure 4: Sign Language Translator not impleented on deice.

(Source: Zhihio Zhou, 2019)

(6)

2.3 Similar Works

Sign language translator not only can use CNN, but sign language translator can also use various technique to build the translator. Sign language translator by (P & Al, 2021) used Natural Language Processing Algorithm technique, the dataset used is words in Indian Sign Language to map the text or text identified from voice. The results are the translator must devote time to translating speech to sign language it depends on the length of the phrase. Next, Dynamic Gesture Recognition for Indian Sign Language Modelling by (Singh, 2021), the dataset used is 20 hand gestures of Indian Sign Language with different backgrounds, there are 1920 samples used to train the network and the accuracy gained is 88.24%.

Back Propagation Algorithm of an MLP is used in the sign language translator by Abou Haidar et al., (2019). The glove used to get the input from the user, Back-Propagation used to process the input, the accuracy gained is 95.3861%. Next, Region of Interest (ROI) segmentation and CNN were being used in the building of the sign language translator (Khan et al., 2019). The dataset used is the Bangla sign language dictionary and gain the accuracy of 97.54%.

Sign language translator using CNN has also gained high accuracy of 95% by looking at the finger spelling (Ojha et al., 2020). The gestures are from the camera, and it will capture the gestures from the camera. Lastly, the sign language using K-Nearest Neighbours (KNN) with one neighbour, SVM linear SVM RBF and Random Forest (Memon et al., 2017). SVM linear gives more percentage accuracy compared to KNN with one neighbour, SVM RBF and Random Forest. This section shows that Sign Language Translator can be developed in various algorithm.

(7)

3. Experiment Settings and Results

Waterfall Model design phase is crucial in the process of the System Development Life Cycle (SDLC). (Alshamrani et al., 2015) stated that Waterfall model is easy to understand and involve sequential process of implementation. Each stage in Waterfall will refer to the previous stage to be completed before continuing the following phase(s) of chain. The requirement phase is the first stage in the Waterfall methodology, requires the research and data gathering before the project could take place in development. System design determines the hardware and system requirements of both developers, as well as the user that contributes to the system architecture. Thus, a refined detail about the developer’s software and hardware requirement were listed in Table 2.

Table 2: Developer’s Hardware and Software Requirements

Category Item Remarks

Hardware Computer with input-output devices ▪ With HD camera

▪ Windows 8 & above

▪ Minimum 4 RAM

▪ Intel CORE i7 & Processor

▪ NVIDIA GPU card with CUDA architectures 3.5, 3.7, 5.2, 6.0, 6.1, 7.0 or higher

▪ 1GB of memory storage

▪ Mouse & Keyboard Software Jupyter Notebook

Python

▪ Platform for system development

▪ User Interface for frontend

▪ System development for backend

3.1 Dataset

To do the evaluation and testing on the sign language translator the data need to be existed to ensure the system can be evaluated and tested. The dataset consisting of the alphabet and numeric of sign language. The secondary data used in this project is the data from Kaggle, American Sign Language (Ayush Takur,2019). Since Bahasa Isyarat Malaysia (BIM) applied from the American Sign Language (ASL) the data set of the ASL will be used in this project. The dataset is used to do the training on the CNN model.

The data provided from Kaggle consisting of 25291 combination of images alphabet and numeric images. Each of the alphabet and numeric consists of approximately 700 images.

3.2 Proposed System

System architecture is a structure of the software system that presented in the form of an illustration of a diagram that consist of components, layer, and interaction in the system. Figure 6 presents the proposed system architecture of Bahasa Melayu Sign Language Translator using CNN model. It illustrates the system's components, layers, and interactions.

(8)

The images undergone pre-processing to remove unwanted data like the background and other objects that might interfere with the classifier calculation. During image segmentation, graying technique was applied. The images were split into segments and then converted from RGB to grayscale. This will enable the classifier to forgo taking colour into account while analysing. Also, thresholding, erosion and dilation were executed at this stage. Thresholding isolates the skin- coloured object from the background and erosion reduces noise and improves the images.

The dilation process adds additional pixels to the image boundary, which enhances image clarity.

The pre-processed images were fed into the deep CNN model, consisting of four layers: the convolutional layer, pooling layer, non-linearly layer and fully connected layer.

The CNN model retrieves useful information from the obtained and augmented data to start with features extraction. It removes duplicate information from the region of interest and begins collecting features for the classification process. The classification follows by feeding the extracted signification features as inputs to train the CNN model by mapping/correlating the significant image features with the information provided by a sign language translator to produce the correct text form content. Then, the model was evaluated.

Figure 6: The Proposed of System Architecture Bahasa Melayu Sign Language Translator

(9)

3.3 Design Phase

A simple user interface is designed to ensure that the user can easily use and understand the system usage. The Figma User Interface Design tool was employed to design the user interface. The programming languages for designing the user interface are JavaScript, HTML and CSS. There are three user modules in this system. Figure 7 below shows the main page of the interface for this project. The main page of this system has 3 functional buttons that enable the user to navigate to the real-time prediction, prediction using images and information about system.

Figure 7: Main Page of Prototype Interface

(10)

Figure 8 below shows the interface prediction using real-time using camera, the user needs to put hand in the blue rectangle box to ensure it can be translated into the desired alphabet based on hand gesture the user show inside the box.

Figure 8: Prediction Using Real-Time

Figure 9 below shows the prediction using import image, the interface consisted of 3 functional buttons which are “Muat naik gambar” button that able the user to import the image from the computer files, “Terjemah” button which function to translate the image imported to the text form and “Kembali” button will able to navigate the user to the main page of the user interface.

Figure 9: Prediction Using Image

(11)

4. Results and Discussion

Evaluation process is the process where the model developed will undergo an evaluation by using evaluation method to ensure in finding the best model for the system and can work well on the system. The evaluation used in this project is using confusion matrix and precision call. The training is done by using 50 epochs. The training stops at epochs 15. The accuracy gain for training is about 98% while for the validation accuracy gain 94%. Figure 10 shows the table of model metrices gain from training.

The first epoch accuracy of training gains about 21% while for validation accuracy is 56%. The second epochs the accuracy of training gain is 67% while for validation accuracy is 78%. The third epoch, the accuracy gain is about 81% while for validation accuracy is 84%. The fourth epoch, the accuracy obtained 88% while validation accuracy gained is 89%. The fifth epoch, the accuracy obtained is 91% while the validation accuracy gain is about 87%. Then sixth epoch, 93% of accuracy gain while 92% accuracy gain from the validation accuracy. Then next seventh epoch, the accuracy gain is 95% while validation accuracy gain is about 92%. The next eighth epoch the accuracy gain is 96% and validation accuracy gain is about 92%. The next nineth epoch, the accuracy gain is about 96% and validation accuracy gain is 93%. The tenth epoch the accuracy gain is 97% and the validation accuracy is 94%. The eleventh epoch the accuracy gain is 97% and the validation accuracy is 94%. The twelfth epoch the accuracy gain is 98% and the validation accuracy is 94%. The thirteenth epoch the accuracy gain is 99% and the validation accuracy is 96%. The fourteenth epoch the accuracy gain is 99% and the validation accuracy is 93%. The final epochs gain accuracy is about 98% and 95% of validation accuracy.

Figure 10: Model Metrices

(12)

Figure 11 below shows the graph of model accuracy during training, the training graph is increasing along with the test accuracy. The graph is slowly increasing from 1 epoch until 15 epochs.

Figure 11: (a) Model Accuracy Graph, (b) Model Loss Graph

(a) (b)

(13)

Figure 12: Precision Call Report

Evaluation of this project is done by using precision call and confusion matrix to find the accuracy of the model. The evaluation is using the testing data, in this project there are 503 image data set for training for each 36 classes consist of 14 images. Based on Figure 12 shows the precision call report for this project. The accuracy project gain from evaluation is 95%.

5. Conclusion

This project manages to design and develop a simple Bahasa Melayu Sign Language Translator.

It aims to break the wall between the deaf-mute communities with the normal community. CNN algorithm helps to recognize the gestures of the sign language, where the images of the sign language were classified according to their alphabet and numeric. For future work, more inclusive testing will be conducted to test the accuracy of the CNN model in recognizing the sign language, using a more comprehensive range of real users as the tester that consist of hand gestures whether it is in static or dynamic with different backgrounds.

(14)

References

Abou Haidar, G., Achkar, R., Salhab, D., Sayah, A., & Jobran, F. (2019, August 1). Sign Language Translator using the Back Propagation Algorithm of an MLP. IEEE Xplore.

https://doi.org/10.1109/FiCloudW.2019.00019

Alshamrani, A., Bahattab, A.A., & Fulton, I.A. (2015). A Comparison Between Three SDLC Models Waterfall Model, Spiral Model, and Incremental/Iterative Model. www.IJCSI.org American Sign Language and British Sign Language Differences. (2021, October 13). Akorbi.

https://akorbi.com/american-sign-language-and-british-sign-language-how-are-they- different/

Awaludin, F. (2021, March 17). Signing the deaf and mute away from the margins. MalaysiaNow.

https://www.malaysianow.com/news/2021/03/17/signing-the-deaf-and-mute-away-from-the- margins/

Jayadeep, G., Vishnupriya, N. V., Venugopal, V., Vishnu, S., & Geetha, M. (2020). Mudra:

Convolutional Neural Network based Indian Sign Language Translator for Banks. 2020 4th International Conference on Intelligent Computing and Control Systems (ICICCS).

https://doi.org/10.1109/iciccs48265.2020.9121144

Khan, S. A., Joy, A. D., Asaduzzaman, S. M., & Hossain, M. (2019, April 1). An Efficient Sign Language Translator Device Using Convolutional Neural Network and Customized ROI Segmentation. IEEE Xplore. https://doi.org/10.1109/ICCET.2019.8726895

Malo, D. C., Rahman, Md. M., Mahbub, J., & Khan, M. M. (2022, January 1). Skin Cancer Detection using Convolutional Neural Network. IEEE Xplore.

https://doi.org/10.1109/CCWC54503.2022.9720751

Memon, Z. A., Ahmed, M. U., Hussain, S. T., Baig, Z. A., & Aziz, U. (2017). Real Time Translator for Sign Languages. 2017 International Conference on Frontiers of Information Technology (FIT). https://doi.org/10.1109/fit.2017.00033

Mohd Jalani, N. N., Zamzuri, Z. F., Mohd Nor, A. H., Md Badarudin, I., Jono, N. H. H., Ismail, S., Maulan, S., Md Yusuf, A. H. S., & Mahpoth, H. (2021). iMalaySign: Malaysian sign language recognition mobile application using Convolutional Neural Network (CNN) / Nurul Natasha Mohd Jalani and Zainal Fikri Zamzuri. Ir.uitm.edu.my; Akademi Pengajian Bahasa.

https://ir.uitm.edu.my/id/eprint/46006/

Ojha, A., Pandey, A., Maurya, S., Thakur, A., & Dayananda, P. (2020). Sign Language to Text and Speech Translation in Real Time Using Convolutional Neural Network. Undefined.

https://www.semanticscholar.org/paper/Sign-Language-to-Text-and-Speech-Translation-in- Ojha-Pandey/e79dd74be9b2eead0c44f5cd367560e71fdf531f

P, E., & Al, E. (2021). Speech To Sign Language Translator For Hearing Impaired. Turkish Journal of Computer and Mathematics Education (TURCOMAT), 12(10), 1913–1919.

https://doi.org/10.17762/turcomat.v12i10.4679

Singh, D. K. (2021). 3D-CNN based Dynamic Gesture Recognition for Indian Sign Language Modeling. Procedia Computer Science, 189, 76–83. https://doi.org/10.1016/j.procs.2021.05.

071