View of The Influence of Word Vectorization for Kawi Language to Indonesian Language Neural Machine Translation

(1)

Journal of Information Technology and Computer Science Volume 7, Number 1, April 2022, pp. 81-93

Journal Homepage: www.jitecs.ub.ac.id

The Influence of Word Vectorization for Kawi Language to Indonesian Language Neural Machine Translation

I Gede Bintang Arya Budaya*¹, Made Windu Antara Kesiman², I Made Gede Sunarya³,

1Institute Technology and Business STIKOM Bali, Denpasar

2, 3 Ganesha University of Education Bali, Singaraja

bintang@stikom-bali.ac.id¹, antara.kesiman@undiksha.ac.id², sunarya@undiksha.ac.id³

*Corresponding Author

Received 25 February 2022; accepted 30 April 2022

Abstract. People relatively use machine translation to learn any textual knowledge beyond their native language. There is already robust machine translation such as Google translate. However, the language list has only covered the high resource language such as English, France, etc., but not for Kawi Language as one of the local languages used in Bali's old works of literature. Therefore, it is necessary to study the development of machine translation from the Kawi language to the more active user language such as the Indonesian language to make easier learning access for the young learner. The research developed the neural machine translation (NMT) using recurrent neural network (RNN) based neural models and analyzed the influence of word vectorization using Word2Vec for the machine translation performance based on BLEU scores. The result shows that word vectorization indeed significantly increases the NMT models performance, and Long-Short Term Memory (LSTM) with attention mechanism has the highest BLEU scores equal to 20.86. The NMT models still could not achieve the BLEU scores on par with those human experts and high resource language machine translation. On the other hand, this initial study could be the reference for the future development of Kawi to Indonesian NMT.

Keywords: RNN, word embedding, word2vec, attention, parallel corpus, BLEU scores

1. Introduction

People tend to use machine translation to learn any textual knowledge beyond their native language. Research [1], [2] shows that the use of machine translation can increase the learning efficiency. There are already robust machine translation such Google translate but the language list it has only covering the high resource language such as English, France, etc. Kawi language is a language uses in many old literature such as ancient palm leaf manuscript [3], [4]. The manuscript itself contain the knowledge of culture, religion, norm and it still studied until today with the support of

(2)

82 JITeCS Volume 7, Number 1, April 2022, pp 81-93 communities, academicians, and the governments [5]. The development of machine translation for Kawi language to the more actively uses language such as Indonesian language can be the solution to open the easier learning access for the young learner.

Research in the development of machine translation models, namely rule-based machine translation , statistical-based machine translation [6], [7], and neural network-based machine translation [8], [9]. The research of [10] and compared the performance of statistical-based machine translation and neural-based machine translation where neural-based machine translation, especially using recurrent neural network (RNN) had better performance and accuracy than statistical-based machine translation. In the research of [11], the machine translation based on recurrent neural network (RNN) uses for the low resource languages where the results are better than statistical-based machine translation. Based on these studies, the RNN-based machine translation has the best performance. Kawi language is one of the low resource languages, and using RNN is the solution.

There are still few studies about neural machine translation models from low- resource local languages, especially the Kawi language to Indonesian. The purpose of this research is to design an RNN-based machine translation model, namely RNN, Bidirectional RNN, GRU, Bidirectional GRU, LSTM, Bidirectional LSTM, LSTM with Attention mechanism, and Bidirectional LSTM with Attention mechanism using parallel corpus Kawi language to Indonesian. The research also to find out the influence of word vectorization on neural machine translation models performance were then compared for their performance based on the BLEU scores.

2. Related Works

Sanskrit, Kawi (Old Javanese), and Balinese are a legacy of written texts and ritual practices of Hinduism in Bali which historically have become very important in the process of religion and teaching, especially the Kawi language. The early use of the Kawi language in Bali can be seen and illustrated through various inscriptions found in Bali itself. The use of the Kawi language is used to replace the role of the Old Balinese language which has been used in inscriptions from 882 AD until the reign of King Anak Wungsu in 1072 AD [12]. After the change of the language became the beginning of the habit of writing inscriptions in Bali which eventually used the Kawi language. Kusuma Dewa dan Sangkuputih is a guide for stakeholders in Bali by using the Kawi language and using various Sanskrit characters [13]. These two guidelines are very clear in the composition of puja (prayer) and become the most important worship guidelines among the stakeholders. These inherited guidelines have indeed been grounded or have been based on the interest of worship in Parahyangan (temples) and have developed in ancient Javanese or Nusantara traditions.

Neural Machine Translation (NMT) has long been developed as in the research conducted by Allen [14], [15]. However, due to unsatisfactory performance results in the initial period, one of which was the limitations of computing in hardware, causing research and development related to NMT to be neglected for several years. Since the advent of deep learning in 2010 and supported by computing devices, more and more processes in natural language processing (NLP) have improved in quality. This has led to the application of deep neural networks for machine translation to receive greater attention.

(3)

I Gede Bintang Arya et al. , The Influence of Word: ...83 Table 1. Related Works

No Title Summary

1 RNN based machine translation and transliteration for Twitter data [10]

Create a machine translation and transliteration model for the Twitter dataset from Hindi to English to determine social security by identifying offensive content that can be understood by speakers of other languages, and found that the LSTM model performed better than statistical machine translation.

2 Revisiting low- resource neural machine translation:

A case study [11]

Measuring the quality of machine translation that has low source data with Neural Machine Translation and Statistical Machine Translation and found that the neural machine translation model used has better performance than statistical machine translation.

3 Deep Learning for Load Forecasting:

Sequence to Sequence Recurrent Neural Networks With Attention [16]

The Sequence to Sequence Recurrent Neural Network (S2S RNN) proposed in this study is focused on forecasting electricity load. For load forecasting, the S2S architecture from language translation is modified, and a corresponding sample generation strategy is developed. The S2S model further enhances time modeling by merging two RNNs: the encoder and the decoder, which allows RNN to be used to capture temporal dependencies existing in the load data. The burden of tying the encoder and decoder together is lessened by the attention mechanism. The research examined the mechanics of attention using several RNN cells (vanilla, LSTM, and GRU) and time horizons.

According to the results, S2S with Bahdanau's attention performs better than other models.

4 Machine Translation System Using Deep Learning for English to Urdu [17]

In this research, a deep learning encoder-decoder model for translating from English to Urdu is proposed. In this study, the Bahdanau attention mechanism has been applied. A parallel English- Urdu corpus of 1083734 tokens has been used;

of these, 542810 were tokens in English and 123636 were in Urdu. Several automatic evaluation criteria, such as BLEU, are used to assess the effectiveness of the proposed solution.

5 Word Embedding Generation for Urdu

The Word2vec model was used in this research to create the Urdu word embeddings. The model

(4)

84 JITeCS Volume 7, Number 1, April 2022, pp 81-93

No Title Summary

Language using Word2vec model [18]

will assist in creating dense word vector representations of Urdu words that might be applied to word vectors that have already been learned. The outcomes demonstrated that the suggested method can be utilized to enhance traditional word embedding techniques.

As shown in Table 1, several studies have been carried out in the development of machine translation, research [10] shows the performance of LSTM-based neural machine translation is better than statistical machine translation. Research [11] uses several neural models for machine translation but does not mention the use of word vectorization technic whereas research [18] shows that the application of word embedding, especially word2vec, that can be an option to create feature representations in natural language processing and increase the performance.

Research [16], is not about machine translation, but adapts the same problem, namely sequential problem and the Bahdanau Attention mechanism implemented in its neural models did improve the prediction results. It was also the case for this research [18]

about machine translation. However, this research has yet to benchmark their translation result with other neural network models.

3. Research Method

This section discussed the method to create a ground truth dataset for translating Balinese palm-leaf manuscript transliteration texts to the Indonesian language and the methodology for modelling basic NMT.

3.1. Dataset Collection and Exploration

The first process is dataset collection and exploration, consisting. Based on the survey and information collected, some experts or academicians have already published their books/documents containing the transliteration and translation text from the ancient palm-leaf manuscript, especially about Parwa (the manuscript genre about the heroic and epic story [19]). However, the format of the documents is not directly compatible with the need for training a deep learning model. Thus, it still needs another step formating to build the dataset for NMT. Table 2 shows the list of corpus sources for this study.

Table 2 Corpus sources

Corpus Title Experts/Academicians

1 Musala Parwa Dr. Anak Agung Gde Alit Geria, M.Si.

2 Phalawkya Kapi Parwa Drs Komang Paramartha, M.S.

(5)

I Gede Bintang Arya et al. , The Influence of Word: ...85

Corpus Title Experts/Academicians

3 Singhalangghyala Parwa Anak Agung Gde Alit Geria & I Gde Agus Darma Putra

The format of the documents is not directly compatible with the need for training a deep learning model. Therefore the dataset must contain sentences from the source language (the Balinese/Kawi language) and its correspondence on the target sentences (Indonesian language). The standard translation model used is a sequence to sequence model [20], where the sentences are a sequence of words [21]. Error! Reference source not found. shows the sample content from the corpus Musala Parwa [22].

The Palm-leaf manuscript images correspond to their transliteration result in Latin Kawi texts and their translation from Kawi to Indonesian.

The Excel document column consists of Id, target language, source language, corpus title, and the expert name. After the input process from the sources corpus, the collected dataset consists of 1086 Kawi to Indonesian sentences. Table 3 shows the dataset insight after the input process, and Table 4 shows the sample of the final dataset in the first three rows.

Table 3 Dataset insight

Corpus Title Dataset Sentences

1 Musala Parwa 465

2 Phalawkya Kapi Parwa 555

3 Singhalangghyala Parwa 66

TOTAL 1086

Table 4. Sample of the dataset in the final dataset Id Target Language Source

Language

Corpus Title

Experts 1 Oh Hyang Widhi

semoga tidak terhalang.

Om

Awighnamàstu. Musala

Parwa Dr. Anak Agung Gde Alit Geria, M.Si.

2 Demikian kata Bhagawàn Wèsampayana,

Samangkana pawarah Bhagawàn Wèsampayana,

Musala Parwa

Dr. Anak Agung Gde Alit Geria, M.Si.

3 lalu bertanya lagi Mahàràja Janamejaya:

matakwan ta Mahàràja Janamejaya

muwah.

Musala Parwa

Dr. Anak Agung Gde Alit Geria, M.Si.

3.2. Preprocessing Dataset

The first step is to prepare and clean the dataset, which helps reduce the dataset noises by following the standard process [21], [23], especially for this study dataset. The task list is as follows; the first task is to convert all the sentences to lower case, remove special characters, remove punctuation, and remove extra space. Table 5 shows samples of the preprocessed dataset.

(6)

86 JITeCS Volume 7, Number 1, April 2022, pp 81-93 Table 5 Preprocessed dataset

Id Target Language Source Language

1 oh hyang widhi semoga tidak terhalang

om awighnamàstu 2 demikian kata bhagawàn

wèsampayana

samangkana pawarah bhagawàn wèsampayana

3 lalu bertanya lagi mahàràja janamejaya

matakwan ta mahàràja janamejaya muwah

3.3. Tokenization

The second step is feature extraction which includes tokenization. The task is to split the sentences into lists of words and index them to make a vocabulary. Then the word counted for each sentence to find the length of the sentences and the overall vocabulary size. The next task is to add padding for each sentence to make each of them has the same length (word count). The padding target number uses the maximum length for sentences at the target language and the source language. Table 6 shows the samples of split sentences processes and count each sentence length based on word count.

Table 6 Samples of tokenized sentences

Id Target

Language

Length Source Language Length 1 oh hyang

widhi semoga tidak terhalang

6 om awighnamàstu 2

2 demikian kata bhagawàn wèsampayana

4 samangkana

pawarah bhagawàn wèsampayana

4

3 lalu bertanya lagi mahàràja janamejaya

5 matakwan ta mahàràja janamejaya

muwah

5

The feature for the deep learning model input is the index number of each word in the vocabulary. Table 7 shows the vocabulary number for each target (Indonesian) and source (Kawi) language. The dataset divides into two groups for training purposes and testing purposes with a ratio of 70: 30 [24].

Table 7 Dataset vocabulary

Id Vocab_Size Max_Length

Target Language (Indonesian)

1577 21

Source Language

(Kawi)

1874 15

(7)

I Gede Bintang Arya et al. , The Influence of Word: ...87 3.4. Word Vectorization.

Word vectorization could be applied in two ways using Word2Vec: one using a Continuous Bag of Words (CBOW) and the other using Skip Gram. In this study, CBOW is used to generate word embeddings [25].

3.5. Neural Machine Translation Model

The model proposed for this preliminary study uses Simpel RNN, LSTM, and GRU.

All of the models use the encoder-decoder architecture and use a bidirectional model for additional scenarios. Lastly, the evaluation process by comparing each performance based on the BLEU score.

3.5.1.1. Recurrent Neural network (RNN)

RNN is a neural network approach for dealing with sequential data [26]. The task for each part of an order is repeated in RNN. The current state's output is determined by the previous computations. RNN keeps track of what it computes in memory. The vanilla RNN or Simple RNN model is the most basic RNN model.

3.5.1.2. Long-Short Term Memory (LSTM)

LSTM is commonly used to handle sequence data such as text, and it has demonstrated great performance in text classification issues [27]. In an LSTM cell state, information is erased or added, which is controlled by three gates: forget, input, and output [28].

3.5.1.3. Gated Recurrent Unit (GRU)

One of the RNN variation architectures aimed at solving the disappearing gradient problem in a standard RNN is the Gated Recurrent Unit (GRU). It divides the LSTM gates into two parts: update and reset gates. According to the study [29], the GRU outperforms the LSTM for long text and short datasets.

3.5.1.4. Attention Mechanism

Bahdanau [9] applies the Attention mechanism for synchronizing sentence input or word sequences that became obstacles in the translation process with long sentences.

The Attention mechanism can change the size of the input vector into the fixed and expanded to make the model of the NMT able to automatically search for parts of the source sentence that are relevant for the process of predicting words in the target language.

3.6. Bilingual Evaluation Understudy (BLEU) Scores

BLEU (bilingual evaluation understudy) is an algorithm for assessing the quality of machine-translated text from one natural language to another [30]. Quality is defined as the correspondence between a machine's output and that of a human. BLEU values vary from 0 to 1 [30] and are frequently between 0 and 100 [31]. The machine translation model is evaluated using BLEU in studies [11], [20].

(8)

88 JITeCS Volume 7, Number 1, April 2022, pp 81-93

4. Research and Discussion

The picture shows the trend of the training loss and validation loss for each NMT model. Based on experiments and observations during the model training process, after the 12th epoch, the value of training loss and validation loss tends to go up and down with the final trend is increased. The loss function value in each model for the training dataset tends to decrease. However, the loss function value of the validation dataset, after the 12th epoch tends to have an increasing trend. It shows that the NMT model is still overfitting.

Fig. 1. Training and validation loss graph.

There are two scenarios in the NMT model training. The first scenario is where the

(9)

I Gede Bintang Arya et al. , The Influence of Word: ...89 NMT model does not use the Word2Vec embedding layer. In the second scenario, the NMT model applies the Word2Vec embedding layer. The purpose of this experiment setting is to determine the influence of implementing word vectors on the input of the NMT. Table 8 Scenario 1 BLEU scoresTable 8 shows the BLEU scores result for the first scenario and Table 9 shows the BLEU scores result for the second scenario.

Table 8 Scenario 1 BLEU scores NMT Model Number of

Units

Batch Size BLEU Scores

Simple RNN 256 64 3.29

Simple RNN 256 128 16.83

BiRNN 256 64 1.98

BiRNN 256 128 17.16

LSTM 256 64 2.66

LSTM 256 128 1.7

BiLSTM 256 64 10.62

BiLSTM 256 128 16.6

GRU 256 64 3.99

GRU 256 128 3.99

BiGRU 256 64 0.83

BiGRU 256 128 2.22

The first scenario, the first NMT model is Simple RNN using batch size 64 with BLEU scores of 3.29. Furthermore, RNN also uses a batch size of 128 with BLEU scores result of 16.83. The second NMT model is a Bidirectional RNN using batch size 64 with BLEU scores result of 1.98. Furthermore, the Bidirectional Simple RNN also uses a batch size of 128 with an average BLEU result of 17.16. The third NMT model is LSTM using batch size 64 with BLEU scores result of 2.66. Furthermore, LSTM also uses a batch size of 128 with BLEU scores result of 1.7. The fourth NMT model is Bidirectional LSTM using batch size 64 with BLEU scores result of 10.62.

Furthermore, the Bidirectional LSTM also uses a batch size of 128 with BLEU scores result of 16.6.

The fifth NMT model is GRU using batch size 64 with BLEU scores result of 3.99. Furthermore, the GRU also uses a batch size of 128 with an average BLEU result of 3.99. The sixth NMT model is Bidirectional GRU using batch size 64 with BLEU scores value of 0.83. Furthermore, the Bidirectional GRU uses a batch size of 128 with BLEU scores result of 2.22. Based on the results of calculating BLEU Scores in the first scenario, it shows that Bidirectional RNN produces the highest value and Bidirectional GRU has the lowest value. The batch sizes parameter affects the value of the existing BLEU Scores. Using a batch size of 128 can increase the average BLEU score significantly, except for the GRU NMT model.

Table 9 Scenario 2 BLEU scores

NMT Model Number of Units Batch Size BLEU Scores

RNN 256 128 20.43

BiRNN 256 128 17.6

(10)

90 JITeCS Volume 7, Number 1, April 2022, pp 81-93 NMT Model Number of Units Batch Size BLEU Scores

GRU 256 128 19.53

BiGRU 256 128 19.98

LSTM 256 128 19.91

BiLSTM 256 128 20.48

LSTM +

Attention 256 128 20.86

BiLSTM +

Attention 256 128 20.41

The second scenario is the implementation of Word2Vec. The first scenario shows the use of batch size 128. Therefore in this second scenario, the word vectors implementation directly uses a batch size of 128. Table 8 shows the results of the calculation of BLEU Scores in scenario 2. The first NMT model is Simple RNN which has a BLEU score of 20.43, compared to the first scenario where it is only 16.83. It shows an increase in the BLEU value in this NMT model by about 5 points.

The second NMT model is Bidirectional Simple RNN which has a BLEU score of 17.6, compared to the scenario BLEU score is only 17.16. It shows an increase in the BLEU scores in this NMT model by about 0.4 points.

The third NMT model is GRU which has a BLEU score of 19.53 compared to the first scenario BLEU score is only 3.99. It shows a very significant increase in the BLEU scores of this NMT model of around 15.54 points. The fourth NMT model is the Bidirectional GRU which has a BLEU score of 19.8, compared to the first scenario where it is only 3.99. It shows a significant increase in the BLEU scores in this machine translation model by around 15.81 points. The fifth NMT model is LSTM which has a BLEU score of 19.91, compared to the first scenario with BLEU scores of only 1.7. It shows an increase in the BLEU scores in the NMT model which is very significant, around 18.21 points. The sixth NMT model is Bidirectional LSTM which has a BLEU score of 20.48, compared to the results in the first scenario where it is only 16.6. It shows an increase in the BLEU value in this NMT model by around 3.88 points.

The LSTM and Bidirectional LSTM translation NMT models are also implemented with the Attention Mechanism. Based on the results of the BLEU Scores, it shows the seventh NMT model, namely LSTM + Attention which has a BLEU score of 20.86. In the eighth NMT model, BiLSTM + Attention has a BLEU score of 20.41. Based on the BLEU scores, it shows that the second scenario with the implementation of word vectors can significantly improve the quality of the NMT.

LSTM with Attention mechanism produces the highest results for the BLEU scores with Bidirectional RNN which has the lowest scores, both of which are about 4.66 points.

5. Conclusion

The NLP technique of word vectorization converts words or phrases from a lexicon to a corresponding vector of real numbers, which is then used to find word predictions and word semantics. Vectorization is the process of turning words into numbers. In

(11)

I Gede Bintang Arya et al. , The Influence of Word: ...91 this research, the Kawi language word embedding was made from the dataset using Word2Vec as a way to vectorize the word. The test scenario is using eight neural machine translation models, with and without the use of Word2Vec. Based on the results of the tests carried out in this research, the application of word vectorization using Word2Vec as an embedding layer significantly improves BLEU Scores. Based on the results of the eight machine translation models tested, this research supported that the LSTM machine translation model with the implementation of the Attention mechanism has the best performance, which produced the highest average BLEU scores with a value of 20.86. The implementation of the Attention mechanism increases the BLEU Scores of the LSTM machine translation model by an increase of 0.95 from 19.91 to 20.86. However, based on the BLEU scores, the translation results still can not yet able to achieve the average translation quality of human experts and machine such as Google translate [31], [32]. In the future, there is a need to increase the size of the parallel corpus Kawi to the Indonesian dataset and reanalyze the word vectorization influence either using the Word2Vec model with the different tuning of hyperparameter or using the other word vectorization techniques such as GloVe and fastText. The implementation of the Transformer neural model also can be considered as a benchmark of the machine translation performance, especially for Kawi to Indonesian neural machine translation.

References

1. T. F. Kai and T. K. Hua. 2021. Enhancing english language vocabulary learning among indigenous learners through google translate. Journal of Education and e-Learning Research, vol. 8, no. 2, pp. 143–148, 2021, doi:

10.20448/JOURNAL.509.2021.82.143.148.

2. H. Bahri and T. S. T. Mahadi. 2016. Google translate as a supplementary tool for learning Malay: A case study at Universiti Sains Malaysia. Advances in Language and Literary Studies, vol. 7, no. 3, pp. 161–167.

3. M. S. Zurbuchen, Introduction to Old Javanese language and literature: A Kawi prose anthology. University of Michigan Press, 2020.

4. A. A. G. e A. Geria. 2020. Lontar: Tradisi Hidup Dan Lestari Di Bal. Media Pustakawan, vol. 17, no. 1, pp. 39–45.

5. N. S. Ardiyasa and M. K. S. E-mail. 2021. Eksistensi Naskah Lontar Masyarakat Bali ( Studi Kasus Hasil Pemetaan Penuyuluh Bahasa Bali Tahun 2016-2018 ). Vol. 11, no. 1.

6. P. Koehn. 2005. Europarl : A Parallel Corpus for Statistical Machine Translation. MT Summit, vol. 11, pp. 79--86.

7. P. Koehn, F. J. Och, and D. Marcu, “Statistical Phrase-Based Translation,”

2003.

8. S. Yang, Y. Wang, and X. Chu. 2020. A Survey of Deep Learning Techniques for Neural Machine Translation.

9. D. Bahdanau, K. H. Cho, and Y. Bengio. 2015. Neural machine translation by jointly learning to align and translate. 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings.

10. M. K. Vathsala and G. Holi. 2020. RNN based machine translation and transliteration for Twitter data. Int J Speech Technol, vol. 23, no. 3, pp. 499–

504.

11. R. Sennrich and B. Zhang. 2020. Revisiting low-resource neural machine translation: A case study. ACL 2019 - 57th Annual Meeting of the Association

(12)

92 JITeCS Volume 7, Number 1, April 2022, pp 81-93 for Computational Linguistics, Proceedings of the Conference, pp. 211–221, doi: 10.18653/v1/p19-1021.

12. I. M. Suweta. 2019. BAHASA DAN SASTRA BALI DALAM KONTEKS BAHASA DAN SASTRA JAWA KUNA. Widyacarya: Jurnal Pendidikan, Agama dan Budaya, vol. 3, no. 1, pp. 1–12.

13. I. N. Warta. 2018. Peran Wasi Dalam Pembinaan Umat,” Widya Aksara, vol.

23, no. 2.

14. R. B. Allen. 1989. Sequence generation with connectionist state machines,”

IJCNN Int Jt Conf Neural Network, p. 593, doi: 10.1109/ijcnn.1989.118376.

15. R. B. Allen. 1987. Several Studies on Natural Language and Back- Propagation..

16. L. Sehovac and K. Grolinger. 2020. Deep Learning for Load Forecasting:

Sequence to Sequence Recurrent Neural Networks with Attention. IEEE Access, vol. 8, pp. 36411–36426, doi: 10.1109/ACCESS.2020.2975738.

17. S. A. B. Andrabi and A. Wahid. 2022. Machine Translation System Using Deep Learning for English to Urdu. Comput Intell Neurosci, vol. 2022, doi:

10.1155/2022/7873012.

18. S. H. Kumhar, M. M. Kirmani, J. Sheetlani, and M. Hassan. 2021. Word Embedding Generation for Urdu Language using Word2vec model. Mater Today Proc, doi: 10.1016/j.matpr.2020.11.766.

19. I. W. Suardiana. 2020. Kesusastraan Bali Purwa. Diakses pada tanggal, vol.

18, 2020.

20. I. Sutskever, O. Vinyals, and Q. V. Le. 2014. Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems, vol. 4, no. January, pp. 3104–3112.

21. G. Tiwari, A. Sharma, A. Sahotra, and R. Kapoor. 2020. English-Hindi Neural Machine Translation-LSTM Seq2Seq and ConvS2S. Proceedings of the 2020 IEEE International Conference on Communication and Signal Processing, ICCSP 2020, pp. 871–875, doi:

10.1109/ICCSP48568.2020.9182117.

22. A. A. G. A. Geria. 2017. Musala Parwa: Lontar, Teks Kawi Latin, dan Terjemahan Bali-Indonesia. Paramita.

23. S. Saini and V. Sahula. 2018. Neural Machine Translation for English to Hindi,” Proceedings - 2018 4th International Conference on Information Retrieval and Knowledge Management: Diving into Data Sciences, CAMP 2018, pp. 25–30, doi: 10.1109/INFRKM.2018.8464781.

24. Q. H. Nguyen et al.. 2021. Influence of data splitting on performance of machine learning models in prediction of shear strength of soil,”

Mathematical Problems in Engineering, vol. 2021, doi:

10.1155/2021/4832864.

25. R. Rahman. 2020. Robust and Consistent Estimation of Word Embedding for Bangla Language by fine-tuning Word2Vec Model. ICCIT 2020 - 23rd International Conference on Computer and Information Technology, Proceedings, pp. 19–21, doi: 10.1109/ICCIT51783.2020.9392738.

26. J. Chung, C. Gulcehre, K. Cho, and Y. B. B. T.-P. of the 32nd I. C. on M.

Learning, “Gated Feedback Recurrent Neural Networks,” vol. 37. PMLR, pp.

2067–2075.

27. G. Liu and J. Guo. 2019. Bidirectional LSTM with attention mechanism and convolutional layer for text classification,” Neurocomputing, vol. 337, pp.

325–338.

(13)

I Gede Bintang Arya et al. , The Influence of Word: ...93 28. A. Sherstinsky. 2020. Fundamentals of Recurrent Neural Network (RNN) and

Long Short-Term Memory (LSTM) network,” Physica D: Nonlinear

Phenomena, vol. 404, p. 132306, doi:

https://doi.org/10.1016/j.physd.2019.132306.

29. S. Yang, X. Yu, and Y. Zhou. 2020. LSTM and GRU Neural Network Performance Comparison Study: Taking Yelp Review Dataset as an Example,” Proceedings - 2020 International Workshop on Electronic Communication and Artificial Intelligence, IWECAI 2020, pp. 98–101, doi:

10.1109/IWECAI50956.2020.00027.

30. K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu. 2001. BLEU: a method for automatic evaluation of machine translation,” Acl, pp. 311–318, doi:

10.3115/1073083.1073135.

31. “Evaluating models | AutoML Translation Documentation | Google Cloud.”

https://cloud.google.com/translate/automl/docs/evaluate (accessed Oct. 22, 2021).

32. L. L. Tan, J. Dehdari, and J. van Genabith. 2015. An Awkward Disparity between BLEU / RIBES Scores and Human Judgements in Machine Translation. Proceedings of the Workshop on Asian Translation (WAT-2015).

Workshop on Asian Translation (WAT-15), October 16, Kyoto, Japan, no.

October, pp. 74–81.