Hate Speech Detection in Indonesia Twitter Comments Using Convolutional Neural Network (CNN) and FastText Word Embedding

(1)

Hate Speech Detection in Indonesia Twitter Comments Using Convolutional Neural Network (CNN) and FastText Word Embedding

Fadhilah Nadia Puteri^*, Yuliant Sibaroni, Fitriyani School of Computing, Informatics, Telkom University, Bandung, Indonesia

Email: ^1,*[email protected], ²[email protected],³[email protected] Correspondence Author Email: [email protected]

Abstract−Hate speech is a problem that is often present in Indonesia, including on social media platforms such as Twitter.

Refers to any form of communication, whether oral, written, or symbolic, that may offend, threaten or insult an individual or group based on attributes such as religion, race, ethnicity, sexual orientation, or other characteristics. The existence of freedom of expression and communication on social media triggers the spread of hate speech quickly and widely. To avoid this, a system is needed that can detect hate speech on social media. Deep learning is potentially better at recognizing and analyzing language patterns that reflect hate speech in text. In the previous study, the accuracy obtained was 73.2% using the Convolutional Neural Network method. This study proposed a hate speech detection system using Convolutional Neural Network model and FastText word embedding. The performance of Convolutional Neural Network classification model and FastText as word embedding provide excellent performance results in detecting hate speech, by involving the K-Fold Cross Validation process to the appropriate dropout value is able to achieve an accuracy value of 80%. The resulting accuracy value can be a benchmark that the model that has been built is able to avoid the spread of hate speech on social media.

Keywords: Hate Speech; Twitter; Deep Learning; Convolutional Neural Network; FastText

1. INTRODUCTION

The existence of rapidly developing information technology has had a significant impact on various aspects of life.

However, not all the impacts generated by this technology benefit society. In fact, technology has made it easier for crime[1]. An example of the most popular hate speech on Twitter social media is a tweet from musician Ahmad Dhani on March 1, 2017, indirectly referring to the governor of DKI who was inactive at that time, Basuki Tjahaja Purnama. This hate speech case was handled late by the police after Basuki Tjahaja Purnama reported it [2]. The harm from this violation can result against the reputation and integrity of a person, as well as possible hostility and confrontation in the form of SARA. Law enforcement uses manual processes that take a lot of time. With it comes the development of a more effective and efficient system than humans in identifying hate speech. Hate speech detection has been done a lot before.

Research[3] discusses the comparative study of deep learning methods. Evaluates three hate speech detection benchmarks containing different types of hate tweet classes from different data sources, Researchers evaluated 14 hate speech detectors based on the shallow/deep classification, which are powered by different word representation methods ranging from TF-IDF, Glove-based word embeddings to transformers. Researchers also used three hate speech detection benchmarks containing different types of hate tweet classes from different data sources to evaluate the detector's performance. Zho Y et al[4] conducted two experiments to evaluate the efficacy of ELMo, BERT, and CNN, three deep-learning techniques for text classification. Utilizing accuracy and F1-Score to assess the outcomes after using raw data as input. The outcomes demonstrated that all three approaches had a high accuracy rate, with F1-Score with CNN having the greatest performance. In Indah et al[5] conducted the experiment by using Instagram as the source of the dataset and the FastText approach as the classifier for detecting hate speech in the Indonesian language. 1,200 comments labeled manually. The evaluation was carried out using 10-fold cross-validation. The experimental results showed that FastText is a better classifier than Random Forest Decision Tree and Logistic Regression. They achieved the highest result when FastText is combined with the bigram feature with an F-measure of 65.7%. In Tin Van Huynh et al[6] conducted a study using a classification approach of three deep learning methods. Using F1-score as an evaluation metric for their experiments. The three models are TextCNN, Bi-GROUP-CNN, and Bi-GRU-LSTM-CNN. The results found that Bi-GRU-LSTM-CNN achieved the best performance among them, achieving an F1-Score of 70.576%. In Ridwan et al[7] deployed a hybrid deep-learning approach to building a classification model that can handle hate speech on Twitter. This method combined Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) models, supported by word embedding Skip-gram and Continuous Bag of Word (CBOW) methods from Word2Vec models. The CNN-LSTM model's iteration and embedding dimensions were combined with Skip-gram and CBOW to conduct experiments. A combination of 30 iterations with 300 Skip-gram dimensions produced the best CNN-LSTM results, with label accuracy values reaching as high as 69.1% during the testing stage.

In a previous study comparing different approaches to Hindi text classification and it appears that CNN method with FastText embedding gives the best performance[8]. CNN again excels in research related to the comparison of various text classification methods with FastText as word embedding. By considering the detection of hate and offensive speech in Hindi and Marathi texts, explore different deep learning architectures such as CNN, LSTM, and Bert variations such as multilingual BERT, IndicBERT, and monolingual RoBERTa coupled with FastText as word embedding[9]. With that, the authors want to examine and see how the application of

(2)

Convolutional Neural Network (CNN) and FastText classification methods as word embedding to detect comments in as many as 13,169 Indonesian tweets available on the public dataset. The data sets are already labeled positive and negative. The Data that has been obtained is divided into training data and test data. Train Data refers to the part of the data set that is used to train the model. Test Data is a separate part of the previous data set that was not used during the model training process. The Experiment was conducted by using k-fold cross-validation and adjusting school dropout scores with test data to assess and measure how well the model performed in detecting hate speech. Testing was done when the value of k was as much as 5, and the dropout value of 0.1, 0.2, and 0.3.

The results obtained will provide conclusions and an understanding of the system created.

2. RESEARCH METHODOLOGY

2.1 Research Stages

Starting from the preprocessing stage to modeling. The system design in this study is represented in the form of a flowchart as follows.

Figure 1. Hate Speech Detection System

Figure 1 shows the flow of a hate speech detection system model in general, starting from calling the library want to use, and importing the dataset, the dataset will be processed first in the preprocessing stage with as many as 5 steps, namely case folding, cleansing, text normalization, stemming and stopword removal, datasets that have gone through the preprocessing stage will produce a new dataset and then split the data and do the word embedding stage where in this study pre-trained model and after the stage is implemented with the algorithm data. To make it easier, a function is created that can call an algorithm to make predictions. It also created a function to display the accuracy and display the confusion matrix at the evaluation stage.

2.2 Data Collection

The Data used in this study are data sourced from public datasets[10]. The Total data collected amounted to 13,169 tweets containing hate speech and abusive comments on Indonesian Twitter comments. Datasets are already labeled for each category. Label 1 is for text that means positive(tweets that are included in hate speech) and label 0 is for text that means negative(tweets that are not included in hate speech).

Figure 2. Visualization of Data Distribution

(3)

The result of data distribution is that 50.1% of positive tweets were obtained and negative tweets by 49.9%

as shown in Figure 2.

2.3 Data Preprocessing

Preprocessing is the next step after acquiring the dataset to be used. There are five preprocessing phases in this study, which include case folding, cleaning, normalizing, stemming, and stopword elimination.

a. Case Folding is the process of standardizing upper or lower case letters into all lowercase characters [11]. In addition, the efficiency of this process can improve the model's performance when completing classification [12].

b. Cleansing is a process to remove punctuation marks or re-tweet symbols, URLs, usernames, regular expressions, and emoticons from sentences contained in datasets. The libraries used at this stage are Pandas, NumPy, and Regular Expressions.

Table 1. Cleansing Process

URL Username Re-tweet Regular expression

‘‘http://’’ or ‘‘https://’’ or “www\.” @user rt 0-9a-zA-Z or @#&$%+-/*

c. Normalization is the process of converting non-formal terms into formal terms and using NumPy library and scikit-learn. This stage uses a pre-built dictionary called alay_dictionary. This dictionary contains a collection of words typos and slang words from previous works [10]. For example, take the word “aaamiinn” to be

“amin”.

d. Stemming is the process of stripping words or reducing words to root words or basic words. In this study, the stemmer to be used is Sastrawi Library. As an example of a sentence before processing stemming is “indonesia memiliki keindahan alam” to “indonesia milik indah alam”.

e. Stopword Removal is the process of removing words that are not useful or have no meaning by the expectations of the study, making it less influential in the classification process. This stage uses a dictionary that has been built by previous research called new_kamusalay [10]. For example, take the word “ada huehue yang huhu” to be “hehe huhu”.

This procedure is used to fix issues that come up when processing data to raise the data's quality and guarantee that the process's outcomes can be predicted more precisely.

Figure 3. Word Cloud Negative Tweets and Positive Tweets

Wordcloud helps to identify the most frequent words in texts related to hate speech, the most frequent words are displayed with a larger and more striking size. Look at the negative tweets there is the word" indonesia "and on the positive tweets" jokowi " and also some other words with a fairly high intensity. Negative tweets do not include hate speech while positive tweets are words that include hate speech as shown in Figure 3.

2.4 Convolutional Neural Network

Convolutional Neural Network (CNN) is one of the deep learning algorithms capable of effectively capturing meaningful sentence representations as used in language classification and modeling [13].

Figure 4. CNN Workflow for Text Classification

The workflow of a classification problem using a CNN is depicted in Figure 4. In this process, the words are transformed into vectors using an embedding layer and subsequently utilized as input for the CNN.

Each CNN layer in the text classification has an important role in extracting useful information from the input text. This process involves convolution filters that identify local patterns, pooling to reduce dimensions, and fully connected layers to perform further transformations. In combination, these layers help the model in understanding important relationships and features in the text. This research was built using TensorFlow and run on Google Colaboratory. The network architecture used can be seen in Figure 5.

(4)

Figure 5. CNN Architecture for Hate Speech Detection

Has 9 layers that convert text input into predictive output, made up of the input layer and embedding layer, dropout layer, Conv. layer 1, pooling layer 1, Conv. layer 2, pooling layer 2, dense layer, dropout layer 2, dense layer 2 dan output layer.

The Input layer receives text input in the form of a sequence of words, then the embedding layer utilizes the representation of discrete words into a continuous vector using an embedding matrix with a size(of 20000,100).

On the first and second layer dropout applied a dropout rate of 0.2 to reduce the occurrence of overfitting. Conv.

The first layer applies convolution operations on text input, convolution is performed with 250 filters and kernel size 3 and Conv. The second layer of convolution is done with 250 filters and a kernel size of 5.

The next stage is the pooling layer plays a role in simplifying the shape of the Matrix without removing neurons.

The first layer is applied MaxPooling1D which is one type of pooling layer that plays a role to reduce the output dimensions of the previous layer by using the default size of 2. The second layer is applied GlobalMaxPooling1D role takes the maximum value on each feature, then the dense layer will reduce by 250 units and the activation function is ReLU used to process the GlobalMaxPooling1D layer.

𝑓(𝑥)𝑅𝑒𝐿𝑈= 𝑚𝑎𝑥⁡(0, 𝑥) (1)

The Sigmoid is a function that shrinks the input space from large to small, or in summary, the input of the sigmoid is a real number, while the output is limited only between zero and one[14]. Can be represented mathematically by the following equation:

𝑓(𝑥)𝑠𝑖𝑔𝑚 = ⁡ ¹

1+⁡𝑒^−𝑥 (2)

Hyperparameter used Dropout: the probability of dropout for each node, Epoch: number of epoch rate, Batch Size: number of samples, Kernel Size: the size of kernel Matrix.

Table 2. Hyperparameter

Dropout Epoch Batch size Kernel size

0.1, 0.2, 0.3 5 32 3

Table 2, the dropout values tested are 0.1, 0.2, and 0.3 as samples to select the dropout values to be applied to the network layer. Based on research[15] obtained the best dropout ratio between 0.1 and 0.2. The value of 5 was chosen for the epoch because on research[16] a superior score was obtained when the epoch was 5. Then, the batch size is set to 32 and the kernel size is set to 3 as an adjustment value to improve computational efficiency.

Therefore, this study uses the value of this hyperparameter in order to obtain optimal performance results.

2.5 FastText Word Embedding

FastText is a word embedding technique created by Facebook'S AI as part of the development of word2vec. Being able to learn Word Representation and perform text classification quickly and effectively [17] treats each word in the corpus as a collection of sub-words or n-grams of characters. The given sentence describes the concept of FastText, a popular word embedding model. FastText utilizes n-gram characters as its fundamental units for representing words. In this case, the word "apple" can be divided into several n-grams: "ap", "ap", "ple", and "le".

Each of these n-grams is assigned a separate word vector. The word embedding vector for the word "apple" is obtained by summing up the vectors of all its constituent n-grams. This approach allows Fast Text to capture both morphological and semantic information by considering subword units, enabling it to handle out-of-vocabulary words and improve performance in various natural language processing tasks[18]. FastText provides two models for calculating word representations: skip-gram and a continuous bag of words (CBOW). The illustration below is a different way of working from the CBOW skip-gram.

Figure 6. Workflow CBOW and SKIPGRAM

(5)

Figure 6 is given the phrase “Saya selalu berdoa sebelum keluar rumah” with the target word being ‘keluar’.

The skip-gram model tries to predict the target using random close words, in contrast to the CBOW model which takes all the words around it.

On the results of research, Aydogan [19] showed that the use of a pre-trained model increases the speed by about 5-7%. Pre-trained model in this study uses the skip-gram model by including 100,000 articles available on Wikipedia. The process took 887,894 seconds to get 211,949 words.

2.6 Cross Validation

Cross Validation is a model training method that can assess prediction accuracy[20]. For assessing the exploratory prediction capability of models, the k-fold forward cross validation is an improvement over the conventional k- fold cross validation. Instead of dividing the dataset into k equal groups at random, all of the samples are first sorted according to the values of the material's properties. In CV K-fold, the data set is divided into a fold, the fold is used in each iteration once as test data, while the remaining folds are used as training data[21]. The performance value of the model is measured as the average of all iterations. We must set the parameter k or the number of folds.

The k value determines how distinct in property values the validation set and training set will be, which has a significant impact on the outcome.

2.7 Performance of System

Classification results can be easily seen in the confusion matrix. A confusion Matrix is an evaluation method that makes it easy to understand the extent to which a model or system is able to classify data correctly. The confusion matrix has 4 main components: True Positive (TP), False Positive (FP), True Negative (TN), and False Negative (FN). From these components, various evaluation metrics can be calculated, including accuracy, precision, recall, and F1-score. The existence of these metrics helps to understand in more detail the performance of the model in performing the classification. Obtained mathematical equations respectively as follows:

a. Accuracy

Accuracy is the calculation of true data from the entire amount of data. Calculation of Accuracy can be seen in the following equation 3.

𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = ⁡ ^{(𝑇𝑃+𝑇𝑁)}

(𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁) (3)

b. Precision

Precision is the level of accuracy between the information requested by the user with the answer given by the system. The calculation of precision can be seen in the following equation 4.

𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = ⁡ ^𝑇𝑃

(𝑇𝑁+𝐹𝑃) (4)

c. Recall

Recall is the success rate of the system in rediscovering information. The calculation of recall can be seen in Equation 5.

𝑅𝑒𝑐𝑎𝑙𝑙 = ⁡ ^𝑇𝑃

(𝑇𝑃+𝐹𝑁) (5)

d. F1-score

F1-Score is a system performance measurement that combines the value of precision with recall. The calculation of F1-Score can be seen in the following equation 6.

𝐹1 − 𝑠𝑐𝑜𝑟𝑒 = ⁡ ^2𝑇𝑃

2(𝑇𝑃+𝐹𝑃+𝐹𝑁) (6)

In Figure 7 below shows information about TP (True Positive), TN (True Negative), FP (False Positive), and FN(False Negative).

Figure 7. Confusion Matrix

(6)

Based on the confusion matrix, it can be explained that TP is the amount of data that is correctly classified as positive by the model, FP is the amount of data that should be classified as negative but is incorrectly classified as positive by the model. TN is the amount of data correctly classified as negative by the model. FN is the amount of data that should have been classified as positive but was incorrectly classified as negative by the model.

3. RESULT AND DISCUSSION

3.1 Test Result

This study used a public dataset with a total of 13,169 data that have been labeled positive and negative. The Data was used to test the hate speech detection system using Convolutional Neural Network (CNN) and FastText as word embedding, the results of the evaluation of the level of system performance will give an assessment of the system that has been built to detect hate speech. Pre-trained Model in the following studies using the skipgram model involving 100,000 articles in Indonesian. This article was taken from a public dataset available from Wikipedia. This process is used in the classification model to improve the performance model.

Table 3. The result of FastText Word Embedding

No Word Dimensions

1 2 3 4 5 100

1 Indonesia -0.68 -0.47 0.29 -0.75 -0.10 0.09

2 Presiden -0.84 -0.07 0.70 -0.05 -0.68 -0.39

3 Jokowi -0.53 0.39 -0.40 0.62 1.02 -0.56

4 Islam -0.62 0.14 1.28 0.44 0.01 0.26

5 Cina 0.42 -0.12 -0.60 0.08 0.06 0.43

In Table 3 shows the final results obtained from the process, the process is carried out in conjunction with pre-trained models that have been prepared. The end result of the process is in the form of word vectorization with a size of 100 dimensions. Each dimension has a particular feature or attribute numerical value that represents a particular feature or attribute of the presented word, such as the relationship of the word “Indonesia” with words that often appear around it. Overall, the vectors obtained will help the computer process, understand, and analyze the text in a more efficient and effective way.

From CNN classification with FastText word embedding, a report based on the K-fold cross-validation evaluation method was obtained. By looking at the level of accuracy of the comparison of data folds with different dropout values, in this case the dropout values used are 0.1, 0.2, and 0.3

Table 4. The Results with 5-fold Against The Value of Dropout

K-fold 0.1 0.2 0.3

Training Validasi Training Validasi Training Validasi K=1 0.9711 0.9588 0.9719 0.9573 0.9544 0.9602 K=2 0.9750 0.9884 0.9668 0.9877 0.9639 0.9892 K=3 0.9759 0.9899 0.9662 0.9877 0.9628 0.9884 K= 4 0.9759 0.9892 0.9619 0.9826 0.9645 0.9863 K=5 0.9768 0.9920 0.9710 0.9913 0.9601 0.9870 Average 0.97494 0.981575 0.9667 0.98132 0.96114 0.98222

Table 4 is the result of testing the system by evaluation using 5-fold cross-validation of the dropout value of 0.1, 0.2, and 0.3. The best performance is obtained when the dropout value of 0.1 and obtain accuracy values of 0.9768 and 0.9920.

Table 5. System Prediction Results

No Text Actual Predicted

1 menteri lingkung hidup hutan upaya perata ekonomi tora kawasan hutan hutan

sosial 0 0

2 kayak monyet 1 1

3 gue ngemafia main 3 acc edan 0 0

4 sinting sih main sepak bola jual bahan kimia 1 1

5 gue pakai celana pendek dengkul bilang banci pakai dengkul bilang gaul mending

pakai celana deh 1 0

A few of the projected outcomes are shown in Table 5. It seems that the text gets a fairly accurate prediction.

label 1 is one example of a positive sentiment, or the text includes a hate speech category and label 0 is a sentiment that is negative or does not include hate speech. Text to 5 did not get the appropriate prediction results, it may be due to natural variability in the data or the data may contain uncertainty or variation that can not be fully explained

(7)

by the model, see the existence of words or tokens of extreme value such as the word ‘slang’ has a meaning that does not belong to hate speech.

3.2 Analysis of Test Results

Obtained results of performance in Convolutional Neural Network model with FastText word embedding, the value of training and validation accuracy reached 0.9768 and 0.9920 using k-fold cross validation evaluation for dropout value of 0.1 and the number of K-fold as much as 5.

Table 6. Classification Report Results from 5-fold and Dropout 0.1 Label Classification Reports

Accuracy Precison Recall F1-Score

0 0.82 0.84 0.83

1 0.77 0.75 0.76 0.8

Table 6 above shows the classification report of the superior test, containing four types of metrics, namely accuracy, precision, recall, and f1-score. These metrics provide a deeper understanding of the extent to which the model can correctly classify each class. Obtained an average value of 80% accuracy, precision of 80%, 80% recall, and f1-score of 80%. This classification report is displayed in order to get a better understanding of how well our model works in classifying data and evaluating the performance of the model as a whole and in each classification class.

Figure 8. Average Scores Cross Folds

Figure 8 shows visualization and provides a visual overview of the average score for each category ('Loss' and 'Accuracy') on Average Scores Cross Folds. A bar chart with the x-axis containing the labels 'Loss’ and 'Accuracy', for the y-axis showing the average score. The average 'Loss' score was 0.13 and the average 'Accuracy' score was 0.95. By looking at the difference in bar sizes, it is easy to compare and draw conclusions about the relative performance of both categories.

Figure 9. Confusion Matrix

Figure 9 is a picture of confusion matrix from testing data with 5-fold and dropout value of 0.1 which can be read as follows:

a. Negative labels predicted negative as many as 1617, negative labels Predicted positive as many as 19.

b. Positive labels predicted positive as much as 1264, positive labels predicted negative as much as 172.

(8)

4. CONCLUSION

In this study has developed an advanced research system using Convolutional Neural Network and FastText word embedding. This study concluded that the performance of Convolutional neural network classification Model and FastText as word embedding provide excellent performance results in achieving learning objectives. By applying the number of epoch 5, batch size 32 and kernel size 3 obtained an increase in the results at the time of K-fold as much as 5 and the value of the school break of 0.1 resulting in an accuracy value of 80%. In order to obtain the optimal model accuracy score, the selection of the value of school students and the right number of K-fold, and the appropriate combination of hyperparameters. In general, the performance of the combination of CNN and Fasttext word embedding produces a model that can learn important patterns related to research results in the text.

However, this will vary depending on the specific needs and characteristics of the data set used. The results of the study showed that students dropped out of school experienced a decrease in grades and a greater number of K-fold showed better estimation accuracy in the classification model of Convolutional Neural Network and FastText word embedding.

REFERENCES

[1] A. Sepima, G. T.P. Siregar, and S. Amry Siregar, “Penegakan Hukum Ujaran Kebencian di Republik Indonesia,” 2021.

[2] K. Antariksa, Y. W. Sigit Purnomo, and D. Ernawati, “Klasifikasi Ujaran Kebencian pada Cuitan dalam Bahasa Indonesia,” 2019.

[3] J. S. Malik, G. Pang, and A. van den Hengel, “Deep Learning for Hate Speech Detection: A Comparative Study,” Feb.

2022, [Online]. Available: http://arxiv.org/abs/2202.09517

[4] Y. Zhou, Y. Yang, H. Liu, X. Liu, and N. Savage, “Deep Learning Based Fusion Approach for Hate Speech Detection,”

IEEE Access, vol. 8, pp. 128923–128929, 2020, doi: 10.1109/ACCESS.2020.3009244.

[5] N. Indah Pratiwi, I. Budi, and I. Alfina, Hate Speech Detection on Indonesian Instagram Comments using FastText Approach. IEEE, 2018.

[6] T. Van Huynh, V. D. Nguyen, K. Van Nguyen, N. L.-T. Nguyen, and A. G.-T. Nguyen, “Hate Speech Detection on Vietnamese Social Media Text using the Bi-GRU-LSTM-CNN Model,” Nov. 2019, [Online]. Available:

http://arxiv.org/abs/1911.03644

[7] M. Ridwan and A. Muzakir, “Model Klasifikasi Ujaran Kebencian pada Data Twitter dengan Menggunakan CNN-LSTM HATE SPEECH CLASSIFICATION MODEL ON TWITTER DATA USING CNN-LSTM,” TEKNOMATIKA, vol. 12, no. 02, pp. 1–5, 2022.

[8] A. Velankar, H. Patil, A. Gore, S. Salunke, and R. Joshi, “Hate and Offensive Speech Detection in Hindi and Marathi,”

Oct. 2021, [Online]. Available: http://arxiv.org/abs/2110.12200

[9] R. Joshi, R. Karnavat, K. Jirapure, and R. Joshi, “Evaluation of Deep Learning Models for Hostility Detection in Hindi Text,” Jan. 2021, doi: 10.1109/I2CT51068.2021.9418073.

[10] M. O. Ibrohim and I. Budi, “Multi-label Hate Speech and Abusive Language Detection in Indonesian Twitter,” 2019.

[Online]. Available: https://www.komnasham.go.id/index.php/

[11] M. A. Rosid, A. S. Fitrani, I. R. I. Astutik, N. I. Mulloh, and H. A. Gozali, “Improving Text Preprocessing for Student Complaint Document Classification Using Sastrawi,” in IOP Conference Series: Materials Science and Engineering, Institute of Physics Publishing, Jul. 2020. doi: 10.1088/1757-899X/874/1/012017.

[12] N. Adani Setyadi, A. Setyadi, M. Nasrun, and C. Setianingsih, Text Analysis for Hate Speech Detection Using Backpropagation Neural Network. 2018.

[13] A. Nurdin, B. Anggo, S. Aji, A. Bustamin, and Z. Abidin, “PERBANDINGAN KINERJA WORD EMBEDDING WORD2VEC, GLOVE, DAN FASTTEXT PADA KLASIFIKASI TEKS,” Jurnal TEKNOKOMPAK, vol. 14, no. 2, p.

74, 2020.

[14] L. Alzubaidi et al., “Review of deep learning: concepts, CNN architectures, challenges, applications, future directions,” J Big Data, vol. 8, no. 1, Dec. 2021, doi: 10.1186/s40537-021-00444-8.

[15] I. Ali Kandhro, S. Zafar Jumani, F. Ali, Z. Uddin Shaikh, M. Arshad Arain, and A. Ahmed Shaikh, “Performance Analysis of Hyperparameters on a Sentiment Analysis Model,” 2020. [Online]. Available: www.etasr.com

[16] J. Elektronik Ilmu Komputer Udayana et al., “Analisis Sentimen Ulasan E-Commerce Pakaian Berdasarkan Kategori dengan Algoritma Convolutional Neural Network,” 2022.

[17] N. Nedjah, I. Santos, and L. de Macedo Mourelle, “Sentiment analysis using convolutional neural network via word embeddings,” Evol Intell, vol. 15, no. 4, pp. 2295–2319, Dec. 2022, doi: 10.1007/s12065-019-00227-4.

[18] S. Mestry, V. Bisht, H. Singh, K. Tiwari, and R. Chauhan, “Automation in Social Networking Comments With the Help of Robust fastText and CNN,” 2019. doi: 10.1109/ICIICT1.2019.8741503.

[19] M. Aydoğan and A. Karci, “Improving the accuracy using pre-trained word embeddings on deep neural networks for Turkish text classification,” Physica A: Statistical Mechanics and its Applications, vol. 541, Mar. 2020, doi:

10.1016/j.physa.2019.123288.

[20] G. Battineni, G. G. Sagaro, C. Nalini, F. Amenta, and S. K. Tayebati, “Comparative machine-learning approach: A follow- up study on type 2 diabetes predictions by cross-validation methods,” Machines, vol. 7, no. 4, 2019, doi:

10.3390/machines7040074.

[21] I. K. Nti, O. Nyarko-Boateng, and J. Aning, “Performance of Machine Learning Algorithms with Different K Values in K-fold CrossValidation,” International Journal of Information Technology and Computer Science, vol. 13, no. 6, pp. 61–

71, Dec. 2021, doi: 10.5815/ijitcs.2021.06.05.