Sentiment Analysis on Twitter(X) Related to Relocating the National Capital using the IndoBERT Method using Extraction
Features of Chi-Square
Dufha Arista*, Yuliant Sibaroni, Sri Suryani Prasetyowati School of Computing, Informatics, Telkom University, Bandung, Indonesia
Email: 1,*[email protected] 2[email protected], 3[email protected] Correspondence Author Email: [email protected]
Abstract−Sentiment analysis or commonly referred to as opinion mining is a field of science that can be used to get the percentage of positive sentiment and negative sentiment towards a person, company, institution, product, or even an issue or topic. Various topics are discussed on social media, one of which is Twitter (X). Starting from the economy, politics, social, culture, law and others. One of the most discussed topics on Twitter (X) is the transfer of Indonesia's capital city to East Kalimantan Province, which has drawn various opinions from netizens on Twitter (X). In this study, data regarding the transfer of the national capital taken by the author was taken from social media, namely from the social media Twitter (X) with a date range of January 1, 2022 to February 28, 2022. The method used in this research is IndoBERT using Chi-Square. Based on the experiments that have been carried out, the performance of IndoBERT with Chi-square selection features shows good results with an overall accuracy value of 94%, a precision value of 85%, a recall value of 91%, and an f1 value of 88.4% for all datasets.
Keywords: Sentiment Analysis; The Capital of The Country; IndoBERT; Chi-Square
1. INTRODUCTION
In this modern era, social media is growing very rapidly in the virtual world. Social media is a new platform that allows us to voice our ideas and share new perspectives and thoughts on various issues and topics with friends, family, our immediate environment, and even strangers from around the world. It allows for interaction without time and space constraints between the infrastructure and superstructure of the political system. Internet use is increasingly widespread in Indonesia, not only for personal or business-commercial purposes but also for political matters[1].
Twitter is an online social network run by the American corporation X Corp. that allows microblogs to have a character restriction of 140. On Twitter, users who have registered can submit text, photos, and videos.
Users can also send direct messages (DM) to other registered users and post (tweet), like, retweet, and comment on quotation posts. Users can connect with Twitter through browsers, programmatic APIs, or mobile frontend apps. Twitter can take use of this to provide a public area where people gather, have conversations, and demand transparency—all of which can lead to positive social change[2]. One topic that is still hotly discussed is, on August 26, 2019 on Twitter(X) regarding the plan to relocate the capital of the nation from Jakarta to East Kalimantan, is one issue that is currently being hotly debated [3]. Since Jakarta is regarded as unfit to hold the title of capital, there has been talk of shifting it since the colonial era. The relocation of the new capital was realized during the Jokowi-Ma'ruf presidency regarding the National Capital was passed [3]. This certainly caused various positive and negative reactions from the Indonesian people, as well as being the subject of debate on social media, especially Twitter. Every reaction typed has a variety of emotions and questions hidden in it [4]. These emotional tendencies and questions can be identified by conducting sentiment analysis.
Research on sentiment analysis via Twitter has been widely conducted, and over time methods for sentiment analysis have developed quite rapidly. Such as the journal published by Hadiyan Kundrat Putra, et al [3]. The journal entitled Detection of the Use of Abusive Sentences in Indonesian Text Using the IndoBERT Method explains that IndoBERT test results can better classify abusive sentences in Indonesian text. The IndoBERT model architecture has been able to detect 6 abusive sentences quite optimally, especially in the case of the second dataset, although in the case of the first dataset, actually, if we compare it with other BERT models, IndoBERT is clearly superior in terms of F1 Score accuracy. All three class types may be classified using IndoBERT, and it can also produce all F1 Score values for each class[5]. This is due to the fact that IndoBERT makes use of transformers, which are attention processes that examine the contextual connections between words or subwords in text. As a result, the model's learning process becomes more intricate in identifying the context between words, which in this instance will improve the classification results' accuracy for each label. The IndoBERT architecture's F1 Score is likewise impacted by the addition of Dataset. After adding data to minority groups, the IndoBERT model F1 Score results improved, with an average F1-Score of 76.32% for each class[5].
The quantity of characteristics in sentiment analysis is one of its issues. Performance in categorization might be hampered by an abundance of features [7]. A feature selection procedure is therefore required. In sentiment analysis, there are various choices for feature selection. Chi Square feature selection will be utilized in this study to determine how dependent a feature is on a class. According to study [8], using the Chi Square approach helped the classification algorithm perform more accurately, going from 73.33% to 93.33%. Naïve Bayes was the algorithm employed in the study [9].
As evidenced by the numerous studies on sentiment analysis utilizing deep learning and machine learning algorithms mentioned above, IndoBERT has the potential to yield positive outcomes. Still, there isn't much research done on sentiment analysis with IndoBERT. Recent studies have demonstrated that trained models trained on a large corpus can effectively use transfer learning to accomplish a range of downstream natural language processing tasks. BERT is one trained language representation model that performs better than several architectures tailored to certain tasks. Practicing detailed representations of text from both left and right directions is the goal of BERT, the Two-Way Encoder Representation of Transformers[5], [6]. By adding just one more classification layer, BERT may be readily modified for classification tasks, negating the requirement to start from scratch when training a new model.
2. RESEARCH METODOLOGY
2.1 System Design
Figure 1. Sentiment Classification System
Figure 1 shows the flowchart for this system. In order to prepare a dataset, the system first retrieves data from Twitter(X), eliminates duplicate data, then labels the retrieved data. The labeled data will undergo preprocessing, which includes word origination, stopword removal, data cleansing, and case folding. The data processing will next move on to the feature selection phase for Chi-Square. The data must then be split into two categories: training data and test data. Test data is used to evaluate the performance of the model, while training data is used to train the model. IndoBERT is the next step in the modeling process. Subsequently, each model will be assessed by adding up all of the confusion matrix values for every iteration. The model's total performance over the whole dataset will be shown in the matrix.
2.2 Datasets
In this study, the dataset taken by the author is data taken from the social media, from social media Twitter(X) with a vulnerable date of January 1, 2022 to February 28, 2022 and using several keywords and hashtags such as
#IbukotaNegara, #IKN, Nusantara, and the new capital city. The data set obtained from twitter(X) has 5,892 records, many of which are labeled positive. The positive class has 3,750 data points and the negative class has 2,142 data points.
The data crawling process is done using the Twitter API assisted by the tweepy library in the python programming language. This data crawling produces a dataset in Comma Separated Value (CSV) format containing opinions, emotions, and opinions of people on Twitter on the decision of the new national capital.
Table 1. Labelled Dataset
Number Tweets Label
1 #AyoTolakUUIKN
Proyek IKN membuka peluang besar bagi oligarki untuk berbuat curang, dengan sebutan curang. Tolak UU IKN
Negative 2 Kalimantan.
Mata dunia tertuju padanya, bukan hanya karena kekayaan alamnya atau rencana pemindahan ibu kota negara.
Namun juga merupakan bagian dari pusat konservasi hutan, paru-paru dunia.
Positive
Number Tweets Label Semoga dengan kepindahan IKN, pengendalian konservasi hutan menjadi lebih baik.
3 Pindah ibu kota hanya menambah beban negara. menambah beban rakyat yang sudah susah. mau dibilang hebat tapi nggak mikir. nggak ngaca kalau nggak punya
kemampuan.😔 Negative
4 Setuju! Adanya ibu kota baru di IKN Nusantara dpt mengatur ulang jarak antara infrastruktur pemerintahan dengan infrastruktur lembaga negara asing yang akan berkantor di IKN Nusantara.
#DukungIKNusantara https://t.co/cXUDPT4uoC
Postive
2.3 Prepocessing
Preprocessing is the next stage of the study once all the data has been gathered. In order to achieve more accurate findings during process classification, preprocessing involves transforming the data into a format that is simpler and more efficient to handle. The six procedures that comprised the preprocessing step of this study were cleaning, case folding, normalization, tokenization, stopword elimination, and stemming.
Cleansing is a process eliminating elements that don't affect the classification process. Data is typically noisy, chaotic, and disorganized [12]. It also tends to be unstructured. Punctuation, double spacing, emoticons, html and url formatting, numbers, and symbols are among the elements that are missing.
Case Folding is the process of changing every letter to lowercase. Data that is consistent and organized is produced using this procedure. As an illustration, consider the phrases "random first minute" to "random first minute."
Tokenization is the process of segmenting phrases into discrete tokens according to the needs of the system.
[7]. For example, the sentence "The IKN Project opens up great opportunities for oligarchs to cheat" becomes
"[IKN Project], [open], [opportunity], [big], [para], [oligarchy], [for], [play], [cheat]".
Stopword Removal is a process to eliminate words that have no effect on the classification process [13].
List Words on stopwords are words that are often used in daily speech; process tokenization produces the input data for this procedure. The process of removing stopwords is carried out to eliminate these words from the text thereby improving the quality of text analysis by paying attention to more important words. In this study, researchers used the corpus provided by NLTK. For example, input "[Project IKN], [open], [opportunity], [big], [para], [oligarchy], [for], [play], [cheat]" to "[play], [cheat]".
Stemming is the of converting a word into its root is called stemming. Through this procedure, the word's affixes—suffixes, prefixes, and combinations of the two—are eliminated. [13]. In this study, researchers used PorterStemmer() from the NLTK module.
Figure 2. Prepocessing Stage 2.4 Weighting TF-IDF
Once the pre-processing stage is complete, the next step involves calculating the weight of the word used. In this study, the method used was TF-IDF weighting to extract features. TF – IDF is a method to determine the relationship of words or terms to a document. It can be said that TF – IDF is a word weighting technique based on the occurrence of a word and the importance of the document containing it. The method of weighting involves assigning numerical values to the input words[8]. TF-IDF plays a role in calculating how often a word appears in the dataset [9]. The TF-IDF weighting formula for words can be seen in the following equation:
w(t, c) = tf(t, d) × idf(t) = tf(t, d) = log N
df(t) (1)
Description:
w(t, c) = represents the weight of a term (t) in the document (d).
tf(t, d) = stands for calculating the frequency of the term (t) in document (d).
idf(t) = the inverted document frequency of each word.
df(t) = shows the frequency of each term in the document.
N = symbolizes the total amount of documents.
2.5 Feature Selection Chi- Square
Following the stage of feature extraction, the chi-squared approach is used for feature selection. Prior to the classification phase, feature selection is the process of choosing words in order to eliminate unnecessary features.
Numerous earlier research have demonstrated the effectiveness of using Chi-Square feature selection [10].
One component of the selection process that measures the degree of reliance or relationship between the current term (t) and class is called Chi-Square[15]. Each term value will then be arranged starting with the highest value. The Chi-Squared Method The following feature selection formula[11]:
X2(t, c) = N(A×D−B× C)2
(A+B)× (C+D)× (A× C)× (B+D) (2)
Description:
X2(t, c)= class c searched for phrase (t).
N = total quantity of information.
A = indicates how many documents are in the class.
B = includes terms.
= shows how much information has the term "t" in it but isn't in class C.
C = shows the quantity of information in class C that was located; however the phrase is absent.
D = shows the quantity of information that is not part of class C and is devoid of words.
Knowing a word's single chi-square value is essential when utilizing Chi-Square. The formula for the singular chi-squared value of each word is as follows:
X2(t, c) ∑ =kc 1 X2(t, c) (3)
Description:
x = one value for each word t = word
c = class 2.6 Split Data
The data used in this investigation were separated into test and training sets. Twenty percent of the data is allotted to the data train, with the remaining eighty percent going toward the data train. The following table displays the outcomes of the data separation:
Table 2. Total Data Train and Data Test Split Data Data Train Data Test
Total Data 4713 1179
Positive label 3007 743
Negative label 1706 436
2.7 K-Fold Cross Validation
Cross validation is one of the methods in validating the best model. This technique will test the effectiveness of the model formed by resampling the data to divide it into 2 parts, namely training data and testing data. The training data will be used to train the model so that the model can understand the patterns in the data and to validate the model training, the testing data will be used as the test[12].
One method of cross validation that is often used is k-fold cross validation because this method will generally produce an unbiased model. This can happen because every observation in the data has the opportunity to become training data or testing data. Or in other words, we can have 𝑘 subsets of data to train and evaluate the performance of the model. Initially, this method will divide the data into k equal parts (folds). The value of k is left to the researcher, but it is recommended that it is not too large and not too small. A value of k that is too large will result in an unbiased model, but can make the variance large, which can lead to overfit. A value of k that is too small will produce a model similar to the usual cross validation method that only divides the data into train - test only (can trigger bias). Commonly used k values are 𝑘 = 5 or 𝑘 = 10 [13].
2.8 Fine Tuning IndoBERT
The model known as IndoBERT, which was first shown in reference [13], was trained using BERT[14].
Approximately 4 billion words of Indonesian text from a variety of sources, including online news, social media, Wikipedia, online articles, video subtitles, and a parallel dataset called Indo4B, were used to train the model [19].
BERT or Bidirectional Encoder Representation from Transformers, is a transformer-based natural language processing technique that was first created by Jacob Devlin and his colleagues at Google and published in 2018.
This is discussed in paper [16]. Bidirectional representation of anonymised text is handled by BERT in such a way that all sections' left and right sides are integrated into a single context. Numerous issues can be solved by making minor adjustments to the current BERT model. BERT is excellent in terms of observability and simplicity. BERT's ability to function and be comprehended across 11 programming languages is also the reason for its GLUE score of 80.5%, MultiNLI accuracy of 86.7%, SQuAD v1.1 F1 test of 93.2, and SQuAD v2.0 F1 test of 83.1.The outcome of the standardization of data is shown inorization. Figure 3 shows a summary of the single sentence classification challenge.
Figure 3. Pre-training & Fine-Tuning Model BERT[15]
As seen in Figure 3, BERT employs two unsupervised tasks in the Pre-Trained procedure. The first method is known as Masked LM, and it involves the model attempting to anticipate the [MASK] word by using other words in the surrounding context. To train the model, a random percentage of input tokens are assigned [MASK], and the [MASK] tokens are then predicted. 15% of all randomly successive Word Piecetokens were masked in [16]. This model's disadvantage is that, because the [MASK] token disappears during fine-tuning, it could lead to a mismatch between pre-training and fine-tuning. One method to get around this is to not always use [MASK]
tokens in place of masked words. Rather, [MASK] tokens are used to replace 80% of the data, random words are used to replace 10%, and nothing is modified in the remaining 10% [16].
The model learns to predict whether the second sentence is the next sentence in the actual document by using a pair of sentences as input. This technique is called Next Sentence Prediction (NSP). 50% of the inputs in the training process [17] are pairs in which the second sentence is the sentence that follows in the original document, while the remaining 50% are random sentences chosen at random from the corpus to serve as the second sentence. It is assumed that the arbitrary sentence will have no bearing on the preceding sentence [16]. An illustration of the input procedure used on the BERT model may be found in Figure 4. The outcome of the data standardization is displayed in Figure 4.
Figure 4. BERT Input Representation[18]
The purpose of the research presented in paper [16]to demonstrate the performance that can be produced when employing the IndoBERT model for a variety of activities. The Indonesian adaptation and modification of the BERT algorithm, which was first developed in 2018 by several Google AI Language researchers, is what gives rise to IndoBERT, which is used by Google to forecast the next sentence in the search column. Using the IndoBERT approach yields an F1-Score matrix score of 84.13 in the Sentiment Analysis task. This number is greater than the results of applying various techniques to the same dataset, such as MBERT, MalayBERT, BiLSTM w/ Rapid Text, Logistic Regression, and Naïve Bayes.
2.9 Evaluation
System evaluation is performed using a multiclass confusion matrix and calculating the accuracy, precision, recall, and F1-score of each class. Evaluation metrics are employed to quantify categorization outcomes and ascertain the degree of efficacy of the developed model. The performance and caliber of the constructed model are also evaluated using evaluation metrics [19].
Two types of uncertainty that can be used to quantify or assess the classification model are accuracies and kappa statistics. The arithmetic mean of the number of samples that are objectively classified with respect to all data (Han, Kamber, & Pei, 2012). We use arithmetic to reduce the labor intensity in evaluating every model. One more clear metric is to reduce all cases that are identified with certainty. This is the most frequently used when all kelas are similar in importance. Accuracy is used when the positive and negative signs of the truth are more significant. The definition of akurasi is given in equation. In this case, TP stands for positive bias, TN for negative bias, FP for positive bias, and FN for negative bias. We also consider the impact of time constraints during the teaching phase[20]. Both of them can be compared to one other:
1. Accuracy is the true positive and true negative values divided by the overall data. Here is the accuracy formula [15]:
Accuracy = TP+TN
TP+TN+FN+FP (4)
2. Precision is the value of true positive divided by all correct positive predictions. Here is the precision formula[15] :
Pecission = TP
TP+FP (5)
3. Recall is the true positive value divided by all positive predictions. Here is the recall formula [15]:
Recall = TP
TP+FN (6)
4. Score F1 model performance is determined by calculating the F1 score, which is derived from the average precision and gain values. The equation displays the formula for determining score F1. [15]
F1 − Score =2×(precision×recall)
precision+recall (7)
In this study, to evaluate the performance of the classification process, a process of calculating accuracy, precision, and recall was carried out. The formulas for performance evaluation are as follows:
Tabel 3. Confusion Matrix
Classification Positive Negative
Positive True Positive (TP) False Negative (FN) Negative False Positive (FP) True Negative (TN)
3. RESULT AND DISCUSSION
3.1 National Capital Sentiment
In this research, the sentiment analysis dataset related to the relocation of the national capital was used. Data obtained from twitter(X) amounted to 5,892 data tweets. Separated into train and test datasets using an 80:20 distribution ratio; there are up to 4713 training datasets and 1179 test datasets overall.
Figure 5. Review Category Proportions 3.2 Chi-Square Feature Selection
The process begins by applying chi-square feature selection to reduce the dimensionality. From the initial 6307 features, only 967 features were selected.This reduces overfitting, simplifies the model, and improves model
performance with less computation. The data was then divided into a training set and a testing set, and the final feature set contained 967 features. Chi-Square Feature Selection's impact on the employed technique is as follows:
1. Dimensionality reduction: In this case, of the original 6307 features, only 967 features were selected using chi- squared. These cuts lessen overfitting, simplify the model, and enhance performance (e.g., lower compute needs).
2. Improve model performance: Selecting appropriate features can enhance the performance of the model. The model can concentrate on information that is more essential for making decisions by choosing the most informative aspects.
Also, using this reduced feature set during the training and testing process will improve model performance and allow the model to focus on the most relevant information to make better decisions.
3.3 IndoBERT Modelling
The performance results of the IndoBERT model were evaluated in two scenarios: with the use of Cross Validation (CV) and without the use of CV. This provides a deep understanding of how well the model can perform classification on test data
Tabel 4. IndoBERT Overall Perfomence with Cross Validation Fold Accuracy Precission Recall F1-Score
1 85% 86% 91% 88%
2 83% 85% 90% 87%
3 82% 85% 87% 86%
4 72% 78% 91% 84%
5 82% 82% 91% 86%
Table 3 shows the results of the model evaluation using CV. These results show the performance of the model on each fold of the data division with different sizes. Meanwhile, Table 4 reflects the model performance evaluation results without using CV, which measures the overall model performance without dividing the data into folds.
Tabel.4 IndoBERT Overall Perfomence without Cross Validation Measures Perfomence Percentage
Acuracy 94%
Precission 85%
Recall 91%
F1-Score 88%
Macro avg 94%
Weighted avg 94%
The results of the model performance evaluation are shown in Table 4. Overall, the model achieved an accuracy rate of 94%, which describes how well the model correctly classifies data. A precision of 85% indicates the proportion of correct positive predictions in the overall positive predictions of the model. A recall of 88%
indicates how well the model identifies the correct instance of all actual cases. The F1-Score of 94% for both classes strikes a balance between precision and recall, illustrating the overall performance of the model. A macro avg average of 91% and a weighted avg of 94%, which includes overall performance measurements across classes, confirm the consistency of model performance across datasets In general, IndoBERT models show good performance with a high degree of accuracy. In general, IndoBERT models show good performance with a high degree of accuracy.
Figure 6. Accuracy of validation and training for the optimal fine-tuning result
The relationship between epoch accuracy and val-epoch accuracy is depicted in the graph. While val epoch accuracy refers to the model accuracy on val data, epoch accuracy is the model correctness on the current epoch data.It is evident from this graph that epoch accuracy rises as epochs do. This demonstrates that the more data that machine learning algorithms are trained on, the more accurate data labels they can predict. Although not as quickly as epoch accuracy, Val epoch accuracy rises as epochs increase. This indicates that additional training is necessary for machine learning models to reach maximum accuracy on val data. The figure 6 demonstrates how, with more data drilled, the machine learning model improves its prediction accuracy for label data. To obtain the best accuracy on val data, more research on machine learning models is still required.
However, there are differences in precision and recall, which may require further adjustments to improve consistency in positive prediction and the identification of actual cases in both classes. This model evaluation provides useful insights for improving the quality of the m odel in classifying sentiments related to moving national capitals from Twitter(X) data.
4. CONCLUSION
Based on the results of the study shows that researchers have built an apparatus capable of evaluating sentiment analysis on twitter(X) related to the Relocation of the National Capital. Overall, the IndoBERT model performed well with a high level of accuracy, both with and without the use of Cross Validation (CV). Evaluation without CV produced a model with a high accuracy of about 94% and showed good performance consistency across various datasets. The use of IndoBERT method with Chi-Square feature selection affects the reduction of the number of features generated. The classification model using IndoBERT with Chi-Square feature extraction shows good performance with an accuracy rate of about 94%. In addition, precision, recall, and F1-Score also showed good results for both positive and negative classes. In conclusion, this study shows that the application of the IndoBERT method with Chi-Square feature extraction in sentiment analysis related to capital city relocation from Twitter can provide satisfactory results in identifying positive and negative sentiments from the public regarding the plan.
Future research is expected to compare the performance of the IndoBERT method with other ways in order to ascertain the advantages and disadvantages of each strategy in the context of sentiment analysis.
REFERENCES
[1] F. F. Noorikhsan, H. Ramdhani, B. C. Sirait, dan N. Khoerunisa, “Dinamika Internet, Media Sosial, dan Politik di Era Kontemporer: Tinjauan Relasi Negara-Masyarakat,” Journal of Political Issues, vol. 5, no. 1, hlm. 95–109, Jul 2023, doi:
10.33019/jpi.v5i1.131.
[2] Nurhayati, A. E. Putra, L. K. Wardhani, dan Busman, “Chi-Square Feature Selection Effect On Naive Bayes Classifier Algorithm Performance For Sentiment Analysis Document,” dalam 2019 7th International Conference on Cyber and IT Service Management (CITSM), IEEE, Nov 2019, hlm. 1–7. doi: 10.1109/CITSM47753.2019.8965332.
[3] A. H. Dyo fatra, N. H. Hayatin, dan C. S. K. Aditya, “Analisa Sentimen Tweet Berbahasa Indonesia Dengan Menggunakan Metode Lexicon Pada Topik Perpindahan Ibu Kota Indonesia,” Jurnal Repositor, vol. 2, no. 7, hlm. 977, Jul 2020, doi:
10.22219/repositor.v2i7.937.
[4] H. Jayadianti, W. Kaswidjanti, A. T. Utomo, S. Saifullah, F. A. Dwiyanto, dan R. Drezewski, “Sentiment analysis of Indonesian reviews using fine-tuning IndoBERT and R-CNN,” ILKOM Jurnal Ilmiah, vol. 14, no. 3, hlm. 348–354, Des 2022, doi: 10.33096/ilkom.v14i3.1505.348-354.
[5] H. K. Putra, M. A. Bijaksana, dan A. Romadhony, “Deteksi Penggunaan Kalimat Abusive Pada Teks Bahasa Indonesia Menggunakan Metode IndoBERT. ,” Jurnal Tugas Akhir Fakultas Informatika, 8(2), e-Proceeding of Engineering. ISSN:
2355-9365., 2021.
[6] F. Koto, A. Rahimi, J. H. Lau, dan T. Baldwin, “IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP,” dalam Proceedings of the 28th International Conference on Computational Linguistics, Stroudsburg, PA, USA: International Committee on Computational Linguistics, 2020, hlm. 757–770. doi:
10.18653/v1/2020.coling-main.66.
[7] K. Sailunaz dan R. Alhajj, “Emotion and sentiment analysis from Twitter text,” J Comput Sci, vol. 36, hlm. 101003, Sep 2019, doi: 10.1016/j.jocs.2019.05.009.
[8] N. I. P. Munggaran dan E. B. Setiawan, “DISC Personality Prediction with K-Nearest Neighbors Algorithm (KNN) Using TF-IDF and TF-Chi Square Weighting,” e-Proceedings Eng., vol. 6, no. 2, pp. 9446–9457, 2019..
[9] A. Topbas, A. Jamil, A. A. Hameed, S. M. Ali, S. Bazai, dan S. A. Shah, “Sentiment Analysis for COVID-19 Tweets Using Recurrent Neural Network (RNN) and Bidirectional Encoder Representations (BERT) Models,” dalam 2021 International Conference on Computing, Electronic and Electrical Engineering (ICE Cube), IEEE, Okt 2021, hlm. 1–6.
doi: 10.1109/ICECube53880.2021.9628315.
[10] T. Ernayanti, M. Mustafid, A. Rusgiyono, dan A. R. Hakim, “PENGGUNAAN SELEKSI FITUR CHI-SQUARE DAN ALGORITMA MULTINOMIAL NAÏVE BAYES UNTUK ANALISIS SENTIMEN PELANGGGAN TOKOPEDIA,”
Jurnal Gaussian, vol. 11, no. 4, hlm. 562–571, Feb 2023, doi: 10.14710/j.gauss.11.4.562-571.
[11] F. Taufiqurrahman, S. Al Faraby, dan M. D. Purbolaksono, “Multi-Label Text Classification on Hadith Translation Indonesian Using Chi Square and SVM,” e-Proceedings Eng., vol. 8, no. 5, pp. 10650–10659, 2021..
[12] Y. Widyaningsih, G. P. Arum, dan K. Prawira, “APLIKASI K-FOLD CROSS VALIDATION DALAM PENENTUAN MODEL REGRESI BINOMIAL NEGATIF TERBAIK,” BAREKENG: Jurnal Ilmu Matematika dan Terapan, vol. 15, no. 2, hlm. 315–322, Jun 2021, doi: 10.30598/barekengvol15iss2pp315-322.
[13] K. Sailunaz dan R. Alhajj, “Emotion and sentiment analysis from Twitter text,” J Comput Sci, vol. 36, hlm. 101003, Sep 2019, doi: 10.1016/j.jocs.2019.05.009.
[14] R. Man dan K. Lin, “Sentiment Analysis Algorithm Based on BERT and Convolutional Neural Network,” Proceedings of the 2021 IEEE Asia-Pacific Conference on Image Processing, Electronics and Computers (IPEC). Guilin, China: IEEE, 2021.
[15] J. Devlin, M.-W. Chang, K. Lee, dan K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” Okt 2018, doi: https://doi.org/10.48550/arXiv.1810.04805.
[16] Y. F. Saifullah dan A. S. Aribowo, “Comparison of Machine Learning for Sentiment Analysis in Detecting Anxiety Based on Social Media Data,” Jan. 2021, [Online]. Available: http://arxiv.org/abs/2101.06353..
[17] M. I. Amal, E. S. Rahmasita, E. Suryaputra, dan N. A. Rakhmawati, “Analisis Klasifikasi Sentimen Terhadap Isu Kebocoran Data Kartu Identitas Ponsel di Twitter,” Jurnal Teknik Informatika dan Sistem Informasi, vol. 8, no. 3, Des 2022, doi: 10.28932/jutisi.v8i3.5483.
[18] Y.-M. Kim dan T.-H. Lee, “Korean clinical entity recognition from diagnosis text using BERT,” BMC Med Inform Decis Mak, vol. 20, no. S7, hlm. 242, Sep 2020, doi: 10.1186/s12911-020-01241-8.
[19] I. R. Hidayat dan W. Maharani, “General Depression Detection Analysis Using IndoBERT Method,” International Journal on Information and Communication Technology (IJoICT), vol. 8, no. 1, hlm. 41–51, Agu 2022, doi:
10.21108/ijoict.v8i1.634.
[20] S. Hendrian, “Algoritma Klasifikasi Data Mining Untuk Memprediksi Siswa Dalam Memperoleh Bantuan Dana Pendidikan,” Faktor Exacta, vol. 11, no. 3, Okt 2018, doi: 10.30998/faktorexacta.v11i3.2777.