Twitter in analysis of policy sentiments of the omnibus law work creative design

(1)



View Online



Export Citation

CrossMark RESEARCH ARTICLE | MAY 09 2023

Twitter in analysis of policy sentiments of the omnibus law work creative design

Windu Gata; Surohman Surohman; Hendri Mahmud Nawawi

AIP Conference Proceedings 2714, 020011 (2023) https://doi.org/10.1063/5.0128546

Articles You May Be Interested In

Regulatory arrangement in the welfare sector using the omnibus law method AIP Conference Proceedings (September 2022)

Research on the factors that affecting the occurrence of gastric cancer based on NCBI gene expression Omnibus database

AIP Conference Proceedings (February 2020)

Robust Approach to Verifying the Weak Form of the Efficient Market Hypothesis AIP Conference Proceedings (September 2011)

Downloaded from http://pubs.aip.org/aip/acp/article-pdf/doi/10.1063/5.0128546/17430644/020011_1_5.0128546.pdf

(2)

Twitter in Analysis of Policy Sentiments of the Omnibus Law Work Creative Design

Windu Gata,

^a)

Surohman Surohman,

^b)

and Hendri Mahmud Nawawi

^c)

Computer Science, Nusa Mandiri University, Jakarta, Indonesia

a)Electronic mail: [email protected]

b)Electronic mail: [email protected]

c)Corresponding author: [email protected]

Abstract. Today’s social media is something that cannot be separated from everyone, such as Instagram, twitter, facebook, path, line and many more. From this phenomenon, making social media a source of data that can be used to seek public opinion instantly.

Analysis of a phenomenon becomes an interesting subject to discuss and becomes a trending topic, one of the sentiments used in this study is about public opinion on work copyright omnibuslaw various comments are collected and classified into a dataset to assess comments about positive or negative omnibuslaw sentiments which are processed by Using rapidminer tools using a comparison of the na¨ıve Bayes algorithm, support vector machine and k-nearest neighbor, this study proposes to use AUC sample boostraping and its accuracy is better for the three algorithms and increases AUC and accuracy. The highest accuracy value is generated by the SVM algorithm model with sample boostraping with an AUC of 0.948 and an accuracy of 85.88

INTRODUCTION

The Job Creation Bill (RUU Job Creation) suddenly became a public topic of conversation after being published a few months ago. It is not something unusual, because the Job Creation Bill was formed using the omnibus law method which is still very foreign to the ears of the Indonesian people, even though in fact this method has been known for a long time in the science of law. So that it is not something new to the ears of legal academics related to the omnibus law. However, the main problem is the lack of public understanding regarding the concept of the omnibus law offered by the Government of the Republic of Indonesia through the Job Creation Bill.

Sentiment analysis or opinion mining is the process of understanding, extracting, and processing data automatically to obtain sentiment information contained in opinion sentences. Sentiment analysis is carried out to examine a person’s opinion or opinion on a particular issue or object, regardless of whether they tend to have negative or positive opinions or opinions. An example of using sentiment analysis in the real world is to identify market trends and view an object [1].

Today’s social media is something that cannot be separated from everyone, such as Instagram, Twitter, Facebook, path, line and so on. Everyone has at least 2 to 5 social media accounts on their smartphone. From this perspective, make social media a source of data that can be used to gather public opinion immediately[2].

Opinion collection can be thought of as a combination of text retrieval and natural language processing, support vector machine (SVM is one of the text mining methods that can be used to solve opinion problems) [3].

SVM can be a good method for text classification. SVM can be used to divide several opinions to classify a tweet or comment, for example dividing into two namely positive opinions and negative opinions. Part-of-speech (POS) method can be used to solve opinion mining problems by marking positive or negative sentences based on natural language perspective [4].

Indonesian-language sentiment analysis magazines classify the content of news articles as new knowledge, namely negative or affirmative conclusions from news content contained on news websites. This can be achieved by using sentiment analysis, namely document classification through text mining or text mining [5].

In this study, the proposed models are Nave Bayes (NB), K-Nearest Neighbor (KNN) and Support Vector Machine (SVM) by adding a sample boostraping feature as a parameter. Selection of important features to improve classification performance [6]. The goal is to apply the feature selection the classification prediction performance can be improved [7].

This study proposes a method for extracting tweets and classifying tweets obtained from social media Twitter with the hashtag #omnibuslaw using the proposed model NB, KNN and SVM to classify tweets taken from April 30, 2020 to May 16. 2020 with a total of 1,133 tweets taken to assess sentiment on the issue of job creation omnibus law

(3)

RELATED WORK

Support Vector Machine is used in research on Sentiment Analysis of KPK’s Hand Arrest Operation according to the Community Using the Support Vector Machine Algorithm, Naive Bayes Based on Particle Swarm Optimizition. get a result of 80.77model and this is much better than the model proposed in this study, namely NB which gets an accuracy value of 76.92

[3] Extraction of public opinion data at universities as a result of the POS Tagging process was implemented by I.

Rozi, et al regarding the Implementation of Opinion Mining (Sentiment Analysis) to then apply a rule to determine whether a document is an opinion or not, then process the opinion and divide it into two parts by separating positive and negative opinions with the Na¨ıve Bayes algorithm, the precision reaches 0.99.

Sentiment Analysis of Odd-Even Policies on Bekasi Toll Road Using the Naive Bayes Algorithm with Optimization of Information Gain in this study is to do text mining on comments related to posts about the effectiveness of odd- even in the Bekasi toll road on Twitter, Instagram, Youtube and Facebook. Forward modeling is used and sentiment analysis is carried out from tweets, comments and posts submitted by the wider community. After doing research using the Naive Bayes model, the results of the Confusion Matrix were obtained, namely an accuracy of 79.55%, a Precision of 80.37%, and a Sensitivity or Recall of 80.51% [9].

METHODOLOGY

Research is generally described as an active, diligent and systematic investigative process, which aims to find information on a particular topic. The research conducted is using tweet data taking the topic, namely the omnibus sentiment analysis of work copyright based on posting twitter data using the NB, KNN, and SVM algorithms.

The research method used is text mining on tweets to classify tweets related to the omnibus law of work copyright.

The research methodology used in this experimental research using the Cross-Industry Standard Process for Data Mining (CRISP-DM) method consists of six stages as described in Figure 1.

FIGURE 1.Research Methodolgy

(4)

RESULT AND ANALYSIS

It is the stage of selecting mining techniques by determining the algorithm to be used. The tools used are Rapid Miner version 9.0. Data Preparation uses tweet data from Twitter with the hashtag #omnibuslaw, which was taken from April 30, 2020 to May 16, 2020. The data structure taken from Twitter consists of several columns, namely Created-At, From-User, FromUser-Id, To. -User, To-User-Id, Language, Source, Text, Geo-Location-Latitude, Geo- LocationLongitude, Retweet-Count, ID. The field used in this problem is only Text and a new field is added, namely the status to be used as a class. The result of model testing is to classify using the SVM, KNN, and NB algorithms.

Models were tested without sample bootstrapping and by using bootstrapping samples. Bootstrapping sample method can increase the accuracy value [10]. In the Sample Bootstrapping method there is a sample parameter ratio which functions to provide a value for the number of sample data used from all existing data with a value of 0-1 with this method, the amount of data processed is not as a whole but some data but does not reduce the amount of data available because after the data used it will be returned again [11].

SVM to get the best accuracy value from the dataset obtained from data analysis sentiment on Twitter uses 1,133 after cleaning using one of the tools that provides a data cleaning process is the GATA Framework which can be accessed on the gataframework.com web page, on the GATA Frameework sentiment as a dataset, it is processed using 10 preprocessing technicians, namely Indonesian Stop word removal, Indonesian Stemming, Transformation:

Remove URL, Tokenization: Regexp, @Anotation Removal Normalization, Indonesian Slank Normalization, Indone- sian acronym Normalization, Emoticon Transformation: Not (Negative) and Word Count, tenth This technique can be used in whole or in part depending on the needs.

One of the techniques in the GATA Framework is Transformation Remove URL, in this process the link or URL contained in the tweet will be removed. This aims to make the selected words or comments only tweets.

TABLE I.Cleaning technique (Remove URL)

Data Before Remove URL(%) Data After Remove URL

bilang pro-buruh. phk, ruu omnibus law hantui bilang pro-buruh. phk, ruu omnibus law nasib buruh ri! https://t.co/af6rvharu7 hantui nasib buruh ri!

Rapidminer model design with sample boostraping

FIGURE 2.Algorithm Model Design With Sample Boostraping

(5)

FIGURE 3.Apply Model and Performance

Accuracy and AUC values

Based on result experiments conducted using a dataset obtained from sentiment analysis data on Twitter using 1,133 using the Na¨ıve Bayes algorithm model, Support vector machine and k-nearest neighbor, the results are shown in Table 2.

TABLE II.Accuray and AUC values

Models Accuracy (%) AUC

SVM 70.44 0.834

SVM + Boostraping 85.88 0.948

NB 72.98 0.524

NB + Bootraping 85.08 0.690

KNN 70.88 0.798

KNN + Boostraping 78.64 0.936

From the results of sentiment analysis conducted using a dataset from tweeters regarding the omnibuslaw policy of job creation with the taggar #omnibuslaw, the results of the analysis using the Na¨ıve Bayes, K-NN and SVM algorithms show that by using the sample boostraping parameter the three models tested have a significant increase in the level of accuracy. and also the AUC and this shows that using sample boostraping can increase the scores for all models. The best accuracy and AUC of the three models is the SVM with boostraping model with an accuracy value of 85.88% and an AUC value of 0.948.

DEPLOYMENT Program Design

Implementation to measure sentiment on tweets that has been obtained based on the results of his research. The goal is to test the application whether the predictions and actuals are the same for the data.

(6)

The design for this deployment is figure 4:

FIGURE 4.Structure Program Deployment

Interface

Make the input design page interface its appearance can be seen in Figures 5 and 6

FIGURE 5.Dashboard Interface

Figure 5 is the initial display at the login system in Figure 5, the tweet data is displayed and it has been classified whether the sentiment is positive or negative. Figure 6 is a menu for analyzing data on a sentence or sentiment and the system will process whether the sentiment is positive or negative. The result is in Table 3.

(7)

FIGURE 6.Analys Interface

TABLE III.Text Analysis Result.

Kalimat Komplain Aslinya

HNW: Pemerintah Harus Fokus ke Penanganan Korona, Bukan Omnibus Law https://t.co/bQIoXUUpYL Remove Annotation

hnw: pemerintah harus fokus ke penanganan korona, bukan omnibus law https://t.co/bqioxuupyl

Tokenize (Regexp)hnw pemerintah harus fokus ke penanganan korona bukan omnibus law https t co bqioxuupyl

Not (Negative)hnw pemerintah harus fokus ke penanganan korona bukan omnibuslaw https t cobqioxuupyl

Stemminghnw perintah harus fokus ke angan korona bukan omnibus law https t co bqioxuupyl

Stopword Removalhnw perintah fokus angan korona bukan omnibus law https t co bqioxuupyl 0

1 2 hnw 3 perintah

perintah Nilai Negative : 0.037, Nilai Positive : 0 ...

Nilai Negative Total : 0.09 Nilai Positive Total : 0.003 KESIMPULAN:Negative

The conclusion is drawn from the results of the highest total value, if the positive value is greater than the negative value then the result is positive, if the negative value is greater than the positive value then the result is negative and in the example sentence above is an example of negative sentiment which is predicted to be negative and the result is the same, it means that the prediction is correct or True Prediction.

(8)

CONCLUSION

Testing using three proposed models, namely naive bayes, K-Nearest Neighbor and support vector machine for sentiment analysis of tweet data from twitter about omnibuslaw job creation with the hashtag #omnibuslaw shows that the accuracy and AUC value of each algorithm by adding remote bootstrap sample parameters better than without using a boostraping sample. Naive Bayes accuracy is 85.08% with an AUC of 0.690, K-Nearest Neighbor gets an accuracy of 78.64% with an AUC of 0.936 and Support Vector Machine (SVM) gets the highest accuracy value with a value of 85.88% for accuracy. and AUC of 0.948. Therefore, this study proposes that the SVM algorithm with sample boostraping can be used as a model to analyze sentiment towards the omnibus law policy of job creation.

REFERENCES

1. C. Patel, P. Budhwar, A. Witzemann, and A. Katou, HR outsourcing: The impact on HR’s strategic role and remaining in-house HR function, J. Bus. Res., vol. 103, no. February, pp. 397–406, 2019.

2. R. N. Chory, M. Nasrun, and C. Setianingsih, ”Sentiment analysis on user satisfaction level of mobile data services using Support Vector Machine (SVM) algorithm,” Proc. - 2018 IEEE Int. Conf. Internet Things Intell. Syst. IOTAIS 2018, pp. 194–200, 2019.

3. I. Rozi, S. Pramono, and E. Dahlan, “Implementasi Opinion Mining (Analisis Sentimen) Untuk Ekstraksi Data Opini Publik Pada Perguruan Tinggi,”J. EECCIS, vol. 6, no. 1, pp. 37–43, 2012.

4. F. Ramadhanti, Y. Wibisono, and R. A. Sukamto, ”Analisis Morfologi untuk Menangani Out-of-Vocabulary Words pada Part-of-Speech Tagger Bahasa Indonesia Menggunakan Hidden Markov Model,”J. Linguist. Komputasional, vol. 2, no. 1, p. 6, 2019.

5. N. K. Wardhani et al., ”Sentiment analysis article news coordinator minister of maritime affairs using algorithm naive bayes and support vector machine with particle swarm optimization,” J. Theor. Appl. Inf. Technol., 2018.

6. J. Wu, J. Xin, and N. Zheng, ”SVM learning from imbalanced microanuerysm candidate datasets used feature selection by gini index,” in 2015 IEEE International Conference on Information and Automation, ICIA 2015 - In conjunction with 2015 IEEE International Conference on Automation and Logistics, 2015.

7. T. Agus, S. M. Adib, and A. Karomi, ”Penerapan Metode Sample Bootstrapping untuk Meningkatkan Performa kNearest Neighbor pada Dataset Berdimensi Tinggi,” IC-Tech, vol. XII, no. 1, April, pp. 9–14, 2017.

8. HERNAWATI and W. GATA, ”Sentimen Analisis Operasi Tangkap Tangan KPK Menurut Masyarakat Menggunakan Algoritma Support Vector Machine , Naive Bayes Berbasis Particle Swarm Optimizition,” vol. 12, no. 3, pp. 230–243, 2019.

9. H. S. Utama, D. Rosiyadi, D. Aridarma, and B. S. Prakoso, ”Sentimen Analisis Kebijakan Ganjil Genap Di Tol Bekasi Menggunakan Algoritma Naive Bayes Dengan Optimalisasi Information Gain,”J. Pilar Nusa Mandiri,vol. 15, no. 2, pp. 247–254, 2019.

10. Y. N. Dewi and F. A. Sariasih, ”Metode Sample Bootstrapping Untuk Meningkatkan Performa Algoritma Naive Bayes Pada Citra Tunggal Pap Smear,”J. Tek. Inform., vol. 12, no. 1, pp. 1–10, 2019.

11. T. A. Setiawan, R. Satria, and A. Syukur, ”Integrasi Metode Sample Bootstrapping dan Weighted Principal Component Analysis untuk Meningkatkan Performa k Nearest Neighbor pada Dataset Besar,” vol. 1, no. 2, pp. 76–81, 2015.