Comparison of Word2Vec with GloVe in Multi-Aspect Sentiment Analysis Classification of Nvidia RTX Products with Naïve Bayes Classifier

(1)

Comparison of Word2Vec with GloVe in Multi-Aspect Sentiment Analysis Classification of Nvidia RTX Products with Naïve Bayes Classifier

Wira Abner Sigalingging, Sri Suryani Prasetyowati, Yuliant Sibaroni^* School of Computing, Informatics, Telkom University, Bandung, Indonesia

Email: ¹wirasigalingging@student.telkomuniversity.ac.id, ²srisuryani@telkomuniversity.ac.id, ^3,*yuliant@ telkomuniversity.ac.id Coresspondence Author Email: yuliant@ telkomuniversity.ac.id

Submitted 16-01-2023; Accepted 08-02-2023; Published 17-02-2023 Abstract

The increasing number of gamers has increased the demand for Graphics Processing Unit (GPU) products, one example of which is the Nvidia RTX product. Many users submit their reviews on social media Twitter in the form of tweets. These Tweets can be analyzed to determine the quality of a product. But most of the tweets talking about the product as a whole ignoring the category aspects of the product, making it difficult for both users and companies to pinpoint which aspects need attention. In this research, a multi-aspect based sentiment analysis will be carried out on tweets on Nvidia RTX products based on aspects of the product. The classification method used is Naive Bayes Classifier which will then compare feature extraction using Word2Vec and GloVe. Performance parameters are measured using a confusion matrix to produce values for accuracy, precision, recall, and f1-score. The highest accuracy results obtained were 60.71% on the price aspect, GloVe feature extraction, and classification with Gaussian Naive Bayes.

Keywords: Naive Bayes Classifier; Word2Vec; Glove; Confusion Matrix; Multi-Aspect Sentiment Analysis

1. INTRODUCTION

The Graphics Processing Unit (GPU) is one of the components on a computer that is tasked with handling computer graphics needs. In general, the GPU is almost similar to the Central Processing Unit (CPU) however, GPU has more cores and specific tasks in assisting the CPU in handling data in the form of graphics, such as rendering graphics, video, and animation. However, as the GPU has evolved, its use has expanded and it can now handle a wide range of applications, including some that are currently under development. Due to the GPU's evolution, it is now one of the essential components for video editors, animators, gamers, and other computer graphics-related jobs. Nvidia is one of the biggest vendors that produce a lot of GPUs according to user needs. The best-selling series at the moment is the Nvidia RTX series[1] which is assisted by Work from Home (WFH) conditions and other remote work so that GPU sales are increasing. Nvidia also uses social media in developing its business. Nvidia uses social media Twitter as a marketing medium as well as to interact with customers. Customers can express their comments or opinions regarding their experiences while using the products purchased via tweets.

By considering the tweets, customers can make a reference Nvidia GPU series that suits their needs by adjusting various aspects of the category such as price, performance, and availability of the GPU. Companies can perform analysis on tweets as a reference for customer satisfaction with their products, this process is called sentiment analysis.

Much research on sentiment analysis has been carried out, especially using the Naive Bayes Classifier, for example, research conducted by Bayhaqy et al. [2] in 2018, discusses how to identify tweets with positive and negative polarity. The researchers used datasets from e-commerce Tokopedia and Bukalapak and compared 3 classifiers, namely Naive Bayes, Decision Tree, and K-Nearest Neighbor. The highest accuracy results were obtained using the Decision Tree method, namely 80%, then K-NN 78%, and Naive Bayes 77%. Sentiment analysis is also widely used in analyzing products like those used by Haque UI et al. [3] which focuses on mobile products, electronics, and musical instruments available on Amazon e-commerce.

Based on previous research on sentiment analysis of novel tweets by Song et al. [4] in 2017, it produces high accuracy. Research on tweets by Alves et al. [5] with FIFA 2013 case studies has also been carried out, by comparing the Naive Bayes and Support Vector Machine (SVM) methods. In addition, research conducted by Guia et al. [6] in 2019 a comparison of various methods in sentiment analysis such as Naive Bayes, SVM, Decision Tree, and Random Forest was carried out.

Research conducted by Novendri et al. [7] in 2020 discussed sentiment analysis with 4 using a dataset in the form of YouTube comments and obtained fairly high accuracy results. Multi-aspect sentiment analysis allows users and companies to determine the category aspects that need attention from a product. As in the study conducted by Ananda et al. [8] in 2021, researchers use 5 category aspects of hotel reviews to help tourists determine which hotel is best for them.

The research conducted [9], [10] discusses how to compare word embeddings including Word2Vec and GloVe on sentiment analysis. Research conducted by Shi and Lie [11], discusses the relationship between Word2Vec and GloVe in text classification and the characteristics of each of these methods.In this study, the system will be classified using Naive Bayes Classifier because this classifier provides high-accuracy results in sentiment analysis research [5], [6]. The main contribution of this study is to compare 2 feature extractions, namely: Word2Vec and GloVe in the case of multi-aspect sentiment analysis which has not been done in previous studies. The dataset is not only divided based on polarity but also divided into 3 aspects, namely: performance, price, and availability. In addition, several Naive Bayes Classifier algorithms will also be used which will be evaluated using a confusion matrix to see which algorithm gives the best results on each feature extraction.

(2)

2. RESEARCH METHODOLGY

2.1 Research Stages

This section describes the system being built and the workflow system which describes the workflow of the system from start to finish in general. The flow of the model will be illustrated in Figure 1:

Figure 1. System Overview a. Data Crawling

Data crawling is the first stage that is carried out by collecting data in the form of tweets from Twitter using the snscrape library.

b. Data Labeling

Data labeling is carried out after the is collected. In this stage, the collected data will be labeled based on polarity and the determined aspect category.

c. Preprocessing

Preprocessing is important because it will affect the final result of the system.There are several sub-stages in the preprocessing, namely:

1. Data Cleaning

Data cleaning aims to remove characters that are not needed in the next process. Some characters removed are numbers, URLs, punctuation, and hashtags.

2. Case Folding

This step changes all capital letters in the text to lowercase.

3. Stopword removal

At this stage, words deemed inappropriate or irrelevant will be deleted because these words have no meaning in a document. The list of stopwords used is built-in from the stopwords natural language toolkit.

4. Stemming

At this stage, the word in the document will be changed to the basic word of the word. In this study, the stemming stage used the library from the Natural Language Toolkit (NLTK).

2.2 Feature Extraction

Feature extraction is a machine-learning technique that reduces data dimensions [12]. At this stage, a comparison of feature extraction is carried out using Word2Vec and using GloVe.

a. Word2Vec

Word2Vec is a word embedding based on a neural network developed by Google[5]. In Word2Vec, words are represented in vectors, which in turn, Word2Vec will group vectors of similar words. The word converted into a vector aims to make it easier for the machine to understand the meaning of the word, This makes word2vec one of the feature extraction that gives high-accuracy results [13]. There are two architectures in Word2Vec, namely: continuous bags-of- words (CBOW) where the predicted word is based on other words in the sentence and continuous skip-gram predicts the word by looking at the word before and after the current word. Skip-gram will work efficiently with large word vectors and unstructured text. In research[14], the skip-gram model showed higher accuracy. The following is the equation of the skip-gram model:

1

𝑇∑ ∑ 𝑙𝑜𝑔 𝑝(𝑤_𝑡+𝑗|𝑤_𝑡)

−𝑐≤𝑗≤𝑐,𝑗≠0 𝑇

𝑡=0 (1)

(3)

With 𝑤𝑡 is the word center,𝑤𝑡+𝑗 is the word after center, and c is a measure of the training context.

b. GloVe

GloVe (Global Vectors) is a word embedding method that uses co-occurrence matrix to produce semantic relationships between words contained in the corpus. This is done by counting how often a word appears in the corpus.

GloVe works well on training with large amounts of data. In this research, the data used is in the form of a Twitter corpus with the hashtag #nvidiartx. The GloVe algorithm will produce output in the form of a list containing similar words, for example in Table 1 which displays words similar to the word "smart":

Table 1. Words Similiar to “smart”

Word Rank-1 Rank-2 Rank-3 Rank-4 Rank-5

smart

sharp deft clever intelligent skilled Rank-6 Rank-7 Rank-8 Rank-9 Rank-10

once overly intellect cunning very

2.2 Naive Bayes Classifier

At this stage classification is carried out using the Naive Bayes Classifier. Naive Bayes is often used in sentiment analysis research because the algorithm is simple and provides high-accuracy results[15]. The following is the general Naive Bayes algorithm equation:

c(d)= arg max P(c) ∏ P(f_i|C)

n

i=1 (2)

Where d represents the test document, c(d) represents the class d, and c represents the label class. The number of attributes is expressed by n,fi denotes the ith feature of n.P(fi|C) declare features fi in document C. At the experimental stage, a classification will be carried out using several Naive Bayes variants with the aim of seeing comparisons with the highest accuracy results.

2.2 Evaluation

In this study, performance evaluation was carried out to measure the classification results that had been obtained in the classification process. Classification results are evaluated with confusion matrix, confusion matrix is a performance viewer of a useful and comprehensive classification[16]. Displayed in tabular form showing the results of accuracy, precision, recall, and f1-score. The results displayed in the confusion matrix will be calculated using the equation below:

accuracy= TP+TN

TP + TN + FP + FN (3)

recall= TP

FN + TP (4)

precision= TP

FN + TP (5)

f1-score=1+ 2 * precision * recall

precision + recall (6)

Information:

TP (True Positive) = system predicts positive and class value is actually positive TN (True Negative) = the system predicts negative and the actual class value is negative FP (False Positive) = the system predicts positive while the actual class value is negative FN (False Negative) = the system predicts negative while the actual class value is positives

3. RESULTS AND DISCUSSION

3.1 Data Crawling

Data crawling is done using the Python programming language and library snscrape. The data collected is in the form of tweets on social media Twitter using the keyword "NvidiaRTX”. The number of tweets that were successfully collected was 1406 tweets in the period from 1 January 2020 to 25 October 2022. The data collected was stored in csv format.

3.2 Data Labeling

The text on the dataset generally refers to how one perceives a particular topic [17]. The datasets are manually labeled, for both sentiment and category aspects. Aspect categories are divided into 3 namely performance aspects, price aspects

(4)

Copyright © 2023 Wira Abner Sigalingging, Page 57 and availability aspects. In determining the aspect of a tweet, there are certain words that indicate how the tweet refers to the intended aspect. Identification of category aspects will be shown in Table 2.

Table 2. Catagory Aspect Identification

Cetagory Aspect Words

Performance Benchmark, High end, Fast, 3dmark Sale, Buy

Coming, Sold, Release, Available Price

Availability

Sentiments are also divided into 3 polarities just like the aspect categories, namely positive sentiment, neutral sentiment and negative sentiment. Polarity determination will be explained in Table 3.

Table 3. Identify Sentiment Polarity

Label Identified

1 Positive Sentiment 0 Neutral Sentiment -1 Negative Sentiment 3.3 Preprocessing

Data that has been labeled will then be processed before classification. The main purpose of preprocessing is to reduce noise in the data [18]. There are several stages in preprocessing that is carried out so that data can be further processed at a later stage. An example of the processing process will be described in Table 4 below.

Table 4. Stages and Results of Preprocessing

Preprocess Sentence

Raw Data "An alleged 3DMark Time Spy benchmark of NVIDIA's upcoming GeForce RTX 4090 graphics card has been leaked.

"\n#nvidia #nvidiartx #gpu #gpus #hardware #computerhardware #gamingpc #ga mingpcbuild #technology #technologynews #rumors https://t.co/JLbceCSa46 Data Cleaning & Case Folding an alleged time spy bencmark of nvidias upcoming geforcce rtx graphics card has

been leakednvidia nvidiartx gpu gpus hardware computerhardware gamingpc gamingpcbuild technology technologynews rumors

Stopword Removal alleged time spy benchmark nvidias upcoming geforce rtx graphics card leakednvidia nvidiartx gpu gpus hardware computerhardware gamingpc gamingpcbuild technology technologynews rumors

Stemming alleg time spi benchmark nvidia upcom geforc rtx graphic card leakednvidia nvidiartx gpu gpus hardware computer hardware gamingpc gamingpcbuild technology technologynews rumor

3.3 Feature Extraction

As previously explained, there are 2 feature extractions that will be compared in this study, namely: Word2Vec and GloVe. Word2Vec itself accepts input in the form of a corpus to be processed and the output into a vector that represents the data in that corpus, the goal is to make it easier for the machine to understand the data in the corpus. Word2Vec is expected to improve performance in sentiment analysis, as discussed in the research[19], [20]. Unlike Word2Vec, GloVe works by calculating the number of times a word appears in the corpus. The output of GloVe is a list of similar words, for example, see below Table 1.

3.3 Test Results

In this study, the multi-aspect sentiment was carried out on 1,406 data that had been collected in the form of tweets about Nvidia RTX. Furthermore, the data will be grouped based on predetermined categories of aspects, namely: performance aspects, price aspects, and availability aspects. There are 2 types of feature extraction used, namely Word2Vec and GloVe.

The aim is to see a comparison of the two feature extractions on the performance of the system for aspect-based sentiment analysis. Two scenarios will be carried out where the first scenario is carried out by not grouping data according to aspect categories and then classification is carried out using Multinomial Naive Bayes and the second scenario is carried out by grouping data according to aspect categories and then classified using several Naive Bayes algorithms. The results of the first experiment will be shown in Table 5.

Table 5. Result of Scenario I

Feature Extraction Accuracy (%) F1-score (%)

Word2Vec 50,88% 66,41%

GloVe 56,22% 70,85%

(5)

Table 6. Result of Scenario II

Naive Bayes Variants Accuracy (%)

Word2Vec GloVe

Aspect Performance Price Availability Performance Price Availability

Multinomial 49,38% 57,14% 44,82% 46,29% 57,14% 40,22%

Bernoulli 51,85% 42,85% 42,52% 38,27% 39,28% 31,03%

Gaussian 39,50% 21,42% 29,88% 15,43% 60,71% 13,94%

In the first scenario, after classifying without grouping data into aspect categories, the best results are shown using GloVe feature extraction with an accuracy of 56.22%. There is no significant difference in accuracy results compared to using Word2Vec with an accuracy of 50.88%. The Naive Bayes variant used in the first scenario is Multinomial NB. Furthermore, Table 6 shows the results of the second scenario where the existing data has been grouped according to the aspects of each category. In addition, in the second scenario, a comparison was made with the three variants of Naive Bayes. The highest accuracy result using Word2Vec is 57.14% on the price aspect which is classified by Multinomial NB, while using GloVe the highest accuracy is 60.71% on the price aspect which is classified by Gaussian NB.

4. CONCLUSION

In this research, there are two scenarios that have been done by comparing the feature extraction of Word2Vec and GloVe.

The aim is to compare the performance of the 2 feature extractions in terms of their accuracy. In the first scenario, GloVe gives better accuracy, which is 56.22% compared to Word2Vec, with an accuracy of 50.88%. Similar results were not obtained in the second scenario, where the average accuracy of Word2Vec was higher than GloVe. Although the highest accuracy using GloVe is 60.71% in terms of price with Gaussian NB, overall accuracy using Word2Vec gives better results, especially with the Bernoulli NB variant. The system can yet be improved through additional research utilizing different classifications or by including the collected data. The dataset employed in this study is unbalanced and can yet be increased, resulting in less-than-ideal outcomes.

REFERENCES

[1] V. v. Sanzharov, A. I. Gorbonosov, V. A. Frolov, and A. G. Voloboy, “Examination of the Nvidia RTX,” in CEUR Workshop Proceedings, 2019, vol. 2485, pp. 7–12. doi: 10.30987/graphicon-2019-2-7-12.

[2] A. Bayhaqy, S. Sfenrianto, K. Nainggolan, and E. R. Kaburuan, “Sentiment Analysis about E-Commerce from Tweets Using Decision Tree, K-Nearest Neighbor, and Naïve Bayes.” [Online]. Available:

http://dlvr.it/Qb83n8pic.twitter.com/8MucIMhUMO,

[3] T. Haque UI, N. Saber N, and F. Shah M, “Sentiment Analysis on Large Scale Amazon Product Reviews,” 2018 IEEE International Conference on Innovative Research and Development (ICIRD), May 2018.

[4] J. Song, K. T. Kim, B. Lee, S. Kim, and H. Y. Youn, “A novel classification approach based on Naïve Bayes for Twitter sentiment analysis,” KSII Transactions on Internet and Information Systems, vol. 11, no. 6, pp. 2996–3011, 2017, doi:

10.3837/tiis.2017.06.011.

[5] A. L. F. Alves, C. de S. Baptista, A. A. Firmino, M. G. de Oliveira, and A. C. de Paiva, “A comparison of SVM versus naive- bayes techniques for sentiment analysis in tweets: A case study with the 2013 FIFA confederations cup,” in WebMedia 2014 - Proceedings of the 20th Brazilian Symposium on Multimedia and the Web, Nov. 2014, pp. 123–130. doi:

10.1145/2664551.2664561.

[6] M. Guia, R. R. Silva, and J. Bernardino, “Comparison of Naive Bayes, support vector machine, decision trees and random forest on sentiment analysis,” in IC3K 2019 - Proceedings of the 11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, 2019, vol. 1, pp. 525–531. doi: 10.5220/0008364105250531.

[7] R. Novendri, A. S. Callista, D. N. Pratama, and C. E. Puspita, “Sentiment Analysis of YouTube Movie Trailer Comments Using Naïve Bayes,” Bulletin of Computer Science and Electrical Engineering, vol. 1, no. 1, pp. 26–32, Jun. 2020, doi:

10.25008/bcsee.v1i1.5.

[8] I. P. Ananda, M. Utama, S. Prasetyowati, and Y. Sibaroni, “Multi-Aspect Sentiment Analysis Hotel Review Using RF, SVM, and Naïve Bayes based Hybrid Classifier,” JURNAL MEDIA INFORMATIKA BUDIDARMA, vol. 5, no. 2, pp. 630–639, Apr.

2021, doi: 10.30865/MIB.V5I2.2959.

[9] O. B. Deho, W. A. Agangiba, F. L. Aryeh, and J. A. Ansah, “Sentiment Analysis with Word Embedding.”

[10] H. Li, X. Li, D. Caragea, and C. Caragea, “Comparison of Word Embeddings and Sentence Encodings as Generalized Representations for Crisis Tweet Classification Tasks.” [Online]. Available: http://aidr.qcri.org/

[11] T. Shi and Z. Liu, “Linking GloVe with word2vec,” Nov. 2014, [Online]. Available: http://arxiv.org/abs/1411.5595

[12] M. Avinash and E. Sivasankar, “A study of feature extraction techniques for sentiment analysis,” Advances in Intelligent Systems and Computing, vol. 814, pp. 475–486, 2019, doi: 10.1007/978-981-13-1501-5_41/COVER.

[13] C. A. Iglesias and A. Moreno, “Sentiment Analysis for Social Media”, Accessed: Jan. 30, 2023. [Online]. Available:

www.mdpi.com/journal/applsci

[14] R. V. O. I. Sudiro, S. S. Prasetiyowati, and Y. Sibaroni, “Aspect Based Sentiment Analysis with Combination Feature Extraction LDA and Word2vec,” 2021 9th International Conference on Information and Communication Technology, ICoICT 2021, pp.

611–615, Aug. 2021, doi: 10.1109/ICOICT52021.2021.9527506.

(6)

NAIVE BAYES AND SIMPLE KRIGING,” JIPI (Jurnal Ilmiah Penelitian dan Pembelajaran Informatika), vol. 7, no. 4, pp.

1244–1253, Nov. 2022, doi: 10.29100/JIPI.V7I4.3264.

[16] D. Krstinić et al., “Multi-label Classifier Performance Evaluation with Confusion Matrix,” Computer Science & Information Technology (CS & IT) Vol.10, No.8, vol. 10, no. 8, p. 1, Jun. 2020, doi: 10.5121/CSIT.2020.100801.

[17] A. Hussain, “Socio-Affective Computing Volume 5 Series Editor”, Accessed: Jan. 30, 2023. [Online]. Available:

http://www.springer.com/series/13199

[18] W. Etaiwi and G. Naymat, “The Impact of applying Different Preprocessing Steps on Review Spam Detection,” in Procedia Computer Science, 2017, vol. 113, pp. 273–279. doi: 10.1016/j.procs.2017.08.368.

[19] S. Al-Saqqa and A. Awajan, “The Use of Word2vec Model in Sentiment Analysis: A Survey,” in ACM International Conference Proceeding Series, Dec. 2019, pp. 39–43. doi: 10.1145/3388218.3388229.

[20] E. M. Alshari, A. Azman, S. Doraisamy, N. Mustapha, and M. Alkeshr, “Improvement of sentiment analysis based on clustering of Word2Vec features,” in Proceedings - International Workshop on Database and Expert Systems Applications, DEXA, Sep.

2017, vol. 2017-August, pp. 123–126. doi: 10.1109/DEXA.2017.41.