Exploring Text Analytics for Social Media Competitive Analysis:
Top Brand Internet Service Provider Companies in Indonesia
Velia Vanissa1*, Tri Widarmanti1, Herry Irawan1
1 School of Economics and Business, Telkom University, Bandung, Indonesia
*Corresponding Author: [email protected]
Accepted: 15 February 2021 | Published: 1 March 2021
__________________________________________________________________________________________
Abstract: This study aims to measure the competitive ability of social media among Internet Service Providers in Indonesia. The data is collected using crawling technique on Twitter from the keywords: “indihome”, “firstmedia”, “biznet”, and “gigbyindosat”. The data is collected in 30 days using RStudio version 1.3.1093. The resulting data is in the form of User-Generated Content or user opinions, this data processed using Sentiment Analysis to measure user opinions. We also use Social Network Analysis to identify its network properties, this analysis is using Gephi software. In addition, WordCloud visualization of each user's sentiment is carried out so that it can be seen what the user's conversation topic is in each sentiment, so that we can see the strengths and weaknesses of each Internet Service Provider. The results of this study can be used to measure the competitive ability of social media on Top Brand Internet Service Providers and also can be used for other academic research.
Keywords: competitive intelligence, competitive analysis, network properties, sentiment analysis, social network analysis, user-generated content
___________________________________________________________________________
1. Introduction
Indonesia has experienced an increase in internet users every year, internet users are increasing along with remote activities during the COVID-19 pandemic. Along with the increase in internet users, there is an increase in fixed broadband users in Indonesia. According to a survey visualized by We Are Social, the 2nd most activity on the internet is the use of social media.
Activities carried out by internet users that are increasing have an impact on the amount of data or content generated by users (User-Generated Content). According to Sapountzi and Psannis (2018), UGC that enters social media is very much, so it requires speed and variety of sophisticated tools and methods to extract information from social media. Social media is not only used by individuals, companies are also use social media for business purposes, such as their relationship with the customers. It is important for a company to improve the quality of its relationships with customers, especially for the companies that hold the title of Top Brand Internet Service Provider (ISP). Top Brand Award is an award event for brands in Indonesia.
In Phase 1 2020, there were 4 Internet Service Providers with the title Top Brand, namely IndiHome, First Media, Biznet, and Indosat M2. These companies are also actively using Twitter as a place for them to interact with their users.
Companies that provide internet services need to improve the quality of service with customers in order to gain the trust of their customers, this is done by measuring their competitive ability on social media. As an Internet Service Provider company and hold the title of Top Brand, they need to look at their abilities and social media identity to find out their competitive abilities
and help identify the company's social media strategy. Customer needs and their perceptions of a product can be taken from UGC data and it can be a valuable input for product improvement and Customer Relationship Management (Schmunk, et al., 2013) and there was an increase in research interest in using social media for competitive analysis (Gémar and Jimenez, 2015). Social media has helped many brands in terms of their popularity and growth in social media channels, seeing this behavior, it is important for brands to investigate and analyze the behavior of their competitors on social media (Arora, et al. (2020). Therefore, a company can measure the competitive ability of itself and its competitors on social media.
Social media competitive analysis can be done using sentiment analysis to measure customer opinions and social network analysis to see the distribution of information on social media.
SNA has become a powerful competitive tool for companies to view marketing activities on customers and competitors (Bayer and Servan-Schreiber, 2011). This study also used wordcloud as a tool to see what topics were discussed by customers based on each sentiment.
This study wants to see what are the strengths and weaknesses of each company. Competition analysis on social media allows businesses to gain business advantage by analyzing the available social media data (He et al., 2013).
2. Literature Review
Competitive Intelligence is the key to strategic management. Cobb (2003) stated that competitive intelligence can be seen as a process that can support strategic and practical decisions of a company, but it also requires a reliable, relevant, and precise information analysis system and process. Competitive Intelligence provides knowledge about competitors, marketing strategies, objectives, research activities, their strengths and weaknesses, and other information.
Social media provides a platform for consumers to express and share feelings and thoughts with many people (Afolabi et.al., 2017), The presence of companies using social media provides access to useful information that can help them provide better service delivery, connect with potential customers/clients. Companies can also identify markets and find competitive marketing advantages through Competitive Intelligence (CI). Social media is the result of the development of ideology and technology based on Web 2.0 in the form of internet- based applications that allow the exchange of content created by users, resulting in interactions with each other (Kaplan and Haenlein, 2010). Currently, social media has become the largest media of communication involving billions of people around the world directly or indirectly (Nazir et al., 2019), This has been a big change in our lives. We can use social media for both positive and negative aspects. The need today is to analyze the power of social media and make it a driver to increase business by analyzing user comments in the form of user-generated content about how they talk about a product. UGC data can be processed using text mining.
Text mining is an automation technology that can be used to recognize, extract, manage, effectively and systematically, integrate and utilize text knowledge (Ananiadou et al., 2009).
Text Mining is an automatic process that can reveal new knowledge and relationships between patterns that are formed in unstructured textual data sources (Eman, 2015). The way text mining works refers to text categorization, text grouping, and entity extraction. The techniques used by text mining are sentiment analysis, document summarization and entity relationship models. Text analysis involves information retrieval, lexical pattern analysis, recognition and natural language processing (Nazir, et.al., 2019). Sentiment analysis is a classification method that uses the supervised learning method to determine positive or negative words in a sentence to be able to determine human positive or negative emotional attitudes (Maks and Vosse, 2012).
Sentiment analysis was first introduced as opinion mining and subjectivity analysis which can determine attitudes or patterns of opinions and reviews written by humans, in this case to assess a product or service. Sentiment analysis is usually applied to all forms of textual opinion such as blogs, reviews and microblogs or what is currently better known as tweets, namely short messages that cannot be longer than 149 characters (Eman, 2015). One of the advantages of text mining is to see the hidden pattern and it can be used to see consumer opinions about services and products. From the UGC data, you can analyze the communication patterns between Twitter users using network analysis. Social Network Analysis is learning about human relations through graphic theory (Tsvetovat and Kouznetsov, 2011). This is in accordance with the notion of social networks according to Zhan & Fang (2011), it is a social structure from the society with certain relationships that resembling a graphic consisting of nodes and edges. Smith, et.al (2012) stated that UGCs in various media are associated with brands and have the potential to shape consumer brand perceptions, this is important for marketer.
3. Methodology
This study discusses the competitive analysis of social media on Twitter using Sentiment Analysis and Social Network Analysis. The data analysis starts with 3 stages of data processing, measuring customer opinion using sentiment analysis, extracting the topics discussed in each sentiment using WordCloud, then identifying social networks using Social Network Analysis.
Sentiment Analysis was conducted using Microsoft Excel 2017, Sentiment Analysis was carried out by giving positive, neutral, and negative labels on each tweet. These tweets were previously cleaned using RStudio version 1.3.1093. Tweets are cleaned by cleaning duplicate data, removing links, mentions, and hashtags. After that, a Lowercase Conversion was carried out, which was changing all letters to lowercase, Tokenization, which was converting a sentence into words, phrases, or other important parts, Stopword Removal, which was removing words without dependence on certain topics, such as conjunctions, prepositions, and others, then the last one is Stemming, which is a process where words are changed back into their original form in standard grammar. After conducting a Sentiment Analysis, the data will be separated based on the sentiment, this aims to make it easier when visualizing words using WordCloud. The WordCloud visualization process is carried out using the wordclouds.com website. The data that has been disaggregated according to sentiment are then entered on this website in csv format. After that, the words will automatically appear. Colors, shapes and fonts can be customized as you wish. By using the same crawled data, the data is processed using Social Network Analysis. Before doing that, the data was cleaned by deleting the entire column and leaving Text and ReplyToSN column. After that, the data was entered into Gephi version 0.9.2 to identify its network properties.
4. Discussion and Conclusion
Sentiment analysis results show the percentage of different sentiments of each internet service provider. Figure 1 shows a comparison of user sentiment at each internet service provider.
Figure 1: Comparison of Brand Sentiment from the Internet Service Providers
In Figure 1, P is Positive, N is Negative, and T is Neutral. After obtaining a comparison as shown in Figure 1, it can be seen that the ranking of Internet Service Providers in Table 1 below.
Table 1: The Ranking of Internet Service Providers Based on Sentiment Analysis
Brand Sentiment Rank
Positive
1. Indosat M2 2. Biznet 3. IndiHome 4. First Media Negative
1. IndiHome 2. Biznet 3. First Media 4. Indosat M2
The ranking is only carried out on positive and negative sentiments, this is because neutral sentiment does not provide more information other than what topics users are talking about on these sentiments.
The results of the WordCloud visualization show the topics discussed by Twitter users about each Internet Service Provider based on their sentiments. Figure 2 shows a visualized WordCloud.
WordCloud visualization is carried out in Indonesian because the tweets obtained are in Indonesian. However, from Figure 2 it can be explained that IndiHome's strength is in the promos they often offer, First Media has the power at friendly prices, Biznet has good connections in certain areas, and Indosat M2 on the prizes quiz they always do on social media as well as limited promos that make customers have to scramble to get it. The weaknesses of the four Internet Service Providers are almost entirely the same, namely connection and network problems which often experience disruption. This is reasonable considering that these four companies are engaged in the same industry, namely internet service providers.
In the Social Network Analysis, the value of each network property is obtained which can be seen in Table 2 below.
Table 2: Network Properties from each Internet Service Provider
Network Properties IndiHome First Media Biznet Indosat M2
Size Nodes: 32471
Edges: 39739
Nodes: 15632 Edges: 17778
Nodes: 3978 Edges: 3327
Nodes: 668 Edges: 626
Density 0 0 0 0,003
Modularity 0,492 0,506 0,752 0,859
Diameter 8 7 8 8
Average Degree 2,448 2,275 1,673 1,874
Average Path Length 2,782 2,086 3,06 3,243
Clustering Coefficient 0 0 0,007 0
Connected Component 4.069 3.408 860 64
From the results of the Social Network Analysis, it can be explained from each network property that:
1) Size
The large number of nodes indicates that the network is quite active because there are many interacting. In this study, researchers included the value of the edges or the spread of information along with the number of nodes, because the edges are what connect nodes in a network. On the Top Brand Internet Service Provider social network, IndiHome has the most nodes and edges, 32,471 nodes and 39739 edges, followed by First Media which has 15,632 nodes and 17,778 edges, then Biznet has 3,978 nodes and 3,327 edges, Indosat M2 has the most nodes and edges. at least 668 nodes and 626 edges.
2) Density
The higher the value, the better because the closer the relationship between the nodes is. In this study, Indosat M2 has the highest density, which is 0.003, while IndiHome, First Media, and Biznet have the same value, namely 0.
3) Modularity
The higher the Modularity, the more groups formed on the network, it can be said that the network is quite active. In this study, Indosat M2 had the highest Modularity value, namely 0.859, followed by Biznet with a value of 0.752, then First Media with a value of 0.505, and finally IndiHome with a value of 0.492.
4) Diameter
In this study, the distance between nodes on the Indosat M2, Biznet, and IndiHome social networks has the same value, namely 8, while First Media has a value of 7. This means that the average weight of a node in the Indosat M2, Biznet, and IndiHome networks is related.
effectively with 8 other nodes. Meanwhile, in the First Media social network, the average weight of a node has an effective relationship with 7 other nodes in its network.
5) Average Degree
The greater the value, the better because the dissemination of information will be faster and easier. In this study, IndiHome had the highest Average Degree value, namely 2.448, followed by First Media with 2.275, then Indosat M2 with 1.874, and finally Biznet with 1.673.
6) Average Path Length
The smaller the Average Path Length, the faster the spread of information that occurs on the network. In this study, the Average Path Length with the smallest value was held by First Media, namely 2.086, then there was IndiHome with a value of 2.782, followed by Biznet with a value of 3.06, and the one with the largest Average Path Length, namely Indosat M2 with a value of 3.243.
7) Clustering Coefficient
The clustering coefficient is symbolized by Ci. If Ci is getting bigger, then the relationship is getting denser and indicates the small world phenomenon in it. If Ci = 0 then the nodes are not interconnected, and if Ci = 1 then the nodes are interconnected. In this study, the Clustering Coefficient value in the Indosat M2, First Media, and IndiHome social networks is 0, meaning that the nodes are not interconnected. On the Biznet social network, the value is 0.007, meaning the nodes are interconnected but very weak.
8) Connected Component
The greater the Connected Component value, the better, meaning that the network can form many groups that are related to one another. In this study, the large value of Connected Component was in IndiHome with a value of 4096, followed by First Media with a value of 3408, then Biznet with a value of 680, and the smallest value was Indosat M2, which was 64.
From the discussion above, there are several things that can be concluded. First, according to the positive sentiment, IndiHome is known for the promos they often offer, First Media is known for its friendly prices, Biznet is known for its good connections in certain areas, and Indosat M2 excels in the prized quizzes they always do at social media and limited promos that make customers have to fight to get it. According to the negative sentiment, the shortcomings of the four Internet Service Providers are almost entirely the same, namely connection and network problems which often experience disruption. This is reasonable considering that these four companies are engaged in the same industry, namely internet service providers. Almost the same neutral sentiment is shared by all Internet Service providers, namely showing the direction given by the admin to the audience regarding the information needed by them.
Second, from the results of social network analysis, there are different advantages of each Internet Service Provider. IndiHome excels in 4 network properties namely Size, Diameter, Average Degree and Connected Component, Indosat M2 excels at 2 properties namely Density and Modularity, First Media excels at 1 property, namely Average Path Length, and Biznet excels at 1 property, namely Clustering Coefficient.
The results of this discussion can be used as evaluation material for Top Brand Internet Service Providers, each Internet Service Provider can learn from the strengths of its competitors. The same thing can be done when knowing the shortcomings, each Internet Service Provider can correct their weaknesses based on customer perceptions on social media. In terms of the quality of social networks, Internet Service Provider companies can also make strategic decisions in order to make the quality of their social networks better, this is because the better the quality of their social networks, the better their relationship with customers.
For further research, a Social Media Competitive Analysis can be carried out using other methods or using social media other than Twitter. In addition, research can be carried out in other industries, such as clothing, food, or shelter, and even other industries.
References
Afolabi et al. (2017). Competitive Analysis of Social Media Data in The Banking Industry.
International Journal of Internet Marketing and Advertising
Ananiadou, S., Rea, B., Okazaki, N., Procter, R., & Thomas, J. (2009). Supporting systematic reviews using text mining. Social Science Computer Review, 27(4), 509–523.
Arora, A., Srivastava, A., & Bansal, S. (2020). Business Competitive Analysis using promoted post detection on social media. Journal of Retailing and Consumer Services, 54(June 2019), 101941.
Bayer, J., & Servan-Schreiber, E. (2011). Gaining competitive advantage through the analysis of customers’ social networks. Journal of Direct, Data and Digital Marketing Practice, 13(2), 106–118.
Cobb, P. (2003). “Competitive intelligence through data mining”. Journal of Competitive Intelligence and Management. Vol. 1 No. 3, pp. 80-9.
Eman, Y. (2015). Sentiment Analysis and Text Mining for Social Media Microblogs using Open Source Tools: An Empirical Study Spatial and Temporal Environment Impact Analysis on People’s Wellbeing View project Predict heart disease from streaming tweets View project Sentime. Article in International Journal of Computer Applications, 112(5), 975–
8887.
Gémar, G., & Jiménez-Quintero, J. A. (2015). Text mining social media for Competitive Analysis. 11(1), 84–90.
He, W., Zha, S., & Li, L. (2013). Social media Competitive Analysis and text mining: A case study in the pizza industry. International Journal of Information Management, 33(3), 464–
472.
Kaplan, A. M., & Haenlein, M. (2010). Users of the world, unite! The challenges and opportunities of Social Media. Business Horizons, 53(1), 59–68.
Maks, I., & Vossen, P. (2012). A lexicon model for deep Sentiment Analysis and opinion mining applications. Decision Support Systems, 53(4), 680–688.
Nazir, M. U. et al. (2019). Social Media Competitive Analysis – A Case Study in The Pizza Industry of Pakistan.
Sapountzi, A., & Psannis, K. E. (2018). Social networking data analysis tools & challenges.
Future Generation Computer Systems, 86, 893–913.
Schmunk et al. (2013). Sentiment Analysis: Extracting Decision-Relevant Knowledge from UGC. Information and Communication Technologies in Tourism 2014, 253-265
Tsvetovat, M., & Kouznetsov, A. (2011). Social Network Analysis for Startups. Sebastopol:
O'Reilly Media Inc.
We Are Social (2020). Digital Use Around The World in July 2020. [online] From:
https://wearesocial.com/blog/2020/07/digital-use-around-the-world-in-july-2020
Zhan, J., & Fang, X. (2011). Social computing: the state of the art. International Journal of Social Computing and Cyber-Physical Systems, 1(1), 1. ---