• Tidak ada hasil yang ditemukan

View of Responding to Natural Disasters on Twitter Social Media to Comparative Analysis of User Behavior and Geospatial Information Content in Indonesia and the United States

N/A
N/A
Nguyễn Gia Hào

Academic year: 2023

Membagikan "View of Responding to Natural Disasters on Twitter Social Media to Comparative Analysis of User Behavior and Geospatial Information Content in Indonesia and the United States"

Copied!
10
0
0

Teks penuh

(1)

Journal of Information Technology and Computer Science Volume 8, Number 1, April 2023, pp. 11-20

Journal Homepage: www.jitecs.ub.ac.id

Responding to Natural Disasters on Twitter Social Media to Comparative Analysis of User Behavior and Geospatial

Information Content in Indonesia and the United States

Muhammad Naufal Yaasir*1, Fatwa Ramdani2, Eko Setiawan3

1,2,3Brawijaya University, Malang, Indonesia

{[email protected]1, [email protected]2, [email protected]3} Received 13 April 2022; accepted 09 January 2023

Abstract. Disasters are a series of events that threaten and disrupt human life and livelihoods, are caused by natural, unnatural, and human factors, with casualties, environmental damage, loss, and psychological consequences. Natural disasters have prompted Twitter social media users to upload information about the conditions of areas affected by natural disasters in pictures or only text of the disaster site. At that time, the researcher proposed to analyze the behavior of Twitter social media users towards education levels in Indonesia and the United States. This research provides an overview of the facts about the content shared on Twitter, and provides a solution for the extraction of geospatial content on Twitter. From the processing of data from Twitter social media, it can be seen that there is no difference in the relationship between Twitter social media users and education levels in Indonesia and the United States and more geospatial information content in the United States than in Indonesia.

Keyword

:

Earthquake; Twitter; Social Media; Information; User Behaviour;

1 Introduction

Disasters are a series of events that threaten and disrupt human life and livelihoods, are caused by natural, unnatural, and human factors, with casualties, environmental damage, loss, and psychological consequences [1]. According to Indonesia Law No 24 Year 2007, “the National Disaster Management Agency (BNPB) divides disasters into three types, that are natural, unnatural, and social disasters. Natural disasters are a series of events caused by natural phenomena such as hurricanes, floods, earthquakes, tsunamis, volcanic eruptions, droughts and landslides. Unnatural disasters are a series of events caused by unnatural factors, such as failed technology, failed modernization, epidemics, and disease outbreaks. Social disaster is a series of events caused by human factors which include social conflict between community groups and terror”

In many recent emergencies around the world, social media is the one of the most effective tools for emergency management and disaster relief, [2]. Due to its ease of use and immediacy, Twitter, one of the social media platforms, is likely to provide emergency management information, allowing users to share information quickly and not be able to skip specific topics [3].

The occurrence of natural disasters encourages Twitter social media users to update the information about the conditions of areas affected by natural disasters in pictures or

(2)

12 JITeCS Volume 8, Number 1, April 2023, pp 11-20 only text from the place of the disaster [4]. Information triggers many responses from other users so that users determine their attitudes and make decisions to help disaster victims based on this information [5].

The earthquake and tsunami disaster in Palu, Central Sulawesi, Indonesia on September 28 2018 caused more than 2000 people to die [6]. Many Twitter social media users have posted tweets about the condition of the community in the earthquake- affected areas, prayers for victims of natural disasters, as well as assistance such as food, drink, clothing and shelter. Tweets are short messages, 280 characters long, incorporating well-defined geographic information provided by GPS or manual verification [7].

The study by Lansley & Longley (2016) entitled London Twitter Thematic Geography uses data from Twitter's social networks as an alternative source of public behavioral surveys that could be used to provide useful information to planners, marketers and researchers. This research classification identified 20 different group topics with their own meaning, representing keywords from Tweets, descriptions of informal activities and conversations between users. The motivation for this study was to use classification to show how the nature of content posted on Twitter changes based on location and user characteristics.

Negara, Andryani and Saksono (2016), in the study of geospatial information from Twitter on social networks, extracted and analyzed geospatial data from Twitter on a developing public problem and developed a method to obtain geospatial data in the software prototype of Twitter The extraction and analysis process is carried out through four stages, namely: data retrieval (tracking), storage (storage), analysis (analysis) and visualization process (visualization). This research is exploratory and focuses on the development of Twitter geospatial data analysis and extraction technology.

From previous studies, it only focuses on developed countries. However, in developing countries, there is still little research done to analyze the content of tweets related to geospatial information on developing countries. Geospatial data on Twitter social media can provide spatial information which is the location of the source of the emergence of public perception of an issue on social media that can be used by various parties so as to produce more useful information through the Twitter Data Analytics process [11].

The researcher proposed to analyze the behavior of Twitter social media users towards the level of education in both developed and developing countries. In which, the developed country is the United Stated and the developing country is Indonesia.

Because, the level of education is different so why the researcher want to know the behavior of Twitter social media users towards the level of education. This research provides an overview of the facts about the content shared on Twitter, and provides a solution for the extraction of geospatial content on Twitter.

Based on the problem identification above, the problem formulations in this study are as follows:

1. What is the frequency with which users respond before and after natural disasters shared on Twitter social media to education levels in Indonesia and the United States?

2. What are the characteristics of the geospatial information content shared on Twitter social media at the time of the earthquake in Indonesia and the United States?

The limitations of the problems in this study are as follows:

1. For data collection, only a few hashtags and keywords have been trending in Indonesia and the United States on Twitter social media.

2. For the data sample, the time taken is September 23, 2018 to October 3, 2018 for natural disasters in Indonesia and July 1 - 11, 2019 for natural disasters in the United

(3)

Muhammad Naufal et al. , Responding to Natural Disasters: ... 13 States.

The hypothesis in this study is: There is no difference in the frequency between users in responding before and after natural disasters that are shared on Twitter to the level of education in Indonesia and the United States. The alternative hypotheses in this study are: There is a difference in the frequency between users responding before and after natural disasters that are shared on Twitter to the level of education in Indonesia and the United States.

2 Methodology

2.1 Case Studies and Research Datasets

The case studies in this research are Twitter social media users in Indonesia and the United States during a natural disaster. The case taken in Indonesia was the earthquake and tsunami disaster in Palu City, Central Sulawesi on September 28, 2018. Meanwhile, the case taken in the United States was the earthquake in California on July 4, 2019.

The dataset in this study is in the form of Tweets posted by users. However, these tweets only focus on a few words or hashtags taken from search engines. Tweet data that has been taken from Twitter social media will be classified where the tweet contains geospatial and non-geospatial content.

Dataset containing geospatial content is data from tweets containing information related to geospatial such as location of earthquake points, earthquake-affected areas, magnitude, maps, and geospatial related images or videos.

2.2 Research Method

This research method is structured so that this research process can be carried out in a systematic and planned manner. This research method is divided into several stages as shown in Figure 1.

Fig 1 Research Methodology

(4)

14 JITeCS Volume 8, Number 1, April 2023, pp 11-20

2.3 Literature Review

The first stage in this research is to conduct a literature study. Literature study is carried out with various references obtained from sources such as books, scientific journals, articles and so on. The literature study focuses on the problem domain to be resolved, namely information related to the characteristics of Twitter social media users regarding natural disasters in the United States and Indonesia. In this process the researcher will get all the information related to the problem domain as a basis for carrying out the next stages in the research.

2.4 Hypothesis Analysis and Testing

Data retrieval will be done using Twitterscraper software. The tweet data that will be retrieved are tweets about natural disasters in Indonesia and the United States based on the keywords "Gempa", "Gempa Palu", "Disaster", "Earthquake", and "California Earthquake". This tweet data collection is based on 23 September 2018 to 3 October 2018 for natural disasters in Indonesia and 1 - 11 July 2019 for natural disasters in the United States.

Twitter data needs to be cleaned up to ensure that text mining identifies valid and representative patterns of user emotions. The following tweets will be deleted:

a) A tweet less than 3 words.

b) A tweet from a user who have posted the same messages more than once because the user of the tweet may be a fake account.

c) A tweet from users such as television media, newspapers, and scientists.

d) A tweet that contains a link.

e) A tweet that contains non-latin characters.

Hypothesis testing in this study is using RStudio. The results from RStudio can be seen from a significant level by comparing with an error level of one (alpha). If the significant level is less than the error rate of one, the null hypothesis is rejected.

However, if the level of significance is greater than the error rate of one, the null hypothesis is accepted.

2.5 Data Processing

The data processing in this study can be implemented by collecting tweets from Twitter in response to the earthquake on September 23, 2018 to October 3, 2018 in Palu City, Central Sulawesi, Indonesia and on 1 – 11 July 2019 in the City of California, United States of America. The model consists of three main stages, which are the data collection, the data processing, and the earthquake location mapping.

At the data collection, all tweets containing the keywords “Earthquake Palu” and

“California Earthquake” were collected from Twitter according to the specified time.

Tweets that have been collected are cleaned of data, grouped into several categories according to keywords, then time grouping is carried out based on keyword categories.

This can be used mainly to find out what Twitter users are posting and when they will post. Tweets containing information about the location and time of the earthquake based on reports from residents around the earthquake scene were also used to represent the points where the earthquake occurred.

The data collection in this study uses Python with tweepy and twitterscraper to get tweets from Twitter. Things that need to be considered in data collection are keywords, start date, end date, and output. The results taken on Twitter are in the form of a json file consisting of columns and rows. The lines in the file consist of tweets from Twitter social media users in 1 day based on the scraped keywords. The keywords are “Gempa”

(5)

Muhammad Naufal et al. , Responding to Natural Disasters: ... 15 At the data processing stage in this study, the results of collecting tweets were cleaned at the data collection stage, grouping tweets by category, and grouping by time.

The collected tweets are cleaned by deleting words with less than 3 characters, words with more than 16 characters, containing URLs, and non-latin characters. Furthermore, the tweets were cleaned by deleting tweets with less than 3 words, the users who had posted the same messages more than once, and the users such as television media, newspapers, and scientists.

Furthermore, data grouping is carried out from tweet data that has been processed from the previous stage. Then, the tweets are grouped by category. Then, an analysis of the frequency of Twitter social media users was carried out on the level of education in Indonesia and the United States. The frequency results are measured by the level of education in Indonesia and the United States. The results of the analysis will answer the formulation of the first problem and the hypothesis of this study.

3 Result and Discussion

Total results from Twitter social media scraping on 23 September 2018 to 3 October 2018 for natural disasters in Indonesia were found 16,475 tweets and 1 - 11 July 2019 for natural disasters in the United States found 5,870 tweets. For tweets from the United States can be seen in Table 1 and tweets from Indonesia can be seen in Table 2. From these tweets, at the peak of the earthquake in Indonesia on September 28, 2018, there were 4,375 tweets. Meanwhile, at the height of the earthquake in the United States on July 6, 2019, it received 407 tweets. The results of these tweets have been cleaned up.

Table 1. Number of Tweets from the United States

Date Number of

Tweets (RAW)

Filtered

Texts Pictures Number of Tweets

01 July 2019 1786 10 9 19

02 July 2019 1692 41 12 53

03 July 2019 1834 9 16 25

04 July 2019 17088 2688 1231 3919

05 July 2019 1369 113 120 233

06 July 2019 1752 230 177 407

07 July 2019 1783 174 135 309

08 July 2019 3631 166 194 360

09 July 2019 3277 130 124 254

10 July 2019 2047 79 93 172

11 July 2019 2532 43 76 119

Analysis of tweet data on Twitter social media in Indonesia and the United States provides clarity regarding comparisons in the two countries, although it is limited to the models, methods and data used. This discussion is carried out based on the results of tweet analysis at each stages of the model being developed. This discussion can provide a deeper explanation regarding the comparison of users between Indonesia and the

(6)

16 JITeCS Volume 8, Number 1, April 2023, pp 11-20

United States based on the phases in the model, and the characteristics of each country.

Tabel 2 Number of Tweets from Indonesia

Date Number of

Tweets (RAW)

Filtered

Texts Pictures Number of Tweets

24 September 2018 3188 853 310 4351

25 September 2018 2799 835 255 1090

26 September 2018 2380 598 250 848

27 September 2018 2501 537 319 856

28 September 2018 7156 4049 326 4375

29 September 2018 3086 507 95 602

30 September 2018 4594 1334 651 1985

01 October 2018 1098 278 100 378

02 October 2018 1958 607 184 791

03 October 2018 1075 525 201 726

04 October 2018 1534 316 157 473

3.1 Words Analyze

After the tweet data collection phase, the next phase is analyzing the words of the collected tweets. This word analysis provides an overview of Twitter social media users in Indonesia and the United States. However, the data used in this word analysis were only taken at the time of the earthquake, namely on July 6, 2019 in the United States and September 28, 2018 in Indonesia.

In Table 3, there are 524 words of “earthquake” from United States and 4740 words of “gempa” from Indonesia. This shows that Twitter social media users in Indonesia update using the word "gempa" more than the United States uses the word "earthquake".

Table 3 Tweet word analysis

Word (US) Total (US) Word (ID) Total (ID)

earthquak 524 gempa 4740

california 66 jogja 649

ridgecrest 48 lombok 543

earthquakela 42 yang 420

71 39 lagi 402

peopl 29 tidur 366

night 28 kerasa 303

like 28 barusan 265

quak 25 semoga 251

time 23 allah 226

Big 23 Tapi 222

(7)

Muhammad Naufal et al. , Responding to Natural Disasters: ... 17

Word (US) Total (US) Word (ID) Total (ID)

Im 22 Baru 216

Safe 22 korban 215

2 21 ternyata 211

Look 20 Untuk 202

Becaus 20 Dari 196

Dure 19 Tadi 190

Today 19 Kalo 184

Hit 19 Juga 183

aftershock 19 Masih 180

3.2 Analysis of the Frequency of Twitter Social Media Users on Education Levels in Indonesia and the United States

In Figure 2, it can be seen from the histogram that the tweet data and education levels in Indonesia and the United States are not normally distributed. Furthermore, the test used in this data is the Spearman method correlation test using the R-Studio software.

Figure 1 Data Distribution

(8)

18 JITeCS Volume 8, Number 1, April 2023, pp 11-20 To calculate correlation value, we used R program. We calculated tweet data toward education level on each country. The formula on this paper, we use cor.test(x, y, method

= "spearman") as implements on R Program.

The formula form is

cor.test(DataTwitter, DataLevelofEducation, method = “spearman) The input of the formula in Indonesia is as follows

>cor.test(indo$Tweet, indo$Pendidikan, method = "spearman") The output results in Indonesia obtained are as follows

Spearman's rank correlation rho data: indo$Tweet and indo$Pendidikan S = 218, p-value = 0.9892

alternative hypothesis: true rho is not equal to 0 sample estimates:

rho 0.009090909

The input of the formula in the United Stated is as follows

>cor.test(ustweet$Tweet, ustweet$Pendidikan, method = "spearman") The output results in the United Stated obtained are as follows

Spearman's rank correlation rho data: ustweet$Tweet and ustweet$Pendidikan S = 170, p-value = 0.5031

alternative hypothesis: true rho is not equal to 0 sample estimates:

rho 0.2272727”

In this study, a correlation test analysis was carried out in which the dependent variable was the level of education and the independent variable was the number of tweets in Indonesia and the United States. The results of the correlation test of the Spearman method for data on Twitter social media users on the level of education in Indonesia resulted in a rho value of 0.009, and data on Twitter social media users on the level of education in the United States resulted in a rho value of 0.227. From this output, we can conclude that there is no relationship between education level and Twitter social media users in Indonesia and the United States.

(9)

Muhammad Naufal et al. , Responding to Natural Disasters: ... 19

Figure 2 Scatterplot

The results of the scatterplot of Twitter users in Indonesia and the United States towards the level of education in figure 3. However, it shows that the distance of the distribution points and the linear line are far apart. This means that the relationship between educational level variables and Twitter social media users is very weak.

4 Conclusion

The Twitter users in Indonesia and the United States are the frequency of users responding before and after natural disasters shared on Twitter social media, there is no difference to the level of education in Indonesia and the United States. Geospatial information content by Twitter social media users in America is more than that of Twitter social media users in Indonesia.

This study only uses Twitter social media in collecting data. For further research, maybe other social media such as Facebook, Instagram, Tumblr, or others can be used as a reference for getting information about natural disasters. The data collection method can also influence the results of the analysis so that there are other methods that may be more accurate.

References

1. Siswahyudi, P. Pengembangan Kerangka Kerja Untuk Manajemen Bencana Alam Tanah Longsor Tahap Kesiapsiagaan Di Kota Batu. Published online 2019.

2. Lovari A, Bowen SA. Social Media In Disaster Communication: A Case Study Of Strategies, Barriers, And Ethical Implications. J Public Aff. 2020;20(1):1-9. doi:10.1002/pa.1967 3. Martínez-Rojas M, Pardo-Ferreira M del C, Rubio-Romero JC. Twitter As A Tool For The

Management And Analysis Of Emergency Situations: A Systematic Literature Review. Int J Inf Manage. 2018;43(April):196-208. doi:10.1016/j.ijinfomgt.2018.07.008

4. Slamet C, Rahman A, Sutedi A, Darmalaksana W, Ramdhani MA, Maylawati DS. Social Media-Based Identifier for Natural Disaster. IOP Conf Ser Mater Sci Eng. 2018;288(1).

doi:10.1088/1757-899X/288/1/012039

5. Huang Q, Xiao Y. Geographic Situational Awareness: Mining Tweets For Disaster

(10)

20 JITeCS Volume 8, Number 1, April 2023, pp 11-20

Preparedness, Emergency Response, Impact, And Recovery. ISPRS Int J Geo-Information.

2015;4(3):1549-1568. doi:10.3390/ijgi4031549

6. Sassa S, Takagawa T. Liquefied Gravity Flow-Induced Tsunami: First Evidence And Comparison From The 2018 Indonesia Sulawesi Earthquake And Tsunami Disasters.

Landslides. 2019;16(1):195-200. doi:10.1007/s10346-018-1114-x

7. Hernandez-Suarez A, Sanchez-Perez G, Toscano-Medina K, et al. Using Twitter Data to Monitor Natural Disaster Social Dynamics: A Recurrent Neural Network Approach with Word Embeddings and Kernel Density Estimation. Sensors. 2019;19(7):1746.

doi:10.3390/s19071746

8. Cvetojevic S, Hochmair HH. Analyzing The Spread Of Tweets In Response To Paris Attacks.

Comput Environ Urban Syst. 2018;71(March):14-26.

doi:10.1016/j.compenvurbsys.2018.03.010

9. Gunawong P, Thongpapanl N, Ferreira CC. A Comparative Study Of Twitter Utilization In Disaster Management Between Public And Private Organizations. J Public Aff. 2019;19(4).

doi:10.1002/pa.1932

10. Lansley G, Longley PA. The Geography Of Twitter Topics In London. Comput Environ Urban Syst. 2016;58:85-96. doi:10.1016/j.compenvurbsys.2016.04.002

11. Negara ES, Andryani R, Saksono PH. Analisis Data Twitter: Ekstraksi dan Analisis Data Geospasial. J INKOM. 2016;10(1):27. doi:10.14203/j.inkom.433

Referensi

Dokumen terkait

The process carried out in this study is to process all data which obtained from Twitter social media and then classify it using the Multinomial Naïve Bayes, Gaussian Naïve Bayes, and

Geographic or Geospatial Information Systems are crucial to several methods of collecting data in the tourism industry[1] and has had a significant impact on the tourism industry,