1
ALGORITHMS
Muhammad Arfan1*, Muhammad Syahid Pebriadi 2 Politeknik Negeri Banjarmasin, Banjarmasin, Indonesia 1,2 Sistem Komputer, STMIK Handayani Makassar, Makassar, Indonesia1 E-mail address: [email protected] 1, [email protected] 2
Received: 25, November, 2022 Revised: 04, December, 2022 Accepted: 05, December, 2022
ABSTRACT
Pandemic Covid-19 has spread rapidly, causing a severe health crisis worldwide, including Indonesia. As is well known, due to Indonesia’s diverse population, there are differences in the number of cases between cities. Therefore, a distribution clustering process is needed to develop a map of Covid-19 cases which aims to enable optimal handling of this pandemic. In this study, DBSCAN was used for the clustering process of Covid-19 spread on Palopo city.
Clustering using the DBSCAN algorithm is needed to find out the clusters that are formed and the location of Covid-19 spread in certain areas. The result of clustering using DBSCAN show that the largest cluster has a total 0f 432 cases that occurred throughout the time period with an average of 15 cases per day. After visualization, the most distribution of Covid-19 was in the central area of Palopo City and concentrated in Wara District with 250 cases.
Keywords: clustering, DBSCAN, covid-19 spread.
1. INTRODUCTION
Pandemic Covid-19 has spread rapidly, causing a severe health crisis worldwide, including Indonesia. As is well known, due to Indonesia’s diverse population, there are differences in the number of cases between cities. Therefore, a distribution clustering process is needed to develop a map of Covid-19 cases which aims to enable optimal handling of this pandemic.
The level of Covid-19 spread has many danger so that it’s need policy and tight special plan (Mahmoudi et al., 2020); (Aman et al., 2022). This indicates the need for further treatment to contain the spread of this virus. One of the steps that can be used to help overcome the Covid- 19 virus outbreak is to determine accurately which areas have a large number of positive cases of Covid-19 and areas that have a small number of sufferers. Cluster analysis can be used for consideration in making policies and strategies taken in handling this outbreak.
Previous research has compared the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and K-Means algorithms to find an accurate and appropriate level in calculating cases of the Covid-19 pandemic as an alternative choice for analyzing conditions in decision making (Fitri et al., 2020). The results of the study show that the DBSCAN method is more relevant to this pandemic case.
2
This research analyzed the spread of Covid-19 in Palopo City, South Sulawesi Province in 2021 using the DBSCAN algorithm. In this study, DBSCAN was used for the clustering process for the distribution of Covid-19 in Palopo City which aims to obtain the distribution area of cases based on the clusters formed. With the application of DBSCAN to the clustering process, it is expected to produce important and useful information or knowledge.
2. THEORY 2.1. Covid-19 Spread
Covid-19 was first identified in December 2019 in Wuhan, China, and quickly spread to 24 other countries (Kucharski et al., 2020). Since March 2020 the World Health Organization has declared Covid-19 a pandemic. Indonesia's first positive case of Covid-19 was identified on March 2 2020 and as of March 17 2020, there are 134 confirmed cases spread across eight provinces, namely Bali, Banten, DKI Jakarta, West Java, Central Java, West Kalimantan, North Sulawesi and Yogyakarta (Swastika, 2020). In Indonesia, with its diverse topography and population, the number of Covid-19 cases has occurred among its 34 provinces.
2.2. DBSCAN Algorithm
Density-Based Spatial Clustering of Applications with Noise (DBSCAN) is a clustering method that groups points based on the density of data in an area. The DBSCAN algorithm is designed to look for clusters and noise in spatial data (Ester et al., 1996). The DBSCAN algorithm requires two input parameters, namely the epsilon distance (Eps) and the minimum number of points (MinPts). Epsilon is the distance between points indicating the density of objects, while MinPts is the minimum number of points from the center of a cluster object.
Neighbors between points that meet the epsilon distance are called eps-neighborhood.
DBSCAN algorithm as follows (Han et al., 2012):
1) Choose an initial point (p) randomly.
2) Get some eps-neighborhood points from p.
3) If amount point from steps 2 Fulfill MinPts scores, then set p as center (core point) and a cluster formed.
4) If amount point from steps 2 not Fulfill MinPts scores, then set p as border point and choose the next point.
5) Repeat steps 2 to 4 until all point already processed and no point which could added in a cluster.
2.3. Previous Research
First research was implemented DBSCAN algorithm in supervision of dengue fever (Nandana et al., 2019). This algorithm was designed to identify differences shapes and patterns. The grouping of dengue fever cases found identified as hotspots. The sectors identified by the grouping are considered to be in a low socio-economic level of life. This study is useful for public health workers, epidemiologists and health practitioners to take proactive actions in controlling the spread of dengue fever outbreaks.
Second research aims to analyze disease-prone areas so that they can assist policies in providing counseling to appropriate areas (Hermanto & Sunandar, 2020). The analytical method used is Knowledge Discovery in Database (KDD). The method for grouping disease-
3 calculation where this clustering aims to divide the area into clusters based on the location of the disease distribution points. The results of this research are the dashboard results of the analysis of the distribution of disease data using the DBSCAN algorithm which includes map of disease distribution, disease clustering results, percentage of disease distribution per sub- district and comparison of minimum points and minimum epsilon. With this analysis it can be a reference for the policies to be taken in decision making.
Third research used a clustering technique to group countries that have similar case patterns together by looking at other countries in the same group as recommendations for treatment within a country (Nurhaliza & Mustakim, 2021). In this study the DBSCAN algorithm was used to obtain clustering results and clustering validity. The result of the research is the value of clustering validity so that it can be known which cluster is the optimal cluster during the clustering process.
3. METHOD
This research was conducted in several stages. The stages of this research can be seen in Figure 1.
Figure 1. Stages of research
This research began with data collection, then processed data records for the spread of Covid- 19 in Palopo City in 2021. Based on this data, pre-processed data was carried out, from the results of pre-processing the data, leaving the address data of Covid-19 patients. Address data is converted into latitude and longitude coordinate format. These steps are carried out to obtain the attributes that will be used for the clustering process using the DBSCAN algorithm. The
4
latitude and longitude attributes will be calculated for the distance. Distance calculation in this study uses the Euclidean equation, with the following formula:
𝐸(𝑥,𝑦) = √∑𝑛𝑖=0(𝑥𝑖− 𝑦1)2 (1)
Clustering using the DBSCAN algorithm is needed to find out the clusters that are formed and the location of the distribution of Covid-19 in certain areas. The result is in the form of a visualization of the distribution map of Covid-19 in Palopo City.
4. RESULTS AND DISCUSSION
Based on data on the spread of Covid-19 obtained from the results of a survey at the South Sulawesi Provincial Health Office in 2021 in July, Palopo City is one of the areas that has the highest cases of the spread of Covid-19 in South Sulawesi Province. An example of patient data for Covid-19 in Palopo City can be seen in Table 1 below:
Table 1. Example Data of Covid-19 in Palopo City Date Confirmed Type of
Sex Age Address Status
01/07/2021 L 26 Jl. Tandipau Healed
02/07/2021 P 31 Perum Citra Malian Pongtiku Healed
03/07/2021 L 66 Jl. Mannennungeng Die
07/07/2021 L 48 Jl. Andi Tenriadjeng Die
08/07/2021 P 27 Jl. Andi Djemma Healed
Table 1 is some data on patients who have been confirmed with Covid-19 in Palopo City. In July 2021 in Palopo City there were 471 positive cases of Covid-19 with 449 recovered patients and 22 deaths. Figure 2 shows the distribution of Covid-19 in Palopo City throughout July 2021.
Figure 2. Data of Covid-19 in Palopo City on July 2021
Figure 2 is a graph of the spread of Covid-19 throughout July 2021 in Palopo City. The highest number of Covid-19 cases occurred on July 30, namely there were 58 cases. Meanwhile, on July 4, 5 and 11 there were no confirmed cases. Clustering on Covid-19 distribution data was carried out to find clusters of Covid-19 distribution. The concentration of the spread of Covid- 19 will indicate areas that are prone to transmission. Figure 3 shows a plot of the distribution
0 10 20 30 40 50 60
01-Jul 03-Jul 05-Jul 07-Jul 09-Jul 11-Jul 13-Jul 15-Jul 17-Jul 19-Jul 21-Jul 23-Jul 25-Jul 27-Jul 29-Jul 31-Jul
CASE
DATE CONFIRMED
5 which is converted into longitude and latitude coordinates.
Figure 3. Plot of Covid-19 Spread
Figure 4. Plot of Covid-19 Spread using DBSCAN
Figure 4 is the result of clustering with the DBSCAN algorithm on Covid-19 data in Palopo City, with Eps = 0.01 and MinPts = 5. The results show that 3 clusters and noise are formed, the first cluster is the most data on the distribution of Covid-19, namely 432 cases, the second cluster with 22 cases, and the third cluster is the data on the distribution of Covid-19 at least 6 cases, while the noise on the distribution of Covid-19 marked with the number 0 in Figure 4 with a total of 11 cases. The distribution area for Covid-19 in Palopo City based on each cluster formed can be seen in Table 2 below:
Table 2. Region of Covid-19 Spread in Palopo City Cluster Region Case
1 Mungkajang 10
Wara 250
Wara Barat 11 Wara Selatan 35 Wara Timur 86
6
Cluster Region Case Wara Utara 40
2 Bara 22
3 Sendana 6
0 Bara 1
Sendana 5
Telluwanua 1 Wara Barat 1 Wara Selatan 1 Wara Timur 2
Table 2 is the spread of Covid-19 sub-districts in Palopo City. Cluster 1 has 6 distribution areas, with the highest number of cases occurring in the Wara District area with 250 cases. Clusters 2 and 3 occur in 1 distribution area each, where cluster 2 occurs in the Bara sub-district area while cluster 3 occurs in the Sendana District area. Cluster 0 has 11 cases, which are outliers which are also considered as noise occurring in 6 distribution areas. Figure 5 shows the cluster area of the Covid-19 spread in Palopo City after being mapped into Google Maps.
Figure 5. Region Cluster of Covid-19 Spread in Palopo City
Figure 5 is a visualization of the Covid-19 distribution area in Palopo City. Most of the distribution of Covid-19 is in the central area of Palopo City and concentrated in Wara District with 250 cases.
5. CONCLUSIONS AND SUGGESTIONS
The information obtained in the clustering process is the total clusters and nosie as well as the areas most affected by the spread of Covid-19. The results of clustering using DBSCAN show that the largest cluster has a total of 432 cases that occurred throughout the time period with an average of 15 cases per day. After visualization, the most distribution of Covid-19 was in the central area of Palopo City and concentrated in Wara District with 250 cases. Future research is expected to use longer monthly or yearly data and add data variables related to the time period of the incident so that the distribution pattern of Covid-19 can be obtained and more optimal results.
7 Aman, A., Angriawan, R., Zulfikar, & Amiruddin, E. G. (2022). Android-Based Car Tire
Pressure Monitoring System. Ceddi Journal of Information System and Technology (JST), 1(1).
Ester, M., Kriegel, H.-P., Sander, J., & Xu, X. (1996). A Density-based algorithm for discovering cluster in large spatial database with noise. Proceeding 2nd Int Conf. on Knowledge Discovery and Data Mining, 635–654. https://doi.org/10.1016/B978- 044452701-1.00067-3
Fitri, I., Asmar, R., & Rubhasy, A. (2020). Data Cluster Mapping Of Global Covid-19 Pandemic Based On Geo-Location. Jurnal Mantik, 4(1), 511–520.
Han, J., Kamber, M., & Pei, J. (2012). Data Mining: Concepts and Techniques. In Proceedings - 2013 International Conference on Machine Intelligence Research and Advancement, ICMIRA 2013 (Third Edit). Morgan Kaufmann. https://doi.org/10.1109/ICMIRA.2013.45 Hermanto, T. I., & Sunandar, M. A. (2020). Analisis Data Sebaran Penyakit Menggunakan Algoritma Density Based Spatial Clustering of Applications with Noise. Jurnal Sains
Komputer Dan Teknologi Informasi, 3(1), 104–110.
https://doi.org/https://doi.org/10.33084/jsakti.v3i1.1775
Kucharski, A. J., Russell, T. W., Diamond, C., Liu, Y., Edmunds, J., Funk, S., Eggo, R. M., Sun, F., Jit, M., Munday, J. D., Davies, N., Gimma, A., van Zandvoort, K., Gibbs, H., Hellewell, J., Jarvis, C. I., Clifford, S., Quilty, B. J., Bosse, N. I., … Flasche, S. (2020).
Early dynamics of transmission and control of COVID-19: a mathematical modelling study. The Lancet Infectious Diseases, 20(5), 553–558. https://doi.org/10.1016/S1473- 3099(20)30144-4
Mahmoudi, M. R., Baleanu, D., Mansor, Z., Tuan, B. A., & Pho, K. H. (2020). Fuzzy clustering method to compare the spread rate of Covid-19 in the high risks countries. Chaos, Solitons and Fractals, 140(August). https://doi.org/10.1016/j.chaos.2020.110230
Nandana, G. M., Mala, S., & Rawat, A. (2019). Hotspot detection of dengue fever outbreaks using DBSCAN Algorithm. Proceedings of the 9th International Conference On Cloud Computing, Data Science and Engineering, Confluence 2019, 158–161.
https://doi.org/10.1109/CONFLUENCE.2019.8776916
Nurhaliza, N., & Mustakim. (2021). Clustering of Data Covid-19 Cases in the World Using DBSCAN Algorithms Pengelompokan Data Kasus Covid-19 di Dunia Menggunakan Algoritma. Indonesian Journal Of Informatic Research and Software Engineering, 1(1), 1–8.
Swastika, W. (2020). Studi Awal Deteksi Covid-19 Menggunakan Citra Ct Berbasis Deep Preliminary Study of Covid-19 Detection Using Ct Image Based on. Jurnal Teknologi Informasi Dan Ilmu Komputer, 7(3), 629–634. https://doi.org/10.25126/jtiik.202073399