• Tidak ada hasil yang ditemukan

Conference series

N/A
N/A
Protected

Academic year: 2024

Membagikan "Conference series"

Copied!
8
0
0

Teks penuh

(1)

PAPER • OPEN ACCESS

Utilization of the field of data mining in mapping the area of the Human Development Index (HDI) in Indonesia

To cite this article: Adi Rahmat et al 2021 J. Phys.: Conf. Ser. 1783 012035

View the article online for updates and enhancements.

You may also like

QUASICONFORMAL HOMOTOPIES OF ELEMENTARY SPACE MAPPINGS I V Abramov and E A Roganov -

Progress of Surveying and Mapping Science and Technology in the Information Age

Wen Zhang and Changqing Sun -

Optimization of Mapping Based on Points Mark

Chunxia Yuan, Jizong Jiao, Dongyu Jia et al.

-

This content was downloaded from IP address 114.125.59.191 on 14/07/2022 at 17:20

(2)

Utilization of the field of data mining in mapping the area of the Human Development Index (HDI) in Indonesia

Adi Rahmat1, H Hardi1, Febrizal Alfarasy Syam1, Z Zamzami1, Bayu Febriadi1, Agus Perdana Windarto2*

1Universitas Lancang Kuning, Indonesia

2STIKOM Tunas Bangsa, Pematangsiantar, Indonesia

Email: [email protected], *[email protected]

Abstract. The Human Development Index (abbreviated as HDI) is an indicator used by a country to measure success in efforts to develop human quality. The purpose of this research is to make HDI mapping in areas in Indonesia by utilizing data mining techniques. Source of data used comes from official data from the Indonesian Central Statistics Agency (https://www.bps.go.id/) consisting of 34 data records (2018-2019). Indicators used in mapping are Life Expectancy at Birth (X1), Expectations of Old School (X2) and Average Length of School (X3). The data mining technique used is part of clustering, namely K-Medoids. The analysis process uses the help of RapidMiner software 5.3. Determination of the number of clusters (k) in the mapping using the Davies Bouldin (DBI) parameter with a maximum value (k = 4) = 0.856. By using four mapping labels (C1 = "very high" group; C2 = "high" group; C3

= "medium" group; C4 = "low" group), the results of C1 = 5 province; C2 = 16 provinces; C3

= 10 provinces and C4 = 3 provinces. Based on the results of the mapping of regions in Indonesia, Indonesia's HDI is still far behind when compared to countries in ASEAN. In the future, this will be submitted to the government to make HDI a priority because it involves the welfare and quality of Indonesian people.

1. Introduction

The Human Development Index (abbreviated HDI) is a comparison measurement tool used by a country in classifying whether the country is included in the category of developed, developing and lagging countries. Besides that HDI can also be used as an indicator to measure success in efforts to develop human quality. In Indonesia data relating to the human development index (HDI) is managed officially by the Indonesian Statistics Agency (abbreviated as BPS) through the official website https://www.bps.go.id/. Based on the Official Statistics News report managed by BPS, the achievement of human development is measured by taking into account three criteria, namely longevity and healthy living, knowledge, and a decent standard of living. Based on a dataset sourced from BPS, the purpose of this study is to map the achievement of regional-based human development using computer techniques [1]–[3]. In computer science [4]–[10] there are several techniques that can do mapping. One of them is data mining [11]. Data mining is a technique used to extract information from existing data (databases, big data) to produce knowledge and information [12]–[16]. Data mining techniques consist of several levels of use such as clustering, classification, estimation, association and forecasting [11], [17]–[19]. In data mining, clustering is one of the well-known methods of mapping (improved k-means, k-medoids (PAM), fuzzy c-means, DBSCAN, CLARANS and Fuzzy Substractive) [20]. In a study conducted [21] states that the k-medoids method emerged as overcoming

(3)

2

the weaknesses of the k-means method that is sensitive to outliers [22]. In addition, the k-medoids method is also the best approach or method for grouping data [23]. Research [24] on the selection of study programs at XYZ University. In this study k-medoids can be applied with the best Silhouette Coefficient value of 0.690754 with a total of three clusters and a total of 15. Based on this it becomes a solution to make it easier to find out areas that have low human development indexes so that it can be done early response.

2. Methodology

Human development index research uses secondary data sourced from the Indonesian Central Statistics Agency. The managed data is the data of human development achievement based on regions in Indonesia consisting of 34 provinces. The criteria used are Life Expectancy at Birth (X1), Expectations of Old School (X2) and Average Length of School (X3). Before the data is processed, the data is cleaned first to maximize cluster results using the help of RapidMiner 5.3 software.

Following are the processed data and the steps in implementing the k-medoids method.

Table 1. Human development index data by region in Indonesia (2018-2019).

No Province Life expectancy at birth Hope in Old School Average length of school

1 Aceh 69.76 14.29 9.14

2 North Sumatra 68.78 13.15 9.40

3 West Sumatra 69.16 13.98 8.84

4 Riau 71.34 13.13 8.98

5 Jambi 70.98 12.92 8.34

6 South Sumatra 69.53 12.38 8.09

7 Bengkulu 69.03 13.59 8.67

8 Lampung 70.35 12.62 7.87

9 Kep Bangka Belitung 70.34 11.91 7.91

10 Riau islands 69.72 12.83 9.90

11 DKI Jakarta 72.73 12.96 11.06

12 West Java 72.76 12.47 8.26

13 Central Java 74.21 12.66 7.44

14 In Yogyakarta 74.87 15.57 9.35

15 East Java 71.08 13.13 7.49

16 Banten 69.74 12.87 8.68

17 Bali 71.84 13.25 8.75

18 West Nusa Tenggara 66.08 13.48 7.15

19 East Nusa Tenggara 66.62 13.13 7.43

20 West Kalimantan 70.37 12.57 7.22

21 Central Kalimantan 69.67 12.56 8.44

22 South Borneo 68.36 12.51 8.10

23 East Kalimantan 74.09 13.68 9.59

24 North Kalimantan 72.52 12.83 8.91

25 North Sulawesi 71.42 12.71 9.34

26 Central Sulawesi 68.01 13.14 8.64

27 South Sulawesi 70.26 13.35 8.14

28 Southeast Sulawesi 70.85 13.54 8.80

29 Gorontalo 67.69 13.05 7.58

30 West Sulawesi 64.70 12.61 7.62

31 Maluku 65.71 13.93 9.70

32 North Maluku 67.99 13.63 8.86

33 West Papua 65.73 12.63 7.36

34 Papua 65.51 10.94 6.59

Source: BPS

Figure 1 below shows the research methodology starting with the study of literature to study relevant theories related to problem solving. The next step is to collect research-related data, in this case the data processed is human development index data obtained from the official Indonesian government website (https://www.bps.go.id/). Data obtained from BPS is then carried out cleaning data to maximize clustering results. Data that has been cleaned is used as input for the k-medoids method in producing HDI mapping of regions in Indonesia. The next step is to analyze the results that have been

(4)

obtained to find out whether the goals are achieved or not. Analysis of the results will draw conclusions from research that has been done. Suggestions are also given for future research input.

Figure 1. Research Methodology

3. Results and Discussion

Before conducting the analysis process (table 1) of the k-medoids method with the help of RapidMiner 5.3 software, the determination of the number of clusters (k) is done by using the Davies-Bouldin Index (DBI) parameter. DBI as the best cluster mapping reference. Tests are carried out on the number of clusters (k = 2; k = 3; k = 4). The following results of data processing are shown in Figure 2

Figure 2. DBI calculation results

Figure 2 shows the results of the value of the number of k (2, 3, 4) where the Davies-Bouldin Index (DBI) is said to be optimal if it has a smaller value. In this case k-4 has the optimal value to be used as the number of clusters in the human development index. The cluster mapping label used is C1 = "very high" group; C2 = "high" group; C3 = "medium" group; C4 = "low" group. The following is a k- medoid model and mapping results using the RapidMiner 5 software.

(5)

4

Figure 3. The k-medoids model in RapidMiner 5.3

Figure 4. Cluster Results of the k-medoids method

Figure 4 states that the results of mapping the human development index by the k-medoids method obtained cluster C1 (cluster_2) = 5 provinces (DKI Jakarta, West Java, Central Java, DI Yogyakarta and East Kalimantan), cluster C2 (cluster_1) = 16 provinces, cluster C3 (cluster_0) = 10 provinces and cluster C4 (cluster_3) = 3 Provinces (West Sulawesi, West Papua and Papua). Here are the final centroid results and mapping of the provinces using the RapidMiner software as follows:

(6)

Table 2. Final centroid value

Figure 5. Cluster results by region (province)

Based on the results of the DBI mapping of regions in Indonesia, the achievement of HDI in Indonesia has increased. However, the increase has not yet brought Indonesia to the middle or upper level. HDI Indonesia is still far behind Singapore, Malaysia, Brunei, Thailand and the Philippines.

From the mapping results it can be concluded that there has not been equitable distribution of human development in Indonesia and is still dominated by the island of Java, while the central region (wita) is still far behind (West Sulawesi, West Papua and Papua). In addition, the number of clusters (k = 2 has a value of 0.856) using k-medoids to reach the optimal value with the HDI calculation shown in the figure below:

Figure 6. K-medoids performance test results

4. Conclusion

Based on the results of research conducted, it was concluded that the k-medoids method can be implemented in mapping the human development index (HDI) in Indonesia. The criteria used as HDI mapping are Life Expectancy at Birth (X1), Expectations of Old School (X2) and Average Length of

(7)

6

School (X3). The mapping results show that Indonesia is still far behind when compared to countries in ASEAN. The results of the mapping show that Indonesia is still far behind even though some provinces are included in the cluster very well and well (C1 = 14% and C2 = 47%). And if taken as a whole, only 30% of provinces experienced a good increase in HDI based on the results of the k- medoids method.

References

[1] B. Febriadi, Z. Zamzami, Y. Yunefri, and A. Wanto, “Bipolar function in backpropagation algorithm in predicting Indonesia’s coal exports by major destination countries,” IOP Conf.

Ser. Mater. Sci. Eng., vol. 420, no. 012089, pp. 1–9, 2018.

[2] Sunandar, A. Buchori, and N. D. Rahmawati, “Development of media kocerin (Smart box interactive) to learning mathematics in Junior High School,” Glob. J. Pure Appl. Math., 2016.

[3] Sunandar, A. Buchori, N. D. Rahmawati, and W. Kusdaryani, “Mobilemath (mobile learning math) media design with seamless learning model on analytical geometry course,” Int. J. Appl.

Eng. Res., vol. 12, no. 19, pp. 8076–8081, 2017.

[4] I. G. I. Sudipa, C. Astria, K. F. Irnanda, and A. Perdana, “Application of MCDM using PROMETHEE II Technique in the Case of Social Media Selection for Online Businesses . Application of MCDM using PROMETHEE II Technique in the Case of Social Media Selection for Online Businesses .,” 2020.

[5] H. Pratiwi et al., “Sigmoid Activation Function in Selecting the Best Model of Artificial Neural Networks,” J. Phys. Conf. Ser., vol. 1471, no. 1, 2020.

[6] W. M. Sari et al., “Improving the Quality of Management with the Concept of Decision Support Systems in Determining Factors for Choosing a Cafe based on Consumers,” J. Phys.

Conf. Ser., vol. 1471, no. 1, 2020.

[7] F. Rahman, I. I. Ridho, M. Muflih, and S. Pratama, “Application of Data Mining Technique using K-Medoids in the case of Export of Crude Petroleum Materials to the Destination Country Application of Data Mining Technique using K-Medoids in the case of Export of Crude Petroleum Materials to the Destination C,” 2020.

[8] N. Nasution et al., “Application of ELECTRE Algorithm in Skincare Product Selection,” J.

Phys. Conf. Ser., vol. 1471, no. 1, 2020.

[9] R. H. S. Siburian, R. Karolina, P. T. Nguyen, E. L. Lydia, and K. Shankar, “Leaf disease classification using advanced SVM algorithm,” Int. J. Eng. Adv. Technol., vol. 8, no. 6 Special Issue, pp. 712–718, Aug. 2019.

[10] Sunandar, N. D. Rahmawati, A. Wibisono, and A. Buchori, “Development of game education basic virtual augmented reality in geometry learning,” Test Eng. Manag., 2020.

[11] B. Supriyadi, A. P. Windarto, T. Soemartono, and Mungad, “Classification of natural disaster prone areas in Indonesia using K-means,” Int. J. Grid Distrib. Comput., vol. 11, no. 8, pp. 87–

98, 2018.

[12] Sudirman, A. P. Windarto, and A. Wanto, “Data mining tools | rapidminer: K-means method on clustering of rice crops by province as efforts to stabilize food crops in Indonesia,” IOP Conf. Ser. Mater. Sci. Eng., vol. 420, p. 012089, 2018.

[13] M. Widyastuti, A. G. Fepdiani Simanjuntak, D. Hartama, A. P. Windarto, and A. Wanto,

“Classification Model C.45 on Determining the Quality of Custumer Service in Bank BTN Pematangsiantar Branch,” J. Phys. Conf. Ser., vol. 1255, no. 012002, pp. 1–6, 2019.

[14] P. Alkhairi and A. P. Windarto, “Penerapan K-Means Cluster Pada Daerah Potensi Pertanian Karet Produktif di Sumatera Utara,” Semin. Nas. Teknol. Komput. Sains, pp. 762–767, 2019.

[15] D. Hartama, A. Perdana Windarto, and A. Wanto, “The Application of Data Mining in

Determining Patterns of Interest of High School Graduates,” J. Phys. Conf. Ser., vol. 1339, no.

1, 2019.

[16] W. Katrina, H. J. Damanik, F. Parhusip, D. Hartama, A. P. Windarto, and A. Wanto, “C.45 Classification Rules Model for Determining Students Level of Understanding of the Subject,”

J. Phys. Conf. Ser., vol. 1255, no. 012005, pp. 1–7, 2019.

[17] Budiharjo, T. Soemartono, A. P. Windarto, and T. Herawan, “Predicting School Participation

(8)

in Indonesia using Back-Propagation Algorithm Model,” Int. J. Control Autom., vol. 11, no.

11, pp. 57–68, 2018.

[18] Budiharjo, T. Soemartono, A. P. Windarto, and T. Herawan, “Predicting tuition fee payment problem using backpropagation neural network model,” Int. J. Adv. Sci. Technol., vol. 120, pp.

85–96, 2018.

[19] I. S. Damanik, A. P. Windarto, A. Wanto, Poningsih, S. R. Andani, and W. Saputra, “Decision Tree Optimization in C4.5 Algorithm Using Genetic Algorithm,” J. Phys. Conf. Ser., vol. 1255, no. 012012, pp. 1–7, 2019.

[20] E. M. Rangel, W. Hendrix, A. Agrawal, W. K. Liao, and A. Choudhary, “AGORAS: A fast algorithm for estimating medoids in large datasets,” Procedia Comput. Sci., vol. 80, pp. 1159–

1169, 2016.

[21] I. Kamila, U. Khairunnisa, and Mustakim, “Perbandingan Algoritma K-Means dan K-Medoids untuk Pengelompokan Data Transaksi Bongkar Muat di Provinsi Riau,” J. Ilm. Rekayasa dan Manaj. Sist. Inf., vol. 5, no. 1, pp. 119–125, 2019.

[22] D. Marlina, N. Lina, A. Fernando, and A. Ramadhan, “Implementasi Algoritma K-Medoids dan K-Means untuk Pengelompokkan Wilayah Sebaran Cacat pada Anak,” J. CoreIT J. Has.

Penelit. Ilmu Komput. dan Teknol. Inf., vol. 4, no. 2, p. 64, 2018.

[23] D. Sun, H. Fei, and Q. Li, “A Bisecting K-Medoids clustering Algorithm Based on Cloud Model,” IFAC-PapersOnLine, vol. 51, no. 11, pp. 308–315, 2018.

[24] B. Wira, A. E. Budianto, and A. S. Wiguna, “Implementasi Metode K-Medoids Clustering Untuk Mengetahui Pola Pemilihan Program Studi Mahasiwa Baru Tahun 2018 Di Universitas Kanjuruhan Malang,” Rainstek, vol. 1, no. 3, pp. 54–69, 2019.

Referensi

Dokumen terkait

The link budget calculation on the K u -Band satellite link for communication between Jakarta – Medan, Jakarta – Banjarmasin and Jakarta – Jayapura, is an appropriate link

In this research, a comparative study of the clustering algorithm between K-Means and K-Medoids was conducted on hotspot location data obtained from Global Forest Watch GFW.. Besides

Finally, the value obtained from the elbow method is used as the parameter for the cluster number while applying the k-means algorithm for text clustering to find out the tweet contents

SciVal Topic Prominence Topic: Prominence percentile: 93.494 Author keywords bubble plots cumulative k-dominance Macrobenthic mollusk polyculture ◅ Back to results ◅ Previous Next

The results of this study are: 1 the brain’s natural learning system are: a the nerves in each hemisphere do not work independently, b doing more activities can connect more brain

The results of this study showed that: 1 students’ problem solving ability of mathematics education FKIP UNS particularly in non-algoritmic problems is still not good, but in

Weight Product Method is one of the methods often used to solve the problem of decision making with many attributes or criteria MADM [3].. This method is more efficient than some other

Conclusions Based on the results of research and discussion of the effectiveness of multisensory development based literacy stimulation models on DDST results in Early Childhood