367
JITE, 6 (2) January 2023 ISSN 2549-6247 (Print) ISSN 2549-6255 (Online)
JITE (Journal of Informatics and Telecommunication Engineering)
Available online http://ojs.uma.ac.id/index.php/jite DOI : 10.31289/jite.v6i2.7911
Received: 20 August 2022 Accepted: 22 November 2022 Published: 25 January 2023
Data Clustering Recommendations For Selection Student Majors To Higher Edication Using The K-Means Method (Case Study of SMAN 2
Palembang)
Jemakmun1)*, R. Ahmad Dicky Syarief Purboyo2)
1,2) Fakultas Ilmu Komputer, Teknik Informatika, Universitas Bina Darma, Palembang, Indonesia
*Coresponding Email: [email protected] Abstrak
SMA Negeri 2 Palembang memiliki dua jurusan, IPA dan IPS. Akibat salah memilih jurusan setelah masuk perguruan tinggi terkadang mahasiswa mengalami kesulitan dan merasa salah jurusan, sehubungan dengan permasalahan tersebut penulis mencoba memberikan solusi penentuan jurusan untuk perguruan tinggi dengan menggunakan metode k-means clustering. Dalam penelitian ini, siswa dikelompokkan dengan menggunakan metode data mining. Pengelompokan tersebut didasarkan pada atribut jurusan, minat, sifat, hobi, bakat, dan nilai rata-rata mata pelajaran IPA dan IPS. Pengelompokan data menggunakan metode K-Means dan pengukuran jarak Euclidean, dianalisis menggunakan perhitungan manual Microsoft excel dan alat RapidMiner. Hasil penelitian menunjukkan bahwa Cluster 1 merupakan cluster yang direkomendasikan untuk mengambil jurusan Bahasa.
Cluster 2 merupakan cluster yang direkomendasikan untuk jurusan Teknik. Cluster 3 merupakan cluster yang direkomendasikan untuk jurusan Kesehatan/Kedokteran. Cluster 4 merupakan cluster yang direkomendasikan untuk jurusan Ekonomi. Cluster 5 merupakan cluster yang direkomendasikan untuk jurusan Bahasa. Sedangkan hasil perhitungan penelitian menggunakan RapidMiner, Cluster_0 direkomendasikan untuk jurusan Teknik, Cluster_1 direkomendasikan untuk jurusan Ekonomi, Cluster_2 direkomendasikan untuk jurusan Bahasa, Cluster_3 direkomendasikan untuk jurusan Pendidikan, Cluster_4 direkomendasikan untuk jurusan Kedokteran/ Kesehatan.
Kata Kunci: K-Means, RapidMiner , Departemen, Clustering, Rekomendasi.
Abstract
SMA Negeri 2 Palembang has two majors, science and social studies. As a result of choosing the wrong major after entering college, students sometimes experience difficulties and feel the wrong major, in connection with this problem the author tries to provide a solution for determining majors for college using the k-means clustering method. In this study, the students were grouped using the data mining method. The group is based on the attributes of majors, interests, traits, hobbies, talents, and the average value of science and social science subjects. Clustering data using the K-Means method and measuring the Euclidean distance, analyzed using Microsoft excel manual calculations and RapidMiner tools. The results of the study indicate that Cluster 1 is a cluster that is recommended to take the Language major. Cluster 2 is a cluster that is recommended for a major in Engineering. Cluster 3 is a cluster that is recommended to major in Health/Medicine. Cluster 4 is a cluster that is recommended for majoring in Economics.
Cluster 5 is the recommended cluster for majoring in Language. While the results of the calculation research using RapidMiner, Cluster_0 is recommended to major in Engineering, Cluster_1 is recommended to major in Economics, Cluster_2 is recommended to major in Language, Cluster_3 is recommended to major in Education, Cluster_4 is recommended to major in Medicine/Health.
Keywords: K-Means, RapidMiner , Department, Clustering, Recommendation.
How to Cite: Jemakmun, J., & Purboyo, R. D. (2023). Data Clustering Recommendations For Selection Student Majors To Higher Edication Using The K-Means Method (Case Study of SMAN 2 Palembang). JITE (Journal of Informatics and Telecommunication Engineering), 6(2), 367-377.
I. INTRODUCTION
The development of information technology is so rapid in the Industrial 4.0 era, especially developments in Data science technology (Ismail, 2021). The role of data science in industry 4.0 in 2021 is very much needed and has such a strong impact on the world's ecosystem. Industry 4.0 is a revolution of innovation in the field of Electronics and Information Technology (Trisyanti & Prasetyo, 2018). The concept of Industry 4.0 is a new reality of the modern economy because innovation and technological development play an important role in every organization (Ślusarczyk, 2018) several kinds of technology will become the main pillars in the Industrial revolution 4.0, namely Big Data, the Internet of Things, Cloud Computing, Addictive Manufacturing, and one of them is Artificial Intelligence(Prasetyo, B., & Trisyanti, 2018). Artificial Intelligence (AI) is a part of computer science that studies how to make machines (computers) able to do work as well as and as good as what humans do and even better than what humans do (Dahria, 2008). Artificial Intelligence in technology is divided into 7 branches, namely Machine Learning, Natural Language Processing (NLP), Expert System, Vision, Speech, Planning, and Robotic (Mulianingsih et al., 2020). Machine learning is a branch of Artificial Intelligence. Machine Learning is a discipline that includes the design and development of algorithms that enable computers to develop behaviour based on empirical data (Kambey, 2020). Machine learning is divided into three categories:
Supervised Learning, Unsupervised Learning, and Reinforcement Learning (Somvanshi, n.d.).
The technique used by Supervised Learning is a classification method in which the data set is completely labelled to classify the unknown class. While the Unsupervised Learning technique is often called a cluster because there is no need for labelling in the data set and the results do not identify examples in the specified class (Thupae, 2018). Whereas Reinforcement Learning is usually in between Supervised Learning and Unsupervised Learning, this technique works in a dynamic environment where the concept must complete the goal without an explicit notification from the computer if the goal has been achieved (Das, 2017). Cluster analysis is an analysis to group similar elements as research objects to become different and mutually exclusive clusters. Cluster analysis is useful for summarizing data by grouping objects based on certain characteristics and similarities between objects to be studied(Sitepu &
Gultom, 2011). The K-Means algorithm is one of the most widely used algorithms in clustering because of its simplicity and efficiency.(Gustientiedina et al., 2019). Determination of majors will have an impact on further academic activities and affect the selection of fields of science or study for students who want to continue to college later (Prabowo et al., 2019). The determination of majors that have been carried out so far has many weaknesses, among others, based on the wishes of students regardless of their academic background(Reflianto & Syamsuar, 2018). So that the chosen major is sometimes a problem for students in the future, for example, academic values that are not optimal, the selection of study programs when continuing to higher education is constrained due to inappropriate high school majors, and so on(Lase, 2019).
In this study, we will analyze the case of grouping the K-Means Cluster of Class 12 students majoring in science/IPS at SMA N 2 Palembang based on the interests and talents of students to be used as recommendations for choosing majors for the next level. K-Means Cluster is a non-hierarchical method and clustering that seeks to group data into a cluster so that data with the same characteristics are grouped into the same cluster (Haviluddin et al., 2021). The similarity measure used is a measure of the distance between objects. The two objects that have the closest distance will merge into one cluster. The closeness of the distance owned shows that the two objects have a similar level of characteristics (Sibarani, 2019).
Based on the background that has been explained, the problem to be discussed in this study can be formulated, namely clustering interests and talents at SMA N 2 Palembang using the K-Means Clustering method. 24 criteria can be examined, namely (interests, majors, hobbies, traits, talents, religious education, Pancasila and citizenship education, Indonesian language, mathematics, Indonesian history, English, arts and culture, physical education, sports and health, physics and science, Mathematics with specialization in science, chemistry in science, biology in science, economics in science, English in science, economics in social studies, sociology in social studies, history in social studies, geography in social studies, English in social studies). it is expected that the percentage of graduation rates in determining the majors of State Universities in students of SMA Negeri 2 Palembang is higher.
II. RESEARCH METHOD
In this study, the method used to analyze data in the application of data mining uses the Knowledge Discovery in Databases (KDD) stage process. Data mining, often also called knowledge discovery in
369
database (KDD), is an activity that includes collecting, and using historical data to find regularities, patterns or relationships in large data sets (Santoso, 2007). In general, the KDD process can be explained as follows (P.-S. d. S. Fayyad, 1996): a. Data Selection Selection of data from a set of operational data that needs to be done before extracting information in KDD begins. The data from the selection results are used for data mining processes, stored in a file, separate from the operational database. b. Pre- processing/Cleaning This cleaning stage cleans duplicate data fills in missing data, inconsistent data will be checked, then checks on data, such as errors in writing (typography). At this stage, an enrichment process is also carried out, which is a process of "enriching" existing data with other relevant data or information, such as external data or information. c. Transformation This stage in KDD is a creative process and is highly dependent on the type or pattern of information to be searched in the database. d. Data Mining Is the process of finding the appropriate pattern or information in the selected data using certain techniques or methods. The selection of the right method or algorithm depends on the goals and process of KDD. e.
Interpretation/Evaluation The pattern of information resulting from the output in the data mining process needs to be changed in a form that is easily understood by the person concerned. This stage includes checking whether the patterns or information found to contradict the facts or pre-existing hypotheses
Figure 1 Knowledge Discovery in Database (KDD) Process A.
Application of the Method
The application of the K-means method in research. Clustering data on recommendations for determining majors in higher education using the K-Means method for the case study of SMA Negeri 2 Palembang was carried out in several stages as shown in the image below.
Figure 2 Application of the K-Means . Method
The following is an explanation of each stage of the application of the K-Means method carried out in this study(Muttaqin & Defriani, 2020). 1. Data Collection. The data for this research was obtained
from the school. The data obtained is adjusted to the specifications of the database. The data is then stored in the database that has been created. 2. Dataset Selection Determine the data be processed based on interests and talents as well as report cards for semesters 1-4. The dataset used is cumulative, so the data selected is the data of grade 12 students given by SMA Negeri 2 Palembang. 3. Determine the Number of Clusters. Determine how many clusters you want to form in each process. In this study, the number of clusters was limited to 5 clusters. 4. Determining the Cluster Starting Point (Centroid) The initial cluster centre point, or it can also be called the initial centroid, is determined randomly based on the number of clusters and the amount of data to be processed. 5. Calculating the Distance of Each Data with the Center Point of Cluster(Mario et al., 2016). The distance between each data and each cluster is calculated using the Euclidean Distance (D) formula as presented in the following equation.
Information; D = cluster distance , Xik = data value (i,k) , Cjk = centroid value (j,k) , n = number of clusters
6. Grouping of Data Based on the Nearest Cluster. Pay attention to which cluster has the closest distance to the data. then group the data into the cluster. 7. Calculating the New Cluster Center Point(Kurnia Bakti & Indriyatno, 2017). After all, data is grouped into clusters, calculate the centre point of the new cluster by calculating the average distance of the data from the cluster centre point using the following equation, .
8. Compare the New Cluster with the Initial Cluster. If the newly formed cluster has a different centroid from the initial cluster, repeat the process starting from step 5. If the new cluster centroid is the same as the previous cluster, the process can be stopped and the final clustering result is obtained (Mustakim & Kamal, 2021).
B.
Data Collection Method
1. Primary; a. Observation The author made direct observations of the field to get clear data about this research. b. Interview The author conducted interviews with the SMA Negeri 2 Palembang school related to the research topic raised. c. Questionnaire Researchers collect data by asking closed questions that must be answered by students at SMA Negeri 2 Palembang related to their interests and talents as well as the average value of subjects which will be grouped by researchers in this study.
2. Secondary; Literature or literature study The author collects data and information related to research in the form of journals and other reading sources that are used as reference materials or guidelines.
III. RESULT AND DISCUSSION A. Data Representation Results
Based on the results of interviews and distribution of questionnaires that have been carried out as well as data on average grades of semesters 1 to 4, the authors obtained data for students of SMA Negeri 2 Palembang. The total number of students at SMA Negeri 2 Palembang is 320 students, consisting of 10 classes, 6 science classes and 4 social science classes. The attributes used in all student data of SMA Negeri 2 Palembang amounted to 24, namely Hobbies, Interests, Talents, Traits, Average value of science majors, and an average value of social studies subjects.
The attribute was chosen by the author because the determination of the major was sought based on semester grades 1 to 4, and the results of student questionnaires, after which the data was processed manually using clustering techniques and the k-means algorithm.
371
B. Data Analysis Results
Based on research that has been done manually with K-Means Clustering, new patterns, information and knowledge have been obtained from the data mining process in determining and classifying student data based on the interests and talents of SMA Negeri 2 Palembang students using data from Class XII students (two twelve) from semesters 1 to 4 using the knowledge discovery in database (KDD) stage. From this research, 5 clusters have been obtained from all class XII students, where cluster 2 is a cluster that is recommended to major in Engineering and has the most members as many as 165 students, and cluster 4 is a cluster that is recommended to major in economics and is the second-largest member with 120 students. , cluster 3 is a cluster that is recommended to major in health/medicine with 26 students as members, cluster 1 is a cluster that is recommended to major in Languages has at least 3 students, and the last is cluster 5 which is also the least and only has 3 students and is recommended to major in Education.
While the RapidMiner calculation with 10 iterations. The number of clusters in the RapidMiner calculation where C1 has 166 data, C2 has 119 data, C3 has 1 data, C4 has 3 data, C5 has 28 data, and the number of centroids in the RapidMiner calculation is 5. In the RapidMiner Calculation and Manual Calculation, it was found that there was a difference between Manual calculations and using Tools were in the Rapidminer Calculations found 3 Clusters using Social Sciences Department Student Data and 2 Clusters initially using Science Department Student data, While in Manual Calculation the initial selection of clusters used 3 Science Department Students, and 2 Social Studies majors, which makes it difficult to equalize between the cluster centres of the two calculations that have been carried out.
C. Rapid Miner Pattern Results
In viewing the results of the cluster output pattern from the RapidMiner tools, it is found on the Chart display in RapidMiner (Sari et al., 2020). The chart is a graphical display of the results of the grouping of data that has been grouped based on interests and talents, then from the grouping of 317 student data, they have grouped again into 5 clusters. The display in the image below is a scatter/bubble type chart.
Figure 3 3D Cluster Scatter Output Pattern
Figure 3 above is a 3D Scatter pattern display, based on the image above it is displayed based on Hobbies, Interests, Traits, Talents, and Majors based on their respective clusters. It can be seen in the picture above that the blue bubble is an interest, the green bubble is a major, the orange bubble is a hobby, the black bubble is a trait, and the purple bubble is a talent.
Figure 4 Bar Cluster Output Pattern (Horizontal)
It can be seen in the 2 charts above, that the cluster that has the most members with a total of 166 is recommended to major in Engineering is cluster_0. Furthermore, with members who have a total of 119 is cluster_1 which is recommended to major in Economics. Furthermore, with the number of members, 28 is cluster_4 which in this cluster is recommended to major in medicine/health. Furthermore, with a small number of members is cluster_3 which is only 3 members and is recommended to major in Education. And lastly, the member level is at least in cluster_2 where the number of members is only 1 and it is recommended to take a language major.
D. Discussion
In line with the results of the data analysis that has been described previously, this can be proven by the calculation of K-Means Clustering with the results of calculating the distance of the 1st data to the 1st initial centroid cluster (C1) = 8312.5625, the results of calculating the distance of the 1st data to the centroid cluster the 2nd initial (C2), the result = 8403,563, the calculation of the distance of the 1st data to the 3rd initial centroid cluster (C3), the result = 662.0625, and the calculation of the distance of the 1st data to the 4th initial centroid cluster (C4). the result = 80114.5625, and the result of calculating the distance of the 1st data to the 5th initial centroid cluster (C5), the result = 72006.1875.
And then proceed with the calculation of the 2nd…N data on the initial centre of the cluster. After all the distance calculation results are obtained, the data will be grouped into members of a cluster that has the closest or smallest distance from the centre of the cluster. After obtaining members from each cluster, then the new cluster centre is recalculated with the current cluster membership. C2, C3, C4, and C5 are calculated for new clusters, in the same way, the number of cluster data filled in for each cluster consisting of C1 4 members, C2 166 members, C3 24 members, C4 120 members, and C5 3 members. Meanwhile, based on the results of manual calculations or using Microsoft Excel above, the grouping results are obtained which are contained in the table below:
Table 1 Final Results of Grouping Students Against Each Cluster
From the table above, it can be seen that the clusters with the fewest members are C1 (Cluster 1) and C5 (Cluster 5), which only contain 3 students in the cluster, while the most cluster members are in C2 (cluster 2) which contains 165 students. C3 (Cluster 3) consists of 26 students and C4 (Cluster 4) consists of 120 students.
Number C1 C2 C3 C4 C5
1 3 165 26 120 3
Total Students 317
373
E. Application Discussion
In its implementation, data mining uses a tool, namely rapidminer. As for the initial stages of rapidminer, namely creating a work page on the new process which is shown in Figure 5 below (Sudarsono et al., 2021).
Figure 5 New Process Display
Furthermore, the data is imported in rapidminer using the read xlsx operator. After that, click on the Import Configuration Wizard Tab to input data from Microsoft Excel in xlsx format. In importing data on the Rapidminer tool, there are 4 stages. The stages are as follows:
The First Step of the Data Import Wizard
The first step is to find the location of the file with the xlsx format. At this stage the data to be tested is stored with the file name Survey of Interests and Talents (Responses) (1).xlsx. It can be seen in the image below.
Figure 6 The First Step of the Data Import Wizard Second Step Data Import Wizard
This second step displays the data selected in the previous step from Microsoft Excel on the sheet menu, the data that has been initialized is selected to be able to process the data. It can be seen in the image below.
Figure 7 Second Step Data Import Wizard Third Step Data Import Wizard
In this third step, we display the data attributes of the file we selected earlier
Figure 8 Third Step Data Import Wizard
Fourth Step Data Import Wizard
This last stage, stage is to determine the data type and attributes. The following data is used for name data using the Polynomial data type, while Departments, Hobbies, Interests, Traits, and Talents use the Integer data type, and for the subject attributes using the Real data type because not a few students whose data sum results are decimal. If all the attributes have been set, click next and then select where the file will be saved in this experiment, it will be stored in the Testing Samples folder. If it is immediately ended with Finish.
375
Figure 9 Fourth Step Data Import Wizard
After performing the 4 steps of the Data Import Wizard, the next step is the data processing which is carried out using the K-Means Clustering segmentation operator. The following is data processing using RapidMiner tools.
Figure 10 K-means Rapidminer Clustering Modeling
By using the modelling as shown in the image above, initializing the number of clusters as many as 5, and max runs is the determination of how many iterations to run, because in calculations using Microsoft excel there are 4 iterations, so here the iteration is limited or max runs up to 10. Measure types choose mixed measures because this is the most suitable for calculating Euclidean distance according to the formula used in manual calculations (Miftahuddin et al., 2020).
IV. CONCLUSION
From the results of this study, it can be concluded that several evaluation materials can be used as references, namely this study uses the K-Means Clustering method from manual calculations that have been carried out, the authors recommend 5 clusters, which for cluster 1 students of SMA Negeri 2 Kota Palembang is recommended to choose the Engineering Department, cluster 2 students at SMA Negeri 2 Palembang are recommended to choose engineering majors, cluster 3 students at SMA Negeri 2 Palembang
are recommended to enter the health/medical field, cluster 4 students are recommended to choose economics majors, and in the cluster 4 students are recommended to choose economics majors.
5 students are recommended to enter the Education Sector. The calculation uses the Euclidean distance formula. Meanwhile, the RapidMiner calculation uses 10 iterations. The number of clusters in the RapidMiner calculation where C1 has 166 data, C2 has 119 data, C3 has 1 data, C4 has 3 data, C5 has 28 data, and the number of centroids in the RapidMiner calculation is 5. between manual calculations and using tools were in the Rapidminer calculation found 3 clusters using Social Studies Department student data and 2 clusters initially using natural science student data, while in manual calculations the initial selection of clusters used 3 science majors students and 2 social studies majors, which resulted in the difficulty of balancing between the cluster centres of the two Calculations that have been carried out.
The suggestion that the author recommends is that further research is expected to be developed into an application system, both web-based and android-based or other programming languages. Furthermore, the research can be redeveloped using other methods that can produce information according to the needs of each research. For further research development, it can be expanded by using other comparison algorithms.
BIBLIOGRAPHY
Dahria, M. (2008). Kecerdasan Buatan (Artificial Intelligence),. Jurnal Saintikom, 5(1), 185–197.
https://doi.org/10.17512/pjms.2018.17.1.19
Das, M. N. S. (2017). A survey on types of machine learning techniques in intrusion prevention systems.
International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET).
Gustientiedina, G., Adiya, M. H., & Desnelita, Y. (2019). Penerapan Algoritma K-Means Untuk Clustering Data Obat-Obatan. Jurnal Nasional Teknologi Dan Sistem Informasi, 5(1), 17–24.
https://doi.org/10.25077/teknosi.v5i1.2019.17-24
Haviluddin, H., Patandianan, S. J., Putra, G. M., Puspitasari, N., & Pakpahan, H. S. (2021). Implementasi Metode K-Means Untuk Pengelompokkan Rekomendasi Tugas Akhir. Informatika Mulawarman : Jurnal Ilmiah Ilmu Komputer, 16(1), 13. https://doi.org/10.30872/jim.v16i1.5182
Ismail, I. (2021). Penerapan Data Science Menggunakan Artificial Neural Network (ANN) Metode Self Organizing Mapping (SOM) untuk Klasifikasi Industri. Warta Akab, 45(2), 66–70.
https://doi.org/10.55075/wa.v45i2.62
Kambey, I. C. C. S. A. N. C. (2020). An analysis of the current status and future of biosecurity frameworks for the Indonesian seaweed industry. Journal of Applied Phycology, 32(6).
Kurnia Bakti, V., & Indriyatno, J. (2017). Klasterisasi Dokumen Tugas Akhir Menggunakan K-Means Clustering, Sebagai Analisa Penerapan Sistem Temu Kembali. KOPERTIP : Jurnal Ilmiah Manajemen Informatika Dan Komputer, 1(1), 31–34. https://doi.org/10.32485/kopertip.v1i1.8
Lase, D. (2019). Pendidikan Di Era Revolusi Industri 4.0. Journal Sunderman, 1(1), 28–43.
https://doi.org/10.53091/jtir.v1i1.17
Mario, A., Herry, S., & Nasution, H. (2016). Pemilihan Distance Measure Pada K-Means Clustering Untuk Pengelompokkan Member Di Alvaro Fitness. Jurnal Sistem Dan Teknologi Informasi, 1(1), 1–6.
Miftahuddin, Y., Umaroh, S., & Karim, F. R. (2020). Perbandingan Metode Perhitungan Jarak Euclidean, Haversine, Dan Manhattan Dalam Penentuan Posisi Karyawan. Jurnal Tekno Insentif, 14(2), 69–77.
https://doi.org/10.36787/jti.v14i2.270
Mulianingsih, F., Anwar, K., Shintasiwi, F. A., & Rahma, A. J. (2020). Program Studi Tadris Ilmu Pengetahuan Sosial Institut Agama Islam Negeri Kudus Artificial Intellegence dengan Pembentukan Nilai dan Karakter di Bidang Pendidikan. Ijtimaiya : Journal of Social Science Teaching, 4(2), 148–154.
http://journal.stainkudus.ac.id/index.php/Ijtimaia
Mustakim, Z., & Kamal, R. (2021). K-Means Clustering for Classifying the Quality Management of Secondary Education in Indonesia. Cakrawala Pendidikan, 40(3), 725–737.
https://doi.org/10.21831/cp.v40i3.40150
Muttaqin, M. R., & Defriani, M. (2020). Algoritma K-Means untuk Pengelompokan Topik Skripsi Mahasiswa.
ILKOM Jurnal Ilmiah, 12(2), 121–129. https://doi.org/10.33096/ilkom.v12i2.542.121-129 P.-S. d. S. Fayyad. (1996). Knowledge Discovery and Data Mining: Towards a Unifying Framework.
Prabowo, W., Yusuf, M., & Setyowati, R. (2019). Pengambilan keputusan menentukan jurusan kuliah ditinjau dari student self efficacy dan persepsi terhadap harapan orang tua. Jurnal Psikologi Pendidikan Dan Konseling: Jurnal Kajian Psikologi Pendidikan Dan Bimbingan Konseling, 5(1), 42–48.
Prasetyo, B., & Trisyanti, U. (2018). Revolusi Industri 4.0 dan Tantangan Perubahan Sosial. In Prosiding Semateksos 3 “Strategi Pengembangan Nasional Menghadapi Revolusi Industri 4.0, 22–27.
377
Reflianto, & Syamsuar. (2018). Pendidikan dan Tantangan Pembelajaran Berbasis Teknologi Informasi di Era Revolusi Industri 4.0. Jurnal Ilmiah Teknologi Pendidikan, 6(2), 1–13.
Santoso, S. (2007). Statistik Deskriptif: Konsep dan Aplikasi dengan Microsoft Exel dan SPSS,. ANDI.
Sari, Y. R., Sudewa, A., Lestari, D. A., & Jaya, T. I. (2020). Penerapan Algoritma K-Means Untuk Clustering Data Kemiskinan Provinsi Banten Menggunakan Rapidminer. CESS (Journal of Computer Engineering, System and Science), 5(2), 192. https://doi.org/10.24114/cess.v5i2.18519
Sibarani, E. (2019). Pengaruh Motivasi dan Disipln Kerja Terhadap Kinerja Perawat pada Rumah Sakit Swasta Lancang Kuning Pekanbaru. Skripsi Fakultas Ilmu Sosial Ilmu Politik Jurusan Ilmu Administrasi Bisnis Universitas Riau.
Sitepu, R., & Gultom, B. (2011). Clustering Analysis for Air Pollution Level on Industrial Sector in South Sumatera. Jurnal Penelitian Sains, 14(3), 11–17.
Ślusarczyk, B. (2018). Industry 4.0 - Are We Ready? Polish Journal Of Management Studies, 232–248.
Somvanshi, P. C. M. (n.d.). A review of machine learning techniques using decision tree and support vector machine.
Sudarsono, B. G., Leo, M. I., Santoso, A., & Hendrawan, F. (2021). Analisis Data Mining Data Netflix Menggunakan Aplikasi Rapid Miner. JBASE - Journal of Business and Audit Information Systems, 4(1), 13–21. https://doi.org/10.30813/jbase.v4i1.2729
Thupae, B. I. N. G. A. M. A.-M. R. (2018). Machine Learning Techniques for Traffic Identification and Classifiacation in SDWSN: A Survey.
Trisyanti, U., & Prasetyo, B. (2018). Revolusi Industri dan Tantangan Revolusi Industri 4.0. Prosiding SEMATEKSOS 3 “Strategi Pembangunan Nasional MenghadapiRevolusiIndustri 4.0,” 22–27.
http://iptek.its.ac.id/index.php/jps/article/view/4417