Identifying Factors Affecting the Relationship between Department and Graduation Level of Informatics
Engineering Students using Apriori Algorithm:
A Case Study at Pamulang University
Thoyyibah T 1,*
* Corespondence Author: e-mail: [email protected]
1 Informatics Engineering, Computer Science Faculty, Universitas Pamulang; Jl. Raya Puspitek No. 46, Serpong, Kota Tangerang Selatan, Banten 15316, Indonesia; email: [email protected]
Submitted : 13/02/2023 Revised : 27/02/2023 Accepted : 13/03/2023 Published : 31/03/2023
Abstract
To cultivate the next generation of leaders, it is essential for teenagers to receive a high level of education. Typically, this education is acquired through attending lectures that produce a high GPA, which is considered a valuable achievement for students. The level of graduation achieved within the appropriate timeframe can also impact campus accreditation, especially for engineering students, particularly those pursuing informatics engineering. To improve graduation rates, it is necessary to use data mining to identify patterns and trends among graduating students.
The a priori algorithm was used in this study to analyze school majors, the length of study, and student graduation rates. Through this algorithm, it was possible to identify one or more rules that can be used as benchmarks for predicting graduation rates. Based on the results and discussions of 30 students, the most effective rule for predicting graduation is a combination of the student's previous school major, a study period of 4 years or less, a GPA of 2.51-3.00, and passing all courses on time. Using the a priori algorithm, the rule was found to have a confidence value of 16 and a support value of 71.4%. This indicates that the rule is a reliable predictor of student graduation rates.
Keywords: apriori algorithms, data mining, student graduation
1. Introduction
The apriori algorithm is widely used in various fields, especially in the field of education to determine the study program students take (Fajri, 2016). In 2019 Indah has designed an application data mining to display information on student graduation rates with a priori algorithm. They mine master data and data student graduation to find information about the pass rate. Where the category of graduation rate is measured based on length of study and GPA
(Astuti, 2019).
Several other studies have used aproiri, for example Data Mining Using the Apriori Algorithm for Determining Fertilizer Purchasing Pattern Association Rules Paradigm (Amrin, 2017). Application of the Apriori Algorithm to Look for Association Rules on Book Borrowing Data in Libraries. Scientific Journal of Information Systems Engineering and Management (Srikanti et al., 2018).
Apriori Algorithm for Shopping Cart Analysis on Sales Transaction Data (Rodiyansyah, 2015). Apriori Algorithm Analysis to Support Higher Education Promotion Strategies. Walisongo Journal of Information Technology (Kusumo et al., 2019). Application of the Apriori Algorithm for Order Level Determination.
Penusa Mantik Journal (Sianturi, 2018). Development of a Book Search Recommendation System with Association Rule Exploration Using the Apriori Algorithm (Wandi et al., 2012). Implementation of Apriori Algorithms for Product Recommendations at Online Stores Implementation of Apriori Algorithms for Product Recommendations at Online Stores (Alma et al., 2020). Application of the Association Rule Using the Apriori Algorithm in the Internal Medicine Polyclinic (Case Study: Bintan Regional General Hospital) (Nola Ritha et al., 2021). Implementation of the Apriori Algorithm in Determining Patterns of Visits to Bali (Agung et al., 2018). Implementation of the Apriori Algorithm to Determine Pattern Purchases in One Transaction (Sinaga et al., 2018).
Implementation of Data Mining in Glasses Sales Using the Apriori Algorithm (Purnia & Warnilah, 2017). Application of the Association Rule Method Using the Apriori Algorithm in Rain Prediction Simulations for the City of Bandung (Fauzy et al., 2016). Implementation of Data Mining Using the Apriori Algorithm to Find Out Patterns of Borrowing Books at University Libraries (Rusdianto et al., 2020). Application of the Association Method Using the Apriori Algorithm in the Application of Consumer Shopping Patterns (Case Study of Gramedia Bintaro Bookstore) (Listriani et al., 2015). Application of Data Mining Using the Apriori Algorithm Method to Determine Fish Purchasing Patterns(Saefudin &
Septian, 2019).
In the field of university education, the Aproiri algorithm is used to analyze student data, including their high school major, length of study, GPA,
and graduation status (on time or delayed). As the data is raw and the relationships between them are unknown, data mining methods are used to identify these relationships. The choice of data mining method must be tailored to the type of data collected in order to achieve the goal of identifying relationships between student data (Grand, 2018). By looking at the problems that have been found above, solutions can be provided by identifying a priori algorithms that serve as considerations or become alternatives for study programs to make decisions. This analysis will be discussed under the title "
Identifying Factors Affecting the Relationship between Department and Graduation Level of Informatics Engineering Students using Apriori Algorithm: A Case Study at Pamulang University ".
2. Research Method
The author utilized several tables to analyze the data and identify rules based on the a priori algorithm. The data used in this study consists of personal information from students in the Department of Informatics Engineering at Pamulang University.
Source: Research Result (2023)
Figure 1. Action Research Method
The research method depicted in Figure 1 involves several stages, including data gathering, data preprocessing, implementing the Apriori algorithm for data processing, and generating rules. Data gathering entails collecting data from a group of students who serve as research subjects. The collected data undergoes preprocessing to transform it into several tables. The next stage involves applying the a priori algorithm to process the data. Finally, rules are generated based on the processed data.
3. Results and Analysis 3.1. Data Gathering
Gathering data was taken from student data for 1 class or consisting of 30 data from students majoring in informatics engineering. In this study, only GPA attributes and school majors were used which consisted of Social Sciences, Sciences, Vocational High Schools. The prediction of the a priori algorithm is used with the determination of on-time graduation into 2, namely the length of study of 4 years or less means graduating on time and studying more than 4 years means graduating not on time.
3.2. Data Preprocessing
To preprocess the data, the graduation records are categorized based on the length of study. Specifically, a student is considered to have graduated
"on time" if they completed their studies within 4 years, and "not on time" if their studies took longer than 4 years. These two categories are used to create combinations, as illustrated in Table 1.
Table 1. Data Transformation No Category Description
1 X1 Study duration 4 years / less and IPK 3.51 – 4.00 2 X2 Study duration 4 years / less and IPK 3.1 – 3.5 3 X3 Study duration 4 years / less and IPK 2.51 – 3.00 4 Y1 Study duration of more than 4 years and IPK 3.51 – 4.00 5 Y2 Study duration of more than 4 years and IPK 3.1 – 3.5 6 Y3 Study duration of more than 4 years and IPK 2.51 – 3.00 Source: Research Result (2023)
3.3. Data Processing
Data processing is done by calculating using the a priori algorithm. The student data used is 30 students, the raw data can be seen in Table 2 and Table 3 which is the number of students based on the category.
Table 2. Raw Data
No Major GPA Category
1 IPA 3.8 X1
2 IPA 3.0 X3
3 IPA 3.0 X3
4 IPS 2.8 X3
5 IPA 3.1 Y3
6 SMK 2.9 X2
7 IPA 3.0 Y3
8 SMK 3.2 Y3
9 SMK 2.9 Y2
10 IPS 2.9 Y3
11 IPS 3.0 Y3
12 IPA 3.2 Y3
13 SMK 3.2 Y2
14 SMK 2.8 Y2
15 SMK 3.0 Y3
16 IPA 3.7 Y3
17 SMK 3.3 Y1
18 IPA 3.1 X2
19 IPA 3.5 X2
20 IPA 3.0 X1
21 IPA 3.7 Y3
22 IPS 2.7 X1
23 IPA 3.2 X3
24 IPA 3.6 X2
25 IPS 2.6 X1
26 IPS 2.8 Y3
27 SMK 3.7 X1
28 SMK 3.0 Y3
29 IPA 2.9 Y3
30 IPS 3.1 Y2
Source: Research Result (2023)
Table 3. The Number of Students by Category
No X1 X2 X3 Y1 Y2 Y3 IPA IPS SMK
1 1 0 0 0 0 0 1 0 0
2 0 0 1 0 0 0 1 0 0
3 0 0 1 0 0 0 1 0 0
4 0 0 0 0 0 1 0 1 0
5 0 1 0 0 0 0 1 0 0
6 0 0 19 0 0 1 0 0 1
7 0 0 0 0 0 1 1 0 0
8 0 0 0 0 1 0 0 0 1
9 0 0 0 0 0 1 0 0 1
10 0 0 0 0 0 1 0 1 0
11 0 0 0 0 0 1 0 1 0
12 0 0 0 0 1 0 1 0 0
13 0 0 0 0 1 0 0 0 1
14 0 0 0 0 0 1 0 0 1
15 0 0 0 0 0 1 0 0 1
16 0 0 0 1 0 0 1 0 0
17 0 1 0 0 0 0 0 0 1
18 0 1 0 0 0 0 1 0 0
19 1 0 0 0 0 0 1 0 0
20 0 0 0 0 0 1 1 0 0
21 1 0 0 0 0 0 1 0 0
22 0 0 1 0 0 0 0 1 0
23 0 1 0 0 0 0 1 0 0
24 1 0 0 0 0 0 1 0 0
25 0 0 0 0 0 1 0 1 0
26 0 0 0 0 0 1 0 1 0
27 1 0 0 0 0 0 0 0 1
28 0 0 0 0 0 1 0 0 1
29 0 0 0 0 0 1 1 0 0
30 0 0 0 0 1 0 0 1 0
5 4 3 1 4 13 14 7 9
Source: Research Result (2023)
The item set table above shows the frequencies of X1, X2, X3, Y2, Y3, IPA, IPS, SMK, so the possible sets are {X1, IPA}, {X1, IPS}, {X1, SMK}, {X2 , IPA}, {X2, IPS}, {X2, SMK}, {X3, IPA}, {X3,IPS}, {X3, SMK}, {Y2, IPA}, {Y2, IPS}, {Y2, SMK}, {Y3, IPA}, {Y3, IPS}, {Y3, SMK}. The resulting item set is in Table 4.
Table 4. Table Based on 2 Item Sets
No Y3 IPA F Y3 IPS F Y3 SMK F
1 0 1 S 0 0 S 0 0 S
2 0 1 S 0 0 S 0 0 S
3 0 1 S 0 0 S 0 0 S
4 1 0 S 1 1 P 1 0 S
5 0 1 S 0 0 S 0 0 S
6 1 0 S 1 0 S 1 1 P
7 1 1 P 1 0 S 1 0 S
8 0 0 S 0 0 S 0 1 S
9 1 0 S 1 0 S 1 1 P
10 1 0 S 1 1 P 1 0 S
11 1 0 S 1 1 P 1 0 S
12 0 1 S 0 0 S 0 0 S
13 0 0 S 0 0 S 0 1 S
14 1 0 S 1 0 S 1 1 P
15 1 0 S 1 0 S 1 1 P
16 0 1 S 0 0 S 0 0 S
17 0 0 S 0 0 S 0 1 S
18 0 1 S 0 0 S 0 0 S
19 0 1 S 0 0 S 0 0 S
20 1 1 P 1 0 S 1 0 S
21 0 1 S 0 0 S 0 0 S
22 0 0 S 0 1 S 0 0 S
23 0 1 S 0 0 S 0 0 S
24 0 1 S 0 0 S 0 0 S
25 1 0 S 1 1 P 1 0 S
26 1 0 S 1 1 P 1 0 S
27 0 0 S 0 0 S 0 1 S
28 1 0 S 1 0 S 1 1 P
29 1 1 P 1 0 S 1 0 S
30 0 0 S 0 1 S 0 0 S
13 14 3 13 7 5 13 9 5
Source: Research Result (2023)
3.4. Rules
The rule used is if x then y, use x as the antecedent and y as the consequent. Based on the rules to be assembled, 2 items are needed, namely x and y from Fk. Therefore, it can be arranged as follows:
1. For (x2,SMK)
a. if x=X2 , if y=SMK-> if X2 else SMK b. if x=SMK , if y=X2-> if SMK else X2 2. For (X2,IPA)
a. if x=X2 , if y=SMK-> if X2 else IPA b. if x=IPA, if y=X2-> if X2 else IPA 3. For (Y3,SMK)
a. if x=Y3 , if y=SMK-> if Y3 else SMK b. if x=SMK , if y=Y3-> if SMK else Y3 4. For (X3,IPS)
a. if x=X3 , if y=IPS-> if X3 else IPS b. if x=IPS, if y=X3-> if X3 else IPS 5. For (Y3,IPA)
a. if x=Y3 , if y=IPA-> if Y3 else IPA b. if x=IPA, if y=Y3-> if IPA else Y3 6. For (Y2,SMK)
a. if x=Y2 , if y=SMK-> if Y2 else SMK b. if x=SMK , if y=Y2-> if SMK else Y2 7. For (Y2,IPS)
a. if x=Y2 , if y=IPS-> if Y2 else IPS b. if x=IPS , if y=Y2-> if IPS else Y2 8. For (Y2,IPA)
a. if x=Y2 , if y=IPA-> if Y2 else IPA b. if x=IPA , if y=Y2-> if IPA else Y2 9. For (X1,IPA)
a. if x=X1 , if y=IPA-> if X1 else IPA b. if x=IPA , if y=X1-> if IPA else X1
Table 5. Candidate Association Rules No. If Antecedent then
Consequent
Support Confidence
1 if X2 else SMK 1/30x100%=3.3 1/4x100%=25 2 if SMK else X2 1/30x100%=3.3 1/19x100%=5.2
3 if X2 else IPA 3/30x100%=10 3/4x100%=75
4 if X2 else IPA 3/30x100%=10 3/14x100%=21.4
5 if Y3 else SMK 6/30x100%=20 6/13x100%=46
6 if SMK else Y3 6/30x100%=20 6/19x100%=31.5
7 if X3 else IPS 5/30x100%=16 5/13x100%=38
8 if X3 else IPS 5/30x100%=16 5/7x100%=71.4 9 if Y3 else IPA 3/30x100%=10 3/19x100%=15.7 10 if IPA else Y3 3/30x100%=10 3/4x100%=75 11 if Y2 else SMK 1/30x100%=3.3 1/4x100%=25 12 if SMK else Y2 1/30x100%=3.3 1/19x100%=5.2 13 if Y2 else IPS 1/30x100%=3.3 1/4x100%=25 14 if IPS else Y2 1/30x100%=3.3 1/7x100%=14.2 15 if Y2 else IPA 1/30x100%=3.3 1/19x100%=5.2 16 if IPA else Y2 1/30x100%=3.3 1/4x100%=25 17 if X1 else IPA 4/30x100%=13.3 4/5x100%=80 18 if IPA else X1 4/30x100%=13.3 4/14x100%=28.5 Source: Research Result (2023)
After finding the value of support and confidence for each candidate, the next stage is multiplying between these 2 things. In this case the data taken is data or candidates who have confidence above 70% and above. The results of these calculations are presented in Table 6
Table 6. Accurate Graduation Pprediction Rules No If Confidence, then
Consiquence
Support Confidence Support x Confidence 1 if X2 else IPA 3/30x100%=10 3/4x100%=75 0.075 2 if X3 else IPS 5/30x100%=16 5/7x100%=71.4 0.11424 3 if IPA else Y3 3/30x100%=10 3/4x100%=75 0.075 4 if X1 else IPA 4/30x100%=13.3 4/5x100%=80 0.1064 Source: Research Result (2023)
After finding the multiplication between support and confidence, then the result of the largest multiplication is the result of the rule used for graduation.
The rule used when determining graduation predictions on time is based on the
GPA and major from the school during high school. The rule that succeeds in being a prediction is that if the previous school major was Social Sciences, then the length of study is 4 years / less and the GPA is 2.51 – 3.00, it is declared to have graduated on time.
4. Conclusion
The study concludes that the Apriori algorithm is a useful tool for decision-making. The analysis was conducted on a sample of 30 students, and the results indicate that the choice of high school major is a significant factor affecting both the length of study and the GPA achieved. The most successful rule identified in this study, based on the multiplication of confidence and support values, was "If the previous school major was IPS, then the length of study was 4 years or less and GPA was between 2.51 and 3.00, the student passed on time." The resulting confidence value was 71.4%, with a support value of 16. Based on these findings, future research could consider incorporating additional attributes to draw more comprehensive conclusions.
Additionally, comparing the performance of several algorithms could help identify the most optimal approach for this type of analysis. Finally, exploring the use of other algorithms for analyzing student data could provide further insights to inform decision-making.
Acknowledgements
The authors wish to convey their appreciation to the reviewers for their assistance in enhancing the manuscript.
Author Contributions
Thoyyibah T proposed the topic; conceived models and designed the experiments; conceived the algorithms; and analysed the result.
Conflicts of Interest
The author declare no conflict of interest.
References
Agung, D., Arimbawa, K., Prabawa, I. N. A., & Mertasana, P. A. (2018).
Implementasi Algoritma Apriori dalam Menentukan Pola Kunjungan Wisata ke Bali.
Alma, E., Utami, E., & Wahyu Wibowo, F. (2020). Implementasi Algoritma Apriori untuk Rekomendasi Produk pada Toko Online. Citec Journal, 7(1).
https://citec.amikom.ac.id/main/index.php/citec/article/download/241/172 Amrin, A. (2017). Data Mining Dengan Algoritma Apriori untuk Penentuan
Aturan Asosiasi Pola Pembelian Pupuk. Paradigma, XIX(1), 74–79.
https://doi.org/10.31294/p.v19i1.1836
Astuti, I. P. (2019). Algoritma Apriori untuk Menemukan Hubungan antara Jurusan Sekolah dengan Tingkat Kelulusan Mahasiswa. Jurnal Teknik Informatika, 12(1), 69–78. https://doi.org/10.15408/jti.v12i1.10525
Fajri, A. F. (2016). Implementasi Algoritma Apriori dalam Menetukan Program Studi yang Diambil Mahasiswa. Jurnal Iptek Terapan, 10(2), 81–85.
https://doi.org/10.22216/jit.2016.v10i2.402
Fauzy, M., Saleh W, K. R., & Asror, I. (2016). Penerapan Metode Association Rule Menggunakan Algoritma Apriori Pada Simulasi Prediksi Hujan Wilayah Kota Bandung. Jurnal Ilmiah Teknologi Infomasi Terapan, 2(3).
https://doi.org/10.33197/jitter.vol2.iss3.2016.111
Grand. (2018). Penerapan Algoritma Apriori untuk Menemukan Hubungan Data Murid dengan Nilai Sekolah. Ikraith Informatika, 2(18), 7–12.
https://journals.upi-yai.ac.id/index.php/ikraith-informatika/article/view/137 Kusumo, H., Sediyono, E., & Marwata, M. (2019). Analisis Algoritma Apriori
untuk Mendukung Strategi Promosi Perguruan Tinggi. Walisongo Journal
of Information Technology, 1(1), 49.
https://doi.org/10.21580/wjit.2019.1.1.4000
Listriani, D., Setyaningrum, A. H., & Eka M.A, F. (2015). Penerapan Metode Asosiasi Menggunakan Algoritma Apriori pada Aplikasi Pola Belanja Konsumen (Studi Kasus Toko Buku Gramedia Bintaro). International Journal of Science and Engineering Research (IJ0SER), 3(4), 2.
http://journal.uinjkt.ac.id/index.php/ti/article/view/5602/3619
Nola Ritha, Suswaini, E., & Pebriadi, W. (2021). Penerapan Association Rule Menggunakan Algoritma Apriori Pada Poliklinik Penyakit Dalam (Studi Kasus: Rumah Sakit Umum Daerah Bintan). Jurnal Sains Dan Informatika, 7(2), 222–230. https://doi.org/10.34128/jsi.v7i2.329
Purnia, D. S., & Warnilah, A. I. (2017). Implementasi Data Mining pada Penjualan kacamata Dengan Menggunakan Algoritma Apriori. Indonesian Journal on Computer and Information Technology, 2(2), 31–39.
https://ejournal.bsi.ac.id/ejurnal/index.php/ijcit/article/view/2776
Rodiyansyah, S. F. (2015). Algoritma Apriori untuk Analisis Keranjang Belanja pada Data Transaksi Penjualan. Infotech Journal, 1(1), 36–39.
https://doi.org/10.31949/inf.v1i2.42
Rusdianto, D., Sutiyono, & Zaelan, L. (2020). Implementasi Data Mining Menggunakan Algoritma Apriori untuk Mengetahui Pola Peminjaman Buku di Perpustakaan Universitas. Jurnal Sistem Informasi, 02(02), 1–10.
http://ejournal.unibba.ac.id/index.php/j-sika/article/view/376
Saefudin, & Septian. (2019). Penerapan Data Mining dengan Metode Algoritma Apriori untuk Menentukan Pola Pembelian Ikan. JSiI (Jurnal Sistem Informasi), 6(2), 36. https://doi.org/10.30656/jsii.v6i2.1587
Sianturi, F. A. (2018). Penerapan Algoritma Apriori untuk Penentuan Tingkat
Pesanan. Jurnal Mantik Penusa, 2(1), 50–57.
http://bowmasbow.blogspot.com/20
Sinaga, A., Nughraha, U., & Pasar, A. K. (2018). Implementasi Algoritma Apriori untuk Menentukan Pembelian Pola dalam Satu Transaksi. 7, 204–207.
Srikanti, E., Yansi, R. F., Norhavina, P., I., & Salisah, F. N. (2018). Penerapan Algoritma Apriori untuk Mencari Aturan Asosiasi pada Data Peminjaman Buku di Perpustakaan. Jurnal Ilmiah Rekayasa Dan Manajemen Sistem Informasi, 4(1), 77–80.
Wandi, N., Hendrawan, R. A., & Mukhlason, A. (2012). Pengembangan Sistem Rekomendasi Penelusuran Buku dengan Penggalian Association Rule Menggunakan Algoritma Apriori. Jurnal Teknik ITS, 1, 1–5.