A Data Mining Approach to Construct Graduates Employability Model in Malaysia
Myzatul Akmam Sapaat, Aida Mustapha, Johanna Ahmad, Khadijah Chamili, Rahamirzam Muhamad
Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, 43400 UPM Serdang, Selangor, Malaysia
{[email protected], [email protected], [email protected], [email protected], [email protected]}
ABSTRACT
This study is to construct the Graduates Employability Model using classification task in data mining. To achieve it, we use data sourced from the Tracer Study, a web- based survey system from the Ministry of Higher Education, Malaysia (MOHE) for the year 2009. The classification experiment is performed using various Bayes algorithms to determine whether a graduate has been employed, remains unemployed or in an undetermined situation. The performance of Bayes algorithms are also compared against a number of tree-based algorithms.
Information Gain is also used to rank the attributes and the results showed that top three attributes that have direct impact on employability are the job sector, job status and reason for not working. Results showed that J48, a variant of decision-tree algorithm performed with highest accuracy, which is 92.3% as compared to the average of 91.3%
from other Bayes algorithms. This leads to the conclusion that a tree-based classifier is more suitable for the tracer data due to the information gain strategy.
KEYWORDS
Classification, Bayes Methods, Decision Tree, Employability
1 INTRODUCTION
Tracer Study is a web-based survey system developed by the Ministry of Higher Education, Malaysia (MOHE). It is compulsory to be filled by all students graduating from polytechnics, public or private institutions before their convocation for any level of degree awarded. The sole purpose of the survey is to guide future planning and to improve various aspects of local higher education administrative system. The survey also serves as a tool to gauge the adequacy of higher education in Malaysia in supplying manpower needs in all areas across technical, managerial or social science. Data sourced from the Tracer Study is invaluable because it provides correlation about the graduate qualifications and skills along with employment status.
Graduates employability remains as national issues due to the increasing number of graduates produced by higher education institutions each year.
According to statistics generated from the Tracer Study, total number of graduates produced by higher institutions in 2008 is 139,278. In 2009, the volume has increased to 155,278 graduates. Taking this into consideration, 50% of graduates in 2009 1086
are bachelor holder from public and private universities. Only 49.20% or 38,191 of them successfully employed within the first six months after finishing their studies. Previous research on graduate employability covers wide range of domain such as education, engineering, and social science. While the researches are mainly based on surveys or interviews, little has been done using data mining techniques.
Bayes’ theorem is among the earliest statistical method that is used to identify patterns in data. But as datasets have grown in size and complexity, data mining has emerged as a technology to apply methods such as neural networks, genetic algorithms, decision trees, and support vector machines to uncover hidden patterns [1]. Today, data mining technologies are dealing with huge amount of data from various sources, for example relational or transactional databases, data warehouse, images, flat files or in the form World Wide Web.
Classification is the task of generalizing observations in the training data, which are accompanied by specific class of the observations. The objective of this paper is to predict whether a graduate has been employed, remains unemployed or in an undetermined situation within the first six months after graduation. This will be achieved through a classification experiment that classifies a graduate profile as employed, unemployed or others. The main contribution of this paper is the comparison of classification accuracy between various algorithms from the two most commonly used data mining techniques in the education domain,
which are the Bayes methods and decision trees.
The remainder of this paper is organized as follows. Section 2 presents the related works on graduate employability and reviews recent techniques employed in data mining.
Section 3 introduces the dataset and the experimental setting. Section 4 discusses finding of the results. Finally Section 5 concludes the paper with some direction for future work.
2 RELATED WORK
A number of works have been done to identify the factors that influenced graduates employability in Malaysia. It is as an initiative step to align the higher education with the industry, where currently exists unquestionable impact against each other. Nonetheless, most of the previous works were carried out beyond the data mining domain.
Besides, data sources for previous works were collected and assembled through survey in sample population.
Research in [2] identifies three major requirements concerned by the employers in hiring employees, which are basic academic skills, higher order thinking skills, and personal qualities.
The work is restricted in the education domain specifically analyzing the effectiveness of a subject, English for Occupational Purposes (EOP) in enhancing employability skills. Similar to [2], work by [3] proposes to restructure the curriculum and methods of instruction in preparing future graduates for the forthcoming challenges based on the model of the T-shaped professional and newly developed field 1087
International Journal on New Computer Architectures and Their Applications (IJNCAA) 1(4): 1086-1098 The Society of Digital Information and Wireless Communications, 2011 (ISSN: 2220-9085)
of Service Science, Management and Engineering (SSME).
More recently, [4] proposes a new Malaysian Engineering Employability Skills Framework (MEES), which is constructed based on requirement by accrediting bodies and professional bodies and existing research findings in employability skills as a guideline in training package and qualification in Malaysia. Nonetheless, not surprisingly, graduates employability is rarely being studied especially within the scope of data mining, mainly due to limited and authentic data source available.
Employability issues have also been taken into consideration in other countries. Research by The Higher Education Academy with the Council for Industry and Higher Education (CIHE) in United Kingdom concluded that there are six competencies that employers observe in individual who can transform the organizations and add values in their careers [5]. The six competencies are cognitive skills or brainpower, generic competencies, personal capabilities, technical ability, business or organization awareness and practical elements. Furthermore, it covers a set of achievements comprises skills, understandings and personal attributes that make graduates more likely to gain employment and successful in their chosen occupations which benefits the graduates, the community and also the economy.
However, data mining techniques have indeed been employed in education domain, for instance in prediction and classification of student academic performance using Artificial Neural Network [6, 7] and a combination of
clustering and decision tree classification techniques [6]. Experiments in [8]
classifies students to predict their final grade using six common classifiers (Quadratic Bayesian classifier, 1-nearest neighbour (1-NN), k-nearest neighbor (k-NN), Parzen-window, multilayer perceptron (MLP), and Decision Tree).
With regards to student performance, [9]
discovers individual student characteristics that are associated with their success according to grade point averages (GPA) by using a Microsoft Decision Trees (MDT) classification technique. [10] has shown some applications of data mining in educational institution that extract useful information from the huge data sets.
Data mining through analytical tool offers user to view and use current information for decision making process such as organization of syllabus, predicting the registration of students in an educational program, predicting student performance, detecting cheating in online examination as well as identifying abnormal/erroneous values.
Among the related work, we found that work done by [11] is most related to this research, whereby the work mines historical data of students' academic results using different classifiers (Bayes, trees, function) to rank influencing factors that contribute in predicting student academic performance.
3 MATERIALS AND METHODS The main objective of this paper is to classify a graduate profile as employed, unemployed or undetermined using data sourced from the Tracer Study database for the year of 2009. The dataset consists 1088
of 12,830 instances and 20 attributes related to graduate profiles from 19 public universities and 138 private universities. Table 1 shows the complete attributes for the Tracer Study dataset.
To construct the classifiers, we use the Waikato Environment for Knowledge Analysis (WEKA), an open- source data mining tool [12] which was
developed at University of Waikato New Zealand. It provides various learning algorithm that can be easily implemented to the dataset. WEKA only accepts dataset in Attribute-Relation File Format (ARFF) format. Therefore, once the data preparation being done, we transform the dataset into ARFF file with extension of .arff.
1089
nternational Journal on New Computer Architectures and Their Applications (IJNCAA) 1(4): 1086-1098 The Society of Digital Information and Wireless Communications, 2011 (ISSN: 2220-9085)
Table 1. Attributes from the Tracer Study dataset after the pre-processing is performed.
No. Attributes Values Descriptions
1 sex {male, female} Gender of the graduate
2 age {20-25, 25-30, 30-40, 40-50, >50} Age of the graduate
3 univ {public_univ, private_univ} University/institution of current qualification
4 level {certificate, diploma,
advanced_diploma, first_degree, postGraduate_diploma, masters_ thesis, masters_courseWork& Thesis,
masters_courseWork, phd_ thesis, phd_courseWork&Thesis, professional}
Level of study for current qualification
5 field {technical, ict, education, science, art&soc_science }
Field of study for current qualification 6 cgpa {2.00-2.49, 2.50-2.99, 3.00-3.66, 3.67-
4.00, failed, 4.01-6.17}
CGPA for current qualification 7 emp_status {employed, unemployed, others} Current employment status 8 general_IT skills {satisfied, extremely_satisfied, average,
strongly_not_satisfied, not_satisfied, not_applicable}
Level of IT skills, Malay and English language proficiency, general knowledge, interpersonal
communication, creative and critical thinking, analytical skills, problem solving, inculcation of positive values, and teamwork acquired from the programme of study
9 Malay_lang 10 English_lang 11 gen_knowledge 12 interpersonal_
comm 13 cc_thinking 14 analytical 15 prob_solving 16 positive_value 17 teamwork
18 job_status {permanent, contract, temp, self_
employed, family_business}
Job status of employed graduates 19 job_sector {local_private_company, multinational_
company, own_company, government, NGO, GLC, statutory_body, others}
Job sector of employed graduates
20 reason_not_
working
{job_hunting, waiting_for_ posting, further_study, participating_skills_
program, waiting_posting_of_study, unsuitable_job, resting, others, family_
responsibilities, medical_ issues, not_
interested_to_work, not_going_to_work,
lack_of_confidence, chambering}
Reason for not working for unemployed graduates
3.1 Data-Preprocessing
The raw data retrieved from the Tracer Study database required pre-processing to prepare the dataset for the classification task. First, cleaning activities involved eliminating data with
missing values in critical attributes, identifying outliers, correcting inconsistent data, as well as removing duplicate data. From the total of 89,290 instances in the raw data, the data cleaning process ended up 12,830 instances that are ready to be mined. For 1090
missing values (i.e., age attribute), we replaced them with the mean values of the attribute.
Second, data discretization is required due to the fact that most of attributes from the Tracer Study are continuous attributes. In this case, we discretized the values into interval so as to prepare the dataset into categorical or nominal attributes as below.
cgpa previously in continuous number is transformed into grade range
sex previously coded as 1 and 2 is transformed into nominal
age previously in continuous number is transformed into age range
field of study previously in numerical code 1-4 is transformed into nominal
skill information (i.e., language proficiency, general knowledge, interpersonal communication etc) previously in numerical 1-9 is transformed into nominal
employment status previously in numerical code 1-3 is transformed into nominal
3.2 Classification Task
The classification task at hand is to predict the employment status (employed, unemployed, others) for graduate profiles in the Tracer Study.
The task is performed in two stages, training and testing. Once the classifier is constructed, testing dataset is used to estimate the predictive accuracy of the classifier.
There are four types of testing option in WEKA, which are using the training
set, supplied test set, cross validation and percentage split. If we use training set as the test option, the test data will be sourced from the same training data, hence this will decrease reliable estimate of the true error rate. Supplied test set permit us to set the test data which been prepared separately from the training data. Cross-validation is suitable for limited dataset whereby the number of fold can be determined by user. 10-fold cross validation is widely use to get the best estimate of error. It has been proven by extensive test on numerous datasets with different learning techniques [13].
With a number of dataset and to avoid overfitting, we employed hold-out validation method with 70-30 percentage split, whereby 70% out of the 12,830 instances is used for training while the remaining instances are used for testing.
Various algorithms from both Bayes and decision tree families are used in predicting the accuracy of the employment status.
Information Gain. Information Gain is an attribute selection measure uses in ID3. If node N represents tuples of partition D, attribute with highest information gain will be chosen as splitting attribute for node N. It resulted towards minimizing number of tests needed to classify a given tuples as well as guarantees that a simple tree is found.
The expected information needed to classify a tuple in D is given by
m
Info(D) = - ∑ pi log2(pi) i=1
1091
International Journal on New Computer Architectures and Their Applications (IJNCAA) 1(4): 1086-1098 The Society of Digital Information and Wireless Communications, 2011 (ISSN: 2220-9085)
Bayes Methods. In Bayes methods, the classification task consists of classifying a class variable, given a set of attribute variables. It is a type of statistical in which the prior distribution is estimated from the data before any new data are observed, hence every parameter is assigned with a prior probability distribution [14]. A Bayesian classifier learns from the samples over both class and attribute variables.
The naïve Bayesian classifier works as follows: Let D be a training set of tuples and their associated class labels.
As usual, each tuple is represented by an n-dimensional attribute vector, X = (x1, x2, …, xn), depicting n measurements made on the tuple from n attributes, respectively, A1, A2, … , An.
Suppose that there are m classes, C1, C2, …, Cm. Given a tuple, X, the classifier will predict that X belongs to the class having the highest posterior probability, conditioned on X. That is, the naïve Bayesian classifier predicts that tuple X belongs to the class Ci if and only if
P(Ci|X) > P(Cj|X) for 1 ≤ j ≤ m; j ≠ i Thus, we maximize P(Ci|X). The class Ci for which P(Ci|X) is maximized is called the maximum posteriori hypothesis.
Under the Bayes method in WEKA, we performed the experiment with eight algorithms, which are Averaged One- Dependence Estimators (AODE), AODEsr, WAODE, Bayes Network, HNB, Naïve Bayesian, Naïve Bayesian Simple and Naïve Bayesian Updateable.
AODE, HNB and Naïve Bayesian was also used in [11] and the rest algorithms were chosen to further compare the
results from the Bayes algorithm experiment using the same dataset.
AODE algorithm achieved the highest accuracy percentage averaging all of smaller searching-space in alternative naive Bayes-like models that have weaker and hence less detrimental independence assumptions than naive Bayes. The resulting algorithm is computationally efficient while delivering highly accurate classification on many learning tasks. AODEsr and WAODE are expended from AODE.
AODEsr complement AODE with Subsumption Resolution, which is capable to detect specializations between two attribute values at classification time and deletes the generalization attribute value.
Meanwhile, WAODE constructs the model called Weightily Averaged One- Dependence Estimators by assigning weight to each dataset. Bayes Network learning using various search algorithms and quality measures. HNB constructs Hidden Naive Bayes classification model with high classification accuracy and AUC. In Naive Bayes, numeric estimator precision values are chosen based on analysis of the training data.
The Naïve Bayes Updateable classifier will use a default precision of 0.1 for numeric attributes when build classifier is called with zero training instances.
Naive Bayes Simple modeled numeric attributes by a normal distribution.
Tree Methods. Tree-based methods classify instances by sorting the instances down the tree from the root to some leaf node, which provides the classification of a particular instance.
Each node in the tree specifies a test of 1092
some attribute of the instance and each branch descending from that node corresponds to one of the possible values for this attribute [15]. Figure 1 shows the model produced by decision trees, which is represented in the form of tree structure.
Under the tree method in WEKA, we performed the classification experiment with nine algorithms, which are ID3, J48, REPTree, J48graft, Random Tree, Decision Stump, LADTree, Random Forest and Simple Cart. J48 and REPTree was also used in [11], but we did not managed to use NBTree and BFTree because the experiment worked on large amount of datasets, thus incompatible with the memory allocation in WEKA. FT, User Classifier and LMT algorithm also experienced the same problem as NBTree and BFTree. In addition, we employed ID3, J48graft, Random Tree, Decision Stump, LAD Tree, Random Forest and Simple Cart to experiment with other alternative algorithms in decision tree.
Figure 1. In a tree structure, each node denotes a test on an attribute value, each branch represents an outcome of the test, and tree leaves represent classes or class distributions. A leaf node indicates the class of the examples. The instances are classified by sorting them down the tree from the root node to some leaf node.
ID3 is a class for constructing an unpruned decision tree based on the ID3 algorithm, which only deals with nominal attributes. J48 is a class for generating a pruned or unpruned C4.5 decision tree while J48 grafted generates a grafted (pruned or unpruned) C4.5 decision tree. REPTree is fast decision tree learner which builds a decision/
regression tree using information gain/
variance and prunes it using reduced- error pruning (with backfitting).
Decision stump is usually being used in conjunction with a boosting algorithm. A multi-class alternating decision tree is generated in LADTree using the LogitBoost strategy. Random Forest constructs a forest of random trees whereas Random Tree constructs a tree that considers K randomly chosen attributes at each node without pruning.
SimpleCart implements minimal cost- complexity pruning.
4 RESULTS AND DISCUSSIONS We segregated the experimental results into three parts. The first is the result from ranking attributes in the Tracer Study dataset using the Information Gain. The second and third parts presents the predictive accuracy results by various algorithms from the Bayes method and decision tree families, respectively.
4.1 Information Gain
In this study, we employed Information Gain to rank the attributes in determining the target values as well as to reduce the size of prediction. Decision
set of possible answers
leaf
node
leaf
node
root
node
set of possible answers
1093
International Journal on New Computer Architectures and Their Applications (IJNCAA) 1(4): 1086-1098 The Society of Digital Information and Wireless Communications, 2011 (ISSN: 2220-9085)
tree algorithms adopt a mutual- information criterion to choose the particular attribute to branch on that gain the most information. This is inherently a simple preference bias that explicitly searches for a simple hypothesis.
Ranking attributes also increases the speed and accuracy in making
prediction. Based on the attribute selection using the Information Gain, the job sector attribute was found the most important factor in discriminating the graduate profiles to predict the graduate’s employment status. This is shown in Figure 2.
Figure 2. Job sector is ranked the highest by attribute selection based on Information Gain. This is largely because the attribute has small set of values, thus one instance is easily distinguishable than the remaining instances.
4.2 Bayes Methods
Table 2 shows the classification accuracies for various algorithms under Bayes method. In addition, the table provides comparative results for the
kappa statistics, mean absolute error, root mean squared error, relative absolute error, and root relative squared error from the total of 3,840 testing instances.
1094
The Weightily Averaged One- Dependence Estimators (WAODE) algorithm achieved the highest accuracy percentage as compared to other algorithms. Despite treating each tree augmented naive Bayes equally, [16]
have extended AODE by assigning
weight for each tree augmented naive Bayes differently as the facts that each attributes do not play the same role in classification.
Table 2. Classification accuracy using various algorithms under Bayes method in WEKA.
Algorithm Accurac y (%)
Error Rate (%)
Kappa Statistic s
Mean Absolut e Error
Root Mean Squared Error
Relative Absolut e Error (%)
Root Relative Squared Error (%)
WAODE 91.3 8.7 0.834 0.073 0.203 20.8 48.4
AODE 91.1 8.9 0.827 0.069 0.208 19.5 49.6
Naïve Bayesian
90.9 9.1 0.825 0.072 0.214 20.5 51.3 Naïve Bayes
simple
90.9 9.1 0.825 0.072 0.214 20.5 51.3 BayesNet 90.9 9.1 0.824 0.072 0.215 20.5 51.4
AODEsr 90.9 9.1 0.824 0.071 0.210 20.1 50.2
Naïve Bayes Updateable
90.9 9.1 0.825 0.072 0.214 20.5 51.3
HNB 90.3 9.7 0.816 0.091 0.214 25.7 51.1
4.3 Tree Methods
Table 3 shows the classification accuracies for various algorithms under tree method. In addition, the table provides comparative results for the
kappa statistics, mean absolute error, root mean squared error, relative absolute error, and root relative squared error from the total of 3,840 testing instances.
Table 3. Classification accuracy using various algorithms under Tree method in WEKA.
Algorithm Accuracy (%)
Error Rate (%)
Kappa Statistics
Mean Absolute
Error
Root Mean Squared
Error
Relative Absolute Error (%)
Root Relative Squared Error
(%)
J48Graft 92.3 7.7 0.849 0.078 0.204 22.1 48.7
J48 92.2 7.8 0.848 0.078 0.204 22.2 48.8
Simple Cart 92.0 8.0 0.844 0.079 0.199 22.3 47.5
Random Forest 91.4 8.6 0.832 0.083 0.205 23.4 49.1
LAD Tree 91.3 8.7 0.830 0.077 0.197 22.0 47.0
1095
International Journal on New Computer Architectures and Their Applications (IJNCAA) 1(4): 1086-1098 The Society of Digital Information and Wireless Communications, 2011 (ISSN: 2220-9085)
REPTree 91.0 9.0 0.825 0.080 0.213 22.8 50.9
Decision Stump 91.0 9.0 0.821 0.108 0.232 30.6 55.3
RandomTree 88.9 11.1 0.787 0.081 0.269 23.0 64.4
ID3 86.7 13.3 0.795 0.072 0.268 21.1 65.2
The J48Graft algorithm achieved the highest accuracy percentage as compared to other algorithms. J48Graft generates a grafted C4.5 decision tree, whether pruned or unprunned. Grafting is an inductive process that adds nodes to the inferred decision tree. Unlike pruning that uses only information as the tree grows, grafting uses non-local information to provide better predictive accuracy. Figure 3 shows the difference of tree structure in a J48 tree as well as the grafted J48 tree.
Figure 3. The top figure is the tree structure for J48 and the bottom figure is the tree structure for grafted J48. Grafting adds nodes to the decision trees to increase the predictive accuracy. In the grafted J48, new branches are added in the place of a single leaf or graft within leaves.
Comparing the performance of both Bayes and tree-based methods, the J48Graft algorithm achieved the highest accuracy of 92.3% using the Tracer Study dataset. The second highest accuracy is also under Tree method, which is J48 algorithm with an accuracy of 92.2%. Bayes method only falls to number three using WAODE algorithm with prediction accuracy of 91.3%.
Nonetheless, we found that both classification approaches were complementary because the Bayes methods provide better view of association or dependencies among the attributes while the results from the tree method are easier to interpret.
Figure 4 shows the mapping of root mean squared error values that resulted from the classification experiment. This knowledge could be used in getting insights on the employment trend of graduates from local higher institutions.
1096
0 0.05 0.1 0.15 0.2 0.25 0.3 1
2
3 4
5
Bayes Methods T ree-based Methods AODE vs.
J48Graft
Naïve Bayesian
Naïve Bayes Simple vs.
REPT ree BayesNet
vs.
RandomT r HNB vs. ID3
Figure 4. A radial display of the root mean squared error across all algorithms under both Bayes and tree- based methods relative to accuracy. The smaller the mean squared error, the better is the forecast. Based on this figure, three out of five tree-based algorithms indicate better forecast as compared to the corresponding algorithms under the Bayes methods.
6 CONCLUSIONS
As the education sector blooms every year, graduates are facing stiff competitions to ensure their employability in the industry. The sole purpose of the Tracer Study system is to aid the higher educational institutions in preparing their graduates with sufficient skills to enter the job market. This paper focussed on identifying attributes that influenced graduates’ employability based on actual data from the graduates themselves after six month of graduation. Nonetheless, assembling the dataset was difficult because only 90%
of the attributes made their way to the classification task. This is due to confidentiality and sensitivity issues, hence the remaining 10% of the attributes are not permitted by the data owner.
This paper attempts to predict whether a graduate has been employed, remains unemployed or in an undetermined situation within the first six months after their graduation. The prediction has been performed through a series of classification experiments using various algorithms under Bayes and decision methods to classify a graduate profile as employed, unemployed or others. Results showed that J48, a variant of decision-tree algorithm yielded the highest accuracy, which is 92.3% as compared to the average of 91.3% across other Bayes algorithms.
As for future work, we are hoping to expand the dataset from the Tracer Study with more attributes and to annotate the attributes with information like correlation factor between the current employer and the previous employer.
We are also looking at integration dataset from different sources of data, 1097
International Journal on New Computer Architectures and Their Applications (IJNCAA) 1(4): 1086-1098 The Society of Digital Information and Wireless Communications, 2011 (ISSN: 2220-9085)
for instance graduate profiles from the alumni organization in the respective educational institutions. Having this, next we plan to introduce clustering as part of pre-processing to cluster the attributes before attribute ranking is performed. Finally, other data mining techniques such as anomaly detection or classification-based association may be implemented in order to gain more knowledge on the graduates employability in Malaysia.
Acknowledgments. Special thanks to Prof. Dr. Md Yusof Abu Bakar and Puan Salwati Badaroddin from Ministry of Higher Education Malaysia (MOHE) for their help with data gathering as well as expert opinion.
7 REFERENCES
1. Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufman (2006) 2. Shafie, L.A, Nayan, S.: Employability
Awareness among Malaysian
Undergraduates. International Journal of Business and Management, 5(8):119--123 (2010)
3. Mukhtar, M., Yahya, Y., Abdullah, S., Hamdan, A.R., Jailani, N., Abdullah, Z.:
Employability and Service Science: Facing the Challenges via Curriculum Design and Restructuring. In: International Conference on Electrical Engineering and Informatics, pp. 357--361 (2009)
4. Zaharim, A., Omar, M.Z., Yusoff, Y.M., Muhamad, N., Mohamed, A., Mustapha, R.:
Practical Framework of Employability Skills for Engineering Graduate in Malaysia. In:
IEEE EDUCON Education Engineering 2010: The Future Of Global Learning Engineering Education, pp. 921--927 (2010) 5. Rees, C., Forbes, P., Kubler, B.: Student
Employability Profiles: A Guide for Higher Education Practitioners (2006)
6. Wook, M., Yahaya, Y.H., Wahab, N., Isa, M.R.M.: Predicting NDUM Student’s
Academic Performance using Data Mining Techniques. In: Second International Conference on Computer and Electrical Engineering, pp. 357--361 (2009)
7. Ogor, E.N.: Student Academic Performance Monitoring and Evaluation Using Data Mining Techniques. In: Fourth Congress of Electronics, Robotics and Automotive Mechanics, pp. 354--359 (2007)
8. Minaei-Bidgoli, B., Kashy, D.A., Kortemeyer, G., Punch, W.F.: Predicting Student Performance: An Application of Data Mining Methods with an Educational Web- based System. In: 33rd Frontiers in Education Conference, pp. 13--18 (2003)
9. Guruler, H., Istanbullu, A., Karahasan, M.: A New Student Performance Analysing System using Knowledge Discovery in Higher Educational Databases. Computers &
Education. 55(1), pp 247--254 (2010) 10. Kumar, V., Chadha, A.: An Empirical Study
of the Applications of Data Mining Techniques in Higher Education, International Journal of Advanced Computer Science and Applications, Vol. 2, No.3, March 2011, pp 80-84 (2011)
1098
16. L. Jiang, H. Zhang: Weightily Averaged One- Dependence Estimators. In: Proceedings of the 9th Biennial Pacific Rim International Conference on Artificial Intelligence, PRICAI 2006, pp 970-974 (2006)
15. Mitchell, T.: Machine Learning. McGraw Hill, New York (1997)
14. Jaynes, E.T.: Probability Theory: The Logic of Science. Cambridge University Press (2003)
13. Ian H. Witten, Eibe Frank:Data Mining : Practical Machine Learning Tools and Techniques, Morgan Kaufmann (2005) 12. Hall, M., Frank, E., Holmes, G., Pfahringer,
B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: An Update; SIGKDD Explorations, Volume 11, Issue 1 (2009) 11. Affendey, L.S., Paris, I.H.M., Mustapha, N.,
Sulaiman, M.N., Muda, Z.: Ranking of Influencing Factors in Predicting Student Academic Performance. Information Technology Journal. 9(4):832--837 (2010)