The reminder of the thesis is organized as follows:
Chapter 2 presents an organized review of the literature that includes computational terms in computer science, previous research studies of data mining tasks and major applications, and data mining classification techniques. Also, this chapter goes through the fundamental theories related to the prediction models, previous research studies in the field of educational data mining and processing techniques for data analytics with the determination of the successful prediction models and classification tools. The most important techniques and
applications of educational data mining in employability will be discussed in this chapter. Literature of neuro- fuzzy approach in classification and its applications in EDM will be discussed.
Chapter 3 introduces and illustrates classification, including decision tree classifiers, attribute selection measures and tree pruning. Chapter 3 also discusses Bayesian classification methods and several rule-based techniques, as well the common techniques for assessing accuracy. The final part of chapter 3 demonstrates the advanced classification techniques, such as Bayesian belief networks, classification by backpropagation, support vector machines and neuro-fuzzy classifications methods called ANFIS which is the main algorithm in this thesis through which the prediction model will be built.
Chapter 4 presents the process of applying our methodology and data analysis for alumni students, their labels (employed, unemployed, and undetermined), and the factor which most affects their employment rates, with associations generated from the prediction model.
Chapter 5 presents the findings of the data analysis with the discussion of the results. Different obstacles to data driven research of the employment market and the most effective factors will be highlighted.
Chapter 6 presents the discussion of the final results that were produced from the experimental study that was performed in Chapter five. The more important attributes that affect the prediction of the employability of the graduates will be shown. The required skills for IT graduates that affect their employability will be discussed, which may prompt the educational institutes to modify their curriculums and teaching strategies.
Chapter 7presents the final conclusions produced from the study outcomes. The limitations of implementing ANFIS algorithm on an educational dataset which contains more than 20 attributes will be demonstrated, as well as the researches that can be carried out according to the problems that appeared during experimental study will be demonstrated as proposed future works after this thesis. One more thing to be presented in this chapter is a set of recommendations for Jordan’s Ministry of Education to discern the direction of future research.
2 Chapter 2. LITERATURE REVIEW
Currently, employability has gained a lot of attention from higher education institutions. Employability data enables these institutions to better plan their educational strategies, enhance the curriculum, as well as improve students’ performance. Studying employability can be achieved by analysing the data extracted from educational resources. Nowadays, there are a lot of resources of educational data such as learning management systems (LMS), online learning system, admission systems, and social media, which provide a vast amount of data. Handling and analysing this huge amount of data cannot be carried out without sophisticated tools, and data mining is the most important technology for this purpose; the task of data mining predicts interesting and useful information that helps in making future strategy plans. Data mining has been used in many fields such as banking, stock exchange, marketing management, retail sales, health, and recently in education.
Additionally, and as noticed from the literature, several research studies in different countries have investigated issues related to employability. Most of these studies have been applied to countries with high unemployment rates, such as Vietnam (Tran, 2015). The Arab world suffers from the some of the highest unemployment rates in the world, especially non-oil countries (Henderson, 2013). Therefore, I decided to choose a country from the Middle East and study the Computer Science (CS) market as well as selecting a data sample of graduates in CS specializations from this country.
On the other hand, employers are often in search of skills that go beyond credentials, knowledge, and experience. An individual’s education and experience may make him/her suitable to be hired and join a job, and to be effective in the majority of tasks, he/she will need certain skills which are likely to progress over time. Some skills will be precise to the job, but the greatest amount will be so-called ‘soft skills’ that can be needed in any job or employment fields. These soft skills are ‘employability skills’: they are what makes anyone employable (Lin, Korsakul and Korsakul, 2012). Furthermore, higher education has a basic role in the reinforcement of any people’s economy as it is an industry branch in itself and it keeps the rest of the industry going by offering trained labourers (Laine, Leino and Pulkkinen, 2015). In the past, the basic issues for the educational institutions were the reduction in the students’ success rates (Araque, Roldán and Salguero, 2009), reduction in students’ retention (Pegrum, Bartle and Longnecker, 2015), growth in students shifting to other likely institutions and shortage of guiding students in subject choice. Since education has become more employment focused, graduating from any university has become an essential aspect in developing its
reputation (Pool, Qualter and Sewell, 2014a). Educational institutions produce and gather vast amount of data (Sedkaoui and Khelfaoui, 2019). This may be about students’ academic progression, students’ personal profile, their communication abilities, their web log interests and also graduates’ profiles (Bharambe et al., 2017). The task of predicting students’ employability may support identifying the students who are at a higher risk of unemployment and consequently management may make timely interference by introducing vital steps to train their students in order to develop their performance skills (Mishra, Kumar and Gupta, 2017).
Computer scientists have tried to identify relations between academic achievement, socioeconomic surroundings, job skills of the graduates and employability, however, mostly by implementing statistical methods (Abas and Imam, 2016). Those statistical methods have a direct relation with the field of computer science and data analytics (Cao, 2017).
Educational Data Mining (EDM) is the process of applying Data Mining (DM) techniques to educational data (Wu et al., 2014). EDM has emerged as an important research area in recent years for researchers all over the world (García et al., 2011) . Due to the importance of data mining in the field of education, the International Educational Data Mining Society1 was founded in July 2011. This society provides much useful research related to EDM, as well as helpful datasets.
This chapter surveys the essential background to understand the DM and EDM. Firstly, here is discussed the data mining definitions as part of Knowledge Discovery in Databases (KDD), and also present data mining tasks and major applications. Furthermore, educational data mining (EDM) is illustrated as a main part of this thesis. A lot of literature shown in this section demonstrates the most important research studies that have included the concept of EDM. Then, the most important techniques and applications of applying educational data mining in employability are investigated. Finally, a very important part of literature, which is neuro- fuzzy approach backgrounds and related works is illustrated.