International Journal of Education and Pedagogy (IJEAP) eISSN: 2682-8464 | Vol. 4 No. 3 [September 2022]
Journal website: http://myjms.mohe.gov.my/index.php/ijeap
DEEP LEARNING TECHNIQUE FOR DETECTING ONLINE LEARNING BEHAVIOR
Mohd Faiz Hilmi1*, Nurazree Mahmud2, Siti Haslina Md Harizan3 and Yanti Mustapha4
1 3 School of Distance Education, Universiti Sains Malaysia, Penang, MALAYSIA
2 Faculty of Business and Management, Universiti Teknologi MARA, Melaka, MALAYSIA
4 Department of Finance and Banking Universiti Teknologi MARA, Kedah, MALAYSIA
*Corresponding author: [email protected]
Article Information:
Article history:
Received date : 1 July 2022 Revised date : 2 August 2022 Accepted date : 25 August 2022 Published date : 6 September 2022
To cite this document:
Hilmi, M. F., Mahmud, N., Md Harizan, S. H., & Mustapha, Y. (2022).DEEP LEARNING TECHNIQUE FOR DETECTING ONLINE LEARNING BEHAVIOR. International Journal of Education and Pedagogy, 4(3), 12-19.
Abstract: The analysis of students’ learning behavior has been a major focus of online learning analytics. About five thousand students utilized the learning management systems to access learning materials such as recorded lectures, notes, quizzes and discussion forums. Each academic session, these activities generated over 9.5 million rows in the log file. With artificial intelligence techniques it is now possible to classify students’ learning behavior. Specifically, due to the availability of big data (9.5 million rows of log data) deep learning algorithm (artificial intelligence technique) can be used to classify students’ learning behavior. The objectives of this research are: 1) To develop cluster of students’ learning behavior based on learning management system's event logs; 2) To develop an unsupervised learning model of students’ behavioral patterns/clusters by applying deep learning technique using GPU based computer; and 3) To evaluate and compare the proposed framework and Graphics Processing Units solution in terms of performance and clustering accuracy.
The methodology proposed for this research consist of 7 steps (Download dataset, data pre-processing, derivation of new attributes, feature engineering, Execute deep learning algorithms students’ behavioral patterns/clusters, algorithm fine tuning and reporting). This study will contribute to the identification of attributes, derivation of new attributes and creation of new algorithm capable of identifying
1. Introduction
The analysis of students’ learning behavior has been a major focus of online learning analytics.
Separation in time and place during the online learning process reduces the ability of lecturers to observe their students' learning behaviors and provide tailored support.
Furthermore, online learning involved high levels of autonomy in which self-regulated learning skills is necessary. It is important for any institution offering online learning to understand their students’
online learning behavior to be able to provide support that will enhance student learning processes.
By understanding students’ online learning behavior, the institution will be able to tweak the setup and settings of their learning management system to maximize students learning behavior.
School of Distance Education, Universiti Sains Malaysia has been supporting the lifelong learning concept by providing opportunities for Malaysians to obtain tertiary education since 1971. School of Distance Education has fully embraced blended learning in its undergraduate degree programs. About five thousand students utilized the learning management systems to access learning materials such as recorded lectures, notes, quizzes and discussion forums. Each academic session, these activities generated over 9.5 million rows in the log file.
Teaching and learning are two important criteria of successful education. Therefore, the evaluation of e-learning effectiveness is becoming more important. As such, this study is intended to better understand students’ learning behavior based on students’ online activities as recorded by learning management system (LMS) log data. This paper describes a study for analyzing log data recorded in an online learning environment utilizing artificial intelligence techniques.
Artificial intelligence is capable of analyzing and recognizing patterns from huge dataset. Applying deep learning algorithm, this study will be able to classify students’ learning behavior based on more than nine million rows of log data. This study answers the call to utilize artificial intelligence to increases efficiency and productivity.
allowing e-learning administrators and university stakeholders to validate the efficiency and effectiveness of e- learning implementation. Furthermore, this research is in line with Malaysia IR 4.0 aspiration and commitment towards achieving United Nation’s Sustainable Development Goals (SDGs). Goal 4 of the SDGs is to ensure inclusive and equitable education and promote lifelong learning opportunities for all.
Keywords: Deep learning, Learning behavior, E-learning, Malaysia, Moodle.
The objective of this study is to conduct a critical review of the learning activities across four different academic programs offered by School of Distance Education. Based on the event logs gathered from the learning management system (Moodle) used at the School of Distance Education, Universiti Sains Malaysia, and this study attempts:
1. To develop cluster of students’ learning behavior based on learning management system's event logs.
2. To develop an unsupervised learning model of students’ behavioral patterns/clusters by applying deep learning technique using GPU based computer.
3. To evaluate and compare the proposed framework and Graphics Processing Units solution in terms of performance and clustering accuracy.
Specifically, this study addresses the following research questions:
1. How can students’ learning behavior be extracted and visualized from activity logs recorded by Moodle?
This research question is aligned to the first research objective. Answering this research question will enable us to understand students' learning behavior. Subsequently, similar learning behavior will be clustered together. The development of clusters will serve as the basis of understanding characteristics/features of online learning behavior.
2. How does online learners' behavioral interactions with Moodle's LMS differ between programs?
This research question is aligned to the second research objective. School of Distance Education students consist of four main programs (Science, Management, Social Science, Arts). Currently, a one-size-fits-all design of the learning management system is used for all four programs.
Understanding the unique learning behavior will enable educator to tailor made the design of learning management system based on the behavior of students thus enhancing learning experience.
3. How does online learners' behavior differ before and after a class?
Aligned with the second research objective, answering this research objective will enable us to understand students' online learning behavior, specifically before and after live class session. Findings will enable educator to revised online learning activities accordingly. For example, students might access the learning management system only after the live class session, therefore, materials/activities should be made ready after the live class session.
4. How can Moodle’s action log of student’s online activity offer meaningful insight into students’
course performance?
Aligned with the third research objective, various algorithms are available and this study will evaluate and compare those algorithms based on acceptable performance and accuracy metrics.
2. Literature Review
The analysis of students’ learning behavior has been a major focus of online learning analytics.
Analyzing specific patterns of interaction in LMS is a topic of great interest for the educational data mining (EDM) and learning analytics (LA) research communities. Separation in time and place during the online learning process reduces the ability of lecturers to observe their students' learning behaviors and provide tailored support.
Furthermore, online learning involve high levels of autonomy in which self-regulated learning skills is necessary (Vanslambrouck et al., 2019). Previous studies have focused primarily on frequency analysis without addressing the temporal aspects of students’ learning behavior (Juhaňák, Zounek, &
Rohlíková, 2019).
E-learning refers to the use of electronic devices for learning, including the delivery of content via electronic media such as the Internet. With the continuous growth of e-learning as a learning medium, the evaluation of e-learning effectiveness is becoming more important. Measuring the effectiveness of e-learning is therefore important but it is difficult due to the wide range of components of e-learning (Octaviani, Othman, Yusof, & Pranolo, 2015).
Learning analytics have been undertaken by utilizing log data. Log data can positively relate to student efforts performance and outcomes (Yin & Uosaki, 2017). Many universities have huge amount of databases which require proper mining to generate patterns and knowledge (A., S., S.K.,
& S., 2019). Large amount of log data on student activity is available and accumulated in Learning Management Systems (LMS) (Na & Tasir, 2018). For example, School of Distance Education, Universiti Sains Malaysia (SDE@USM) has been using LMS in delivering courses to all its students and every academic year, the
SDE@USM’s Moodle server has captured more than nine million rows of log data of both students and lecturers’ activities. In learning field, each student has his own learning style that affects his way of get, process, understand and percept information. Determining the learning style of students enhances the performance of learning process. Critical success factors of e-learning is the engagement by the students (Octaviani et al., 2015). Identifying learning style and learning behavior helps in the development of learning management systems. A potential approach, Machine Learning (ML) has emerged in a variety of application areas including LMS (Acheson & Ning, 2018; Syed & Nair, 2018).
This study proposes machine learning approach to evaluate the existence of e-learning. In the learning field, the main focus is on the learning style and learning behavior of the learners. This study describes a proposal for mining action log data recorded in an online learning to understand students’ level of activities at any stage of course progression. The discovered clusters will represent the starting point for the assessment of the usage behavior of the Moodle LMS.
3. Method
To achieve the objectives, this proposal adopted The CRISP-DM model (CRoss Industry Standard Process for Data Mining) (Piatetsky-Shapiro, 1999) consisting of six phases:
Phase 1. Learning understanding
This initial phase focuses on understanding the objectives and requirements converting this knowledge to a problem definition and a preliminary plan designed to achieve the objectives.
Specifically, this phase concentrates on understanding the learning objectives and requirements from a pedagogical perspective, and then converting this knowledge into a web mining problem definition, and a preliminary plan designed to achieve the objectives.
Phase 2. Data Understanding
This phase starts with an initial data collation and proceeds with data familiarization, the identification of data quality problems, the discovery of patterns in the data and the detection of interesting subsets that create hypotheses for hidden information. Specifically, this phase starts with collecting Moodle usage data and proceeds with activities in order to identify data quality problems, to discover first insights into the data, or to detect interesting subsets to form hypotheses for hidden information.
2.1 Download Dataset - The Dataset Consists of Learning Management System Logs Files 2018 are available for download from School of Distance Education's Moodle server. The main file is the event log file which consists of more than 7 million rows (specifically 7,359,872 rows). This file logs every activity that took place, for example logged in, clicked, download and viewed for every student.
Table 1: Summary of the Dataset
Item Total
Event log 7,359,872 rows Students 5,232 students
Courses 322 courses
Programs 4 programs
Majors 13 majors
Based on the data summarized above, this is the first study exploring large online learning dataset.
There is no published research utilizing such comprehensive and huge dataset. To date, only School of Distance Education fully utilizes online learning for all its students/courses.
Phase 3. Data Preparation
This phase covers all the activities involved in converting the initial raw data to the final data set. The data will be cleaned and transformed into an appropriate format for analysis.
3.1 Data Pre-Processing
Example of task during this phase is converting the date and time. Moodle database use UNIX timestamp (number of seconds that have passed since midnight on the 1st January 1970, UTC time).
All timestamp will be converted to DD-MM-YYYY format (or any acceptable format) for further analysis.
3.2 Derivation of New Attributes
For example, date/timestamp will be converted to "Day" as a new variable.
Phase 4. Modeling
The mining algorithms will be used to build and execute the model that discovers and summarizes the knowledge of interest for the teacher or developer.
4.1 Execute deep learning algorithms to discover students’ behavioral patterns/clusters.
This step will produce new algorithm capable of identifying clusters/patterns of students' learning behavior.
4.2 Algorithm Fine Tuning Phase 5. Evaluation
The results or model obtained will be interpreted for understanding students’ behavior.
Phase 6. Deployment
Depending on the requirements, this optional phase can be as simple as generating a report or as complex as implementing a repeatable data mining process.
6.1 Reporting
This study utilized KNIME Analytics Platform (Berthold et al., 2008) to build and train the deep learning architectures (Melcher & Silipo, 2020).
4. Findings
This study analyzes the log file of SDE@USM Moodle activities as the proxy of learning behavior.
Initial phase of exploring the data involved a workflow shown in Fig 1. Orange colored nodes indicates data setup, yellow colored nodes indicates data cleaning and transformation and the blue colored nodes performed data visualization. Three tables extracted from Moodle database were imported into KNIME Analytics Platform. The table are log, course and role. These three tables were combined to form a single table. This single table were then subjected to various cleaning and transformation.
Figure 1: KNIME Workflow
This study will continue with the analysis by building and training deep neural network architectures to identifying clusters/patterns of students' learning behavior. Clustering analysis will be utilizing Keras feedforward neural network (Melcher & Silipo, 2020).
5. Discussion and Conclusion
This study is aligned with Malaysia’s aspiration in fulfilling the United Nation’s Sustainable Development Goals (SDGs). Goal 4 of SDGs is to ensure inclusive and equitable education and promote lifelong learning opportunities for all. Goal 4 focuses on the access to all types of education, early or later education for all. Furthermore, goal 4 focuses on equitable access to education, literacy, numeracy and multi-skills with proper financial and materials’ assistance to further contribute to nation’s development and society.
This study is also aligned with Malaysian National Policy on Industry 4.0. Fourth Industrial Revolution (IR 4.0) infuses higher value added process through the application of advanced digitization, advanced manufacturing technologies and efficient resource utilization to enhanced efficiency and reduced dependency of human labor and ultimately driving competitiveness going forward. IR 4.0 will necessitate profound changes in major aspects of education: content, delivery/pedagogy, and structure/management of education.
Major key enabler of IR 4.0 are artificial intelligence and big data analytics. Big data techniques are being applied in various industry to improve customer experience and product quality, realize energy efficiency and increase productivity. It is now possible to collect masses of data from several different sources to direct decisions making.
This study begins with analyzing big data (9.5 million rows of students’ online activities log) using the deep learning neural network technique. Findings from this study will allow e-learning administrator and university stakeholder to validate the efficiency and effectiveness of e-learning implementation.
6. Acknowledgement
The authors would like to thank Ministry of Higher Education Malaysia for Fundamental Research Grant Scheme with Project Code: FRGS/1/2020/SSI0/USM/02/6.
References
A., V., S., R., S.K., V., & S., G. (2019). Mining CMS Log Data for Students’ Feedback Analysis. In Y. XS., S. S., D. N., & J. A. (Eds.), Third International Congress on Information and Communication Technology. Advances in Intelligent Systems and Computing (Vol. 797).
Singapore: Springer.
Acheson, L. L., & Ning, X. (2018). Enhance E-learning through data mining for personalized intervention. Paper presented at the 10th International Conference on Computer Supported Education (CSEDU 2018).
Berthold, M. R., Cebron, N., Dill, F., Gabriel, T. R., Kötter, T., Meinl, T., . . . Wiswedel, B. (2008).
KNIME: The Konstanz information miner. Paper presented at the Studies in Classification, Data Analysis, and Knowledge Organization.
Juhaňák, L., Zounek, J., & Rohlíková, L. (2019). Using process mining to analyze students’ quiz- taking behavior patterns in a learning management system. Computers in Human Behavior, 92, 496-506. doi:10.1016/j.chb.2017.12.015
Melcher, K., & Silipo, R. (2020). Codeless Deep Learning with KNIME: Build, train, and deploy various deep neural network architectures using KNIME Analytics Platform. Birmingham: Packt Publishing Ltd.
Na, K. S., & Tasir, Z. (2018). Identifying at-risk students in online learning by analysing learning behaviour: A systematic review. Paper presented at the 2017 IEEE Conference on Big Data and Analytics (ICBDA 2017).
Octaviani, D., Othman, M. S., Yusof, N., & Pranolo, A. (2015). Applied Clustering Analysis for Grouping Behaviour Of E-learning Us age Based on Meaningful Learning Characteristics. Jurnal Teknologi, 76(1), 125-138. doi:10.11113/jt.v76.3904
Piatetsky-Shapiro, G. (1999). CRISP-DM: a proposed global standard for data mining. The On-Line Executive Journal for Data-Intensive Decision Support, 3(15), 199.
Syed, T. A., & Nair, S. S. K. (2018). Personalized recommendation system for advanced learning management systems. Paper presented at the ACM International Conference Proceeding Series.
Vanslambrouck, S., Zhu, C., Pynoo, B., Thomas, V., Lombaerts, K., & Tondeur, J. (2019). An in- depth analysis of adult students in blended environments: Do they regulate their learning in an
‘old school’ way? Computers and Education, 128, 75-87. doi:10.1016/j.compedu.2018.09.008 Yin, C., & Uosaki, N. (2017). Building a Group Formation System by Using Educational Log Data.
Paper presented at the IEEE 17th International Conference on Advanced Learning Technologies (ICALT 2017).