• Tidak ada hasil yang ditemukan

This Report Presented in Partial Fulfillment of the Requirements for the Degree of Bachelor of Science in Computer Science and Engineering

N/A
N/A
Protected

Academic year: 2023

Membagikan "This Report Presented in Partial Fulfillment of the Requirements for the Degree of Bachelor of Science in Computer Science and Engineering"

Copied!
46
0
0

Teks penuh

Touhid Bhuiyan, Professor and Head, Department of CSE, for his kind help in completing our project and also to other faculty members and the staff of the CSE Department of Daffodil International University. We would like to thank all our coursemates at Daffodil International University who participated in this discussion while completing the coursework. The most effective techniques for predicting dropout among Bangladeshi students were found after training and testing the dataset with a number of well-known algorithms including SVM, Logistic Regression, Random Forest, Decision Tree, etc.

We have made a considerable effort to conduct a study on the student dropout rate during the COVID-19 period, considering the scenario and the effects of the pandemic. The study focused mainly on university students in Bangladesh, and the aim was to determine the percentage of students who dropped out of school due to financial problems or other unspecified problems. We polled more than 500 undergraduate Bangladeshi students when the country was on lockdown.

The five Bernoulli algorithms Naive Bayes, DecTreeReg, RandomForestRegressor, LogisticReg and LinearReg were trained and tested on the database and shown to be the best approaches for predicting Bangladeshi student dropouts. Situational considerations included the belief that the lockdown was jeopardizing their family's financial stability, academic performance, symptoms of COVID-19, and health issues.

Motivation .................................................................................................................. 1-2

We will be able to understand the dropout rate of students during COVID-19 and the main causes of their dropout by addressing the issues they are not comfortable with and bringing them to an appropriate method. Our teachers will be able to use this approach to overcome all difficulties and protect students from dropping out by correctly organizing the class if we ever face a new epidemic like COVID-19 and need an online class again. The question of whether students will be active during the last week or if this is the last week they will be active is the one that is addressed most often.

Early identification of students at risk of dropping out reduces the problem and makes it possible to identify the necessary conditions. Our teachers will be able to use this approach to overcome all the difficulties and ensure students from being dropped by organizing the class properly if we ever face a new epidemic like COVID-19 and need an online class again . The size of the data set is a frequent error in many research projects in educational technology, learning analytics and educational data mining.

In addition to using logs, it is difficult to collect enough educational data to properly meet the needs of machine learning. Here, the question addresses how the task is accomplished and provides a quick overview of the system.

Expected Outcomes .................................................................................................. 3-4

Topics discussed: 1.1 Introduction, 1.2 Motivation, 1.3 Rationale for the study, 1.4 Research questions, 1.5 Expected outcome and 1.6 Layout of the report. The main objective of the study was covered in the chapter Rationale of the work of the study. We witness the effects of the authors' choice to draw a line through this area of ​​study later in Chapter 2.

The following topics are covered: Background/Terminology, Related Works, Comparative Analysis and Summary, Scope of the Problem, and Challenges are all included in this report. The chapter closes with a discussion of the limitations of our study, which can be used as a springboard for other people's future research. The following part we have discussed is: 6.1 Summary of the Study, 6.2 Conclusions, 6.3 Implications for Further Study.

It can be fatal and result in the destruction of loss of life due to anxiety, depression, etc. We analyzed 15 publications relevant to our study out of the more than 35 that we examined to provide a better result that directly would benefit affected students and their families.

Related Works ............................................................................................................ 7-9

In addition to future demolition, this disaster also causes financial loss and emotional pain for the family of the affected students. This crisis situation known as dropout is due to the lack of proper recognition and concern of the education sector. In order to recognize this at the primary level and diagnose it early, a different algorithm is properly used.

As this condition is discovered from time to time, there is a chance of error during a normal investigation process, which can cause tragedy to the family of the affected student. Therefore, we are working within these limitations to solve problems that will improve the accuracy of our system. But due to poor internet facilities students disturb their studies and this internet problem is one of the reasons for dropping out.

Using several algorithms, including logistic regression, decision tree, random forest, naive Bayes, and Support Vector Machine, Janka Kabathova et al. Marcell Nagy and Roland Molontay et al represent the research paper on dropout based on Secondary School Performance. They applied the following machine learning models on the data sets such as Decision Tree, Boosted Tress, Deep learning etc.

In this work, they used an artificial neural network algorithm, which consists of an input layer unit, an output layer unit, and a hidden layer unit. In this model, they collected real data from 2100 students and created 70 features in the dataset.[6] Mia Hossain and Labib et al worked on seven algorithms and their main goal was to find out whether students benefit from online classes or not.

In our proposed work, we have proposed a deep learning method to more effectively identify and predict student dropouts. We have proposed a Deep Convolutional Network (CNN) for fully automated dropouts in raw data that can solve large causes and problems to achieve our goal. Data scientists can still engage with each other on a variety of topics using the Kaggle platform.

Scope of the Problem

Challenges

To achieve our goal, we used various machine learning algorithms such as Random Forest Regression, Bernoulli Naive Bayes, Decision Tree Regression, Logistic Regression, XGBoost Regression, etc. A highly configured computer with a high GPU is used to run the machine learning algorithms and machine learning models in a smooth manner. From figure 3.3.1 given below, we can see the visualization of the data set distribution where the data is divided into two categories.

Building machine learning models and running applications requires a sufficient amount of random access memory (RAM) and a fast central processing unit (CPU). In Figure 4.2.3 below, the random forest regression is similar to the decision tree regression with 98.8% accuracy. It can be seen from Figure 4.2.5 below that XGBoost regression gives 75% accuracy with 93% result of this model.

Daffodil International University 23 Table 4.2.3 shows the True Positive, True Negative, False Positive and False Negative Decision Tree which is used to make predictions. Daffodil International University 24 Table 4.2.6 shows the True Positive, True Negative, False Positive and False Negative of the XGBoost regression which is used to make predictions. Other machine learning algorithms can be used to increase the accuracy rate such as KNN and Neural Network.

Various machine learning algorithms are used such as Bernoulli Naive Bayes, Decision Tree Regression, Random Forest Regression, Logistic Regression, and XGBoost Regression. Fortunately, we were able to capture this phenomenon in the form of data using machine learning. 34;Prediction of student dropout in university classes using two-layer ensemble machine learning approach: a novel stacked generalization”, vol.3, 2022.

3] Fisnik Dalipi, Ali Shariq Imran, Zenun Kastrati, "MOOC Dropout Prediction Using Machine Learning Techniques: Review and Research Challenges", DOI:10.1109/EDUCON. 7] Neema Mduma, Khamisi Kalegele, Dina Machuve, "A Survey of Machine Learning Approaches and Techniques for Student Dropout Prediction", Data Science Journal, 17. april 2019. 8] Ali Shariq Imran, "MOOC Dropout Prediction Using Machine Learning Techniques: Review and Research Challenges”, IEEE Global Engineering Education Conference (EDUCON Evaluation of Prediction Algorithms in the Student Dropout Problem.

11] Meseret Yihun Amare, Stanislava Simonova, “Global challenges of student's dropout: A Prediction model development using machine learning algorithms on higher education datasets”, p. 12] João Gabriel Corrêa Krüger, Alceu Britto, Jean Paul Barddal, “An Explainable Machine Learning Approach for Student Dropout Prediction”, pp. 14] Hee Sun Park, Seong Joon Yoo, "Early failure prediction in university application with machine learning", vol.

Figure 3.2.1 shows that 71.5% of students did not drop out and 28.5% of students dropped out  during the Covid-19 pandemic
Figure 3.2.1 shows that 71.5% of students did not drop out and 28.5% of students dropped out during the Covid-19 pandemic

Gambar

Figure 3.2.1 shows that 71.5% of students did not drop out and 28.5% of students dropped out  during the Covid-19 pandemic
Figure 3.4.2 – Required Python Libraries
Figure 4.2.1 shows the Actual and Predict data based on Income. Bernoulli Naive Bayes has shown  98.8% accuracy on test data
Figure 4.2.2 - Graph for Decision Tree Regression
+7

Referensi

Dokumen terkait

Dengan menyebut nama Allah SWT yang Maha Pengasih lagi Maha Penyayang, penulis panjatkan puji syukur atas kehadirat Allah SWT yang telah melimpahkan rahmat serta