• Tidak ada hasil yang ditemukan

This Report Presented in Partial Fulfillment of the Requirements for the Degree of Bachelor of Science in Computer Science and

N/A
N/A
Protected

Academic year: 2023

Membagikan "This Report Presented in Partial Fulfillment of the Requirements for the Degree of Bachelor of Science in Computer Science and "

Copied!
67
0
0

Teks penuh

Rony Howlader to the Department of Computer Science and Engineering, Daffodil International University, has been accepted as satisfactory in partial fulfillment of the requirements for the degree of B.Sc. Department of Computer Science and Engineering Faculty of Natural Sciences and Information Technology Påskelilje International University. We hereby declare that this project has been carried out by us under the supervision of Moushumi Zaman Bonny, Assistant Professor, Department of Computer Science &.

We are grateful and extend our sincere thanks to Moushumi Zaman Bonny, Assistant Professor, Department of Computer Science and Engineering, Daffodil International University, Dhaka. Sadekur Rahman and Touhid Bhuiyan, Head of Department of CSE, for his kind help to complete our project and also to other faculty members and the staff of CSE Department of Daffodil International University. We would like to thank our entire coursemate in Daffodil International University, who participated in this discussion while completing the coursework.

  • Introduction
  • Motivation
  • Rationale of the Study
  • Research Questions
  • Expected Outcome
  • Report Layout

If COVID-19 can be predicted and treated at home through an online application, a nation will benefit in three ways. As a result, a model that can predict COVID-19 is proposed and trained using a relevant database. An online interface has been created through which patients can provide data and predict COVID-19 at home.

The SARS-CoV-2 virus is the reason that COVID-19 is the biggest problem around the world. So, to control this huge pressure of infected people, prediction of COVID-19 from home became the objective of this study. While this research is being conducted, a team can issue a warning to people and river regions about COVID-19.

  • Introduction
  • Related Works
  • Comparative Analysis
  • Scope of the Problem
  • Challenges

They use SEIR modeling to predict the occurrence of COVID-19 every day in China and abroad. The results of the RTPCR test for COVID-19 were considered in one cigarette and the data of the included patients. The peaks and magnitude of the COVID-19 pandemic were accurately predicted using the SEIR and AI model.

In this study, a machine learning model has been developed through which a potentially infected individual can know how susceptible he/she is to being infected with COVID-19 and its conditions. Due to the limitations and unavailability of healthcare in Bangladesh, 79% of patients affected by COVID-19 are unable to get tested. There were many factors in the data set and not all were asked to predict COVID-19.

  • Introduction
  • Research Subject
  • Machine Learning Techniques
    • Supervised Learning
  • Classification Techniques
    • Learning
    • Classification
  • Algorithmic Details
    • Decision Tree
    • Support Vector Machine
    • Logistic Regression
    • Linear Discriminant Analysis
    • eXtreme Gradient Boosting (XGBoost)
    • Random Forest
    • Gaussian Naïve Bayes
    • AdaBoost
    • Stochastic Gradient Descent
    • Linear SVC
    • Perceptron
  • Proposed System
    • Data Collection
    • Dataset
    • Data Pre-processing
    • Data Normalization
    • Data Splitting
    • Algorithm implicate
    • Model Analysis
    • Extract Appropriate Algorithm
    • Creating Model for Web Interface
    • Building a Web Interface
    • Execute Model
    • Input Values
    • Predictive Result
  • System Architecture
    • User Segment
    • Web Insider
    • Machine Learning Model

Twelve of the best Supervised Machine Learning algorithms are applied to carry out this research project. They generate predictions based on the probability that a current input dataset will fall into each of the classes. The probability is calculated using Bayes' theorem, which predicts the probability that the output variable produced the input.

Hundreds to thousands of trees are often used, depending on the scale and nature of the workshop. The is used in equation (ix) to change the weight of the dataset samples, resulting in the creation of a new dataset. Then all data were combined into one CSV file for analysis and interpretation.

The frequency of the rows remained ideal for integrating different Machine Learning algorithms after all the combined data was combined into a digital Comma Separated Value file. The specific model can be developed to predict anything using the Training half of the dataset, and the Testing component can then be used to assess how accurately the data is predicted. Bayes, Support Vector Machines, Logistic Regression, Discriminant Analysis are the names of the twelve methods.

The data were transformed into tables after evaluating the Confusion Matrix, Accuracy Score, Jaccard Score, Cross Validated Score, AUC Score, Mean Absolute Error and Mean Squared Error for all the algorithms. By evaluating and analyzing all the essential information from the tables, the optimal method was found. First of all, an appropriate algorithm must be designed to make efficient use of the data set.

This is a simpler method of developing a web interface because Django does most of the work for you. A basic system architecture that is a broader representation of the proposed system is shown in Figure 3.2. By analyzing and monitoring all the essential findings from the tables, the optimum method was discovered.

Figure 3.1: Proposed Method to Predict COVID-19
Figure 3.1: Proposed Method to Predict COVID-19
  • Introduction
  • Experimental Results
    • Data Acquisition
    • Data Utilization
    • Feature Importance
  • Result & Discussion
    • Confusion Matrix
    • Classification Report
  • Result Analysis
    • Accuracy
    • Jaccard Score
    • Cross Validated Score
    • Standard Deviation
    • Misclassification & Error
  • Web Implementation
    • Web Interface
    • Web Output Analysis

Each class has three numerical values: COVID-19 negative. If the result is positive, it is normal or an emergency, depending on the patient's condition. As a result, if the diagnostic categories of the samples are unknown, missing values ​​will appear in the data, requiring the use of an appropriate imputation approach. The confusion matrix has been used to demonstrate desired results and evaluate the effectiveness of machine learning algorithms.

It is necessary to develop a confusion matrix to validate the results of the implementation phase. The letter TP in the acronym indicates that 46% of the values ​​in XGBoost, Random Forest, and Decision Tree were True Positive (TP) values. Random Forest, Decision Tree and XGBoost yielded fewer False Positive (FP) values ​​in this part of the study, by only 1%.

In machine learning, a ranking report is a metric used to analyze the performance of the system. The eXtreme Gradient Boosting method has proven to be one of the most powerful and scalable implementations of gradient boosting machines, capable of pushing the limits of processing power for boosted tree algorithms. The Jaccard score is a metric used to determine the similarity and diversity of sample sets.

The Jaccard coefficient, which is defined as the size of the intercept divided by the size of the union, can be used to quantitatively compare the similarity of two finite sets of samples. The accuracy plot and percentage of each method used for prediction in this model are shown in Equation (xiv), Figure 4.2 and Table 4.8. A plot of the cross-validated results and the percentage of each method used for estimation in this model are shown in Figure 4.3 and Table 4.8.

Misclassification, mean absolute error, and mean square error in the algorithms are shown in Table 4.10 and Figure 4.5. In Figure 4.6, all data was entered randomly and the desired result was obtained using a model previously trained to predict COVID-19. In addition, in Figures 4.7 and 4.8, all data were randomly entered from the test case, and the required result was taken as a previously trained model to predict COVID-19.

Table 4.1: COVID-19 Result frequency of the patients
Table 4.1: COVID-19 Result frequency of the patients
  • Introduction
  • Impact on Society
  • Ethical Aspects
  • Sustainability

When the database has millions of records, the model will be stronger, and it is hoped that it will be able to predict accurately over time. In the future, it will be unnecessary to visit a diagnostic hospital if Artificial Intelligence and the Internet of Things can be connected to a database. People will be able to predict COVID-19 at home before consulting a doctor about their condition.

And if a portable machine that can test blood can be developed, people can test for COVID-19 more accurately at home. Because any variant of the coronavirus is incredibly dangerous and this life-killing virus is constantly changing its genetic sequence, this research has long-term potential. Mutations with a frequency of more than 50 and a peak of even more than 30 can cause the vaccine to fail and spread quickly and fatally.

Because of this, using a website to monitor his or her COVID-19 condition is a very effective approach. The site mainly uses machine learning techniques to predict the coronavirus disease and the patient's condition. However, there are many possibilities for deep learning, AI and the Internet of Things to provide more accurate results for this project in the future.

As a result, many projects can be done in the future with the help of this project if it is sustainable. Moreover, in the future, various diseases and organ failure can be predicted using this machine learning research and the web method.

  • Introduction
  • Future Scope of this Study
  • Recommendations
  • Conclusion

Most of the symptoms of the coronavirus are similar to the common cold, flu and common allergies. In the dataset, 61.04% of infected patients are normal patients and 38.96% are emergency patients. If appropriate measures are taken, such as diagnosing corona in the early stages of infection, following appropriate guidelines and timely administration of necessary treatments, the risk of an infected individual becoming critically infected is reduced.

If COVID-19 is detected, diagnosed and treated at an early stage, there is a high probability that it will be controlled properly and efficiently, the patient will probably recover in a short time and the risk of death will be reduced. To determine if they have COVID-19, people must find a hospital where the COVID-19 test kit is available, wait a long time, and be tested. It is quite difficult to provide enough kits to test for COVID-19 for such a large population in the current situation.

Most suspected patients are unable to diagnose COVID-19 in time and take immediate measures due to an inadequate diagnostic system. The authors have created an interface for this project that allows users to get their COVID-19 report by simply filling out a form with the required information. Twelve machine learning algorithms are trained on the dataset and evaluated based on their accuracy, Jaccard score, Cross Validated score and AUC score to determine the best model for the dataset.

Compared to other classifiers, XGBoost outperforms the competition in terms of that once the website for this research is established, it will be available to all hospitals providing treatment for COVID-19. Reports on COVID-19 will be accessible online and available 24 hours a day, seven days a week, so people won't have to leave their homes to get them. 2021), “Analysis, Prediction and Evaluation of COVID-19 Datasets Using Machine Learning Algorithms”, Volume 8. Solanellas-Soler, J.; Sanchez-Gomez, S., “Loss of smell and taste can accurately predict infection with COVID-19: a machine learning approach”, J. Munroe, Bina Joe, and Xi Cheng, “Artificial intelligence and machine learning to combat COVID-19”, April 3, 2020.

20] An XS, Li XY, Shang FT, Yang SF, Zhao JY, Yang XZ, Wang HG, “Clinical features and blood test results in COVID-19 patients”, Ann Clin Lab Sci.

Gambar

Figure 3.1: Proposed Method to Predict COVID-19
Figure 3.2: System Architecture
Figure 3.3: System Architecture of Web Interface
Table 4.1: COVID-19 Result frequency of the patients
+7

Referensi

Garis besar

Dokumen terkait

Herein, my study about a composite catalyst using perovskite oxide and electron donor through super-milling treatment for enhancing the catalytic activity for OER is suggested as