Intensive Care Unit Patient Monitoring of Pediatric and Congenital Heart Disease Using Data Mining

This thesis entitled "Intensive Care Unit Patient Monitoring of Pediatric and Congenital Heart Disease Using Data Mining", submitted by the group mentioned below, has been accepted as satisfactory in partial fulfillment of the requirements for the degree of B.Sc. This is to confirm that the work presented in this thesis, entitled "Intensive Care Unit Patient Monitoring of Pediatric and Congenital Heart Disease Using Data Mining", is the result of the study and research carried out by the following students under the supervision of Dr. Hasan Sarwar, Professor, Department of Computer Science and Engineering (CSE), United International University (UIU), Dhaka, Bangladesh.

We are grateful to Almighty Allah for his blessings for the successful completion of our thesis. Hasan Sarwar, Professor, Department of Computer Science and Engineering (CSE), United International University (UIU), Dhaka, Bangladesh, for his constant supervision, loving guidance and great encouragement and motivation. We are especially grateful to the Department of Computer Science and Engineering (CSE), Military Institute of Science and Technology (MIST) for providing their full support during the dissertation work.

The healthcare industry collects enormous amounts of heart disease data that is unfortunately not being "mined" to uncover hidden information for effective decision making. The reduction of blood and oxygen supply to the heart leads to heart disease. However, there is a lack of effective analysis tools to discover hidden relationships and trends in data. The aim of this research work is to more accurately predict the presence of heart diseases with reduced number of characteristics.

LIST OF SYMBOLS

INTRODUCTION

Background
Objective
Outline of Work

The basic functions of data mining are the use of various methods and algorithms to discover and extract patterns from stored data. Over the past two decades, data mining and knowledge discovery applications have gained significant attention due to their importance in decision-making and have become an essential component in various organizations. The field of data mining has progressed and positioned itself in new areas of human life with various integrations and advancements in the fields of statistics, databases, machine learning, pattern reorganization and healthcare.

Medical data mining in healthcare is considered an important but complex task that needs to be performed accurately and efficiently. Data mining in healthcare attempts to solve real health problems in the diagnosis and treatment of diseases. The purpose of this research paper is to analyze several data mining techniques proposed in recent years for the diagnosis of heart disease.

Many researchers used data mining techniques in the diagnosis of diseases such as tuberculosis, diabetes, cancer and heart diseases in which some data mining techniques are used in the diagnosis of heart diseases such as KNN, Neural Networks, Bayesian classification , Clustering based classification, Decision Tree, Genetic Algorithm, Naive Bayes, Decision Tree, WAC which show different levels of accuracy. Naive Bayes is one of the successful data mining techniques used in the diagnosis of heart disease patients. The use of data mining tools has been widely used in clinical applications to diagnose diseases more effectively.

Various data mining techniques such as decision trees, artificial neural networks, Bayesian networks, kernel density of support vector machines, trunk algorithm have been actively used in clinical support systems for the diagnosis of heart diseases. Although there have been promising results in the application of data mining techniques in the diagnosis and treatment of heart disease, the research done in finding treatment options for patients and especially heart patients is relatively substantial. It has been suggested by researchers that the application of data mining techniques to propose appropriate treatment options for patients would not only improve patient care but also reduce investigation time, errors, and improve physician performance as well. .

Much research has been done on the application of various data mining techniques in the diagnosis of heart disease to find the most accurate technique, but no research has been done on the data mining technique that can increase the reliability and accuracy in finding an effective treatment for heart disease. patients with heart disease. We use the data mining process to provide output. We use Rapid Miner for data mining.

RELATED WORK

Heart disease in both men and women occurred only in the presence of exercise-induced angina and factors such as chest pain were asymptotic. A method was introduced that uses the basic software 9.1.3 of the Static Analysis System (SAS) for diagnosing the heart disease. It has been proposed to select the critical features and aid the diagnosis of five major heart diseases, namely hypertension, coronary artery disease, rheumatic valvular disease, chronic pulmonary disease and congenital heart disease.

They used heart disease data with 352 cases and for each case 40 diagnostic features were recorded. Among the 352 cases of heart disease datasets, 24 critical diagnostic features were identified and their corresponding diagnosis weights for supporting or denying the diagnosis of each heart disease were determined. The classification task of HNFB-1 has been evaluated with different benchmark databases such as heart disease datasets.

The dimensions of the heart disease and hepatitis disease datasets were reduced to 9 from 13 and 19 in the feature selection (FS) subroutine by C4. The second stage was the heart disease and hepatitis disease datasets were normalized to the interval [0, 1] and weighted via weighted fuzzy pre-processing. The obtained classification accuracy of the system was 92.59% and 81.82% using 5050% training-test split for heart disease and hepatitis disease datasets.

They used the heart disease and hepatitis datasets from the UCI machine learning database as a medical dataset. The proposed method achieved accuracy values of 84.24% and 86.8% for the Pima Indians diabetes dataset and Cleveland heart disease dataset, respectively. Anooj, [9] presented a weighted fuzzy rule-based Clinical Decision Support System (CDSS) for computer-aided diagnosis of heart diseases.

Here, data preprocessing was applied to the heart disease dataset to remove the noisy information and find missing values. An example of an application in the medical domain was a heart disease detection system based on computer-aided diagnosis methods, where the data was obtained from other sources and evaluated by computer-based applications.

METHODOLOGY

Data Mining
RapidMiner

RapidMiner provides learning schemes, models and algorithms from WEKA and R scripts that can be used via extensions. EachRapidMiner uses a modular concept for this, where each step of an analysis (for example a preprocessing step or a learning procedure) is illustrated by an operator in the analysis process. This creates a data flow during the entire analysis process, as you can see as an example in Figure 3.1.

Alongside data tables and models, there are numerous application-specific objects that can flow through the process. In the text analysis, entire documents are passed, time series can be guided by special transformation operators or pre-processing models are simply transferred to a storage. operator (such as a normalization) to reproduce the same transformation later on other data. A subprocess is responsible for producing a model from the respective training data, while the second subprocess is given this model and any other generated results to apply it to the test data and measure the quality of the model in each case. application can be seen in Figure 3.2. In this way, a process that has already been created can be quickly reused for a similar problem, a model that has been generated once can be loaded and applied or simply the obtained analysis results can be looked at to find the method that the most success promised.

The results can be dragged and dropped into processes, where they are reloaded and provided to the process by special operators. In addition to the local repositories, which are stored in the computer's file system, RapidAnalytics instances can also be used as a repository. Because the RapidAnalytics server has extensive user rights management, processes and results can be shared or access can be restricted for individuals or groups of individuals.

This means that the analysis is carried out completely in the background and the user can learn about the analysis process via a status display. The user can continue to work at the same time in the foreground without their computer being slowed down by CPU and memory intensive calculations. All calculations are now done on the background server, which is probably much more efficient, as can be seen in Figure 3.4.

It also means that the hardware resources can be used more efficiently, since only a powerful server used jointly by all analysts is needed to perform memory-intensive calculations. All these extensions take advantage of and complement the extensive possibilities offered by RapidMiner: They not only add operators and new data objects, but also provide new views that can be freely integrated into the user interface, or even complete perspectives in which they can bundle their views like the R extension in Figure 3.5.

Figure 3.1: A simple process with examples of operators for loading, preprocessing and model production

IMPLEMENTATION

Step of Implementation
Input Data Format
Data Fitting with RapidMiner

Now we need to create the system which is intelligent enough to collect the knowledge of the previous result and give the exact output value. For the simplicity of our work, we collected our input data in an Excel sheet. The respiratory tract is where the process and data are stored. Now we will use this imported data for our need.

In some cases we need to measure performance, then we use the performance operator.

Table 4.1: Input Parameter Sl Parameter Value

RESULT AND DISCUSSION

Decision Tree

CONCLUSION