• Tidak ada hasil yang ditemukan

disease Prediction

Dalam dokumen EMBASE S by (Halaman 44-50)

Performance evaluation of Adtree, Functional Tree and lMT

38 Indian Journal of Public Health Research & Development, April-June 2018, Vol.9, No. 2 Mining algorithm with Nearest Neighbor Classifier,

Genetic Algorithm for variety of Optimal Reduced Set of Attributes and Naive Bayes with Decision Tree classifier, C4.5 and C5.0 decision tree algorithm with Rule reduction are compared and investigated in[6-8]. Cluster based Association Rule Mining, Sequential Minimal Optimization (SMO), Logistic Function and Multilayer Perceptron, K- Nearest Neighbor algorithm are scrutinized and compared in[9-12]. Review reports on heart disease prediction by several researchers are précised in[13-16]. The potentials, benefits and usage of DM in Healthcare to predict diseases are expounded in[17-18]. Adaptive Neuro-Fuzzy Inference with Hybrid Learning algorithm, ANN with Multilayer Perceptron using Back Propagation algorithm, Classification and Regression Tree (CART) algorithms are explored and elaborated in[17-21] and these algorithms are compared for accuracy of classifiers with other classifiers in existing literature.

Naive Bayes(NB), ANN and Decision tree combination, Cascaded Neural Network classifier and Support vector machine (SVM) algorithm, combination of CART, ID3 and Decision Tree classifier are investigated and compared in[22-25]. Equal Frequency Discretization Gain Ratio Decision Tree, Decision Tree classifier and Bagging algorithm, ANN, SVM and K-Means Clustering combination are examined and compared in

[26-27]. Decision Table, NB and J48 algorithm combination suggested in[28]. Decision Tree with K-Means, NB and Weighted Associative Classifier with Apriori Algorithm combination are examined and compared in[29]. SVM and NB classifier are separately assessed and compared in [30]. NB with Jelinek-mercer smoothing combination is explained in [31]. Random Forest and J48 Classifiers are separately evaluated and compared in[32]. LMT and FT Classifiers are separately examined (with all attributes) and compared in[33]. Memory Based Classifiers (with all attributes) are separately analyzed and compared in[34]. ZeroR, RIDOR and PART Classifiers (with all attributes) are separately tested and compared in[35]. Functional Tree Classifier and Random Forest Classifier (with all attributes) are separately examined and compared in[36]. This work investigated and compared the performance of ADTree, FT and LMT classifiers with CFSAE for predicting Heart Disease.

MATeRIAl ANd MeThOd

This research work uses the Heart Disease dataset [37]

with a sum of 270 instances with 13 medical attributes.

This dataset contains instances of 150 patients without

heart disease and 120 patients with heart disease. The class value”1” is used to indicate the healthy patient and class value “2” is used to indicate heart disease affected patient. The attributes are as: age, sex, chest, trestbps, chol, fasting blood sugar, restecg, thalach, exang, oldpeak, slope, ca, and thal.

ADTree Classifier: An alternating decision tree (ADTree) generalizes decision trees and has links to boosting. An ADTree comprises of an alternation of decision nodes, which postulate a predicate condition, and prediction nodes, which contain a single number. An instance is classified by an ADTree by ensuing all paths for which all decision nodes are true, and summing any prediction nodes that are traversed.

Functional Tree Classifier: FT Classifier constructs

“Functional Trees”. It has logistic regression functions at the inner nodes and leaves. The FT algorithm can manage numeric and nominal attributes, binary and multi class variables and missing values.

LMT Classifier: LMT is a combination of Logistic Regression and Decision Tree algorithm, which makes a tree with binary and multiclass variables. LMT constricts a single outcome in the form of tree containing binary splits on numeric attributes.

Performance Measures used: Various measures were used to scale the performance of above classifiers with CFS evaluator.

Classification Accuracy: Classification accuracy is figured as correctly classified instances divided by number of total instances multiplied by 100.

Mean Absolute Error (MAE): MAE defined as the average of the variance between projected and actual value in all instances. It is a good measure to measure the performance.

Root Mean Square Error (RMSE): RMSE is used to scale variations between values actually professed and the values projected by the model. It is same as taking the square root of the mean square error.

FINdINgS

The performance of ADTree, FT and LMT Classifiers with CFSAE separately investigated for heart disease

Indian Journal of Public Health Research & Development, April-June 2018, Vol.9, No. 2 39 prediction. The performance verified using the Training

set as well as using several cross validation methods and different percentage splits. The results are attained by considering resulting attributes of CSFAE namely, chest, exercise induced angina, restecg, maximum heartrate achieved, number of major vessels, oldpeak and thal of the dataset.

ADTree Classifier: The assessment summary of ADTree Classifier with CFSAE for entire training set and various cross validation methods is shown in Table 1. ADTree Classifier gives 85.19 % accuracy for the training data set. On an average, it gives around 79.21% of accuracy which is little less to the accuracy (79.94% of accuracy) got by only using ADTree classifier without CFSAE.

Table 1: ADTree Classifier with CFSAE Overall Evaluation Summary

Test Mode No. of instances (Testing)

Correctly Classified

Instances Accuracy Kappa MSe RMSe Model

building time (Sec)

Training Set 270 230 85.1852 0.698 0.2674 0.3258 0.19

2 Fold CV 270 206 76.2963 0.5232 0.3014 0.4013 0.09

5 Fold CV 270 209 77.4074 0.5421 0.2959 0.3807 0.05

10 Fold CV 270 213 78.8889 0.5721 0.2947 0.3846 0.04

15 Fold CV 270 214 79.2593 0.5786 0.3012 0.3857 0.08

20 Fold CV 270 213 78.8889 0.5714 0.3 0.3875 0.04

50 Fold CV 270 212 78.5185 0.5635 0.3087 0.3929 0.06

50% PS 135 111 82.2222 0.6433 0.299 0.3752 0.04

66% PS 92 74 80.4348 0.6125 0.2853 0.3628 0.04

75% PS 67 54 80.597 0.612 0.3004 0.3786 0.05

80% PS 54 43 79.6296 0.5948 0.3178 0.3851 0.05

FT Classifier:

The assessment summary of FT Classifier with CFSAE for entire training set and various cross validation methods is shown in Table 2. FT Classifier gives 87.78 % accuracy for the training data set. On an average, it gives around 80.82% of accuracy, which same to the accuracy (80.82% of accuracy) got by only using FT classifier without CFSAE.

Table 2: FT Classifier with CFSAE Overall Evaluation Summary

Test Mode No. of instances (Testing)

Correctly Classified

Instances Accuracy Kappa MSe RMSe Model

building time (Sec)

Training Set 270 237 87.7778 0.7523 0.1869 0.3054 0.31

2 Fold CV 270 221 81.8519 0.6285 0.2187 0.3906 0.17

5 Fold CV 270 226 83.7037 0.6672 0.1979 0.3751 0.22

10 Fold CV 270 224 82.963 0.6527 0.2113 0.3829 0.16

15 Fold CV 270 220 81.4815 0.6244 0.2129 0.3743 0.16

20 Fold CV 270 218 80.7407 0.6093 0.2099 0.4013 0.19

50 Fold CV 270 217 80.3704 0.6008 0.2172 0.4031 0.2

50% PS 135 109 80.7407 0.6149 0.2358 0.3994 0.2

66% PS 92 70 76.087 0.5238 0.2431 0.4513 0.23

75% PS 67 54 80.597 0.611 0.2041 0.37 0.17

80% PS 54 43 79.6296 0.5948 0.2462 0.4035 0.19

40 Indian Journal of Public Health Research & Development, April-June 2018, Vol.9, No. 2

LMT Classifier: The assessment summary of LMT Classifier with CFSAE for entire training set and various cross validation methods is shown in Table 3. LMT Classifier gives 85.56 % accuracy for the training data set. On an average, it gives around 84.73% of accuracy which is more than 0.74% to the accuracy (83.99%) got by only using LMT classifier without CFSAE.

Table 3: LMT Classifier with CFS Overall Evaluation Summary

Test Mode No. of instances (Testing)

Correctly Classified

Instances Accuracy Kappa MSe RMSe Model

building time (Sec)

Training Set 270 231 85.5556 0.7058 0.2258 0.3339 1.64

2 Fold CV 270 227 84.0741 0.6756 0.2328 0.3541 0.69

5 Fold CV 270 229 84.8148 0.6902 0.2317 0.3503 0.49

10 Fold CV 270 226 83.7037 0.6683 0.2377 0.3566 1.29

15 Fold CV 270 227 84.0741 0.6762 0.2383 0.3551 0.46

20 Fold CV 270 225 83.3333 0.6611 0.2391 0.358 0.51

50 Fold CV 270 227 84.0741 0.6762 0.2384 0.3553 0.43

50% PS 135 111 82.2222 0.6443 0.2531 0.3588 0.46

66% PS 92 79 85.8696 0.7182 0.2245 0.334 0.42

75% PS 67 59 88.0597 0.7611 0.2264 0.3293 0.61

80% PS 54 47 87.037 0.7407 0.2317 0.3345 0.45

Comparison of Tree Based Classifiers with and without CFS

Fig 1: Comparison of AdTree, FT and lMT Classifiers with and without CFSAE

The comparison of performance of the different Classifiers with and without CFSAE is presented in Fig 1 based on accuracy. The complete evaluation is done based on classification accuracy, MAE and RMSE values found using Training set result and Cross Validation Techniques. Subsequently, LMT classifier outperforms other classifiers followed by FT Classifier, then by ADTree Classifier and especially performance of LMT is improved with CFSAE.

CONCluSION

This research work investigated the efficiency of ADTree, FT and LMT Classifiers with and without CFSAE for heart disease prediction. Experiment is done using the open source machine learning tool. Also, performance evaluation of the classifiers has been done in view of various scales of performance measure. At last, it is observed that LMT classifier out performs than other classifiers followed by FT Classifier, then by ADTree Classifier and especially performance of LMT is improved with CFSAE.

ethical Clearance: Data has taken from publicly available source and cited.

Source of Funding: Self Conflict of Interest: Nil

ReFeReNCeS

1. Catherine Kreatsoulas and Sonia S Anand.

2010. The impact of social determinants on cardiovascular disease. Can J Cardiol, Vol 26 Suppl C August/September 2010, 8C-13C.

Indian Journal of Public Health Research & Development, April-June 2018, Vol.9, No. 2 41 2. Ashish Kumar Sen, Shamsher Bahadur Patel,

and D. P. Shukla. 2013. Data Mining Technique for Prediction of Coronary Heart Disease Using Neuro-Fuzzy Integrated Approach Two Level.

Int. J. Engg. and Comp. Science, 2(9), 2663-2671.

3. V.Manikantan and S.Latha. 2013. Predicting the Analysis of Heart Disease Symptoms Using Medicinal Data Mining Methods. Int. J. on Adv.

Computer Theory and Engg., 2(2), 5-10.

4. M.Akhil Jabbar et el. 2013. Classification of Heart Disease using Artificial Neural Network and Feature Subset Selection. Global J. Computer Sci. and Tech. Neural & Artificial Intelligence, 13(3).

5. G.Karthiga, et el. 2014. Heart Disease Analysis System Using Data Mining Techniques. Int. J.

Innovative Research in Sci., Engg. and Tech., 3(SI3), 3101–3105.

6. S.Sandhiya, et el. 2013. Novel Approach for Heart Disease verdict Using Data Mining Technique.

Int. J. Modern Engg. Research, pp. 10-14.

7. Shruti Ratnakar, K. Rajeswari, and Rose Jacob.

2013. Prediction of Heart Disease Using Genetic Algorithm For Selection of Optimal Reduced Set of Attributes. Int. J. Adv. Computational Engg and Networking, 1(2), 51-55.

8. Mohammad Taha Khan, Shamimul Qamar and Laurent F.Massin. 2012. A Prototype of Cancer/

Heart Disease Prediction Model Using Data Mining. Int. J. Applied Engg. Research, 7(11).

9. M.Akhil jabbar, Priti Chandra, and B.L.Deekshatulu. 2012. Heart Disease Prediction System using Associative Classification and Genetic Algorithm. Int. Conf. on Emerging Trends in Electrical, Electronics and Comn.

Technologies.

10. S.Vijayarani and S.Sudha. 2012. A Study of Heart Disease Prediction in Data Mining. Int. J.

Computer Sci. and Information Tech. & Security, 2(5), 1041-1045.

11. S.Vijayarani, and S.Sudha. 2013. Comparative Analysis of Classification Function Techniques for Heart Disease Prediction. Int. J. Innovative Research in Computer and Comn. Engg, 1( 3), 735-741.

12. Mai Shouman, Tim Turner, and Rob Stocker. 2012.

Applying k-Nearest Neighbour in Diagnosing Heart Disease Patients. Int. J. Information and Education Tech., 2(3), 220-223.

13. MA. Jabbar, Priti Chandra, and B.L.Deekshatulu.

2011. Cluster Based Association Rule Mining for Heart Attack Prediction. J. Theoretical and Applied Information Tech., 32(2), 196-201.

14. K.Srinivas, G.RaghavendraRao and A.Govardhan.

2011. Survey on Prediction of Heart Morbidity Using Data Mining Techniques. Int. J. Data Mining & Knowledge Mgt. Process, 1(3), 14-34.

15. S.J.Gnanasoundhari, G.Visalatchi, and M.Balamurugan. 2014. A Survey on Heart Disease Prediction System Using Data Mining Techniques. Int. J. Computer Sci. and Mobile Applications, 2(2), 72-77.

16. Hariganesh,S and Gajenthiran,M. 2014. A Survey:

Data Mining Approaches for Prediction of Heart Disease. Int. J. Engg. Sci. Invention, 3(4), 44-46.

17. Boris Milovic and Milan Milovic. 2012. Prediction and Decision Making in Health Care using Data Mining. Int. J. Public Health Science, 1(2), 69-78.

18. Negar Ziasabounchi and ImanAskerzade. 2014.

ANFIS Based Classification Model for Heart Disease Prediction. Int. J. Electrical & Computer Sciences, 14(2), 7-12.

19. Chaitrali S.Dangare, and Sulabha S.Apte. 2012.

A Data Mining Approach for Prediction of Heart Disease using Neural Networks. Int. J. Computer Engg. and Tech., 3(3), 30-40.

20. Mohammad Subhi Al-batah. 2014. Testing the Probability of Heart Disease Using Classification and Regression Tree Model. Annual Research &

Review in Biology, 4(11), 1713-1725.

21. Manjusha B.Wadhonkar, P.A.Tijare and S.N.Sawalkar. 2013. Classification of Heart Disease Dataset using Multilayer Feed forward back propagation Algorithm. Int. J. Application or Innovation in Engg. & Mgt., 2(4), 213-220.

22. R.Chitra and V.Seenivasagam. 2013. Heart Disease Prediction System Using Supervised Learning Classifier. Bonfring Int. J. Software Engg. and Soft Computing, 3(1), 1-7.

42 Indian Journal of Public Health Research & Development, April-June 2018, Vol.9, No. 2 23. Vikas Chaurasia, et al. 2013. Early Prediction of

Heart Diseases Using Data Mining Techniques.

Carib.j.SciTech, 1, 208-217.

24. Mai Shouman, Tim Turner, and Rob Stocker.

2011. Using Decision Tree for Diagnosing Heart Disease Patients. Proc. of the 9-th Australasian Data Mining Conference (AusDM’11), Ballarat, Australia, 23-29.

25. K.Thenmozhi, P.Deepika, and M.Meiyappasamy.

2014. Different Data Mining Techniques Involved in Heart Disease Prediction: A Survey. Int. J.

Scientific Research, 3(9), 67-68.

26. Aditya Methaila et el. 2014. Early Heart Disease Prediction using Data Mining Techniques. Proc of CCSEIT, DMDB, ICBB, MoWiN, AIAP - 2014, 53–59.

27. Aqueel Ahmed, and Shaikh Abdul Hannan.

2012. Data Mining Techniques to Find Out Heart Diseases: An Overview. Int. J. Innovative Tech.

and Exploring Engg., 1(4), 18-23.

28. Hari Ganesh.S, and Gajenthiran.M. Comparative study of Data Mining Approaches for prediction Heart Diseases. Int. org. Scientific Research IOSR Journal of Engg., 4(7), 36-39.

29. Aswathy Wilson et el. 2014. Data Mining Techniques For Heart Disease Prediction. Int. J. Advances in Computer Sci. and Tech., 3(2), 113- 116.

30. G.Parthiban and S.K.Srivatsa. 2012. Applying Machine Learning Methods in Diagnosing Heart Disease for Diabetic Patients. Int. J. Applied Information Systems, 3(7), 25-30.

31. Rupali R.Patil. 2014. Heart Disease Prediction System using Naive Bayes and Jelinek-mercer smoothing. Int. J. Adv. Research in Computer and Comn. Engg., 3(5).

32. Lakshmi Devasena, C. 2016. Proficiency Comparison of Random Forest and J48 Classifiers for Heart Disease Prediction. Int. J. Computing Academic Research, 5(1), 46-55.

33. Lakshmi Devasena, C. 2015. Comparative Analysis of LMT and FT Classifiers for Smart Heart Disease Prophecy. Int. J. Core Engg. &

Mgt., Special issue, 368 – 380.

34. Lakshmi Devasena, C. 2015. Comparative Analysis of Memory Based Classifiers for Intelligent Heart Disease Prediction. Int. J.

Applied Engg. Research, 10(81), 109-113.

35. Lakshmi Devasena, C. 2014. Proficiency Comparison of ZeroR, RIDOR and PART Classifiers for Intelligent Heart Disease Prediction. Int. J. Advances in Computer Science and Technology, 3(SI-11), 12-18.

36. Lakshmi Devasena, C. 2015. Comparative Analysis of Functional Tree Classifier and Random Forest Classifier for Smart Heart Disease Prediction. Proc. of 4th Int. Conf. on Communications, Signal Processing Computing and Information Technologies, 6-10.

37. UCI Machine Learning Data Repository – http://

archive.ics.uci.edu/ml/datasets.

Dalam dokumen EMBASE S by (Halaman 44-50)

Garis besar

Dokumen terkait