Knee osteoarthritis classification using support vector ... - iKnow

(1)

AIP Conference Proceedings 2168, 020049 (2019); https://doi.org/10.1063/1.5132476 2168, 020049

Knee osteoarthritis classification using support vector machine AdaBoost and decision tree AdaBoost

Cite as: AIP Conference Proceedings 2168, 020049 (2019); https://doi.org/10.1063/1.5132476 Published Online: 04 November 2019

Z. Rustam, J. Pandelaki and D. A. Kusuma

ARTICLES YOU MAY BE INTERESTED IN Random forest for breast cancer prediction

AIP Conference Proceedings 2168, 020050 (2019); https://doi.org/10.1063/1.5132477 Multiclass classification of breast cancer large scale datasets for detecting cancer drivers AIP Conference Proceedings 2168, 020051 (2019); https://doi.org/10.1063/1.5132478

Comparison between stochastic support vector machine (stochastic SVM) and Fuzzy Kernel Robust C-Means (FKRCM) in breast cancer classification

AIP Conference Proceedings 2168, 020048 (2019); https://doi.org/10.1063/1.5132475

(2)

Knee Osteoarthritis Classification Using Support Vector Machine AdaBoost and Decision Tree Adaboost

Z. Rustam

^{1, a)}

, J. Pandelaki

²

and D. A. Kusuma

¹

1Department of Mathematics, Faculty of Mathematics and Natural Sciences (FMIPA), Universitas Indonesia, Depok 16424, Indonesia

2Department of Radiology, Universitas Indonesia, Jakarta 10430, Indonesia

a)Corresponding author: [email protected]

Abstract. Osteoarthritis is a chronic joint disease of cartilage that often occurs in elderly people. One of the joints that can be infected this disease is in the knee. Older people often underestimate painful feeling around their joint or do not realize that they have been affected by knee osteoarthritis, so the knee osteoarthritis disease becomes more chronic.

According to some studies, preventive measurements from early stage are very crucial to overcome the disease. One of the preventive measurements to overcome knee osteoarthritis is to detect the current stage of the disease, so the knee osteoarthritis patient can have the right treatment. Knee osteoarthritis was detected by classify the stage of knee osteoarthritis patients by using SVMAdaBoost and DTAdaBoost. The objective in this research is to compare SVMAdaBoost and DTAdaBoost based on classification accuracy from both methods. The results showed that the classification accuracy given SVMAdaBoost is 85.714 % and DTAdaBoost is 75 %.

Keywords: Classification accuracy, DTAdaboost, osteoarthritis, SVMAdaboost

INTRODUCTION

Osteoarthritis is the most common chronic disease and the biggest cause of joint disability in elderly people [1].

Osteoarthritis or degenerative knee joint disease defined as a knee abnormality caused by gradual degradation of cartilage causing osteophytes and cysts to form on the edge of the bone [2]. This disease caused by being overweight, getting older, a genetic defect in joint cartilage, joint injury, joints that are not properly formed, and stresses on the joints from certain jobs and playing sports [3]. Osteoarthritis can affect all parts of the body that have joint areas such as fingers, hips and knees. The most common osteoarthritis is knee osteoarthritis. Older people often underestimate painful feeling around their joint or do not realize that they have been affected by knee osteoarthritis, and then the knee osteoarthritis disease becomes more chronic. Therefore, early detection became the foundation for controlling knee osteoarthritis to make the patients get the right treatment and to improve the quality of our life in the future [4, 5]. One of the methods can used to early detection is machine learning. Support vector machines (SVM) is the most popular methods of machine learning. It has been used in many studies for various purposes, for example intrusion detection system [6], policyholders satisfactory [7], insolvency prediction in insurance company [8], brain cancer classification [9], etc.

In this research we used knee osteoarthritis data that had examined by MRI with T2Map sequence [10]. We use knee osteoarthritis data for classification using machine learning, which is different with the previous studies [11]

that used the data to make prediction using machine learning. Machine learning methods that used in previous studies is artificial neural network (ANN) methods, support vector machine (SVM), random forest, and naive Bayes.

In this study also used SVM but the SVM used together with adaptive boosting (SVMAdaBoost) and decision tree

(3)

(DT) which was used together with adaptive boosting (DTAdaBoost). SVMAdaBoost and DTAdaBoost are compared based on the classification accuracy of both methods.

METHODS

Here we used SVMAdaBoost and DTAdaBoost methods to classify knee osteoarthritis. It can be seen that AdaBoost is combined with the SVM and DT methods. AdaBoost maintains weight distribution from base classifier.

The base classifiers in this research are SVM and DT.

AdaBoost

AdaBoost (Adaptive Boosting) is one of the ensemble methods purposed in 1995 by Scaphire and Singer [12]

where applied to construct a set of base classifiers from training data given a dataset Q^{= {(} ¹^,y¹^{), (} ²^,y²^{), ..., (} ^N^,y^N^)}, ⁱ∈ , yi ∈ ∈ {-1,+1} by machine learning and combine them so it can improve the results of classification [12]. Therefore, the AdaBoost equation can be written as follows:

= ∑ ℎ (1)

Meaning of this equation is the AdaBoost generates the hypothesis ht = {-1,+1} and determines the weight of the hypothesis α^t correctly. H denoted the combined hypothesis and t = 1, 2, ..., J is the cycle of training iteration from base classifier. In the first process [13], AdaBoost initializes the weight vector in the training data with the equation:

= , = 1, 2, … , (2)

with the error function of training data:

! = ∑ , " ≠ ℎ (3)

The next process this method will determine the weight of the hypothesis ht follow the function:

=_$log ( ^)*_*₊⁺, (4)

with updated weight vector:

- = ^.⁺^/01.2+3+4 .5

6+ =₆^.₊⁺× 8 9^):⁺, " = ℎ

9^:⁺, " ≠ ℎ (5)

where Rt is normalization constant.

SVMAdaBoost

SVM is one of the machine learning algorithms that used to predict the regression and classification cases. The main purpose of SVM is to find a hyperplane function that can separate a data set into two classes (or more). In this case we separate the data into two classes, positive (+1) and negative (-1). Based on Vapnik theory [12], this classes can be completely separated by the hyperplane function which is defined as follows:

;^< + = = 0 (6)

(4)

Given a dataset Q^{= {(} ¹^,y¹^{), (} ²^,y²^{), ..., (} ^N^,y^N^)}, ⁱ∈ , yi ∈ ∈ {-1,+1}, we must solve the equation below

Min_{;,B $}‖; ‖^$ (7)

D. F. " ;^< + = ≥ 1, = 1, 2, … , (8)

where ; is perpendicular vector. The hyperplane, yi is the class labels, i is the personal information for i = 1, 2, ..., N, and b is the bias of the optimal hyperplane.

There are several cases where different classes are mixed in the data. This case is called a misclassification error.

In this case, the slack variable ξⁱ and C parameters are added so the equation becomes:

Min_{;,B $}‖; ‖^$ + I ∑ J (9)

D. F. " ;^< + = ≥ 1 − J , J ≥ 0 = 1, 2, … , (10) with C > 0.

Then, we define the proper kernel function as the transformation of nonlinear data from the input space to the feature space to make the problem separate linearly. The kernel functions are defined as follows:

L , ′ = N ^ON ′ (11)

with several kernel functions [14]:

• Linear kernel : L , ′ = ^O. ′

• Polynomial kernel : L , ′ = P4 ^O. ′ 5 + 1Q^R

• Radial basis function (RBF) kernel : L , ′ = exp −V‖ − ′ ‖^$ , V > 0

As previously explained, SVMAdaBoost is an AdaBoost combined with SVM. It means that the base classifier of AdaBoost is SVM. AdaBoost maintains weight distribution from SVM. The output of this method is class of hypothesis ℎ with better accuracy. This algorithm of SVMAdaBoost is shown in Fig. 1 based on Ref. [15].

DTAdaBoost

Decision tree (DT) is a machine learning algorithm that is easy to understand and implement to classify data sets.

The main purpose of DT is to build classification models that predict the value of a target attribute based on the input attributes [16]. The classification model that predicts target values based on input attributes by forming structure like a tree.

Input: X = { , " , $, "$ , … , , " }, ∈ , " ∈ Y ∈ {−1, +1}, SVM algorithm, and cycle Z

Initialize: = , = 1, 2, … , Do for F = 1, 2, 3, … , Z:

1. Train SVM with respect to and obtain the hypothesis ℎ 2. Calculate the training error of ℎ : ! = ∑ , " ≠ ℎ 3. If ! > 0,5 then stop

4. Determine the weight of hypothesis ℎ : =_$log ( ^)*_*₊⁺, 5. Update the weight of training samples: ^- = ^.⁺^/01.2+3+4 .5

6+ =₆^.₊⁺× 89^):⁺, " = ℎ

9^:⁺, " ≠ ℎ

Output the final hypothesis: = D]^ ∑ ℎ

FIGURE 1. SVMAdaBoost Algorithm

(5)

The classification tree consists of a series of branches and nodes. This nodes include the root node, parent node, child node, and terminal node. Each node in the classification tree is connected by a branch and only has two child nodes. The root node will form the parent node. Parent nodes will form child node and terminal node. Terminal node is child node that cannot form nodes anymore. The child node formed by separated of root node will be calculated using gini. Given a dataset Q^{= {(} ¹^,y¹^{), (} ²^,y²^{), ..., (} ^N^,y^N^)}, ⁱ∈ , yi ∈ , where is an attribute that contains k different classes and is a feature that will be separated into two child nodes. Gini value in node m is stated as follows:

_ = 1 − ∑ `b _ab^$ (12)

where pmk is number proportion of objects that are class-k on a node m. The weight of the Gini value obtained for each value in feature on the child node will be summed by using gini index formula:

c _, = ∑a∈ ^d_d^ef _ (13)

with nm is number of objects in feature that are class-k on a node _ and ^ is is the total number of objects on all nodes in feature .

Same as SVMAdaBoost, DTdaboost is an Adaboost combined with decision tree (DT). The base classifier of this method is DT. DTAdaBoost maintain the weight set in the training set. The weight of each training set will be changed according to the classification that is built by the classifier. The training set will be reweighted which will eventually build the classifier. This algorithm of DTAdaBoost is shown in Fig. 2based on Ref. [17].

RESULTS AND DISCUSSION

The dataset that we used is the knee osteoarthritis data that had been examined by MRI with T2Map sequence [10]. In this result the RBF and polynomial kernel are used for SVM methods. Thus, the classification with SVMAdaBoost RBF kernel will be compared with SVMAdaBoost polynomial kernel. The percentages of training data are 10 % until 90 %.

Based on the results above in Table 1, it can be seen that the highest accuracy from classification with SVMAdaBoost RBF kernel is 85.714 % and SVMAdaBoost polynomial kernel is 50%. This highest accuracy value is obtained in training data of 80 % and 70 %. Next, because SVMAdaBoost with RBF kernel given a higher accuracy than with polynomial kernel, we used SVMAdaBoost RBF kernel and compared it with DTAdaBoost. So, knee osteoarthritis data was classified using SVMAdaBoost and DTAdaBoost with a percentages of training data are 10 % until 90 %.

Back to the research purposed, the objective in this research is to compare SVMAdaBoost and DTAdaBoost based on classification accuracy from both methods in Table 2. From Table 2 we can see that the highest accuracy of SVMAdaBoost is 85.714 % in 80 % training data and DTAdaBoost is 75 % in 90 % training data. We also can see that the highest accuracy of both methods is 85.714 % given from SVMAdaBoost.

Input: X = { , " , $, "$ , … , , " }, ∈ , " ∈ Y, DT algorithm, and cycle Z Initialize: = , = 1, 2, … ,

Do for F = 1, 2, 3, … , Z:

1. Train decision Tree with respect to and obtain the hypothesis ℎ 2. Calculate the training error of ℎ : ! = ∑ , " ≠ ℎ

3. If ! > 0,5 then stop

4. Determine the weight of hypothesis ℎ : =_$log ( ^)*_*₊⁺, 5. Update the weight of training samples: ^- = ^.⁺^/01.2+3+4 .5

6+ =₆^.₊⁺× 89^):⁺, " = ℎ

9^:⁺, " ≠ ℎ

Output the final hypothesis: = D]^ ∑ ℎ

FIGURE 2. DTAdaBoost Algorithm

(6)

TABLE 1. Accuracy of SVMAdaBoost.

Training Data (%) SVMAdaBoost Accuracy (%)

RBF Polynomial

10 53.333 40

20 55.556 33.333

30 58.333 33.333

40 65 25

50 70.588 23.529

60 71.429 35.714

70 80 50

80 85.714 28.571

90 75 25

TABLE 2. Accuracy of SVMAdaBoost and DTAdaBoost.

Training Data (%) Accuracy (%)

SVMAdaBoost DTAdaBoost

10 53.333 43.333

20 55.556 44.444

30 58.333 45.833

40 65 50

50 70.588 52.941

60 71.249 57.143

70 80 70

80 85.714 71.429

90 75 75

CONCLUSION

AdaBoost is one of the ensemble methods that can improve the results of classification from the base classifier.

The base classifier we used is SVM and DT. SVMAdaBoost is an AdaBoost combined with SVM as a base classifier. Same as SVMAdaBoost, DTdaboost is an Adaboost combined with decision tree (DT). Based on the accuracy of SVMAdaBoost and DTAdaBoost in Table 2 above, the classification accuracy of SVMAdaBoost is 85.714 %, which is higher than DTAdaBoost. So, the support vector machine AdaBoost (SVMAdaBoost) method is better than the decision tree AdaBoost (DTAdaBoost) method.

ACKNOWLEDGMENTS

This research supported by the Ministry of Research and Higher Education Republic of Indonesia (KEMENRISTEKDIKTI) with PDUPT 2018 research grant scheme, ID number 389/UN2.R3.1/HKP05.00/2018.

REFERENCES

1. C. G. Helmick et al., Arthritis Rheum. 58, 15-25 (2008) .

2. Gale Encyclopedia of Medicine, Osteoarthritis (2008), available at http://medical- dictionary.thefreedictionary.com/osteoarthritis

(7)

3. National Institute of Arthritis and Musculoskeletal and Skin Disease, What Is Osteoarthritis (2018), available at https://www.niams.nih.gov/health-topics/osteoarthritis#tab-causes

4. A. Anandacoomarasamy et al., Ann Rheum Dis. 71, 26-32 (2012) 5. A. S. Gersing et al., Osteoarthritis and Cartilage. 24, 1126-34 (2016) 6. Z. Rustam and D. Zahras. J. Phys. Conf. Ser. 1028, 012227 (2018).

7. Z. Rustam and N. P. A. Ariantari, J. Phys. Conf. Ser. 1028, 012005 (2018).

8. Z. Rustam and F. Yaurita, J. Phys. Conf. Ser. 1028, 012118 (2018).

9. V. Panca and Z. Rustam, AIP Conf. Proc. 1862, 030133 (2017).

10. C. Pamela, Ph.D thesis, Universitas Indonesia, Depok, 2017 11. Y. Du, J. Shan and M. Zhang, IEEE 1, 671-677 (2017)

12. R. E. Schapire and Y. Singer, Prediction Machine Learning. 37, 297-336 (1999) 13. X. Wu et al., Know Inf Syst. 14, 1-37 (2008)

14. D. A. Sachindra, K. Ahmed, Md. Mamunur Rashid, S. Shahid, and B. J. C. Perera, Atmospheric Research. 212, 240-58.

15. X. Li, L. Wang and E. Sung, Eng Appl Artif Intel. 21, 785-95 (2008).

16. M. Taamneh, J. Safety. 66, 121-129 (2018).

17. H. Drucker and C. Cortes, Adv Neural Inform Process Syst. 8, 479-85 (1995).