Lung Cancer Classification using Support Vector Machine and Hybrid Particle Swarm Optimization-

(1)

Lung Cancer Classification using Support Vector Machine and Hybrid Particle Swarm Optimization-

Genetic Algorithm

Faisa Maulidina Department of Mathematics

University of Indonesia Depok, Indonesia [email protected]

Zuherman Rustam Department of Mathematics

University of Indonesia Depok, Indonesia

[email protected]

Jacub Pandelaki Department of Radiology Cipto Mangunkusumo Hospital

Jakarta, Indonesia [email protected]

Abstract—Cancer is an uncontrolled growth of abnormal cells in the body. It affects different parts of the body and the ones associated with the lungs is known as lung cancer. Some of the factors increasing a person's risk of the disease include smoking, family history of lung cancer, radiation exposure, and HIV infection. Although the diagnosis of this disease has been made in many ways, there are still some errors in diagnosing the disease. Therefore, this study proposed the classification of lung cancer using the machine learning method to avoid these errors.

This involved using the CT Scan dataset obtained from Cipto Mangunkusumo Hospital, Jakarta, Indonesia, and the application of the Particle Swarm Optimization-Genetic Algorithm-Support Vector Machine (PSO-GA-SVM) method of classification. The Particle Swarm Optimization-Genetic Algorithm (PSO-GA) method was used to optimize the parameters of the Support Vector Machine. Moreover, the values of accuracy, precision, recall, and f1-score of the method were measured to evaluate its performance and later compared with the SVM without parameter optimization. The results showed that the classification using PSO-GA-SVM had better performance compared to Support Vector Machine without parameter optimization. This is indicated by the values of the accuracy, precision, recall, and f1-score for the PSO-GA-SVM which were found to be 97.69%, 98.46%, 98.82%, and 97.66%

respectively.

Keywords—optimization, classification, particle swarm optimization, genetic algorithm, support vector machine

I. INTRODUCTION

Cancer is the uncontrolled growth of abnormal cells in the body. It is known as lung cancer when the organ affected is the lungs and this type has been reported to be the leading cause of cancer death worldwide [1]. According to World Health Organization (WHO), it is the most common type of cancer with 2.09 million cases in 2018 and the most common cause of cancer death with a mortality rate of 1.76 million people.

These figures show there is a need to prevent and control this disease through appropriate diagnosis from doctors.

Several methods have been used for diagnosing this disease, but there some errors are observed in the process. Therefore, the machine learning method was proposed to be used in classifying lung cancer to assist the doctors in achieving more precise diagnoses. This study used Support Vector Machine for the classification, but its parameters were optimized by combining Particle Swarm Optimization and Genetic Algorithm methods before its application.

The Support Vector Machine-Particle Swarm Optimization-Genetic Algorithm (SVM-PSO-GA) method was used to classify the lung cancer CT Scan dataset obtained from the Radiology Department, Cipto Mangunkusumo

Hospital, Jakarta, Indonesia. This dataset consists of 252 observations which were divided into two classes including 127 for lung cancer patients and 125 for normal patients.

Support Vector Machine (SVM) is one of the supervised learning methods which can be used for classification and regression [2]. It maps the input vector to a higher dimensional space where the maximum hyperplane separator is formed to maximize the distance between the two parallel hyperplanes.

Genetic Algorithm (GA) is a technique based on biological evolutionary processes which is normally used to solve complex optimization problems [3]. Its main components are selection, crossover, and mutation [3]. Meanwhile, Particle Swarm Optimization (PSO) is a meta-heuristic algorithm that is generally used in discrete, continuous, and combinatorial optimization problems [4]. In the context of the PSO, a single solution is called a particle while a set of all solutions is called a swarm. The main idea is that each particle only knows its current velocity, the best position of the particle (pbest), and the globally best position in the swarm (gbest) [4].

PSO and GA are both population-based optimization algorithms with their advantages and disadvantages. PSO has memory which allows all the particles to retain the knowledge of its optimal solution, but it has a premature convergence due to its lack of diversity. Meanwhile, GA is better in avoiding local optima values, but the information in an individual that is not selected is unavailable. Furthermore, the convergence speed of this algorithm is quite slow because it requires evolutionary operators such as selection, crossover, and mutation to generate a solution. The advantage of the PSO algorithm, on the other hand, is its fast convergence. Therefore, it is possible to have a better algorithm by combining the advantages and overcoming the disadvantages of both methods.

In previous studies, lung cancer data has been used for classification by using several machine learning methods [5,6,7]. Moreover, SVM has also been used to classify different diseases data [8, 9, 10, 11, 12] while PSO and GA have been applied as optimization and selection features for some diseases [3, 13, 14]. Generally, those previous research are to the base of our proposed method, which is SVM and PSO-GA optimization to classify lung cancer datasets. The novelty of this research is using a roulette wheel method for the selection in the genetic algorithm process, and for the crossover we use a uniform crossover. The dataset that we use is new and never been used before. The results of the PSO- GA-SVM method will be compared with SVM without optimization based on the accuracy, precision, recall, and f1- score.

(2)

II. MATERIAL AND METHODS

A. Dataset

The lung cancer dataset was obtained from the Department of Radiology, Cipto Mangunkusumo Hospital, Jakarta, Indonesia. It consists of 252 observations which were divided into two classes including 127 for lung cancer patients which were labeled as class 1, and 125 for normal patients which were labeled as class 0. The dataset contains 8 features as indicated by the first five samples of patients in Table I.

TABLE I. T^HEFEATURES IN LÛNGCÂNCERDÂTASET Area Min Max Aver

age SD Sum Leng

th Diag nosis 106.

53 -926 -773 -863 29.5 -156324 -36.6 0 76.0

7 -924 -755 -850 29.02 -121557 31.01 0 109.

76 -42 88 28.4

5 24.86 5576 37.17 1

135.

93 -917 -683 -875 41.11 -284527 41.34 0 97.5

5 -12 65 32.5

9 14.05 7562 35.01 1

B. Particle Swarm Optimization (PSO)

Particle Swarm Optimization (PSO) is designed based on the concept of survival processes in the flock of birds. Its population is based on algorithmic search which starts randomly as particles. Moreover, each particle was assumed to have two characteristics which are position and velocity to always provide updated information on its best position to the other particles.

Each particle has a fitness function designed according to its problem and it is remembered as a personal best (pbest) when it moves to a new position in the search space. It also provides information to others and remembers its global best position (gbest). Subsequently, each particle evaluates its direction and acceleration according to pbest and gbest to move in the optimum direction and search for the optimum solution.

The particle 𝑖’s position and velocity in a certain spatial dimension can be written using Equations (1) and (2).

𝑥_!= {𝑥_!", 𝑥_!#, … , 𝑥_!$} (1) 𝑣_!= {𝑣_!", 𝑣_!#, … , 𝑣_!$} (2) Meanwhile, the formula to calculate the change in position and velocity of particle 𝑖 can be written as follows [15]:

𝑣!"#$%= 𝜔. 𝑣!"# + 𝑐%. 𝑟%!. (𝑝!"− 𝑥!"# ) + 𝑐&. 𝑟&!. -𝑝'"− 𝑥!". (3)

𝑥_!%&'"= 𝑥_!%^& + 𝑣_!%&'", 𝑑 = 1,2, … , 𝐷 (4) where t denotes the iteration of the process, d shows the dimension in the search space, 𝜔 is the weight of inertia, 𝑐_"

and 𝑐_# are parameters of cognitive and social ability, while

𝑟_"! and 𝑟_#! represent the random numbers uniformly

distributed in the interval [0,1]. Moreover, 𝑝_!% and 𝑝_)%

represent the elements of pbest and gbest in the 𝑑-dimension.

C. Genetic Algorithm (GA)

Genetic Algorithm is a search algorithm based on the mechanism of natural selection and genetics known as the evolutionary process [16]. Its main components include selection, crossover, and mutation [3]. This study used a

roulette wheel selection for the selection process and this method could select individual pairs for the crossover with probability proportional to their fitness value. Meanwhile, the probability for an individual to be selected as the parent for crossover is shown in the following formula:

𝑃(𝑖) = ^*(!)

∑ *(!) (5)

where 𝑓(𝑖) is the fitness value of individual i, and 𝑃(𝑖) is the probability the individual i will be selected.

The fitness value used was the root mean squared error (RMSE) which was evaluated using the SVM classifier, and the formula was obtained as follows:

𝑓(𝑖) = 𝑅𝑀𝑆𝐸 = 5^∑^,_*-.^(*^!"#$!%^+*&'()*"#()*)⁺

- (6)

The crossover has a parameter known as crossover percentage (cp) at the interval (0, 1) while the offspring was produced by multiplying cp with the size of the chromosomes. Moreover, the chromosomes of the offspring obtained mutated after the completion of the crossover process. It is important to note that all the offspring chromosomes in the population have a value in the randomly generated interval (0, 1) which is called the chromosomal probability. Meanwhile, the chromosomes that mutated were those with a lower probability compared to the mutation rate (mu).

D. Hybrid Particle Swarm Optimization-Genetic Algorithm (PSO-GA)

Particle Swarm Optimization (PSO) and Genetic Algorithm (GA) were combined in this study to optimize the SVM’s parameters. The flowchart of the optimization process is presented in the following figure [17].

Population Initialization

Updating Pbest and Gbest position from

the particles

GA process

Crossover

Mutation

Optimal results

Yes No

End Calculating the

fitness value Start

Updating the velocity and position

Satisfy the terminal condition?

Selection

(3)

Fig. 1. Flowchart of the PSO-GA optimization

E. Support Vector Machine (SVM)

The main concept of SVM is to determine an optimal hyperplane that is perpendicular to a pattern by maximizing the margin or distance between the hyperplane and its closest pattern. It is also important to note that the pattern observed to be the closest to the optimal hyperplane is known as the support vector [18]. The process of determining an optimum hyperplane using the SVM is presented in Figure 2 [19].

Fig. 2. Illustration of SVM in Finding the Optimum Hyperplane

Given a data which has an input denoted as 𝒙𝑖 ∈ 𝑅𝑑 and output in the form of target denoted as 𝑦𝑖 ∈ {−1, +1} for 𝑖 = 1, 2, ..., 𝑛, where 𝑛 is the amount of data. Then, it is possible to separate the class -1 and +1 using a hyperplane with dimension n presented in the following equation:

𝒘^.𝒙 + 𝑏 = 0 (7) The hyperplane formed can separate the data into two classes with positive or negative values such that those in the positive class are labeled as 𝑦𝑖 = +1 for 𝑖 = 1, 2, ..., 𝑁 and this means 𝒙𝑖 can be defined as follows.

𝒘^.𝒙!+ 𝑏 ≥ 1 (8) Then, when the 𝒙𝑖 data belongs to the negative class where 𝑦𝑖

= −1 for 𝑖 = 1, 2, ..., 𝑁, it can be defined as:

𝒘^.𝒙!+ 𝑏 ≤ −1 (9) Therefore, for each 𝒙𝑖 data with label 𝑦𝑖 ∈ {−1, +1} for 𝑖 = 1, 2, ..., 𝑁 can be defined as:

𝑦!(𝒘^.𝒙!+ 𝑏) ≥ 1 (10) SVM uses Quadratic Programming as the basic formulation in Equation (11) as follows.

𝑚𝑖𝑛 ^%_&A|𝒘|A^& (11) with constraint

𝑦!(𝒘^.𝒙!+ 𝑏) ≥ 1, 𝑖 = 1,2, … , 𝑁 (12) Therefore, it is possible to solve these two equations using the Lagrange multiplier method defined in the following Equation (13).

𝐿(𝒘, 𝑏, 𝛼) =^!_"‖𝑤‖^"− ∑ 𝛼^%#&! #-𝑦#/(𝒘^$𝒙#+ 𝑏) − 134 , 𝑖 = 1,2, … 𝑁(13) This was used to obtain the 𝒘 and 𝑏 values presented in Equations (14) and (15) respectively and later substituted into the function 𝒇(𝒙) in Equation (16).

𝒘 = ∑ 𝛼𝑵 !𝑦/𝒙!

!1% (14)

𝑏 =₂^%

/∑ -𝑦!45 !− ∑ 𝛼345 3𝑦3𝒙3. (15) 𝒇(𝒙) = 𝑠𝑖𝑔𝑛(𝒘. 𝒙 + 𝑏) (16) F. Performance Measurement

PSO-GA-SVM was used in measuring the performance in classifying the lung cancer dataset, while the confusion matrix presented in Table II was used to estimate the accuracy, precision, recall, and f1-score values through its TP, FN, FP, and TN functions respectively to evaluate the classification model.

TABLE II. C^ONFUSIONM^ATRIX Predicted Class Lung Cancer Normal

Actual Class

Lung Cancer

TP

(True Positive) FN

(False Negative)

Normal FP

(False Positive) TN

(True Negative)

The accuracy, precision, recall, and f1-score formulas are also presented in the following Table III.

TABLE III. T^HEFORMULAS OF E^ACHPERFORMANCE M^EASUREMENT Performance

Measurement (%)

Formula

Accuracy 𝑇𝑃 + 𝑇𝑁

𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁 × 100%

Precision 𝑇𝑃

𝑇𝑃 + 𝐹𝑃 × 100%

Recall 𝑇𝑃

𝑇𝑃 + 𝐹𝑁 × 100%

F1-Score 2 × 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 × 𝑅𝑒𝑐𝑎𝑙𝑙 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙

III. RESULTS AND DISCUSSION

The best PSO parameters in the PSO-GA optimization were found to be 𝒏_𝒑𝒐𝒑, 𝝎, 𝒄_𝟏, 𝒄_𝟐 and those for the GA were mu and cp. Moreover, the best SVM parameters after the optimization process were kernel, 𝑪, 𝜸, and 𝒅, and they were measured using the fitness value.

The SVM parameters were further used to classify the lung cancer data through two experiments conducted based on the selection of different parameter values for PSO-GA optimization which are 𝑛₂₃₂= 20, 𝑐_"= 𝑐_# = 2, 𝜔= 1, cp=

0.6, mu= 0.02 and 𝑛₂₃₂= 40 𝑐_"= 𝑐_#=2, 𝜔 = 0.33, cp= 0.9, mu= 0.1 with reference to previous studies [20, 21, 22].

Meanwhile, the second parameter was selected to compare the working process of the method with the different PSO and GA parameters used. Moreover, 4 dimensions were selected to produce 4 outputs with the upper and lower bounds adjusted to each output dimension, where 𝑪 = [𝟎. 𝟎𝟏, 𝟑𝟓. 𝟎𝟎𝟎], 𝜸 = [𝟎. 𝟎𝟎𝟎𝟏, 𝟑𝟐], 𝒅 = [𝟐, 𝟓], 𝒌𝒆𝒓𝒏𝒆𝒍 = [𝟏, 𝟑]. It is important to note that these upper and lower limits were selected in line with previous studies [3, 23, 24, 25].

Lastly, the maximum iteration initialization was conducted 30 times to ensure the solution value converge during the iteration (𝒕).

(4)

The two experiments were repeated 5 times to compare the results for different parameters in each repetition and the results obtained for 𝐶, 𝛾, d, and kernel in one of the experiments are presented in the following Table IV. The optimized parameters were later used in the PSO-GA-SVM method.

TABLE IV. 1^STE^XPERIMENTO^PTIMIZEDPARAMETERS OF 𝐶,𝛾,^D,^AND KERNEL WITH I^NITIATION𝑛'('=20,𝑐!= 𝑐"=2,𝜔=1,^CP=0.6,^MU=0.02,

OÛTPUTDÎMENSION=4ÂNDMÂXIMUMI^TERATION=30 Trai

ning Data

𝐶 𝛾 d Kernel

50% 27.59299504 5610763

- 2 Polynomial

60% 13.27667127 0350292

- 5 Polynomial

70% 3.141719257 174098

- 4 Polynomial

80% 2.973433242 969763

- 2 Polynomial

90% 0.270197741 7290249

- 4 Polynomial

A. Support Vector Machine Classification Results with PSO-GA Parameter Optimization

PSO-GA was used to optimize the parameters of SVM which later will be applied to classify the lung cancer data.

This was followed by determining the average of the accuracy, precision, recall, and f1-score values obtained from the five experiments conducted as indicated in Tables V and VI.

TABLE V. T^HEA^VERAGER^{ESULTS OF}A^CCURACY,P^RECISION, RÊCALL,ÂNDF1-S^COREU^SINGO^PTIMIZEDPÂRAMETERS𝐶,𝛾,^D,ÂND

K^ERNEL(𝑛'('=20,𝑐!= 𝑐"=2,𝜔=1,^CP=0.6,^MU=0.02) Trai

ning Data

Accuracy Precision Recall F1-Score

50% 96.66% 96.52% 96.83% 96.63%

60% 97.02% 95.71% 98.82% 97.17%

70% 95.25% 93.59% 97.36% 95.40%

80% 94.50% 94.77% 94.69% 94.68%

90% 96.15% 95.60% 96.92% 96.23%

These results showed that the best average accuracy, recall, and f1-score were found with the use of 60% training data and the values were 97.02%, 98.82%, and 97.17%

respectively. Meanwhile, the best value for precision was obtained when using 50% of the training data and recorded to be 96.52%.

TABLE VI. T^HEA^VERAGER^{ESULTS OF}A^CCURACY,P^RECISION, RÊCALL,ÂNDF1-S^COREU^SINGO^PTIMIZEDPÂRAMETERS𝐶,𝛾,^D,ÂND

K^ERNEL(𝑛'('=40𝑐!= 𝑐"=2,𝜔=0.33,^CP=0.9,^MU=0.1) Trai

ning Data

50% 92.37% 95.79% 89.21% 91.39%

60% 94.64% 94.43% 95.28% 94.70%

70% 94.20% 93.70% 95.25% 94.34%

80% 94.89% 95.01% 95.38% 95.06%

Trai ning Data

90% 97.69% 98.46% 96.92% 97.66%

The results in Table VI showed that the best average value for accuracy, precision, recall, and f1-score were 97.69%, 98.46%, 96.92%, and 97.66% respectively obtained using 90% of the training data.

B. Support Vector Machine Classification Results without PSO-GA Parameter Optimization

The classification was conducted in this section using the polynomial SVM because the authors already tried to use other kernels to classify, and the polynomial kernel produces the best results. Here we will use the default parameter value with C=1 and d=3. The average performance from the five experiments was recorded based on accuracy, precision, recall, and f1-score as indicated in Table VII.

TABLE VII. T^HEA^VERAGER^{ESULTS OF}A^CCURACY,P^RECISION, RÊCALL,ÂNDF1-S^{CORE WITH}SVMPÔLYNOMIALKÊRNEL(𝐶=1^{AND D}=

3) Trai

ning Data

50% 90.33% 89.79% 90.68% 90.34%

60% 90.18% 89.52% 90.39% 90.20%

70% 90.42% 89.97% 90.52% 90.43%

80% 90.90% 90.90% 90.90% 90.90%

90% 89.50% 88.48% 90.33% 89.60%

The results in the table showed that the best average results for accuracy, precision, recall, and f1-score were 90.90% each obtained with the use of 80% of the training data.

C. Comparison of the SVM Results with and without PSO- GA Parameter Optimization

The accuracy, precision, recall, and f1-score of PSO-GA- SVM and SVM were compared, and the results are presented in Table VIII.

TABLE VIII. C^OMPARISONB^ETWEENPSO-GA-SVM(𝑛'('=40𝑐!=

𝑐"=2,𝜔=0.33,^CP=0.9,^MU=0.1)^ANDSVM(𝐶=1^{AND D}=3)B^{ASED ON}

T^HEA^CCURACY,P^RECISION,R^ECALL,^ANDF1-S^CORE

Method Accuracy Precision Recall F1-Score PSO-GA-

SVM 97.69% 98.46% 98.82% 97.66%

SVM 90.90% 90.90% 90.90% 90.90%

It was discovered that the best average results for accuracy, precision, and f1-score were 97.69%, 98.46%, and 97.66% respectively obtained using the second parameters of PSO-GA-SVM (𝑛₂₃₂= 40 𝑐_"= 𝑐_#=2, 𝜔 = 0.33, cp= 0.9, mu= 0.1) and 90% of the training data. Meanwhile, the best average result for the recall was 98.82% which was obtained with the use of the first parameters of PSO-GA-SVM (𝑛₂₃₂= 20, 𝑐_"= 𝑐_# = 2, 𝜔= 1, cp= 0.6, mu= 0.02) and 50% of the training data. Moreover, the best results for accuracy,

(5)

precision, recall, and f1-score when using SVM without optimization were found to be 90.90% each and recorded with the use of 80% of the training data. This means the PSO- GA-SVM had better performance compared to SVM without parameter optimization. In other words, the PSO-GA-SVM method was able to improve the results of accuracy, precision, recall, and f1-score in lung cancer data.

IV. CONCLUSION

This study classified lung cancer data using PSO-GA- SVM and its performance was compared with the SVM method without optimization based on the accuracy, precision, recall, and f1-score. When using the PSO-GA-SVM method, in terms of accuracy, precision, or f1-score held the important matters, 90% of data used is preferred. Meanwhile, if we want to see only the recall value, using 60% of the data is enough.

Besides that, the SVM method without parameter optimization has lower accuracy, precision, recall, and f1- score values compared to PSO-GA-SVM with the use of 80%

training data.

This means the PSO-GA-SVM performs better than the ordinary SVM and this means the parameter optimization method with this metaheuristic algorithm can be used as the solution to obtain the right parameters needed to increase the performance. If you want to use PSO-GA parameter values for SVM parameter optimization, it is recommended to choose the second PSO-GA parameter where 𝑛₂₃₂= 40 𝑐₁= 𝑐₂=2, 𝜔 = 0.33, cp= 0.9, mu= 0.1 because the majority of higher performance values are obtained when using these parameters.

For further research, the same classification method can be used on similar datasets but with other metaheuristic methods for parameter optimization, such as Ant Colony Optimization (ACO) and Cross-Entropy (CE). It is also recommended to use other classification methods, such as Logistic Regression and Random Forest. Moreover, other datasets can also be used on the same parameter classification and optimization method to verify that the SVM method with PSO-GA parameter optimization produces better performance when compared to SVM without optimization.

ACKNOWLEDGMENT

This research supported financially by Universitas Indonesia with FMIPA HIBAH 2021 research grant scheme.

REFERENCES

[1] H. Lemjabbar-Alaoui, O. U. Hassan, Y. W. Yang, and P. Buchanan,

“Lung cancer: Biology and treatment options,” Biochim. Biophys.

Acta, vol. 1856(2), pp. 189–210, 2015.

[2] V. N. Vapnik, The Nature of Statistical Learning Theory. New York:

Springer, 1995.

[3] K. Bhattacharjee and M. Pant, “Hybrid particle swarm optimization- genetic algorithm trained multi-layer perceptron for classification of human glioma from molecular brain neoplasia data,” Cogn. Syst.

Res., vol. 58, pp. 173-194, 2019.

[4] F. E. F. Junior and G. G. Yen, “Particle swarm optimization of deep neural networks architectures for image classification,” Swarm Evol.

Comput., vol. 49, pp. 62-74, 2019.

[5] Z. Rustam, S. Hartini, R. Y. Pratama, R. E. Yunus, and R. Hidayat,

“Analysis of architecture combining Convolutional Neural Network

(CNN) and kernel K-means clustering for lung cancer diagnosis,” Int.

J. Adv. Sci. Eng. Inf. Technol., vol. 10(3), pp. 1200-1206, 2020.

[6] Z. Rustam and S. A. A. Kharis, “Comparison of Support Vector Machine Recursive Feature Elimination and Kernel Function as feature selection using Support Vector Machine for lung cancer classification,” J. Phys. Conf. Ser., vol. 1442(1), pp. 012027, 2020.

[7] P. Nanglia, S. Kumar, A. N. Mahajan, P. Singh, D. J. Rathee, “A hybrid algorithm for lung cancer classification using SVM and neural networks,” ICT Express, 2020.

[8] T. V. Rampisela and Z. Rustam, “Classification of Schizophrenia Data Using Support Vector Machine (SVM),” J. Phys. Conf. Ser., vol.

1108(1), pp. 012044, 2018.

[9] T. Nadira and Z. Rustam, “Classification of cancer data using support vector machines with features selection method based on global artificial bee colony,” AIP Conf. Proc., October 2018 [AIP Publishing LLC, vol. 2023, no. 1, p. 02020].

[10] C. Aroef, R. Yuda, Z. Rustam, and J. Pandelaki, “Multinomial Logistic Regression and Support Vector Machine for Osteoarthritis Classification,” J. Phys. Conf. Ser., vol. 1417(1), pp. 012012, 2019.

[11] Arfiani, Z. Rustam, J. Pandelaki, and A. Siahaan, “Kernel Spherical K- Means and Support Vector Machine for Acute Sinusitis Classification,” IOP Conf. Ser.: Mater. Sci. Eng., vol. 546(5), pp.

052011, 2019.

[12] C. Aroef, Y. Rivan, and Z. Rustam, “Comparing random forest and support vector machines for breast cancer classification,” Telkomnika, vol. 18(2), pp. 815-821, 2020.

[13] Z. Rustam, I. Primasari, and D. Widya, “Classification of cancer data based on support vectors machines with feature selection using genetic algorithm and laplacian score,” AIP Conf. Proc., October 2018 [AIP Publishing LLC, vol. 2023, no. 1, p. 020234].

[14] Z. Rustam, D. A. Utami, J. Pandelaki, and R. E. Yunus, “Analyzing cerebral infarction using support vector machine with artificial bee colony and particle swarm optimization feature selection,” J. Phys.

Conf. Ser., vol.1490(1), pp. 012031, 2020.

[15] Y. Ding, W. Zhang, L. Yu, and K. Lu, “The accuracy and efficiency of GA and PSO optimization schemes on estimating reaction kinetic parameters of biomass pyrolysis,” Energy, vol. 176, pp. 582–588, 2019.

[16] J. H. Holland, Adaptation in Natural and Artificial Systems. Michigan:

University of Michigan Press, 1975.

[17] Y. Liu et al., “Optimization of five-parameter BRDF model based on hybrid GA-PSO algorithm,” Optik, vol. 219, pp. 164978, 2020.

[18] K. S. Durgesh and B. Lekha, “Data classification using support vector machine,” J. Theor. Appl. Inf. Technol., vol. 12(1), pp. 1-7, 2010.

[19] E. García-Gonzalo, Z. Fernández-Muñiz, P. J. García Nieto, A.

Bernardo Sánchez, and M. Menéndez Fernández, “Hard-rock stability analysis for span design in entry-type excavaions with learning classifiers,” Mater., vol. 9(7), pp. 531, 2016.

[20] E. Alba, J. Garcia-Nieto, L. Jourdan, and E. G. Talbi, “Gene selection in cancer classification using PSO/SVM and GA/SVM hybrid algorithms,” 2007 IEEE Congr. Evol. Comput., vol. 284-290, 2007.

[21] Z. Tao, L. Huiling, W. Wenwen, and Y. Xia, “GA-SVM based feature selection and parameter optimization in hospitalization expense modeling,” Appl. Soft Comput., vol. 75, pp. 323-332, 2019.

[22] R. K. Yadav, “PSO-GA based hybrid with Adam Optimization for ANN training with application in Medical Diagnosis,” Cogn. Syst.

Res., vol. 64, pp. 191-199, 2020.

[23] S. Ali, and K. Smith-Miles, “On optimal degree selection for polynomial kernel with support vector machines: Theoretical and empirical investigations,” Int. J. of Knowl.-Based Intell. Eng.

Syst., vol. 11(1), pp. 1-18, 2007.

[24] S. W. Lin, K. C. Ying, S. C. Chen, and Z. J. Lee, “Particle swarm optimization for parameter determination and feature selection of support vector machines,” Expert Syst. Appl., vol. 35(4), pp. 1817- 1824, 2008.

[25] I. Syarif, A. Prugel-Bennett, and G. Wills, “SVM parameter optimization using grid search and genetic algorithm to improve classification performance,” Telkomnika, vol. 14(4), pp. 1502, 2016.