Analisis Perbandingan Proses Cluster Menggunakan K- Means Clustering dan K-Nearest Neighbor pada Penyakit Diabetes Mellitus

(1)

ABSTRAK

Klasifikasi merupakan salah satu dari beberapa peran data mining. Pada fungsi klasifikasi, terdapat banyak algoritma yang dapat digunakan untuk mengolah Input menjadi output yang diinginkan, sehingga harus diperhatikan aspek performance dari masing-masing algoritma tersebut. Tujuan penelitian ini adalah untuk menganalisis dan membandingkan performance K-Nearest Neighbor dan K-Means Clustering dari sudut pandang akurasi dan runing time.Data sets penelitian berasal dari UCI Machine Learning Repository, yaitu: PIMA Indians Diabetes Dataset.Hasil analisis perbandingan akurasi menunjukkan bahwa nilai ke-akuratan algoritma K-Means Clustering lebih baik dengan akurasi 67.143 % dibandingkan algoritma K-Nearest Neighbor dengan akurasi 64.286 % pada implementasi terhadap data sets.sedangkan waktu proses pengujian algoritma K-Nearest Neighbor relatif lebih cepat dibandingkan dengan K-Means Clustering dimana watu pengujian K-Nearest Neighbor yaitu 0.2492 detik sedangkan K-Means Clustering yaitu 12.1285 detik.

Kata Kunci: Klasifikasi, Dataset, K-Means Clustering, K-Nearest Neighbor, runing time, Akurasi.

(2)

COMPARATIVE ANALYSIS OF CLUSTER PROCESS USING K -MEANS CLUSTERING AND K-NEAREST NEIGHBOR DISEASE DIABETES MELLITUS

ABSTRACT

Classification is one of the few role of data mining. In the classification function, there

are many algorithms that can be used to process input into the desired output, so it

must be considered aspects of performance of each algorithm. The purpose of this

study was to analyze and compare the performance of Nearest Neighbor and

K-Means Clustering from the standpoint of accuracy and runing time.Data sets the

research came from the UCI Machine Learning Repository, ie: PIMA Indians Diabetes

Dataset.Hasil accuracy comparative analysis shows that the value to-accuracy

algorithm K-Means Clustering with an accuracy better than 67 143% K-Nearest

Neighbor algorithm with 64 286% accuracy in the implementation of the testing

process the data sets.sedangkan time K-Nearest Neighbor algorithm is relatively faster

than the K-Means Clustering where Watu testing of K-Nearest Neighbor ie 0.2492

seconds while K-Means Clustering is 12.1285 seconds.

Keywords : Classification , Dataset , K -Means Clustering , K - Nearest Neighbor ,

runing time , accuracy .