• Tidak ada hasil yang ditemukan

Á¨º°„‡»–¨´„¬–³…o°¤¼¨Ã—¥°µ«´¥ª·›¸„µ¦‹´—¨Îµ

N/A
N/A
Protected

Academic year: 2024

Membagikan "Á¨º°„‡»–¨´„¬–³…o°¤¼¨Ã—¥°µ«´¥ª·›¸„µ¦‹´—¨Îµ"

Copied!
13
0
0

Teks penuh

(1)

„µ¦¨—¤·˜·…o°¤¼¨—oª¥„µ¦‡´—Á¨º°„‡»–¨´„¬–³…o°¤¼¨Ã—¥°µ«´¥ª·›¸„µ¦‹´—¨Îµ—´

¤·˜·…o°¤¼¨Â®¨µ¥¨Îµ—´´Êœ—oª¥Áš‡œ·‡Linear SVM Weight ¦nª¤„´ReliefF Dimensionality Reduction using Multi-layers for Feature Selection Combined

Linear SVM Weight and ReliefF Technique

ª·£µª¦¦–´ªš°Š1

š‡´—¥n°

еœª·‹´¥œ¸ÊÁž}œ„µ¦œÎµÁ­œ°ª·›¸„µ¦¨—¤·˜·…o°¤¼¨—oª¥ª·›¸„µ¦‡´—Á¨º°„‡»–¨´„¬–³ …o°¤¼¨Ã—¥°µ«´¥ª·›¸„µ¦‹´—¨Îµ—´¤·˜·…o°¤¼¨Â®¨µ¥¨Îµ—´´Êœ—oª¥Áš‡œ·‡Linear SVM Weight ¦nª¤„´5HOLHI) Á¡ºÉ°Äo‹ÎµÂœ„ž¦³Á£š…o°¤¼¨„´˜´ª‹ÎµÂœ„ž¦³Á£š…o°¤¼¨

classifierSupport Vector Machine ץ𗭰„´»—…o°¤¼¨ SRBCT ¨³ …o°¤¼¨usps Ÿ¨‹µ„„µ¦«¹„¬µ¡ªnµª·›¸„µ¦¨—¤·˜·…o°¤¼¨š¸ÉœÎµÁ­œ°Ä®ož¦³­·š›·£µ¡

—¸„ªnµª·›¸„µ¦¨—¤·˜·…o°¤¼¨—oª¥ª·›¸„µ¦‡´—Á¨º°„‡»–¨´„¬–³…o°¤¼¨Â¨Îµ—´´ÊœÁ—¸¥ª—oª¥

Áš‡œ·‡Linear SVM Weight ®¦º°Â5HOLHI) „¨nµª‡º°»—…o°¤¼¨ SRBCT Ž¹ÉŠÄo ª·›¸„µ¦‡´—Á¨º°„‡»–¨´„¬–³…o°¤¼¨Â®¨µ¥¨Îµ—´´ÊœÄ®o‡nµ‡ªµ¤™¼„˜o°ŠÄœ„µ¦‹ÎµÂœ„

…o°¤¼¨°¥¼nš¸É¦o°¥¨³100 ¨³­µ¤µ¦™¨—¤·˜·…o°¤¼¨Á®¨º°Á¡¸¥Š8 ¤·˜·…o°¤¼¨‹µ„š´ÊŠ®¤—

2,308 ¤·˜·…o°¤¼¨Â¨³»—…o°¤¼¨usps Ä®o‡nµ‡ªµ¤™¼„˜o°ŠÁšnµ„´¦o°¥¨³95.76 š¸É55 ¤·˜·

…o°¤¼¨‹µ„š´ÊŠ®¤—256 ¤·˜·…o°¤¼¨

Abstract

This research investigated how dimensional data could be downsized using multi-layers feature selection method. A combined Linear SVM Weight DQG5HOLHI)ZDs a proposed technique for data classification with classifier QDPHO\6XSSRUW9HFWRU0DFKLQHV690V7ZRGDWDVHWVLQFOXGLQJ65%&7DQG usps, were used for the experiment. We discovered that the proposed technique was more efficient than using either LineaU690:HLJKWRU5HOLHI)

1 °µ‹µ¦¥r­µ…µÁš‡ÃœÃ¨¥¸­µ¦­œÁ𫇖³ª·š¥µ«µ­˜¦r¨³Áš‡ÃœÃ¨¥¸¤®µª·š¥µ¨´¥¦µ£´’£¼Á„Ș

(2)

žeš¸É11 Œ´š¸É2 „¦„‘µ‡¤ – ›´œªµ‡¤2558

alone for dimensionality reduction. The dimensional data could be downsized from 2,308 to 8 attributes where the accuracy rate could reach 100 percent in SRBCT. The experimental result of SBRCT was also consistent with that of usps in which the dimensional data could be downsized 256 to 55 attributes with the accuracy of 95.76 percent.

Keywords : Dimensionality Reduction, Multi-/D\HUV)HDWXUH6HOHFWLRQ5HOLHI) Linear SVM Weight

šœÎµ

ž{®µ¤·˜·…°Š…o°¤¼¨š¸É¤µ„Á„·œÅžÁž}œž{®µš¸É¡Å—oĜ„µ¦šÎµÁ®¤º°Š…o°¤¼¨Äœ

ž{‹‹»´œ„µ¦¨—¤·˜·…o°¤¼¨‹´—Áž}œ„¦³ªœ„µ¦®œ¹ÉŠÄœ…´Êœ˜°œ„µ¦Á˜¦¸¥¤…o°¤¼¨Tan P. N., Steinbach, M., and Vipin, K., 2005) œ´Éœ‡º°„µ¦šÎµÄ®o…o°¤¼¨˜´ÊŠ˜oœ¤¸…œµ—¨—¨ŠÃ—¥

­¼Á­¸¥¨´„¬–³­Îµ‡´…°Š…o°¤¼¨œo°¥š¸É­»—¨³­¼Á­¸¥‡ªµ¤™¼„˜o°Š…°ŠŸ¨¨´¡›rœo°¥

š¸É­»—ÁœºÉ°Š‹µ„…o°¤¼¨Â˜n¨³˜´ª‹³¤¸‡ªµ¤­Îµ‡´˜n°„µ¦‹´—„¨»n¤…o°¤¼¨Å¤nÁšnµ„´œ—oª¥

Áš‡œ·‡„µ¦Á¨º°„…o°¤¼¨š¸É—¸‹³šÎµÄ®o­µ¤µ¦™Á¨º°„…o°¤¼¨š¸É¤¸‡ªµ¤­Îµ‡´Â¨³­µ¤µ¦™Äo Áž}œ˜´ªÂšœ…°Š…o°¤¼¨­nªœÄ®nŗoŽ¹ÉŠÄœ‡ªµ¤Áž}œ‹¦·Šž{®µ…°Š¤·˜·…o°¤¼¨ ¤´„¡Å—o Á­¤°Äœ…o°¤¼¨…œµ—Ä®n‹¹Š‹ÎµÁž}œ˜o°Š¤¸„µ¦¨—…œµ—¤·˜·…o°¤¼¨¨Š„µ¦¨—¤·˜·…o°¤¼¨¤´„

„¦³šÎµÄœ­°ŠÂœªšµŠ‡º°1¨—¨´„¬–³š¸É°›·µ¥…o°¤¼¨Ä®oÁ®¨º°ÁŒ¡µ³¨´„¬–³®¨´„š¸É

­Îµ‡´attributes of features‹³°¥¼nĜ„¨»n¤š¸ÉÁ¦¸¥„ªnµ„µ¦‡´—Á¨º°„¨´„¬–³…o°¤¼¨

feature selection¨³2¨—‹Îµœªœ¦µ¥„µ¦…o°¤¼¨Ä®oÁ®¨º°ÁŒ¡µ³¦µ¥„µ¦…o°¤¼¨š¸É

­µ¤µ¦™Áž}œ˜´ªÂšœ…°Š…o°¤¼¨„¨»n¤Ä®nŗofeature extractionŽ¹ÉŠÁš‡œ·‡š¸Éœ·¥¤Äo¨—

‹Îµœªœ…o°¤¼¨ž¦³Á£šœ¸Ê‡º°„µ¦­»n¤sampling„µ¦‹´—„¨»n¤clustering) 0RKG$Iizi 0RKG6KXNUDQ2PDU=DNDULD1RRUKDQL]D:DKLG$KPDGDQG0XMDKLG$KPDG

=DLGL

ž{‹‹»´œ¤¸œ´„ª·‹´¥‹Îµœªœ®œ¹ÉŠÅ—ošÎµ„µ¦«¹„¬µª·‹´¥Á„¸É¥ª„µ¦¨—¤·˜·…o°¤¼¨—oª¥

ª·›¸„µ¦‡´—Á¨º°„‡»–¨´„¬–³—´Šœ¸Ê£´š¦µª»•·Â¨³‡–³ £´š¦µª»•·Â­Š«·¦·, «‹¸¤µ‹–ª·Á¸¥¦

(3)

¨³¡¥»Š¤¸­´‹, Á¤¬µ¥œ-¤·™»œµ¥œ, 2553) ŗošÎµ„µ¦Áž¦¸¥Áš¸¥„µ¦¨—¤·˜·…o°¤¼¨Ã—¥Äo ª·›¸„µ¦‡´—Á¨º°„‡»–¨´„¬–³Â&RUUHODWLRQ %DVHG )HDWXUH 6HOHFWLRQ &%)6 Gain Ratio ¨³Information Gain ¨oªœÎµŸ¨¨´¡›rš¸ÉŗoŞÁž}œ°·œ¡»˜…°Š˜´ª‹ÎµÂœ„

ž¦³Á£š…o°¤¼¨ÂŽ´¡¡°¦r˜Áª„Á˜°¦r¤¸œSupport Vector Machine: SVM¡ªnµ ª·›¸„µ¦ÂGain Ratio ¨³Information Gain Ä®ož¦³­·š›·£µ¡—¸„ªnµÂ&%)6 ÁnœÁ—¸¥ª„´Šµœª·‹´¥…°ŠªŠ„˜Â¨³‡–³ ªŠ„˜«¦¸°»Å¦, ¡¥»Š¤¸­´‹Â¨³¼µ˜·®§Å¥³

«´„—·Í, ¡§¬£µ‡¤ š¸É«¹„¬µª·›¸„µ¦‹ÎµÂœ„®¤ª—®¤¼nÁ°„­µ¦Ã—¥Äo‹Îµ¨°Š

®´ª…o°¦nª¤„´ª·›¸„µ¦‡´—Á¨º°„‡»–¨´„¬–³š¸É­Îµ‡´­°Šª·›¸‡º°Gain Ratio ¨³Chi Squared Á…oµ¤µÄoĜ„µ¦‡´—Á¨º°„‡»–¨´„¬–³š¸ÉÁ®¤µ³­¤Á¡ºÉ°œÎµ¤µ‹ÎµÂœ„®¤ª—®¤¼n Á°„­µ¦—oª¥˜´ª‹ÎµÂœ„ž¦³Á£š…o°¤¼¨Âª·›¸„µ¦…°ŠÁ¥rŽ´¡¡°¦r˜Áª„Á˜°¦r¤¸œ

¨³„µ¦˜´—­·œÄ‹Ã—¥ÄoŸœŸ´Š˜oœÅ¤oŸ¨„µ¦ª·‹´¥Â­—ŠÄ®oÁ®Èœªnµ„µ¦‡´—Á¨º°„

‡»–¨´„¬–³—oª¥ª·›¸Á„œInformation Gain¨³Äo˜´ª‹ÎµÂœ„ž¦³Á£š…o°¤¼¨ÂŽ´¡¡°¦r˜

Áª„Á˜°¦r¤¸œÄ®ož¦³­·š›·£µ¡—¸š¸É­»— $EGROKRVVHLQ6DUUDI]DGHK+DELEROODK$JK Atabay, Mir Mosen Pedram, Jamshid Shanbehzadeh, 2012) čoª·›¸„µ¦‡´—Á¨º°„

‡»–¨´„¬–³Â5HOLHI) Ĝ„µ¦¨—¤·˜·…°ŠContent-Based Image ¡ªnµ­µ¤µ¦™

‡oœ®µ…o°¤¼¨Å—o¦ª—Á¦ÈªÂ¨³™¼„˜o°Š¤µ„…¹ÊœÂ¨³Šµœª·‹´¥…°Š;LQ -LQ 5RQJ\DQ /L Xian Shen, Rongfang Bie, 2007) š¸ÉŗošÎµ„µ¦«¹„¬µWeb Mining ץčo˜´ª‹ÎµÂœ„

ž¦³Á£š…o°¤¼¨ÂHiddenNaïve Bayes ¨³Äoª·›¸„µ¦‡´—Á¨º°„‡»–¨´„¬–³Â

5HOLHI) ¡ªnµ‡nµ‡ªµ¤™¼„˜o°ŠÄœ„µ¦‹ÎµÂœ„ž¦³Á£š…o°¤¼¨­¼Š…¹Êœ

š´ÊŠœ¸Êª·›¸„µ¦¨—¤·˜·…o°¤¼¨š¸ÉčoĜеœª·‹´¥‹Îµœªœ¤µ„°¥¼nœ¡ºÊœ“µœ…°Š

„µ¦¡·‹µ¦–µ‡´—Á¨º°„‡»–¨´„¬–³…o°¤¼¨š¸É­Îµ‡´Á¡¸¥Š‡¦´ÊŠÁ—¸¥ª—oª¥„µ¦¡·‹µ¦–µ

‡´—Á¨º°„Á¡¸¥Š¨Îµ—´´ÊœÁ—¸¥ªÂ¨³ÄoÁš‡œ·‡Á—¸¥ªÁšnµœ´Êœ®¦º°°µ‹„¨nµªÅ—oªnµÄoÁ„–”r Á¡¸¥ŠÁ„–”rÁ—¸¥ªÄœ„µ¦¡·‹µ¦–µ‡´—Á¨º°„¤·˜·…o°¤¼¨Ž¹ÉോšÎµÄ®oÁ„·—ž{®µ¤·˜·…o°¤¼¨š¸É¤¸

‡ªµ¤­Îµ‡´Å¤nŗo¦´„µ¦‡´—Á¨º°„

еœª·‹´¥œ¸Ê¤¸ª´˜™»ž¦³­Š‡rÁ¡ºÉ°œÎµÁ­œ°Áš‡œ·‡„µ¦¨—¤·˜·…o°¤¼¨—oª¥ª·›¸„µ¦

‡´—Á¨º°„‡»–¨´„¬–³š¸É­Îµ‡´Â5HOLHI) /LQHDU 690 :HLJKW ¨³ÂLinear SVM Weight ¦nª¤„´5HOLHI)¤µÁž¦¸¥Áš¸¥„´œÃ—¥ÄoÁš‡œ·‡„µ¦‹ÎµÂœ„…o°¤¼¨Â

(4)

žeš¸É11 Œ´š¸É2 „¦„‘µ‡¤ – ›´œªµ‡¤2558

Support Vector Machine Á¡ºÉ°Â„ož{®µ„µ¦˜´—¤·˜·…o°¤¼¨š¸É¤¸‡ªµ¤­Îµ‡´š·ÊŠÅž‹µ„

„µ¦Äoª·›¸„µ¦‡´—Á¨º°„‡»–¨´„¬–³…o°¤¼¨—oª¥¨Îµ—´´ÊœÁ¡¸¥Š´ÊœÁ—¸¥ªÄœ„µ¦¡·‹µ¦–µ

‡´—Á¨º°„¨³¡·‹µ¦–µªnµÁš‡œ·‡—´Š„¨nµªÁ®¤µ³­¤„´…o°¤¼¨ž¦³Á£šÄ——´Šœ´ÊœŠµœª·‹´¥

·Êœœ¸Ê‹¹Š¤¸Âœª‡·—š¸É‹³Á¡·É¤¨Îµ—´´Êœ„µ¦¡·‹µ¦–µÄœ„µ¦‡´—Á¨º°„¤·˜·…o°¤¼¨Ã—¥ÄoÁš‡œ·‡

„µ¦¨—¤·˜·…o°¤¼¨®¨µ¥Áš‡œ·‡¤µŸ­¤Ÿ­µœ„´œÂ¨³Áž}œ„µ¦Á¡·É¤¨Îµ—´´ÊœÄœ„µ¦¡·‹µ¦–µ

‡´—Á¨º°„¤·˜·…o°¤¼¨Ä®o¨³Á°¸¥—…¹Êœ

еœÂ¨³š§¬‘¸š¸ÉÁ„¸É¥ª…o°Š

Ž´¡¡°¦r˜Áª„Á˜°¦r¤¸œ

Ž´¡¡°¦r˜Áª„Á˜°¦r¤¸œHsu, C.-w., Chang, C.-c., & Lin, C.-j , 2010) Áž}œ˜´ª‹ÎµÂœ„ž¦³Á£š…o°¤¼¨œ·—®œ¹ÉŠš¸Éčo®¨´„„µ¦®µ…°š¸É„ªoµŠš¸É­»—¨³Â„ož{®µÄœ

„µ¦‹ÎµÂœ„…o°¤¼¨Â2 ‡¨µ­®¦º°ÄoÁ¡ºÉ°®µ¦³œµ„µ¦˜´—­·œÄ‹Äœ„µ¦ÂnŠ…o°¤¼¨

°°„Áž}œ­°Š­nªœÃ—¥Äo­¤„µ¦Á­oœ˜¦ŠÁ¡ºÉ°ÂnŠÁ…˜…o°¤¼¨„¨»n¤°°„‹µ„„´œÂ¨³®µ

Ÿ¨¨´¡›rš¸É—¸š¸É­»—ץčo„µ¦Á¦¸¥œ¦¼o‹µ„­™·˜·…°Š…o°¤¼¨Ž¹ÉŠšÎµŠµœÃ—¥„µ¦®µ‡nµ¦³¥³…°š¸É

¤µ„š¸É­»—Maximum Margin) …°Š¦³œµ˜´—­·œÄ‹'HFLVLRQ+\SHUSODQHĜ„µ¦

nŠÂ¥„„¨»n¤…o°¤¼¨š¸Éčo f„®´—°°„‹µ„„´œÃ—¥Äo¢{Š„r´œÂ¤Èž…o°¤¼¨‹µ„Input space Ş¥´Š)HDWXUH 6SDFH —´Š¦¼žš¸É1 ¨³­¦oµŠ¢{Š„r´œª´—‡ªµ¤‡¨oµ¥š¸ÉÁ¦¸¥„ªnµÁ‡°¦rÁœ¨

¢{Š„r´œ œ)HDWXUH 6SDFH ×¥¤¸ª´˜™»ž¦³­Š‡rš¸É‹³¡¥µ¥µ¤¨—‡ªµ¤Ÿ·—¡¨µ—‹µ„

„µ¦šÎµœµ¥0LQLPL]H HUURU ¡¦o°¤„´Á¡·É¤¦³¥³‹ÎµÂœ„Ä®o¤µ„š¸É­»—0D[LPL]HG margin) Ž¹ÉŠ˜nµŠ‹µ„Áš‡œ·‡Ã—¥š´ÉªÅžÁnœÃ‡¦Š…nµ¥ž¦³­µšÁš¸¥¤$UWLILFLDO 1HXUal Network : ANN) š¸É¤»nŠÁ¡¸¥ŠšÎµÄ®o‡ªµ¤Ÿ·—¡¨µ—‹µ„„µ¦šÎµœµ¥˜Éεš¸É­»—Á¡¸¥Š°¥nµŠ

Á—¸¥ªÁ®¤µ³­Îµ®¦´…o°¤¼¨š¸É¤¸¨´„¬–³¤·˜·…°Š…o°¤¼¨ž¦·¤µ–¤µ„

Ĝ„¦–¸š¸É…o°¤¼¨­°Š„¨»n¤Å¤n­µ¤µ¦™ÂnŠÃ—¥Äo­¤„µ¦Ž´¡¡°¦r˜Áª„Á˜°¦r ¤¸œÂÁ­oœ˜¦ŠÅ—oÁ¡¦µ³…o°¤¼¨°µ‹‹´„¨»n¤Äœ˜ÎµÂ®œnŠ˜nµŠÇ„´œ—´Šœ´Êœ‹¹Š˜o°Š¤¸

Á‡¦ºÉ°Š¤º°¤µnª¥Ä®o…o°¤¼¨Á®¨nµœ´ÊœÁ¦¸¥Š˜´ª„´œÄ®¤nĜ¡ºÊœš¸Éš¸ÉÁ¦¸¥„ªnµ¡ºÊœš¸É¤·˜·­¼ŠHigher Dimension Space) „µ¦ÂnŠÂ¥„„¨»n¤…°Š…o°¤¼¨‹µ„¦³œµ®¨µ¥¤·˜·Ä®ož¦³­·š›·£µ¡š¸É

—¸„ªnµª·›¸„µ¦Ã—¥š´ÉªÅžÃ—¥Äo¢{Š„r´œÁ‡°¦rÁœ¨.HUQHO)XQFWLRQ)

(5)

Ž´¡¡°¦r˜Áª„Á˜°¦r¤¸œ‹¹ŠÁž}œÁš‡œ·‡š¸ÉčoĜ„µ¦­¦oµŠÂ‹Îµ¨°Š„µ¦Á¦¸¥œ¦¼o …°ŠÁ‡¦ºÉ°ŠÁ¡ºÉ°‹ÎµÂœ„ž¦³Á£š…o°¤¼¨š¸Éŗo¦´‡ªµ¤œ·¥¤ÁœºÉ°Š‹µ„Ä®o‡ªµ¤™¼„˜o°ŠÄœ

„µ¦‹ÎµÂœ„ž¦³Á£š…o°¤¼¨­¼Š—´Šž¦µ„’ÄœŠµœª·‹´¥®¨µ¥·ÊœHuang, C.L., Chen, M.C.,

& Wang, C.J, 2007) Xuehua, L., & Lan,S, 2008) 6ZHLODP1+7KDrwat, A. A., &

Abdel Moniem, N.K, 2010) &KHQ+L., Yang, B., Liu, J., & Liu, D.Y, 2011)

¦¼žš¸É „µ¦ÂnŠ„¨»n¤…°Š…o°¤¼¨—oª¥­¤„µ¦ SVM Á­oœ˜¦Š

Linear SVM Weight

Linear SVM Weight Áž}œª·›¸„µ¦‡´—Á¨º°„‡»–¨´„¬–³Â®œ¹ÉŠš¸É­µ¤µ¦™

ž¦³¥»„˜rčoŗo—¸„´š»„Classifier *X\RQ,:HVWRQ-%DUQKLOO6 9DSQLN9

¤¸°´¨„°¦·š¹¤—´Šœ¸Ê

¦¼žš¸É 2 °´¨„°¦·š¹¤…°ŠLinear SVM Weight

Algorithm 1 Feature Ranking Based on Linear SVM Weights

Input : Training sets, (Xi, Yi), i =1,…..m.

Output : Sorted feature ranking list.

1. Use grid search to find the best parameter C.

2. Train aL2-loss linear SVM model using the best C.

3. Sort the features according to the absolute values

(6)

žeš¸É11 Œ´š¸É2 „¦„‘µ‡¤ – ›´œªµ‡¤2558

‹µ„£µ¡š¸É2 ­µ¤µ¦™°›·µ¥°´¨„°¦·š¹¤Å—o‡º°°·œ¡»˜Áž}œ…o°¤¼¨ f„ œ ­nªœ

Á°µ˜r¡»˜‹³Å—o‡»–¨´„¬–³…o°¤¼¨š¸É‹´—¨Îµ—´Â¨oªŽ¹ÉФ¸…´Êœ˜°œ‡º°1čo„µ¦‡oœ®µÂ„

¦·—grid) Á¡ºÉ°Ä®oŗo‡nµ¡µ¦µ¤·Á˜°¦rC š¸É—¸š¸É­»— 2) ­¦oµŠÃ¤Á—¨Ã—¥Äo¢{Š„r´œL2-loss linear „´…o°¤¼¨Ã—¥Äo‡nµ¡µ¦µ¤·Á˜°¦r…°ŠC š¸É—¸š¸É­»—š¸Éŗo‹µ„…o°1 ¨³ 3) Á¦¸¥Š¨Îµ—´

‡»–¨´„¬–³ …o°¤¼¨Ã—¥—¼‹µ„‡nµ­´¤¼¦–r…°Š‡nµœÊ宜´„š¸Éŗo <LQ-Wen Chang, Chih- Jen Lin, 2008)

ReliefF

5HOLHI) ™¼„¡´•œµ¤µ‹µ„Relief Ž¹ÉŠ¤¸…o°—o°¥‡º°­µ¤µ¦™„¦³šÎµ„´…o°¤¼¨š¸É¤¸

Á¡¸¥Š­°Š‡¨µ­Ášnµœ´ÊœÂ¨³œ·—…°Š…o°¤¼¨Áž}œÂÅ¤n˜n°ÁœºÉ°Š nominal„´…o°¤¼¨

š¸ÉÁž}œ˜´ªÁ¨… numeric) Ášnµœ´ÊœÄœ…–³š¸É 5HOLHI)Áž}œª·›¸„µ¦‡´—Á¨º°„‡»–¨´„¬–³

š¸É­µ¤µ¦™„¦³šÎµ„´…o°¤¼¨š¸É¤¸¤µ„„ªnµ­°Š‡¨µ­Ã—¥Äoŗo„´…o°¤¼¨š»„œ·—¨³¤¸

‡ªµ¤šœšµœ˜n°…o°¤¼¨š¸ÉŤn™¼„˜o°ŠÂ¨³Å¤n‡¦™oªœ$ 6DUUDI]DGHK +$ $WDED\

M.M, Pedram., and J., Shanbehzadeh, 2012) °´¨„°¦·š¹¤ 5HOLHI) ‡·—‡oœÃ—¥

.RQRQHQNR ,JRU DQG (GYDUG 6LPHF Áž}œ°´¨„°¦·š¹¤š¸É‡Îµœª–‡nµœÊ宜´„

‡»–¨´„¬–³…o°¤¼¨‹µ„„µ¦¡·‹µ¦–µ…o°¤¼¨š¸ÉĄ¨oÁ‡¸¥Š„´…o°¤¼¨­»n¤¤¸„µ¦œÎµÁš‡œ·‡Â

K-NN Á…oµ¤µ¦nª¤Äoеœ—oª¥Äœ„µ¦®µ‡nµk ‹³®µ£µ¥Äœ‡¨µ­Á—¸¥ª„´œ„n°œ­µ¤µ¦™

­—а´¨„°¦·š¹¤Å—o—´Šœ¸Ê

(7)

¦¼žš¸É °´¨„°¦·š¹¤…°Š5HOLHI)

(௉(஼)כ ෍ௗ௜௙௙ቀ஺,ோ,ெ(஼)ቁ

(௠.௞)

಴ಯ೎೗ೌೞೞ൫ೃ೔൯ ೕసభ

);

Algorithm ReliefF

Input: for each training instance a vector of attribute values and the class value

Output: the vector W of estimations of the qualities of attributes 1. set all weights W[A] :=0.0;

2. for i:=1 to ݉ do begin

3. randomly select an instance ܴ; 4. find k nearest hits ܪ ;

5. for each class ܥ ് ݈ܿܽݏݏ(ܴ) do 6. from class ܥ find k nearest misses ܯ(ܥ) ; 7. for A :=1 to a do

8. ݓ[ܣ]ؔ ݓ[ܣ]െ σ௝ୀଵௗ௜௙௙൫஺,ோ(௠.௞),ு

9. end;

(8)

žeš¸É11 Œ´š¸É2 „¦„‘µ‡¤ – ›´œªµ‡¤2558

ª·›¸—εÁœ·œ„µ¦ª·‹´¥

…´Êœ˜°œš¸ÉčoĜ„µ¦—εÁœ·œŠµœ«¹„¬µª·‹´¥Äœ‡¦´ÊŠœ¸Ê¤¸š´ÊŠ®¤—3 …´Êœ˜°œŽ¹ÉŠ

¦µ¥¨³Á°¸¥—…°ŠÂ˜n¨³…´Êœ˜°œ¤¸—´Š˜n°Åžœ¸Ê

¦¼žš¸É 4 …´Êœ˜°œ„µ¦—εÁœ·œŠµœ

„µ¦‡´—Á¨º°„»—…o°¤¼¨

„µ¦«¹„¬µª·‹´¥‡¦´ÊŠœ¸Êčo»—“µœ…o°¤¼¨ http://orange.biolab.si/datasets.php

‹Îµœªœ2 »—…o°¤¼¨Å—o„nSRBCT ¤¸‹Îµœªœ‡¨µ­4 ‡¨µ­Â¨³ usps ¤¸‹Îµœªœ‡¨µ­

10 ‡¨µ­ ×¥š»„»—…o°¤¼¨š¸ÉÁ¨º°„‹³Å¤n¤¸‡nµš¸É®µ¥ÅžMissing Value¨³¤¸‹ÎµœªœÂ°

šš¦··ª˜r¤µ„„ªnµ100 °šš¦··ª˜r Á¡ºÉ°Ä®o…o°¤¼¨¤¸‡ªµ¤Á®¤µ³­¤„´„µ¦¨—¤·˜·…o°¤¼¨

—´Š˜µ¦µŠš¸É1

Dataset SRBCT, usps

Dimensionality Reduction

ReliefF

Linear SVM Weight Linear SVM Weight +

ReliefF

Classifier (SVM)

Evaluation

(9)

˜µ¦µŠ »—…o°¤¼¨š¸ÉčoĜ„µ¦š—¨°Š

»—…o°¤¼¨ ‹ÎµœªœÂ°šš¦··ª˜r ‹Îµœªœ‡¨µ­

65%&7懤³Á¦ÈŠ˜n°¤œÊεÁ®¨º°Š) 2,308 4

8VSV„µ¦Á…¸¥œ˜´ªÁ¨…—oª¥¨µ¥¤º°) 256 10

Á‡¦ºÉ°Š¤º°Â¨³Áš‡œ·‡š¸ÉčoĜ„µ¦¨—¤·˜·…o°¤¼¨

čoް¢šrª¦rOrange Canvas version 2.72 šÎµ„µ¦ž¦´¨—¤·˜·…o°¤¼¨

—oª¥ª·›¸„µ¦‡´—Á¨º°„‡»–¨´„¬–³š¸É­Îµ‡´Â5HOLHI) /LQHDU 690 :HLJKW ¨³ Linear SVM Weight ¦nª¤„´5HOLHI)

Áš‡œ·‡š¸ÉčoĜ„µ¦‹ÎµÂœ„…o°¤¼¨Â690 6XSSRUW 9HFWRU 0DFKLQH čo¢{Š„r´œ5%) Á‡°¦rÁœ¨Â¨³œÎµ…o°¤¼¨š¸Éž¦´¨—¤·˜·Â¨oª¤µ­¦oµŠÂ‹Îµ¨°Š„µ¦Á¦¸¥œ¦¼o …°ŠÁ‡¦ºÉ°ŠMachine Learning Model)

­Îµ®¦´Áš‡œ·‡Linear SVM Weight ¦nª¤„´5HOLHI) Ĝ„µ¦¨—¤·˜·¨Îµ—´´Êœ

¦„‹³ÄoÁš‡œ·‡„µ¦‡´—Á¨º°„‡»–¨´„¬–³ÂLinear SVM Weight „n°œÄœ¨Îµ—´´Êœ

š¸É­°Š‹³ÄoÁš‡œ·‡Â5HOLHI) ª´—¨³ž¦³Á¤·œŸ¨

„µ¦ª´—¨³ž¦³Á¤·œ¦ª¤™¹Š„µ¦Áž¦¸¥Áš¸¥ž¦³­·š›·£µ¡…°ŠÃ¤Á—¨Äœ

„µ¦¨—¤·˜·…o°¤¼¨Â˜n¨³ÂÄœŠµœª·‹´¥œ¸Êčoª·›¸„µ¦10-)ROGV &URVV 9DOLGDWLRQ Áž}œ

ª·›¸„µ¦š—­°ž¦³­·š›·£µ¡…°ŠÂ‹Îµ¨°Š„µ¦Á¦¸¥œ¦¼o…°ŠÁ‡¦ºÉ°Š—oª¥„µ¦ÂnŠ»—…o°¤¼¨

°°„Áž}œk »—ÁšnµÇ„´œÃ—¥Äo…o°¤¼¨‹Îµœªœk-1 »—Á¡ºÉ°šÎµ„µ¦­¦oµŠÂ‹Îµ¨°Š„µ¦Á¦¸¥œ¦¼o …°ŠÁ‡¦ºÉ°Š­nªœ…o°¤¼¨°¸„ 1 »—š¸ÉÁ®¨º°‹³Äo­Îµ®¦´š—­°‡ªµ¤™¼„˜o°ŠÂ¨³‹³

šÎµŽÊε‹œ„ªnµ…o°¤¼¨š»„»—‹³™¼„čoš—­°Â‹Îµ¨°Š„µ¦Á¦¸¥œ¦¼o…°ŠÁ‡¦ºÉ°ŠÂ¨oªœÎµ‡nµ

‡ªµ¤™¼„˜o°Š®¦º°‡ªµ¤Ÿ·—¡¨µ—…°ŠÂ˜n¨³¦°¤µ¦ª¤„´œÂ¨³®µ‡nµÁŒ¨¸É¥Á¡ºÉ°Áž}œ‡nµ

­³šo°œž¦³­·š›·£µ¡…°Š„µ¦Á¦¸¥œ¦¼o­Îµ®¦´„µ¦ž¦³Á¤·œž¦³­·š›·£µ¡…°ŠÃ¤Á—¨Äo ª·›¸„µ¦ª´—‡nµ‡ªµ¤™¼„˜o°Š‡nµ‡ªµ¤Â¤nœ¥Îµ Precision‡nµ‡ªµ¤¦³¨¹„ Recall¨³

(10)

žeš¸É11 Œ´š¸É2 „¦„‘µ‡¤ – ›´œªµ‡¤2558

„µ¦ª´—ž¦³­·š›·£µ¡Ã—¥¦ª¤)-measure‹µ„œ´ÊœœÎµ‡nµ‡ªµ¤Â¤nœ¥Îµš¸ÉŗoĜ˜n¨³¦°

¤µ‡Îµœª–®µ‡nµÁŒ¨¸É¥‡ªµ¤Â¤nœ¥Îµ…°ŠÃ¤Á—¨ Accuracy—´Š­¤„µ¦š¸É1-4 ˜µ¤¨Îµ—´

Precision = ୔୰ୣୡ୧ୱ୧୭୬(୘୔)ା୔୰ୣୡ୧ୱ୧୭୬(୘୒)

Recall = ୖୣୡୟ୪୪(୘୔)ାୖୣୡୟ୪୪(୘୒)

F- measure = ୒ ୶ (ୖୣୡୟ୪୪ ୶ ୔୰ୣୡ୧ୱ୧୭୬)

ୖୣୡୟ୪୪ା୔୰ୣୡ୧ୱ୧୭୬

Accuracy = ୅୪୪ ୈୟ୲ୟ୘୔ା୘୒

Á¤ºÉ°„ε®œ—Ä®o N ‡º°‹Îµœªœ‡¨µ­

TP ‡º°‡nµ…°ŠTrue Positive TN ‡º°‡nµ…°ŠTrue Negative

Ÿ¨„µ¦ª·‹´¥

еœª·‹´¥Œ´œ¸ÊŸ¼oª·‹´¥‹³œÎµÁ­œ°Ÿ¨¨´¡›r‹µ„„µ¦¨—¤·˜·…o°¤¼¨Â¨oªÁ®¨º°‹Îµœªœ

°šš¦··ª˜rÁ¡¸¥Š60 °šš¦··ª˜rÁšnµœ´ÊœÃ—¥°oµŠ°·Š‹µ„еœª·‹´¥®¨µ¥ÇеœªnµnªŠ

…°Š„µ¦¨—¤·˜·š¸É¤¸ž¦³­·š›·£µ¡š¸É­»—‹³°¥¼nš¸É20 - 60 °šš¦··ª˜r ‹µ„„µ¦š—­°„´»—

…o°¤¼¨SRBCT ¡ªnµª·›¸„µ¦‡´—Á¨º°„‡»–¨´„¬–³Â®¨µ¥¨Îµ—´´Êœ‡º°Áš‡œ·‡Â

Linear SVM Weight ¦nª¤„´Relief Ä®o‡nµ‡ªµ¤™¼„˜o°ŠÄœ„µ¦‹ÎµÂœ„…o°¤¼¨­¼Š­»— 100 Áž°¦rÁŽÈœ˜rš¸É8 °šš¦··ª˜rĜ…–³š¸É„µ¦‡´—Á¨º°„‡»–¨´„¬–³Â´ÊœÁ—¸¥ªš¸ÉčoÁš‡œ·‡

Linear SVM Weight ‹³Ä®o‡nµ‡ªµ¤™¼„˜o°Š­¼Š­»—100 Áž°¦rÁŽÈœ˜rš¸É11 °šš¦··ª˜r ¨³„µ¦‡´—Á¨º°„‡»–¨´„¬–³Â´ÊœÁ—¸¥ªš¸ÉčoÁš‡œ·‡Relief ‹³Ä®o‡nµ‡ªµ¤™¼„˜o°Š

­¼Š­»—š¸É98.89 Áž°¦rÁŽÈœ˜r¨³š¸É31 °šš¦··ª˜r­Îµ®¦´»—…o°¤¼¨usps ª·›¸„µ¦‡´—Á¨º°„

®¨µ¥¨Îµ—´´ÊœÄ®o‡nµ‡ªµ¤™¼„˜o°ŠÄœ„µ¦‹ÎµÂœ„…o°¤¼¨Å—o—¸„ªnµ„µ¦Äoª·›¸„µ¦

‡´—Á¨º°„¨Îµ—´´ÊœÁ—¸¥ª‡º°Ä®o‡nµ‡ªµ¤™¼„˜o°ŠÁšnµ„´95.76% š¸É55 ¤·˜·…o°¤¼¨‹µ„

š´ÊŠ®¤—256 ¤·˜·…o°¤¼¨ ­µ¤µ¦™Â­—ŠÅ—o—´Š¦¼žš¸É5 ¨³6 ˜µ¤¨Îµ—´

(11)

¦¼žš¸É „µ¦‡´—Á¨º°„‡»–¨´„¬–³Ã—¥Äo»—…o°¤¼¨SRBCT

¦¼žš¸É 6 „µ¦‡´—Á¨º°„‡»–¨´„¬–³Ã—¥Äo»—…o°¤¼¨usps

­¦»ž

‹µ„Ÿ¨„µ¦ª·‹´¥ª·›¸„µ¦¨—¤·˜·…o°¤¼¨„´»—…o°¤¼¨š¸É¤¸‹Îµœªœ‡¨µ­®¨µ¥‡¨µ­

—´Š„¨nµª…oµŠ˜oœ­µ¤µ¦™­¦»žÅ—oªnµ„µ¦‡´—Á¨º°„‡»–¨´„¬–³Â®¨µ¥¨Îµ—´´ÊœÃ—¥Äo Áš‡œ·‡ÂLinear SVM Weight ¦nª¤„´5HOLHI) Ä®o‡nµ‡ªµ¤™¼„˜o°ŠÄœ„µ¦‹ÎµÂœ„

…o°¤¼¨Å—o—¸„ªnµ„µ¦Äoª·›¸„µ¦‡´—Á¨º°„¨Îµ—´´ÊœÁ—¸¥ªš¸ÉčoÁš‡œ·‡Linear SVM Weight ®¦º°5HOLHI) ¨³„µ¦‡´—Á¨º°„‡»–¨´„¬–³Â®¨µ¥¨Îµ—´´Êœ—´Š„¨nµª­µ¤µ¦™

šÎµŠµœ„´…o°¤¼¨…œµ—Ä®nŗoÁž}œ°¥nµŠ—¸ÁœºÉ°Š‹µ„čoÁª¨µÄœ„µ¦‡Îµœª–Ťn¤µ„¨³

(12)

žeš¸É11 Œ´š¸É2 „¦„‘µ‡¤ – ›´œªµ‡¤2558

¥´Š­µ¤µ¦™œÎµÅžž¦³¥»„˜rčo„´…o°¤¼¨—oµœ„µ¦Â¡š¥r—oµœ£¼¤·­µ¦­œÁ𫮦º°Šµœ—oµœ

°ºÉœÇš¸É¤¸ž¦·¤µ–…o°¤¼¨Áž}œ‹Îµœªœ¤µ„Å—o Á°„­µ¦°oµŠ°·Š

£´š¦µª»•· ­Š«·¦·, «‹¸¤µ‹ – ª·Á¸¥¦ ¨³¡¥»Š ¤¸­´‹2553). „µ¦‡´—Â¥„ž¦³Á£š…°Š

¤³Á¦ÈŠÁ¤È—Á¨º°—…µªÃ—¥Äoª·›¸„µ¦‹´—°´œ—´¦nª¤„´Áš‡œ·‡Ž´¡¡°¦ršÁª„Á˜°¦r ¤¸œ. ªµ¦­µ¦ª·‹´¥¤®µª·š¥µ¨´¥…°œÂ„nœ, 10–17.

ªŠ„˜ «¦¸°»Å¦, ¡¥»Š ¤¸­´‹ ¨³¼µ˜· ®§Å¥³«´„—·Í„µ¦Á˜¦¸¥¤¢eÁ‹°¦rœ¡ºÊœ“µœ

‹Îµ¨°Š®´ª…o°­Îµ®¦´„µ¦‹ÎµÂœ„®¤ª—®¤¼n…°ŠÁ°„­µ¦. „µ¦ž¦³»¤

ª·µ„µ¦¦³—´ž¦³Áš«—oµœ‡°¤¡·ªÁ˜°¦r¨³Áš‡ÃœÃ¨¥¸­µ¦­œÁš« ‡¦´ÊŠš¸É 5, SS-151). „¦»ŠÁš¡².

A, Sarrafzadeh., H.A, $WDED\003HGUDPDQG-6KDQEHK]DGHK 5HOLHI)%DVHG)HDWXUH6HOHFWLRQ,Q&RQWHQW-Based Image Retrieval.

Proceeding of International MultiConference of Engineers and

&RPSXWHU6FLHQWLVWVSS-19).

Abdolhossein Sarrafzadeh, Habibollah Agh Atabay, Mir Mosen Pedram, -DPVKLG6KDQEHK]DGHK5HOLHI)%DVHG)HDWXUH6HOHFWLRQ,Q Content-Based Image Retrieval. Proceeding of International MultiConference of Engineers and Computer Scientists . Hong Kong.

Chen, H-L., Yang, B., Liu, J., & Liu, D.-<$VXSSRUWYHFWRUPDFKLQH classifier with rough set-based feature selection for breast cancer diagnosis. Expert Systems with Applications, 9014-9022.

*X\RQ , :HVWRQ - %DUQKLOO 6 9DSQLN 9 *HQH VHOHFWLRQ IRU cancer classification using support vector machines. Machine learning, 389-422.

(13)

Hsu, C.-w., Chang, C.-c., & Lin, C.-M$SUDFWLFDOJXLGHWRVXSSRUW vector classification. Taiwan: Department of Computer Science and Information Engineering, National Taiwan.

Huang, C.-L., Chen, M.-C., & Wang, C.--&UHGLWVFRULQJZLWKDGDWD mining approach based on support vector machines. Expert Systems with Applications, 847-856.

.RQRQHQNR,JRUDQG(GYDUG6LPHF,QGXFWLRQRIGHFLVLRQWUHHVXVLQJ 5(/,()0DWKHPDWLFDl and Statistical Methods in Artificial Intelligence, 363-375.

0RKG $IL]L 0RKG 6KXNUDQ 2PDU =DNDULD 1RRUKDQL]D :DKLG $KPDG DQG 0XMDKLG$KPDG=DLGL$&ODVVLILFDWLRQ0HWKRGIRU'DWD0LQLQJ Using Svm-Weight And Euclidean Distance. Australian Journal of Basic and Applied Sciences, 2053-2059.

6ZHLODP1+7KDUZDW$$ $EGHO0RQLHP1.6XSSRUWYHFWRU machine for diagnosis cancer disease: A comparative study. Egyptian Informatics Journal, 81-92.

Tan P. N., Steinbach, M., and Vipin, K. ,QWURGXFWLRQWR'DWD0LQLQJ United State of America: Addison Wesley.

;LQ-LQ5RQJ\DQ/L;LDQ6KHQ5RQJIDQJ%LH$XWRPDWLF:HE3DJHV

&DWHJRUL]DWLRQZLWK5HOLHI)DQG+LGGHQ1DĮYH%D\HV$&0$VVRFLDWH for Computing Machinery, 617-621.

XXHKXD/ /DQ6)X]]\7KHRU\%DVHG6XSSRUW9HFWRU 0DFKLQH

&ODVVLILHU)X]]\6\VWHPVDQG.QRZOHGJH'LVFRYHU\

Yin-Wen Chang, Chih--HQ /LQ )HDWXUH 5DQNLQJ 8VLQJ /LQHDU 690 :RUNVKRSDQG&RQIHUHQFH3URFHHGLQJVSS-64).

Referensi

Dokumen terkait

Paper [3] proposed a novel method of direction feature extraction approach combined with principal component analysis (PCA) and support vector machine (SVM)

we used 3 methods to solve the problem, the methods are AHP for feature selection, SVM for classification from 3 classes to 2 classes, and then TOPSIS give a

LibSVM is a programming library for SVM algorithm was developed by [8], it is used researcher for classification and regression task. LibSVM also integrated into WEKA

Hence in order to overcome these issues, a detection and classification technique for Covid-19 from X-ray images using Support Vector Machine SVM and Deep Convolutional Neural

three methodologies were used for classification\regression tasks; i each one of the nine classifiers was used as an independent classifier; ii the best three classifiers/regressions

This paper shows a review to the degradation blurring functions and noise functions types as well as the classification of restoration methods for linear reconstruction technique, which

Accordingly, the algorithm is proposed for the authentication and classification is performed.30 users’ data set is used for the experiments and using logistic classifier in weka they

Brief Overview of Support Vector Machine Classifier For the pattern classification problem, from a given set of labeled training examples xi,yi∈RN×{±1}, i=1,…,l, an SVM constructs a