Nong Thi Hoa v^firg Tap chi KHOA HQC & CONG NGHE 106(06). 4 9 - 5 3
S I / D U N G IVIANG N O R O N N H A N T A G F U Z Z Y A R T D E P H A N C U M D l T L I E U N6ng Thj Hoa'*, Ho^ng Trpng VTnh^
'Tru&ng Dgi hpc Cong nghe Thong tin c6 Truyen thong - DH Thai Nguyen
^ConglyFPT software T O M T A T
Mang noron mo la mpt mang noron nhSn tao md phoi hgp ciia cdc khdi ni^m mo. cdc luat suy diln mo voi kien triic va viec hgc ciia cdc mang naron, Phdn cum dii' lieu la mot cong cu quan trong ciia khai thac diJ lieu va tim ra tri thirc trong iTi6t s6 lugiig 16'n dO li?u. Fuzzy ART (Fuzzy Adaptive Resonance Theory) Id m^t ni^ng naron m6' ma gidi quySl bdi loan phan cum dO lieu tot ban cdc phuang phdp phan cum truyen thfing. Trong nghien ciru ndy, chung toi phan tich cdc uu diem ciia Fuzzy ART va hudng dan chon cdc tham so ciia mo hinh Fuzzy ART de giai bai loan phan cum cho cac tap du' lieu dat dp chinh xdc cao nhat. Cdc thuc nghiem dugc lam v6i 5 tap dO lieu chuan trong CO sd du' iieu UCI de chiing minh tfnh hieu qua cua Fuzzy ART. Kit qua thuc nghiem cho thay Fuzzy ART cho ket qua phan cum vdi do chinh xdc cao.
Tu khda: Fuzzy ART, ART. Fuzzy Neural Network, Fuzzy Sel, Clustering
GIOI THIEU
Phan cum du lieu la mot cong cu quan trpng ciia khai thac du' lieu va tim ra t n thuc trong mgl so luang Ion du lieu, Hon nua, phan cum con long hop mpt so lupiig Ion dir lieu trong mpt so lugng nho cac nhom nen phan cum c6 ich cho cho viec hieu mpt so lupng Ion dii' heu Mot so phuo'ng phap phan cum truyen thong da duoc dua ra n h u K-mean [2], phan cum phan cap [3], va m o hinh S O M [6] nhung do phue lap linh toan cua cac p h u o n g phap nay kha Ian. Fuzzy A R T [ I ] la mot mang noron m a CO cac uu diem gom: Hpc dU lieu huan luyen cho den khi thoa man mgt dieu kien nhat dinh, eo the sinb ra nhom moi ma khong pha v o cac nhom d a n g ton tai, de dang iua chpn cac tham so cua m a n g . Do do. Fuzzy A R T phan cum d u lieu vai dp chinh xac cao va giam dang ke do phue tap tinh toan.
BAI TOAN PHAN C U M DU' LIEU Phat bieu bai toan
Cho tap dir lieu D. Moi du' lieu I trong tap D d u g c bieu dien bang mot vec t o c6 M phan tiV.
Moi thanh phan ciJa I nam trong khoang [0, I] Khi do ta CO'
I = ( I I , . . . . , I M ) .
Tel 01238 492 484
Tap D CO p nhom Moi nhom co mot v i e to*
trpng so W gom co M phan lir. Khi do vec t o trpng so ciia nhom k d u o c bieu dien n h u sau:
Wk=(W^', , , Ik^ )
Moi dij lieu I thupc ve mot nhom h nhat dinh, Y e u cau: D y a vao sy t u o n g dong giiia moi du' lieu I va vec to trpng so W ciia cac nhom de phan dij' lieu I vao nhom h.
Mot so phu'ong phap giai truyen thong Teuvo Kohonen [6] d u a ra mot mo hinh ctia mpt qua trinh l y to chirc m a i ma d u o c gpi la mo hinh S O M S O M la mot m a n g n o ron nhan tao t h y c hien hgc khong giam sat de tao ra mpt bieu dien cho khong gian dii lieu vao voi so chieu thap ban. M.Queen [2] dua ra Ihuat loan K-mean ma phan chia tap dii lieu vao mot so cym nhat djnh bang viec e y e lieu ham loi binh p h u o n g . Sau do, trong so ciia cac cum d u p c cap nhat boi gia tri trung binh ciia cac mau trong moi p h a n x y m . Johnson [3]
dtfa ra thuat loan phan cum phan cap dya vao sy nhap lai ciia hai cum gan nhat, Tuy nhien cac phu'ong phap nay co d o phirc tap linh loan kha cao do viec luon tinh lai trpng so ciia moi phan lap khi xet mot diJ lieu huan luyen.
49
Ndng Thj Hoa vd Dig T9P chi KHOA HQC & CONG NGHE 106(06): 49-53 FUZZY ART
Mo hinh mang ART
Cac mgng naron ART dirge ph^l Irien bai Grossbcrg [4][5] de giai quyet van de ve hipn tui;mg on djnh-linh hoat. Mang ART la mpt thuat loan hpc ISng cuang nen mang ART thich nghi dirpc voi mgl du li?u mai. 0 mpt thai diem, mang ART khong cho phep lhay d6i cac mau da luu cho den khi mau v^o kh6'p vtji cac mau da luu vai mpt mirc nhieu nhat djnh. NghTa la, mang ART co ca Iinh linh hogt va on djnh. Mgl phan nhom mai co the hinh thanh khi moi truang khong khap vdi cac mau da luu, nhung moi trudng khong the lhay doi cac mau da luu Irir khi sy tuang ty ciia chiing dat den mpt mirc dat trudc. Cau iriic chung cOa mang ART dugc the hien d Hinh !.
Hinh 1: .Wei hinh don gian cua mang ART Mpt mang ART dien hinh eo hai tang: tang dir lieu vao (FI) va tang du lieu ra (F2). Tang dU lieu vao chira N niit vdi N la so lugng cac mau dir lieu vao. So lugng niil eiia tang dii lieu ra la dgng. Moi niit ciia lang cd mgt vec to kieu tuang irng. Tinh dgng ciia mang dtrgc dieu khien bdi hai he thong con: h^ thdng chii y va he thong djnh hudng. He thong chu \ dua ra mgt naron chifin Ihang (hay nhdm) va he thdng dnih hudng quyet djnh nhdm nao chap nhan hay khong chap nhan dir lieu vao dd. Mang nay d trong mgt Irang thai egng hudng khi he thong dinh hirdng chap nhan mgl nhdm chien thang, nghTa la vec to ki^u chien Ihang khdp du gan vdi mau du lieu vao hien lai.
Thuat loan Fuzzy ART
Thuat loan nay dugc Carpenter trinh bay ngSn ggn trong [1]. Sau day la ba tham so thg hien tinh ddng cua mo hinh Fuzzy ART:
50
• tham sd ehpn a > 0;
• tham so toe dg hgc PG[0, I ]
• tham so ngudng p£[0, 1];
Ngi dung ciia thuat loan dugc trinh bay nhu sau:
Bu'dc 1: Khdi tao vec tff trgng so. mdi phan Idp j lirang irng vdi mgl vec la W,= (W.i,..., WJM) ctia cac trgng sd Ihich nghi hay vet bg nhd dai hgn Sd cac nhdm tiem nang N (j = i,..., N) lab^t ky. Khdi tao
Wji=-=W.,, = 1 (1)
va mdi nhdm dugc coi la chua hinh thanh.Sau khi mgt nhdm duge chpn de ma hoa, nhdm duge hinh thanh. Nhu bieu dien dudi day, mdi vet bg nhd dai han Wj, la khong lang dan theo thdi gian va vi vay eae Wj, hgi tu tdi mgl gidi han.
Buctc 2: Lira chon mot nhdm chien thang:
Vdi mdi dir lieu vao 1 va nhdm j , ham chpn Tj dugc djnh nghTa bdi
TAO = ^ — ^ (2) vdi phep loan AND, A, trong logic md dugc djnh nghTa.
{.X Ay): =min{x„yj}(3) va vdi dang |. |dugc djnh nghTa:
N = E;;=IX. (4) De dan gian viec ky hieu, Tj(l) trong Cong thirc 2 thudng dugc viet la Tj khi dii lieu vaol la cd djnh. Sir chgn nhdm dugc gSn chi so bang J, vdi
Tj = m3x{Tj_,j = l..N}(5) Neu ed nhieu hon mgl T, la cue dai ihi nhdmj vdi chi so nhd nh4l dugc ehgn. Cu the hon, cac mil dirge hinh thanh theo Ihir ty j = 1,2, 3 , . . . . Bu*6c 3: Kiem tra trang thai cua mang lii eong hudng hay thiet lap lai:
Cpng hieang xuat hien neu ham doi chieu cua nhdm dirge chgn dat dieu kien ve ngudng.
Dieu kien la:
1 [ AtV, I
^ > P (6)
Sau dd viee hgc se dien ra.
Nong Thi Hoa vd Dig Tap chi KHOA HQC & CONQ NGHE
i 06(06). 4 9 - 5 3<P (7) Thiet lap lai xuat hien neu
Ul
Sau do, gia trj cua ham chgn Tj dugc thiet lap -I cho cac bieu dien dir lieu vao de ngan sy lua chgn lai dii lieu vao trong qua trinh lim kiem Mgt chi sd mdi j dugc chgn bdi Cdng thirc 5. Qua trinh lim kiem tiep tue cho den khi j dirge ebon thda man Cdng thirc 6. Neu khdng ed nhdm dang tdn tai nao thda man dieu trong Cdng thirc 6 thi mpt nhdm mdi j dugc sinh ra va dat 11^-"^"' = /.
Buoc 4: Hoc du- lieu huan luyen: :Vec to trgng sd Wj dugc cap nhat theo cong thii'c
vv/'"- = ^ ( ; A »'/•")- {i-mf'"(8)
Phan cum dtr lieu bang Fuzzy ART:
Chgn ra mpt tap eon (D') cac mau bat ky tii' tap D. Dimg tap D' de Fuzzy ART huan luyen. Sau do dung tap du lieu con Iai de kiem tra kha nang phan cym ciia Fuzzy ART De tang hieu qua ciia viec hgc cd the ebon nhieu tap eon khac nhau de huan luyen va diing phan du lieu cdo de kiem Ira kha nang phan cum,
PHAN TICH CAC U'U DIEM VA H U ' 6 N G DAN CHON CAC THAM SO CHO FUZZY ART
Cac iru diem ciia Fuzzy ART de giai bai loan phan cum dir lieu
Viec dimg md hinh Fuzzy ART vao phan cum dii lieu se tan dung dugc het cac uu diem ciia Fuzzy ART. Thu nhat. Fuzzy ART hoe dti' lieu huan luyen de hinh thanh cac nhdm cbi khi mire do tuong dong giu'a du' lieu huan luyen va mpt nhdm dat den mot ngudng nhat dinh. Dieu kien nay dugc dieu chinh de chat Iugng CLia moi nhdm la eao. Thu hai, Fuzzy ART sinh ra cac nhom mdi khi mii'c dp tuong dong giii'a du' lieu huan luyen va moi nhom deu khong dat ngudng ve sy tuong ddng. Do nhdm mdi dugc hinh thanh nen lam giam duge su chdng cheo cac nhom. Thu ba, cac Iham so cila Fuzzy ART dugc lya chgn de dang do mien gia trj ciia ca ba tham sd deu
nam trong mien [0, 1] va co the chon gia trj thich hgp nhu hudng din d phin tiSp theo.
Theo cac phan tich d tren, chimg ta thay Fuzzy ART giai quyet lot bai loan phan cum bdi hai ly do sau: Fuzzy ART dugc thiet ke de phan cum dir lieu va de dang chpn dugc cac Ihatn sd ciia Fuzzy ART de kha nang phan cym la cao nhat. Hon nG'a, Fuzzy ART chi cap nhat trgng sd ciia mgt nhdm duoc chgn va trgng sd ciia nhdm mdi eiing khdng phu thudc vao cac mau da xet thudc ve nhdm dd nen do phue tap linh loan ciia Fuzzy ART giam ho'n kha nhieu so vdi cac phuong phap phan cum truyen thong. Ndi each khac, viec ap dung Fuzzy ART cho bai loan phan cum dir lieu la bieu qua ca ve mat chat lugng va giam thdi gian tinh loan.
Hudng dan chon gia trj cho cac tham so eua Fuzzy ART
Viec lua chon gia in cho cac tham sd de Fuzzy ART co kha nang phan cum cao nhat la kha don gian. Theo Cdng thiie (2), neu a cang Idn tbi kha nang dti' lieu huan luyen dupc chpn vao mot nhdm Iai cang giam va ngugc Iai. Do do tiiy vao muc dich ciia bai loan phan cum la can phan cum thd hay do chinh xac cao, chimg ta co the chgn a la nho hay Idn.
Tham sd p the hien ldc do hoc ciia mo hinh.
NOI each khac, |3 the bien mue do anh hudng ciia dil' lieu huan luyen den trgng sd ciia cac nhom. Theo Cong thiie (8), ndu |3 cang Idn thi anh huong ciia dii lieu huan luyen eang nhieu va nguae Iai. Do dd tiiy vao tinh chat eua dif lieu trong tap mau, chiing ta cd the chgn p la Idn neu cac du lieu trong tap mau la chuan va chgn p la nho neu tap mau co chira eae mau dj thirdng.
Theo Cong thuc (6), neu p cang Ion thi su tuong ddng giQa du lieu huan luyen va mpt nhom eang cao. Do dd, chimg ta co the chgn p thich hgp ii'ng vdi linh chat ciia tap dii' lieu can phan cum. Ndi each khac, neu du lieu phan bo rdi rac va cd nhieu dii' lieu di thudng thi nen chpn p la nho va nguo'c Iai,
51
Nong Thj Hoa vd Dig Tap chf KHOA HQC & CONG NGHg 106(06): 49 - 53 KET QUA THU'C NGHIBM
Chimg Idi chgn 5 tap di~r lieu chuan Iir ca sd du' li9u UCl' va Shape" bao gom Iris, Wine, Jain, Flame, vii R15. Cac tap dir lipu nay lii khac nhau tir so thugc linh, so phan cym, so mSu huan Iuy?n, va sy phan bo cac mau d cae phan cym. Bang I the hii;n cac thdng lin tren cua cac tap diT liOu dugc chgn.
Bdng 1: Ddc trirng ciia cdc tqp dir li^u Ten lap
dii' liv'u
phanSo thugc
tfnh So mau Ins
Glass Wine Jain R15
4 150 9 214 13 !78 2 373 2 600 Du lieu ciia cae lap dir lieu duge chuan hda ve mien [0,1], Chiing tdi xac dinh gia tri ciia cac tham sd ciia md hinh de dat dugc ket qua phan cum cao nhat vdi a~0.5, P=O.I va p=0,4, Vd'i mdi tap dir lieu, chiing Idi lam cac Ihu nghiem con vdi sd Iugng mau lang dan.
Ty Ie phan tram cae mau dugc phan cum dimg duoc the hien Irong mgt bang tuang irng vdi lapdu lieu dd.
Kiem tra vol tap Iris
Su phan bd sd mau trong ba nhdm la ddng deu moi nhdm ed 50 mau. Bang 2 the hien kcl qua thuc nghiem vdi tap mau Iris. Cac kel qua the hien rang Fuzzy ART cd ti le phan cum dir lieu diing tir 93.3% den 100%. Ket qua nay cho lhay Fuzzy ART phan Idp dat hieu qua eao vdi tap Iris.
Bang 2: Ty le phdn tram cdc mdu dirac phdn cum dung Irong tap mdu Ins So
mlu Kdt qua
30 100
60 98.3
90 93.3
i20 95
i50 96 Kiem tra \m tap Fhimc
Sy phan bo sd mau trong hai nhdm la 87 va 153, Cac sd lieu tir Bang 3 the hien k6l qua
' DH liOu is dia chi Imp // archivt ics uci edu/ml/ tlalasets ' DO' lieu o dia clli Imp ,'/cs loensuu Ti/sipu/dalawis/
thyc nghi?m vdi tap mau Flame. Cae ket qua the hicn rSng Fuzzy ART cd ti le phan cum dli lipu dung Iii' 84.6% den 100%. K6t qua nay cho thSy Fuzzy ART phan Idp dat hi?u qua kha eao vdi lap Flame.
B^ng 3: Ty 1^ phan trdm cdc mdu diepcphdn cym
dung trong ldp mdu FlameSo mau 50
200 240Ket
qua
100 98.0 98 7 95 Kiem tra voi lap R15
Sy phan bo sd mau trong 15 nhdm la ddng deu mdi nhdm cd 40 mSu. Bang 4 the hien kfit qua thyc nghiem vdJ tap mau RI5. Cac k8t qua the hien rang Fuzzy ART eo Ii le phan cum dir lieu dung tir 95.3%. den 97.3%. Ket qua nay eho lhay Fuzzy ART phan Idp dat hieu qua cao vdi tap RI5.
Bang 4: Ty le phdn irdm cdc mdu ditpcphdn cum
diing Irong lap mdu R15 Som l u K i t qua
iOO 96
200 95.5
300 95 3
400 96
500 96.8
600 97.3
Kicm tra vdi tap Wine
Sy phan bd sd mau Irong ba nhdm lan lugt la 59, 71, va 48 Bang 5 the hien kel qua thirc nghiem vdi lap mau \\ ine Cae ket qua th^
hien rang Fuzz;' ART cd li le phan cum dii lieu dung lir 76.7% d^n 100%. Ket qua nay cho tha\ Fuzzy ART phan ldp dat hi?u qua kha cao vdi tap Wine.
Bang 5: Ty le phdn trdm cdc mdu dirac phdn cum
Somiiu Ket qua
ail.
30 100
'tg Iron 60 98.3
glapir.
90 83.3
Idu Wir,
120 76.7 te
150 77.3
178 77.5
Kicm tra vdi tap Jain
Su phan bo sd mau trong hai nhdm la 276 va 97. Cac sd lieu lir Bang 6 the hien k^t qua thyc nghiem vdi tap mau Jain. Cac ket the hien rang Fuzzy ART cd ti Ie phan cum d&
lieu diing tir 94.6% den 99.6%. Kk qua nay
N6ng Thi Hoa vd Dtg Tap chl KHOA HQC & CONG NGHE 106(06): 4 9 - 5 3 cho thay Fuzzy A R T phan ldp dat hieu qua
cao vdi lap Jain.
Bang 6: Ty liphdn trdm cdc mdu dupc phdn cum dung trong lap mdu dain S6m5u
K i t qua 100 99
200 99.5
300 96.3
373 94.6 Viee lira chgn ba tham sd a, p, p va ket qua phan cum dimg ciia 5 t h y c nghiem tren cho thay Fuzzy A R T giai quyet hieu qua bai loan phan cum d u lieu vdi ty Ie % phan cum dung eao ban 9 5 % trong hau bet cac kiem tra con.
K E T LUAN v A H U ' 6 N G P H A T T R I E N Chiing tdi da su d u n g Ihanh cdng md hinh Fuzzy A R T eho nhiem vu phan cym du' lieu vdl viec lua cac tham sd ciia md hinh de dat ket qua phan cum e a o nhat. Kel qua thuc nghiem cung cho thay d o ehinh xac eiia eae nhom do Fuzzy phan cum la c a o trong hau hel cac Irudng hgp. Hien nay, cac thii' nghiem da eho ket qua eao n h u n g d e kha n a n g phan cum cua Fuzzy A R T ehinh xac hon thi can thiet ke them eae thual toan de tim ra gia trj thi'ch hgp cho cac tham sd tiiy vao lirng tap d u lieu mau.
TAI LIEU T H A M K H A O l,G, Carpenter, S, Grossberg, and D. B. Rosen (1991), "Fuzzy ART: Fast Stable Learning and Categorization of Analog Patterns by an Adaptive Resonance System, " Neural Networks, vol. 4. pp.
759-771.
2.J B.MacQueen (1967), "Some methods for classification and analysis of multivariate obser- vations," Proceedings of 5lh Berkeley Symposium on Mathematical Statistics and Probability, no. 1, pp. 281-297.
3.S, C. Johnson (1967), "Hierarchical Clustering Schemes," Psychomelrika, vol. 32, issue 3, pp 241-254.
4. S. Grossberg (1976), "Adaptive paUern classification and universal receding, II Feedback, expectation, olfaction and illusions," Biological Cybernetics.23, 187-212.
5,S. Grossberg (1980),•"How does a brain build a cognitive code". Studies of mind and brain Neural principles of learning, perception, development, cognition, and motor control (Chap I) Boston, MA' Reidel Press
6 T. Kohonen (1982), "Self-Organizing Formation of Topologically Correct Feature Maps," Biological Cybernetics, Springer-Verlag, vol. 69, pp. 59-69.
S U M M A R Y
U S I N G F U Z Z Y A R T N E U R A L N E T W O R K F O R C L U T E R I N G D A T A Nong Thi Hoa' , Hoang Trong Vinh' College of Information Technology & Communication - TNU
' FPTsoftware company Fuzzy neural network is an artificial neural network that combines fuzzy concepts, fuzzy inference mie with structure and learning ability of neural network Clustering is an important tool in data mining and knowledge discovery. Fuzzy ART (Fuzzy Adaptive Resonance Theory) is a fuzzy neural network that solves effectively clustering problem Fuzzy ART clusters better than traditional methods based on three following advantages' Learning data until satisfying a given conddition, creating a new category without affecting to existing categories, and easily choosing parameters of Fuzzy ART. In this papper, we apply Fuzzy ART for clustering 5 brenchmark datasets. After showing results of experiments, we present guide to choose suitable values for parameters of Fuzzy ART that the ability of clustering is the highest. Then, we analysis the advantages of Fuzzy ART when it is applied to clustering data. Results from experiments also show that Fuzzy ART cluster much effectively for clustering problems.
Key v/or6s: Fuzzy ART, ART, Fuzzy Neural Network, Fuzzy Set, Clustering Ngdy nhan bdi 15/5/2013. Ngdy phdn bien 20/5/2013, Ngdy duyet ddng 26/7/2013
Tel-01238 492 4S4