1
'R÷UXVDO2OPD\DQ*|PPH7HNQLNOHUL$OWÕQGD*HQ'L]LOHULQLQ(YULPVHO
øOLúNLOHUL
Evolutionary Relationships Between Gene Sequences via Nonlinear
Embedding
7XQFD'R÷DQ
1%LOJH.DUDoDOÕ
21.
%L\RWHNQRORMLYH%L\RPKHQGLVOLN
%|OP
,
ø]PLU<NVHN7HNQRORML(QVWLWV
tuncadogan@iyte.edu.tr
2.
(OHNWULNYH(OHNWURQLN0KHQGLVOL÷L%|OP
ø]PLU<NVHN7HNQRORML(QVWLWV
bilgekaracali@iyte.edu.tr
g]HWoH
%XoDOÕúPDGDoRNOXGL]LKL]DODPDVÕQÕWDNLEHQoÕNDUÕPVDQPÕú JHQ GL]L LNLOLOHUL DUDVÕQGDNL HYULPVHO X]DNOÕNODUD uygulanan GR÷UXVDOROPD\DQJ|PPHWHNQL÷LQLQKDWDDQDOL]LVXQXOPXúWXU %X DPDoOD VHQWHWLN JHQ GL]LOHUL ROXúWXUXOPXú YH o D\UÕ HYULPVHO \RO ER\XQFD KHU GL]L LNLOLVL DUDVÕQGDNL JHUoHN HYULPVHO X]DNOÕN ND\ÕW DOWÕQD DOÕQDUDNWDQ UDVWODQWÕVDO yer GH÷LúWLUPH PXWDV\RQXQD WDEL WXWXOPXúODUGÕU 'úN ER\XWOX ELU YHNW|U X]D\ÕQD \DSÕODQ GR÷UXVDO ROPD\DQ J|PPH LúOHPL |QFHVL YH VRQUDVÕ HOGH EXOXQDQ oÕNDUÕPVDQPÕú HYULPVHO X]DNOÕNODULOHJHUoHNHYULPVHOX]DNOÕNODUDUDVÕQGDNLIDUNOÕOÕNODU NDUúÕODúWÕUÕOPÕúWÕU 6RQXoODU GR÷UXVDO ROPD\DQ J|PPH LúOHPLQLQ oÕNDUÕPVDQDQ HYULPVHO X]DNOÕNODUGD \HU DODQ KDWD SD\ÕQÕ GúUPH NRQXVXQGDNL EDúDUÕVÕQÕ J|VWHUPLúWLU 6RQXo olarak, HYULPVHO X]DNOÕNODU ]HULQH X\JXODQDQ GR÷UXVDO ROPD\DQ J|PPH LúOHPLQLQ JHQ GL]LOHUL DUDVÕQGDNL HYULPVHO LOLúNLOHUL RUWD\D oÕNDUPDN DGÕQD JYHQLOLU oÕNDUÕPODUGD EXOXQDELOHFH÷LEHOLUWLOPLúWLU
Abstract
We present an error analysis on the application of non-linear embedding on pairwise evolutionary distances inferred over a collection of genetic sequences following multiple sequence alignment. To this end, we have generated gene sequences evolved by random substitutions along three different evolutionary pathways with known evolutionary distances between every sequence pair. We have compared the discrepancy between the inferred evolutionary distances to the true distances before and after non-linear embedding into a low dimensional vector space. The results indicate that non-linear embedding achieves significant reduction in error in the estimated evolutionary distances. Consequently, non-linear embedding of evolutionary distances can provide more reliable inferences on the evolutionary relationships between genetic sequences.
1.
*LULú
%L\RORMLN RUJDQL]PDODUÕQ JHQ YH SURWHLQ GL]LOHUL DUDVÕQGDNL HYULPVHO LOLúNLOHULQ NHúfi, sistemlerini meydana getiren
PROHNOHU YH IL]\RORMLN PHNDQL]PDODUÕ RUWD\D oÕNDUPDN DoÕVÕQGDQ|QHPWDúÕU. %XoDOÕúPDODUHOGHNLGL]LOHULQDUDVÕQGDNL benzerlikleri dikkate alarak RUWDN DWDODUÕQ RUWD\D oÕNDUÕOPDVÕ
YH WP EX GL]LOHULQ DWDODU-WRUXQODU úHNOLnde birbirleri ile
LOLúNLOHQGLULOGLNOHUL VR\ D÷DoODUÕQÕQ ROXúWXUXOPDVÕ\OD JHUoHNOHúWLULOLU%LUGL÷HUGH\LúOHILORJHQHWLN\|QWHPOHUEXJHQ YHSURWHLQGL]LOHULDUDVÕQGDNLHQPXKWHPHOKL]DODPD\ÕGLNNDWH DODUDN PROHNOHU \HU GH÷LúLPOHULQL VDSWDPD\Õ YH EX úHNilde
GL]LOHULQ RUWDN JHoPLúlerLQL RUWD\D oÕNDUPD\Õ KHGHIOHU Bu
\|QWHPOHU GL]L X]DNOÕNODUÕQÕQ KHVDSODQÕS EX X]DNOÕNODUÕ HYULPVHO X]DNOÕNODUOD LOLúNLOHQGLUHQ matematiksel bir modelin
X\JXODQPDVÕYHVRQXoWDHYULPVHOX]DNOÕNODUÕQoÕNDUVDQPDVÕQD GD\DQÕU
PrDWLNWH JHQ YH SURWHLQ GL]LOHULQLQ oRN X]XQ ROPDVÕ YH
|]HOOLNOH JHoPLú ]DPDQD DLW GL]LOHULQ oR÷XQOXNOD HOGH EXOXQPDPDVÕVLVWHPLQNDUPDúÕNOÕ÷ÕQÕDUWWÕUÕUYHNDYUDQPDVÕQÕ JoOHúWLULU%XQHGHQGHQGROD\ÕEXWLSDUDúWÕUPDODULVWDWLVWLNVHO DQODPGD JHUoHNOHúWLULOLU %X X\JXODPDODU JHUHN GR÷UXOXN JHUHNVH KÕ] JLEL |]HOOLNOHUL GROD\ÕVÕ LOH LúOHPVHO RUWDPGD \UWOUKonvansiyonel DQODPGDNXOODQÕODQEX\|QWHPOHUGHQ
ED]ÕODUÕ -XNHV-Cantor mRGHOL >@ .LPXUD¶QÕQ iki parametreli modeli [2] ve genel zaman tersinir model¶GLU [3]. Bu klasik
\|QWHPOHU ELOLQPH\HQ HYULPVHO X]DNOÕNODU KDNNÕQGD ED]Õ
tahminlerde bulunsa da, bu tahminler bazen VRQXoODUÕQ
JYHQHELOLUOL÷LQLGúUHQKDWDODUD tabi olabilmektedir.
%X WLS oDOÕúPDODUGD VD÷OÕNOÕ VRQXoODU HOGH HGLOPHVLQLQ ve
EX VRQXoODUÕQ NDYUDQDELOLU J|UVHO LIDGHOHUOH DoÕNODQPDVÕQÕQ NDUúÕVÕQGDNL HQ E\N HQJHOOHUGHQ ELULVL de oRN ER\XWOXOXN
VRUXQXGXU $QDOL]H JLUGL RODUDN YHULOHQ EX GL]LOHULQ ]HULQGH NRGODQPÕú KHU ELU NDUDNWHU DQDOL]H NDWÕOPÕú ELU \HQL ER\XWX
ifade eder ve bu girdiler kimi zaman onbinlerce karakter
X]XQOX÷XQGD ROGX÷XQGD JHQ YH SURWHLQ GL]LOHULQGH ROGX÷X
gibi) boyut sorunu oDOÕúPDGDQ VD÷OÕNOÕ VRQXoODU DOPD\Õ KD\OL
JoOHúWLUHELOLU. %X VRUXQX DúPDN LoLQ GR÷UXVDO YH GR÷UXVDO
ROPD\DQ J|PPH WHNQLNOHUL NXOODQÕOPDNWDGÕU. Bu teknikler
2
PDNXO ER\XWODUGD LIDGH HWPHN LoLQ X\JXODQPÕúWÕU .ODVLN oRN ER\XWOX |OoHNOendirme [4@ YH HVDV ELOHúHQ Dnalizi [5] bu
WHNQLNOHULQ LON YH HQ VÕN NXOODQÕODQ |UQHNOHULGLU Analiz edilecek veULOHUELUG]OHP]HULQGHGD÷ÕOÕPJ|VWHUGL÷L]DPDQ
EDúDUÕOÕ VRQXoODU UHWPHVLQH UD÷PHQ EX \|QWHPOHULQ H÷UL \]H\OHUGH EDúDUÕVÕ] ROGX÷X OLWHUDWUGH EHOLUWLOPLúWLU >6@ (÷UL
\]H\OHUGHNL GD÷ÕOÕPODUÕ \DNDOD\DELOPHN LoLQ, bu klasik
\|QWHPOHULQX\DUODQPDVÕ ile ELUoRNGR÷UXVDOROPD\DQJ|PPH
WHNQL÷L JHOLúWLULOPLú YH IDUNOÕ DODQODUGD X\JXODQPÕúWÕU Bu
\|QWHPOHUGHQED]ÕODUÕGR÷UXVDOROPD\DQKaritalama [7], yerel
GR÷UXVDO J|PPH >8@ YH VWRNDVWLN \DNÕQOÕN J|PPHsi [9]
\|QWHPOHULGLU
'R÷UXVDO ROPD\DQ J|PPH \|QWHPOHULnin protein ve gen
GL]LOHUL DUDVÕQGDNL EHQ]HUOLNOHUL RUWD\D oÕNDUPDN DGÕQD oRNOX
dizi hL]DODPDVÕQÕ WDNLEHQ GL]L YHULOHULQGHQ oÕNDUÕPVDQDQ evULPVHO X]DNOÕNODUD X\JXODQPDVÕ GDKD |QFH oDOÕúÕOPÕúWÕU
gUQH÷LQ ELU oDOÕúPDGD VWRNDVWLN \DNÕQOÕN J|PPHsi WHNQL÷L kulODQÕODUDNELUELULLOH\DNÕQLOLúNLOHULoLQGHRODQED]ÕSURWHLQ dizileri DQODPOÕNPHOHULoLQGHJUXSODQPD\DoDOÕúÕOPÕúWÕU>0]. FDUNOÕ DQDOL] SDUDPHWUHOHULQLQ GHQHQLS EX SDUDPHWUHOHULQ
RSWLPL]H HGLOPH\H oDOÕúÕOGÕ÷Õ DUDúWÕUPDGD ED]Õ SURWHLQ
ailelerinin fonksiyonel oODUDN ELUELULQH \DNÕQ \HOHULQLQ evrimsel anlamda uzak SURWHLQOHUGHQ D\UÕOÕS iki boyutlu uzayda gruplaQPDVÕ EDúDUÕOPÕú EX úHNLOGH GR÷UXVDO ROPD\DQ
J|PPHWHNQLNOHULQLQSURWHLQYHJHQGL]LOHULQLNPHOH\HELOPH
becerisi ortaya koyulPXúWXU>0].
ø]RPHWULN|]HOOLNKaritalama (ISOMAP)GR÷UXVDOROPD\DQ
J|PPH \|QWHPOHULQGHQ ELULGLU %X \|QWHP NODVLN |OoHNOHQGLUPH LoLQH J|POPú RODUDN, NRPúXOXN GL\DJUDPÕ
]HULQGHNL MHRGH]LN X]DNOÕNODUÕ GLNNDWH DOÕU[6@ %X |]HOOLN
\|QWHPH H÷UL \]H\OHU ]HULQH GD÷ÕOPÕú YHULOHUL ELOH EDúDUÕOÕ úHNLOGH \DNDODPD EHFHULVLQL ND]DQGÕUÕU $QDOL]H JLUGL RODUDN YHULOHQ oÕNDUÕPVDQPÕú HYULPVHO X]DNOÕNODU NRPúX \DNÕQ QRNWDODUDUDVÕQGDNLMHRGH]LNX]DNOÕNODUÕKHVDSODPDNLoLQiyi bir
\DNODúÕP RODUDN \HU DOÕU >6@ 8]DN QRNWDODU LoLQVH EX NRPúX noktalar DUDVÕQGDQ ELUoRN NoN DWODPD VW VWH HNOHQHUHN
MHRGH]LNX]DNOÕNODUKHVDSODQDELOLU
%XELOGLULGHVHQWHWLNJHQGL]LOHULQLQ,620$3DOJRULWPDVÕ NXOODQÕODUDN X\JXODQPÕú RODQ GR÷UXVDO ROPD\DQ J|PPH VRQUDVÕ HOGH HGLOHQ HYULPVHO X]DNOÕNODUÕQÕQ KDWD DQDOL]L VXQXOPXúWXU %X UDSRUXQ VÕUDGDNL NÕVPÕQGD IDUNOÕ HYULPVHO \ROODU L]OH\HQ VHQWHWLN GL]LOHULQ UHWLOPHVL YH KDWD DQDOL]LQLQ JHUoHNOHúWLULOPHVL LoLQ NXOODQÕODQ SDUDPHWUH DoÕNODQPÕú WDNLS HGHQE|OPGHLVHVRQXoODURUWD\DNRQXOPXúYHEXVRQXoODUÕQ \RUXPODQPDVÕJHUoHNOHúWLULOPLúWLU
2.
<|QWHP
dDOÕúPDGDLON|QFHLúOHPVHORUWDPGDKD]ÕUODQPÕúRODQVHQWHWLN JHQGL]LOHULQLQDUDODUÕQGDNLHYULPVHOX]DNOÕNOar, konvansiyonel Jukes-Cantor mRGHOL >@ NXOODQÕODUDN oÕNDUÕPVDQPÕúWÕU 'DKD sonra bu YHULOHU ]Hrinde ,620$3 \|QWHPL uygulanarak,
GR÷UXVDO ROPD\DQ J|PPH VRQUDVÕ HYULPVHO X]DNOÕN GH÷HUOHUL HOGH HGLOPLúWLU. Bu iki grup HYULPVHO X]DNOÕN GH÷Hri, sentetik
GL]LOHUDUDVÕQGDNLJHUoHNHYULPVHOPHVDIHOHUOHNDUúÕODúWÕUÕODUDN KDWD DQDOL]L RUWD\D NR\XOPXúWXU +DWD PLNWDUODUÕ JUDILNOHU ]HULQGH NDUúÕODúWÕUÕODUDN ,620$3 \|QWHPL LOH \DSÕODQ GR÷UXVDO ROPD\DQ J|PPH LúOHPLQLQ oÕNDUÕPVDQPÕú HYULPVHO X]DNOÕNODU LoLQGH \HU DODQ KDWD PLNWDUODUÕQÕ DQODPOÕ RODUDN GúUPH\L EDúDUÕS EDúDUDPDGÕ÷Õ GL÷HU ELU GH\LúOH KDWD GúUPe kapasitesi LQFHOHQPLúWLU Sentetik gen dizilerinin
ROXúWXUXOPDVÕ GDKLO WP LúOHPOHU YH DQDOL]OHU 0$7/$% \D]ÕOÕPÕNXOODQÕODUDN\DSÕOPÕúWÕU
2.1. 6HQWHWLNJHQGL]LOHULQLQROXúWXUXOPDVÕ
6HQWHWLN JHQ GL]LOHULQLQ KD]ÕUODQPDVÕQGD o D\UÕ HYULPVHO \RO L]OHQPLú DQDlizler de EX o \RO LoLQ D\UÕ D\UÕ
JHUoHNOHúWLULOPLúWLU (YULPVHO \ROODU RULMLQDO VHQWHWLN JHQ GL]LVLQLQED]ÕE|OJHOHULQGHUDVWODQWÕVDOED]DO\HUGH÷LúWLUPHOHU \HUOHúWLULOPHVL LOH \HQL GL]LOHU ROXúWXUXOPDVÕQD GD\DQÕU Bu
RSHUDV\RQLúOHPVHORUWDPGDVLPXOHHGLOPLúWLU
+HU ELU GL]L NHQGLVLQGHQ ELU |QFH JHOHQ GL]LQLQ |QFHNL QHVOLQ HQ \DNÕQ DNUDEDVÕGÕU %X úHNLOGH QHVLOOHU LOHUOHGLNoH ROXúDQ\HQLGL]LOHURULMLQDO(ilk) GL]LGHQJLWWLNoHIDUNOÕODúÕUODU Her bir evrimsel yolda 501¶HU (orijinal diziyle beraber) dizi bulunur. +HU ELU GL]L QNOHRWLGGHQ ROXúXU $QDOL]GH
NXOODQÕODQ \HU GH÷LúWLUPH KÕ]Õ ¶GLU 'L÷HU ELU GH\LúOH YDU RODQELUGL]LGHQ\HQLELUGL]LROXúWXUXOXUNHQGL]Llerde yer alan
YHKHUELULQLELUQNOHRWLGLQNDSODGÕ÷ÕE|OJHGHQKHUELUL
ioLQeski dizideki QNOHRWLGLQD\QHQ\HQLGL]L\H\HUOHúWLULOPH ihtimali UDVWJHOH RODUDN IDUNOÕ QNOHRWLGGHQ ELULQLQ
\HUOHúWLULOPH LKWLPDOL LVH ¶GLU Evrimsel yol 1 (EY1), Evrimsel yol 2 (EY2) ve Evrimsel yol 3¶Q (EY3) olXúWXUXOPDVÕQGDL]OHQHQ\ROlar ùHNLO¶GHYHULOPLúWLU
HDWD GH÷HUOHUL (< (< YH (< LoLQ D\UÕ D\UÕ HOGH
HGLOPLúWLU %XQXQOD EHUDEHU KDWD DQDOL]L KHU ELU HYULPVHO \RO LoLQ GL]L LoLQGH \HU DODQ WP GL]L LNLOLOHUL LoLQ WRSOX úHNLOGH GH÷LO DUDODUÕQGD EHOLUOL QHVLO VD\ÕODUÕ EXOXnan diziler
LoLQD\UÕD\UÕRUWD\DNRQPXúWXU(10, 20, 50, 100 ve 200 nesil
IDUNOÕOÕNODU LoLQ %XQXQ QHGHQL DUDGDNL QHVLO VD\ÕVÕQÕQ GH÷LúLPL\OH EHUDEHU KDWD oranÕQÕQ GD úLGGHWOL úHNLOGH
GH÷LúPHVidir.
2.2. *HQ GL]LOHUL DUDVÕQGDNL oÕNDUÕPVDQPÕú evrimsel
X]DNOÕNODUÕQKHVDSODQPDVÕ
6HQWHWLN JHQ GL]LOHUL DUDVÕQGD \HU DODQ HYULPVHO X]DNOÕNODUÕQ oÕNDUÕPVDQPDVÕQGD NXOODQÕODQ NRQYDQVL\RQHO -XNHV-Cantor
PRGHOL VWRNDVWLN ELU \DNODúÕPGÕU %X YH EHQ]HUL
konvansiyonel modeller, LNLGL]LDUDVÕQGDNLGL]L uzaklÕ÷ÕQÕED]Õ parametreler ve basit bir formulasyon \DUGÕPÕile yine bu iki
GL]L DUDVÕQGDNL HYULPVHO X]DNOÕ÷D oHYLULUler. Jukes-Cantor
PRGHOLQNOHRWLG\HUGH÷LúWLUPHKÕ]ÕQÕoHúLWQNOHRWLGLQWP IDUNOÕ LNLOL NRPELQDV\RQODUÕ LoLQ HúLW NDEXO HGHU %XQXQ \DQÕQGDQNOHRWLGIUHNDQVODUÕYHGL]L]HULQGHNLE|OJHOHUDUDVÕ \HU GH÷LúWLUPH KÕ]ODUÕ GD HúLW NDEXO HGLOPLúWLU -XNHV-Cantor
PRGHOLQLQIRPXODV\RQXQXPDUDOÕGHQNOHPGHYHULOPLúWLU X]DNOÕ÷ÕQÕ WHPVLO HGHU dRNOX GL]L KL]DODPDVÕ VRQXFX RUWD\D oÕNDQGL]LOHU]HULQGHKHUELUE|OJHLoLQD\UÕD\UÕ\DSÕODQLNLOL NDUúÕODúWÕUPDODU VRQUDVÕQGD LNLOLdizi X]DNOÕNODUÕ RUWD\D oÕNDU
%X KHVDSODPDGD KHUKDQJL ELU E|OJHGH LNL GL]LGH GH D\QÕ QNOHRWLG EXOXQX\RUVD ELU HúOHúPH IDUNOÕ QNOHRWLGOHU YDUVD HúOHúPHPH ND\ÕW HGLOLU øúOHP VRQXQGD WRSODP HúOHúPHPH VD\ÕVÕ LNL GL]LGH GH ERúOXN ROPD\DQ WRSODP E|OJH VD\ÕVÕQD RUDQODQÕU'HQNOHPLQGHULYDV\RQXUHIHUDQVE|OPQGHYHULOHQ
3 2.3. *HQ GL]LOHUL DUDVÕQGDNL oÕNDUÕPVDQPÕú evrimsel
X]DNOÕNODUÕQ,620$3\|QWHPLLOHLúOHQPHVL
,620$3 \|QWHPL NHQGLVLQH JLUGL RODUDN YHULOHQ ELU X]DNOÕN PDWULVLQLDOÕSPDWULVWHDUDODUÕQGDNLPHVDIHOHUYHULOHQQRNWDODUÕ oRNER\XWOXYHNW|UX]D\ÕQGDVDELWQRNWDODUDDWDU%X úHNLOGH QPHULNX]DNOÕNPDWULVOHULQLGHJHRPHWULNRODUDNWHPVLOHWPLú
olur. <|QWHPGHQ DOÕQDQ oÕNWÕGD YHULQLQ J|POHFH÷L YHNW|U
X]D\Õ ER\XW VD\ÕVÕ LVWH÷H J|UH VHoLOLU *HQHOGH LúOHP ELUoRN IDUNOÕ ER\XW VD\ÕVÕ LoLQ DUGÕ VÕUD WHNUDUODQÕU YH LúOHP VRQXQGD DUDúWÕUPDFÕQÕQ EX ER\XW VD\ÕODUÕQGDQ ELULQL VHoLS GR÷UXVDO ROPD\DQJ|PPHQLQEXER\XWWDNLELUYHNW|UX]D\ÕQD\DSÕOPDVÕ VD÷ODQÕU %R\XW VD\ÕVÕQÕQ VD÷OÕNOÕ RODUDN VHoLOHELOPHVL LoLQ ER\XWVD\ÕVÕQDNDUúÕOÕNDUWÕNYDU\DQV GH÷HUOHULQLQ oL]LOGL÷L ELU
grafik verilir DUWÕN YDU\DQV JUDIL÷L. Bu grafikte, artan boyut
VD\ÕVÕ LOH EHUDEHU GúHQ DUWÕN YDU\DQV GH÷HULQLQ LQFHOHQPHVL YDU\DQVWDNLGúúQVRQDHULSPLQLPXPD\DNODúWÕ÷ÕHQGúN ER\XWVD\ÕVÕQÕQVHoLPLQLVD÷ODU0PNQROGX÷XNDGDUGúN ER\XWWDNL LIDGHQLQ VHoLOPHVLQLQ QHGHQL YHUL VHWLQLQ LoLQGHNL \DSÕ\ÕER]PDGDQ, VLVWHPLPDNVLPXPGHUHFHGHEDVLWOHúWLUPHN
KDWWDLPNDQYDUVDJ|UVHOJHRPHWULNELUWHPVLOLVD÷OD\DELOHFHN RODQYH\DER\XWWDoDOÕúPDNWDGÕU
%LU|QFHNLDGÕPGD-XNHV-&DQWRUPRGHOLLOHoÕNDUÕPVDQPÕú olan HYULPVHO X]DNOÕNODU ,620$3 \|QWHPLQH JLUGL RODUDN
YHULOHUHNGR÷UXVDOROPD\DQJ|PPHWHNQL÷LLOHoRNER\XWOXELU YHNW|U X]D\ÕQGD LIDGH HGLOPHOHUL VD÷ODQÕU ,620$3 \|QWHPL KHUJHQGL]LVLQLGL]LOHUDUDVÕLNLOLHYULPVHOX]DNOÕNODUÕGLNNDWH DODUDN oRN ER\XWOX X]D\GD \HU DODQ NRRUGLQDWODUÕ EHOLUOL ELU
nokta ile ifade eder. HHU ELU JHQ GL]LVLQH WHNDEO HGHQbu
QRNWDODU DUDVÕQGDNL PHWULN PHVDIHOHU, \|QWHPLQ VRQXo RODUDN
YHUGL÷LHYULPVHOX]DNOÕNODURODUDNND\GHdilir.
ùHNLO1: EYYH¶QROXúWXUXOPDVÕQGDL]OHQHQ\ROODU
3.
6RQXoODU
YHWDUWÕúPD
+DWD GúUPH NDSDVLWHVLQLQ RUWD\D NRQPDVÕ LoLQ NXOODQÕODQ SDUDPHWUH RUWDODPD PXWODN KDWD GH÷HUOHULGLU %X SDUDPHWUH
bir \|QWHPGHQ oÕNWÕ RODUDN DOÕQDQ bir dizi ikiliVL DUDVÕQGDNL
oÕNDUÕPVDQPÕú HYULPVHO X]DNOÕN GH÷HUL LOH EX LNL GL]L DUDVÕQGDNLJHUoHNHYULPVHOX]DNOÕ÷ÕQELUELUOHULQGHQoÕNDUÕOPDVÕ
LOH RUWD\D oÕNDQ IDUNÕQ PXWODN GH÷HU LoLQH DOÕQPDVÕ D\QÕ LúOHPLQ WP GL]L LNLOLOHUL LoLQ WHNUDUODQPDVÕ YH VRQXQGD HOGH
edilHQ WP GH÷HUlerLQ RUWDODPDVÕQÕQ DOÕQPDVÕ\OD HOGH HGLOLU
6RQXoWDELULNRQYDQVL\RQHO\|QWHPHYHGL÷HULGH,620$3¶H DLWROPDN]HUHLNLKDWDGH÷HULROXúXU
,620$3 LúOHPL VRQXQGD GR÷UXVDO ROPD\DQ J|PPH LúOHPLQLQ \DSÕODFD÷Õ YHNW|U X]D\ÕQÕQ ER\XW VD\ÕVÕQÕ VHoPek
LoLQ DUWÕN YDU\DQV JUDILNOHUL ùHNLO LQFHOHQPLúWLU $UWÕN YDU\DQVWDNL GúúQ VRQD HUGL÷L ER\XWODU RODQ (< YH (< LoLQER\XWOX(<LoLQVHER\XWOXYHNW|UX]D\ODUVHoLOPLúWLU
6RQXoODUÕQ LQFHOHQPHVLQLQ YH ELUELUOHUL LOH NDUúÕODúWÕUÕOPDVÕQÕQ J|UVHO DQODPGD YH NROD\ DQODúÕODELOLU ROPDVÕ LoLQ JUDILN J|VWHULPOHU KD]ÕUODQPÕú YH EX E|OPQ GHYDPÕQGDVXQXOPXúWXU%XNÕVÕPGDYHULOHQKHUELUúHNLOD\UÕ ELU HYULPVHO \RO (< (< YH (< LoLQ \DSÕOPÕú DQDOL]L WHPVLOHGHU+HUúHNLOGH[-ekseninde nesil uzDNOÕNODUÕ
YH \ HNVHQLQGH LVH LOJLOL QHVLO X]DNOÕNODUÕQD LOLúNLQ RUWD\D oÕNDQ KDWD GH÷HUL NRQYDQVL\RQHO -XNHV-Cantor
PRGHOLYH,620$3\|QWHPLLoLQLNLD\UÕEDUYHULOPLúWLU
4
ùHNLl 3, 4 ve 5¶GH J|UOG÷ ]HUH WP DQDOL]H WDEL
WXWXOPXú HYULPVHO \ROODU LoLQ QHVLO IDUND NDGDU ± 50
DUDVÕNRQYDQVL\RQHOPRGHOYH,620$3oÕNWÕODUÕELUELUOHULQH oRN \DNÕQ KDWD GH÷HUOHUL RUWD\D NR\PXúWXU 6RQXo RODUDN ,620$3 \|QWHPL 50 nesile kadar olan mesafelerde,
NRQYDQVL\RQHO \|QWHPLQ oÕNDUÕPVDGÕ÷Õ HYULPVHO X]DNOÕNODUGD \HUDODQKDWD\Õbelirgin RODUDN GúUPHPLúWLU %XQXQ QHGHQL
LVH EX DQDOL] úDUWODUÕQGD QHVLOH NDGDU RODQ PHVDIHOHUGH NRQYDQVL\RQHO PRGHOLQ ]DWHQ ROGXNoD GúN YH NDEXO
edileELOLU|OoGHKDWDlar YHUPLúROPDVÕGÕU
<LQH D\QÕ úHNLOOHUGH J|UOPHNWHGLU NL YH QHVLO PHVDIHOHUGH NRQYDQVL\RQHO PRGHOLQ UHWWL÷L KDWD ELU DQGD \NVHOPHNWHGLU %XQXQ \DQÕQGD ,620$3 VRQXQGD oÕNDQ GH÷HUOHU LoHULVLQGH NDODQ KDWD oRN GúN PHVDIHOHUGH J|]OHQHQ KDWD LOH \DNÕQ GH÷HUOHUH VDKLSWLU %X da, ISOMAP
LoLQ100 ve 200 nesil VÕUDVÕyla 0,65 ve 0,75 GH÷HUOHULQGHdizi
X]DNOÕNODUÕQD YH YH GH÷HUOHULQGH oÕNDUÕPVDQPÕú HYULPVHO X]DNOÕNODUD NDUúÕOÕN JHOLU PHVDIHOHUGH RUWD\D oÕNDQ oRNEHOLUJLQELUKDWDLQGLUJHPHNDSDVLWHVLQLLúDUHWHWPHNWHGLU
,620$3 \|QWHPLQLQ \NVHN QHVLO PHVDIHOHULQGHNL KDWDODUÕ LQGLUJHPHGH NL EX EDúDUÕVÕ DOJRULWPDQÕQ ELUELULQGHQ oRN X]DNWD \HU DODQ LNL QRNWD DUDVÕQGDNL PHWULN PHVDIH\L
hesaplarken, girdi olarak kendisine verileQ GH÷HUL GLNNDWH
DOPDN \HULQH ELUELULQH oRN \DNÕQ PHVDIHGHNL QRNWDODU DUDVÕQGD NoN DWODPDODU \DSDUDN EX LNL X]DN QRNWDQÕQ ELULQGHQ GL÷HULQH XODúPDVÕQD YH ROXúDQ EX PHVDIH\L GLNNDWH DOPDVÕQDGD\DQÕU
ùHNLO3: (< LoLQ NRQYDQVL\RQHO PRGHO -XNHV-Cantor) ile
,620$3DUDVÕQGDRUWDODPDPXWODNKDWDNDUúÕODúWÕUPDVÕ
ùHNLO4: (< LoLQ NRQYDQVL\RQHO PRGHO -XNHV-Cantor) ile
,620$3DUDVÕQGDRUWDODPDPXWODNKDWDNDUúÕODúWÕUPDVÕ
ùHNLO5: (< LoLQ NRQYDQVL\RQHO PRGHO -XNHV-Cantor) ile
,620$3DUDVÕQGDRUWDODPDPXWODNKDWDNDUúÕODúWÕUPDVÕ
%LU EDúND GH\LúOH NÕVD PHVDIHOHU DUDVÕQGDNL X]DNOÕ÷Õ KHVDSODUNHQRUWD\DoÕNDQKDWD GH÷HUL NoN X]XQ PHVDIHOHUL KHVDSODUNHQRUWD\DoÕNDQKDWDGH÷HULE\NWU,620$3X]XQ PHVDIHOHUDUDVÕQGDNLX]DNOÕNODUÕKHVDSODUNHQNÕVDYHJYHQLOLU
mesafeleri birbirinH HNOH\HUHN ELU X]DNOÕN EHOLUOHGL÷L LoLQ
KDWD\ÕNoNGH÷HUOHUGHWXWPD\ÕEDúDUPDNWDGÕU.
(OGH HGLOHQ EX VRQXoODU GR÷UXVDO ROPD\DQ J|PPH \|QWHPL ,620$3¶LQ JHQ YH SURWHLQ GL]ilerinin evrimsel
VÕQÕIODPDVÕ konusunda, evrimsel mesafelerin tahminindeki
KDWDODUÕbelirgin úHNLOGHGúUPHNDSDVLWHVLQHVDKLSROGX÷XQX
RUWD\D NR\PXúWXUAOJRULWPDQÕQ YH SDUDPHWUHOHULQen uygun
úHNLOGH VHoLPL ile \|QWHPLQ GL]LOHU DUDVÕ HYULPVHO LOLúNLOHUL
RUWD\D oÕNDUPD NRQXVXQGD YHULPOL RODUDN NXOODQÕOma potansiyeline sDKLSROGX÷XQXGúQOPHNWHGLU
4.
.D\QDNoD
[1] T. H. Jukes, C. R. Cantor, ³(YROXWLRQ RI SURWHLQ
PROHFXOHV´ Pp. 21-123 in H. N. Munro, ed. Mammalian protein metabolism. Academic Press, New York, 1969. [2] M. $ .LPXUD ³6imple method for estimating
evolutionary rate of base substitution through comparative VWXGLHVRIQXFOHRWLGHVHTXHQFHV´Journal of Molecular Evolution 16:111-120, 1980.
[3] 6 7DYDUp "Some Probabilistic and Statistical Problems in the Analysis of DNA Sequences". American Mathematical Society: Lectures on Mathematics in the Life Sciences 17: 57±86.
[4] I. Borg, P. Groenen, ³Modern Multidimensional Scaling: theory and applications´, Springer New York, 2005. [5] . 3HDUVRQ ³On Lines and Planes of Closest Fit to
Systems of Points in Space´. Philosophical Magazine 2
(6): 559±572, 1901.
[6] J. B. Tenenbaum, V. de Silva, J. C. Langford, ³A global geometric framework for nonlinear dimensionality reduction´. Science, 290, 2319±2323, 2000.
[7] J. W. Sammon, ³A nonlinear mapping for data structure analysis´. IEEE Trans. Computers 1969, 18, 401-409. [8] S. Roweis, L. Saul, ³Nonlinear dimensionality reduction
by LLE´. Science, 290, 2323-2326, 2000.
[9] D. K. Agrafiotis, H. Xu, ³A self-organizing principle for learning nonlinear manifolds´. Proc. Natl. Acad. Sci. U.S.A., 99, 15869- 15872, 2002.
[10]M. A. Farnum, H. Xu, D. K. Agrafiotis, ³Exploring the nonlinear geometry of protein homology´. Protein Science, 12:1604±1612, 2003.