P hylo g e ny o f Fr og s o f the Ph ysalae m u s Pu stu lo su s S pe cie s G r oup, W it h a n E x a m ina t io n o f D a ta I ncong r ue nce
DAVID C . CANN ATELLA,1,2DAVID M . HILLIS,1 PAULT. CHIPPIND ALE,1,3LEEWEIG T,4 A. STAN LEYRAN D,4,5 ANDMICH AELJ. RYAN1,4
1D ep a rtm ent o f Zoo logy,U niversity o f Texa s,Austin,Texa s 78712,U SA ; E-m a il : (D.M.H.)h illis@bull.zo.utexa s.ed u,(M.J.R.)m rya n@m a il.utexas.edu
2Texas M em o ria l M useum,University o f Texa s,Austin,Texas 78705,US A ; E-m a il : ca t® sh@m a il.utexas.edu
Ab stract.Ð C ha racters de rive d from advertisem en t calls, m o rph ology, allozym es, and the se q ue nce s of th e sm a ll su bu nit of the m itoch o nd rial rib osom al gen e (12S ) and th e cytoch rom e o xida se I (C O I) m itoch o ndrial gen e were use d to e stim a te the phylogeny o f frogs of th ePh ysa - la em us p ustulosus grou p (Le ptodactylidae ). T he co m bina bility o f these d ata pa rtitions wa s a sse sse d in several ways : m easures of ph ylogen etic signal, ch aracter su pp ort fo r tre es, con - gruence of tre e topologie s, co m patibility o f da ta p artitions with sub optim a l tree s, a nd hom o - gene ity of da ta partitions. C o m bine d parsim on y ana lysis o f a ll data eq ua lly weigh ted yielded the sam e tre e as th e 12S pa rtition analyzed un de r pa rsim on y a nd m axim u m likelih ood . T he C O I, allozym e, and m orph ology partitions we re gen erally con gruent and co m patible with the tree de rived from co m bine d d ata. T he call d ata were sign i® can tly diƒerent from all oth er p artitions, wh ethe r co nsidered in term s o f tree top ology alone, p artition ho m oge neity, o r co m - p atibility o f d ata with tre es de rive d from o ther p artitions. Th e lack of eƒect o f th e call d ata on the top ology of the com b in ed tree is prob ab ly d ue to th e sm all nu m be r of ca ll ch aracters. T he gene ral inco ngrue nce o f the call d ata with othe r d ata p artition s is con siste nt with the ide a tha t the adve rtisem en t calls of this grou p o f frogs are u nd er strong sexual se le ction.
[ Ad vertisem e nt calls ; b eha vior ; com bine d- da ta a nalysis ; d ata partitions ; frogs ; L eptoda c- tylidae ;Ph y sa laem us; sen sory e xp loitation hyp oth esis.]
W he the r or n ot to com b ine data sets has be e n discusse d widely in the re cen t literature (Bull et al., 1993 ; Eern isse and Kluge , 1993 ; Ch ippindale and W ien s, 1994 ; de Q ue iroz e t al., 1996). Le ss dis- cussed is th e ide nti® cation an d localiza- tion of in congrue n ce am ong data partitions (b ut se e H ue lse n be ck and Bull, 1996 ; Poe , 1996 ; M ason -G am er and Kellogg, 1996 ; Lu tzoni, 1997). It has b ee n argu ed that if diƒere nt data partition s are no m ore diƒere nt than e xpecte d b y sam - pling e rror, the n th e data can be com - bine d in to a single an alysis (Bull et al.,
Ð Ð Ð Ð Ð
3Present ad dress : D epa rtm en t o f B iology, U ni- versity of T exa s, A rlington, T exas 76019, U S A ; E -m a il : pau lc@alb ert.uta.edu
4S m ith sonian T ropica l R esea rch In stitute, U n it 0948, APO AA 34002 ; Presen t dd ress (L .W .) : Fie ld M u seum of N atural History, R o osevelt R o ad at L ake S hore D r., C hicago, Illinois 60605, U S A ; E - m ail : weigt@fm pp r.fm nh.org
5E - m ail : ran d@gam b oa.si.edu
1993). Although th ere are m an y re ason s to favor a com bin e d analysis (Ee rnisse an d Kluge, 1993 ; Ch ippin dale an d W ie ns, 1994), it can b e e nligh te n ing to e xam in e incongru en ce am ong data partition s.
Beh avioral data are re ce ivin g in cre as- ing atte ntion in ph yloge ne tic an alysis (de Q ueiroz and W im b e rge r, 1993 ; Foster et al., 1996 ; G ittle m an et al., 1996 ; Irwin , 1996 ; Ken ne dy e t al., 1996 ; W im be rge r an d de Q ueiroz, 1996). In th is article we use a diverse, origin al data set from advertisem en t calls, m orp h ology, allozym e s, and th e 12S and cytochrom e oxidase I (CO I) m itoch ondrial gen es to e stim ate th e phylogen y of frogs of th e Ph ysalaem us p ustulosus group (C an natella an d D ue llm an , 1984). Th is clade h as se rve d as a m ode l for exam in ing aspe cts of b e havioral evolution such as sexual se lection and sign al-rece ive r e volution (Ryan and R an d, 1993, 1995 ; Ryan, 1996).
Additionally, we assess in congrue n ce am ong data partition s with se ve ral 311
m e thods, an d discuss th e ph yloge ne tic utility of th e advertisem en t calls of th ese frogs.
MATER IALS AN DMETHO DS
S pe cim en s we re colle cte d in th e ® e ld, tissue s e xtracted, an d the vouch er sp eci- m e n s preserved or pre pared as ske le tons (Appe ndix 1). S pe cim en s are de posited at the Un ited S tates National M use um and the Texas M em orial M use um , U n ive rsity of Te xas. S om e ske le tal m ate ria l was borrowed from the Am e rican M use um of N atural History ; U niversity of Kan sas M use um of N atural H istory ; the M use um of Com parative Zoology, Harvard Un ive rsity ; and the Louisian a S tate U nive rsity M useum of Natural S cien ce.
Ta xon Sa mp ling
Th e sp ecie s sam ple d are listed in Appe n dix 1. All known valid spe cie s in the in group we re sam ple d ; we treated a population ofP. p etersithat m ay be refer- ab le to th e nom inal taxon P. freibergi (Cann ate lla and D ue llm an, 1984) as a dis- tinct taxon. M on oph yly of the in group is supported b y four syn apom orphie s (Cann ate lla an d D uellm an, 1984). O ut- group taxa we re Ph ysalaem us ep hipp ifer, Ph ysa la em us sp. A, an d Ph ysa la em us ene sefa e. The se spe cie s were ch ose n be cause our prelim in ary survey of m or- ph ology an d calls am on g 75% of th e sp e cies sugge ste d that the y are th e m ost sim ilar to th e pustulosusgroup in exte rn al m orp hology, oste ology, an d th e ge ne ral characte ristics of th e call. A m ore com - pre he nsive ph yloge ne tic analysis of relationships in the gen us is in progre ss.
Da ta Pa rtitions
Th e following ch aracter se ts we re de sig- nated as data partitions : m orph ological characte rs (n5 12 ; M O R PH O LO G Y), ad- ve rtise m en t calls (n5 12 ; CALLS ), allo- zym e ele ctrom orph s (n5 27 ; ALLO - ZYM ES ), DN A seq ue nce of the cyto- chrom e oxida se I ge ne (n5 543 ; C O I ),
an d D NA se q ue nce of th e sm all sub- unit of th e m itochon dria l rib osom al ge n e (n5 1214 ; 12S ). The com b in ed data se t was de sign ated as C O M BINED (n5 1808).
M orphological characte rs (Appe n dix 2) we re take n from disse ction s of wh ole specim en s and alizarin -an d- alcian±
staine d skele tons (Din ge rku s and U hle r, 1977). Although sam ple sizes of ske leton s for m ost spe cie s were two or thre e , a survey of . 30 ske leton s of Ph ysala em us p ustulosus (C an nate lla and D ue llm an , 1984) in dicated no intrasp eci® c polym or- phism in th e ch aracters exam ine d, an d n one was n oted in the pre sen t study.
Adve rtise m en t calls we re re corde d in th e ® eld on to m e tal tape with e ithe r a S ony TC D 5M , M aran tz PM D 420, or S ony Profession al W alkm an usin g a M E-80 S e nn he ise r m icrophon e with a K3-U powe r m odule and wind scre en . Te m pe rature s at the callin g sites of e ach frog were re corde d and usually we re 256 2°C . S uch a sm all te m perature dif- feren tial has no sub stantial in ¯ uen ce on call variation .
The adve rtise m e nt calls of the Ph ysa - laem us pustulosus spe cie s group (exce pt specie s C ) and th e thre e outgroup spe cie s are all sim ilar in that the y are rathe r long fre q uen cy swee ps. W e refer to th ese calls as wh ine s, which de scrib es the sound to th e h um an ob server. S om e sp ecies m ay add to th eir call a suffix, which is de scrib ed as a ch uck. TuÂn gara, th e com m on n am e for P. pustulosus, is an on om atopoe ia for the wh in e followe d b y two ch ucks. Be cause the whin e is th e com pon en t re q uired for sp e cies re cogni- tion (Ryan, 1985 ; R an d et al., 1992 ; R yan an d R an d, 1995), it is th e only call com - pone nt conside red. Th e whin e s diƒer in th eir spe ctral prope rties (th e on se t, oƒse t, an d dom inan t fre q uen cy) as well as in th e duration an d shape of the fre q uen cy swee p. All of th e whine s h ave upper h ar- m onics, b ut in P. pustulosusthe se harm o- n ics h ave no in¯ ue nce on th e calls’
attractiven ess to fem ales (Ran d et al., 1992 ; W ilczyn ski e t al., 1995). Th ese h ar- m onics are n ot con sidere d h ere ; all
TABLE1. Allozym e loci exam ine d, and b uƒer system s and tissu es u sed. E . C . n um b er5 E nzym e C o m - m ission num b er from In tern ation al U n ion o f B ioch e m istry (1984). B uƒe r system s follow M urphy et a l.
(1996) ; 15 T ris- citrate II, pH 8.0 ; 25 T ris- citrate- ED T A, p H 7.0 ; 35 T ris-b orate -E D T A II, p H 8.6 ; 45 T ris- citrate/bo rate , ge l p H 8.7.
Locu s A bb reviation E . C . nu m b er Buƒer system
Acon itase h ydra tase- 1 A co- 1 4.1.1.3 11 NAD P
Ade nylate kin ase A k 2.7.4.3 1
As pa rtate am ino tran sfe rase A at-M 2.6.1.1 3
(m itoch on d rial form )
As pa rtate am ino tran sfe rase A at-S 2.6.1.1 1, 3
(sup ern atan t form )
C reatine kina se C k 2.7.3.2 1
C ytosol am inop eptidase C ap 3.4.11.1 1
Este rase D E st-D 3.1.1.-
Fructose -b iph osph atase Fbp 3.1.3.11 11 NAD P
G lucose- 6- pho spha te d ehyd rogen ase G 6pd h 1.1.1.49 41 NAD P
G lucose- 6- pho spha te isom erase G p i 5.3.1.9 4
G lutam ate de hydrogena se G tdh 1.4.1.4 1
G lutath ione red ucta se G r 1.6.4.2 1
G lycerol-3- pho spha te de hyd rogena se G 3pd h 1.1.1.8 2
Iso citrate de hydroge nase -1 Id h- 2 1.1.1.42 2
Iso citrate de hydroge nase -2 Id h- 2 1.1.1.42 2
Lactate d ehyd rogen ase- A L dh -A 1.1.1.27 2
Lactate d ehyd rogen ase- B L dh -B 1.1.1.27 2
M alate de hydrogena se -1 M dh -1 1.1.1.37 1
M alate de hydrogena se -2 M dh -2 1.1.1.37 1
M alate de hydrogena se -1 (N AD P+) M dh p- 1 1.1.1.40 21 NAD P
M alate de hydrogena se -2 (N AD P+) M dh p- 2 1.1.1.40 21 NAD P
Pe p tidase A (glycyl-L- le ucine ) P ep- A 3.4.-.- 1
Ph osph ogluco m utase P gm 5.4.2.2 21 NAD
Ph osph ogluco nate de hydrogena se P gd h 1.1.1.44 11 NAD P
S u pe roxide d ism u tase S od - S 1.15.1.1 2
(sup ern atan t form )
Triose- ph osph ate isom e rase T p i 5.3.1.1 2
values re fer to th e fundam en tal fre - q uen cy.
S pe ctral prope rtie s of calls, e xce pt for dom inan t fre q ue ncy, we re an alyzed on a U niscan sonogra ph . Tem poral prope rtie s were an alyze d on a D ATA 6000 digita l waveform analyze r. C alls were digitize d at a rate of 20 kHz ; the refore the N yq uist freq ue ncy is 10 kH z, sub stantially ab ove the h igh e st fre q uen cies in any of the calls an alyze d. Th e dom in an t fre q uen cy of th e call also was analyzed on th e DATA 6000 by takin g a fast Fourier tran sform of th e en tire call. The followin g call varia ble s were q uanti® ed : D uration (TLD U R, m se c), freq ue n cy at onse t of call (INHZ, Hz), m axim um fre q ue ncy (M XH Z, H z), tim e to the m axim um fre q uen cy (TM M X, m se c), tim e to m id-fre q ue ncy (TM H FHZ, m se c), fre q uen cy at oƒse t of call (FN HZ,
H z), dom inant fre q uen cy (D O M H Z, Hz), duration of am plitu de -m odulated com - pone nt (AM D UR , m sec), rise tim e (RS TM , in m se c), tim e to m id-rise (TM H FRS , m sec), fall tim e (FLTM , m sec), an d tim e to m id-fall (TM HFFL , m sec).
C alls an d tissues for D NA an d allozym e an alysis are from the sam e in di- viduals, e xce pt forPh ysala em us pustulosus, in wh ich the y are from diƒere nt in divid- uals in the sam e population . Th e C O I an d 12S seq ue nce data for P. p ustulosus we re ob taine d from diƒe ren t individu als, b ut the se cam e from th e sam e popu- lation. Each spe cies is represen ted b y on e population ; intrasp eci® c variation was n ot asse ssed. Alth ough the re are sign i® - can t diƒe re n ces in call param e ters with in a spe cies (e.g., R yan and W ilczynski, 1988, 1991), from studie s of Ph ysala em us
pustulosus we kn ow that in traspe ci® c varia tion is far less th an varia tion am on g the spe cies (Ryan e t al., 1996).
Live r, he art, an d th igh m uscle were dissecte d from 10 in divid uals from each population in the ® e ld an d im m ediate ly frozen in liq uid n itroge n un til transporta- tion to the U niversity of Texas, Austin , at which tim e th ey were m aintaine d in an ultra cold fre ezer at le ss than ± 70(C.
M e thods for allozym e ele ctroph ore sis fol- lowed the h orizon tal starch ge l protocols de scrib e d b y M urph y e t al. (1996). G e ls were m ade from 12% starch (S tarch Art lot W 561-2). Table 1 shows the e nzym e loci score d an d b uƒe r syste m used to score e ach locus. Appe ndix 1 lists th e localities of the spe cim en s exam in ed.
M e th ods for D NA isolation, am pli® ca- tion, clon in g, an d seq ue ncing followe d Hillis e t al. (1996) ; protocol num b e rs in the followin g description re fer to that pape r. W h ole gen om ic D NA was isolate d usin g protocol 1.
D ata partition 12S con siste d of th e com ple te m itoch ondrial 12S rR NA gen e, com ple te valin e-tR NA ge ne , and th e
adjacen t approxim ate ly 200 b p of the 16S rR NA ge n e. Th ese were am pli® ed by th e polym e rase chain reaction (se e Palum b i, 1996) usin g prim e rs 12S h an d 16S h (Table 2). The am pli® e d product was clone d using TA clonin g (protocol 18, part B).
Plasm id D NA was isolate d according to protocol 14, an d se q ue nce d (protocols 21, 22, and 25) usin g th e prim e rs shown in Tab le 2. The 12S se q ue nce s we re align e d using M ALIG N (W h e ele r an d G ladstein , 1992).
The sam e e xtracted D N A sam ple s we re use d to se q ue nce th e cytochrom e oxidase I ge ne . D N A from the following specie s was am pli® e d usin g the poly- m e rase chain re action with CO If an d C O Ia prim e rs (Palum b i, 1996) : P. ep h ipp ifer, P. freibergi, P. sp . B, P. sp . A, and P. p ustulosus. The re m ain ing specie s we re am pli® ed with C O If an d C O Ia2 (design e d for th ese spe cie s) : P. coloradorum, P. enesefa e, P. petersi, P. p ustula tus,P. sp . C . Th e region of an alysis include d site s 55 ± 597.
Afte r am pli® cation , the produ ct was se parate d and e xcise d from an agarose
TABLE 2. Prim e rs u sed to se q ue nce 12S rR NA , valine -tR N A, and 16S rR N A ge nes (up per pa rt of tab le ) an d C O I gene (lower pa rt). Th e 12S p rim er loca tions refe r to the p ositions in the P.p ustulosus seq uen ce . T he d esigna tions pp 6 ± pp9 are inte rna l p rim ers for C O I.
12S p rim er na m e P rim er se q u ence P osition
12S a 5’- A AAC TG G G AT T AG AT AC C C C AC T AT - 3’ 413± 437
12S ar 5’- A TA G T G G G G T AT C T AA TC C C AG T T T -3’ 437± 413
12S b 5’- G AG G G T G AC G G G C G G T G T G T -3’ 835± 816
12S c 5’- A AG G C G G AT T T AG C A G TA AA- 3’ 754± 773
12S d 5’- T C G TG C C A G C C R C C G C G G T - 3’ 230± 248
12S e 5’- G G G AA G AAAT G G G C TA C AT T T TC T - 3’ 689± 712
12S h 5’- A AAG G T TT G G T C C T AG C C T T -3’ 1± 20
12S k 5’- G G G AA C T AC G AG C AAAG C T T- 3’ 475± 494
12S l 5’- G G AC AG G C T C C TC T AG G T G G -3’ 545± 526
16S h 5’- G C T AG AC C AT KAT G C AAAAG G T A- 3’ 1202± 1180
M 13re v 5’- C AG G AA AC A G C T AT G AC -3’ ve ctor
T 7 prom ote r 5’- A AT AC G A C T C A C T AT AG - 3’ ve ctor
C O I prim e r n am e P rim er se q u ence P osition
C O If 5’- C C T G C A G G A G G A G G A G AY C C - 3’ 1± 20
C O Ia 5’- A G T A TA AG C G T C T G G G TA G TC - 3’ 660± 681
C O Ia 2 5’- C C T G C Y AR Y C C T AR R AA R T G T T G A G G - 3’ 616± 641
p p6 5’- T C T G C A AC A A TA AT Y AT Y G C A AT T C C A AC - 3’ 256± 284
p p7 5’- G T T G G A AT T G C R A TR AT T AT T G T T G C A G A- 3’ 284± 256
p p8 5’- T C T C T A G AY A TT G T A T T A C AT G A- 3’ 421± 443
p p9 5’- T C A T G T A AT AC A AT R TC T AG A G A -3’ 443± 421
gel an d re susp e nded for a secon d round of PC R am pli® cation . Th e produ ct was puri® e d via G en ecle an III (BIO 101, La Jolla, California). C ycle se q uen cing was don e with the ABI Prism m ix se q ue ncin g kit. S e q uen ce s we re run on an ABI 377 autom ated D N A se q uen cer (Applie d Bio- syste m s, Pe rkin ± Elm e r, Foster C ity, C alifornia) usin g the m an ufacturer’ s recom m e n de d protocols. S e q ue nce s were read, ve ri® e d an d align e d with the ABI software package S e q Ed.
G e nb an k acce ssion n um b ers are AF058957-66. Th e NEXU S ® le (M addison et al., 1997) is availab le at h ttp ://
www.ute xas.e du/de pts/systb iol.
Phylogenetic Ana lysis
C odin g of the call variab le s followed a proce du re in sp ired by M addison and S latkin (1990). Th e m in im um and m axim um value s of a varia b le (data poole d ove r all spe cie s) were scaled to 0 an d 25, re spe ctive ly (Tab le 3). The sp ecie s m e an was th en scale d m onotonically to the ne are st inte ge r. Each characte r was down we igh te d to unity an d analyze d as orde red. In th is way th e re lative distan ce be twee n each pair of value s was m ain - tain e d, and calculation of h om oplasy indice s was possib le .
Phylogen etic analyse s we re don e usin g PAU P 3.1.1 (S woƒord, 1993) an d PAUP*
test versions 4.0.0d26 ± 4.0.0d28 (provide d by D avid S woƒord). Th e allozym ic data were code d using step m atrice s so that a
® xe d change at a locus was weigh ted as one step in the parsim on y an alysis, and an y inte rm e dia te com b in ation of allele s was counte d as a half-ste p. Th us, a change from a ® xed to a polym orp hic condition or vice ve rsa (e.g., aa to ab , or ab to bb ) was counte d as a half ste p, whe reas a ® xe d or m utually e xclusive dif- fere nce (e.g., aa to b b , or ab to cd) was code d as a full ste p. Parsim ony an alyse s of th e D NA data included (1) all ch arac- ter transform ations weigh ted e q ually, with gaps treated as a ® fth characte r ; (2) all characte r tran sform ation s weigh te d eq ually, b ut gaps treated as m issin g data ;
an d (3) a we igh te d parsim ony an alysis in wh ich tran sversions we re give n weigh ts of two an d ® ve tim es re lative to tran- sitions. Th ese value s we re b ase d on th e substitution m atrix e stim ated b y ave rag- ing across all m ost parsim on ious re con- struction s of ch aracters on an initial unwe igh te d tre e using M acClade (M addison and M addison, 1992).
M axim um -like lih ood analyse s in clude d (1) a one -param e ter analysis (all classe s of sub stitu tions eq ually like ly), assum ing e q ual b ase fre q ue ncie s ; (2) a on e- param ete r an alysis, usin g em pirical (ob se rve d) b ase freq ue n cies ; (3) a two- param ete r analysis (allowing diƒere nt rates of tran sition s an d tran sversions), with e q ual base fre q uen cie s ; an d (4) a two-param e ter analysis, with e m pirically de te rm ine d b ase freq ue n cies.
D ata were we igh te d as follows : 12S , C O I, M O RPHO LO G Y, and m ono- m orph ic loci from ALLO ZYM ES we re we igh te d 1,000, polym orp hic loci from ALLO ZYM ES we re we igh te d 500, an d C ALLS we re scale d with a base weigh t of 1,000. In th is way th e total variation in e ach ch aracter was eq ually weigh ted.
Each data partition was an alyzed se para- te ly, and the data were pooled for a com - b in ed analysis.
Non param e tric b ootstrap analyse s we re con du cted with 5000 iterations.
D e cay value s (Brem e r support, b ran ch support) were calculated using th e H yp e rcard utility Autodecay 2.9.5 (Eriks son , 1996 ; h ttp ://www.botan.su.se/ S yste m atik/Folk/Torste n.htm l) ; 10 ran- dom -addition se q ue nce s were use d to de te rm ine th e decay value for e ach node of e ach tre e. The re sulting tre es are de picte d with th e outgroup arbitra rily sh own as m on oph yle tic. Bootstra p/de cay value s for th e branch conn ectin g th e ingroup and outgroup were arbitra rily placed at th e base of th e ingroup.
Be cause no data on calls were available for Ph ysala em us sp . C , th e results of th e C O M BIN ED analysis were used to con- strain th at sp ecie s to b e the siste r spe cie s of Ph ysala em us sp. B for com parison s of tre e topologies.
TABLE3.SummaryofcallstatisticsforthePhysalaemuspustulosusgroupandcloserelatives.Thevariablesaretotaldurationofcall(TLDUR,msec), frequencyatonsetofcall(INHZ,Hz),maximumfrequency(MXHZ,Hz),timetothemaximumfrequency(TMMX,msec),timetomid-frequency(TMHFHZ, msec),frequencyatoƒsetofcall(FNHZ,Hz),dominantfrequency(DOMHZ,Hz),durationofamplitude-modulatedcomponent(AMDUR,msec),risetime (RSTM,inmsec),timetomid-rise(TMHFRS,msec),falltime(FLTM,msec),andtimetomid-fall(TMHFFL,msec).Themeanisgivenwiththerangebelow. Thelatterinparenthesesfollowingthemeanisthecharacter-statecode;seeMaterialsandMethods.ThevariablesarediscussedinmoredetailinCocroft andRyan(1995). SpeciesnTLDURINHZMXHZTMMXTMHFHZFNHZDOMHZAMDURRSTMTMHFRSFLTMTMHFFL sp.A1339(j)812(d)876(a)65(k)160(j)460(j)983(v)0(?)94.6(h)56.6(h)251.6(n)71.1(h) 0234±447800±840800±92043±100150±187400±520767±13620±064±11637±83174±33931±116 ephippifer1266(g)900(h)944(e)62(j)140(i)576(q)944(t)0(?)83.5(g)39.4(e)177.4(i)60.6(g) 238±308840±1000840±104043±81112±162520±600845±10250±048±974±56129±23627±107 enesefae1747(z)944(j)976(g)162(z)386(z)692(z)962(u)384.8(z)301.5(z)166.6(z)445.7(z)203.7(z) 0631±903880±1040920±104081±287300±456640±720844±1384238±760230±407125±242372±54699±338 pustulosus1370(k)884(h)884(a)0(a)124(h)484(k)712(j)22.3(b)24.0(a)7.9(a)342.8(s)175.0(v) 0252±496840±960840±9600±087±175440±600605±88311.9±32.79.3±60.62.2±17.7236±450106±252 petersi1246(f)1220(x)1220(w)0(a)28(b)384(d)628(g)12.0(a)13.7(a)11.5(a)230.3(m)47.1(e) 0206±3501040±14001040±14000±018±50320±480596±6938.4±22.36.0±18.89.6±14.2194±3318.2±161 freibergi1104(a)1253(z)1253(z)0(a)12(a)330(a)482(a)15.0(a)19.3(a)17.6(b)30.0(a)12.6(a) 048.2±140.81000±14241000±14240±06±25272±368361±5854.5±2314.2±29.913.3±24.134.1±124.71.2±37.9 coloradorum9221(e)1031(o)1071(m)25(d)83(e)556(p)1007(w)47.0(d)53.4(d)23.3(c)161.7(h)47.0(e) 152±358960±10801000±11600±6250±100480±640889±113333±7225±7212±5264±2858±128 pustulatus1206(d)964(k)964(f)0(a)88(f)676(x)1062(z)94.3(g)99.5(h)95.0(n)104.3(e)52.9(f) 0186±230880±1080880±10800±056±118640±800820±125473.1±107.690.3±109.477.7±108.276.9±129.019.5±120 sp.B1395(l)740(a)888(a)112(r)115(g)444(h)894(r)92.1(f)105.1(h)69.4(j)293.7(p)93.3(k) 0322±608680±880840±96043±15081±162400±480854±98165±14950±15425±120238±44415±169
Assessments of Combina bility Th ere are seve ral issue s re late d to th e conce pt of com b in ab ility : (1) phylogen - etic sign al or data stru cture ; (2) stre ngth of support for a re sultin g tre e topology ; (3) congrue n ce of tree s from diƒe ren t data partition s ; (4) h om oge ne ity of data partitions ; (5) com patib ility of a data par- tition with a sub optim al tree ; an d (6) stre ngth of support (assum in g 5 is true ) of a data partition for a suboptim al tre e.
Ph ylogene tic signal.Ð If a data set has no structure that is sign i® cantly diƒe ren t from random , the n little con ® den ce can be place d in the re sulting estim ate s of tree topology. H owe ver, lack of discern - ib le stru cture m ay b e an artifact of sm all num b e rs of characte rs. W e assesse d data stru cture using th e PTP te st (Faith, 1991) as im ple m en ted in PAUP* usin g 5000 ran dom m atrice s.
Strength of sup port for a tree topology.Ð C on ® de nce in tre es was q uanti® ed for branch e s usin g ch aracte r re sam plin g (non param etric bootstrap ; H illis an d Bull, 1993) an d Bre m e r support (decay in de x) value, an d for the en tire tre e usin g ``total support’’ te st and the con stra ine d tre e T-PTP. Clades with . 70% b ootstrap values are con sidere d strongly sup- porte d.
Th e ``total support’’ te st de scrib e d b y KaÈlle rsjoÈ e t al. (1992) and re com m en de d by Bre m er (1994) consists of com putin g total support (the sum of all Bre m e r support value s, also calle d decay in dices) for th e ob served data an d com parin g th is to a distrib ution of total support value s from ran dom ly pe rm ute d m atrice s. O n e hundred m atrice s were produce d usin g M acC lade 3.05, an d decay indices for each m atrix we re calculate d using Auto- de cay 2.9.5 (Eriks son , 1996) ; 10 ran dom - addition h euristic se arch es we re used for each de cay value .
Th e con stra ine d- tre e T-PTP test is an exte nsion of Faith’ s m on oph yly test (se e also Faith and C ranston , 1991) in wh ich an en tire tree , rathe r th an a single node, is used as a con straint. It is im ple m en te d as the TPTP te st in PAU P* , b ut an e ntire
tre e is de ® n ed as a con stra int rathe r than just on e node (se e S woƒord e t al., 1996, for a criticism of T-PTP te sts). Th e len gth diƒere nce b etwe en the ob se rve d sh orte st tre e and the sh orte st tree th at is in con- gru en t in any part of the tre e is use d as th e te st statistic an d com pared to a null distrib ution of le ngth diƒere nce s gen er- ated from pe rm uted data. This te st am oun ts to a test of the m onophyly of th e n ode with the we ake st de cay in de x.
R e je ction of th e null h yp othe sis is in ter- prete d as sign i® cant support for a spe ci-
® e d topology, as oppose d to gen e ral cladistic stru cture in the case of th e PTP te st. The n ull distrib ution is e ssen tially on e of de cay indice s b ase d on perm ute d data. G e ne rally, 1,000 random ize d m atrice s we re use d to gen erate th e null distrib ution . If th e perm utation -tail prob- ab ility was 0.05 or le ss, th e te st was reru n with 5000 m atrices to incre ase re solution in the tail of th e distrib ution . Th e con stra ine d- tre e te st diƒe rs in details of e xecution from th e ``all- groups’ ’ te st pro- pose d by Faith an d Ballard (1994), although th e purp ose (asse ssin g ove rall support of a data set for a tree ) is sim ilar.
Congruence of trees.Ð A th ird issue is th e con grue nce of tre es re sultin g from data partition s. W e asse ssed tree congrue n ce b y strict consen sus tree s (S woƒord, 1991) an d tre e sim ilarity by the sym m etric- diƒere nce dista nce , or partition m etric (Robin son an d Foulds, 1981), which is de ® n ed as the num b e r of subclades th at appe ar on e ithe r of th e two tree s, b ut not b oth. Th is m e tric q uanti® es diƒe re n ces in tre e topology (``taxon om ic con grue nce ’ ’) irre sp e ctive of th e characte r support.
Pe nn y an d He ndy (1985) discusse d se ve ral attractive features of this m e tric, wh ich can b e used with un roote d or roote d an d bin ary or non bin ary tre es.
Value s range from 0 to 2n2 6 wh ere n is th e num b e r of te rm in als (S tee l an d Pe nn y, 1993). It sh ould be n ote d th at a te rm in al with diƒe rin g position on two oth erwis e sim ilar tree s m ay yield a large value , in th e way that a strict con se n sus tre e would appe ar large ly un re solve d
un de r sim ilar conditions. The prob ab ility that two give n tre es are dra wn at ran dom from all possib le tre e s was de term ine d usin g Tab le 3 in H en dy e t al. (1984) ; thus, reje ction of th e null hypothe sis indicate s that two lab e led topologies are m ore sim ilar than one would e xpe ct by ch an ce.
Hom ogene ity of pa rtitions.Ð Bull et al.
(1993) argue d th at on e should b e cautious in com b in ing data partitions that are sig- ni® cantly he te rogen e ous. W e do n ot argu e for or against com b ining h ete r- ogen eous partition s ; rathe r, we sim ply wish to dete rm in e h ete roge ne ity b efore furthe r analysis . W e asse ssed partition hom ogen eity using PAU P* . Th e partition - hom ogen eity test gen erally assum e s that if diƒe ren t data partition s are hom ogen eous, the n ran dom ly allocatin g characte rs am ong those partition s sh ould yie ld tree s that are not sign i® cantly diƒe ren t. As propose d by Farris e t al.
(1994, 1995), th e test relies on th e obse rve d incon grue nce len gth diƒe re n ce, com pared to a null distrib ution gen - Dxy,
erated b y pooling the m1 n ch aracters from partition s (m atrices) x an d y and the n ran dom ly allocating the se in to two m atrices of origin al size s m an d n. Th e incon grue nce le ngth diƒe ren ce , Dxy, is de ® n e d
Dxy5 L(x+y)2 (Lx1 Ly)
whe re Lx an d Ly are th e le ngths of th e sh ortest tre es for m atrice s x an d y, and is th e le ngth of th e shorte st tree for L(x+y)
the com b in ed m atrix. Farris e t al. (1994) argu ed that L(x+y)did n ot n ee d to be cal- culate d be cause it was a com m on te rm . Thus th e te st b ecom es a com parison of the sum of ob se rve d tre e len gths com - pared to th e sum of tre e le ngths from ran dom characte r partitions. If the data partitions are congrue n t, the n th e le ngth - sum s of the random partitions will b e le ss th an or e q ual to that of the ob serve d partition. If th e partition s are h igh ly incon grue nt, the n the le ngth -sum s of th e ran dom partitions will be gre ate r than that of th e ob se rve d partition, b ecause ran dom partitions will te n d to produ ce
(lon ger) tree s with m ore hom oplasy.
PAU P* dete rm in e s the sign i® cance of th e te st b y P5 12 (S/W), whe re S is th e n um b er of replicate s in which th e len gth- sum is gre ater th an th e le ngth-sum for th e ob se rve d partition, and W is th e total n um b er of ob se rve d and random parti- tion s. Farris e t al. (1994) n oted that th e e xact len gth s we re not cru cial an d approxim ate parsim ony calculation s (e.g., a ``one -pass’ ’ h euristic search) were suffi- cien t, but b e cause of th e sm all num be r of taxa we use d he uristic se arch es with TBR b ranch -swapping. Partition-h om oge ne ity te sts were don e for all pairwis e com pari- sons of data partitions and a sim ulta- n e ous ® ve-partition test, with 1,000 ite ration s for e ach te st.
Com p a tibility of da ta pa rtitions with sub- op tima l trees.Ð Eve n th ough two data par- tition s strongly support diƒere nt tre es, it m ay be that on e partition is com patible (does not con¯ ict) with th e oth er (sub optim al) tre e . S uch com patibility was te ste d usin g Te m ple ton’ s te st an d th e com pare -2 T-PTP.
Tem pleton ’s te st (Te m ple ton, 1983 ; Larson, 1994) is a W ilcoxon sign ed ran ks te st (Zar, 1974) of th e diƒe ren ce in len gth s of characte rs wh e n a data parti- tion is optim ize d on one tree ve rsu s an oth er. Its results can be in te rpre ted as a state m e nt ab out th e com patib ility of a data partition with a sub optim al tree , rathe r than a statem e nt about two tre e topologies. The m ore con se rvative two- taile d te st was used (Fe lse nstein, 1985), although it can b e argue d th at th e on e- taile d te st is appropria te.
The com pare-2 T-PTP was sugge ste d b y Faith (1991) an d is im plem e nted in PAU P* . A data se t is optim ize d using parsim ony on e ach of two con stra int tre es, an d the diƒere nce in len gth is use d as a statistic an d com pare d to a n ull dis- trib ution of len gth diƒe ren ces from ran- dom ly perm ute d data. If on e of th e con stra int tree s is th e shorte st tre e , the n th e test re ¯ e cts th e com patib ility of th e data partition with th e se cond, sub- optim al tre e.
Strength of supp ort for suboptima l trees.Ð It is of inte rest wh eth er a data partition give s sign i® can t support to a sub optim al topology, in addition to b ein g com patib le with it. Th is was assessed usin g a constrain ed-tree T-PTP as describ e d earlie r.
Oth er considera tions.Ð The T-PTP pe r- m utation tests are im ple m en ted in PAU P* as a priori te sts (Faith, 1991) in which no particular h yp oth e sis of m on o- ph yly is b e ing tested. In cases wh e re a particular h yp oth esis of m on oph yly is tested, the a posteriori test is m ore appropriate . U sing th e a priori te st can incre ase Type 1 e rror (wrongly re jectin g the n ull hypoth esis). Th e con strain ed- tree te st can be pe rform ed as an a priori test b ecause the re was n o e xpe ctation of particular m on oph yle tic groups.
Howe ve r, it is not clear that th e com pare - 2 tests are properly e xe cute d as a priori tests. In th e case of the test for m on o- ph yly of a clade, th e a posteriori m on o- ph yly te st is perform e d by sub tractin g the m in im um le ngth un de r a m on oph yly constrain t from th e le ngth un der non - m on oph yly ; the le ngth diƒere nce s are calculated for the ob se rve d an d m any pe rm uted data m atrices. Howe ve r, for a particular pe rm ute d m atrix th e le ngth diƒe ren ce is calculated using th e large st value foun d for all groupings of taxa th e sam e size as th e clade of inte rest (Faith, 1991). Th us, th e le n gth diƒe ren ce would be e valuated, for exam ple, for e ach of th e 35 com b inations of thre e taxa from th e se ven in group taxa, for e ach pe rm ute d m atrix.
Th e T-PTP tests use d h ere in (b oth th e constrain ed-tree and com pare -2) diƒe r from the m onophyly te st in that th e en tire tree is constrain ed, and Faith’ s (1991) procedure of e valuatin g clades of eq ual size am ounts to e xam in ing alte rna- tive tree s, as is done in th e a priori te st.
Thus, it would se e m that if th e en tire tre e is con stra ine d, th ere is no ope ration al dif- fere nce be twee n a priori and a poste riori tests. H oweve r, we fee l that th e issue de serve s furthe r e xam ination (e.g., S wof-
ford et al., 1996), and b ecause a solution is not ob vious , we have pe rform ed all pe rm utation te sts as a priori te sts. O n e of th e purposes of th is pape r is to e xam in e th e be havior of the se te sts, and th e re sults of the se te sts are ve ry consiste nt with othe r tests (se e R e sults).
W e have used th e C O M BINED data set as if it we re an y oth er data partition . H owever, th is in troduce s a de gre e of n onin de pen de nce in pairwis e com pari- sons. Curiosity ab out th e b e havior of th e C O M BIN ED partition in th e se te sts out- we igh s our conce rns about non in- de pe nden ce, an d th e re sults can b e re adily in te rpre ted.
A seq ue ntial Bonferron i corre ction (Rice , 1989) was applied to the tables of probab ility value s re sultin g from th e pairwis e proce du res.
RES ULTS
The statistics for the call varia ble s an d th e codin g for each are shown in Tab le 3.
Th e alle le fre q uen cies for th e pre sum p- tive loci are prese nte d in Table 4.
Ph ylogene tic Ana lysis
Ph ylogenetic signal a nd p hylogeny estima tion.Ð Th e PTP te st indicate d th at e ach data partition had sign i® cant phylogen e tic structure (Tab le 5). S tatistics from th e re sults of th e se parate an d com b ine d phylogen etic analyse s are sh own in Tab le 5 an d Figure 1. Eith er on e or two m ost parsim onious tre e s we re found for e ach partition. Th e C O M BIN ED data set and the 12S partition produce d th e sam e tre e.
W e igh tin g tran sversions twice as m uch as tran sition s yie lded th e sam e sh orte st tre es for th e CO M BIN ED , 12S , and C O I partition s. W e igh tin g tran sve rsion s ® ve tim e s as m uch as tran sition s yie lde d th e sam e sh orte st tree s for th e C O M BINED an d 12S partition s, an d for the C O I parti- tion yie lde d one of the two tre es found in th e un weigh ted an alysis , the one with th e ((P.colora dorum,p ustula tus), (sp. B, sp. C )) topology.
For the 12S data partition, all m axim um -like lih ood an alyse s yielde d