Post-molecular Systematics And The Future Of Phylogenetics

2y ago
99 Views
2 Downloads
306.75 KB
6 Pages
Last View : 6d ago
Last Download : 3m ago
Upload by : Ronnie Bonney
Transcription

OpinionPost-molecular systematics and thefuture of phylogeneticsR. Alexander PyronDepartment of Biological Sciences, The George Washington University, 2023 G St NW, Washington, DC 20052, USAThe time is past when a research program in systematicsshould be based on only a few genes, extant taxa, andultrametric trees. Cheap genome sequencing, powerfulstatistical methods, and new fossil discoveries promiseto reinvigorate research programs in evolutionary biology. Population genetics, phylogeography, and speciesdelimitation all benefit from genomic data, not justtree building alone. Null-hypothesis testing and poweranalysis via simulation can increase the confidenceand robustness of phylogenetic comparative methods.Merging morphological and molecular datasets for fossiland extant taxa gives a more complete view of the Treeof Life. Combined, these developments can foster a postmolecular systematics, integrating phylogenetic signalfrom the population up based on DNA and through timebased on direct observation rather than inference.Molecular systematics in the 21st centuryFor several years, molecular systematics has been thedominant phylogenetic paradigm [1]. By this, I mean anera in which the primary objective was obtaining DNAsequence data, primarily from protein-coding or ribosomalgenes, to generate molecular phylogenies of extant taxa.This was facilitated by fast, cheap sequencing technologies,powerful statistical methods for inferring and analyzingphylogenies, and renewed interest in natural historycollections for tissue samples. The result is that highlycomplete phylogenies are now available for groups such asplants [2], fishes [3], birds [4], amphibians [5], squamates[6], and mammals [7]. These phylogenies are increasinglysupported by backbones of genomic data [8] and are beingused to answer fundamental questions in ecology andevolution.This tree-building effort has or will naturally plateau,since genomic information and species diversity are moreor less finite. Of course, groups that are hyper-diverse andunder-sampled (such as arthropods), as well as thosewhose evolutionary history cannot easily be representedby a dichotomous phylogeny (such as prokaryotes), havenot reached this plateau. While much more work remainsto be done in such groups, the developments discussedbelow remain relevant for such taxa. Once a clade is fullysampled for species and characters, the question becomesCorresponding author: Pyron, R.A. (rpyron@colubroid.org).Keywords: genomes; fossils; phylogenies; comparative methods; total evidence;systematics.0169-5347/ß 2015 Elsevier Ltd. All rights reserved. nds in Ecology & Evolution, July 2015, Vol. 30, No. 7what problems, limitations, and future directions facesystematic efforts? In most cases, genomic data have notoffered a full-scale revolution in phylogenetic understanding but have often simply reinforced results or increasedsupport from smaller datasets.Concurrently, these trees have been analyzed usingincreasingly sophisticated comparative methods designedfor dated trees of extant species. A large proportion ofrecent molecular systematics literature follows a similarpattern. A tree is constructed based on DNA sequence data.This tree is then dated using clock-based methods, typically with internal node-age constraints derived from the posthoc association of fossils with extant clades. Then, comparative methods are applied to the dated tree, for aimssuch as estimating diversification rates, ancestral areas,community assembly processes, ancestral character states,and models of character evolution or some interactionbetween these, such as the effect of a trait on diversification. However, recent results suggest that many of thesemethods are underpowered, biased, and suffer from nonindependence in ways not easily fixable, which may affectsignificant proportions of studies using them [9–12].In general, as more groups approach phylogenetic completeness and stability, and comparative methods growmore complex, the era of molecular systematics as anend unto itself is ending. Now, I suggest we are enteringan age of post-molecular systematics that will: (i) criticallyexamine the utility and dimensionality of genomic dataabove and below the species level; (ii) reexamine the use ofphylogenetic comparative methods in terms of scale andpower; and (iii) place a renewed emphasis on total-evidencephylogenetics, including fossil species based on morphological data.The strengths and weaknesses of genome-scale dataThe genomics revolution seemed poised to offer massivebenefits to molecular systematics. Hundreds or thousandsof loci would hopefully provide unambiguous resolutionand support for most, if not all, branches in the Tree of Life.The use of explicit species-tree methods would hopefullyoffer analytical stability and confidence in these estimates[13]. We might expect two scenarios to have arisen: (i)whole-genome data overturn long-held hypotheses aboutwhat were previously considered strongly supportedbranches in the Tree of Life; and (ii) genomic data unambiguously resolve and strongly support previously contentious rapid radiations in the Tree of Life. In general, theseexpectations seem not to have been met [14–16].

OpinionIn the first instance, the early inception of moleculardata did provoke a phylogenetic revolution, overturningnumerous long-held phylogenetic hypotheses in groupssuch as basal animals [17], amniotes [18], birds [19],squamates [20], and amphibians [21]. The subsequentdevelopment of genomic datasets for these groups hasnot substantially altered these initial molecular estimatesin many cases. For instance, over 1000 loci cement theplacement of turtles with archosaurs but merely reaffirmearly results from mitochondria or a few nuclear loci[22]. In squamates, sampling 44 nuclear genes for hundreds of taxa [23], or 15 mitochondrial and nuclear loci forthousands of species [6], yields essentially the same resultsas the initial tree from two nuclear loci [20].This result is borne out by numerous empirical andsimulation studies using species-tree methods, which showthat topologies usually stabilize with a relatively small( 20) number of loci in most instances [24]. Although thenumber of potential resolutions is large but finite, thenumber of realistic or likely resolutions is usually small,and phylogenetic signal is usually strong and broadlydistributed in sampled loci. Thus, such a result is notreally unexpected. It is difficult to imagine a case where1000 loci provide strong support for a novel rearrangementof a branch that was not found in previous datasets of 500,50, or even five loci [14].Similarly, many contentious nodes in rapid radiationshave not been fully resolved by genome-scale data[14,15,25–29]. In these studies, some previously uncertainrelationships were strongly resolved by phylogenomic data, but others were not [27,30]. This issue has been notedfor some time: if speciation events occurred too rapidly andbranching events are tightly clustered, too few substitutions may become fixed to resolve relationships, regardlessof the volume of data used to address the problem [31–33].A possible avenue of investigation to resolve such nodes isphylogenetic inference based on gene order and syntenymapping, although these models and analyses remain intheir infancy [34].Thus, genomic data may be useful for providing astrongly supported backbone, even when that backbonehas been contentious, provided phylogenetic signal is present [29,30]. However, this is not guaranteed for morerecalcitrant radiations [15,35]. Species-tree methods canhelp to resolve some ambiguous branches [36], but poorlysupported nodes may persist across the Tree of Life dueto various evolutionary mechanisms [14,32]. Even then,taxon sampling is likely to have as strong an impact at thephylogenomic scale as it does for smaller concatenateddatasets. Thus, the effort already expended to build current phylogenies would need to be equaled or exceededto leverage fully the power of phylogenomics to resolvenodes with both extensive character and taxon sampling.By contrast, genome-scale data seem to be somewhatunderappreciated in the context of species delimitation,phylogeography, and population genetics [37]. For theseapplications, increases in the amount of data seem toincrease accuracy and precision more linearly as morevariable sites from more regions of the genome are included. Investigating species boundaries of morphologicallysimilar groups [38], phylogeographic structure [39], andTrends in Ecology & Evolution July 2015, Vol. 30, No. 7parameters such as gene flow and effective population size[40] is boosted substantially by phylogenomic datasets.Thus, genome-scale data have at least three majorapplications for post-molecular systematics that have previously been underappreciated:Studies below the species level. Coalescent species delimitation, multilocus phylogeography, and populationgenetics are all significantly improved by genome-scaledatasets, particularly with new models for analyzinglarge-scale genomic variation [37,41–46].Adaptive molecular evolution. Relatively few studieshave leveraged the power of genome-scale data to investigate molecular evolution across the genome itself, andwithin and among lineages through time, to highlightprocesses such as adaptation and convergence [47,48]. Thismay be particularly relevant when adaptive convergencein phenotypic traits is driven by adaptive convergence inthe genome [49,50]. While a few genomes have been analyzed thus, this awaits a larger number of well-annotatedgenomes for large-scale comparison.Genomes as traits. A final interesting and under-studiedpotential of genomics is treating the genome itself as aphenotypic character expressed by the organism. While thegenome sequences of closely related species might be similar, the functional expression patterns of those genomescan vary widely among tissues, developmental stages, andenvironmental conditions. Comparative transcriptomicscan use this information to analyze adaptation in functional traits, patterns of selection, and processes of genomeevolution [51].The necessities and failures of phylogeneticcomparative methodsThe ascendance of phylogenetics as a dominant paradigm inbiology stemmed from the realization that species are notepistemologically or statistically independent; all comparative observations in biology are affected by shared ancestry.Even a simple bivariate correlation needs a phylogeneticterm when it involves multiple species [52]. Such methodshave become increasingly complex, including models of traitevolution [9], character correlation [53], estimation of speciation and extinction rates [54], and the association of traitsand diversification rates [54]. These models permit sophisticated statistical analyses based on traditional modelfitting techniques such as analysis of variance (ANOVA)as well as situations (such as diversification rate estimation)specific to a phylogenetic context [55].Worryingly, recent results suggest that many of thesemethods have low overall power [56]. Methods for estimating speciation regimes may be relatively robust [57] but theoutlook for extinction is typically weak even under optimalconditions [54,58]. Summary statistics that were popularin the past, such as g, are now known to be relativelyinaccurate and underpowered, while diversification scenarios are often so complex that analytical likelihoodscannot be calculated [59]. Other methods for estimatingrate shifts such as modeling evolutionary diversity usingstepwise AIC (MEDUSA) appear to have excessiveType I error rates and biased parameter estimates thatessentially preclude their use [60]. Other authors havepointed out the shaky epistemological and methodological385

Opinionfootings on which many analyses have been based, as themodels used are often limited in their ability to answer thedesired questions and the questions themselves are oftenpoorly defined [61].The limitations of ancestral-state reconstruction havelong been known [62]. For instance, the power to reconstruct a character at the root of placental mammals usinga tree of 4507 species modeled using Brownian motion(BM) is equivalent to a measurement from five data points[9]. For more complex models such as Ornstein–Uhlenbeck(OU), current approaches are non-unique or unidentifiable,often inaccurate, and pathologically biased toward excessively complex scenarios [9]. In general, available modelssuch as BM and OU appear to be relatively inadequate andpoor absolute fits to real clades when examined from anempirical perspective [10,63]. Methods linking ancestralstate estimation to quantitative trait variation using BMand OU models have also been proposed [53,64] but it isunclear whether the methods are robust to the fundamental shortcomings that seem to plague parameter estimation using the models [10,63]In a particularly spectacular case, the use of statedependent speciation and extinction (SSE-based) methods[specifically binary (BiSSE)] to estimate the relationshipbetween a binary trait and speciation and extinctionrates yields Type I error rates of 100% [11]. This isnot, apparently, due to any conceptual or mathematicalflaws in the method or algorithm, but simply extremesensitivity to assumptions of rate constancy acrosslineages, which are nearly always violated in empiricalcases. The effect is so severe that even characters such asthe length of a species’ Latin name are found by the methodto influence speciation and extinction significantly. All ofthese methods have been used in dozens or hundredsof studies, potentially calling into question the conclusionsfrom a large proportion of molecular systematics literaturefrom the past 15 years.The relevant question for post-molecular systematics istwofold: (i) what can we reasonably expect to learn fromphylogenies; and (ii) how confident can we be in theseinferences? The information contained in a dated phylogeny generally reduces to n – 1 node ages and 2(n – 1) branchlengths, which are subject to numerous upstream biases ofphylogenetic and temporal inference that can stronglyaffect the results of comparative methods [65–67]. Then,we demand from this graph information on speciation,extinction, ancestral states, ancestral areas, links betweentraits or areas and diversification process, communityassembly processes, phylogenetically informed regressionsand ANOVAs, etc. Perhaps our favorite clade has only50–100 species; is all of this information contained in thosebranches? Or when analyzing large-scale phylogenies, doour methods adequately capture among-lineage variationand the diversity of processes such as biotic interactionslike competition, predation, and facilitation?A way forward has been suggested by several recentpapers testing comparative methods, implemented by afew studies introducing new methods, but rarely employedby any empirical studies. This is the evaluation of modeladequacy and absolute fit via a posteriori simulationsand the use of summary statistics [9,10,60]. For any386Trends in Ecology & Evolution July 2015, Vol. 30, No. 7model-based scenario, ranging from estimating speciation,fitting a model of trait evolution, or linking traits to speciation, it will be possible to generate simulated phylogenieswithin the estimated parameter space of the best-fit model.These can be compared with the observed data via summary statistics to evaluate how well the model capturesthe empirical conditions [60] and to evaluate the power ofthe test to reject null or alternative models [10,11].Even for tests as simple as comparing BM and OU fora trait, this approach should be used to estimate powerand model adequacy. Nearly all comparative-methodspackages include the facility for simulations, yet few empirical studies employ this approach. While tedious, itseems absolutely necessary for future investigations ofcomplex evolutionary dynamics, given the frequent andsubstantial failings that many commonly used comparative methods exhibit. Furthermore, the potential forexhausting the available degrees of freedom in a phylogenythrough numerous tests of different rates, characters, andscenarios should not be overlooked [9,12] and, if possible,should be simulated and assessed jointly. Clearly [11], itis often possible to retrieve strong significance from comparative analysis of large-scale phylogenies. However,recent results suggest that it is often difficult to determinethe biological importance of this significance or whether itis real or artifactual in empirical datasets.The resurgence and importance of total-evidencephylogeneticsMorphological systematics commonly included both extinct and extant taxa, as there was little epistemologicaldistinction between them in terms of the available data[68]. The ascendance of molecular systematics saw a decline in total-evidence phylogenetics incorporating morphological data, primarily due to the ease of capturinglarge amounts of DNA sequence data quickly and thedifficulty of generating large morphological matrices[69]. However, recent methods allow the integration ofmorphological and molecular data for fossil and living taxain a unified, dated phylogenetic framework [70,71]. It istime for a reunion of paleontology and neontology and aresurrection of total-evidence phylogenetics. The Tree of Lifecontains both extinct and extant branches across time andincluding fossil taxa and morphological data can improvetopological and branch-length estimation [68,72–74].Importantly, many of the limitations of phylogeneticcomparative methods described above are lessened oralleviated when fossil lineages are included in analyses[9,75]. Ancient DNA can also provide an avenue for merging datasets of extinct and extant species [76]. Fossillineages give direct observations of speciation and extinction rates and character states through time. For a treewith fossil lineages, speciation and extinction can be estimated with much higher precision and accuracy [77]. Whentrait data are available, ancestral states and areas can beobserved much closer to nodes of interest [78] and character models such as OU and BM can be assessed far morepowerfully [9].Of course, this is not a problem-free directive. Manytraits, particularly ecological variables, cannot be measured easily for extinct species. The true diversity of extinct

Opinionlineages is vastly under-sampled and stochastically limited by the availability of sediments in appropriate areasand the time spent describing fossil material. Finally, thepreparation of morphological matrices is tedious and timeconsuming and requires substantial expertise possessedby relatively few. There are substantial ontological arguments regarding character states and homology statements across large phylogenetic scales [79]. This is inaddition to empirical problems arising from taphonomicartifacts that may mislead phylogenetic inference [80].There is no easy way forward here, other than to continue to bolster paleontological and morphological expertise and to begin constructing larger joint matrices of fossiland living taxa. Issues such as data incongruence andhomoplasy are well known [81]. The impact of these issueson total-evidence phylogenetics has been assessed in only afew groups [72] but suggests that incongruence may beresolved through combined analyses [82]. Epistemologicalissues of homology, character-state definition, and treatment of multistate characters all have a long history in theliterature, with which many younger systematists may beunfamiliar, and these issues will need to be revisited andrevised [83]. Finally, little methodological work has beendone since 2001 on the models used for phylogenetic inference of discrete morphological data [84]. Improving onthese might substantially enhance efforts to infer totalevidence phylogenies as matrices grow larger and morecomplex [85].Concluding remarksFor more than three deca

Apr 18, 2013 · systematics, integrating phylogenetic signal from the population up based on DNA and through time based on direct observation rather than inference. Molecular systematics in the 21st century For several years, molecular systematics has been the dominant phylogenetic paradigm [1]. By t

Related Documents:

Silat is a combative art of self-defense and survival rooted from Matay archipelago. It was traced at thé early of Langkasuka Kingdom (2nd century CE) till thé reign of Melaka (Malaysia) Sultanate era (13th century). Silat has now evolved to become part of social culture and tradition with thé appearance of a fine physical and spiritual .

May 02, 2018 · D. Program Evaluation ͟The organization has provided a description of the framework for how each program will be evaluated. The framework should include all the elements below: ͟The evaluation methods are cost-effective for the organization ͟Quantitative and qualitative data is being collected (at Basics tier, data collection must have begun)

̶The leading indicator of employee engagement is based on the quality of the relationship between employee and supervisor Empower your managers! ̶Help them understand the impact on the organization ̶Share important changes, plan options, tasks, and deadlines ̶Provide key messages and talking points ̶Prepare them to answer employee questions

Dr. Sunita Bharatwal** Dr. Pawan Garga*** Abstract Customer satisfaction is derived from thè functionalities and values, a product or Service can provide. The current study aims to segregate thè dimensions of ordine Service quality and gather insights on its impact on web shopping. The trends of purchases have

On an exceptional basis, Member States may request UNESCO to provide thé candidates with access to thé platform so they can complète thé form by themselves. Thèse requests must be addressed to esd rize unesco. or by 15 A ril 2021 UNESCO will provide thé nomineewith accessto thé platform via their émail address.

systematics. In fact, pre-Darwinian systematics as ahistorical science provides an invaluable resource for understanding what it means that systematics is essentially historical. Systematics is an ideal subject for understanding historical scientific methodology precisely because we

Chính Văn.- Còn đức Thế tôn thì tuệ giác cực kỳ trong sạch 8: hiện hành bất nhị 9, đạt đến vô tướng 10, đứng vào chỗ đứng của các đức Thế tôn 11, thể hiện tính bình đẳng của các Ngài, đến chỗ không còn chướng ngại 12, giáo pháp không thể khuynh đảo, tâm thức không bị cản trở, cái được

accounting and bookkeeping principles, practices, concepts and methods featured in the unit and there was good evidence of preparation and practice with regard to structure, format and presentation of accounting data and information among the sound financial statements, double-entry bookkeeping and cash budgets submitted. That said, this is not a unit solely of numbers or arithmetic and there .