Building a Foundation To Enable Semantic Technologies For Phylogenetically-Based Comparative Analyses

Document Type


Publication Date



In revealing historical relationships among genes and species, phylogenies provide a unifying context across the life sciences for investigating diversification of biological form and function. The utility of phylogenies for addressing a wide variety of biological questions is evident in the rapidly increasing number of published gene and species trees. Further, this trend is certain to pick up pace with the explosion of data being generated with next generation sequencing technologies. The impact that this deluge of species and gene tree estimates will have on our understanding of the forces that shape biodiversity will be limited by the accessibility of these trees, and the underlying data and methods of analysis. The true structure of species trees and gene trees is rarely known. Rather, estimates are obtained through the application of increasingly sophisticated phylogenetic inference methods to increasingly large and complicated datasets. The need for Minimum Information about Phylogenetic Analyses (MIAPA) reporting standard is clear, but specification of the standard has been hampered by the absence of controlled vocabularies to describe phylogenetic methodologies and workflows. PhylOnt is an extensible ontology being developed to describe the methods employed to estimate trees given a data matrix and thus support specification of MIAPA. PhylOnt will be linked with the Comparative Data Analysis Ontology (CDAO) to provide a comprehensive set of concepts relating to phylogeny estimation that can be used by searchable tree databases and web services. Moreover, we aim to use PhylOnt/CDAO concepts that describe tree estimation procedures to explicitly relate tree descriptions to data matrices within NeXML files. We view this as an important step in the development and specification of MIAPA.


This invited talk was given at iEvoBio 2011, Oklahoma, USA, June 21 - 22, 2011.