High throughput glycoproteomics, similar to genomics and proteomics, involves extremely large volumes of distributed, heterogeneous data as a basis for identification and quantification of a structurally diverse collection of biomolecules. The ability to share, compare, query for and most critically correlate datasets using the native biological relationships are some of the challenges being faced by glycobiology researchers. As a solution for these challenges, we are building a semantic structure, using a suite of ontologies, which supports management of data and information at each step of the experimental lifecycle. This framework will enable researchers to leverage the large scale of glycoproteomics data to their benefit. In this paper, we focus on the design of these biological ontology schemas with an emphasis on relationships between biological concepts, on the use of novel approaches to populate these complex ontologies including integrating extremely large datasets (~500MB) as part of the instance base and on the evaluation of ontologies using OntoQA  metrics. The application of these ontologies in providing informatics solutions, for high throughput glycoproteomics experimental domain, is also discussed. We present our experience as a use case of developing two ontologies in one domain, to be part of a set of use cases, which are used in the development of an emergent framework for building and deploying biological ontologies.
Sahoo, S. S.,
Sheth, A. P.,
York, W. S.,
& Tartir, S.
(2006). Knowledge Modeling and Its Application in Life Sciences: A Tale of Two Ontologies. 15th International World Wide Web Conference (WWW2006).