Document Type
Presentation
Publication Date
2008
Abstract
Scientific experimental data generated by all the bionomic technologies is characterized by heterogeneity in its representation formats, constituents, and generation processes and, therefore, also in its usage. Using the proteomics domain we demonstrate the important role of provenance information o manage, interpret and analyze experimental data. We present a novel approach that employs an ontology as a knowledge model to automatically create semantic provenance information for high-throughput mass spectrometry (MS) data in the glycoproteomics domain. The Semantic Provenance Annotation of Data in protEomics (SPADE) implementation is based on the ProPreO ontology, a large-process ontology ( ~500 classes, 40 named relationships with 170 class-level restrictions, and 3.1 million instances) that models the complete experimental protocol for MS-based glycoproteomics data analysis. The semantic provenance information created in SPADE enables biologists to query over the semantic provenance information and retrieve exact data using “train-of-thought” expressive queries in SPARQL query language. We also discuss our current work in extending the ProPreO ontology to support toxicological metabolomics experimentation using Nuclear Magnetic Resonance (NMR) spectroscopy. Our strategic goal is to use Semantic Provenance information by pattern recognition and data mining algorithms for comparative or correlation analysis of Liquid Chromatography MS (LCMS) and NMR spectroscopy experimental data as part of toxicological metabolomics studies.
Repository Citation
Sahoo, S. S.,
Raymer, M. L.,
Henson, C. A.,
Sheth, A. P.,
& York, W. S.
(2008). Ontology Driven Semantic Provenance for Heterogeneous Bionomics Experimental Data. .
https://corescholar.libraries.wright.edu/knoesis/122
Included in
Bioinformatics Commons, Communication Technology and New Media Commons, Databases and Information Systems Commons, OS and Networks Commons, Science and Technology Studies Commons
Comments
An oral presentation at the Ohio Collaborative Conference on Bioinformatics (OCCBIO), University of Toledo, OH, July 2008.