Provenance information in eScience is metadata that's critical to effectively manage the exponentially increasing volumes of scientific data from industrial-scale experiment protocols. Semantic provenance, based on domain-specific provenance ontologies, lets software applications unambiguously interpret data in the correct context. The semantic provenance framework for eScience data comprises expressive provenance information and domain-specific provenance ontologies and applies this information to data management. The authors' "two degrees of separation" approach advocates the creation of high-quality provenance information using specialized services. In contrast to workflow engines generating provenance information as a core functionality, the specialized provenance services are integrated into a scientific workflow on demand. This article describes an implementation of the semantic provenance framework for glycoproteomics.
Sahoo, S. S.,
Sheth, A. P.,
& Henson, C. A.
(2008). Semantic Provenance for eScience: Managing the Deluge of Scientific Data. IEEE Internet Computing, 12 (4), 46-54.