IntelliGEN: A Distributed Workflow System for Discovering Protein-Protein Interactions
A large genomics project involves a significant number of researchers and technicians performing dozens of tasks, either manual (e.g. performing laboratory experiments), computer assisted (e.g. looking for genes in the GENBANK database), or sometimes performed entirely automatically by the computer (e.g. sequence assembly). It has become apparent that managing such projects poses overwhelming problems and may lead to results of lower or even unacceptable quality, or possibly drastically increased project costs. In this paper, we present a design and an initial implementation of a distributed workflow system created to schedule and support activities in a genomics laboratory. The focus of the activities in the laboratory is the discovery of protein-protein interactions of fungi, specifically Neurospora crassa. We present our approach of developing, adapting and applying workflow technology in the genomics lab and illustrate it using one distinct part of a larger workflow to discover protein-protein interactions. Novel features of our system include the ability to monitor the quality and timeliness of the results and if necessary, suggesting and incorporating changes to the selected tasks and their scheduling.
Sheth, A. P.,
Arpinar, I. B.,
& Cardoso, J.
(2003). IntelliGEN: A Distributed Workflow System for Discovering Protein-Protein Interactions. Distributed and Parallel Databases, 13 (1), 43-72.