Workflow Management Systems (WFMSs) can be used to re-engineer, streamline, automate, and track organizational processes involving humans and automated information systems. However, the state-of-the-art in workflow technology suffers from a number of limitations that prevent it from being widely used in large-scale mission critical applications. Error handling is one such issue. What makes the task of error handling challenging is the need to deal with errors that appear in various components of a complex distributed application execution environment, including various WFMS components, workflow application tasks of different types, and the heterogeneous computing infrastructure.

In this paper, we discuss a top-down approach towards dealing with errors in the context of ORBWork, a CORBA-based fully distributed workflow enactment service for the METEOR2 WFMS. The paper discusses the types of errors that might occur including those involving the infrastructure of the enactment environment, system architecture of the workflow enactment service. In the context of the underlying workflow model for METEOR, we then present a three-level error model to provide a unified approach to specification, detection, and runtime recovery of errors in ORBWork. Implementation issues are also discussed. We expect the model and many of the techniques to be relevant and adaptable to other WFMS implementations.


Technical Report #UGA-CS-TR-97-002