A simple model is used to study the effect of fault-tolerance techniques and system design on system availability. A generic multiprocessor architecture is used that can be configured in different ways to study the effect of system architectures. Important parameters studied are different system architectures and hardware fault-tolerance techniques, mean time to failure of basic components, database size and distribution, interconnect capacity, etc. Quantitative analysis compares the relative effect of different parameter values. Results show that the effect of different parameter values on system availability can be very significant. System architecture, use of hardware fault tolerance (particularly mirroring), and data storage methods emerge as very important parameters under the control of a system designer.
Sheth, A. P.
(1989). Fault Tolerance in a Very Large Database System: A Strawman Analysis. Proceedings of the 9th International Conference on Distributed Computing Systems, 227-236.