Publication Date


Document Type


Committee Members

Michael Raymer (Advisor)

Degree Name

Master of Science (MS)


Prediction of protein tertiary structure based on amino acid sequence is one of the most challenging open questions in computational molecular biology. Experimental methods for protein structure determination remain relatively time consuming and expensive, and are not applicable to all proteins. While a diverse array of algorithms have been developed for prediction of protein structure from amino acid sequence information, the accuracy and reliability of these methods are not yet comparable to experimental structure determination techniques. Computational models of protein structure can, however, be improved by the incorporation of experimental information. Relatively rapid and inexpensive protein modification experiments can be used to probe the physical and chemical features of specific amino acid residues. The information gained from these experiments can be incorporated into computational structure prediction techniques to increase the confidence and accuracy of these methods. Analysis of protein modification experiments, however, presents another array of computational challenges. An important step in this analysis is the determination of reaction rate constants from experimental data. This thesis examines the problem of reaction rate constant determination for protein modification and digestion experiments using mass spectrometry for fragment quantification. A computational framework is developed for curve-fitting limited proteolysis and chemical modification experimental data and for calculating confidence intervals for the resulting reaction rate constant estimates. A stochastic simulation is employed to formulate and test a mathematical model for proteolysis. Several methods for nonlinear curve-fitting, including the Gauss-Newton and Nelder-Mead simplex methods are explored for associating experimental results to their model. In addition, the use of Monte Carlo simulation and model comparison methods for confidence interval estimation with protein modification data are investigated. The results of these analyses are applied to multiple experiments on cytochrome c, and the findings are compared with the crystallographically-determined structure of this protein. This case study demonstrates the capability of the methods developed here as a framework for the automated analysis of experimental data for protein structure determination.

Page Count


Department or Program

Department of Computer Science

Year Degree Awarded