Publication Date


Document Type


Committee Members

Travis Doom (Advisor)

Degree Name

Doctor of Philosophy (PhD)


Thousands of copies of short interspersed repeats (SINEs) are scattered essentially ran-domly through the human genome. Although copies of each repeat subfamily are identical at the time of their insertion, they become subject to individual substitutions after insertion. As the relative time of insertion is known for many of these repeats, such "junk DNA" can be used to provide a sizeable number of time-series data points for studying substitution effects in a variety of genomic contexts. This dissertation specifically discusses the usefulness of the Alufamily of SINE repeats towards addressing open problems in genomics, population genetics, and biology in general.

Alus and other repeat elements have been used successfully to approach many open questions in biology. However, a more complete analysis of the statistical properties of the repeats themselves is necessary to more fully categorize any confounding factors that must be considered when reporting repeat-based results. This dissertation provides a study of potentially confounding statistical properties underlying Alurepeats in various genomic contexts.

The utility of this statistical approach to the use of repeat elements as time series data is illustrated by furthering investigations designed to elucidate the driving force behind evolution: environmental factors or replication error. Specifically, this dissertation addresses the degree of replication-based error by providing tighter statistical boundaries on the male-to-female mutation ratio, alpha, in humans. This goal is accomplished by performing a whole-genome analysis of substitutions in characterized Alurepeats. Existent mathematical models are used to provide a strict confidence bounds on the human alpha.

Finally, this dissertation validates the elucidated value of alpha by performing a whole-genome analysis of replication-induced insertion and deletion events in characterized Alurepeats. This analysis provides a better characterized general mechanism for studying questions in human population genetics as well as a more accurate consideration of the specific genomic issues relevant to the driving force of evolution.

Page Count


Department or Program

Department of Computer Science and Engineering

Year Degree Awarded