Publication Date


Document Type


Committee Members

Gerald Alter (Committee Member), Oleg Paliy (Committee Chair), Michael Raymer (Committee Member), Nicholas Reo (Committee Member)

Degree Name

Doctor of Philosophy (PhD)


Ground-breaking advancements in molecular and analytical techniques in the past decade have enabled researchers to accumulate data at an extraordinary rate. Especially in the field of microbial ecology, the introduction of technologies such as high-throughput sequencing, quantitative microarrays, nuclear magnetic resonance and mass spectrometry has led to the interrogation of diverse and previously unexplored microbial communities at unparalleled depth. Analysis and interpretation of patterns within datasets acquired with such high-throughput methods require powerful statistical approaches. A class of such techniques called multivariate statistical analyses is an excellent choice for analysis of complex microbiota-related datasets. This field of statistics is constantly evolving as new techniques and procedures are being developed and applied to explore and interpret the underlying patterns both statistically and visually. As a result, the decision-making process involved in the choice of the technique that best suits the scientific question and the dataset is no longer trivial. Additionally, the current trends in the use of multivariate statistics in microbial ecology indicate a strong preference toward exploratory analyses, resulting in limitations to possible biological interpretations. In order to facilitate a more extensive integration of multivariate statistics in microbial ecology, I apply a diverse set of analytical methods to human-associated microbial and metabolite datasets that allows us to draw biologically relevant inferences. Specifically, I use indirect gradient analyses to show that the largest gradients of variability correspond to the separation of samples based on sample groups. I use direct gradient analyses to explain a significant portion of the overall variability present within the response variables using independently measured environmental variables. I use classifier techniques to build highly accurate discriminant models based on the differences in the response variables across sample groups and identify the variables that contribute the most to sample group separation. Using correlation-based bipartite analyses, I identify statistically significant associations between two different sets of response variable that were measured for the same set of samples. Finally, I integrate the analytical insights from the above approaches into a generalized protocol for the analysis of multivariate datasets in the field of microbial ecology.

Page Count


Department or Program

Biomedical Sciences

Year Degree Awarded


Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License.