Publication Date
2016
Document Type
Dissertation
Committee Members
Gerald Alter (Committee Member), Oleg Paliy (Committee Chair), Michael Raymer (Committee Member), Nicholas Reo (Committee Member)
Degree Name
Doctor of Philosophy (PhD)
Abstract
Ground-breaking advancements in molecular and analytical techniques in the past decade have enabled researchers to accumulate data at an extraordinary rate. Especially in the field of microbial ecology, the introduction of technologies such as high-throughput sequencing, quantitative microarrays, nuclear magnetic resonance and mass spectrometry has led to the interrogation of diverse and previously unexplored microbial communities at unparalleled depth. Analysis and interpretation of patterns within datasets acquired with such high-throughput methods require powerful statistical approaches. A class of such techniques called multivariate statistical analyses is an excellent choice for analysis of complex microbiota-related datasets. This field of statistics is constantly evolving as new techniques and procedures are being developed and applied to explore and interpret the underlying patterns both statistically and visually. As a result, the decision-making process involved in the choice of the technique that best suits the scientific question and the dataset is no longer trivial. Additionally, the current trends in the use of multivariate statistics in microbial ecology indicate a strong preference toward exploratory analyses, resulting in limitations to possible biological interpretations. In order to facilitate a more extensive integration of multivariate statistics in microbial ecology, I apply a diverse set of analytical methods to human-associated microbial and metabolite datasets that allows us to draw biologically relevant inferences. Specifically, I use indirect gradient analyses to show that the largest gradients of variability correspond to the separation of samples based on sample groups. I use direct gradient analyses to explain a significant portion of the overall variability present within the response variables using independently measured environmental variables. I use classifier techniques to build highly accurate discriminant models based on the differences in the response variables across sample groups and identify the variables that contribute the most to sample group separation. Using correlation-based bipartite analyses, I identify statistically significant associations between two different sets of response variable that were measured for the same set of samples. Finally, I integrate the analytical insights from the above approaches into a generalized protocol for the analysis of multivariate datasets in the field of microbial ecology.
Page Count
110
Department or Program
Biomedical Sciences
Year Degree Awarded
2016
Copyright
Copyright 2016, some rights reserved. My ETD may be copied and distributed only for non-commercial purposes and may not be modified. All use must give me credit as the original author.
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License.