Publication Date
2022
Document Type
Dissertation
Committee Members
Derek Doran, Ph.D. (Advisor); Michael Raymer, Ph.D. (Committee Member); Vincent Schmidt, Ph.D. (Committee Member); Nikolaos Bourbakis, Ph.D. (Committee Member); Thomas Wischgoll, Ph.D. (Committee Member)
Degree Name
Doctor of Philosophy (PhD)
Abstract
Hierarchical clustering is a class of algorithms commonly used in exploratory data analysis (EDA) and supervised learning. However, they suffer from some drawbacks, including the difficulty of interpreting the resulting dendrogram, arbitrariness in the choice of cut to obtain a flat clustering, and the lack of an obvious way of comparing individual clusters. In this dissertation, we develop the notion of a topological hierarchy on recursively-defined subsets of a metric space. We look to the field of topological data analysis (TDA) for the mathematical background to associate topological structures such as simplicial complexes and maps of covers to clusters in a hierarchy. Our main results include the definition of a novel hierarchical algorithm for constructing a topological hierarchy, and an implementation of the MAPPER algorithm and our topological hierarchies in pure Python code as well as a web app dashboard for exploratory data analysis. We show that the algorithm scales well to high-dimensional data due to the use of dimensionality reduction in most TDA methods, and analyze the worst-case time complexity of MAPPER and our hierarchical decomposition algorithm. Finally, we give a use case for exploratory data analysis with our techniques.
Page Count
145
Department or Program
Department of Computer Science and Engineering
Year Degree Awarded
2022
Copyright
Copyright 2022, some rights reserved. My ETD may be copied and distributed only for non-commercial purposes and may not be modified. All use must give me credit as the original author.
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.