Publication Date

2022

Document Type

Dissertation

Committee Members

Derek Doran, Ph.D. (Advisor); Michael Raymer, Ph.D. (Committee Member); Vincent Schmidt, Ph.D. (Committee Member); Nikolaos Bourbakis, Ph.D. (Committee Member); Thomas Wischgoll, Ph.D. (Committee Member)

Degree Name

Doctor of Philosophy (PhD)

Abstract

Hierarchical clustering is a class of algorithms commonly used in exploratory data analysis (EDA) and supervised learning. However, they suffer from some drawbacks, including the difficulty of interpreting the resulting dendrogram, arbitrariness in the choice of cut to obtain a flat clustering, and the lack of an obvious way of comparing individual clusters. In this dissertation, we develop the notion of a topological hierarchy on recursively-defined subsets of a metric space. We look to the field of topological data analysis (TDA) for the mathematical background to associate topological structures such as simplicial complexes and maps of covers to clusters in a hierarchy. Our main results include the definition of a novel hierarchical algorithm for constructing a topological hierarchy, and an implementation of the MAPPER algorithm and our topological hierarchies in pure Python code as well as a web app dashboard for exploratory data analysis. We show that the algorithm scales well to high-dimensional data due to the use of dimensionality reduction in most TDA methods, and analyze the worst-case time complexity of MAPPER and our hierarchical decomposition algorithm. Finally, we give a use case for exploratory data analysis with our techniques.

Page Count

145

Department or Program

Department of Computer Science and Engineering

Year Degree Awarded

2022

Creative Commons License

Creative Commons Attribution-Noncommercial-Share Alike 3.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.


Share

COinS