Publication Date

2021

Document Type

Dissertation

Committee Members

Guozhu Dong, Ph.D. (Advisor); Keke Chen, Ph.D. (Committee Member); Hemant Purohit, Ph.D. (Committee Member); Michael Raymer, Ph.D. (Committee Member); Krishnaprasad Thirunarayan, Ph.D. (Committee Member)

Degree Name

Doctor of Philosophy (PhD)

Abstract

Classification is an important branch of machine learning that impacts many areas of modern life. Many classification algorithms (classifiers for short) have been developed. They have highly different levels of sophistication and classification accuracy. Classification problems often have highly different levels of hardness and complexity. Practitioners of classification modeling need better understanding of those algorithms in order to select the optimal algorithm for given classification problems. Researchers of classification need new insight on how given classifiers are weak and how they can be improved by correcting their classification errors. This dissertation introduces new tools and concepts to analyze classifier weakness and provides new insights on classifier weakness and classifier error correctability. Three tools are introduced to discover such insights. (I) The primary tool is a novel algorithm called Pattern-Aided Mixed-Type Modeling (PAMM). This tool produces a structural model revealing the shape and structure of a classifier’s error space, hence offering new analytical possibilities. (ii) Based on the structured model thus produced, new weakness metrics are introduced, incorporating structural properties of the error space and the correctability of classification errors. (iii) This study uses Corrective Method Sets (CMS), which are sets of popular, simple classifiers, to characterize a classifier’s weakness based on how much of a classifier’s errors can be corrected by the CMS. Two families of valuable insights on 11 popular classifiers are obtained using the three new tools. (I) The 11 popular classifiers are ranked in terms of how structured their error spaces are and how correctable their classification errors are, giving insights into classifier weakness and correctability. Such rankings are also compared against pure accuracy-based classifier rankings, giving insights on the relationship between poor classifier accuracy and classifier error correctability. (ii) The top ranked CMS of three types are provided: those applicable to all 11 classifiers, those applicable to each given classifier, and those applicable to each given classifier on data sets with certain characteristics. In summary, this dissertation offers insights on how many opportunities classifiers leave on the table and how much their classification errors can be easily corrected using simple corrective methods.

Page Count

175

Department or Program

Department of Computer Science and Engineering

Year Degree Awarded

2021

Copyright

ORCID ID

0000-0002-5800-1909

Download

Included in

Computer Engineering Commons, Computer Sciences Commons

COinS

Browse all Theses and Dissertations

Analysis of Classifier Weaknesses Based on Patterns and Corrective Methods

Publication Date

Document Type

Committee Members

Degree Name

Abstract

Page Count

Department or Program

Year Degree Awarded

Copyright

ORCID ID

Included in

Search

Browse

About

Browse all Theses and Dissertations

Analysis of Classifier Weaknesses Based on Patterns and Corrective Methods

Author

Publication Date

Document Type

Committee Members

Degree Name

Abstract

Page Count

Department or Program

Year Degree Awarded

Copyright

ORCID ID

Included in

Share

Search

Browse

About