Publication Date

2023

Document Type

Thesis

Committee Members

Krishnaprasad Thirunarayan, Ph.D. (Advisor); Shu Schiller, Ph.D. (Committee Member); Michael Raymer, Ph.D. (Committee Member)

Degree Name

Master of Science (MS)

Abstract

Obtaining accurate inferences from deep neural networks is difficult when models are trained on instances with conflicting labels. Algorithmic recognition of online hate speech illustrates this. No human annotator is perfectly reliable, so multiple annotators evaluate and label online posts in a corpus. Labeling scheme limitations, differences in annotators' beliefs, and limits to annotators' honesty and carefulness cause some labels to disagree. Consequently, decisive and accurate inferences become less likely. Some practical applications such as social research can tolerate some indecisiveness. However, an online platform using an indecisive classifier for automated content moderation could create more problems than it solves. Disagreements can be addressed in training by using the label a majority of annotators assigned (majority vote), training only with unanimously annotated cases (clean filtering), and representing training labels as probabilities (soft labeling). This study shows clean filtering occasionally outperforming majority voting, and soft labeling outperforming both.

Page Count

Department or Program

Department of Computer Science and Engineering

Year Degree Awarded

2023

Copyright

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

ORCID ID

0000-0003-3332-4485

Download

Included in

Computer Engineering Commons, Computer Sciences Commons

COinS

Browse all Theses and Dissertations

Comparative Adjudication of Noisy and Subjective Data Annotation Disagreements for Deep Learning

Publication Date

Document Type

Committee Members

Degree Name

Abstract

Page Count

Department or Program

Year Degree Awarded

Copyright

Creative Commons License

ORCID ID

Included in

Search

Browse

About

Browse all Theses and Dissertations

Comparative Adjudication of Noisy and Subjective Data Annotation Disagreements for Deep Learning

Author

Publication Date

Document Type

Committee Members

Degree Name

Abstract

Page Count

Department or Program

Year Degree Awarded

Copyright

Creative Commons License

ORCID ID

Included in

Share

Search

Browse

About