Publication Date

2021

Document Type

Thesis

Committee Members

Valerie Shalin, Ph.D. (Advisor); Debra Steele-Johnson, Ph.D. (Committee Member); T. K. Prasad, Ph.D. (Committee Member)

Degree Name

Master of Science (MS)

Abstract

Qualitative data result from observation, video, and dialogue. These types of data are flexible and allow us to study behavior without imposing potentially disruptive data collection methods. However, subsequent quantitative analysis requires a time consuming, labor intensive initial coding process, and a second manual coding to calculate inter-rater reliability. I examined the use of machine learning algorithms to reduce the amount of manual annotation work required to perform inter-rater reliability measures on text data. By comparing machine-human and human-human raters using Cohen’s Kappa statistic and an informal analysis of the features used in machine learning classification, I identify the promise and limitations of machine rating for conducting the second coding effort used to determine reliability. I found that machine learning algorithms can be useful tools for supporting inter-rater reliability as a second coder, but there are limitations associated with the class balance of the data that may restrict their usage.

Page Count

58

Department or Program

Department of Psychology

Year Degree Awarded

2021

ORCID ID

0000-0001-6089-8893


Share

COinS