Publication Date
2021
Document Type
Thesis
Committee Members
Valerie Shalin, Ph.D. (Advisor); Debra Steele-Johnson, Ph.D. (Committee Member); T. K. Prasad, Ph.D. (Committee Member)
Degree Name
Master of Science (MS)
Abstract
Qualitative data result from observation, video, and dialogue. These types of data are flexible and allow us to study behavior without imposing potentially disruptive data collection methods. However, subsequent quantitative analysis requires a time consuming, labor intensive initial coding process, and a second manual coding to calculate inter-rater reliability. I examined the use of machine learning algorithms to reduce the amount of manual annotation work required to perform inter-rater reliability measures on text data. By comparing machine-human and human-human raters using Cohen’s Kappa statistic and an informal analysis of the features used in machine learning classification, I identify the promise and limitations of machine rating for conducting the second coding effort used to determine reliability. I found that machine learning algorithms can be useful tools for supporting inter-rater reliability as a second coder, but there are limitations associated with the class balance of the data that may restrict their usage.
Page Count
58
Department or Program
Department of Psychology
Year Degree Awarded
2021
Copyright
Copyright 2021, all rights reserved. My ETD will be available under the "Fair Use" terms of copyright law.
ORCID ID
0000-0001-6089-8893