This paper explores the problem of predicting specific reading mistakes, called miscues, on a given word. Characterizing likely miscues tells an automated reading tutor what to anticipate, detect, and remediate. As training and test data, we use a database of over 100,000 miscues transcribed by University of Colorado researchers. We explore approaches that exploit different sources of predictive power: the uneven distribution of words in text, and the fact that most miscues are real words. We compare the approaches’ ability to predict miscues of other readers on other text. A simple rote method does best on the most frequent 100 words of English, while an extrapolative method for predicting real-word miscues performs well on less frequent words, including words not in the training data
Winter, S. V.,
& Wang, S.
(2002). Predicting Oral Reading Miscues. .