Publication Date
2013
Document Type
Thesis
Committee Members
John Gallagher (Committee Member), Mateen Rizki (Committee Member), Shaojun Wang (Advisor)
Degree Name
Master of Science (MS)
Abstract
I present a composite language model in which an n-gram language model is integrated with the Latent Dirichlet Allocation topic clustering model. I also describe a parallel architecture that allows this model to be trained over large corpora and present experimental results that show how the composite model compares to a standard n-gram model over corpora of varying size.
Page Count
36
Department or Program
Department of Computer Science
Year Degree Awarded
2013
Copyright
Copyright 2013, all rights reserved. This open access ETD is published by Wright State University and OhioLINK.