Publication Date

2013

Document Type

Thesis

Committee Members

John Gallagher (Committee Member), Mateen Rizki (Committee Member), Shaojun Wang (Advisor)

Degree Name

Master of Science (MS)

Abstract

I present a composite language model in which an n-gram language model is integrated with the Latent Dirichlet Allocation topic clustering model. I also describe a parallel architecture that allows this model to be trained over large corpora and present experimental results that show how the composite model compares to a standard n-gram model over corpora of varying size.

Page Count

36

Department or Program

Department of Computer Science

Year Degree Awarded

2013


Share

COinS