Publication Date


Document Type


Committee Members

John Flach (Committee Member), Ion Juvina (Committee Member), Valerie Shalin (Committee Chair), Amit Sheth (Committee Member)

Degree Name

Doctor of Philosophy (PhD)


Social media data promise to inform the disaster response community, but effective mining remains elusive. To assist in the analysis of community reports on disaster from social media, I draw on an integrated model of psycholinguistic theory to investigate the patterns by which language use changes as a function of environmental influence. Using social media corpora from several disasters and non-disasters, I examine variations in patterns of lexical choice between domain independent paired antonyms with respect to an Internet-specific base rate to determine generic sentinels of breach of canonicity. I examine social media content with respect to disaster proximity and examine relative proportions of actionable content in messages containing words that indicate breach. Results indicate a preliminary set of antonym pairs that vary consistently with respect to breach. Despite the absence of correlation with actionable content density, two related findings support the role of a psycholinguistic perspective on the mining of social media data. First, several diagnostic pairs reflect human function in an environment independent of sentiment. Second, the analysis of sentiment by spatial proximity suggests an increase in positive sentiment with proximity. Both findings motivate the continued study of how human behavior contributes to the production of social media messages, and hence the analysis of the messages they produce. I note several methodological contributions resulting from this work, including the expanded set of informative domain independent lexical items, consideration of base rates that both enables detection of departure from canonicity and reduces reliance on anonymous reporting, and a complement to sentiment analysis that is sensitive to environmental variability. Theoretical contributions include consolidation of disparate threads of language production research (including a focus on grounding). Finally, I identify several limitations in my own analysis, and more general concerns regarding the mining of social media data, to guide future work.

Page Count


Department or Program

Department of Psychology

Year Degree Awarded