Publication Date


Document Type


Committee Members

Amit Sheth, Ph.D. (Advisor); Keke Chen, Ph.D. (Committee Member); Krishnaprasad Thirunarayan, Ph.D. (Committee Member); Valerie Shalin, Ph.D. (Committee Member); Brandon Minnery, Ph.D. (Committee Member)

Degree Name

Doctor of Philosophy (PhD)


The wisdom of the crowd is a well-known example of collective intelligence wherein an aggregated judgment of a group of individuals is superior to that of an individual. The aggregated judgment is surprisingly accurate for predicting the outcome of a range of tasks from geopolitical forecasting to the stock price prediction. Recent research has shown that participants' previous performance data contributes to the identification of a subset of participants that can collectively predict an accurate outcome. In the absence of such performance data, researchers have explored the role of human-perceived diversity, i.e., whether a human considers a crowd as a diverse crowd, to assemble an intelligent crowd. In fact, diversity among participants and independent decision making are the two most important criteria for a crowd to provide an accurate aggregated judgment. However, perceived diversity based crowd selection does not scale. This dissertation explores whether we can infer the diversity and independence from user-generated social network data to inform intelligent crowd selection. This dissertation first provides a data-driven bottom-up diversity measure and shows that participant diversity can be inferred from social media data and that it can be used to perform diverse crowd selection. It then provides a multi-objective optimization based diverse crowd selection method using this measure. The results show that the diverse crowds significantly outperform both randomly selected and expert crowds. A top-down approach then provides explainable diversity measures to select such a diverse crowd. The data-driven diversity measures do not utilize the social media profile and link information. Community detection using shared content and link information can both inform diverse crowd selection. However, the existing methods do not consider ``contextual'' similarity that could play a crucial role in identifying and characterizing contextual communities. This dissertation provides a state-of-the-art contextual similarity measure and a knowledge graph-enhanced community detection approach to select a diverse crowd as well as explain the domain-specific diversity that could affect crowd wisdom. It is shown that such a diverse crowd can accurately predict the outcome of real-world events. These results have implications for numerous domains that utilize aggregated judgments - from consumer reviews to econometrics, to geopolitical forecasting and intelligence analysis.

Page Count


Year Degree Awarded