Download Full Text (158 KB)
As social media continues to become an incredible mode of communication in daily life dealing with the exchange of information, these systems provide authors a platform where they can share their thoughts, feelings, and experiences about a number of topics. Harnessing the information expressed publicly through these modes can be incredibly powerful: public perceptions, signals, and data about a variety of specific topics could be extracted and studied from these posts. However, there is a common trade-off in collecting information about a topic from social media: the more specific the topic, generally, the more challenging it is to extract meaningful information. This is because, at first glance, social media posts are simply too noisy: authors post topics that are forced to inject meaning in a short length (140 characters on Twitter). This work presents a nontrivial methodology to overcome this problem. It uses state-of-the-art programming and data storage technologies, stop-word dictionaries, author filters and Twitter bot detectors. Short of evaluating the authenticity of the collected tweets, which will be done in future work through Amazon Mechanical Turk evaluators, we demonstrate how our methodology extracts specific, meaningful tweets about topics related to chronic diseases and medication.
Arts and Humanities | Engineering | Life Sciences | Medicine and Health Sciences | Physical Sciences and Mathematics | Social and Behavioral Sciences
Duberstein , S. J., Asamoah , D., Doran , D., & Schiller , S. Z. (2016). Finding Specific, Topic Related Information from a Sea of Social Media Posts. .
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 4.0 License.