Files

Download

Download Full Text (158 KB)

Document Type

Presentation

Description

As social media continues to become an incredible mode of communication in daily life dealing with the exchange of information, these systems provide authors a platform where they can share their thoughts, feelings, and experiences about a number of topics. Harnessing the information expressed publicly through these modes can be incredibly powerful: public perceptions, signals, and data about a variety of specific topics could be extracted and studied from these posts. However, there is a common trade-off in collecting information about a topic from social media: the more specific the topic, generally, the more challenging it is to extract meaningful information. This is because, at first glance, social media posts are simply too noisy: authors post topics that are forced to inject meaning in a short length (140 characters on Twitter). This work presents a nontrivial methodology to overcome this problem. It uses state-of-the-art programming and data storage technologies, stop-word dictionaries, author filters and Twitter bot detectors. Short of evaluating the authenticity of the collected tweets, which will be done in future work through Amazon Mechanical Turk evaluators, we demonstrate how our methodology extracts specific, meaningful tweets about topics related to chronic diseases and medication.

Publication Date

4-15-2016

Disciplines

Arts and Humanities | Engineering | Life Sciences | Medicine and Health Sciences | Physical Sciences and Mathematics | Social and Behavioral Sciences

Comments

Presented at the Seventh Annual Celebration of Research, Scholarship, and Creative Activities, Dayton, OH, April 15, 2016.

Creative Commons License

Creative Commons Attribution-Noncommercial 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 4.0 License.