Publication Date
2017
Document Type
Thesis
Committee Members
Valerie Shalin (Committee Member), Amit Sheth (Committee Member), Krishnaprasad Thirunarayan (Advisor)
Degree Name
Master of Science (MS)
Abstract
Social media has brought people closer than ever before, but the use of social media has also brought with it a risk of online harassment. Such harassment can have a serious impact on a person such as causing low self-esteem and depression. The past research on detecting harassment on social media is primarily based on the content of messages exchanged on social media. The lack of context when relying on a single social media post can result in a high degree of false alarms. In this study, I focus on the reliable detection of harassment on Twitter by better understanding the context in which a pair of users is exchanging messages, thereby improving precision. Specifically, I use a comprehensive set of features involving content, profiles of users exchanging messages, and the sequence of messages. By analyzing the conversation between users and features such as change of behavior during their conversation, length of conversation and frequency of curse words, I find that the detection of harassment can be improved significantly over merely using content features and user profile information. Experimental results demonstrate that the comprehensive set of features I use in my supervised machine learning classifier achieves F-score of 88.2 and Area Under Curve (AUC) of Receiver Operating Characteristic (ROC) of 94.3.
Page Count
72
Department or Program
Department of Computer Science and Engineering
Year Degree Awarded
2017
Copyright
Copyright 2017, some rights reserved. My ETD may be copied and distributed only for non-commercial purposes and may not be modified. All use must give me credit as the original author.
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License.