Publication Date

2017

Document Type

Thesis

Committee Members

Valerie Shalin (Committee Member), Amit Sheth (Committee Member), Krishnaprasad Thirunarayan (Advisor)

Degree Name

Master of Science (MS)

Abstract

Social media has brought people closer than ever before, but the use of social media has also brought with it a risk of online harassment. Such harassment can have a serious impact on a person such as causing low self-esteem and depression. The past research on detecting harassment on social media is primarily based on the content of messages exchanged on social media. The lack of context when relying on a single social media post can result in a high degree of false alarms. In this study, I focus on the reliable detection of harassment on Twitter by better understanding the context in which a pair of users is exchanging messages, thereby improving precision. Specifically, I use a comprehensive set of features involving content, profiles of users exchanging messages, and the sequence of messages. By analyzing the conversation between users and features such as change of behavior during their conversation, length of conversation and frequency of curse words, I find that the detection of harassment can be improved significantly over merely using content features and user profile information. Experimental results demonstrate that the comprehensive set of features I use in my supervised machine learning classifier achieves F-score of 88.2 and Area Under Curve (AUC) of Receiver Operating Characteristic (ROC) of 94.3.

Page Count

72

Department or Program

Department of Computer Science and Engineering

Year Degree Awarded

2017

Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License.


Share

COinS