Publication Date

2016

Document Type

Thesis

Committee Members

Valerie Shalin (Committee Member), Amit Sheth (Committee Member), Krishnaprasad Thirunarayan (Advisor)

Degree Name

Master of Science (MS)

Abstract

Harassment on social media has become a critical problem and social media content depicting harassment is becoming common place. Video-sharing websites such as YouTube contain content that may be offensive to certain community, insulting to certain religion, race etc., or make fun of disabilities. These videos can also provoke and promote altercations leading to online harassment of individuals and groups. In this thesis, we present a system that identifies offensive videos on YouTube. Our goal is to determine features that can be used to detect offensive videos efficiently and reliably. We conducted experiments using content and metadata available for each YouTube video such as comments, title, description and number of views to develop Naive Bayes and Support Vector Machine classifiers. We used training dataset of 300 videos and test dataset of 86 videos and obtained a classification F-Score of 0.86. It was surprising to note that sentiment and content of the comments were less effective in detecting offensive videos than the unigrams and bigrams in the video title and any other feature combinations does not improve the performance appreciably.Thus, the simplicity of these features contributes to the efficiency of computation and implies that the up-loaders provide good titles.

Page Count

51

Department or Program

Department of Computer Science and Engineering

Year Degree Awarded

2016


Share

COinS