Valerie Shalin (Committee Member), Amit Sheth (Committee Member), Krishnaprasad Thirunarayan (Advisor)
Master of Science (MS)
Harassment on social media has become a critical problem and social media content depicting harassment is becoming common place. Video-sharing websites such as YouTube contain content that may be offensive to certain community, insulting to certain religion, race etc., or make fun of disabilities. These videos can also provoke and promote altercations leading to online harassment of individuals and groups. In this thesis, we present a system that identifies offensive videos on YouTube. Our goal is to determine features that can be used to detect offensive videos efficiently and reliably. We conducted experiments using content and metadata available for each YouTube video such as comments, title, description and number of views to develop Naive Bayes and Support Vector Machine classifiers. We used training dataset of 300 videos and test dataset of 86 videos and obtained a classification F-Score of 0.86. It was surprising to note that sentiment and content of the comments were less effective in detecting offensive videos than the unigrams and bigrams in the video title and any other feature combinations does not improve the performance appreciably.Thus, the simplicity of these features contributes to the efficiency of computation and implies that the up-loaders provide good titles.
Department or Program
Department of Computer Science and Engineering
Year Degree Awarded
Copyright 2016, all rights reserved. My ETD will be available under the "Fair Use" terms of copyright law.