Publication Date

2023

Document Type

Dissertation

Committee Members

Bin Wang, Ph.D. (Advisor); Soon M. Chung, Ph.D. (Committee Member); Meilin Liu, Ph.D. (Committee Member); Zhiqiang Wu, Ph.D. (Committee Member)

Degree Name

Doctor of Philosophy (PhD)

Abstract

Insider threats to information security have become a burden for organizations. Understanding insider activities leads to an effective improvement in identifying insider attacks and limits their threats. This dissertation presents three systems to detect insider threats effectively. The aim is to reduce the false negative rate (FNR), provide better dataset use, and reduce dimensionality and zero padding effects. The systems developed utilize deep learning techniques and are evaluated using the CERT 4.2 dataset. The dataset is analyzed and reformed so that each row represents a variable length sample of user activities. Two data representations are implemented to model extracted features in gray encoding (GE) and kernel density estimator (KDE) with cumulative distribution function (CDF). Additionally, sentiment analysis and unique coding are assigned to each category of user activities so that the detection model can distinguish all activities, the correlation between activities, and the temporal characteristics of the activities. The first detection system is a Long-Short-Term Memory (LSTM) network. The first detection system reduced FNR, but the performance degraded as the dataset’s size increased. The second detection system combines convolutional neural networks (CNN) and LSTM networks. Processing and modeling of the dataset created two problems that hindered the performance of the previous two detection systems (1) dimensionality and (2) vanishing short rows due to zero padding. The last detection system aims to reduce the curse of dimensionality and short rows vanishing. Two neural models are utilized, embedding layer and autoencoder. The embedding layer removes padded zeros and produces dense embedded output. The autoencoder compresses the input data samples to a shorter length and feeds the processed data samples to the detection model. All detection systems presented a high performance in classifying users’ activities and detecting insider threats. The first detection system attained an AUC of 0.97, the second detection system attained an AUC of 0.74, and the third detection system attained an AUC of 0.94. The future work will incorporate modeling users’ activities, analyzing emails and website content, developing fine detection models, and investigating developing a balanced insider threat dataset.

Page Count

183

Department or Program

Department of Computer Science and Engineering

Year Degree Awarded

2023

ORCID ID

0000-0003-4364-8766


Share

COinS