Publication Date

2023

Document Type

Dissertation

Committee Members

Bin Wang, Ph.D. (Advisor); Soon M. Chung, Ph.D. (Committee Member); Liu Meilin, Ph.D. (Committee Member); Wu Zhiqiang, Ph.D. (Committee Member)

Degree Name

Doctor of Philosophy (PhD)

Abstract

The Internet of Things (IoT) is used in many fields that generate sensitive data, such as healthcare and surveillance. Increased reliance on IoT raised serious information security concerns. This dissertation presents three systems for analyzing and classifying IoT traffic using Deep Learning (DL) models, and a large dataset is built for systems training and evaluation. The first system studies the effect of combining raw data and engineered features to optimize the classification of encrypted and compressed IoT traffic using Engineered Features Classification (EFC), Raw Data Classification (RDC), and combined Raw Data and Engineered Features Classification (RDEFC) approaches. Our results demonstrate that the EFC, RDC, and RDEFC models achieve a high classification accuracy of 80.94%, 86.45%, and 90.55%, respectively, outperforming systems reported in the literature with similar configurations. The second system uses three approaches of density estimation, which are histogram, Kernel Density Estimation (KDE), and Cumulative Distribution Function (CDF), to enhance encrypted and compressed variable-size IoT traffic classification. The results demonstrate that the KDE approach attains a significantly higher accuracy of 90.92% compared to 86.66% and 82.6% of the histogram and CDF, respectively. Furthermore, the KDE approach outperforms our RDEFC model in three aspects: variable file length, dataset complexity, and dimensionality reduction. The third system suggests a novel approach for file type classification of fragments in a compressed archive file for forensic digital investigation. Existing research in the literature classifies these files as archive file formats, such as .zip, with no further investigation of the compressed file types. In this system, an optimized modification of the Inception network is implemented. Two sets of filter sizes are implemented, and the attained accuracies are 73.18% and 75.24%, respectively. For future work, we suggest including more encryption or compression algorithms and collecting real-world traffic, classifying encrypted file types, and using our trained models for transfer learning in prospective security domains.

Page Count

120

Department or Program

Department of Computer Science and Engineering

Year Degree Awarded

2023


Share

COinS