Publication Date
2023
Document Type
Dissertation
Committee Members
Bin Wang, Ph.D. (Advisor); Soon M. Chung, Ph.D. (Committee Member); Liu Meilin, Ph.D. (Committee Member); Wu Zhiqiang, Ph.D. (Committee Member)
Degree Name
Doctor of Philosophy (PhD)
Abstract
The Internet of Things (IoT) is used in many fields that generate sensitive data, such as healthcare and surveillance. Increased reliance on IoT raised serious information security concerns. This dissertation presents three systems for analyzing and classifying IoT traffic using Deep Learning (DL) models, and a large dataset is built for systems training and evaluation. The first system studies the effect of combining raw data and engineered features to optimize the classification of encrypted and compressed IoT traffic using Engineered Features Classification (EFC), Raw Data Classification (RDC), and combined Raw Data and Engineered Features Classification (RDEFC) approaches. Our results demonstrate that the EFC, RDC, and RDEFC models achieve a high classification accuracy of 80.94%, 86.45%, and 90.55%, respectively, outperforming systems reported in the literature with similar configurations. The second system uses three approaches of density estimation, which are histogram, Kernel Density Estimation (KDE), and Cumulative Distribution Function (CDF), to enhance encrypted and compressed variable-size IoT traffic classification. The results demonstrate that the KDE approach attains a significantly higher accuracy of 90.92% compared to 86.66% and 82.6% of the histogram and CDF, respectively. Furthermore, the KDE approach outperforms our RDEFC model in three aspects: variable file length, dataset complexity, and dimensionality reduction. The third system suggests a novel approach for file type classification of fragments in a compressed archive file for forensic digital investigation. Existing research in the literature classifies these files as archive file formats, such as .zip, with no further investigation of the compressed file types. In this system, an optimized modification of the Inception network is implemented. Two sets of filter sizes are implemented, and the attained accuracies are 73.18% and 75.24%, respectively. For future work, we suggest including more encryption or compression algorithms and collecting real-world traffic, classifying encrypted file types, and using our trained models for transfer learning in prospective security domains.
Page Count
120
Department or Program
Department of Computer Science and Engineering
Year Degree Awarded
2023
Copyright
Copyright 2023, some rights reserved. My ETD may be copied and distributed only for non-commercial purposes and may not be modified. All use must give me credit as the original author.
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.