Junjie Zhang, Ph.D. (Advisor); Lingwei Chen, Ph.D. (Committee Member); Meilin Liu, Ph.D. (Committee Member)
Master of Science in Cyber Security (MSCS)
Vulnerabilities in source code can be compiled for multiple processor architectures and make their way into several different devices. Security researchers frequently have no way to obtain this source code to analyze for vulnerabilities. Therefore, the ability to effectively analyze binary code is essential. Similarity detection is one facet of binary code analysis. Because source code can be compiled for different architectures, the need can arise for detecting code similarity across architectures. This need is especially apparent when analyzing firmware from embedded computing environments such as Internet of Things devices, where the processor architecture is dependent on the product and cannot be controlled by the researcher. In this thesis, we propose a system for cross-architecture binary similarity detection and present an implementation. Our system simplifies the process by lifting the binary code into an intermediate representation provided by Ghidra before analyzing it with a neural network. This eliminates the noise that can result from analyzing two disparate sets of instructions simultaneously. Our tool shows a high degree of accuracy when comparing basic blocks. In future work, we hope to expand its functionality to capture function-level control flow data.
Department or Program
Department of Computer Science and Engineering
Year Degree Awarded
Copyright 2022, some rights reserved. My ETD may be copied and distributed only for non-commercial purposes and may not be modified. All use must give me credit as the original author.
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.