Publication Date
2024
Document Type
Thesis
Committee Members
Cogan Shimizu, Ph.D. (Advisor); Lingwei Chen, Ph.D. (Committee Member); Krishnaprasad Thirunarayan, Ph.D. (Committee Member)
Degree Name
Master of Science (MS)
Abstract
This research explores the integration of knowledge graphs with large language models that have already been trained on a vast pool of unstructured text data. Large language models trained on this type of data have a tendency to hallucinate and produce factually inaccurate results. This behavior is primarily due to the data being trained is unstructured and huge text corpus, and large language model uses predictive text analysis methods to obtain a response. These issues can be addressed by applying Retrieval Augmented Generation and Fine-tuning to large language models, employing an underlying domainspecific knowledge graph. Integrating knowledge graph and large language models aids in learning the underlying semantics with the contextual understanding provided by large language models, resulting in an efficient system capable of interpreting and responding to user queries in a more contextually aware manner while reducing hallucinations. In this thesis, we conducted experiments to improve the efficiency of large language models using knowledge graphs, Retrieval Augmented Generation, and fine-tuning strategies, where the large language model responds to user query with the help of either Retrieval Augmented Generation or fine-tuning techniques. The results of this study show that Retrieval Augmented Generation outperformed fine-tuning in terms of generating precise responses to user queries.
Page Count
53
Department or Program
Department of Computer Science and Engineering
Year Degree Awarded
2024
Copyright
Copyright 2024, some rights reserved. My ETD may be copied and distributed only for non-commercial purposes and may not be modified. All use must give me credit as the original author.
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
ORCID ID
0009-0006-1064-141X