Publication Date

2024

Document Type

Thesis

Committee Members

Cogan Shimizu, Ph.D. (Advisor); Lingwei Chen, Ph.D. (Committee Member); Krishnaprasad Thirunarayan, Ph.D. (Committee Member)

Degree Name

Master of Science (MS)

Abstract

This research explores the integration of knowledge graphs with large language models that have already been trained on a vast pool of unstructured text data. Large language models trained on this type of data have a tendency to hallucinate and produce factually inaccurate results. This behavior is primarily due to the data being trained is unstructured and huge text corpus, and large language model uses predictive text analysis methods to obtain a response. These issues can be addressed by applying Retrieval Augmented Generation and Fine-tuning to large language models, employing an underlying domainspecific knowledge graph. Integrating knowledge graph and large language models aids in learning the underlying semantics with the contextual understanding provided by large language models, resulting in an efficient system capable of interpreting and responding to user queries in a more contextually aware manner while reducing hallucinations. In this thesis, we conducted experiments to improve the efficiency of large language models using knowledge graphs, Retrieval Augmented Generation, and fine-tuning strategies, where the large language model responds to user query with the help of either Retrieval Augmented Generation or fine-tuning techniques. The results of this study show that Retrieval Augmented Generation outperformed fine-tuning in terms of generating precise responses to user queries.

Page Count

53

Department or Program

Department of Computer Science and Engineering

Year Degree Awarded

2024

ORCID ID

0009-0006-1064-141X


Share

COinS