Publication Date
2021
Document Type
Thesis
Committee Members
Nikolaos G. Bourbakis, Ph.D. (Advisor); Euripides G.M. Petrakis, Ph.D. (Committee Member); Soon M. Chung, Ph.D. (Committee Member)
Degree Name
Master of Science (MS)
Abstract
A great percentage of documents in scientific and engineering disciplines include mathematical formulas and/or algorithms. Exploring the mathematical formulas in the technical documents, we focused on the mathematical operations associations, their syntactical correctness, and the association of these components into attributed graphs and Stochastic Petri Nets (SPN). We also introduce a formal language to generate mathematical formulas and evaluate their syntactical correctness. The main contribution of this work focuses on the automatic segmentation of mathematical documents for the parsing and analysis of detected algorithmic components. To achieve this, we present a synergy of methods, such as string parsing according to mathematical rules, Formal Language Modeling, optical analysis of technical documents in forms of images, structural analysis of text in images, and graph and Stochastic Petri Net mapping. Finally, for the recognition of the algorithms, we enriched our rule based model with machine learning techniques to acquire better results.
Page Count
127
Department or Program
Department of Computer Science and Engineering
Year Degree Awarded
2021
Copyright
Copyright 2021, all rights reserved. My ETD will be available under the "Fair Use" terms of copyright law.