Document Type
Conference Proceeding
Publication Date
6-2011
Abstract
This paper describes a minimally guided approach to automatic domain model creation. The first step is to carve an area of interest out of the Wikipedia hierarchy based on a simple query or other starting point. The second step is to connect the concepts in this domain hierarchy with named relationships. A starting point is provided by Linked Open Data, such as DBPedia. Based on these community-generated facts we train a pattern-based fact-extraction algorithm to augment a domain hierarchy with previously unknown relationship occurrences. Pattern vectors are learned that represent occurrences of relationships between concepts. The process described can be fully automated and the number of relationships that can be learned grows as the community adds more information. Unlike approaches that are aimed at finding single, highly indicative patterns, we use the cumulative score of many pattern occurrences to increase extraction recall. The relationship identification process itself is based on positive only classification of training facts.
Repository Citation
Thomas, C.,
Mehra, P.,
Wang, W.,
Sheth, A. P.,
Weikum, G.,
& Chan, V.
(2011). Automatic Domain Model Creation Using Pattern-Based Fact Extraction. .
https://corescholar.libraries.wright.edu/knoesis/1001
Included in
Bioinformatics Commons, Communication Technology and New Media Commons, Databases and Information Systems Commons, OS and Networks Commons, Science and Technology Studies Commons
Comments
Submitted to the Sixth International Conference on Knowledge Capture, Banff, Alberta, Canada, June 25-29, 2011.