Document Type


Publication Date



Discovering patterns in graphs has long been an area of interest. In most contemporary approaches to such pattern discovery either quantitative anomalies or frequency of substructure is used to measure the interestingness of a pattern. In this paper we address the issue of discovering informative sub-graphs within RDF graphs. We motivate our work with an example related to Semantic Search. A user might pose a question of the form: ' What are the most relevant ways in which entity X is related to entity Y?' the response to which is a subgraph connecting X to Y. Relevance of the discovered subgraph therefore will depend on the amount of useful information conveyed to the user. This in turn depends on the meaning of the edges in the subgraph. We introduce heuristics that guide a discovery algorithm away from banal paths towards more in-formative ones. This guidance is based on weighting mechanisms (driven by edge semantics) for the edges in the RDF graph. We present an analysis of the quality of the subgraphs generated with respect to path ranking metrics. We then conclude presenting intuitions about which of our weighting schemes and heuristics produce higher quality subgraphs.


University of Georgia, Athens, Department of Computer Science Technical Report 05-001