Semantic Heterogeneity in Global Information Systems: The Role of Metadata, Context and Ontologies

Semantic heterogeneity has been identified as one of the most important and toughest problems when dealing with interoperability and cooperation among multiple databases. It was earlier studied in the context of exchanging, sharing and integrating data, especially during the schema/view analysis phase of schema or view integration, or when writing a view or query using a multidatabase language. With the advent of global interconnectivity, we now need to deal with more heterogeneous information resources consisting of a variety of digital data, and the scale of the problem has changed from a few databases to millions of information resources, thus making it more important than ever to address this problem. It is also recognized that the problem has only become harder and that simplistic solutions involving only representational or structural components of data will not work beyond a very restricted set of cases.

In this chapter, we explore approached to tackle the semantic heterogeneity problem in the context of Global Information Systems (GIS) which are systems geared to handle information requests on the Global information Infrastructure (GII). These approached are based on the capture and representation of metadata, contexts and ontologies. In order to handle information overload, it would be advantageous to abstract out the representational details of the underlying data and capture the information content by using domain specific metadata. The next important step is that of understanding the context of the query, using metadata to construct the context and identifying the relevant data in the context. Another critical issue that arises here is that of different vocabularies used to characterize similar information. We present an approach to deal with this problem at the metadata/context level by using terms from domain specific ontologies to construct metadata/context. We deal with semantic heterogeneity at this level and propose an approach using terminological relationships to achieve semantic interoperability.

