Conferences in Research and Practice in Information Technology
  

Online Version - Last Updated - 20 Jan 2012

 

 
Home
 

 
Procedures and Resources for Authors

 
Information and Resources for Volume Editors
 

 
Orders and Subscriptions
 

 
Published Articles

 
Upcoming Volumes
 

 
Contact Us
 

 
Useful External Links
 

 
CRPIT Site Search
 
    

A Unifying Semantic Distance Model for Determining the Similarity of Attribute Values

Roddick, J.F., Hornsby, K. and de Vries, D.

    The relative difference between two data values is of interest in a number of application domains including temporal and spatial applications, schema versioning, data warehousing (particularly data preparation), internet searching, validation and error correction, and data mining. Moreover, consistency across systems in determining such distances and the robustness of such calculations is essential in some domains and useful in many. Despite this, there is no generally adopted approach to determining such distances and no accommodation of distance within SQL or any commercially available DBMS. For non-numeric data values calculating the difference between values often requires applicationspecific support but even for numeric values the practical distance between two values may not simply be their numeric difference or Euclidean distance. In this paper, a model of semantic distance is developed in which a graph-based approach is used to quantify the distance between two data values. The approach facilitates a notion of distance, both as a simple traversal distance and as weighted arcs. Transition costs, as an additional expense of passing through a node, are also accommodated. Furthermore, multiple distance measures can be incorporated and a method of 'localisation' is discussed which allows relevant information to take precedence over less relevant information. Some results from our investigations, including our SQL based implementation, are presented.
Cite as: Roddick, J.F., Hornsby, K. and de Vries, D. (2003). A Unifying Semantic Distance Model for Determining the Similarity of Attribute Values. In Proc. Twenty-Sixth Australasian Computer Science Conference (ACSC2003), Adelaide, Australia. CRPIT, 16. Oudshoorn, M. J., Ed. ACS. 111-118.
pdf (from crpit.com) pdf (local if available) BibTeX EndNote GS
 

 

ACS Logo© Copyright Australian Computer Society Inc. 2001-2014.
Comments should be sent to the webmaster at crpit@scem.uws.edu.au.
This page last updated 16 Nov 2007