Conferences in Research and Practice in Information Technology
  

Online Version - Last Updated - 20 Jan 2012

 

 
Home
 

 
Procedures and Resources for Authors

 
Information and Resources for Volume Editors
 

 
Orders and Subscriptions
 

 
Published Articles

 
Upcoming Volumes
 

 
Contact Us
 

 
Useful External Links
 

 
CRPIT Site Search
 
    

Automatically Generated Consumer Health Metadata Using Semantic Spaces

Chen, G., Warren, J.R. and Evans, J.

    The continual growth of the World Wide Web presents the (also growing) population of health information seekers with the challenge of finding reliable information that is appropriate to their needs. Metadata about consumer health websites can provide a guide for end users and domain-specific search tools. In this paper we present and demonstrate a method for automatically inferring a non-trivial metadata attribute that has been encoded for breast cancer websites: whether the site is 'medical' or 'supportive' in tone. We induce decision trees to distinguish Medical vs. Supportive sites based on feature vectors of word co-occurrence patterns, founded in a semantic space model called Hyperspace Analog to Language (HAL). We achieve 82% (95% CI: 74% to 91%) classification accuracy. This should already be a useful capability for human metadata coders or to support on-the-fly queries, and it inspires us to further investigate metadata classifiers based on HAL features.
Cite as: Chen, G., Warren, J.R. and Evans, J. (2008). Automatically Generated Consumer Health Metadata Using Semantic Spaces. In Proc. Second Australasian Workshop on Health Data and Knowledge Management (HDKM 2008), Wollongong, NSW, Australia. CRPIT, 80. Warren, J. R., Yu, P., Yearwood, J. and Patrick, J. D., Eds. ACS. 9-15.
pdf (from crpit.com) pdf (local if available) BibTeX EndNote GS
 

 

ACS Logo© Copyright Australian Computer Society Inc. 2001-2014.
Comments should be sent to the webmaster at crpit@scem.uws.edu.au.
This page last updated 16 Nov 2007