Conferences in Research and Practice in Information Technology
  

Online Version - Last Updated - 20 Jan 2012

 

 
Home
 

 
Procedures and Resources for Authors

 
Information and Resources for Volume Editors
 

 
Orders and Subscriptions
 

 
Published Articles

 
Upcoming Volumes
 

 
Contact Us
 

 
Useful External Links
 

 
CRPIT Site Search
 
    

Sensor Fusion Weighting Measures in Audio-Visual Speech Recognition

Lewis, T.W. and Powers, D.M.W.

    Audio-Visual Speech Recognition (AVSR) uses vision to enhance speech recognition but also introduces the problem of how to join (or fuse) these two signals together. Mainstream research achieves this using a weighted product of the output of the phoneme classifiers for both modalities. This paper analyses current weighting measures and compares them to several new measures proposed by the authors. Most importantly, when calculating the dispersion of the output there is a shift from analysing the variance to analysing the skewness of the distribution. Experiments in AVSR using neural networks raise questions of the utility of such measures with some intriguing results.
Cite as: Lewis, T.W. and Powers, D.M.W. (2004). Sensor Fusion Weighting Measures in Audio-Visual Speech Recognition. In Proc. Twenty-Seventh Australasian Computer Science Conference (ACSC2004), Dunedin, New Zealand. CRPIT, 26. Estivill-Castro, V., Ed. ACS. 305-314.
pdf (from crpit.com) pdf (local if available) BibTeX EndNote GS
 

 

ACS Logo© Copyright Australian Computer Society Inc. 2001-2014.
Comments should be sent to the webmaster at crpit@scem.uws.edu.au.
This page last updated 16 Nov 2007