Conferences in Research and Practice in Information Technology
  

Online Version - Last Updated - 20 Jan 2012

 

 
Home
 

 
Procedures and Resources for Authors

 
Information and Resources for Volume Editors
 

 
Orders and Subscriptions
 

 
Published Articles

 
Upcoming Volumes
 

 
Contact Us
 

 
Useful External Links
 

 
CRPIT Site Search
 
    

Audio-Visual Speaker Verification using Continuous Fused HMMs

Dean, D., Sridharan, S. and Wark, T.

    This paper examines audio-visual speaker verification using a novel adaptation of fused hidden Markov models, in comparison to output fusion of individual classifiers in the audio and video modalities. A comparison of both hidden Markov model (HMM) and Gaussian mixture model (GMM) classifiers in both modalities under output fusion shows that the choice of audio classifier is more important than video. Although temporal information allows a HMM to outperform a GMM individually in video, this temporal information does not carry through to output fusion with an audio classifier, where the difference between the two video classifiers is minor. An adaptation of fused hidden Markov models, designed to be more robust to within-speaker variation, is used to show that the temporal relationship between video observations and audio states can be harnessed to reduce errors in audio-visual speaker verification when compared to output fusion.
Cite as: Dean, D., Sridharan, S. and Wark, T. (2006). Audio-Visual Speaker Verification using Continuous Fused HMMs. In Proc. HCSNet Workshop on the Use of Vision in Human-Computer Interaction, (VisHCI 2006), Canberra, Australia. CRPIT, 56. Goecke, R., Robles-Kelly, A. and Caelli, T., Eds. ACS. 87-92.
pdf (from crpit.com) pdf (local if available) BibTeX EndNote GS
 

 

ACS Logo© Copyright Australian Computer Society Inc. 2001-2014.
Comments should be sent to the webmaster at crpit@scem.uws.edu.au.
This page last updated 16 Nov 2007