Conferences in Research and Practice in Information Technology
  

Online Version - Last Updated - 20 Jan 2012

 

 
Home
 

 
Procedures and Resources for Authors

 
Information and Resources for Volume Editors
 

 
Orders and Subscriptions
 

 
Published Articles

 
Upcoming Volumes
 

 
Contact Us
 

 
Useful External Links
 

 
CRPIT Site Search
 
    

Voiceless Speech Recognition Using Dynamic Visual Speech Features

Yau, W.C., Kumar, D.K. and Arjunan, S.P.

    This paper describes a voiceless speech recognition technique that utilizes dynamic visual features to represent the facial movements during phonation. The dynamic features extracted from the mouth video are used to classify utterances without using the acoustic data. The audio signals of consonants are more confusing than vowels and the facial movements involved in pronunciation of consonants are more discernible. Thus, this paper focuses on identifying consonants using visual information. This paper adopts a visual speech model that categorizes utterances into sequences of smallest visually distinguishable units known as visemes. The viseme model used is based on the viseme model of Moving Picture Experts Group 4 (MPEG-4) standard. The facial movements are segmented from the video data using motion history images (MHI). MHI is a spatio-temporal template (grayscale image) generated from the video data using accumulative image subtraction technique. The proposed approach combines discrete stationary wavelet transform (SWT) and Zernike moments to extract rotation invariant features from the MHI. A feedforward multilayer perceptron (MLP) neural network is used to classify the features based on the patterns of visible facial movements. The preliminary experimental results indicate that the proposed technique is suitable for recognition of English consonants.
Cite as: Yau, W.C., Kumar, D.K. and Arjunan, S.P. (2006). Voiceless Speech Recognition Using Dynamic Visual Speech Features. In Proc. HCSNet Workshop on the Use of Vision in Human-Computer Interaction, (VisHCI 2006), Canberra, Australia. CRPIT, 56. Goecke, R., Robles-Kelly, A. and Caelli, T., Eds. ACS. 93-101.
pdf (from crpit.com) pdf (local if available) BibTeX EndNote GS
 

 

ACS Logo© Copyright Australian Computer Society Inc. 2001-2014.
Comments should be sent to the webmaster at crpit@scem.uws.edu.au.
This page last updated 16 Nov 2007