Learning Models for English Speech Recognition

Xie, H., Andreae, P., Zhang, M. and Warren, P.

    This paper reports on an experiment to determine the optimal parameters for a speech recogniser that is part of a computer aided instruction system for assisting learners of English as a Second Language. The recogniser uses Hidden Markov Model (HMM) technology. To find the best choice of parameters for the recogniser, an exhaustive experiment with 2370 combinations of parameters was performed on a data set of 1119 different English utterances produced by 6 female adults. A server-client computer network was used to carry out the experiment. The experimental results give a clear preference for certain sets of parameters. An analysis of the results also identified some of the causes of errors and the paper proposes two approaches to reduce these errors.
Cite as: Xie, H., Andreae, P., Zhang, M. and Warren, P. (2004). Learning Models for English Speech Recognition. In Proc. Twenty-Seventh Australasian Computer Science Conference (ACSC2004), Dunedin, New Zealand. CRPIT, 26. Estivill-Castro, V., Ed. ACS. 323-329.
pdf (from crpit.com) pdf (local if available) BibTeX EndNote GS