Ramachandrula, Sitaram and Thippur, Sreenivas (1997) Connected phoneme HMMs with implicit duration modelling for better speech recognition. In: International Conference on Information, Communications and Signal Processing, ICICS, 9-12 September, Singapore, pp. 1024-1028.
The duration of speech units is an important cue in speech recognition. But most of the current speech recognizers, based on HMMs, do not use this durational information satisfactorily. Previously the duration was incorporated into HMM based systems by modifying the HMM state duration modelling ability, but this has some limitations. We propose a better way of using the duration of speech units in HMM based systems. Here, the implicit duration modelling ability of the whole HMM is exploited. Connected phoneme HMMs with implicit duration modelling are proposed as better word models, than those that are trained using whole word speech. In a speaker independent, isolated word recognition experiment, having confusable words in the vocabulary, the word HMMs formed by concatenating phoneme HMMs with duration modelling, have improved the word recognition accuracy by 7-8%, compared to word HMMs trained using whole word speech
|Item Type:||Conference Paper|
|Additional Information:||© 1997 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.|
|Department/Centre:||Division of Electrical Sciences > Electrical Communication Engineering|
|Date Deposited:||25 Aug 2008|
|Last Modified:||19 Sep 2010 04:35|
Actions (login required)