Ramasubramanian, V and Sreenivas, TV (2004) Automatically Derived Units for Segment Vocoders. In: 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '04), 17-21 May, Quebec,Canada, Vol.1, 473-476.
Segment vocoders play a special role in very low bitrate speech coding to achieve intelligible speech at bitrates of 300 bits/sec. In this paper, We explore the definition and use of automatically derived units for segment quantization in segment vocoders. We consider three automatic segmentation techniques, namely, spectral transition measures (STM), maximum-likelihood (ML) segmentation (unconstrained) and duration-constrained ML segmentation, towards defining diphone-like and phone-like units. We show that the ML segmentations realize phone-like units which are significantly better than those obtained by STM in terms of match accuracy with TIMIT phone segmentation as well as actual vocoder performance measured in terms of segmental SNR. Moreover, the phone-like units of ML segmentations also outperform the diphone-like units obtained using STM in early vocoders. We also show that the segment vocoder can operate at very high intelligibility when used in a single-speaker mode.
|Item Type:||Conference Paper|
|Additional Information:||Ã�Â©1990 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.|
|Department/Centre:||Division of Electrical Sciences > Electrical Communication Engineering|
|Date Deposited:||21 Dec 2005|
|Last Modified:||19 Sep 2010 04:22|
Actions (login required)