Sarkar, Anindya and Sreenivas, TV (2005) Automatic Speech Segmentation Using Average Level Crossing Rate Information. In: IEEE International Conference on Acoustics, Speech, and Signal Processing: ICASSP '05, 18-23 March, 2005, Philadelphia, PA, USA, Vol.1, 397-400.
We explore new methods of determining automatically derived units for classification of speech into segments. For detecting signal changes, temporal features are more reliable than the standard feature vector domain methods, since both magnitude and phase information are retained. Motivated by auditory models, we have presented a method based on average level crossing rate (ALCR) of the signal, to detect significant temporal changes in the signal. An adaptive level allocation scheme has been used in this technique that allocates levels, depending on the signal pdf and SNR. We compare the segmentation performance to manual phonemic segmentation and also that provided by Maximum Likelihood (ML) segmentation for 100 TIMIT sentences. The ALCR method matches the best segmentation performance without a priori knowledge of number of segments as in ML segmentation.
|Item Type:||Conference Paper|
|Additional Information:||Copyright 2005 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.|
|Department/Centre:||Division of Electrical Sciences > Electrical Communication Engineering|
|Date Deposited:||30 Nov 2007|
|Last Modified:||19 Sep 2010 04:35|
Actions (login required)