Murthy, A Sreenivasa and Sekhar, S Chandra and Sreenivas, TV (2007) Robust and High-resolution Voiced/Unvoiced Classification in Noisy Speech Using A Signal Smoothness Criterion. In: Interspeech Conference 2007, August 27-31, 2007, Antwerp, Belgium, pp. 2260-2263.Full text not available from this repository.
We propose a novel technique for robust voiced/unvoiced segment detection in noisy speech, based on local polynomial regression. The local polynomial model is well-suited for voiced segments in speech. The unvoiced segments are noise-like and do not exhibit any smooth structure. This property of smoothness is used for devising a new metric called the variance ratio metric, which, after thresholding, indicates the voiced/unvoiced boundaries with 75% accuracy for 0dB global signal-to-noise ratio (SNR). A novelty of our algorithm is that it processes the signal continuously, sample-by-sample rather than frame-by-frame. Simulation results on TIMIT speech database (downsampled to 8kHz) for various SNRs are presented to illustrate the performance of the new algorithm. Results indicate that the algorithm is robust even in high noise levels.
|Item Type:||Conference Paper|
|Additional Information:||Copyright for this article belongs to ISCA-International Speech Communication Association.|
|Keywords:||voiced; unvoiced; local polynomial model; regression; signal-to-noise ratio|
|Department/Centre:||Division of Electrical Sciences > Electrical Communication Engineering|
|Date Deposited:||08 Jan 2010 05:44|
|Last Modified:||08 Jan 2010 05:47|
Actions (login required)