ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Research bed for unit selection based text to speech synthesis

Sarathy, Partha K and Ramakrishnan, AG (2008) Research bed for unit selection based text to speech synthesis. In: Proc. II IEEE Spoken Language Technology (SLT) workshop.

[img] PDF
A_RESEARCH_BED.pdf - Published Version
Restricted to Registered users only

Download (198Kb) | Request a copy
Official URL: http://ieeexplore.ieee.org/search/srchabstract.jsp...

Abstract

The paper describes a modular, unit selection based TTS framework, which can be used as a research bed for developing TTS in any new language, as well as studying the effect of changing any parameter during synthesis. Using this framework, TTS has been developed for Tamil. Synthesis database consists of 1027 phonetically rich prerecorded sentences. This framework has already been tested for Kannada. Our TTS synthesizes intelligible and acceptably natural speech, as supported by high mean opinion scores. The framework is further optimized to suit embedded applications like mobiles and PDAs. We compressed the synthesis speech database with standard speech compression algorithms used in commercial GSM phones and evaluated the quality of the resultant synthesized sentences. Even with a highly compressed database, the synthesized output is perceptually close to that with uncompressed database. Through experiments, we explored the ambiguities in human perception when listening to Tamil phones and syllables uttered in isolation,thus proposing to exploit the misperception to substitute for missing phone contexts in the database. Listening experiments have been conducted on sentences synthesized by deliberately replacing phones with their confused ones.

Item Type: Conference Paper
Additional Information: Copyright 2009 IEEE. Personal use of this material is permitted.However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Keywords: speech synthesis;speech codecs;intelligibility;naturalness; perception
Department/Centre: Division of Electrical Sciences > Electrical Engineering
Date Deposited: 20 Sep 2011 08:35
Last Modified: 20 Sep 2011 08:35
URI: http://eprints.iisc.ernet.in/id/eprint/40538

Actions (login required)

View Item View Item