Text To Speech Synthesis Systems for Indian Languages (TTS-IL)
1. Project Overview
This project aims at developing text-to-speech synthesis systems for Indian languages (i.e., Hindi, Bengali, Marathi, Telugu, Tamil and Malayalam). It is sponsored by the Department of Information Technology, MCIT, Government of India starting from April 2009. It is to be executed by a consortium of academic and research institutions and industry partners. The members of this TTS consortium are:
- IIT Madras (coordination)
- C-DAC Mumbai
- C-DAC Thiruvananthapuram
- IIIT Hyderabad
- IIT Kharagpur
- STQC DIT
C-DAC Mumbai is primarily responsible for the vertical task of developing the system for Marathi.
2. Background Information
The proliferation of the internet and telecom has brought in a new dimension to information access. Palmtops, cell phones are enabled with internet access, enabling to receive any information across the globe. Cell phones have very small displays that make reading difficult. Text-to-speech synthesis systems can go a long way in enabling ease of access, in that one could have information read to the user. TTS systems can be extremely useful for visually challenged to get into the mainstream society.
Text-to-speech synthesis systems are available for a number of languages across the globe. These systems have become part of screen readers for the visually challenged (http://www.freedomscienti_c.com). Although there are a number of different TTS systems developed for a number of Indian languages (developed by institutions like IIT Madras, IIIT Hyderabad, CDAC Noida, CDAC Kolkata, CDAC Thiruvananthapuram, etc.) each of these systems use different technologies and do not work with a common framework. None of these systems provide JAWS like interfaces that enable a visually challenged person to use the computer with ease. Most of them suffer from shortcomings due to foreign accent, lack of intonation, inadequate quality, etc.
So the focus of this effort is to bring together the expertise across these organizations and provide a COMMON platform and interface that can enable others to seamlessly integrate the synthesis systems into their products, and enhance the quality of TTS for Indian languages.
3. Expected Outcome
- SAPI compliant speech synthesis engine supporting Telugu, Tamil, Hindi, Bengali, Marathi and Malayalam. This will enable a user to directly integrate the engine with other applications. The input text will be in UNICODE/UTF-8.
- The synthesis engine will be platform independent. The engine will be developed using festival speech synthesis engine which works on most platforms (see website http://www.cstr.ac.uk/projects/festival).
- A JAWS like application to work with both Windows and Linux.
- Quality of speech synthesis at a minimum MOS score 3.
4. Current Status
As part of this on-going work the team is now in the process of data collection. A prototype system with one hour Marathi speech data has already been built.
5. Team members (for the Mumbai node)
Dr. M Sasikumar (Chief Investigator)
Bira Chandra Singh
Pranaw Kumar
Ritesh Shah