[Sugar-devel] Google summer of Code proposal: Speech synthesis in core sugar (Chirag Jain)

chirag jain chiragjain1989 at gmail.com
Sat Mar 28 10:31:34 EDT 2009


Hi!!
I am an undergraduate student from Netaji Subash Insititute of
Technology, New Delhi. I am interested in the speech synthesis project
in GSoC.
My project aims at speech synthesis in core sugar. Means it will
provide a basic functionality to sugar by which any text can be speak
out. Not only this, my project also aims at providing a UI for the
configuration of speech.
This activity can be a great language learning tool for children of age 6-15.

I discussed a lot with alsroot, assimd and besmac on IRC. The main
points of discussion are:

1. The main aim of sugar in speech synthesis is to integrate the
speech in core sugar.

2. Integrating speech in core sugar means providing a speech generator
as a basic functionality in sugar. Thus if there is any window
containing a text is open in sugar then the selected text can be read
out by the application running behind.

3. The other aim is to develop a GUI for speech configuration which
will also act as a configuration mangement tool.

4. Now in this tool, basic facilities like changing the volume, pitch,
voice, accent, language etc can be included.

5. Accent acording to locale is yet another important feature that
sugar aims at in the speech synthesis. Espeak already provides
different accents for different languages.

6. Another nice idea that assimd suggested is a keyboard speaker.
Means whenever a user presses any of the key, the activity speaks it
out.

Some rough ideas of implementation:

7. There are two options for using a layer over TTS engine espeak, one
is a speech dispatcher which was the OLPC's last year GSoC project and
other is the gstreamer plugin espeak.

8. Both of these use espeak. Listen and Spell uses the speechd. But
when I discussed it with alsroot on IRC, he told me that using a
speechd is a bad idea becaue it has become a system daemon and
requires root privileges to work. Therefoe using gstreamer plugin is
the only and best idea.

9. Now to implement speech in sugar core is a tough task which is yet
to be worked out. But one idea is to use clipboard module which takes
care of copy paste in sugar. So using this module the entire selected
text can be sent to the speech activity that it can speak out.

10. For the keyboard speaker, we can simply store the keystrokes in a
file and then send the file to the speech generator.

11. The basic idea is to provide a read button in core sugar (like a
home button) which is always there. So that if a user selects any of
the text in the current window and presses the button it gets speak
out.

Issues:

12. How to provide karoke style coloring?

Although I am still working on it.....If you have any nice idea or
suggestion then please do reply....

Regards

Chirag


More information about the Sugar-devel mailing list