[sugar] An Update about Speech Synthesis
Wed Feb 20 04:28:36 EST 2008
On Feb 19, 2008 4:45 PM, Samuel Klein <sj at laptop.org> wrote:
> Hemant and James,
> Can you write something about this at a [[spoken texts]] page on the
> wiki ('hear and read'? some other more creative name... )? The
> Google Literacy Project is highlighting a number of literacy efforts
> for the upcoming World Book Day, and your work would be fine
> suggestions for that list.
You can use my article,
Let me know when you have something, and I'll drop in on the Wiki page
and see if I can add anything useful to your account.
> On Feb 19, 2008 1:13 PM, Hemant Goyal <goyal.hemant at gmail.com> wrote:
> > Hi,
> > > I'd like to see an eSpeak literacy project written up -- Once we have
> > > a play button, with text highlighting, we have most of the pieces to
> > > make a great read + speak platform that can load in texts and
> > > highlight words/sentences as they are being read. Ping had a nice
> > > mental model for this a while back.
> > Great idea :). The button will soon be there :D. I had never expected this
> > to turn into something this big :). There are lots of things I want to get
> > done wrt this project and hope to accomplish them one by one.
> > > Thanks for the info Hemant! Can you tell me more about your experiences
> > > with speech dispatcher and which version you are using? The things I'm
> > > interested in are stability, ease of configuration, completeness of
> > > implementation, etc.
> > I'll try to tell whatever I am capable of explaining (I am not an expert
> > like you all :) ). Well we had initially started out with a speech-synthesis
> > DBUS API that directly connected to eSpeak. Those results are available on
> > the wiki page [http://wiki.laptop.org/go/Screen_Reader]. From that point
> > onwards we found out about speech-dispatcher and decided to analyze it for
> > our requirements primarily keeping the following things in mind:
> > An API that provided configuration control on a per-client basis.
> > a feature like printf() but for speech for developers to call, and thats
> > precisely how Free(b)soft described their approach to speech-dispatcher.
> > Python Interface for speech-synthesis
> > Callbacks for developers after certain events.
> > At this moment I am in a position to comment about the following:
> > WRT which modules to use -I found it extremely easy to configure
> > speech-dispatcher to use eSpeak as a TTS engine. There are configuration
> > files available to simply select/unselect which TTS module needs to be used.
> > I have described how an older version of speech-dispatcher can be made to
> > run on the XO here
> > http://wiki.laptop.org/go/Screen_Reader#Installing_speech-dispatcher_on_the_xo
> > There were major issues of using eSpeak with the ALSA Sound system some time
> > back [http://dev.laptop.org/ticket/5769, http://dev.laptop.org/ticket/4002].
> > This issue is resolved by using speech-dispatcher as it supports ALSA, and
> > OSS. So in case OLPC ever shifts to OSS we are safe. I am guessing
> > speech-dispatcher does not directly let a TTS engine write to a sound device
> > but instead accepts the audio buffer and then routes it to the Audio Sub
> > System.
> > Another major issue we had to tackle was providing callbacks while providing
> > the DBUS interface. The present implementation of speech-dispatcher provides
> > callbacks for various events that are important wrt speech-synthesis. I have
> > tested these out in python and they were working quite nicely. In case you
> > have not, you might be interested in checking out their Python API
> > [http://cvs.freebsoft.org/repository/speechd/src/python/speechd/client.py?hideattic=0&view=markup].
> > Voice Configuration and language selection - The API provides us options to
> > control voice parameters such as pitch, volume, voice etc for each client.
> > Message Priorities and Queuing - speech-dispatcher has provided various
> > levels of priority for speech synthesis, so we cand place a Higher Priority
> > to a message played by Sugar as compared to an Activity.
> > Compatibility with orca - I installed orca and used speech-dispatcher as the
> > speech synth engine. It worked fine. We wanted to make sure that the speech
> > synth server would work with orca if it was ported to XO in the future.
> > Documentation - speech-dispatcher has a lot of documentation at the moment,
> > and hence its quite easy to find our way and figure out how to do things we
> > really want to. I had intended to explore gnome-speech as well, however the
> > lack of documentation and examples turned me away.
> > The analysis that I did was mostly from a user point of view or simple
> > developer requirements that we realized had to be fulfilled wrt
> > speech-synthesis, and it was definitely not as detailed as you probably
> > might expect from me.
> > We are presently using speech-dispatcher 0.6.6
> > A dedicated eSpeak module has been provided in the newer versions of
> > speech-dispatcher and that is a big advantage for us. In the older version
> > eSpeak was called and various parameters were passed as command line
> > arguments, it surely was not very efficient wrt XO.
> > Stability - I think the main point that I tested here was how well
> > speech-dispatcher responds to long strings. The latest release of
> > speech-dispatcher 0.6.6 has some
> > tests in which an entire story is read out
> > [http://cvs.freebsoft.org/repository/speechd/src/tests/long_message.c?view=markup].
> > However I still need to run this test on the XO. I will do so once I have
> > RPM packages to install on the XO.
> > In particular speech-dispatcher is quite customizable, easily controlled
> > through programming languages, provides callback support, and has
> > specialized support for eSpeak that makes it a good option for the XO.
> > All in all speech-dispatcher is very promising for our requirements wrt XO.
> > While I am not able to project all possible problems that will come wrt
> > speech-synthesis at this stage, it is the best option that is available at
> > present as opposed to our original plans of providing a DBUS API :P. I am
> > preparing myself to possibly delve deeper and test speech-dispatcher 0.6.6
> > on the XO once its RPMs are accepted by Fedora Community. As we progress I
> > will surely find out limitations of speech-dispatcher and would surely
> > report them and/or help fix them along with the Free(b)Soft team.
> > I hope you find this useful, I can try to answer a more specific question.
> > Thanks!
> > Hemant
> Sugar mailing list
> Sugar at lists.laptop.org
End Poverty at a Profit by teaching children business
"The best way to predict the future is to invent it."--Alan Kay
More information about the Sugar-devel