[Sugar-devel] GSOC 2010: Speech Recognition in Sugar
christoph.derndorfer at gmail.com
Tue Apr 6 10:29:30 EDT 2010
On Sun, Apr 4, 2010 at 9:49 AM, chirag jain <chiragjain1989 at gmail.com>wrote:
> On Sat, Apr 3, 2010 at 7:37 AM, Benjamin M. Schwartz <
> bmschwar at fas.harvard.edu> wrote:
>> I think your proposal is very interesting. It contains a number of
>> different ideas. One major division is between Voice Commands and Speech
>> Recognition. Each of these contains many other possibilities. My biggest
>> suggestion is to specify further which possibilities you want to work on.
>> I recommend you schedule the _easiest_ thing first, before moving on to
>> the hard things. Most GSoC students are too ambitious and never produce
>> anything useful.
>> Thanks Benjamin for a quick reply and providing me with some very useful
>> Some specific ideas:
>> Voice Commands:
>> - integrate with a text-command system like Gnome Do , so that the
>> commands are accessible through the keyboard as well as microphone. Also
>> look at Perlbox . (Note that neither Gnome Do or Perlbox can be used
>> - integrate with GnomeVoiceControl , which already uses PocketSphinx
>> and should be highly compatible with Sugar. This could allow voice
>> control of unmodified Activities.
>> I have already gone through Gnome Voice control which I think is the best
> option for integrating into sugar. The reason being it uses Pocket Sphinx
> which is light weight and thus should be compatible with devices like
> XO-1.0. The run time memory requirements of Pocket Sphinx are upto 20 MB.
> During next few days, I will be testing the functionality of Pocket Sphinx
> in sugar and familiarizing myself more with Gnome voice control.
>> Speech Recognition:
>> - supply text to any unmodified activity
>> - control input language easily for multilingual users
>>  http://do.davebsd.com/index.shtml
>>  http://perlbox.sourceforge.net/
>>  http://live.gnome.org/GnomeVoiceControl
>> I have broken the proposal into following parts that should be done in
> a) My first priority this summer is to enable "Sugar Voice Control". This
> 1. Testing Pocket Sphinx on Sugar
> 2. Studying more about Gnome Voice Control.
> 3. Sugarizing the Gnome Voice Control.
> 4. A command line interface that will start speech recognition in the
> background and will start taking "Speech Commands".
> b) After the successful implementation of Sugar Voice control, we can then
> look into providing speech recognized text to unmodified sugar activities.
> Thus activities like Write can be made to get the required inputs either
> from Keyboard or through microphone. This includes:
> 1. Providing a Speech recognition button in the sugar frame (for example
> on Top Right hand side) which when clicked will automatically start
> recognizing speech in the background. Clicking the same button again will
> stop the recognition process.
> 2. A key board shortcut like Alt+S for starting speech recognition
> 3. Speech recognition control panel for controlling the various parameters.
> c) The last part can be creating an API for providing easy Speech
> Recognition access to activity developers.
> My aim is to atleast achieve part a) this summer and if time permits I
> would also like to implement part b). Part c) can be taken care off later.
I just looked at your updated proposal and it's looking very good indeed.
I also think that Benjamin's comments are spot-on and so achieving (a) in
combination with supporting not only English but also Spanish (arguably the
most important language when you look at current OLPC / Sugar deployments)
would certainly be a big success and a great foundation for follow-up
e-mail: christoph at olpcnews.com
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Sugar-devel