[Sugar-devel] GSOC 2010: Speech Recognition in Sugar

chirag jain chiragjain1989 at gmail.com
Tue Apr 6 11:36:58 EDT 2010


Hi Christoph,

Thanks for the encouraging words! :)

Yes, after English, creating language models for Spanish will be a great
idea so that we can cover a greater section of users. In fact I have decided
the following four languages during the summers, English, Spanish, German
and Hindi.

Although I know that users for Hindi are very few, but still I would like to
implement it because that would ease me to test the framework in my
locality.

Regards

On Tue, Apr 6, 2010 at 6:29 AM, Christoph Derndorfer <
christoph.derndorfer at gmail.com> wrote:

> On Sun, Apr 4, 2010 at 9:49 AM, chirag jain <chiragjain1989 at gmail.com>wrote:
>
>> Hi,
>>
>>
>>
>> On Sat, Apr 3, 2010 at 7:37 AM, Benjamin M. Schwartz <
>> bmschwar at fas.harvard.edu> wrote:
>>
>>> I think your proposal is very interesting.  It contains a number of
>>> different ideas.  One major division is between Voice Commands and Speech
>>> Recognition.  Each of these contains many other possibilities. My biggest
>>> suggestion is to specify further which possibilities you want to work on.
>>>  I recommend you schedule the _easiest_ thing first, before moving on to
>>> the hard things.  Most GSoC students are too ambitious and never produce
>>> anything useful.
>>>
>>> Thanks Benjamin for a quick reply and providing me with some very useful
>> suggestions.
>>
>>
>>> Some specific ideas:
>>>
>>> Voice Commands:
>>>  - integrate with a text-command system like Gnome Do [1], so that the
>>> commands are accessible through the keyboard as well as microphone.  Also
>>> look at Perlbox [2].  (Note that neither Gnome Do or Perlbox can be used
>>> directly.)
>>>  - integrate with GnomeVoiceControl [3], which already uses PocketSphinx
>>> and should be highly compatible with Sugar.   This could allow voice
>>> control of unmodified Activities.
>>>
>>> I have already gone through Gnome Voice control which I think is the best
>> option for integrating into sugar. The reason being it uses Pocket Sphinx
>> which is light weight and thus should be compatible with devices like
>> XO-1.0. The run time memory requirements of Pocket Sphinx are upto 20 MB.
>> During next few days, I will be testing the functionality of Pocket Sphinx
>> in sugar and familiarizing myself more with Gnome voice control.
>>
>>
>>> Speech Recognition:
>>>  - supply text to any unmodified activity
>>>  - control input language easily for multilingual users
>>>
>>> [1] http://do.davebsd.com/index.shtml
>>> [2] http://perlbox.sourceforge.net/
>>> [3] http://live.gnome.org/GnomeVoiceControl
>>>
>>> I have broken the proposal into following parts that should be done in
>> sequence:
>>
>> a) My first priority this summer is to enable "Sugar Voice Control". This
>> includes:
>>
>> 1. Testing Pocket Sphinx on Sugar
>> 2. Studying more about Gnome Voice Control.
>> 3. Sugarizing the Gnome Voice Control.
>> 4. A command line interface that will start speech recognition in the
>> background and will start taking "Speech Commands".
>>
>> b) After the successful implementation of Sugar Voice control, we can then
>> look into providing speech recognized text to unmodified sugar activities.
>> Thus activities like Write can be made to get the required inputs either
>> from Keyboard or through microphone. This includes:
>>
>> 1.  Providing a Speech recognition button in the sugar frame (for example
>> on Top Right hand side) which when clicked will automatically start
>> recognizing speech in the background. Clicking the same button again will
>> stop the recognition process.
>>
>> 2.  A key board shortcut like Alt+S for starting speech recognition
>>
>> 3. Speech recognition control panel for controlling the various
>> parameters.
>>
>> c) The last part can be creating an API for providing easy Speech
>> Recognition access to activity developers.
>>
>> My aim is to atleast achieve part a) this summer and if time permits I
>> would also like to implement part b). Part c) can be taken care off later.
>>
>
> Hi,
>
> I just looked at your updated proposal and it's looking very good indeed.
>
> I also think that Benjamin's comments are spot-on and so achieving (a) in
> combination with supporting not only English but also Spanish (arguably the
> most important language when you look at current OLPC / Sugar deployments)
> would certainly be a big success and a great foundation for follow-up
> projects.
>
> Cheers,
> Christoph
>
> --
> Christoph Derndorfer
> co-editor, olpcnews
> url: www.olpcnews.com
> e-mail: christoph at olpcnews.com
>



-- 
Chirag Jain

Undergraduate Student
Netaji Subash Institute of Technology
New Delhi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.sugarlabs.org/archive/sugar-devel/attachments/20100406/7afb43a6/attachment.htm 


More information about the Sugar-devel mailing list