[Gsoc] Summer of Code Proposal: Furthering Speech Recognition in Sugar.

Assim Deodia assim.deodia at gmail.com
Mon Mar 23 15:03:42 EDT 2009

Hi Satya,

Your idea is indeed very good. Its would be great to have speech as system
wide another input mechanism.

On Mon, Mar 23, 2009 at 3:05 AM, satya komaragiri <
satya.komaragiri at gmail.com> wrote:

> Hello,
> I am a final year student from India. I wish to apply to GSoC this
> year by building upon my current work. I had discussed the feasibility
> and advantages of having Speech Recognition for an OLPC with the devel
> list [1] in September 2008 and have been working on it since then as a
> part of the Sarai Fellowship. My progress on that project can be
> tracked on its wiki page [2]. I am also working on a dictation
> activity under it though it might take some time as I am currently
> gathering the speech corpus spoken by children.

I see that you have already compiled Julius on XO. Were you able to test it
since it only support Japanese?

> This summer, I would like to implement a generic speech library that
> could be used by any activity so that children can interact with Sugar
> using voice rather than typing. I spoke to Mr. Assim Deodia(cc'ed in
> this mail) who developed an activity called 'Listen and Spell' in GSoC
> last year and is interested in this idea.
> I can showcase one of its potential usages by integrating speech
> capabilities to the 'Listen and Spell' activity where the child can
> spell out the word verbally. I want to let the children speak out the
> spelling rather than type it out. As the alphabet of any language is
> limited (26 in the case of English, extension to any language would
> just mean getting a few people to read out the alphabet of that
> language).

What you can do is (As Sean DALY suggested) to have a specific key pressed
while speaking this i guess will greatly improve the efficiency of the
engine. Starting out with the recognition of just 26 letters that with too
boundaries marked would not be of much load for XO and it will enhance the
activity greatly.

Some basic commands like close, open <activity name> etc can also be

> Having a generic library will make system-wide integration easier by
> abstracting the interactions with the speech engine via DBUS etc.  All
> the activities can use the speech capability as they see fit  (spoken
> commands to control the activity is the most straightforward
> application that strikes me).
> It would be really nice if the community could provide us with some
> feedback on this proposal. :)
> I suggest you draft your proposal on wiki page and put it under this
category http://wiki.sugarlabs.org/go/Category:2009_GSoC_applications. Your
application is great. Hoping to see it in this year GSoc. Best of Luck.

> Regards
> Satya Komaragiri
> [1] Dicsussion on the devel list:
> http://lists.laptop.org/pipermail/devel/2008-September/019136.html
> [2] Speech recognition project page:
> http://wiki.laptop.org/go/Speech_to_Text

Assim Deodia | http://nsitonline.in/assim
Undergraduate Student, Netaji Shubhas Institute of Technology
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.sugarlabs.org/archive/gsoc/attachments/20090324/f06e1265/attachment.htm 

More information about the GSoC mailing list