[Gsoc] Summer of Code Proposal: Furthering Speech Recognition in Sugar.
satya komaragiri
satya.komaragiri at gmail.com
Mon Mar 23 16:01:26 EDT 2009
Hi Assim,
On Tue, Mar 24, 2009 at 12:33 AM, Assim Deodia <assim.deodia at gmail.com> wrote:
>
> Hi Satya,
>
> Your idea is indeed very good. Its would be great to have speech as system
> wide another input mechanism
>
> I see that you have already compiled Julius on XO. Were you able to test it
> since it only support Japanese?
>
Yes, Julius uses acoustic models created by using the HTK Toolkit, and
hence other language models that have been created using the same can
also be used with Julius. I have tested it with the acoustic models
and the speech corpus that are made available by Voxforge. The
efficiency is not very high as of now, but its more of a acoustic
model problem that arises due to limited speech corpus rather than the
engine itself. It might also be because I speak 'Indian' English. I am
recording my own corpus by asking children to speak. I am confident
we'll get better results as we keep recording more and more data.
>
> What you can do is (As Sean DALY suggested) to have a specific key pressed
> while speaking this i guess will greatly improve the efficiency of the
> engine. Starting out with the recognition of just 26 letters that with too
> boundaries marked would not be of much load for XO and it will enhance the
> activity greatly.
> Some basic commands like close, open <activity name> etc can also be
> implemented.
True. I am very positive about achieving very high efficiency in this.
> I suggest you draft your proposal on wiki page and put it under this
> category http://wiki.sugarlabs.org/go/Category:2009_GSoC_applications. Your
> application is great. Hoping to see it in this year GSoc. Best of Luck.
>
Sure, I'll put up a proposal on the wiki. Thanks a lot for your
encouragement. I'm praying the project gets selected too.
Regards,
Satya
More information about the GSoC
mailing list