[Sugar-devel] Fwd: Summer of Code Proposal: Furthering Speech Recognition in Sugar.

satya komaragiri satya.komaragiri at gmail.com
Mon Mar 23 08:43:52 EDT 2009


Hello Sean,

On Mon, Mar 23, 2009 at 4:13 AM, Sean DALY <sdaly.be at gmail.com> wrote:
> Greetings Satya,
>
> In the early 1990s I did some tests for a speech recognition system
> and I found a shortcut for reliably acquiring and classing words and
> even phonemes: asking the subject to click the mouse or the spacebar
> to mark the boundary between words. Phonemes were more difficult, but
> some subjects did manage to become very proficient in demarcating
> boundaries. I seem to remember reading that most of the world's
> languages draw from a pool of only 50-60 phonemes. Of course machines
> were many times less powerful then compared to today but this little
> shortcut simplified realtime processing considerably.

Oh, that's interesting! Can I contact you off-list for more details?
>
> As a former audio engineer I had also explored digitally-controlled
> analog audio processing, in particular equalization, as another method
> of reducing processing power requirements. However, my results were
> inconclusive, due I believe to the slow response time of the equipment
> I had available. I recall finally resorting to a first sample to
> capture the speech, then conversion to analog for shaping, then
> conversion to digital again for analysis. Modern realtime plugin
> effects modules for recording platforms such as Pro Tools would
> certainly do the job better and in a single step, but the best of
> these are very highly priced proprietary closed code (example:
> http://www.sonnoxplugins.com).
>
> I did find recognition rates varied very widely with the model of
> microphone used. I remember obtaining interesting results from the
> proximity effect of a common studio dynamic mic, the Shure SM57; the
> "colored" analog sound allowed faster transformations (this is what
> led me to shaping sound in the analog domain). Recording very close in
> also eliminated ambient room noise. The stumbling block I encountered
> there was encouraging subjects to speak right into the mic, which some
> found intrusive. I did not test with cheapo soundcard mics as
> interfacing these with pro sound equipment for testing was a headache.
>

I have been asking children to record using a mouthpiece in their
normal home environments. This is because children will be using Sugar
at homes and schools and studio quality models might be rendered less
efficient in such noisy environments. Recording using a mouthpiece
guarantees that the children speak into the mic.

> It's a fascinating subject
>
Indeed!
> Sean
>

Regards,
Satya


More information about the Sugar-devel mailing list