[Sugar-devel] Sugar-devel Digest, Vol 6, Issue 7

James Simmons jim.simmons at walgreens.com
Wed Apr 1 18:24:16 EDT 2009


Chirag,

Since you have been working with Aleksey Lim you probably know about 
text to speech with highlighting in Read Etexts.  I wrote the original 
TTS code that used speech-dispatcher with some assistance from Hemant 
Goyal and the folks on the speech-dispatcher project.  Aleksey 
refactored my code so it could work with either speech-dispatcher or his 
own gstreamer espeak plugin.  Not only does his plugin need no 
configuration to work, it also does a LOT better in producing timely 
callbacks as it reads each word.

Since I've labored in these vineyards for awhile, my opinion might be 
worth something.  I think your proposal is fine as written.  I just 
wonder if you can deliver what you're promising, and how you'd go about it.

As you point out in your proposal, highlighting the word as it is spoken 
is a big part of the benefit of what you're proposing.  If all you 
wanted to do was capture some highlighted text in the clipboard and have 
it spoken in a voice you can configure in a control panel, that would be 
easy, even trivial.  It's the highlighting that's difficult.  When I 
added speech to Read Etexts I deliberately tried for the simplest 
approach that would get the job done.  It reads only the current page.  
It always starts either at the first word on the page, or if speech has 
been paused, it resumes with the last word spoken.  You can't choose the 
word to start on.  The Activity itself receives the callbacks as each 
word is spoken and takes care of doing the highlight and scrolling the 
textarea so the highlighted word stays on the screen.

If I had to write a facility that did what Read Etexts does outside of 
the Activity I wouldn't know how to do it.  It seems to me that 
highlighting is best done by the Activity itself.  I can't deny that it 
would be useful to have all this work done as you have described without 
the Activity knowing anything about it, but it doesn't seem feasible.  
You'd have to have something that could work with gtk textareas, the 
evince component Read uses, Abiword, and everything else that came along.

Another thing you'd have to deal with is PDFs composed of scanned in 
book pages.  There are a lot of these around (the Internet Archive is 
full of them) and somehow the kid trying to select words on a scanned in 
page would have to be clued in that these words are not selectable.

I suppose you could make an Activity that grabbed whatever text was in 
the clipboard, displayed it in a textarea, and highlighted the words in 
that textarea as it spoke them.  I'm pretty sure that wasn't what you 
had in mind.

Splitting sentences into separate words will be a challenge.  I just use spaces as delimiters and filter out characters like asterisks, vertical bars, etc.  That works OK for English but not for other languages.  If I wanted Read Etexts to do highlighting on the Bhagavad-Gita in the original Sanskrit it wouldn't work.  Even in English I get tripped up by double hyphens (--).  It would be nice if Gutenberg etexts put spaces around double hyphens but they don't.

It looks like you've picked a challenging project, and I would love to be proven wrong about everything I've mentioned here.  Good luck with this,

James Simmons


Date: Wed, 1 Apr 2009 22:00:02 +0530
From: chirag jain <chiragjain1989 at gmail.com>
Subject: [Sugar-devel] GSoC proposal: Speech Synnthesis
To: sugar-devel at lists.sugarlabs.org
Message-ID:
	<e116096a0904010930l23312712ha5fd4128efe7dd99 at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

Modified the proposal a lot.

http://wiki.sugarlabs.org/go/speech-synthesis

want some feedback




More information about the Sugar-devel mailing list