[Sugar-devel] [Sugar] Implement text to speech in Sugar feature - v2

Gonzalo Odiard godiard at sugarlabs.org
Wed Jan 25 11:46:01 EST 2012


On Mon, Jan 23, 2012 at 1:08 PM, Sascha Silbe <silbe at activitycentral.com>wrote:

> Excerpts from godiard's message of 2012-01-17 22:49:31 +0100:
>
> > Added controls to pause/stop and addressed suggestions
> > in the review process.
>
> Please put the changelog below the marker line, i.e. below this one:
>
> > ---
>
> Please add a description. Besides other things I'm missing a clear
> mention that the (currently undocumented) dependency on the OLPC Speech
> Server [10] gets replaced by a new dependency [1] on gst-plugins-espeak.
> The latter still isn't packaged for many distros, including Debian.
> Given that the OLPC Speech Server isn't packaged either, it's no reason
> to reject the patch, but it's definitely worth mentioning.
>
>
Ok


> [data/sugar.schemas.in]
> > +    <schema>
> > +      <key>/schemas/desktop/sugar/speech/pitch</key>
> > +      <applyto>/desktop/sugar/speech/pitch</applyto>
> > +      <owner>sugar</owner>
> > +      <type>int</type>
> > +      <default>50</default>
> > +      <locale name="C">
> > +        <short>Default pitch to the speech sugar service</short>
>
> Probably a typo: s/to/for/.
>
> As can be seen from the code, we retain user changes over a restart of
> Sugar, so we should drop the mention of "default".
>
>
Ok

>
> > +        <long>Pitch value used by the speech service in Sugar,
> > +        can be changed by the user, with controls in a icon
> > +        in the frame</long>
>
> If we mention UI details here, we'd need to update them whenever the UI
> changes. Please state only what the setting affects.
>
>
>
Ok


> [...]
> > +      <key>/schemas/desktop/sugar/speech/rate</key>
> [...]
>
> Same as above.
>
>
> [extensions/deviceicon/speech.py]
> > +from gettext import gettext as _
> > +import gconf
>
> gconf isn't part of the standard Python libraries, so it should be moved
> to the following block of third-party module imports:
>
> > +import glib
> > +import gtk
>
>
>
Ok


> [...]
> > +class SpeechDeviceView(TrayIcon):
> > +
> > +    FRAME_POSITION_RELATIVE = 105
>
> Please consider using a number that leaves us free to add more Device
> Icons in between without changing existing code. 150 might be a good
> choice.
>
>
Ok


>
> > +    def __init__(self):
> > +        client = gconf.client_get_default()
> > +        self._color =
> XoColor(client.get_string('/desktop/sugar/user/color'))
> > +
> > +        TrayIcon.__init__(self, icon_name=_ICON_NAME,
> xo_color=self._color)
> > +
> > +        self.set_palette_invoker(FrameWidgetInvoker(self))
> > +
> > +        self._manager = speech.get_speech_manager()
> > +
> > +        self.connect('expose-event', self.__expose_event_cb)
> > +
> > +        self._icon_widget.connect('button-release-event',
> > +                                  self.__button_release_event_cb)
>
> Please don't spread the code this much using blank lines. In fact the
> above code would be at least as readable even without any blank line.
>
>
This is funny, but ok :)


>
> [...]
> > +    def __button_release_event_cb(self, widget, event):
> > +        if event.button != 1:
> > +            return False
>
> As the Palette doesn't have a default action, left-click should open the
> Palette, too.
>
>
Good point.


>
> > +    def __expose_event_cb(self, *args):
> > +        self._update_info()
>
> Where does _update_info() get defined? What's the purpose of the expose
> callback?
>
>
You are right, old code.


>
> > +class SpeechPalette(Palette):
> > +
> > +    def __init__(self, primary_text, manager):
> > +        Palette.__init__(self, label=primary_text)
> > +
> > +
> > +
>
> Please use pep8 and pylint [2] to check your code prior to submission
> [3,4].
>

I have used it. If you see any problem tell me.



> [...]
> > +        """
> > +        self._play_item = MenuItem('Say selected text')
> > +        self._play_icon = Icon(icon_name='player_play',
> > +                icon_size=gtk.ICON_SIZE_MENU)
> > +        self._pause_icon = Icon(icon_name='player_pause',
> > +                icon_size=gtk.ICON_SIZE_MENU)
> > +        self._play_item.set_image(self._play_icon)
> > +        self.menu.append(self._play_item)
> > +        self._play_item.show()
> > +        self._play_item.connect('activate', self.__play_clicked_cb)
> > +
> > +        self._stop_item = MenuItem('Stop talking')
> > +        self._stop_icon = Icon(icon_name='player_stop',
> > +                icon_size=gtk.ICON_SIZE_MENU)
> > +        self._stop_item.set_image(self._stop_icon)
> > +        self.menu.append(self._stop_item)
> > +        self._stop_item.show()
> > +        self._stop_item.connect('activate', self.__stop_clicked_cb)
> > +        self._stop_item.set_sensitive(False)
> > +        """
>
> If you don't intend to execute code, please remove it entirely (applies
> to several other parts of your patch as well). Also docstrings are a
> particularly bad way of marking code to not get executed: without syntax
> highlighting (e.g. in emails and a lot of editors) it's hard to notice
> that the above, long code block doesn't get executed. Using comments
> makes it obvious as each line is marked individually. Some text editors
> even have shortcuts to comment in/out the currently selected code block
> to make this trivial.
>
>
This code was commented because I proposed two different UI,
and wanted to do easy to other testers/reviewers try the two options.
I will remove the option with the MenuItem beacuse gtk3 limitations (see
Simon mail)




> [...]
> > +        pitch_step = 10
> > +        self._adj_pitch =
> gtk.Adjustment(value=self._manager.get_pitch(),
> > +                                          lower=self._manager.MIN_PITCH,
> > +                                          upper=self._manager.MAX_PITCH,
> > +                                          step_incr=pitch_step,
> > +                                          page_incr=pitch_step,
> > +                                          page_size=pitch_step)
>
> Why do you disallow coarser steps when using the keyboard to change the
> value?
>
>
Hmm, I thought would be easier, but may be is not a good option.



>
> > +        self._hscale_pitch = gtk.HScale(self._adj_pitch)
> > +        self._hscale_pitch.set_digits(0)
>
> > +        self._hscale_pitch.set_draw_value(False)
>
> This is up to the Design Team to decide, but I'd imagine it to be useful
> to see the current value as a number, so the learner can reproducibly
> set the same value (both as before and as other learners).
>

Good point, I will add it and then Design can can try it.



>
> [Rate adjustment]
>
> Same comments as for the pitch adjustment.
>
>
> [...]
> > +    def __play_clicked_cb(self, widget):
> > +        if self._manager.is_paused:
> > +            self._manager.restart()
> > +        else:
> > +            if not self._manager.is_playing:
> > +                self._manager.say_selected_text()
> > +            else:
> > +                self._manager.pause()
>
> This could be streamlined by using elif.
>
>
> > +    def __stop_clicked_cb(self, widget):
> > +        self._manager.stop()
>
> Why do we need a Stop button in addition to Play/Pause? And where does
> this callback get invoked? AFAICT the hook-up is commented out.
>
>
If you have a long text been played, and want finish or start again
from the beginning, you need a stop button.


>
> [extensions/globalkey/speech.py]
> [...]
> > +BOUND_KEYS = ['<alt>s']
>
> This will collide with shortcuts from existing applications and
> activities, especially since GTK uses <Alt> for mnemonics [7]. To give a
> concrete example of a non-mnemonic case: the Scan activity [8] uses
> <Alt>+S to start or stop scanning [9].
>
> Why didn't you keep the existing shortcut (<Alt><Shift>S)?
>
>
I have changed it by Simon suggestion. Please read the other reviews.
Scan activity? Is not in ASLO yet. You can change it ;)



> [src/jarabe/model/speech.py]
> > +import gconf
>
> Like in extensions/deviceicon/speech.py, the gconf import should be
> moved to the block of third-party module imports.
>
> > +
> > +import gst
> > +import gtk
> > +import gobject
> > +
> > +import os
> > +from gettext import gettext as _
> > +import logging
>
> Standard Python library modules should be listed first, before
> third-party modules (PEP-8 section "Imports", second item).
>
>
> > +# TRANS: The language pitch (range [0 - 99], default 50 for English)
> > +# Look at http://espeak.sourceforge.net/commands.html for details
> > +DEFAULT_PITCH = int(_('50'))
> > +
> > +
> > +# TRANS: The diction speed, in average words per minute (range [80 -
> 390],
> > +# default 170 for English).
> > +# Look at http://espeak.sourceforge.net/commands.html for details
> > +DEFAULT_RATE = int(_('170'))
>
> Why did you choose a different default for rate than espeak did (170 vs.
> 175)?
>

Hmmm, I can't find why :) Changing to 175


>
> Do these values really need to be localised? AFAICT from the document
> you referenced, the upstream default is always the same and espeak
> voices will interpret the value as needed.
>
>
Ok, I will remove i18n here.



>
> [...]
> [class SpeechManager]
> > +    def __init__(self, **kwargs):
> > +        gobject.GObject.__init__(self, **kwargs)
> > +        self._player = AudioGrabGst()
> > +        self._player.connect('play', self._emit_signal, 'play')
> > +        self._player.connect('stop', self._emit_signal, 'stop')
> > +        self._player.connect('pause', self._emit_signal, 'pause')
> > +        logging.debug('SpeechManager setting default parameters')
>
> Does this log message actually aid in debugging? The values always get
> set on instantiation and before instantiation there's no SpeechManager
> to access.
>
>

Not needed now. Removed.


>
> [...]
> > +        try:
> > +            self._loading = True
> > +            self.restore()
> > +            self._loading = False
> > +        except:
> > +            pass
>
> Please don't catch exceptions unconditionally except in very specific
> cases. I'm not even sure we should catch exceptions here at all, as
> anything that could happen in self.restore() would be the result of a
> severe problem and should be reported to the user, rather than ignored
> silently.
>
> You're leaving self._loading set if restoring the settings fails, which
> will also prevent changes to be saved later on. But you can get rid of
> self._loading completely if you set the internal variables (self._pitch
> and self._rate) in self.restore() directly rather than using the
> accessors.
>
>
>
It's true. Thanks.


> > +    def _emit_signal(self, player, signal):
> > +        self._is_playing = (signal == 'play')
> > +        self._is_paused = (signal == 'pause')
> > +        self.emit(signal)
>
> This method does more than just emitting a signal. Something like
> _update_state() would be a better name.
>
>
>
Ok


> [...]
> > +    def save(self):
> > +        client = gconf.client_get_default()
> > +        client.set_int('/desktop/sugar/speech/pitch', self.get_pitch())
> > +        client.set_int('/desktop/sugar/speech/rate',
> > +                self.get_rate())
>
> You can access the private variables directly here; that would get rid
> of the line break.
>
>
Ok.


> Also, if we save on every change, we should listen for gconf updates so
> that outside changes (e.g. using gconf-editor or gconftool-2) will be
> used by Sugar instead of getting overwritten on the next invocation of
> save().
>
>
We are not doing this in sound or network settings, right?



> > +class AudioGrabGst(gobject.GObject):
>
> What's the purpose of this class? What does it grab? Please consider
> renaming the class and / or providing a useful docstring.
>
> > +
> > +    def __init__(self):
> > +        gobject.GObject.__init__(self)
> > +        self._pipeline = None
> > +        self._quiet = True
>
> What do we need self._quiet for? AFAICT it's only set, never read.
>

It's true. Removed.


>
>
> > +    def restart_sound_device(self):
> > +        if self._pipeline is None:
> > +            return
>
> Calling restart_sound_device() shouldn't be allowed to be called if
> there's no pipeline set up. The UI should deactivate the buttons while
> there's no text to be spoken. restart_sound_device() et al. should
> throw an error when called out-of-order.
>
>
The UI is deactivating the buttons, but I think does not harm have a
control here.
I will add a log message.


>
> [...]
> > +    def make_pipeline(self, cmd):
>
> Re. "cmd": Please don't abbreviate unless you have a good reason to do
> so [6].
>
>
Ok.


>
> [...]
> > +    def __pipe_message_cb(self, bus, message):
> > +        logging.error('Gst message %s' % message)
>
> Why do you mark every GStreamer message as error? AFAICT you will
> receive these messages as part of normal operation.
>
> Also please pass parameters to the logging function instead of doing the
> formatting yourself. To quote [5]: "This is more robust and formatting
> only happens if we actually log something."
>
>
Ok. Is only for testing I will remove it in the final version.


> > +        if message.structure.get_name() == 'espeak-mark' and \
> > +                message.structure['mark'] == 'end':
> > +            self.emit('stop')
>
> Do we need to set the pipeline state to gst.STATE_NULL like in
> self.stop_sound_device() or does that happen automatically?
>
>
Automatically? How?



>
> > +    def speak(self, pitch, rate, voice_name, text):
>
> Why do we need to pass the parameters in every time, rather than
> settings the properties a) on init and b) when the gconf / UI ?
>
>
This class does not have this variables. Should copy almost the same code in
the SpeechManager class. I am looking if are really needed two classes or if
I can move part of this to sugar-toolkit, to enable the activities to use
tts.


>
> > +        # XXX workaround for http://bugs.sugarlabs.org/ticket/1801
>
> Please get rid of the XXX, it does not add any value. If we intend to
> remove this some time after SL#1801 gets fixed, we should mark it as
> TODO.
>
>
Ok.


>
> > +        if not [i for i in text if i.isalnum()]:
> > +            return
>
> The user will be rather puzzled why saying some texts works and others
> simply get ignored. We should filter out the individual characters that
> gst-plugins-espeak doesn't like, rather than bailing out completely.
>
>
I understand the text is ignored if there are only non alpha chars.
But yes, the code is a little cryptic.


>
> [...]
> > +    def get_all_voices(self):
>
> This should probably be a private method (i.e. be called
> _get_all_voices()).
>
>
Like before, I am trying to see what is needed by activities,
and moving to sugar-toolkit.



>
> > +        all_voices = {}
> > +        for i in gst.element_factory_make('espeak').props.voices:
> > +            name, language, dialect = i
>
>         for voice in gst.element_factory_make('espeak').props.voices:
>
> Please consider using "voice" or a similar speaking name in place of
> "i". The latter is traditionally used for _i_ntegers (or _i_ndexes,
> which are also integers).
>
>
Ok


>
> > +            #if name in ('en-rhotic','english_rp','english_wmids'):
> > +                # these voices don't produce sound
> > +             #   continue
>
> Please remove dead code.
>
>
Ok.



>
>
>
> +            all_voices[language] = name
>
> Are the voices in ...props.voices ordered such that the preferred voice
> for a given language is listed last? Can the user influence the
> ordering? If not, we should make the voice configurable (which is easy,
> given that we already pass it through SpeechManager), even if just as a
> gconf setting with no UI in Sugar.
>
>
>
Yes. I am tempted of adding the voice selection to the UI too.
In particular, there are 8 different english voices, two spanish and two
portuguese,
and I found difficult found the right voice every time.



> [...]
> > +    def get_default_voice(self):
> > +        """Try to figure out the default voice, from the current locale
> ($LANG)
> > +           Fall back to espeak's voice called Default."""
> > +        voices = self.get_all_voices()
> > +
> > +        try:
> > +            lang = os.environ['LANG']
> > +            if lang.find('.') > --1:
> > +                lang = lang[0:lang.find('.')]
> > +                lang = lang.replace('_', '-').lower()
> > +        except:
> > +            lang = ""
> > +
> > +        best = "default"
> > +
> > +        try:
> > +            best = voices[lang]
> > +        except:
> > +            try:
> > +                lang = lang[0:lang.find('-')]
> > +                best = voices[lang]
> > +            except:
> > +                pass
>
> The following would be more robust, not masking coding mistakes (like
> "--1") by catching exceptions unconditionally. Not to mention that it's
> a lot less code:
>
>        locale = os.environ.get('LANG', '')
>        language_location = locale.split('.', 1)[0].replace('_',
> '-').lower()
>        language = language_location.split('-')[0]
>        return voices.get(language_location) or voices.get(language) or
> 'default'
>
>
Thanks. I found other problems making the language list,
I will try to do a better list and search of the best voice.


>
> > +def get_speech_manager():
> > +    global _speech_manager
> > +
> > +    if _speech_manager == None:
>
> PEP8, "Programming Recommendations", second item: "Comparisons to
> singletons like None should always be done with 'is' or 'is not', never
> the equality operators."
>
>
>
Ok.

Thanks by the exhaustive review. I will do another try this week,
to have a good candidate before the deadline.

Gonzalo


> Sascha
>
> [1]
> https://wiki.sugarlabs.org/go/Development_Team/Code_Review#Dependencies
> [2] https://wiki.sugarlabs.org/go/Development_Team/Code_guidelines#Tools
> [3] https://wiki.sugarlabs.org/go/Development_Team/Code_Review#Proposal
> [4] https://wiki.sugarlabs.org/go/Development_Team/Code_guidelines
> [5]
> https://wiki.sugarlabs.org/go/Development_Team/Code_guidelines#Miscellaneous
> [6]
> https://wiki.sugarlabs.org/go/Development_Team/Code_guidelines#Code_style
> [7]
> http://www.gtk.org/api/2.6/gtk/GtkButton.html#gtk-button-new-with-mnemonic
> [8] http://git.sugarlabs.org/scan-activity
> [9]
> http://git.sugarlabs.org/scan-activity/scan-activity/blobs/master/scan.py#line655
> [10] http://wiki.laptop.org/go/Speech_Server
> --
> http://sascha.silbe.org/
> http://www.infra-silbe.de/
>



-- 
Gonzalo Odiard
SugarLabs Argentina
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sugarlabs.org/archive/sugar-devel/attachments/20120125/c88d8ac2/attachment-0001.html>


More information about the Sugar-devel mailing list