[Sugar-devel] Language section debugging, use ICU?
cjlhomeaddress at gmail.com
Thu Mar 28 20:47:13 EDT 2013
On Tue, Mar 26, 2013 at 3:17 PM, Manuel Quiñones <manuq at laptop.org> wrote:
> I have ongoing work on polishing the Language section of the Sugar
> settings panel. I'm sharing my findings to open discussion, to start
> bringing back discussions to the mailing list, and to encourage
> testing of my patches.
> First, an introduction of the issues, ordered by priority as I understand:
> 1. list of languages: how could I select my language if it is
> displayed in another language?
> 2. list of languages: if there are two languages and English is the
> first one, the list is displayed as the second one
> 3. many language names are not translated
> 4. the section takes a lot of time to start, we should display a
> watch/busy cursor while it is loading
> The issues are interconnected as you will find if you try to read the
> many comments, which mix them all through the years :)
> A brief of the current implementation:
> A. we parse the output of 'locale -av' command and create a list of
> (language name, territory name, locale code)
> B. we call gettext with ISO-639 to translate the language name
> C. we call gettext with ISO-3166 to translate the territory/country name
> Number 2 gets obsolete if we solve no. 1. The problem in 2. is that
> we should not use gettext to translate strings to English, as they are
> already in English. It is iliustrated in this comment:
> Number 3 is a flaw of the current implementation: the output of
> 'locale -av' does not match 100% the strings in the po files for the
> given gettext domains. See comment by Chris Leonard on this:
> Number 4 has a general patch that works for all sections, except for
> Modem section. We actually have code that displays a busy cursor, but
> it is shown only for an instant, because (fortunatly) the UI doesn't
> block the program. The patch wraps the initialization of the section
> in a GObject.idle_add call, to make it work. As said before, this
> conflicts with the current implementation of the Modem section. So..
> still work to be done on this one.
> By the way, number 4. gets less relevant if we speed up the section.
> That leads me to talk about my findings on issue number 1: I have
> investigated alternatives to our current gettext implementation.
> - using gettext domain ISO 639-3 instead of ISO 639 for translating
> the language name
> - using external libraries Babel and PyICU
> Here is a table of my results:
> And here is the result of profiling them:
> So, it looks like the ICU project is very fast and provides good
> output. And reading the project homepage it looks on shape too. Can
> we consider a switch to it?
> .. manuq ..
First, thank you for exploring the question of improving the selection
of languages from the control panel and the several issues involved.
As you correctly note in your message, there are actually a series of
intertwined issues and it is, of course, important to be clear about
which issue is being targeted by any given approach proposed and the
impact of a given approach on the related issues.
Starting this line of investigation from bug #4449:
"Spanish" and other language names are not translated in My Settings
The questions that you have most directly addressed with your testing
so far are:
1) How does Sugar currently populate the list of language names (and
territory names) in the language selection Control Panel?
For which you describe a series of lookups from 1) the glibc locale
itself 2) ISO-639 for language name, 3) ISO-3166 for territory name.
2) Is this the most efficient (fastest) method of retrieving a
localized language name?
You decribe your experimentation with ICU. It is important to note
that the source of information for ICU on localized language and
territory names is the CLDR locale, and so, at least in part, this
comes down to a discussion about glibc locales versus CLDR locales.
I would like to suggest that there is another question (issue) in
play, which is
3) What is the most authoritative and complete source of localized
language names available (without regard to the performance cost of
My concern with using CLDR locales via ICU is that although they may
have some nice features, like containing chunks of ISO-639 and
ISO-3166 (which probably accounts for some of the observed speed-up),
the coverage from locale-to-locale on what language and territory
names is translated can be highly variable.
Using glibc locales is the "Linux" standard, whereas CLDR locales are
more of a Mozilla or Android approach. Although the amount of L10n on
the ISO-639 and ISO-3166 projects can also be variable by language, at
least the untranslated values are there in the PO files, pointing out
the need for more work.
glibc locales versus CLDR locales were the subject of a very brief
discussion on libc-alpha (the main glibc list) [1-5]
More information about the Sugar-devel