[Sugar-devel] Unicode strings in translations

Martin Langhoff martin.langhoff at gmail.com
Wed Aug 15 21:27:15 EDT 2012


On Wed, Aug 15, 2012 at 7:12 PM, Manuel Kaufmann <humitos at gmail.com> wrote:
> Take a look at this. Following what I understood from your email, if I
...

We are veering far far offtopic from the subject. But string encoding
is an important topic, so I'll go offtopic.

> [humitos at michifus ~]$ cat test.py
> #!/usr/bin/python
> # -*- coding: utf-8 -*-

What is the context? Where are you typing this script? An xterm? A VT?
In OFW? Over a serial port? Over SSH?

In all cases, keystrokes have to be interpreted before you get the ó,
and the OS needs to decide what input it'll give to the editor, and
the editor needs to decide whether it will apply any translation.

> s = 'camión'

Think about this: this line in your script could be written in a
number of ways!  ISO 8859-1 ("Latin codepage"), UTF-16, UTF-8, UTF-32
just to list the ones _you_ see most often.

But even in Unicode, there are _two_ ways to say ó -- you can say
"letter o with acute" or "acute, composable + letter o". Oops!

Try to install Yudit, or use iconv to transcode your nice python
script to a few other encodings -- then look at it with a hex viewer.

The thing is, when the python interpreter starts up, and reads _the
script_ it doesn't know what encoding it is in. UTF-8 looks
essentially identical to ISO 8859-1 -- so it cannot decide beforehand.

You say, I'm in a modern system, it'll be utf-8! But perhaps it's an
old script. Or your text editor is old and pre-unicode. How will
Python know?

Same applies to data files. That CSV file you are opening, maybe comes
from MS Excel on Windows 3.11, German edition. Oops! But we are more
used to thinking about data files -- every thing you know about data
files also applies to Python program files (they are string data!) and
to user input in the UI (except that you can usually ask the UI
toolkit or the env what encoding it's feeding you).

hth,



m
-- 
 martin.langhoff at gmail.com
 martin at laptop.org -- Software Architect - OLPC
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff


More information about the Sugar-devel mailing list