[Sugar-devel] Unicode strings in translations
S. Daniel Francis
francis at sugarlabs.org
Mon Aug 13 19:15:46 EDT 2012
2012/8/13 Gonzalo Odiard <gonzalo at laptop.org>:
> You can read a utf8 encodec file with codecs.open too.
>
> http://docs.python.org/library/codecs.html
>
> Gonzalo
I look some people is needing to know more about Unicode:
The strings are encoded by default in ASCII, but with ASCII the
computer can't represent all the unicode characters, and here appears
utf-8, adding the line at the header of the file Python encodes the
strings in utf-8.
If we have a variable of type unicode:
my_unicode = u'Hello World'
you can get a string in utf-8 with the following line:
utf8_string = my_unicode.encode('utf-8')
for get a unicode object from a string:
new_unicode = utf8_string.decode('utf-8')
When do you need unicode?
Some characters have 2 bytes at the memory for be represented, so at
the time of iterate the string you get bytes, not characters, so it
works well with 1-byte characters but it will not work as expected
with the other, with unicode you can iterate by the text character by
character.
Code example:
>> for i in "My string with Ñ": print i
M
y
s
t
r
i
n
g
w
i
t
h
�
>>for x in [i.encode('utf-8') for i in u"My string with Ñ"]: print x
M
y
s
t
r
i
n
g
w
i
t
h
Ñ
http://wiki.python.org/moin/Unicode
Regards.
~danielf
More information about the Sugar-devel
mailing list