[Sugar-devel] Unicode strings in translations

S. Daniel Francis francis at sugarlabs.org
Wed Aug 15 11:40:21 EDT 2012


2012/8/14 Gonzalo Odiard <gonzalo at laptop.org>:
>> - Strings with format
>> Example:
>> button.set_tooltip(_('Append %s') % _('something'))
>>
>
> The problem with this example is when you have language like Spanish,
> where some of the characters can be encoded in ascii, but not all.
> In this case, gettext will return a str or a PyUnicode depending of the
> case,
> and if are not compatibles, the format will break.
>

Well, the PyUnicode type is recommended for index and modify a string,
Manuel gave a good example, but there aren't the necessary cases at
activities for use the PyUnicode format as default in the
translations, non-ascii characters often need more than a byte for
save the character in the memory and the Python str class is made for
index byte per byte, not character per character. That's the main
difference between the Python types str and PyUnicode.

The Python strings are encoded by default in the utf-8 code charset
thanks to the heather line. Another function of the PyUnicode type is
encode a resultant string of type str.
So, the Python strings can be encoded in a Unicode compatible charset
like utf-8, the Python Unicode type is a way to encode a string if you
don't like to add a header and the recommended way to work in the
program internally, so you mustn't use it for output, you will have to
encode the content of type PyUnicode in a PyString with the UTF-8
charset for the output and it'll not generate any conflict.

Cheers.


More information about the Sugar-devel mailing list