[Sugar-devel] [PATCH] OLPC #8857 - Browse fails to download some files with non-ascii characters

Gonzalo Odiard gonzalo at laptop.org
Fri Nov 19 15:09:35 EST 2010


I had a chat with silbe in irc, but we don't agreed in a solution.
Right now the problem we have is, we are trying to use a file name with a
name is not utf-8 encodeable. Dbus sent a error, and you can't download the
file.
The proposed solution take the encoding of the page to manipulate the file
name.
Is the same we (and mozilla) are using to display and manipulate the url
from the page (browser.py line 300).

More ideas?

Gonzalo


<silbe> gonzalo_: so it happens if you try to follow ("download") a link
where the URL has characters in it that are not valid in UTF-8?
<silbe> I.e. if you try to follow <a href="
http://www.cerlyn.com/q/test2%f1%e2%e0.odt">foo</a> ?
 or is it the filename= MIME attribute that gets sent by the web server?
<gonzalo_> silbe: yes, if you go to http://www.cerlyn.com/q/ and try to
download the last file
<Cerlyn> from a user's perspective you get the file downloaded but sugar
doesn't show any filename in the download dialogs or as part of the journal
name ("File{two spaces}from ...")
<-- acaleechurn ha cerrado (Quit: Ex-Chat)
<gonzalo_> silbe, cerlyn: but internally there are errors, because dbus does
not accept the utf string in the name
 cerlyn: see you the file saved in the journal?
<Cerlyn> gonzalo_: At least in os353 it gets saved
<Cerlyn> it just lacks a name
<gonzalo_> cerlyn: ok
<Cerlyn> useful name anyway
 the journal entry is labeled with the text that should surround the
downloaded file's original name
 or maybe only in some cases.  Clicking on it just shows 'Download
completed" with no further info, and Show journal doesn't work , because
there is no journal entry
<silbe> gonzalo_: what does interfaces.nsITextToSubURI do if the URL is in a
different encoding than the web page (which I guess is where the value of
uri.originCharset comes from)?
<Cerlyn> Right clicking and choosing download attempts to download an
"Untitled" thing to the journal which doesn't complete.  Maybe had more luck
yesterday for some odd reason
--> timClicks (~tim at 219-89-80-120.adsl.xtra.co.nz) ha entrado en #sugar
<gonzalo_> silbe: i don't know
 cerlyn: i could not download the file without the patch
<silbe> gonzalo_: My fear is that it would break in the same way. And I
don't know of any requirement that URLs have to use the same encoding as the
medium they're used in (in this case HTML).
<gonzalo_> silbe: but i think the link the browse see is in the encoding
from the page
--> m_anishh (qwebirc at gateway/shell/sugarlabs.org/x-srydaokmidwatxhk) ha
entrado en #sugar
<gonzalo_> silbe: think about that, the browse read a link with the encoding
from the page
<silbe> gonzalo_: exactly. but as URLs don't specify encodings, where does
uri.originCharset get its value from?
<silbe> oh, wait, I misread what you wrote.
<gonzalo_> silbe: tjis case is different like if you put a arbitrary link in
the location bar
<silbe> gonzalo_: I don't think we can rely on HTML pages containing links
_only_ to stuff in the same encoding as they use themselves.
 gonzalo_: e.g. if I take your link to the ODT file and put it on my web
page (which uses UTF-8) it would still break. And I think that's a valid use
case. :)
<gonzalo_> silbe: but until we found a hypothetical case where can be
browen, why not resolve what is broken now?
<silbe> gonzalo_: because if we don't do it the right way now, we'll have to
fix it _again_ later
<Cerlyn> Well where do we get the name value anyway?  The server can suggest
a name in an HTTP header, and I'd have to look up what encoding it is there
<silbe> Cerlyn: that was exactly my question. do we get it from the link
(href=) or from something the server sent us?
<gonzalo_> cerlyn, silbe: i dont know, is in the nsURI object
<silbe> In the former case, we cannot assume anything about the encoding
(though we might try some guesses, but those can go wrong). In the latter
case we can probably (would need to cross-check with the standard) assume
some specific encoding.
<gonzalo_> cerlyn, silbe: what i am doing now in the download of the file is
the same we are using in browser.py to get the url from the uri
 cerlyn, silbe: just in the download we was arbitrary setting the encode
like utf8
 silbe: can you see browser.py line 301
<silbe> gonzalo_: sorry, I need to stop here. I'd like to finished the
second version of the clean-up series today and need to go to bed soon. Can
you summarise our discussion (or just cut&paste it) and post it to
sugar-devel (as reply to the patch), please?
<gonzalo_> silbe:ok
<silbe> gonzalo_: thx
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sugarlabs.org/archive/sugar-devel/attachments/20101119/ad5d341f/attachment.html>


More information about the Sugar-devel mailing list