[Bugs] #1348 UNSP: infoSlicer not able to download new articles
Sugar Labs Bugs
bugtracker-noreply at sugarlabs.org
Wed Oct 28 19:22:42 EDT 2009
#1348: infoSlicer not able to download new articles
------------------------------------------+---------------------------------
Reporter: walter | Owner: walter
Type: defect | Status: new
Priority: Unspecified by Maintainer | Milestone: Unspecified by Release Team
Component: InfoSlicer | Version: Unspecified
Severity: Blocker | Keywords:
Distribution: Unspecified | Status_field: Unconfirmed
------------------------------------------+---------------------------------
Changes (by jpichon):
* cc: jpichon (added)
Comment:
I'm attaching a patch that fixes the article retrieval issue. I noticed
afterwards that most headings were gone from articles from the English
wikipedia and a few headings went missing in the other wikipedias as well,
the 2nd patch would fix this by treating more tags as having relevant
content.
There's still another aesthetic problem, whereby there're a few blank
lines at the top of newly downloaded articles. I haven't been able to fix
that yet, I just know that it's related to the pre_parse function in
HTML_Parser.py. The only workaround I have for now is to reinitialise
self.input with BeautifulSoup after calling pre_parse(), I'm not sure if
that would be appropriate for a patch. I still hope to figure out what's
the real problem.
--
Ticket URL: <http://bugs.sugarlabs.org/ticket/1348#comment:1>
Sugar Labs <http://sugarlabs.org/>
Sugar Labs bug tracking system
More information about the Bugs
mailing list