[Bugs] #3660 Wikipedia UNSP: Wikipedia: Chars # and " in the article title break data generation process

Sugar Labs Bugs bugtracker-noreply at sugarlabs.org
Wed May 30 12:02:53 EDT 2012


#3660: Wikipedia: Chars # and " in the article title break data generation process
------------------------------------------+---------------------------------
    Reporter:  godiard                    |          Owner:  godiard                    
        Type:  defect                     |         Status:  new                        
    Priority:  Unspecified by Maintainer  |      Milestone:  Unspecified by Release Team
   Component:  Wikipedia                  |        Version:  Unspecified                
    Severity:  Unspecified                |       Keywords:                             
Distribution:  Unspecified                |   Status_field:  Unconfirmed                
------------------------------------------+---------------------------------
 After processing pages_parser.py, there are links with '"' and #  in the
 .links file, and after make_selection.py are added to pages_selected-
 level-1

 The " produce errors when trying to insert in the sql database, and the #
 points to index inside other articles, then should be ignored.

 Part of the errors were solved (and other were avoided editing the ages-
 selected file by hand), but this characters should be removed earlier in
 the process (probably in make_selection.py)

-- 
Ticket URL: <http://bugs.sugarlabs.org/ticket/3660>
Sugar Labs <http://sugarlabs.org/>
Sugar Labs bug tracking system


More information about the Bugs mailing list