[Sugar-devel] [ASLO] Release Wikipedia-33.5

Daniel Drake dsd at laptop.org
Mon Mar 19 10:56:56 EDT 2012


On Sat, Mar 17, 2012 at 1:29 PM, Gonzalo Odiard <gonzalo at laptop.org> wrote:
> It's a consequence of updating the data.
> The english wikipedia has grow from 1.5M articles in 2007 to almost 4M now
> [1],
> and the articles are longer too.

If I remember correctly, the process used originally was:
1. Rank all wikipedia page according to some criteria of popularity
(page views)?
2. Take articles from the top of the list, one by one, going down the
list until 100mb of space was used.

Perhaps the same process can be run again, to not only bring in new
content but also consider updated popularity?

This would also solve the space usage issue. The growth of Wikipedia
and WikipediaES means that we are no longer able to produce 2GB disk
images due to not having enough space. Some of this bloat is
undoubtedly due to F17 bloat, however the combined 90mb growth of
these 2 activities is a lot to ask for in any case.

Daniel


More information about the Sugar-devel mailing list