[Systems] [IAEP] wiki.sugarlabs.org ongoing maintenance

Bernie Innocenti bernie at sugarlabs.org
Mon Apr 22 21:47:42 EDT 2013


On 04/17/2013 09:23 PM, Frederick Grose wrote:
> On Wed, Apr 17, 2013 at 9:19 AM, Bernie Innocenti <bernie at sugarlabs.org
> <mailto:bernie at sugarlabs.org>> wrote:
> 
>     On 04/16/13 20:39, Frederick Grose wrote:
>     > There was a page,
>     >
>     http://wiki.sugarlabs.org/go/Sugar_on_a_Stick/%CA%BB%C5%8Chelo_%CA%BBai
>     >
>     > that can no longer be found.
> 
>     Hmm... an encoding issue, maybe? Do you remember some of its content, to
>     see if the page pops up in search under a different title?
> 
>     --
>     Bernie Innocenti
>     Sugar Labs Infrastructure Team
>     http://wiki.sugarlabs.org/go/Infrastructure_Team
> 
> 
> It was mostly like
> http://wiki.sugarlabs.org/go/Sugar_on_a_Stick/Quandong
> 
> Notice also the last two subpage links listed on the subpage index for  
> http://wiki.sugarlabs.org/go/Sugar_on_a_Stick
> <http://wiki.sugarlabs.org/go/Sugar_on_a_Stick/Quandong>
> (in the [show ▼] frame)
> which behave like nonexistent pages.

Ok, I figured out what happened: the mysqldump & restore that I did
prior to the upgrade somehow trashed the encoding.

Mysql offers a plethora of options that allow setting a per-table and
per-database character encoding that can differ from the encoding used
by the server and the one used by the client. Strings should supposedly
be "automatically" converted back and forth, but in practice getting it
right in all weird cases is difficult and involves a certain amount of
prayer.

Our main Mediawiki instance has always been configured to use utf8, but
Debian's default encoding is latin1 and evidently some data was being
interpreted as latin1 even though the tables were configured as 8tf8.

Bah. Anyway, the actual *content* of all pages seems ok, while the page
titles which contained non-ASCII characters got screwed. I manually
fixed the ʻŌhelo_ʻai pages and User:Ignacio_Rodríguez. These are the
incantations that did the trick:

  update page set page_title='Sugar_on_a_Stick/ʻŌhelo_ʻai' where
page_title like 'Sugar_on_a_Stick/%helo_%ai';
  update user set user_name='Ignacio Rodríguez' where user_name like
'Ignacio Rodr%';

I'm not sure how to get a list of all pages containing weird characters.

-- 
Bernie Innocenti
Sugar Labs Infrastructure Team
http://wiki.sugarlabs.org/go/Infrastructure_Team


More information about the Systems mailing list