[Systems] [IAEP] ALERT JUSTICE DOWN [24 hours without response] (!)

Bernie Innocenti bernie at sugarlabs.org
Thu Jun 18 15:09:08 EDT 2015


On 06/18/2015 03:01 PM, Gonzalo Odiard wrote:
> Any chance to check if disks are dying or there other reason for these
> instabilities?

Nothing odd from smartctl, and anyway the server would keep responding
to pings even if both disks in the raid array were dead.

So I'm thinking it's either a kernel bug, or unstable hardware.


> Gonzalo
> 
> On Thu, Jun 18, 2015 at 3:56 PM, Bernie Innocenti <bernie at sugarlabs.org
> <mailto:bernie at sugarlabs.org>> wrote:
> 
>     +systems@
> 
>     I rebooted justice from the management console and it's now responding
>     to pings.
> 
>     I couldn't view the screen capture and I had no time to go to the Media
>     Lab to physically inspect the machine, so I don't understand the
>     root cause.
> 
>     As reported by Dogi, Justice seems to crash every 1-2 months. I suggest
>     we try the following steps:
> 
>     1. upgrade justice to Ubuntu 14.04 (like we did with freedom 1yr ago)
> 
>     2. if crashes continue, go to the server room and swap the drives with
>     freedom (which is our hot-swap server and doesn't currently run anything
>     critical)
> 
>     3. Ask again the ML to give one of us physical access to the server
>     room. I work nearby, but I have trouble leaving during office hours on a
>     personal errand and if anything happens over a week-end we're in
>     trouble.
> 
>     Sebastian: you should at least get access to the management console.
>     Ping me on IRC and I'll send you the credentials on a secure channel.
> 
> 
>     On 06/18/2015 10:40 AM, Sebastian Silva wrote:
>     > Hello Sugar Oversight Board, Sugar Labs Members,
>     >
>     > Our main production server virtual machine host is down and I can't
>     > reach it.
>     > We have several systems that depend on this infrastructure, including
>     > pootle server which was actively being used by translators of
>     Aymara and
>     > Awajun native languages.
>     >
>     > I respectfully request that you call on the phone whoever has physical
>     > access to this machine and we try to bring it back online. I think
>     this
>     > should be either Bernie Inocenti or Stefan Unterhauser.
>     >
>     > Also, I would like to request for more volunteers from infrasctucure
>     > team to have virtual terminal access to these machines (not just ssh),
>     > or to put them in a proper collocation service where we can get some
>     > support.
>     >
>     > Thanks in advance for your help.
>     > Sebastian
>     >
>     > On 17/06/15 20:55, Sebastian Silva wrote:
>     >> Affected services:
>     >> translate.sugarlabs.org <http://translate.sugarlabs.org>
>     >> git.sugarlabs.org <http://git.sugarlabs.org>
>     >> packages.sugarlabs.org <http://packages.sugarlabs.org>
>     >>
>     >>
>     >>
>     >> On 17/06/15 20:48, Sebastian Silva wrote:
>     >>> We can't reach it.
>     >>>
>     >>> Anybody with physical access to the machine please respond.
>     >>>
>     >>>
>     >>> Regards,
>     >>> Sebastian
> 
>     --
>     Bernie Innocenti
>     Sugar Labs Infrastructure Team
>     http://wiki.sugarlabs.org/go/Infrastructure_Team
>     _______________________________________________
>     IAEP -- It's An Education Project (not a laptop project!)
>     IAEP at lists.sugarlabs.org <mailto:IAEP at lists.sugarlabs.org>
>     http://lists.sugarlabs.org/listinfo/iaep
> 
> 
> 
> 
> -- 
> Gonzalo Odiard
> 
> SugarLabs - Software for children learning 


-- 
Bernie Innocenti
Sugar Labs Infrastructure Team
http://wiki.sugarlabs.org/go/Infrastructure_Team


More information about the Systems mailing list