[Systems] Mostly offline until 2014-04-13

Bernie Innocenti bernie at codewiz.org
Tue Apr 1 16:11:16 EDT 2014


Thanks to both you and Roberto for the kind offer. I'll count on you guys!


On 04/01/2014 01:56 PM, Sebastian Silva wrote:
> Hi guys,
> Aleksey, thanks for the heads up. I'll keep an eye.
> 
> Bernie,
> I can look over the farm in your absence too.
> 
> A backup guy would be good since sometimes i'm up to a half day away
> offline and I have two babies.

*TWO*? I must have missed the second one. Well, congrats! :-)


> While you're still here, maybe you can point me to how to setup a
> monitoring service that will alert us in the event the services hosted
> at network.sugarlabs.org suffer an outage?
>
> We've been experiencing some issues and we consider this as production
> for use by end users. Specifically sometimes the services becomes
> really slow, and/or also sometimes it will stop responding (something
> about "too many open files" in the logs, see here
> http://tareas.somosazucar.org/hxp/issue71).
>
> I think we use munin, or someothing? It would be a good chance for me
> to get better aqcuainted with the infra.

There are various munin plugins you could use. Sunjammer is already
monitoring jita, so all you have to do is go to jita and enable a munin
plugin which sends a get request to the website and check the response.
You can probably also graph the response time. I think there's already
something in the default munin installation.

Once the plugin is running, you can setup thresholds to get notified
over email. And you can set the email address to an SMS gateway to get
notified on your phone.

If the version of munin running on jita is too old, try upgrading to
munin 2.0. There are packages on launchpad. There are also extra plugins
that you can install by hand.

Chat me on #sugar for more details. But not now, I'm busy busy busy.


> Regards,
> Sebastian
> 
> El sáb, 29 de mar 2014 a las 11:17 PM, Bernie Innocenti
> <bernie at codewiz.org> escribió:
>> On 03/29/2014 12:15 AM, Aleksey Lim wrote:
>>
>>     Hi all, I might not manage to react on server issues in time until
>>     2014-04-13. So, please, keep an eye on jita status. The list of
>>     services is on wiki[1], in most cases restart them from
>>     /etc/init.d/* [1] http://wiki.sugarlabs.org/go/User:Alsroot#services 
>>
>> Thanks for the heads up, I'll keep an eye on jita for you. By the way:
>> I'm going on a trip to Nepal from Apr 25 to May 12. and most of the
>> time I'll be completely unreachable, not even by phone. So I'm looking
>> for someone to step up for monitoring our services and respond to user
>> requests during this period.
>> -- 
>> _ // Bernie Innocenti \X/ http://codewiz.org
>> _______________________________________________ Systems mailing list
>> Systems at lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/systems 


-- 
 _ // Bernie Innocenti
 \X/  http://codewiz.org


More information about the Systems mailing list