[Systems] solarsail: http down
Ivan Krstić
krstic at solarsail.hcs.harvard.edu
Sat Mar 14 22:05:28 EDT 2009
On Mar 14, 2009, at 11:37 PM, Sascha Silbe wrote:
> Both dev.sugarlabs.org and wiki.sugarlabs.org were down for over an
> hour.
A 3rd party service starts flooding my cell phone with alerts if
anything is down for an hour, and I haven't received any notifications
about this before the kernel problem kicked in. I'll have to
investigate more; the machine probably slowed down initially but still
worked.
In general, downtime with this machine has been exceedingly rare, so I
haven't felt compelled to improve upon the combination of hourly
monitoring and 20-minute watchdog. It's now time to rethink this.
Tomorrow, I will set up off-site Nagios with 60s monitoring frequency
for all of SL infra and post the details here.
Cheers,
--
Ivan Krstić <krstic at solarsail.hcs.harvard.edu> | http://radian.org
More information about the Systems
mailing list