[Systems] [Sugar-devel] git.sugarlabs.org down for unplanned maintenance

Bernie Innocenti bernie at sugarlabs.org
Mon Apr 14 09:12:05 EDT 2014


On 04/12/2014 02:07 AM, Sebastian Silva wrote:
> Here I just got home. Sorry for the inconvenience I might have caused.
> 
> Bernie, do you know which log was/is growing out of hand?

Both access.log and node.sugarlabs.org.log. I discarded the first and
compressed the second (it compresses very well). You can still examine
it by doing:

  xzless access.log-20140411.xz | tail

You'll see lines like this one:

node.sugarlabs.org:80 181.65.159.107 - - [11/Apr/2014:19:51:36 -0400]
"GET /?cmd=subscribe HTTP/1.1" 200 232 "-" "python-requests/1.2.1
CPython/2.7.0 Linux/2.6.35.13_xo1.5-20120508.1139.olpc.eb0c7a8"


The problem seems to be that laptops retry the connection to
/context.atom and /feedback.atom quickly. It's probably near the end of
the file though. Don't try to uncompress the whole file because it's
over 2GB.


> Here's a report on everything I know about the issue.
> We've been experiencing some performance degradation and also some
> downtime in Sugar Network services (this is documented at
> http://tareas.somosazucar.org/hxp/issue71 ).
> We've seen a burst in users since deployment OS images with Sugar
> Network features ( http://network.sugarlabs.org/stats-viewer/ growing
> pretty fast user_total).
> There is a notification feature that is polling the sugar network node
> service.
> This was causing the allocation and exhaustion of resources (open
> files). Crashes got to a frequency of every hour or so.
> It's code I don't understand really well, but I went ahead and patched
> the Sugar Network with:
> http://tareas.somosazucar.org/hxp/file66/sn_disable_notifications.patch 
> This made the SN much snappier and it stopped crashing. However logs
> were saving a traceback several times per second. I thought I had
> contained the log issue but apparently I missed some other logs (I guess
> apache logs but they seem clean now).
> 
> I took a glance at jita and could not find the growing log.
> 
> Let me know where I can help mitigation.
> 
> Regards
> Sebastian
> 
> 
> El vie, 11 de abr 2014 a las 7:51 PM, Bernie Innocenti
> <bernie at sugarlabs.org> escribió:
>> I was notified that git.sugarlabs.org was showing errors. After some
>> head scraping I realized that the root filesystem on jita was full. I
>> looked around and found giant request logs containing millions of
>> requests apparently originating from XOs located in Peru. We've been
>> DDOSed by our own creature :-) Anyway, the machine also had a giant,
>> very fragmented mysql database that I'm currently cleaning up.
>> Gitorious will be back online in less than 1 hour. Contact me on IRC
>> if this is blocking your work, I can postpone the maintenance.
>> -- 
>> Bernie Innocenti Sugar Labs Infrastructure Team
>> http://wiki.sugarlabs.org/go/Infrastructure_Team
>> _______________________________________________ Sugar-devel mailing
>> list Sugar-devel at lists.sugarlabs.org
>> http://lists.sugarlabs.org/listinfo/sugar-devel 


-- 
Bernie Innocenti
Sugar Labs Infrastructure Team
http://wiki.sugarlabs.org/go/Infrastructure_Team


More information about the Systems mailing list