[Systems] Migrating to the Media Lab

Bernie Innocenti bernie at sugarlabs.org
Fri Jan 13 19:34:22 EST 2012


Hello everyone,

as anticipated long ago, the FSF is loosing its rack at GNAPS, so we'll
be relocating sunjammer and treehouse to a new server room at the Media
Lab. The absolute deadline to move out is Feb 29, but it would be safest
to complete the migration sooner than that, possibly by mid February.

Here's the current plan:


== Phase 1: treehouse -> housetree ==

To minimize downtime, we'll temporarily migrate all our VMs to
housetree, a server which is already racked in E15.

There's question is whether housetree can take the  load of 13
additional virtual machines:

Id Name                 State
----------------------------------
 54 aslo-web             running
130 zatoichi             running
180 schooltool           running
185 anno                 running
227 booki                running
228 subuntu              running
233 lightwave            running
234 rt                   running
240 pootle               running
243 ole                  running
247 identity             running
248 idea                 running
249 monitoring           running

The most critical machines for Sugar Labs are lightwave, also-web and
pootle. rt is not essential but it would be nice to keet it running. We
can temporarily turn off subuntu and schooltool.

I need to know from the PyEdu folks if zatoichi is still in production.

All the others (anno, booki, ole, identity, idea and monitoring) belong
to dogi.

== Housetree preparations ==

Dogi and I have been working to optimize housetree. Last week, the load
was peaking at over 10 with almost nothing running on it.

We stopped a couple of unused VMs (openqwaq & template-squeeze) and
solved a few issues with munin and jita. There are currently two VMs
(ole2 and munin) which are causing an abnormally high load which I
suspect might be caused by poor I/O performance to a fragmented qcow2
file. We'll migrate them to LVM partitions over the week-end.

Housetree also has a broken drive. We've already bought a spare, but we
decided to postpone the replacement until after the migration is over.


== IPv4 and IPv6 ==

Currently we have only 7 IPv4 addresses assigned to housetree, which is
not at all sufficient for all our VMs. Dogi noted that DHCP at the Media
Lab gives long leases that remains stable for long periods of time, but
in the long term we need a subnet with 32 or better 64 IPs dedicated to
Sugar Labs, possibly with reverse DNS delegation. Dogi will ask to
Michailis.

As for IPv6, currently the Media Lab does not provide real IPv6 transit,
but we can still use 6to4.


== DNS adjustments ==

The TTL in our DNS is currently 3600 seconds. I'll lower it to 30
seconds before starting the migration.


== Sunjammer ==

Sunjammer will move last, after treehouse is back online in the Media
Lab. Currently, it's a XEN domU running on an FSF machine which is more
or less as fast as treehouse. We can use the daily backups to speed up
the filesystem migration.


== activities.sugarlabs.org ==

An additional complication is that aslo-web requires a low-latency
connection to sunjammer for NFS, which won't be available during the
transition.

Tonight I've removed aslo-web from the load balancer to see if sunjammer
survives the extra load. I don't expect any trouble during the week-end,
and I'll be on vacation on Monday so I can monitor the situation.


== Physical access ==

We currently have no access to the machine room, so if something happens
on a Friday night we're screwed for the whole week-end. Walter has an
MIT pass and Michailis should be able to get him access to the machine
room as well.

It seems that we could also request a special card type called "DLC /
Unofficial Members of the MIT Community" to some office in the Media
Lab.

-- 
Bernie Innocenti
Sugar Labs Infrastructure Team
http://wiki.sugarlabs.org/go/Infrastructure_Team



More information about the Systems mailing list