[Systems] aslo cluster progress

David Farning dfarning at gmail.com
Fri Mar 12 16:02:51 EST 2010


On Fri, Mar 12, 2010 at 2:48 PM, Stefan Unterhauser <dogi at sugarlabs.org>wrote:

> hi david
>
> On Fri, Mar 12, 2010 at 7:48 PM, David Farning <dfarning at gmail.com> wrote:
> > We had a pretty big success yesterday.  We brought up a third aslo node
> > yesterday from stock ubuntu server to fully functioning aslo node with
> one
> > command.  The goal is to reduce the aslo cluster management as much as
> > possible:
> > 1. No backups. It is easier to rebuild a machine than restore it from
> > backup.
>
> ++1
> the recipe finally is more important than the backup
>

BTW, I have been turning off the machines from http://18.214.0.179 when they
are not in use to reduce the noise, heat and energy usage for our generous
hosts (Pika).  If anyone wants to to test/develop, please feel free to turn
the machines on as needed.

david

> 2. Easy to replace machines.  Replacing a dead machine or adding a machine
> > to the cluster is easy and foolproof.  (easy and foolproof is important
> to
> > anything in maintain.)
> > 3.  I we need to change a configuration.  That change is automatically
> > propagated to all machines
> >
> > We had only one problem that need manual intervention.  I screwed up the
> > order of  disenabling the apache default site and installing apache.  Had
> to
> > go back in and remove the default site by hand:)  Still it was pretty
> cool.
> > Nodes two or three can die without affecting service.
> >
> > Node one is still a bottle neck for.
> > 1. The database.
> > 2. The shared file system.
> > 3. The loadbalancer.
> >
> > Work this weekend will focus on setting up redundant 'masters' for the
> > database, filesystem, and loadbalancer.  The goal will be for a admin
> > (human) to be able to manually switch control of the cluster from one
> node
> > to the other.  From there it will be a matter of adding the High
> > Availability (HA) function so the cluster can pass control around on its
> > own.
> >
> > The interesting problem is not switching control from one machine to
> another
> > machine.  The main problem is insuring that the orginal 'master' does not
> > wake up and think that it is still in control.
> >
> > After a couple of slow weeks learning puppet it is nice to be moving
> forward
> > again.
> >
> > david
> >
> >
> > _______________________________________________
> > Systems mailing list
> > Systems at lists.sugarlabs.org
> > http://lists.sugarlabs.org/listinfo/systems
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.sugarlabs.org/private/systems/attachments/20100312/53a8a12f/attachment-0001.htm 


More information about the Systems mailing list