[Systems] aslo cluster progress
David Farning
dfarning at gmail.com
Fri Mar 12 16:02:51 EST 2010
On Fri, Mar 12, 2010 at 2:48 PM, Stefan Unterhauser <dogi at sugarlabs.org>wrote:
> hi david
>
> On Fri, Mar 12, 2010 at 7:48 PM, David Farning <dfarning at gmail.com> wrote:
> > We had a pretty big success yesterday. We brought up a third aslo node
> > yesterday from stock ubuntu server to fully functioning aslo node with
> one
> > command. The goal is to reduce the aslo cluster management as much as
> > possible:
> > 1. No backups. It is easier to rebuild a machine than restore it from
> > backup.
>
> ++1
> the recipe finally is more important than the backup
>
BTW, I have been turning off the machines from http://18.214.0.179 when they
are not in use to reduce the noise, heat and energy usage for our generous
hosts (Pika). If anyone wants to to test/develop, please feel free to turn
the machines on as needed.
david
> 2. Easy to replace machines. Replacing a dead machine or adding a machine
> > to the cluster is easy and foolproof. (easy and foolproof is important
> to
> > anything in maintain.)
> > 3. I we need to change a configuration. That change is automatically
> > propagated to all machines
> >
> > We had only one problem that need manual intervention. I screwed up the
> > order of disenabling the apache default site and installing apache. Had
> to
> > go back in and remove the default site by hand:) Still it was pretty
> cool.
> > Nodes two or three can die without affecting service.
> >
> > Node one is still a bottle neck for.
> > 1. The database.
> > 2. The shared file system.
> > 3. The loadbalancer.
> >
> > Work this weekend will focus on setting up redundant 'masters' for the
> > database, filesystem, and loadbalancer. The goal will be for a admin
> > (human) to be able to manually switch control of the cluster from one
> node
> > to the other. From there it will be a matter of adding the High
> > Availability (HA) function so the cluster can pass control around on its
> > own.
> >
> > The interesting problem is not switching control from one machine to
> another
> > machine. The main problem is insuring that the orginal 'master' does not
> > wake up and think that it is still in control.
> >
> > After a couple of slow weeks learning puppet it is nice to be moving
> forward
> > again.
> >
> > david
> >
> >
> > _______________________________________________
> > Systems mailing list
> > Systems at lists.sugarlabs.org
> > http://lists.sugarlabs.org/listinfo/systems
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.sugarlabs.org/private/systems/attachments/20100312/53a8a12f/attachment-0001.htm
More information about the Systems
mailing list