[Systems] Discovery_One Was a.sl.o migration.

David Farning dfarning at sugarlabs.org
Wed Nov 4 18:46:54 EST 2009


It look like we are hitting our peak load capacity on sunjammer.  That
load seems to be about 1.0K Apache accesses per minute.  Above that
the load balance seems to increase quickly.

Got a chance to talk to the remora infrastructure guys again today.
They use Zeus, a commercial load balancing / traffic management
system / HA system.  I think that we can match their system using open
source products.

The basic a.sl.o system design is:

Apache - php server
MySQL - database server

This is basically what we use.  We add an additional distributed
caching layer for the database:

Apache - php server
Memcached - distributed PHP object cache
MySQL - database server

The first step will be to add a Squid caching proxy in front of the
PHP server.  It looks like we will be able to cache all of the images,
html, JavaScript, and css files.  This will give us:

Squid - caching proxy
Apache - php server
Memcached - distributed PHP object cache
MySQL - database server

The second step will be to add a load balancer to enable us to use
multiple php servers.  I am learning towards perlbal but HAproxy also
looks good.  This will give us:  (This stack can live on a single
machine -- except the the php nodes)

Squid - caching proxy
????? - load balancer
X * Apache - php server
Memcached - distributed PHP object cache
MySQL - database server

We should be able to handle our scalability needs for the couple of
years by adding machines running additional php servers.

While this is happening we will want to work on HA.  This require:
(This stack can live on two machines except the the php nodes.  As
bottle necks occur, we can split services off to their own machines.)

2 * load balancer + heartbeat - Health monitor
2 + n  Squid - caching proxy
2 * load balancer+ heartbeat - Health monitor
2 + n Apache - php server
2 + n  Memcached - distributed PHP object cache
2 * MySQL - database server in Master - Master configuration

I _think_ that our current bottle neck is our Apache server.  Because
we are running many services, we have several Apache mod_* enabled.
This gives us heavy  Apache processes.  Because we are not using an
caching proxy in front of Apache, all of our static content is being
handled by these processes.

Adding squid and removing the uneedded mod_* should increase overall
system throughput.

david


More information about the Systems mailing list