[Clusterusers] cluster planning
Wm. Josiah Erikson
wjerikson at hampshire.edu
Mon May 15 10:32:51 EDT 2006
Hello all,
For anybody who doesn't already know:
Right now, fly's head node is serving most of fly AND most of hex (a
couple of dead motherboards and hard drives is the only reason that
isn't ALL - hardware is on order to remedy this), proving that
non-identical hardware can coexist in the same cluster just fine with
ROCKS, the clustering software I'm currently using and have fallen in
love with.
I'm going to go forward with cluster planning assuming that
everybody thinks it would be great if both clusters were permanently the
same, as that's the universal response I got from everybody when we
first started talking about getting fly back up and running. This is
possible and even relatively simple with ROCKS, as I have just proven :)
I'm going to put new hard drives, in a RAID 1, in hex's current
master node - 250GB hard drives, an upgrade from the current 120GB hard
drives - the old ones are well past their life expectancy, which makes
me nervous. Hex's master node will serve all 40 compute nodes. The
question is: Should I put hex at fly's IP (directly acessible from the
outside world via SSH and HTTP), or hex's IP (only accessible directly
from inside Hampshire's network). I would argue for the former... I also
think fly is a cooler name than hex, but I don't actually care :) I can
keep fly up-to-date security-wise for those services that are available,
and the rest are firewalled at the kernel level as well as at the edge
of Hampshire's network, so I don't think we're exposing ourselves to
anything big and scary by giving outselves a globally valid IP.
In short: I plan on putting both clusters together into one, using
hex's current master node to be the head node, calling it fly, and
making it globally accessible. Does anybody have a problem with this?
Here are the services/programs that I know need to be installed.
Please add to this list:
-breve
-Maya and Pixar license servers
-Pixar stuff (RenderManProServer and rat)
-MPI
-build tools (gcc, etc - v4 as well as v3)
-X on the head node
-Maya
-cmucl
-All the standard ROCKS stuff that is now on fly. This means that
the new combined cluster will look very very much indeed like fly does
currently. Nearly identical, in fact, except for the addition of gcc 4.0
, license servers, and the head node will be faster and RAIDed. We're
working on a backup scheme for the homedirs, and until then, I would
continue to use the FireWire drive that is currently being used for backup.
I will take everything that people have on both their hex and their
fly homedirs and put them into the new cluster, if necessary. It would
be nice if people would clean up what isn't needed before then.
Also, please tell me when you next plan to use the cluster and for
what, so that I can plan what should be up and working and how at that
point.
Thanks - send any comments, concerns, or "what the hell do you think
you're doing"'s my way.
-Josiah
More information about the Clusterusers
mailing list