[Clusterusers] cluster planning

Wm. Josiah Erikson wjerikson at hampshire.edu
Mon May 15 10:32:51 EDT 2006


Hello all,
    For anybody who doesn't already know:

    Right now, fly's head node is serving most of fly AND most of hex (a 
couple of dead motherboards and hard drives is the only reason that 
isn't ALL - hardware is on order to remedy this), proving that 
non-identical hardware can coexist in the same cluster just fine with 
ROCKS, the clustering software I'm currently using and have fallen in 
love with.
   
    I'm going to go forward with cluster planning assuming that 
everybody thinks it would be great if both clusters were permanently the 
same, as that's the universal response I got from everybody when we 
first started talking about getting fly back up and running. This is 
possible and even relatively simple with ROCKS, as I have just proven :)
    I'm going to put new hard drives, in a RAID 1, in hex's current 
master node - 250GB hard drives, an upgrade from the current 120GB hard 
drives - the old ones are well past their life expectancy, which makes 
me nervous. Hex's master node will serve all 40 compute nodes. The 
question is: Should I put hex at fly's IP (directly acessible from the 
outside world via SSH and HTTP), or hex's IP (only accessible directly 
from inside Hampshire's network). I would argue for the former... I also 
think fly is a cooler name than hex, but I don't actually care :) I can 
keep fly up-to-date security-wise for those services that are available, 
and the rest are firewalled at the kernel level as well as at the edge 
of Hampshire's network, so I don't think we're exposing ourselves to 
anything big and scary by giving outselves a globally valid IP.

    In short: I plan on putting both clusters together into one, using 
hex's current master node to be the head node, calling it fly, and 
making it globally accessible. Does anybody have a problem with this?

    Here are the services/programs that I know need to be installed. 
Please add to this list:

    -breve
    -Maya and Pixar license servers
    -Pixar stuff (RenderManProServer and rat)
    -MPI
    -build tools (gcc, etc - v4 as well as v3)
    -X on the head node
    -Maya
    -cmucl
    -All the standard ROCKS stuff that is now on fly. This means that 
the new combined cluster will look very very much indeed like fly does 
currently. Nearly identical, in fact, except for the addition of gcc 4.0 
, license servers, and the head node will be faster and RAIDed. We're 
working on a backup scheme for the homedirs, and until then, I would 
continue to use the FireWire drive that is currently being used for backup.

    I will take everything that people have on both their hex and their 
fly homedirs and put them into the new cluster, if necessary. It would 
be nice if people would clean up what isn't needed before then.

    Also, please tell me when you next plan to use the cluster and for 
what, so that I can plan what should be up and working and how at that 
point.

    Thanks - send any comments, concerns, or "what the hell do you think 
you're doing"'s my way.

    -Josiah








More information about the Clusterusers mailing list