[Clusterusers] cluster planning

Wm. Josiah Erikson wjerikson at hampshire.edu
Mon May 15 13:06:16 EDT 2006


Yes, this is all still possible, though slightly different (and 
possibly, or probably... better). You might want to read this:

http://fly.hampshire.edu/rocks-documentation/4.1/start-computing.html

(or 
http://www.rocksclusters.org/rocks-documentation/4.1/start-computing.html )

    -Josiah


Lee Spector wrote:

>
> Josiah,
>
> This all sounds fantastic to me. On the IP/access/firewall stuff --  
> if I'll be able to ssh/scp into the cluster head node from the  
> outside world I'll be very happy, assuming we won't be putting the  
> cluster at unreasonable risk.
>
> I expect that my next major use of the cluster will involve breve,  
> and possibly a new built for which we'll have to talk to Jon.
>
> I do have a sequence of probably ignorant questions, which can be  
> summarized as: Will I still be able to run/stop cluster-wide lisp  
> jobs with my shell scripts or some reasonable replacements? I guess  
> there are a couple of parts to this, on the possibly-silly assumption  
> that I WOULD still be using shell scripts:
>
> - Can I still get a list of all of the node names and stick them in a  
> local file with something like:
>
> /usr/bin/gstat -l -1 | grep n | cut -f 1 -d ' ' | sort > ~/pssc/rshnodes
>
> - Can I still then fork processes on all of the nodes (AT THE SAME  
> TIME) with something like:
>
> forkall "/opt/bin/cmucl -dynamic-space-size 1600 -load /home/lspector/ 
> $1/load -quiet > /tmp/output"
>
> where my "forkall" is defined as:
>
> #!/bin/sh
> while read nodename  ; do
>         ssh $nodename "$@" & < /dev/null;
> done <  ~/pssc/rshnodes
>
> - Can I still then kill all of my lisp processes across the cluster  
> with something like:
>
> forkall source ~/bin/kill-lisps
>
> where my kill-lisps is:
>
> kill -9 `ps -ax | grep lisp | gawk '{ print $1 }'`
>
> If the answer is something like "NO, you can't do any of that in such  
> a goofy, neanderthal sort of way, but there are perfectly good and in  
> fact simpler ways to do this with ROCKS" then of course I'd be fine  
> with that, although I'll need some pointers about the new way to do  
> it. If, on the other hand, there's some major snag in doing this sort  
> of thing in any way, then I'm worried and we need to talk.
>
> Thanks,
>
>  -Lee
>
>
>
>
>
> On May 15, 2006, at 10:32 AM, Wm. Josiah Erikson wrote:
>
>> Hello all,
>>    For anybody who doesn't already know:
>>
>>    Right now, fly's head node is serving most of fly AND most of  hex 
>> (a couple of dead motherboards and hard drives is the only  reason 
>> that isn't ALL - hardware is on order to remedy this),  proving that 
>> non-identical hardware can coexist in the same cluster  just fine 
>> with ROCKS, the clustering software I'm currently using  and have 
>> fallen in love with.
>>      I'm going to go forward with cluster planning assuming that  
>> everybody thinks it would be great if both clusters were  permanently 
>> the same, as that's the universal response I got from  everybody when 
>> we first started talking about getting fly back up  and running. This 
>> is possible and even relatively simple with  ROCKS, as I have just 
>> proven :)
>>    I'm going to put new hard drives, in a RAID 1, in hex's current  
>> master node - 250GB hard drives, an upgrade from the current 120GB  
>> hard drives - the old ones are well past their life expectancy,  
>> which makes me nervous. Hex's master node will serve all 40 compute  
>> nodes. The question is: Should I put hex at fly's IP (directly  
>> acessible from the outside world via SSH and HTTP), or hex's IP  
>> (only accessible directly from inside Hampshire's network). I would  
>> argue for the former... I also think fly is a cooler name than hex,  
>> but I don't actually care :) I can keep fly up-to-date security- wise 
>> for those services that are available, and the rest are  firewalled 
>> at the kernel level as well as at the edge of  Hampshire's network, 
>> so I don't think we're exposing ourselves to  anything big and scary 
>> by giving outselves a globally valid IP.
>>
>>    In short: I plan on putting both clusters together into one,  
>> using hex's current master node to be the head node, calling it  fly, 
>> and making it globally accessible. Does anybody have a problem  with 
>> this?
>>
>>    Here are the services/programs that I know need to be installed.  
>> Please add to this list:
>>
>>    -breve
>>    -Maya and Pixar license servers
>>    -Pixar stuff (RenderManProServer and rat)
>>    -MPI
>>    -build tools (gcc, etc - v4 as well as v3)
>>    -X on the head node
>>    -Maya
>>    -cmucl
>>    -All the standard ROCKS stuff that is now on fly. This means  that 
>> the new combined cluster will look very very much indeed like  fly 
>> does currently. Nearly identical, in fact, except for the  addition 
>> of gcc 4.0 , license servers, and the head node will be  faster and 
>> RAIDed. We're working on a backup scheme for the  homedirs, and until 
>> then, I would continue to use the FireWire  drive that is currently 
>> being used for backup.
>>
>>    I will take everything that people have on both their hex and  
>> their fly homedirs and put them into the new cluster, if necessary.  
>> It would be nice if people would clean up what isn't needed before  
>> then.
>>
>>    Also, please tell me when you next plan to use the cluster and  
>> for what, so that I can plan what should be up and working and how  
>> at that point.
>>
>>    Thanks - send any comments, concerns, or "what the hell do you  
>> think you're doing"'s my way.
>>
>>    -Josiah
>>
>>
>>
>>
>>
>> _______________________________________________
>> Clusterusers mailing list
>> Clusterusers at lists.hampshire.edu
>> http://lists.hampshire.edu/mailman/listinfo/clusterusers
>
>
> -- 
> Lee Spector, Professor of Computer Science
> School of Cognitive Science, Hampshire College
> 893 West Street, Amherst, MA 01002-3359
> lspector at hampshire.edu, http://hampshire.edu/lspector/
> Phone: 413-559-5352, Fax: 413-559-5438
>
> _______________________________________________
> Clusterusers mailing list
> Clusterusers at lists.hampshire.edu
> http://lists.hampshire.edu/mailman/listinfo/clusterusers




More information about the Clusterusers mailing list