[Clusterusers] now something seems really strange on fly

Kyle Harrington kharrington at hampshire.edu
Mon Oct 15 08:09:58 EDT 2007


I didn't have time to work on divisionblocks this weekend (or during
last week much either). So, I don't think this would have been caused by
the networking code. Also, I have only been running it for a few minutes
at a time, just enough to test out block transfer.

~Kyle

Lee Spector wrote:
> 
> Josiah and clusterusers,
> 
> I tried earlier today to kill all of my cluster jobs that had been
> running for the previous day or so. I thought it had worked, but just
> noticed that I had got a timeout after the first node. Tried again --
> and my kill script hangs on ALL nodes, but continues to the next one if
> I command-C (over and over again to get to each node). Never seen that
> before. And when I got up to compute-1-13, compute-1-15, and
> compute-1-16, which might not even have had my jobs launched on them
> (because of a previous problem -- I think I wrote about this to Josiah
> but not to the cluster list), they asked me for a password. I've had to
> deal with node/password issues before, but I don't see any reason why
> this should have cropped up now.
> 
> So something seems to be pretty hosed at the moment. All of my breve
> processes do seem to have been killed, assuming cluster-top is telling
> the truth, but something's wrong.
> 
> All of which (along with the other recent problems) is really puzzling
> me because I've been running essentially this same code for months --
> it's just breve/PushGP on a straightforward problem, using the same
> shell scripts that I've been using for months -- and I've never had
> problems anything like this before the last couple of weeks. What's
> changed? Is it the simultaneous use of the cluster for lots of
> rendering? But hasn't that been about the same for a while too? Could it
> be Kyle's new work with breve -- Kyle, is it possible that your
> block-transport code is consuming a lot of OS network resources of some
> kind? Any other ideas?
> 
>  -Lee
> 
> -- 
> Lee Spector, Professor of Computer Science
> School of Cognitive Science, Hampshire College
> 893 West Street, Amherst, MA 01002-3359
> lspector at hampshire.edu, http://hampshire.edu/lspector/
> Phone: 413-559-5352, Fax: 413-559-5438
> 
> _______________________________________________
> Clusterusers mailing list
> Clusterusers at lists.hampshire.edu
> http://lists.hampshire.edu/mailman/listinfo/clusterusers
> 




More information about the Clusterusers mailing list