[Clusterusers] [Ci-lab] request for CPU time

Wm. Josiah Erikson wjerikson at hampshire.edu
Thu Jun 27 14:36:03 EDT 2013


Also, on July 1 I will have another $5K to spend and I plan to get a 
whole bunch more C6100's (rack2), since they've been such a success! $5K 
should get us around 6 more of those units, which means 24 more nodes... 
if I can find 12U and the requisite power somewhere for them, which are 
both actual problems, but nice problems to have :)
We can definitely get another 3 units, or 12 more nodes - I've got the 
rackspace and power for that.
     -Josiah


On 6/27/13 2:26 PM, Thomas Helmuth wrote:
> Hi Jaime,
>
> I think I'm the main culprit right now. I have quite a few long, 
> important, but not very time sensitive runs going on the cluster 
> currently. Some have been going for 5 to 10 days now, so I'd prefer 
> not to cancel them and lose that work, even though they could be going 
> another 5 to 10 days until completion. I'd be happy to pause any new 
> launches of runs for now, but I'd prefer to allow the already-started 
> runs to finish if at all possible.
>
> Also, for figuring out the future sharing of the cluster, I'd be happy 
> to only have my runs hog certain nodes, or have some other way of 
> having them get lower priority when others need to use it. Let's talk 
> about it at the lab meeting.
>
> -Tom
>
>
> On Thu, Jun 27, 2013 at 2:23 PM, Wm. Josiah Erikson 
> <wjerikson at hampshire.edu <mailto:wjerikson at hampshire.edu>> wrote:
>
>     The issue right now is that many of Tom's Digital Multiplier
>     processes seem to be taking around 9 days to finish, so once
>     they've grabbed a slot, they hold on to it for a very long time,
>     which effectively means that nobody else can use the cluster while
>     they are running, since we don't have any kicking-people-out
>     algorithm, and everybody else's jobs finish in more like 20 minutes.
>     The solution, of course, is to change the "tom" tag to only use
>     every other slot or something like that, or only run two processes
>     per machine or something... but then they'll take even longer to
>     finish.
>     Discussing tomorrow at the meeting seems like a good plan.
>         -Josiah
>
>
>
>     On 6/27/13 2:19 PM, Lee Spector wrote:
>
>         Hi Jaime,
>
>         Fine with me personally but I'll check with my lab group to
>         see what everyone's expected needs are. I'm also not sure
>         exactly how to implement the idea if people do want to run
>         some other things... maybe by having you use a subset of
>         machines that everyone else excludes?
>
>           -Lee
>
>         On Jun 27, 2013, at 2:14 PM, Jaime Davila wrote:
>
>             Greetings everyone,
>
>             I wanted to check to see if it was possible for me to grab
>             some more CPU cycles out of the cluster for a week or so.
>             I just placed a new algorithm on my account, and have it
>             run it, but it's fairly different from what I was doing
>             before, and I would rather detect quickly if I need to
>             tweak or change things, as opposed to having to wait a
>             week to realize I need to make a 10 minute change.
>
>             Last time that the system load diminished some, I noticed
>             that my processes run at their top speed if the number of
>             CPUs loaded to their maximum drops to about 75%, as
>             opposed to the 97% where they are now. Maybe things will
>             be that way this time around, maybe not? In either case,
>             my grabbing more cpu cycles right now would be extremely
>             useful.
>
>             Thoughts?
>
>             Thanks a lot,
>
>             Jaime
>
>             -- 
>             ******************************************************
>             Jaime J. Dávila
>             Associate Professor of Computer Science
>             Hampshire College
>             jdavila at hampshire dot edu
>             http://helios.hampshire.edu/jdavila
>             *******************************************************
>
>             _______________________________________________
>             Clusterusers mailing list
>             Clusterusers at lists.hampshire.edu
>             <mailto:Clusterusers at lists.hampshire.edu>
>             https://lists.hampshire.edu/mailman/listinfo/clusterusers
>
>         --
>         Lee Spector, Professor of Computer Science
>         Cognitive Science, Hampshire College
>         893 West Street, Amherst, MA 01002-3359
>         lspector at hampshire.edu <mailto:lspector at hampshire.edu>,
>         http://hampshire.edu/lspector/
>         Phone: 413-559-5352 <tel:413-559-5352>, Fax: 413-559-5438
>         <tel:413-559-5438>
>
>         _______________________________________________
>         Clusterusers mailing list
>         Clusterusers at lists.hampshire.edu
>         <mailto:Clusterusers at lists.hampshire.edu>
>         https://lists.hampshire.edu/mailman/listinfo/clusterusers
>
>
>     -- 
>     Wm. Josiah Erikson
>     Assistant Director of IT, Infrastructure Group
>     System Administrator, School of CS
>     Hampshire College
>     Amherst, MA 01002
>     (413) 559-6091 <tel:%28413%29%20559-6091>
>
>
>     _______________________________________________
>     Ci-lab mailing list
>     Ci-lab at lists.hampshire.edu <mailto:Ci-lab at lists.hampshire.edu>
>     https://lists.hampshire.edu/mailman/listinfo/ci-lab
>
>

-- 
Wm. Josiah Erikson
Assistant Director of IT, Infrastructure Group
System Administrator, School of CS
Hampshire College
Amherst, MA 01002
(413) 559-6091

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.hampshire.edu/pipermail/clusterusers/attachments/20130627/04b61cea/attachment-0001.html>


More information about the Clusterusers mailing list