[Clusterusers] [Ci-lab] request for CPU time
Wm. Josiah Erikson
wjerikson at hampshire.edu
Thu Jun 27 14:36:03 EDT 2013
Also, on July 1 I will have another $5K to spend and I plan to get a
whole bunch more C6100's (rack2), since they've been such a success! $5K
should get us around 6 more of those units, which means 24 more nodes...
if I can find 12U and the requisite power somewhere for them, which are
both actual problems, but nice problems to have :)
We can definitely get another 3 units, or 12 more nodes - I've got the
rackspace and power for that.
-Josiah
On 6/27/13 2:26 PM, Thomas Helmuth wrote:
> Hi Jaime,
>
> I think I'm the main culprit right now. I have quite a few long,
> important, but not very time sensitive runs going on the cluster
> currently. Some have been going for 5 to 10 days now, so I'd prefer
> not to cancel them and lose that work, even though they could be going
> another 5 to 10 days until completion. I'd be happy to pause any new
> launches of runs for now, but I'd prefer to allow the already-started
> runs to finish if at all possible.
>
> Also, for figuring out the future sharing of the cluster, I'd be happy
> to only have my runs hog certain nodes, or have some other way of
> having them get lower priority when others need to use it. Let's talk
> about it at the lab meeting.
>
> -Tom
>
>
> On Thu, Jun 27, 2013 at 2:23 PM, Wm. Josiah Erikson
> <wjerikson at hampshire.edu <mailto:wjerikson at hampshire.edu>> wrote:
>
> The issue right now is that many of Tom's Digital Multiplier
> processes seem to be taking around 9 days to finish, so once
> they've grabbed a slot, they hold on to it for a very long time,
> which effectively means that nobody else can use the cluster while
> they are running, since we don't have any kicking-people-out
> algorithm, and everybody else's jobs finish in more like 20 minutes.
> The solution, of course, is to change the "tom" tag to only use
> every other slot or something like that, or only run two processes
> per machine or something... but then they'll take even longer to
> finish.
> Discussing tomorrow at the meeting seems like a good plan.
> -Josiah
>
>
>
> On 6/27/13 2:19 PM, Lee Spector wrote:
>
> Hi Jaime,
>
> Fine with me personally but I'll check with my lab group to
> see what everyone's expected needs are. I'm also not sure
> exactly how to implement the idea if people do want to run
> some other things... maybe by having you use a subset of
> machines that everyone else excludes?
>
> -Lee
>
> On Jun 27, 2013, at 2:14 PM, Jaime Davila wrote:
>
> Greetings everyone,
>
> I wanted to check to see if it was possible for me to grab
> some more CPU cycles out of the cluster for a week or so.
> I just placed a new algorithm on my account, and have it
> run it, but it's fairly different from what I was doing
> before, and I would rather detect quickly if I need to
> tweak or change things, as opposed to having to wait a
> week to realize I need to make a 10 minute change.
>
> Last time that the system load diminished some, I noticed
> that my processes run at their top speed if the number of
> CPUs loaded to their maximum drops to about 75%, as
> opposed to the 97% where they are now. Maybe things will
> be that way this time around, maybe not? In either case,
> my grabbing more cpu cycles right now would be extremely
> useful.
>
> Thoughts?
>
> Thanks a lot,
>
> Jaime
>
> --
> ******************************************************
> Jaime J. Dávila
> Associate Professor of Computer Science
> Hampshire College
> jdavila at hampshire dot edu
> http://helios.hampshire.edu/jdavila
> *******************************************************
>
> _______________________________________________
> Clusterusers mailing list
> Clusterusers at lists.hampshire.edu
> <mailto:Clusterusers at lists.hampshire.edu>
> https://lists.hampshire.edu/mailman/listinfo/clusterusers
>
> --
> Lee Spector, Professor of Computer Science
> Cognitive Science, Hampshire College
> 893 West Street, Amherst, MA 01002-3359
> lspector at hampshire.edu <mailto:lspector at hampshire.edu>,
> http://hampshire.edu/lspector/
> Phone: 413-559-5352 <tel:413-559-5352>, Fax: 413-559-5438
> <tel:413-559-5438>
>
> _______________________________________________
> Clusterusers mailing list
> Clusterusers at lists.hampshire.edu
> <mailto:Clusterusers at lists.hampshire.edu>
> https://lists.hampshire.edu/mailman/listinfo/clusterusers
>
>
> --
> Wm. Josiah Erikson
> Assistant Director of IT, Infrastructure Group
> System Administrator, School of CS
> Hampshire College
> Amherst, MA 01002
> (413) 559-6091 <tel:%28413%29%20559-6091>
>
>
> _______________________________________________
> Ci-lab mailing list
> Ci-lab at lists.hampshire.edu <mailto:Ci-lab at lists.hampshire.edu>
> https://lists.hampshire.edu/mailman/listinfo/ci-lab
>
>
--
Wm. Josiah Erikson
Assistant Director of IT, Infrastructure Group
System Administrator, School of CS
Hampshire College
Amherst, MA 01002
(413) 559-6091
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.hampshire.edu/pipermail/clusterusers/attachments/20130627/04b61cea/attachment-0001.html>
More information about the Clusterusers
mailing list