[Clusterusers] [Ci-lab] request for CPU time

Thomas Helmuth thelmuth at cs.umass.edu
Thu Jun 27 14:26:41 EDT 2013


Hi Jaime,

I think I'm the main culprit right now. I have quite a few long, important,
but not very time sensitive runs going on the cluster currently. Some have
been going for 5 to 10 days now, so I'd prefer not to cancel them and lose
that work, even though they could be going another 5 to 10 days until
completion. I'd be happy to pause any new launches of runs for now, but I'd
prefer to allow the already-started runs to finish if at all possible.

Also, for figuring out the future sharing of the cluster, I'd be happy to
only have my runs hog certain nodes, or have some other way of having them
get lower priority when others need to use it. Let's talk about it at the
lab meeting.

-Tom


On Thu, Jun 27, 2013 at 2:23 PM, Wm. Josiah Erikson <wjerikson at hampshire.edu
> wrote:

> The issue right now is that many of Tom's Digital Multiplier processes
> seem to be taking around 9 days to finish, so once they've grabbed a slot,
> they hold on to it for a very long time, which effectively means that
> nobody else can use the cluster while they are running, since we don't have
> any kicking-people-out algorithm, and everybody else's jobs finish in more
> like 20 minutes.
> The solution, of course, is to change the "tom" tag to only use every
> other slot or something like that, or only run two processes per machine or
> something... but then they'll take even longer to finish.
> Discussing tomorrow at the meeting seems like a good plan.
>     -Josiah
>
>
>
> On 6/27/13 2:19 PM, Lee Spector wrote:
>
>> Hi Jaime,
>>
>> Fine with me personally but I'll check with my lab group to see what
>> everyone's expected needs are. I'm also not sure exactly how to implement
>> the idea if people do want to run some other things... maybe by having you
>> use a subset of machines that everyone else excludes?
>>
>>   -Lee
>>
>> On Jun 27, 2013, at 2:14 PM, Jaime Davila wrote:
>>
>>  Greetings everyone,
>>>
>>> I wanted to check to see if it was possible for me to grab some more CPU
>>> cycles out of the cluster for a week or so. I just placed a new algorithm
>>> on my account, and have it run it, but it's fairly different from what I
>>> was doing before, and I would rather detect quickly if I need to tweak or
>>> change things, as opposed to having to wait a week to realize I need to
>>> make a 10 minute change.
>>>
>>> Last time that the system load diminished some, I noticed that my
>>> processes run at their top speed if the number of CPUs loaded to their
>>> maximum drops to about 75%, as opposed to the 97% where they are now. Maybe
>>> things will be that way this time around, maybe not? In either case, my
>>> grabbing more cpu cycles right now would be extremely useful.
>>>
>>> Thoughts?
>>>
>>> Thanks a lot,
>>>
>>> Jaime
>>>
>>> --
>>> ********************************************************
>>> Jaime J. Dávila
>>> Associate Professor of Computer Science
>>> Hampshire College
>>> jdavila at hampshire dot edu
>>> http://helios.hampshire.edu/**jdavila<http://helios.hampshire.edu/jdavila>
>>> *********************************************************
>>>
>>> ______________________________**_________________
>>> Clusterusers mailing list
>>> Clusterusers at lists.hampshire.**edu <Clusterusers at lists.hampshire.edu>
>>> https://lists.hampshire.edu/**mailman/listinfo/clusterusers<https://lists.hampshire.edu/mailman/listinfo/clusterusers>
>>>
>> --
>> Lee Spector, Professor of Computer Science
>> Cognitive Science, Hampshire College
>> 893 West Street, Amherst, MA 01002-3359
>> lspector at hampshire.edu, http://hampshire.edu/lspector/
>> Phone: 413-559-5352, Fax: 413-559-5438
>>
>> ______________________________**_________________
>> Clusterusers mailing list
>> Clusterusers at lists.hampshire.**edu <Clusterusers at lists.hampshire.edu>
>> https://lists.hampshire.edu/**mailman/listinfo/clusterusers<https://lists.hampshire.edu/mailman/listinfo/clusterusers>
>>
>
> --
> Wm. Josiah Erikson
> Assistant Director of IT, Infrastructure Group
> System Administrator, School of CS
> Hampshire College
> Amherst, MA 01002
> (413) 559-6091
>
>
> ______________________________**_________________
> Ci-lab mailing list
> Ci-lab at lists.hampshire.edu
> https://lists.hampshire.edu/**mailman/listinfo/ci-lab<https://lists.hampshire.edu/mailman/listinfo/ci-lab>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.hampshire.edu/pipermail/clusterusers/attachments/20130627/dc5fec11/attachment.html>


More information about the Clusterusers mailing list