<div dir="ltr"><div><div>Hi Jaime,<br><br></div>I think I'm the main culprit right now.

 I have quite a few long, important, but not very time sensitive runs 

going on the cluster currently. Some have been going for 5 to 10 days 

now, so I'd prefer not to cancel them and lose that work, even though 

they could be going another 5 to 10 days until completion. I'd be happy 

to pause any new launches of runs for now, but I'd prefer to allow the 

already-started runs to finish if at all possible.<br><br></div><div>Also, for figuring out the future sharing of the cluster, I'd be happy to only have my runs hog certain nodes, or have some other way of having them get lower priority when others need to use it. Let's talk about it at the lab meeting.<br>

</div><div><br></div>-Tom</div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Jun 27, 2013 at 2:23 PM, Wm. Josiah Erikson <span dir="ltr"><<a href="mailto:wjerikson@hampshire.edu" target="_blank">wjerikson@hampshire.edu</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">The issue right now is that many of Tom's Digital Multiplier processes seem to be taking around 9 days to finish, so once they've grabbed a slot, they hold on to it for a very long time, which effectively means that nobody else can use the cluster while they are running, since we don't have any kicking-people-out algorithm, and everybody else's jobs finish in more like 20 minutes.<br>

The solution, of course, is to change the "tom" tag to only use every other slot or something like that, or only run two processes per machine or something... but then they'll take even longer to finish.<br>

Discussing tomorrow at the meeting seems like a good plan.<br>

    -Josiah<div class="HOEnZb"><div class="h5"><br>

<br>

<br>

On 6/27/13 2:19 PM, Lee Spector wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Hi Jaime,<br>

<br>

Fine with me personally but I'll check with my lab group to see what everyone's expected needs are. I'm also not sure exactly how to implement the idea if people do want to run some other things... maybe by having you use a subset of machines that everyone else excludes?<br>

<br>

  -Lee<br>

<br>

On Jun 27, 2013, at 2:14 PM, Jaime Davila wrote:<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Greetings everyone,<br>

<br>

I wanted to check to see if it was possible for me to grab some more CPU cycles out of the cluster for a week or so. I just placed a new algorithm on my account, and have it run it, but it's fairly different from what I was doing before, and I would rather detect quickly if I need to tweak or change things, as opposed to having to wait a week to realize I need to make a 10 minute change.<br>

<br>

Last time that the system load diminished some, I noticed that my processes run at their top speed if the number of CPUs loaded to their maximum drops to about 75%, as opposed to the 97% where they are now. Maybe things will be that way this time around, maybe not? In either case, my grabbing more cpu cycles right now would be extremely useful.<br>

<br>

Thoughts?<br>

<br>

Thanks a lot,<br>

<br>

Jaime<br>

<br>

-- <br>

******************************<u></u>************************<br>

Jaime J. Dávila<br>

Associate Professor of Computer Science<br>

Hampshire College<br>

jdavila at hampshire dot edu<br>

<a href="http://helios.hampshire.edu/jdavila" target="_blank">http://helios.hampshire.edu/<u></u>jdavila</a><br>

******************************<u></u>*************************<br>

<br>

______________________________<u></u>_________________<br>

Clusterusers mailing list<br>

<a href="mailto:Clusterusers@lists.hampshire.edu" target="_blank">Clusterusers@lists.hampshire.<u></u>edu</a><br>

<a href="https://lists.hampshire.edu/mailman/listinfo/clusterusers" target="_blank">https://lists.hampshire.edu/<u></u>mailman/listinfo/clusterusers</a><br>

</blockquote>

--<br>

Lee Spector, Professor of Computer Science<br>

Cognitive Science, Hampshire College<br>

893 West Street, Amherst, MA 01002-3359<br>

<a href="mailto:lspector@hampshire.edu" target="_blank">lspector@hampshire.edu</a>, <a href="http://hampshire.edu/lspector/" target="_blank">http://hampshire.edu/lspector/</a><br>

Phone: <a href="tel:413-559-5352" value="+14135595352" target="_blank">413-559-5352</a>, Fax: <a href="tel:413-559-5438" value="+14135595438" target="_blank">413-559-5438</a><br>

<br>

______________________________<u></u>_________________<br>

Clusterusers mailing list<br>

<a href="mailto:Clusterusers@lists.hampshire.edu" target="_blank">Clusterusers@lists.hampshire.<u></u>edu</a><br>

<a href="https://lists.hampshire.edu/mailman/listinfo/clusterusers" target="_blank">https://lists.hampshire.edu/<u></u>mailman/listinfo/clusterusers</a><br>

</blockquote>

<br></div></div><span class="HOEnZb"><font color="#888888">

-- <br>

Wm. Josiah Erikson<br>

Assistant Director of IT, Infrastructure Group<br>

System Administrator, School of CS<br>

Hampshire College<br>

Amherst, MA 01002<br>

<a href="tel:%28413%29%20559-6091" value="+14135596091" target="_blank">(413) 559-6091</a></font></span><div class="HOEnZb"><div class="h5"><br>

<br>

______________________________<u></u>_________________<br>

Ci-lab mailing list<br>

<a href="mailto:Ci-lab@lists.hampshire.edu" target="_blank">Ci-lab@lists.hampshire.edu</a><br>

<a href="https://lists.hampshire.edu/mailman/listinfo/ci-lab" target="_blank">https://lists.hampshire.edu/<u></u>mailman/listinfo/ci-lab</a><br>

</div></div></blockquote></div><br></div>