<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Nope, don't think so. It errored because it ran out of RAM. Keep in
mind it has 24GB of that, so something is probably up with your
scene still. Maybe if you got lucky and went to one of the really
really big nodes that frame made it through :) I just rebooted it
and I will un-NIMBY it.<br>
-Josiah<br>
<br>
<br>
<br>
<div class="moz-cite-prefix">On 4/22/15 3:16 PM, Piper Odegard
wrote:<br>
</div>
<blockquote
cite="mid:28373996efd37d5d12457ff3ae976cea@mail.hampshire.edu"
type="cite">
<p>Hey Josiah,<br>
<br>
I just NIMBY'd compute 2-1 because it errored out 3 tasks in a
row on one of my jobs—in the future, is there a way on the fly
to tell a job not to use a particular<br>
machine for the rest of its run?<br>
<br>
- Piper</p>
<p> </p>
<div> </div>
<p>On 2015-04-20 18:59, Thomas Helmuth wrote:</p>
<blockquote type="cite" style="padding-left:5px;
border-left:#1010ff 2px solid; margin-left:5px"><!-- html ignored --><!-- head ignored --><!-- meta ignored -->
<div dir="ltr">
<div>Awesome, thanks Chris! I should be able to merge that
idea into my current launching scripts to get Lee going, but
we'll page you if we run into trouble.<br>
<br>
</div>
Tom</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Mon, Apr 20, 2015 at 6:57 PM, Lee
Spector <span><<a moz-do-not-send="true"
href="mailto:lspector@hampshire.edu">lspector@hampshire.edu</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin: 0 0 0 .8ex;
border-left: 1px #ccc solid; padding-left: 1ex;">
<div style="word-wrap: break-word;">
<div>Thanks Chris!</div>
<div> </div>
<div> -Lee</div>
<div>
<div class="h5">
<div> </div>
<br>
<div>
<blockquote type="cite" style="padding-left:5px;
border-left:#1010ff 2px solid; margin-left:5px">
<div>On Apr 20, 2015, at 6:47 PM, Chris Perry
<<a moz-do-not-send="true"
href="mailto:perry@hampshire.edu">perry@hampshire.edu</a>>
wrote:</div>
<br>
<div>
<div style="word-wrap: break-word;">
<div> </div>
My forum post resulted in a solution that
could work for us if we keep the slots as
they are. The RemoteCmd line to use is:
<div> </div>
<div>RemoteCmd { command } -samehost 1
-atmost 4 -atleast 4 -service { blah }</div>
<div> </div>
<div>That will check out exactly 4 slots on
a single node. So all Lee needs is to know
how many slots are on the machine he wants
and we (should be) good to go using a
RemoteCmd structure like this. </div>
<div> </div>
<div>I’m happy to help you guys make these
scripts work if and when the time comes.</div>
<div> </div>
<div>- chris</div>
<div>
<div><br>
<div>
<div>On Apr 20, 2015, at 5:28 PM, Lee
Spector <<a
moz-do-not-send="true"
href="mailto:lspector@hampshire.edu">lspector@hampshire.edu</a>>
wrote:</div>
<br>
<blockquote type="cite"
style="padding-left:5px;
border-left:#1010ff 2px solid;
margin-left:5px">
<div style="word-wrap: break-word;">
<div> </div>
<div>This, in conjunction with our
observations of weird
multithreading performance
pathologies on the JVM, makes me
think that we SHOULDN'T change
the slots per node from whatever
we're currently doing, at least
not universally or by default.
Is there a way to make this work
as it currently does most of the
time, but for me to grab a
machine in 1-slot mode when I
want one?</div>
<div> </div>
<div>My use case is the exception,
not the rule, and I usually need
one or a small handful of these
sessions going at a time (I run
them individually and watch
them). At the moment I'm just
using my desktop mac, and I
think that actually runs my code
faster than any fly node at the
moment (although 1-4 seemed to
perform well). I may want to run
on a fly node or two at some
point too, but I wouldn't want
the performance of other work
(especially Tom's many many
large runs) to take a hit for
this. </div>
<div> </div>
<div> -Lee</div>
<div> </div>
<div> </div>
<br>
<div>
<blockquote type="cite"
style="padding-left:5px;
border-left:#1010ff 2px solid;
margin-left:5px">
<div>On Apr 20, 2015, at 4:59
PM, Thomas Helmuth <<a
moz-do-not-send="true"
href="mailto:thelmuth@cs.umass.edu">thelmuth@cs.umass.edu</a>>
wrote:</div>
<br>
<div>
<div dir="ltr">I agree with
the others that it should
be pretty easy to do
everything you need. And I
can walk you through how
to easily do it in Clojush
with the python script I
have to launch runs.<br>
<div><br>
I'm not sure if I'd get
more or less performance
with one slot per node,
but I'm guessing less --
I often get multiple
runs going on a node, so
I'd need to get 4 or 8
times speedups with
multi-threaded in a
single run to get the
same amount of work
done. But, maybe this
would just happen
automatically with how
we have it setup in
Clojush.<br>
<br>
</div>
<div>Tom</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On
Mon, Apr 20, 2015 at
4:19 PM, Chris Perry <span><<a
moz-do-not-send="true" href="mailto:perry@hampshire.edu">perry@hampshire.edu</a>></span>
wrote:<br>
<blockquote
class="gmail_quote"
style="margin: 0 0 0
.8ex; border-left: 1px
#ccc solid;
padding-left: 1ex;"><br>
I just wrote a post to
the tractor developer
forum asking about the
one piece of this that
I’m not 100% sure
about, but if we go
with Josiah’s
recommendation of one
slot per blade then my
post will be
irrelevant and YES we
can do exactly what
you want.<br>
<br>
And Josiah, I do think
that you’re right
about one slot per
node, not only because
of the great
multithreading that
everyone’s doing, but
because our fileserver
can’t keep up with too
many concurrent
reads/writes anyway.<br>
<span><span
style="color:
#888888;"><br>
- chris<br>
</span></span>
<div>
<div><br>
On Apr 20, 2015,
at 3:55 PM, Wm.
Josiah Erikson
<<a
moz-do-not-send="true"
href="mailto:wjerikson@hampshire.edu">wjerikson@hampshire.edu</a>>
wrote:<br>
<br>
><br>
><br>
> On 4/20/15
3:43 PM, Lee
Spector wrote:<br>
>> Maybe
it's finally time
for me to do this.
As Tom points out
I should have a
better handle on
it before he
leaves. But the
runs that I do are
a little different
from what he
usually does.<br>
>><br>
>> Is it now
possible to do
this with all of
these properties?:<br>
>><br>
>> - My run
starts
immediately.<br>
> Yes. If you
spool with a high
enough priority,
it will actually<br>
> terminate and
re-spool other
jobs that are in
your way.<br>
>> - It runs
on a node that I
specify.<br>
> Yes. The
service key would
just be the node
you want.<br>
>> - No
other jobs run on
that node until my
run is done, even
if mine
temporarily goes
to low cpu
utilization.<br>
> We would have
to make it so that
your node had one
slot, and then you<br>
> did something
like this if you
had more than one
command:<br>
> <a
moz-do-not-send="true"
href="http://fly.hampshire.edu/docs/Tractor/scripting.html#sharing-one-server-check-out-among-several-commands">http://fly.hampshire.edu/docs/Tractor/scripting.html#sharing-one-server-check-out-among-several-commands</a>.<br>
> I think that
most things that
we are doing these
days are
multithreaded<br>
> enough that
one slot per node
might be a valid
choice
cluster-wide... at<br>
> least at the
moment with
Maxwell and prman.
What say others?<br>
>> - I can
see stderr as well
as std out, which
I'll be piping to
a file (which has
to be somewhere I
can grep it).<br>
> Just pipe
each one to
whatever file you
want. yes.<br>
>> - I can
easily and
instantly kill and
restart my runs.<br>
> Yes.
Right-click and
restart, interrupt
or delete.<br>
>><br>
>> If so
then yes, I guess
I should try to
switch to a
tractor-based
workflow for my
work on fly.<br>
>><br>
>> -Lee<br>
>><br>
>><br>
>>> On
Apr 20, 2015, at
3:26 PM, Chris
Perry <<a
moz-do-not-send="true"
href="mailto:perry@hampshire.edu">perry@hampshire.edu</a>> wrote:<br>
>>><br>
>>><br>
>>> I
have a different
ideal solution to
propose: use
tractor!<br>
>>><br>
>>> What
you are looking to
do with your NIMBY
mechanism is to
pull machines out
of tractor
temporarily so
they can run your
job(s). But this
is exactly what
tractor does: it
gives a job to an
idle machine then
keeps that machine
from running new
jobs until the
first job is
killed or
completed. If you
spool with high
enough priority,
tractor kills and
respools a lower
priority job so as
to free up the
machine, so you
don’t have to wait
if immediate
access is a
concern of yours.<br>
>>><br>
>>> Not
to mention, the
tractor interface
will show running
jobs so there
would be a “Lee’s
job” item in green
on the list. This
might help keep
you less likely to
lose track in the
way you describe
happening now.<br>
>>><br>
>>> Worth
considering? I’m
sure there are a
few kinks to
figure out (such
as how to tag your
jobs so that they
have the full
machine,
guaranteed) but I
feel confident
that we can do
this.<br>
>>><br>
>>> -
chris<br>
>>><br>
>>> On
Apr 20, 2015, at
2:31 PM, Lee
Spector <<a
moz-do-not-send="true"
href="mailto:lspector@hampshire.edu">lspector@hampshire.edu</a>>
wrote:<br>
>>><br>
>>>>
Some of these were
probably my doing,
but I only recall
nimbying 1-4 and
4-5 in the recent
past.<br>
>>>><br>
>>>>
It's not a problem
with a node that
causes me to do
this, it's an
interest in having
total control of a
node for a
compute-intensive
multithreaded run,
with no chance
that any other
processes will be
allocated on it.
Sometimes I'll
start one of
these, check in on
it regularly for a
while, and then
check in less
frequently if it's
not doing anything
really interesting
but I'm not ready
to kill it in case
it might still do
something good.
Then sometimes I
lose track. Right
now I have nothing
running, and any
nimbying that I've
done can be
unnimbyed,
although I'm not
100% sure what I
may have left
nimbyed.<br>
>>>><br>
>>>>
Ideally, I guess,
we'd have a nimby
command that
records who
nimbyed and maybe
periodically asks
them if they still
want it nimbyed.
I'm not sure how
difficult that
would be. If
somebody is
inspired to do
this, then it
would also be nice
if the command for
numbying/unnimbying
was more
straightforward
than the current
one (which takes a
1 or a 0 as an
argument, which is
a little
confusing), if it
returned a value
that made it more
clear what the new
status is, and if
there was a simple
way just to check
the status (which
maybe there is
now? if so I don't
know it).<br>
>>>><br>
>>>>
-Lee<br>
>>>><br>
>>>><br>
>>>>>
On Apr 20, 2015,
at 1:36 PM, Wm.
Josiah Erikson
<<a
moz-do-not-send="true"
href="mailto:wjerikson@hampshire.edu">wjerikson@hampshire.edu</a>>
wrote:<br>
>>>>><br>
>>>>>
Hi all,<br>
>>>>>
Why are nodes
1-17, 1-18, 1-2,
1-4, and 1-9
NIMBYed? If you
are<br>
>>>>>
having a problem
with a node that
causes you to need
to NIMBY it,
please<br>
>>>>>
let me know,
because maybe it
just means I
screwed up the
service keys<br>
>>>>>
or something. I'm
not omniscient :)<br>
>>>>><br>
>>>>>
--<br>
>>>>>
Wm. Josiah Erikson<br>
>>>>>
Assistant Director
of IT,
Infrastructure
Group<br>
>>>>>
System
Administrator,
School of CS<br>
>>>>>
Hampshire College<br>
>>>>>
Amherst, MA 01002<br>
>>>>>
<a
moz-do-not-send="true"
href="tel:%28413%29%20559-6091">(413) 559-6091</a><br>
>>>>><br>
>>>>>
_______________________________________________<br>
>>>>>
Clusterusers
mailing list<br>
>>>>>
<a
moz-do-not-send="true"
href="mailto:Clusterusers@lists.hampshire.edu">Clusterusers@lists.hampshire.edu</a><br>
>>>>>
<a
moz-do-not-send="true"
href="https://lists.hampshire.edu/mailman/listinfo/clusterusers">https://lists.hampshire.edu/mailman/listinfo/clusterusers</a><br>
>>>>
--<br>
>>>>
Lee Spector,
Professor of
Computer Science<br>
>>>>
Director,
Institute for
Computational
Intelligence<br>
>>>>
Cognitive Science,
Hampshire College<br>
>>>>
893 West Street,
Amherst, MA
01002-3359<br>
>>>> <a
moz-do-not-send="true" href="mailto:lspector@hampshire.edu">lspector@hampshire.edu</a>,
<a
moz-do-not-send="true"
href="http://hampshire.edu/lspector/">http://hampshire.edu/lspector/</a><br>
>>>>
Phone: <a
moz-do-not-send="true"
href="tel:413-559-5352">413-559-5352</a>, Fax: <a
moz-do-not-send="true"
href="tel:413-559-5438">413-559-5438</a><br>
>>>><br>
>>>>
_______________________________________________<br>
>>>>
Clusterusers
mailing list<br>
>>>> <a
moz-do-not-send="true" href="mailto:Clusterusers@lists.hampshire.edu">Clusterusers@lists.hampshire.edu</a><br>
>>>> <a
moz-do-not-send="true"
href="https://lists.hampshire.edu/mailman/listinfo/clusterusers">https://lists.hampshire.edu/mailman/listinfo/clusterusers</a><br>
>>>
_______________________________________________<br>
>>>
Clusterusers
mailing list<br>
>>> <a
moz-do-not-send="true"
href="mailto:Clusterusers@lists.hampshire.edu">Clusterusers@lists.hampshire.edu</a><br>
>>> <a
moz-do-not-send="true"
href="https://lists.hampshire.edu/mailman/listinfo/clusterusers">https://lists.hampshire.edu/mailman/listinfo/clusterusers</a><br>
>> --<br>
>> Lee
Spector, Professor
of Computer
Science<br>
>> Director,
Institute for
Computational
Intelligence<br>
>> Cognitive
Science, Hampshire
College<br>
>> 893 West
Street, Amherst,
MA 01002-3359<br>
>> <a
moz-do-not-send="true"
href="mailto:lspector@hampshire.edu">lspector@hampshire.edu</a>, <a
moz-do-not-send="true"
href="http://hampshire.edu/lspector/">http://hampshire.edu/lspector/</a><br>
>> Phone: <a
moz-do-not-send="true" href="tel:413-559-5352">413-559-5352</a>, Fax: <a
moz-do-not-send="true" href="tel:413-559-5438">413-559-5438</a><br>
>><br>
>>
_______________________________________________<br>
>>
Clusterusers
mailing list<br>
>> <a
moz-do-not-send="true"
href="mailto:Clusterusers@lists.hampshire.edu">Clusterusers@lists.hampshire.edu</a><br>
>> <a
moz-do-not-send="true"
href="https://lists.hampshire.edu/mailman/listinfo/clusterusers">https://lists.hampshire.edu/mailman/listinfo/clusterusers</a><br>
><br>
> --<br>
> Wm. Josiah
Erikson<br>
> Assistant
Director of IT,
Infrastructure
Group<br>
> System
Administrator,
School of CS<br>
> Hampshire
College<br>
> Amherst, MA
01002<br>
> <a
moz-do-not-send="true"
href="tel:%28413%29%20559-6091">(413) 559-6091</a><br>
><br>
>
_______________________________________________<br>
> Clusterusers
mailing list<br>
> <a
moz-do-not-send="true"
href="mailto:Clusterusers@lists.hampshire.edu">Clusterusers@lists.hampshire.edu</a><br>
> <a
moz-do-not-send="true"
href="https://lists.hampshire.edu/mailman/listinfo/clusterusers">https://lists.hampshire.edu/mailman/listinfo/clusterusers</a><br>
<br>
_______________________________________________<br>
Clusterusers
mailing list<br>
<a
moz-do-not-send="true"
href="mailto:Clusterusers@lists.hampshire.edu">Clusterusers@lists.hampshire.edu</a><br>
<a
moz-do-not-send="true"
href="https://lists.hampshire.edu/mailman/listinfo/clusterusers">https://lists.hampshire.edu/mailman/listinfo/clusterusers</a></div>
</div>
</blockquote>
</div>
</div>
_______________________________________________<br>
Clusterusers mailing list<br>
<a moz-do-not-send="true"
href="mailto:Clusterusers@lists.hampshire.edu">Clusterusers@lists.hampshire.edu</a><br>
<a moz-do-not-send="true"
href="https://lists.hampshire.edu/mailman/listinfo/clusterusers">https://lists.hampshire.edu/mailman/listinfo/clusterusers</a></div>
</blockquote>
</div>
<br>
<div>
<div style="word-wrap:
break-word;">
<div style="word-wrap:
break-word;">
<div style="word-wrap:
break-word;">
<div style="word-wrap:
break-word;">
<div style="word-wrap:
break-word;">
<p style="margin:
0px;"> </p>
<div>--</div>
<div>Lee Spector,
Professor of
Computer Science</div>
<div>Director,
Institute for
Computational
Intelligence<br>
Cognitive Science,
Hampshire College<br>
893 West Street,
Amherst, MA
01002-3359<br>
<a
moz-do-not-send="true"
href="mailto:lspector@hampshire.edu">lspector@hampshire.edu</a>, <a
moz-do-not-send="true"
href="http://hampshire.edu/lspector/">http://hampshire.edu/lspector/</a><br>
Phone: <a
moz-do-not-send="true"
href="tel:413-559-5352">413-559-5352</a>, Fax: <a
moz-do-not-send="true"
href="tel:413-559-5438">413-559-5438</a></div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
_______________________________________________<br>
Clusterusers mailing list<br>
<a moz-do-not-send="true"
href="mailto:Clusterusers@lists.hampshire.edu">Clusterusers@lists.hampshire.edu</a><br>
<a moz-do-not-send="true"
href="https://lists.hampshire.edu/mailman/listinfo/clusterusers">https://lists.hampshire.edu/mailman/listinfo/clusterusers</a></blockquote>
</div>
</div>
</div>
</div>
_______________________________________________<br>
Clusterusers mailing list<br>
<a moz-do-not-send="true"
href="mailto:Clusterusers@lists.hampshire.edu">Clusterusers@lists.hampshire.edu</a><br>
<a moz-do-not-send="true"
href="https://lists.hampshire.edu/mailman/listinfo/clusterusers">https://lists.hampshire.edu/mailman/listinfo/clusterusers</a></div>
</blockquote>
</div>
<br>
<div>
<div style="word-wrap: break-word;">
<div style="word-wrap: break-word;">
<div style="word-wrap: break-word;">
<div style="word-wrap: break-word;">
<div style="word-wrap: break-word;">
<p style="margin: 0px;"> </p>
<div>--</div>
<div>Lee Spector, Professor of Computer
Science</div>
<div>Director, Institute for
Computational Intelligence<br>
Cognitive Science, Hampshire College<br>
893 West Street, Amherst, MA
01002-3359<br>
<a moz-do-not-send="true"
href="mailto:lspector@hampshire.edu">lspector@hampshire.edu</a>, <a
moz-do-not-send="true"
href="http://hampshire.edu/lspector/">http://hampshire.edu/lspector/</a><br>
Phone: <a moz-do-not-send="true"
href="tel:413-559-5352">413-559-5352</a>,
Fax: <a moz-do-not-send="true"
href="tel:413-559-5438">413-559-5438</a></div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<br>
_______________________________________________<br>
Clusterusers mailing list<br>
<a moz-do-not-send="true"
href="mailto:Clusterusers@lists.hampshire.edu">Clusterusers@lists.hampshire.edu</a><br>
<a moz-do-not-send="true"
href="https://lists.hampshire.edu/mailman/listinfo/clusterusers">https://lists.hampshire.edu/mailman/listinfo/clusterusers</a><br>
<br>
</blockquote>
</div>
</div>
<br>
<pre>_______________________________________________
Clusterusers mailing list
<a moz-do-not-send="true" href="mailto:Clusterusers@lists.hampshire.edu">Clusterusers@lists.hampshire.edu</a>
<a moz-do-not-send="true" href="https://lists.hampshire.edu/mailman/listinfo/clusterusers">https://lists.hampshire.edu/mailman/listinfo/clusterusers</a>
</pre>
</blockquote>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Clusterusers mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Clusterusers@lists.hampshire.edu">Clusterusers@lists.hampshire.edu</a>
<a class="moz-txt-link-freetext" href="https://lists.hampshire.edu/mailman/listinfo/clusterusers">https://lists.hampshire.edu/mailman/listinfo/clusterusers</a>
</pre>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
Wm. Josiah Erikson
Assistant Director of IT, Infrastructure Group
System Administrator, School of CS
Hampshire College
Amherst, MA 01002
(413) 559-6091
</pre>
</body>
</html>