[Clusterusers] long running renders

Wed Jun 17 13:17:45 EDT 2015

I just spooled some runs to test it out, and it seems like it worked for
me. So, this isn't a general tractor failure.

Tom

On Wed, Jun 17, 2015 at 1:09 PM, Jordan Miron <jordanamiron at gmail.com>
wrote:

> So I was trying to spool a couple new renders today and terminal gave me
> the error "hostname lookup failed." This is the first time I've tried to
> spool anything since the 4th, is there something that's been changed about
> the farm since then that could be causing this? And does anyone know how I
> can fix it? This is the command I used:
>
> /helga/global/wc/scripts/launcher.py \
>
> jordanD3 scenes shot1ins4 hotbeef maya2015maxwell31 1-100 \
>
> -proj /helga/tmp/jordanD3/ \
>
> -file /helga/tmp/jordanD3/scenes/shot01.mb \
>
> -sl 16 -time 2160 -dumpmxs false
> -Jordan
>
> On Sat, Jun 13, 2015 at 1:59 PM, Bassam Kurdali <bassam at urchn.org> wrote:
>
>> Hmm, I think it is going OK for now; I mainly just thought that 1
>> day/node-frame seemed like a pathological rendertime, if that is what
>> maxwell just does, maybe that's ok.
>> Our current title render is probably sub-optimal but it seems to be going
>> ok.
>> For the next batch of renders from tube we're going to be rendering 1/3
>> resolution low sample renders, so I *think* we'll be ok, I'll post here if
>> things seem to go too slow. In that case we could look at restricting
>> nodes, temporarily pausing and restarting long renders, priority games, or
>> we could look at external renderfarms (for tube) - in the latter case I'll
>> try to script some downloads so the rendered frames go into
>> helga/tube/renders in the same way they do if they render on the cluster.
>>
>> On Fri, 2015-06-12 at 23:22 -0400, Lee Spector wrote:
>>
>>
>> It's possible that my recent behavior has had some impact too -- for the
>> last week or so I've been doing some big long runs on 4 nodes (1-1, 1-2,
>> 1-17, and 1-18) outside of tractor. I didn't nimby the nodes, and I saw
>> other things running on them at various times, but at present it looks like
>> nothing else is getting allocated to them except one run of Tom's on 1-18.
>> I may scale this back after examining progress tomorrow.
>>
>>  -Lee
>>
>> On Jun 12, 2015, at 11:12 PM, Bassam Kurdali <bassam at urchn.org> wrote:
>>
>> We can hold off changing anything for now - I really feel something has
>> to be wrong with the files to get such long rendertimes, but I don't have
>> experience with maxwell. I'd never use it if that was 'normal'.
>> I'm going to try to get some free nodes from renderstreet - I just
>> emailed them (they typically offer free rendering to open movie projects)
>> and perhaps I'll try quarnot as well though I don't have a relationship
>> with the latter. If it works out we won't need to use the farm as much -
>> I'm trying a render on it right now and it doesn't seem too too terrible;
>> we're getting about 45 mins to an hour per frame on this particular file at
>> about 14 nodes concurrent. At this rate our shot should finish in 4 days, I
>> would probably have expected a day or less in the old days, but it is hard
>> to compare since it has been a while.
>>
>> I'm probably going to pause this particular render - based on the
>> settings, it looks like it is optimized for gpu- and prioritize another
>> shot from another artist. He's going to do preview quality render, so I'm
>> hoping we can get a turnaround of a day, if that works out it is
>> acceptable.
>> cheers
>> Bassam
>>
>>
>> On Fri, 2015-06-12 at 22:16 -0400, Thomas Helmuth wrote:
>>
>> Yes, those long renders have been going for a few weeks I think. But, at
>> a max of 40 launches, they haven't interfered too much with what I'm doing
>> -- I can still get 80+ GP runs launched at once, which is good since I'm
>> finishing up some experiments for my dissertation. If there's any change,
>> the one that would help me the most is moving them to only (or primarily)
>> use the ash nodes, since I can't use those myself. But, I don't consider
>> them to be a problem, unless someone else needs to get some heavy use soon.
>>
>> Tom
>>
>> On Fri, Jun 12, 2015 at 10:11 PM, Chris Perry <perry at hampshire.edu>
>> wrote:
>>
>>
>> Jordan seems to have spooled with a time limit of 2160, which yes is 36
>> hours per frame max though many seem to be finishing with fewer than that.
>>
>> If this is hogging the farm too much, I think we could and should limit
>> the Maxwell launches. Right now we’re at 40. I could lower it to 20 or so
>> which will still give Jordan cycles but not at the expense of others
>> needing to render.
>>
>> LMK what you all think. If I can’t deal with this before I leave town
>> (tomorrow), Josiah can definitely change the limit too.
>>
>> - chris
>>
>>
>> On Jun 12, 2015, at 9:21 PM, Bassam Kurdali <bassam at urchn.org> wrote:
>>
>> > I've noticed there are some renders on the farm that are taking a day a
>> > frame (or more) running on 40 or so nodes - is there something wrong
>> > with the farm or with these renders? They seem to be maxwell jobs.
>> >
>> > There's a few hundred frames left, so they would finish a few months
>> > from now - assuming no-one else renders in that time frame.
>> > Cheers
>> > Bassam
>> > _______________________________________________
>> > Clusterusers mailing list
>> > Clusterusers at lists.hampshire.edu
>> > https://lists.hampshire.edu/mailman/listinfo/clusterusers
>>
>> _______________________________________________
>> Clusterusers mailing list
>> Clusterusers at lists.hampshire.edu
>> https://lists.hampshire.edu/mailman/listinfo/clusterusers
>>
>>
>> _______________________________________________
>> Clusterusers mailing list
>> Clusterusers at lists.hampshire.edu
>> https://lists.hampshire.edu/mailman/listinfo/clusterusers
>>
>>
>> --
>> Lee Spector, Professor of Computer Science
>> Director, Institute for Computational Intelligence
>> Cognitive Science, Hampshire College
>> 893 West Street, Amherst, MA 01002-3359
>> lspector at hampshire.edu, http://hampshire.edu/lspector/
>> Phone: 413-559-5352, Fax: 413-559-5438
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.hampshire.edu/pipermail/clusterusers/attachments/20150617/8ebe3150/attachment.html>