[Clusterusers] long running renders

Jordan Miron jordanamiron at gmail.com
Wed Jun 17 13:09:58 EDT 2015


So I was trying to spool a couple new renders today and terminal gave me
the error "hostname lookup failed." This is the first time I've tried to
spool anything since the 4th, is there something that's been changed about
the farm since then that could be causing this? And does anyone know how I
can fix it? This is the command I used:

/helga/global/wc/scripts/launcher.py \

jordanD3 scenes shot1ins4 hotbeef maya2015maxwell31 1-100 \

-proj /helga/tmp/jordanD3/ \

-file /helga/tmp/jordanD3/scenes/shot01.mb \

-sl 16 -time 2160 -dumpmxs false
-Jordan

On Sat, Jun 13, 2015 at 1:59 PM, Bassam Kurdali <bassam at urchn.org> wrote:

> Hmm, I think it is going OK for now; I mainly just thought that 1
> day/node-frame seemed like a pathological rendertime, if that is what
> maxwell just does, maybe that's ok.
> Our current title render is probably sub-optimal but it seems to be going
> ok.
> For the next batch of renders from tube we're going to be rendering 1/3
> resolution low sample renders, so I *think* we'll be ok, I'll post here if
> things seem to go too slow. In that case we could look at restricting
> nodes, temporarily pausing and restarting long renders, priority games, or
> we could look at external renderfarms (for tube) - in the latter case I'll
> try to script some downloads so the rendered frames go into
> helga/tube/renders in the same way they do if they render on the cluster.
>
> On Fri, 2015-06-12 at 23:22 -0400, Lee Spector wrote:
>
>
> It's possible that my recent behavior has had some impact too -- for the
> last week or so I've been doing some big long runs on 4 nodes (1-1, 1-2,
> 1-17, and 1-18) outside of tractor. I didn't nimby the nodes, and I saw
> other things running on them at various times, but at present it looks like
> nothing else is getting allocated to them except one run of Tom's on 1-18.
> I may scale this back after examining progress tomorrow.
>
>  -Lee
>
> On Jun 12, 2015, at 11:12 PM, Bassam Kurdali <bassam at urchn.org> wrote:
>
> We can hold off changing anything for now - I really feel something has to
> be wrong with the files to get such long rendertimes, but I don't have
> experience with maxwell. I'd never use it if that was 'normal'.
> I'm going to try to get some free nodes from renderstreet - I just emailed
> them (they typically offer free rendering to open movie projects) and
> perhaps I'll try quarnot as well though I don't have a relationship with
> the latter. If it works out we won't need to use the farm as much - I'm
> trying a render on it right now and it doesn't seem too too terrible; we're
> getting about 45 mins to an hour per frame on this particular file at about
> 14 nodes concurrent. At this rate our shot should finish in 4 days, I would
> probably have expected a day or less in the old days, but it is hard to
> compare since it has been a while.
>
> I'm probably going to pause this particular render - based on the
> settings, it looks like it is optimized for gpu- and prioritize another
> shot from another artist. He's going to do preview quality render, so I'm
> hoping we can get a turnaround of a day, if that works out it is
> acceptable.
> cheers
> Bassam
>
>
> On Fri, 2015-06-12 at 22:16 -0400, Thomas Helmuth wrote:
>
> Yes, those long renders have been going for a few weeks I think. But, at a
> max of 40 launches, they haven't interfered too much with what I'm doing --
> I can still get 80+ GP runs launched at once, which is good since I'm
> finishing up some experiments for my dissertation. If there's any change,
> the one that would help me the most is moving them to only (or primarily)
> use the ash nodes, since I can't use those myself. But, I don't consider
> them to be a problem, unless someone else needs to get some heavy use soon.
>
> Tom
>
> On Fri, Jun 12, 2015 at 10:11 PM, Chris Perry <perry at hampshire.edu> wrote:
>
>
> Jordan seems to have spooled with a time limit of 2160, which yes is 36
> hours per frame max though many seem to be finishing with fewer than that.
>
> If this is hogging the farm too much, I think we could and should limit
> the Maxwell launches. Right now we’re at 40. I could lower it to 20 or so
> which will still give Jordan cycles but not at the expense of others
> needing to render.
>
> LMK what you all think. If I can’t deal with this before I leave town
> (tomorrow), Josiah can definitely change the limit too.
>
> - chris
>
>
> On Jun 12, 2015, at 9:21 PM, Bassam Kurdali <bassam at urchn.org> wrote:
>
> > I've noticed there are some renders on the farm that are taking a day a
> > frame (or more) running on 40 or so nodes - is there something wrong
> > with the farm or with these renders? They seem to be maxwell jobs.
> >
> > There's a few hundred frames left, so they would finish a few months
> > from now - assuming no-one else renders in that time frame.
> > Cheers
> > Bassam
> > _______________________________________________
> > Clusterusers mailing list
> > Clusterusers at lists.hampshire.edu
> > https://lists.hampshire.edu/mailman/listinfo/clusterusers
>
> _______________________________________________
> Clusterusers mailing list
> Clusterusers at lists.hampshire.edu
> https://lists.hampshire.edu/mailman/listinfo/clusterusers
>
>
> _______________________________________________
> Clusterusers mailing list
> Clusterusers at lists.hampshire.edu
> https://lists.hampshire.edu/mailman/listinfo/clusterusers
>
>
> --
> Lee Spector, Professor of Computer Science
> Director, Institute for Computational Intelligence
> Cognitive Science, Hampshire College
> 893 West Street, Amherst, MA 01002-3359
> lspector at hampshire.edu, http://hampshire.edu/lspector/
> Phone: 413-559-5352, Fax: 413-559-5438
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.hampshire.edu/pipermail/clusterusers/attachments/20150617/40f5d812/attachment-0001.html>


More information about the Clusterusers mailing list