[Clusterusers] More keg rebuilding

Wm. Josiah Erikson wjerikson at hampshire.edu
Wed Jan 15 11:08:29 EST 2014


OK, first keg reboot done and was a success. It doesn't look like 
anything was interrupted or crashed - at least, not that I've noticed yet :)
     -Josiah

On 1/15/14 11:01 AM, Thomas Helmuth wrote:
> Hi Josiah,
>
> I'm now ready for the keg reboot whenever you are. It would "nice" to 
> have the last few runs finish, but definitely not necessary if the 
> reboot kills them. Let me know when its done, since I have some more 
> runs I want to start.
>
> Thanks,
> Tom
>
>
> On Tue, Jan 14, 2014 at 8:55 PM, Thomas Helmuth <thelmuth at cs.umass.edu 
> <mailto:thelmuth at cs.umass.edu>> wrote:
>
>     Hi Josiah,
>
>     I'm hoping most of my runs should be done by tomorrow, either
>     early or late morning. Even if a few are still going, they aren't
>     as critical as ones that finished today. So, I think it would be a
>     great time to reboot keg if it works for you. I'll take a look at
>     what's still running in the morning and will be in touch.
>
>     -Tom
>
>
>     On Mon, Jan 13, 2014 at 3:28 PM, Wm. Josiah Erikson
>     <wjerikson at hampshire.edu <mailto:wjerikson at hampshire.edu>> wrote:
>
>         OK, let me know. What I'm doing is long-term maintenance, not
>         time-sensitive research :)
>             -Josiah
>
>
>         On 1/13/14 3:24 PM, Thomas Helmuth wrote:
>>         Well, that would certainly work. Currently, I'd really like
>>         all of my green runs to finish, which I'm hoping will be
>>         within a day or two, but after that it might actually be a
>>         nice break point to do the reboot then. We'll see if it works
>>         out well to reboot sometime tomorrow or Wednesday.
>>
>>         -Tom
>>
>>
>>         On Mon, Jan 13, 2014 at 3:19 PM, Wm. Josiah Erikson
>>         <wjerikson at hampshire.edu <mailto:wjerikson at hampshire.edu>> wrote:
>>
>>             You know, I could just wait on all of this until
>>             February, too, if you're feeling the crunch. Would that
>>             be better?
>>
>>                 -Josiah
>>
>>
>>             On 1/13/14 9:27 AM, Thomas Helmuth wrote:
>>>             Sure. I doubt the runs I have currently will be done by
>>>             the end of the day. The most important are the Pagie
>>>             runs, the others can be paused until after the keg
>>>             reboot. I'm guessing they'll be done sometime tomorrow,
>>>             but its hard to say.
>>>
>>>             -Tom
>>>
>>>
>>>             On Mon, Jan 13, 2014 at 9:24 AM, Wm. Josiah Erikson
>>>             <wjerikson at hampshire.edu
>>>             <mailto:wjerikson at hampshire.edu>> wrote:
>>>
>>>                 Keg is the license server for the Pixar stuff, so if
>>>                 keg goes down for too long, the runs sometimes crash
>>>                 due to license checkout failures for tractor.
>>>
>>>                 We should coordinate about rebooting keg - maybe
>>>                 when your current runs finish, before you start new
>>>                 ones? Hopefully at the end of the day or something.
>>>
>>>                     -Josiah
>>>
>>>
>>>
>>>                 On 1/13/14 9:17 AM, Thomas Helmuth wrote:
>>>>                 Hi Josiah,
>>>>
>>>>                 I don't really know what keg is (maybe the HDDs for
>>>>                 fly?), but I'll assume you have everything under
>>>>                 control. I do have a paper deadline coming up at
>>>>                 the end of January, and the runs currently going
>>>>                 are very important for it, so I guess be extra
>>>>                 careful that runs/data aren't lost.
>>>>
>>>>                 -Tom
>>>>
>>>>
>>>>                 On Mon, Jan 13, 2014 at 9:05 AM, Wm. Josiah Erikson
>>>>                 <wjerikson at hampshire.edu
>>>>                 <mailto:wjerikson at hampshire.edu>> wrote:
>>>>
>>>>                     Hi guys,
>>>>                         So I'm going to try again with resizing
>>>>                     keg's RAID array. I am going to be a little
>>>>                     more careful this time. The first thing I'm
>>>>                     going to do, today, is to remove the fourth
>>>>                     drive from the RAID array, rebuild its
>>>>                     partition table with GPT, which can handle
>>>>                     larger than 2TB partitions, and then re-add it
>>>>                     to the array and let it rebuild. Then I'm going
>>>>                     to reboot keg and make sure everything comes
>>>>                     back up fine. I will wait to reboot, though,
>>>>                     until Tom's jobs are done, if that would be
>>>>                     helpful. It looks like they are relatively
>>>>                     short-running jobs, Tom (I mean, for you... heh)?
>>>>                         I'll ping you again before I reboot it,
>>>>                     which "shouldn't" hose your jobs, but things
>>>>                     haven't gone as planned twice recently, so I'm
>>>>                     less confident about that :)
>>>>                         Of course, now that I'm saying that,
>>>>                     everything will go as planned. It's like
>>>>                     bringing an umbrella to ensure it doesn't rain,
>>>>                     probably :)
>>>>
>>>>                     -- 
>>>>                     Wm. Josiah Erikson
>>>>                     Assistant Director of IT, Infrastructure Group
>>>>                     System Administrator, School of CS
>>>>                     Hampshire College
>>>>                     Amherst, MA 01002
>>>>                     (413) 559-6091 <tel:%28413%29%20559-6091>
>>>>
>>>>                     _______________________________________________
>>>>                     Clusterusers mailing list
>>>>                     Clusterusers at lists.hampshire.edu
>>>>                     <mailto:Clusterusers at lists.hampshire.edu>
>>>>                     https://lists.hampshire.edu/mailman/listinfo/clusterusers
>>>>
>>>>
>>>>
>>>>
>>>>                 _______________________________________________
>>>>                 Clusterusers mailing list
>>>>                 Clusterusers at lists.hampshire.edu  <mailto:Clusterusers at lists.hampshire.edu>
>>>>                 https://lists.hampshire.edu/mailman/listinfo/clusterusers
>>>
>>>                 -- 
>>>                 Wm. Josiah Erikson
>>>                 Assistant Director of IT, Infrastructure Group
>>>                 System Administrator, School of CS
>>>                 Hampshire College
>>>                 Amherst, MA 01002
>>>                 (413) 559-6091  <tel:%28413%29%20559-6091>
>>>
>>>
>>>                 _______________________________________________
>>>                 Clusterusers mailing list
>>>                 Clusterusers at lists.hampshire.edu
>>>                 <mailto:Clusterusers at lists.hampshire.edu>
>>>                 https://lists.hampshire.edu/mailman/listinfo/clusterusers
>>>
>>>
>>
>>             -- 
>>             Wm. Josiah Erikson
>>             Assistant Director of IT, Infrastructure Group
>>             System Administrator, School of CS
>>             Hampshire College
>>             Amherst, MA 01002
>>             (413) 559-6091  <tel:%28413%29%20559-6091>
>>
>>
>
>         -- 
>         Wm. Josiah Erikson
>         Assistant Director of IT, Infrastructure Group
>         System Administrator, School of CS
>         Hampshire College
>         Amherst, MA 01002
>         (413) 559-6091  <tel:%28413%29%20559-6091>
>
>
>

-- 
Wm. Josiah Erikson
Assistant Director of IT, Infrastructure Group
System Administrator, School of CS
Hampshire College
Amherst, MA 01002
(413) 559-6091

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.hampshire.edu/pipermail/clusterusers/attachments/20140115/3d470b4d/attachment-0001.html>


More information about the Clusterusers mailing list