[Clusterusers] Keg's NFS share down

Wm. Josiah Erikson wjerikson at hampshire.edu
Wed Jan 29 13:51:52 EST 2014


Bassam and I just went in and looked at this together, after I 
discovered that the RAID 5 had resized itself spontaneously and then 
resynced to approximately 2TB, when it's supposed to be ~9TB. However, 
the FILESYSTEM still thinks it's the right size. So we told the RAID5 to 
resize itself again, and it responded appropriately, is resyncing, and 
when we mount the filesystem, we no longer get the input/output errors 
we did before. Cross your fingers - if this works, it will be one of the 
more harrowing near escapes I've ever had.

We'll know tomorrow, as I'm leaving today before it will finish resyncing.

     -Josiah


On 1/29/14 12:46 PM, Bassam Kurdali wrote:
> On Wed, 2014-01-29 at 10:36 -0500, Wm. Josiah Erikson wrote:
>> Well... the reshape finished, and I figured out why the NFS errors were
>> happening, and then the online resize finished successfully as well, so
>> I rebooted to re-seat the RAM, which all worked fine and we're back to
>> 6GB, but then when it came back up, I had lots of input/output errors on
>> the filesystem. It's currently booted from a rescue CD, and resyncing
>> again (why? I don't know), and then I'll fsck the filesystem and see if
>> I can rescue it. If not, we're looking at several days of downtime,
>> rebuilding from backups, and losing any renders that people didn't have
>> backed up.
> eek! I hope that doesn't happen too. We have our renders up to a certain
> time backed up, probably a month or two old by now.
> What about svn and stuff in git / sparkleshare? we don't have a backup
> of those repos (now I feel stupid)
>
>> I hope that doesn't happen - I'll keep you updated.
>>       -Josiah
>>
>>
>> On 1/28/14 11:04 PM, Wm. Josiah Erikson wrote:
>>> In the middle of the reshape, I got some reaally strange NFS errors
>>> that were making me nervous, so in the interest of not corrupting any
>>> data, I have taken keg's NFS offline. It will be back online tomorrow
>>> morning when the reshape finishes. This will render a bunch of things
>>> inoperable, but mainly nobody will be able to render.
>>>
>
> _______________________________________________
> Clusterusers mailing list
> Clusterusers at lists.hampshire.edu
> https://lists.hampshire.edu/mailman/listinfo/clusterusers

-- 
Wm. Josiah Erikson
Assistant Director of IT, Infrastructure Group
System Administrator, School of CS
Hampshire College
Amherst, MA 01002
(413) 559-6091



More information about the Clusterusers mailing list