[Clusterusers] Stress-testing the cluster tonight
Wm. Josiah Erikson
wjerikson at hampshire.edu
Sat May 31 17:14:36 EDT 2014
Though, you don't even have to do that, actually. If fly1.hampshire.edu
stops pinging but fly.hampshire.edu is still pinging, then just log into
fly, do a "sudo ifdown eth2" and then "sudo ifup eth2" and that should
fix it.
-Josiah
On 5/31/14 5:09 PM, Wm. Josiah Erikson wrote:
> Yeah, you can reboot it and then execute sudo
> /etc/init.d/tractor-engine start on the head node.
> -Josiah
>
>
> On 5/31/14 12:06 PM, Bassam Kurdali wrote:
>> Is there something we can do to goose it while you're away like
>> rebooting (if it goes down again)?
>>
>> "Wm. Josiah Erikson" <wjerikson at hampshire.edu> wrote:
>>
>>> This was that eth2 problem. I hope the kernel update I just did will
>>> keep it from happening again... but the kernel parameter line I added
>>> before was supposed to fix it even without the kernel update. Keep your
>>> finger crossed...
>>> -Josiah
>>>
>>> On 5/30/14 5:06 PM, Bassam Kurdali wrote:
>>>> Hmm, maybe fly is having problems as a result? tractor just 'went
>>>> away'
>>>> I can still SSH into fly though, is there anything I can do to restart
>>>> tractor , otherwise we are down for a week :(
>>>> On Thu, 2014-05-29 at 23:05 -0400, Wm. Josiah Erikson wrote:
>>>>> I am ferreting out any weak nodes by running dnetc on anything that
>>>>> isn't doing anything else. My goal is to get rid of/fix any nodes
>>>>> that
>>>>> are not reliable. Please let me know if you feel like this is causing
>>>>> any problems for you. I will stop this tomorrow morning - it's just
>>>>> overnight, as I'm going on vacation for a week starting Saturday. My
>>>>> tentative belief at this point is that all of the nodes are fully
>>>>> functional and reliable.... testing that belief :)
>>>>>
>>>> _______________________________________________
>>>> Clusterusers mailing list
>>>> Clusterusers at lists.hampshire.edu
>>>> https://lists.hampshire.edu/mailman/listinfo/clusterusers
>>> --
>>> -----
>>> Wm. Josiah Erikson
>>> Head, Systems and Networking
>>> Hampshire College
>>> Amherst, MA 01002
>>>
>>> _______________________________________________
>>> Clusterusers mailing list
>>> Clusterusers at lists.hampshire.edu
>>> https://lists.hampshire.edu/mailman/listinfo/clusterusers
>> _______________________________________________
>> Clusterusers mailing list
>> Clusterusers at lists.hampshire.edu
>> https://lists.hampshire.edu/mailman/listinfo/clusterusers
>
--
-----
Wm. Josiah Erikson
Head, Systems and Networking
Hampshire College
Amherst, MA 01002
More information about the Clusterusers
mailing list