[Clusterusers] Stress-testing the cluster tonight
Bassam Kurdali
bassam at urchn.org
Sun Jun 1 13:22:54 EDT 2014
One problem- I don't have permissions as anim to do that! (ps- it is
down now)
On Sat, 2014-05-31 at 17:14 -0400, Wm. Josiah Erikson wrote:
> Though, you don't even have to do that, actually. If fly1.hampshire.edu
> stops pinging but fly.hampshire.edu is still pinging, then just log into
> fly, do a "sudo ifdown eth2" and then "sudo ifup eth2" and that should
> fix it.
> -Josiah
>
>
> On 5/31/14 5:09 PM, Wm. Josiah Erikson wrote:
> > Yeah, you can reboot it and then execute sudo
> > /etc/init.d/tractor-engine start on the head node.
> > -Josiah
> >
> >
> > On 5/31/14 12:06 PM, Bassam Kurdali wrote:
> >> Is there something we can do to goose it while you're away like
> >> rebooting (if it goes down again)?
> >>
> >> "Wm. Josiah Erikson" <wjerikson at hampshire.edu> wrote:
> >>
> >>> This was that eth2 problem. I hope the kernel update I just did will
> >>> keep it from happening again... but the kernel parameter line I added
> >>> before was supposed to fix it even without the kernel update. Keep your
> >>> finger crossed...
> >>> -Josiah
> >>>
> >>> On 5/30/14 5:06 PM, Bassam Kurdali wrote:
> >>>> Hmm, maybe fly is having problems as a result? tractor just 'went
> >>>> away'
> >>>> I can still SSH into fly though, is there anything I can do to restart
> >>>> tractor , otherwise we are down for a week :(
> >>>> On Thu, 2014-05-29 at 23:05 -0400, Wm. Josiah Erikson wrote:
> >>>>> I am ferreting out any weak nodes by running dnetc on anything that
> >>>>> isn't doing anything else. My goal is to get rid of/fix any nodes
> >>>>> that
> >>>>> are not reliable. Please let me know if you feel like this is causing
> >>>>> any problems for you. I will stop this tomorrow morning - it's just
> >>>>> overnight, as I'm going on vacation for a week starting Saturday. My
> >>>>> tentative belief at this point is that all of the nodes are fully
> >>>>> functional and reliable.... testing that belief :)
> >>>>>
> >>>> _______________________________________________
> >>>> Clusterusers mailing list
> >>>> Clusterusers at lists.hampshire.edu
> >>>> https://lists.hampshire.edu/mailman/listinfo/clusterusers
> >>> --
> >>> -----
> >>> Wm. Josiah Erikson
> >>> Head, Systems and Networking
> >>> Hampshire College
> >>> Amherst, MA 01002
> >>>
> >>> _______________________________________________
> >>> Clusterusers mailing list
> >>> Clusterusers at lists.hampshire.edu
> >>> https://lists.hampshire.edu/mailman/listinfo/clusterusers
> >> _______________________________________________
> >> Clusterusers mailing list
> >> Clusterusers at lists.hampshire.edu
> >> https://lists.hampshire.edu/mailman/listinfo/clusterusers
> >
>
More information about the Clusterusers
mailing list