[Clusterusers] compute-1-18 and 1-3

Wm. Josiah Erikson wjerikson at hampshire.edu
Tue Jan 31 11:10:15 EST 2017


These two nodes weren't running tractor - they had rebooted themselves
and reinstalled. Compute-1-18 5 days, 11:39 uptime and compute-1-3 1
day, 5:18. Not sure why - neither has anything in the logs nor shows
anything particularly suspicious in ganglia. I have restarted tractor
and noted this occurrance to see if it's a pattern or random. Sometimes
some jobs do just trigger random reboots if they randomly use up all the
RAM, invoking the oom-killer, but usually that will leave something in
the logs.


-- 
Wm. Josiah Erikson
Assistant Director of IT, Infrastructure Group
System Administrator, School of CS
Hampshire College
Amherst, MA 01002
(413) 559-6091



More information about the Clusterusers mailing list