SUMMARY: Abnormal shut down cause

From: Patricio Mora (pmora@cgob.junta-andalucia.es)
Date: Tue Nov 28 2000 - 04:22:54 CST


I didn't think a summary was needed, until I received a similar question. Even
though the results of the submmit leaded to few answers, may be these be of
interest for someone...

Resuming: likely to be a sudden panic, wathdog,.. without any logs or core.

Thanks to Doug Winter, Eric van de Meerakker, Rene Occelli, Matthew
Hannigan, David Mitchell, Reggie Stuart, David M. Spreitzer, Jim Taylor,
Mike Moehlman, & Annie Austin

------------------------
Original question
------------------------

IT's an urgent 'internal security' concerned issue: can I get to know whether a
Sun UE 3000 has been powered of in any abnormal way?

We suspect someone has pulled out the cables, or turn off the power key, which
is on and enabled. I've searched all the logs but haven't found the cause of the
reboot, just suddenly the usual boot logs.

All net inet services are disabled except for rpc.metad and rpc.metamhd, for
use by SDS metatool. The only remote entrance is through openssh.

Any place I can found trails about what really happened?

Additional info.
I've found that /var/adm/messages is dated 'Nov 15 07:38' while
the last line says: Nov 15 07:32:04 machinename last message repeated 1 time

Is this 6 minutes difference normal ?

Anyway, I don't consider the user who is suspicious of shutting down the
machine to be experieced enough to be able to delete lastlog, wtmp,
messages,.. entries, find and change dates of modified files,...

First answers point to that, after all, this will turn to be a spontaneous reboot.

--------------------------------------------------------
answers, nothing conclusive was found
--------------------------------------------------------

----------(Doug Winter)----------

$ last reboot
$ uptime

if they don't match, then it has rebooted without running the rc finish
scripts (which write to wtmpx).

> (they matched)

----------(Eric van de Meerakker)----------

For what it's worth: when the cable was pulled or the power key turned off you
likely won't find much. Security without regard for physical security doesn't go
very far!

However, if it was a system crash, you may try to do a 'savecore' manually, even
if you haven't enabled it in the startup scripts (/etc/init.d/sysetup). Depending
on how heavy the swap space is used the savecore command may still find an intact
system core dump and be able to rescue it. From that dump you may just be able to
find out whether the system crashed and if so, why....

> savecore enabled now, didn't try hot savecore. (Physical security here depends
> on things far away from real needs)

----------(Rene Occelli, David Spreitzer, Mike Moehlman, Darren Dunham)----------

Yup. When the power goes, you're not going to log that fact. You'll
also lose core, so no coredump. In fact, a reboot with no logs is
almost an indication of poweroff/powerloss.

> Don't know how power faults are logged, but prtdiag shows them. It wasn't
> a general power loss (we have UPS and the other machines didn't notice
> anything strange), that's why I suspected of one pulling the cable.

Some people have reported suddenly crash and reboot without message.
But this only concern some CPUs with high memory cache (400 MHz).

> Not my case

Try prtdiag -v

you can see the most recent AC Power failure. So, in your case
detect if it was a cable disconnected.

> prtdiag show no AC failures

I know (i've read) that messages are buffered before write to messages file
. This is why one can see the famous message :
Nov 15 07:32:04 machinename last message repeated 1 time

So I think the date difference is due to that.

> Mike Moehlman provided a rc script to look for and log both kinds of reboot

----------(David Mitchell)----------

If a machine appears to suddenly reboot with no logs or messages indicating
an sort of shutdown, it's most likely losing it's power one way or another.

You can tell how many times the machine has powered up from scratch
as follows:

server# eeprom | grep power-cycles
#power-cycles=58

if this keeps going up, someone or something is switching it off or
you've got a power supply problem.

> power-cycles showed a ~10 digit number, so I discarded it's value
> without more investigation

----------(Stuart, Reggie)----------

You can look in /var/crash and see if a core file was written at reboot. I
don't know, but I suspect your time stamp discrepency would be because
syslogd opened the file to write a message just as the lights went out.

What services did this machine provide (rhetorical, no response expected).
Any other machines (nfs clients, nis+ replicas, etc.) that might have log
messages regarding this machine.

> Only NFS client and Networker client, nothing reported at that hour

----------(Jim Taylor)----------

In one case where a machine of mine suffered "spontaneous
reboots" with no logs to help me it was due to a watchdog
reset (and I don't mean Netscape's watchdog). It may be
a low probability, but if you want to check it, you can
set an obp variable to slightly change things...

There is an OBP variable specific to auto reboots after
watchdog resets. If you want to investigate it this way,
change it to false (stays at ok> prompt) if it isn't already
and IF you can afford the time the box is down until you
manually boot it.

Just one thing to check...not necessarily the right one.

Also check your crash dump area if you haven't already.

----------(Annie Austin)----------

Just to note. I had a machine go down on me last week and the logs had
nothing in them about the method that was used to take it down.

The box has a VT100 terminal on it, which if turned off sends a break,
putting the box at the OK prompt. We suspected this had happened.

S
U BEFORE POSTING please READ the FAQ located at
N ftp://ftp.cs.toronto.edu/pub/jdd/sun-managers/faq
. and the list POLICY statement located at
M ftp://ftp.cs.toronto.edu/pub/jdd/sun-managers/policy
A To submit questions/summaries to this list send your email message to:
N sun-managers@sunmanagers.ececs.uc.edu
A To unsubscribe from this list please send an email message to:
G majordomo@sunmanagers.ececs.uc.edu
E and in the BODY type:
R unsubscribe sun-managers
S Or
. unsubscribe sun-managers original@subscription.address
L To view an archive of this list please visit:
I http://www.latech.edu/sunman.html
S
T



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:14:24 CDT