How to Troubleshoot a Soft Kernel Panic
Soft panics symptoms:
1. Much less severe than hard panic.
2. Usually results in a segmentation fault.
3. Can see an oops message search /var/log/messages
for string Oops
for string Oops
4. Machine still somewhat usable (but should be
rebooted after information is collected).
rebooted after information is collected).
Soft panics causes:
- Almost anything that causes a module to crash when it is not within an interrupt handler can cause a soft panic.
- In this case, the driver itself will crash, but will not cause catastrophic system failure since it was not locked in the interrupt handler.
- The same possible
causes exist for soft panics as do for hard
panics (i.e. accessing a null pointer during runtime).
Soft panics information
to collect:
- When a soft panic occurs, the kernel will generate a dump that contains kernel symbols this information is logged in /var/log/messages.
- To begin troubleshooting, use the ksymoops utility to turn kernel symbols into meaningful data.
To generate a
ksymoops file:
- Create new file from text of stack trace found in /var/log/messages. Make sure to strip off timestamps, otherwise ksymoops will fail.
- Run ksymoops on new stack trace file:
- Generic: ksymoops -o [location of Dialogic drivers] filename
Example: ksymoops -o /lib/modules/ 2.4.18-5/ misc ksymoops.log
All other defaults should work fine
For a man page
on ksymoops, see the following webpage:
So you Are trying to start Linux for the first time and what You get messages like:
¤ Unable to mount root device.
¤ Kernel panic - not syncing.
What to do
now?
(1) The first part of the system that starts
running is the boot loader,usually grub. This is the program that loads Linux,
and/or Windows if you so desire. (The master boot record,or MBR,
enables the computer to load grub.)
(2)
The first
thing that Grub needs to know is where is the kernel? It gets this from the /boot/grub/grub.
conf file.
(3)
The way that you specifythe correct drive and
partition in Grub is a little
different from, like (hd0,0) what you use in ordinary Linux. The kernel will be in some file named vmlinuz
different from, like (hd0,0) what you use in ordinary Linux. The kernel will be in some file named vmlinuz
(4)
Once Grub has
loaded the kernel into memory, the first thing that the kernel needs to know is,
where is the root filesystem? The root= parameter is passed to the
kernel to provide this information.
(5)
Notice
that now you are talking to Linux, and you identify devices in Linux terms, like /dev/hda2
(6)
Given this information, Linux is going to try
to mount the root filesystem prepare it
for use.
(7)
The most
common mistake at this point is that you have specified the wrong device in
step #3.
(8)
Unfortunately, the message that results is
rather nasty looking When Linux doesnt know how to proceed, as in this case, it
says kernel panic and it stops.
(9)
But, even then, it tries to go down gracefully.
It tries to write anything to disk that hasnt been written out (an operation
called syncing, for some darn-fool reason), and if it succeeds in doing so it
will say not syncing.
(10)
What’s totally misleading about this message
combination is that it implies, incorrectly, that the reason for the panic is
not syncing,when actually the reason for the panic will be found in the preceding
few lines.
(11)
You might see the message, tried to kill init That
really means that a program called init died which it is not allowed to ever
do.
(12)
init is a very special program in Linux the first program created when the
machine starts.
(13)
So, basically, when you get these messages on
startup the situation is really a lot more dreadful looking than it actually
is.
(14)
You have probably just made a type when
entering the information in grub.conf.(Another common place to make a
typo is in /etc/fstab,
which tells Linux where all the other drives are.)
which tells Linux where all the other drives are.)
(15)
So what do you do? If you are doing a
first-time install you can just start over. Otherwise, you need to boot a
separate CD-ROM, which will give you a stand-alone Linux installation from
which you can edit the offending files.
Explained: kernel panic - not syncing - attempted
to kill init
(16)
When the kernel gets into a situation where it
does not know how to proceed (most often during booting, but at other times),
it issues a kernel panic by calling the panic (msg) routine defined in kernel/panic.
c. (Good name, huh?)
(17)
This is a call
from which No One Ever Returns. The panic() routine adds text to the
front of the message, telling you more about what the system was actually doing
when the panic occurred basically how big and bad the trail of debris in the
filesystem is likely to be.
(18)
This is where
the not syncing part comes from, and when you see that, its good.(panic() does
try to issue a sinc() system-call to push all buffered data out to the
hard-disks before it goes down.)
(19)
The second part of the message is what was
provided by the original call to panic(). For example, we find panic(Tried
to kill init) in kernel/exit. c.
(20)
So, what does this actually mean? Well, in
this case it really doesnt mean that someone tried to kill the magical init
process (process #), but simply that
it tried to die.
it tried to die.
(21)
This process is not allowed to die or to be
killed. When you see this message, it’s almost always at boot-time, and the
real messages the cause of the
actual failure ¡ will be found in the startup messages immediately preceding this one.
actual failure ¡ will be found in the startup messages immediately preceding this one.
(22)
This is often the
case with kernel-panics.
(23)
init encountered something really bad, and it didnt know what to do, so it
died, so the kernel died too.
(24)
BTW, the kernel-panic code is rather
cute. It can blink lights and beep the system-speaker in Morse code. It can
reboot the system automatically.
Obviously the people who wrote this stuff encountered it a lot
Obviously the people who wrote this stuff encountered it a lot
No comments:
Post a Comment