Problem Description:
Kernel panics on Linux are hard to identify and troubleshoot. Troubleshooting kernel panics often requires
reproducing a situation that occurs rarely and collecting data that is difficult to gather.
Solution Summary:
This document outlines several techniques that will help reduce the amount of time necessary to troubleshoot a
kernel panic.
Technical Discussion:
What is a kernel panic?
As the name implies, the Linux kernel gets into a situation where it doesn’t know what to do next. When this
happens, the kernel gives as much information as it can about what caused the problem, depending on what
caused the panic.
Since hard panics and soft panics are different in nature, we will discuss how to deal with each separately.
1. /var/log/messages — sometimes the entire kernel panic stack trace will be logged there
2. Application / Library logs (RTF, cheetah, etc.) – may show what was happening before the panic
3. Other information about what happened just prior to the panic, or how to reproduce
4. Screen dump from console. Since the OS is locked, you cannot cut and paste from the screen. There are
two common ways to get this info:
Digital Picture of screen (preferred, since it’s quicker and easier)
Copying screen with pen and paper or typing to another computer
If the dump is not available either in /var/log/message or on the screen, follow these tips to get a dump:
1. If in GUI mode, switch to full console mode – no dump info is passed to the GUI (not even to GUI shell).
2. Make sure screen stays on during full test run – if a screen saver kicks in, the screen won’t return after a
kernel panic. Use these settings to ensure the screen stays on.
setterm -blank 0
setterm -powerdown 0
setvesablank off
3. From console, copy dump from screen (see above).
If the culprit is a Dialogic driver you will see a module name with:
streams-xxxxDriver (xxxx = dlgn, dvbm, mercd, etc.)
Hard panic – partial trace example (note there is no line with EIP information)
[] ip_rcv [kernel] 0×357
[] sramintr [streams_dlgnDriver] 0x32d
[] lis_spin_lock_irqsave_fcn [streams] 0x7d
[] inthw_lock [streams_dlgnDriver] 0x1c
[] pwswtbl [streams_dlgnDriver] 0×0
[] dlgnintr [streams_dlgnDriver] 0x4b
[] Gn_Maxpm [streams_dlgnDriver] 0x7ae
[] __run_timers [kernel] 0xd1
[] handle_IRQ_event [kernel] 0x5e
[] do_IRQ [kernel] 0xa4
[] default_idle [kernel] 0×0
[] default_idle [kernel] 0×0
[] call_do_IRQ [kernel] 0×5
[] default_idle [kernel] 0×0
[] default_idle [kernel] 0×0
[] default_idle [kernel] 0x2d
[] cpu_idle [kernel] 0x2d
[] __call_console_drivers [kernel] 0x4b
[] call_console_drivers [kernel] 0xeb
Code: 8b 50 0c 85 d2 74 31 f6 42 0a 02 74 04 89 44 24 08 31 f6 0f
<0> Kernel panic: Aiee, killing interrupt handler!
In interrupt handler – not syncing
…wordpress.com/…/linux-kernel-panic-… 3/11
6/22/2010 Linux “Kernel Panic” — Prevent Cardiac…
Hard panics – using kernel debugger (KDB)
If only a partial trace is available and the supporting information is not sufficient to isolate root cause, it may be
useful to use KDB. KDB is a tool that is compiled into the kernel that causes the kernel to break into a shell
rather than lock up when a panic occurs. This enables you to collect additional information about the panic,
which is often useful in determining root cause.
1. If this is a potential Dialogic issue, technical support should be contacted prior to the to use of KDB
2. Must use base kernel – i.e. 2.4.18 kernel instead of 2.4.18-5 from RedHat. This is because KDB is only
available for the base kernels, and not the builds created by RedHat. While this does create a slight
deviation from the original configuration, it usually does not interfere with root cause analysis.
3. Need different Dialogic drivers compiled to handle the specific kernel.
1. Create new file from text of stack trace found in /var/log/messages. Make sure to strip off timestamps,
otherwise ksymoops will fail.
2. Run ksymoops on new stack trace file:
Generic: ksymoops -o [location of Dialogic drivers] filename
Example: ksymoops -o /lib/modules/2.4.18-5/misc ksymoops.log
All other defaults should work fine
####################################################################
So you’re trying to start Linux for the first time and … wham! You get messages like:
(1) The first part of the system that starts running is the “boot loader,” usually grub. This is the program that
loads Linux, and/or Windows if you so desire. (The “master boot record,” or MBR, enables the computer to
…wordpress.com/…/linux-kernel-panic-… 4/11
6/22/2010 Linux “Kernel Panic” — Prevent Cardiac…
load grub.)
(2) The first thing that Grub needs to know is … “where is the kernel?” It gets this from the /boot/grub/grub.conf
file. The way that you specify the correct drive and partition in Grub is a little different from, like “(hd0,0)” what
you use in ordinary Linux. The kernel will be in some file named “vmlinuz-…”
(3) Once Grub has loaded the kernel into memory, the first thing that the kernel needs to know is, “where is the
root filesystem?” The root= parameter is passed to the kernel to provide this information. Notice that now you
are talking “to Linux,” and you identify devices “in Linux’s terms,” like “/dev/hda2″.
(4) Given this information, Linux is going to try to mount the root filesystem … prepare it for use. The most
common mistake at this point is that you’ve specified the wrong device in step #3. Unfortunately, the message
that results is rather nasty looking…
When Linux doesn’t know how to proceed, as in this case, it says “kernel panic” and it stops. But, even then, it
tries to go down gracefully. It tries to write anything to disk that hasn’t been written out (an operation called
“syncing”, for some darn-fool reason), and if it succeeds in doing so it will say “not syncing.” What’s totally
misleading about this message combination is that it implies, incorrectly, that the reason for the panic is “not
syncing,” when actually the reason for the panic will be found in the preceding few lines.
You might see the message, “tried to kill ‘init’.” That really means that a program called init died… which it is
not allowed to ever do. init is a very special program in Linux… the first program created when the machine
starts.
So, basically, when you get these messages on startup … the situation is really a lot more dreadful looking than it
actually is. You have probably just made a “tpyo” when entering the information in grub.conf.
(Another common place to make a typo is in /etc/fstab, which tells Linux where all the other drives are.)
So what do you do? If you’re doing a first-time install you can just start over. Otherwise, you need to boot a
separate CD-ROM, which will give you a stand-alone Linux installation from which you can edit the offending
files.
When the kernel gets into a situation where it does not know how to proceed (most often during booting, but at
other times), it issues a kernel panic by calling the panic(msg) routine defined in kernel/panic.c. (Good name,
huh?) This is a call from which No One Ever Returns.
The panic() routine adds text to the front of the message, telling you more about what the system was actually
doing when the panic occurred … basically how big and bad the trail of debris in the filesystem is likely to be.
This is where the “not syncing” part comes from, and when you see that, it’s good. (panic() does try to issue a
sinc() system-call to push all buffered data out to the hard-disks before it goes down.)
The second part of the message is what was provided by the original call to panic(). For example, we find
panic(“Tried to kill init!”) in kernel/exit.c.
So, what does this actually mean? Well, in this case it really doesn’t mean that someone tried to kill the magical
init process (process #1…), but simply that it tried to die. This process is not allowed to die or to be killed.
When you see this message, it’s almost always at boot-time, and the real messages … the cause of the actual
failure … will be found in the startup messages immediately preceding this one. This is often the case with
kernel-panics. init encountered something “really bad,” and it didn’t know what to do, so it died, so the kernel
died too.
BTW, the kernel-panic code is rather cute. It can blink lights and beep the system-speaker in Morse code. It can
reboot the system automagically. Obviously the people who wrote this stuff encountered it a lot…
…wordpress.com/…/linux-kernel-panic-… 5/11
6/22/2010 Linux “Kernel Panic” — Prevent Cardiac…
In diagnosing, or at least understanding, kernel-panics, I find it extremely helpful to have on-hand a copy of the
Linux source-code, which is usually stored someplace like /usr/src/linux-2.x. You can use the grep utility to
locate the actual code which caused the panic to occur.
Ads by Google
Free ZFS based NAS/iSCSI
In-line dedupe and compression End vendor lock-in, save 75%
www.nexenta.com
19 comments
1.
2.
Wonderful!!!
3.
mind blowing!!!!!!!
4.
Thanks a ton !!
This really helped.
5.
i get the same message kernel panic -not syncing trying to likk init ..
i got from ur explaination tht y this prob aries .. but i dint got the soln how to proceed further…
…wordpress.com/…/linux-kernel-panic-… 6/11
6/22/2010 Linux “Kernel Panic” — Prevent Cardiac…
Reply
6.
i get the same message kernel panic -not syncing trying to kill init ..
i got from ur explaination tht y this prob arises .. but i dint got the soln how to proceed further…
7.
There can be lot of reasons for this. There is no straight forward solution. can you send a screenshot of the
error. Are you sure that mount points in /etc/fstab are assigned properly? what is your OS ?
When did you error occur? Did you try a kernel upgrade ? Can you post your grub.conf or lilo.conf?
8.
Is there something I could put in the grub / kernel load line that records everything from absolute start?
All in all, thank you for the article, given me a few insights to think about.
Cheers.
9.
Dear sir
10.
Great writeup! I’m trying to debug a random kernel panic right now and this gave me just what I needed:
where to start.
Thanks a lot!
…wordpress.com/…/linux-kernel-panic-… 7/11
6/22/2010 Linux “Kernel Panic” — Prevent Cardiac…
11.
Great post, very useful in helping me understand a panic I am having with lpfc cards …
12.
Great One!
Thanks.
13.
14.
hello. i read your article and think its great! i actually have a kernel panic right now and cannot for the life
of me figure out how to fix it. i cant do a new install because all my files are on the partition and i dont
want to lose them. the error im getting is that it says “kernel panic – not syncing – attempted to kill init”. i
am running jaunty 9.04 and the 2.6.28-15 kernel. my grub conf is not in the /boot/grub folder, nor is
lilo.conf in the /etc folder. you mentioned upgrading my kernel. how can i do that? im running off my livecd
now so can install a kernel onto my harddrive through the livecd?
thanks!
Any changes you recently made? what does grub.conf show? what does /etc/fstab show?
what does fdisk -l ( small L ) show ?
It would be tricky, if not difficult to compile kernel off a live CD, I am not sure, if that is possible. It
may be possible, if you mount the partitions properly ( /boot , /etc, /usr , /var , /tmp atleast off the
old drive ) and compile, but again, I have never tested it physically.
i did a few updates that ubuntu prompted me to do. it said several times that i could only do
a partial upgrade because something was missing (not quite sure what). other than that, no
changes. i cant find grub.conf on the disk at all, but here is a link to my fstab:
http://www.4shared.com/file/127384232/bb3dc519/fstab.html
…wordpress.com/…/linux-kernel-panic-… 8/11
6/22/2010 Linux “Kernel Panic” — Prevent Cardiac…
i tried the link but when i entered find /grub/stage2 it said “error: file missing”. im really at a
loss as how to fix this and anything you can suggest would be so greatly appreciated.
oh, and when i did fdisk -l in the terminal nothing happened either. just went to the next line in
the terminal
15.
16. [...] That's interesting! I'd never heard of the 'soft panic' before. This site seems to have some sensible
troubleshooting suggestions though: http://rhcelinuxguide.wordpress.com/…ardiac-arrest/ [...]
Leave a Comment
Name(required)
E-mail(required)
Website
Submit Comment
Pages
…wordpress.com/…/linux-kernel-panic-… 9/11
6/22/2010 Linux “Kernel Panic” — Prevent Cardiac…
About Me
Categories
Advanced Commands (26)
Apache (1)
Backup (4)
cvs (1)
iptables (6)
Linux Administration (42)
Linux Boot (2)
Linux General (33)
Linux Installation (3)
Linux Kernel (3)
Linux Networking (9)
Linux Security (17)
Linux:- Tips & Tricks (44)
LVM (7)
Mysql (9)
NFS (1)
parallels (1)
php (1)
plesk (1)
postgres (1)
postgresql (1)
psa database (1)
sql (2)
Tuning Linux (13)
Uncategorized (13)
upgrade virtuozzo (1)
Virtuozzo (1)
virtuozzo 3 to 4 (1)
virtuozzo 3 to virtuozzo 4 (1)
virtuozzo 3 upgrade (1)
virtuozzo upgrade (1)
virtuozzo version upgrade (1)
windows (1)
windows2003 (1)
yum (2)
Archives
April 2010
August 2009
July 2009
June 2009
September 2008
August 2008
July 2008
May 2007
August 2006
July 2006
June 2006
search archives Submit
…wordpress.com/…/linux-kernel-panic-… 10/11
6/22/2010 Linux “Kernel Panic” — Prevent Cardiac…
Meta
Register
Log in
WordPress.com
Blog at WordPress.com.
Theme: Neat. Entries (RSS) and Comments (RSS).
…wordpress.com/…/linux-kernel-panic-… 11/11