|
Acknowledgements
The author would like to thank Guy Streeter, Wade Mealing, and Masahiro Matsuya for reviewing and suggesting improvements for several drafts of this article.
About the author
Eugene Teo is Technical Account Manager of Red Hat’s Asia Pacific Global Support Services. Eugene received his bachelor’s degree in Computing from the National University of Singapore. In his spare time, Eugene enjoys learning new things, auditing the Linux kernel source code, and contributing kernel fixes.
This entry was posted by The editorial team on Wednesday, August 15th, 2007 at 10:25 am and is filed under Red Hat Enterprise Linux, technical. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.
Setup Diskdump
Diskdump is one of the two different crash dump facilities that we shipped with Red Hat Enterprise Linux 3 and 4. This article will not cover Netdump or Kdump.
Before you beginning setting up Diskdump on your machine, do read /usr/share/doc/diskdumputils-version/README to make sure that your machine has a supported storage adapter before proceeding.
Assign a disk device to dump memory. It may be:
Define the disk device in /etc/sysconfig/diskdump. In this example, we will use /dev/sda2:
# vi /etc/sysconfig/diskdump
add the line "DEVICE=/dev/sda2"
Load the kernel module:
# tail -f /var/log/message &
# modprobe diskdump
Apr 30 21:29:20 kerndev kernel: disk_dump: Maximum block size: 16384
Apr 30 21:29:20 kerndev kernel: disk_dump: total blocks required: 261770
(header 3 + bitmap 8 + memory 261759)
See /proc/diskdump after loading diskdump kernel module:
# cat /proc/diskdump
# sample_rate: 8
# block_order: 2
# fallback_on_err: 1
# allow_risky_dumps: 1
# dump_level: 0
# total_blocks: 261770
#
Format the diskdump device:
# service diskdump initialformat
/dev/sda2: [100.0%]
See /proc/diskdump after formatting:
# tail -n2 /proc/diskdump
sda2 102398310 10233405
Enable Diskdump service:
# chkconfig diskdump on
# service diskdump start
Starting diskdump: [ OK ]
# Apr 30 21:31:19 kerndev diskdump: activating succeeded
Test that Diskdump works. The following commands will crash your machine:
# echo 1 > /proc/sys/kernel/sysrq
# echo c > /proc/sysrq-trigger
Make sure that you run the above two commands in console (press Ctrl + Alt + F1), so that we can see what is happening when your system crashes. You have to perform this so that you can have a vmcore file to follow the rest of the paper. It will be located at /var/crash.
15 responses to “A quick overview of Linux kernel crash dump analysis”
1. Basil Gohar says:
August 15th, 2007 at 12:27 pm
The link “ftp://ftp.redhat.com/pub” is formatted very strangely. The quotes for the tag’s href attribute appear to be “smart quotes” style or something other than a standard quote.
2. Gabriel Donnell says:
August 24th, 2007 at 10:44 pm
The URLs for the Red Hat Knowledge Base are bad. The URL and text before and after the FAQ URL is the problem.
How do I configure a Netdump Server and a Netdump Client?
- Bad: http://www.redhatmagazine.com/2007/08/15/a-quick-overview-of-linux-kernel-crash-dump-analysis/%E2%80%9Dhttp://kbase.redhat.com/faq/FAQ_43_2467.shtm%E2%80%9D
- Good: http://kbase.redhat.com/faq/FAQ_43_2467.shtm
How do I configure kexec/kdump on Red Hat Enterprise Linux 5?
- Bad: http://www.redhatmagazine.com/2007/08/15/a-quick-overview-of-linux-kernel-crash-dump-analysis/%E2%80%9Dhttp://kbase.redhat.com/faq/FAQ_105_9036.shtm%E2%80%9D
- Good: http://kbase.redhat.com/faq/FAQ_105_9036.shtm
3. Eugene says:
August 25th, 2007 at 4:51 am
Thanks guys. I have informed the editor about the formatting problems.
4. Ashish Barot says:
September 4th, 2007 at 10:23 am
It is good article.
Can any one post few more details on crash dump on different hardware platform.
so we can come to know more error.
Thanks.
Ashish Barot.
5. Safir says:
September 20th, 2007 at 8:40 am
Hi,
Not very usable because of the formatting.
It would be better to provide also a pdf version.
Thanks
6. Eugene says:
September 27th, 2007 at 8:02 am
Ashish, thanks for the suggestion.
Safir, I will provide a pdf version, and post the link here soon. Stay tuned.
7. Philip Pokorny says:
October 9th, 2007 at 6:30 pm
Where can we get the kernel-debuginfo RPM’s for the production Red Hat kernels? YUM, UP2DATE and RHN don’t seem to have them available.
Do I need to rebuild the kernel from the SRPM?
8. Eugene says:
October 9th, 2007 at 6:34 pm
Hi Philip,
You can download the packages at ftp://ftp.redhat.com/pub. No you do not need to rebuild the kernel from the source RPM.
Thanks.
9. Eugene says:
October 9th, 2007 at 6:57 pm
Hi Philip,
If you are using yum, take a look at the .repo files in /etc/yum.repos.d/. For example, in Fedora 7, you can set enabled=0 to 1 under [fedora-debuginfo] in /etc/yum.repos.d/fedora.repo file. Once you done that, type “yum clean all” on the command line, and then start yum -y install the kernel-debuginfo rpm you need.
Hope this info helps.
Eugene
10. Mag says:
October 17th, 2008 at 2:28 pm
This is nice but do you have something a little bit more recent? It seems “printing eip:” has been depreciated in Redhat 5.2. Unless I am doing something wrong.
TIA
11. Red Hat Magazine | Linux kernel crash dump analysis overview says:
November 16th, 2008 at 11:41 pm
[...] A quick overview of Linux kernel crash dump analysis [...]
12. Eugene says:
November 18th, 2008 at 6:58 pm
Thanks Mag. I will try to update the article soon. Will keep you posted.
13. Manish Sarori says:
November 25th, 2008 at 10:20 pm
In the calltrace, does anyone know what the numbers after the + mean (see below)? For example, after __handle_sysrq is 0×58/0xc6. Is this some sort of offset into the function where the call was made? I need to pinpoint an exception in a lengthy function. Thanks in advance.
Call Trace:
[] __handle_sysrq+0×58/0xc6
[] write_sysrq_trigger+0×23/0×29
14. Kaiwan B says:
December 3rd, 2008 at 9:50 pm
Manish, reg:
+x/y
x represents the approx offset into the function .
y represents the approx total length of that function. Actually, y is the distance to the next global symbol. Therefore, these are approximations and not exact – but they do give you a “good enough” idea of where the faulting code lies. Typically, you could now try using objdump to disassemble and look at the offsets that show up, matching to the closest ‘x’ offset above.
15. Manish Sarori says:
December 16th, 2008 at 6:31 pm
Thanks Kaiwan!
Linux-Kernel-Crash-Dump-Analysis.pdf
|
첫댓글 http://people.redhat.com/anderson/.crash_whitepaper/