[lug] Linux kernel crash course

David Ahern dsahern at gmail.com
Fri Jul 30 07:18:38 MDT 2010



On 07/29/10 18:08, David L. Willson wrote:
> I'll throw some ideas:
> 
>  - I have a hung server. I want to force it to dump core, or attempt to, and reboot, or attempt to. How do I do that?

Hung could be due to high priority processes starving those that respond
to user requests, e.g., high priority processes are starving mingetty's
and sshd.

With console access you can use sysrq. Exactly which letters do what can
vary by vendor (e.g., Red Hat adds on to kernel.org options). See
Documentation/sysrq.txt in a kernel tree.

To get a core you will need to have configured diskdump/netdump or kdump
first.

>  - If it won't reboot, what sorts of things can be done to find out why?

Not a whole lot and it can be really frustrating. From past experiences:
it could be a high priority thread (e.g., FIFO/99) stuck in a loop,
problems exiting SMM from vendor agents (e.g., HP and IBM have S/W
agents for monitoring the hardware), kernel deadlock

I have not used kgdb yet, so it may allow probing if that feature will
respond.

If the "server" is a VM running on KVM you can use the gdbserver stub to
have a look at guest side memory. The 'info registers' (qemu monitor)
command can also provide hints.

>  - If it will dump core, what sorts of things can I do with the core dump?

crash is the best tool.

>  - If a server can't be physically reset, how do you get it to abandon a queued IO request?

You can try to force a sync with the sysrq key.

David


> 
> It's a poor list, but it's what I have for now.
> _______________________________________________
> Web Page:  http://lug.boulder.co.us
> Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
> Join us on IRC: irc.hackingsociety.org port=6667 channel=#hackingsociety



More information about the LUG mailing list