kexec is a set of systems call that allows you to load another kernel
from the currently executing Linux kernel.  The current implementation
has only been tested, and had the kinks worked out on x86, but the
generic code should work on any architecture.

Some machines have BIOSes that are either extremely slow to reboot,
or that cannot reliably perform a reboot.  In which case kexec
may be the only alternative to reboot in a reliable and timely manner.

The patch is archived at:
http://www.xmission.com/~ebiederm/files/kexec/

And is currently kept in two pieces.
The pure system call.
http://www.xmission.com/~ebiederm/files/kexec/linux-2.5.48.x86kexec.diff

And the set of hardware fixes known to help kexec.
http://www.xmission.com/~ebiederm/files/kexec/linux-2.5.48.x86kexec-hwfixes.diff

A compatible user space is at:
http://www.xmission.com/~ebiederm/files/kexec/kexec-tools-1.8.tar.gz
This code boots either a static ELF executable or a bzImage.

As of version 1.6 /sbin/kexec now works much more like /sbin/reboot.
It is recommend you place /sbin/kexec -e in /etc/init.d/reboot
just before the the call to /sbin/reboot.  If you haven't called
/sbin/kexec previously it will fail, and you can then call
/sbin/reboot.  Given the similiarity it is now the plan to merge in
reboot via kexec into /sbin/reboot.  

One bug was fixed in the move to 2.5.48.  Previously I had failed to
clear PAE and PSE in the kernel.  This caused reboot failures when
CONFIG_HIGHMEM_64G was enabled, as the new kernel would fail when
enabling paging, as these bits remained set.  Is %cr4 present on all
386+ intel cpus, or do I need to conditionalize the code that accesses
it?

As of version 1.6 /sbin/kexec when presented with a bzImage by default
avoids all BIOS calls and jumps directly to the kernels 32 bit entry
point.  The information it would usually get from the BIOS is instead
collected from the current kernel.  Accurately getting things like
the BIOS memory map from the current kernel is a challenge, still
needs to be addressed.  Safe defaults have been provided for the cases
I do not currently have good code to gather the information from the
running kernel.

In bug reports please include the serial console output of 
kexec kexec_test.  kexec_test exercises most of the interesting code
paths that are needed to load a kernel (mainly BIOS calls) with lots
of debugging print statements, so hangs can easily be detected.   


A kernel reformater that bypasses setup.S in favor of a version that
uses fewer BIOS calls, (increasing the reliability) is at:
ftp://ftp.lnxi.com/pub/mkelfImage/mkelfImage-1.18.tar.gz

I have been using this technique for the last several years and 
what the kernel needs to do is well understood, and currently
implemented.  I have also been working with etherboot which is a very 
minimal kernel whose sole purpose is to download a kernel over the
network and boot it.   

From etherboot it is possible to download a DOS image and run it.
I have not yet reached this level of firmware reliably after using the
kexec syscall, but the fact that etherboot does it shows it can be
done.  Having worked with etherboot as a primary bootloader I had
forgotten how challenging it is theoretically to shut down a kernel
that uses it's own hardware drivers and have the firmware still work
reliably. 

My expectation is that I can have complete BIOS functionality when
I compile a kernel whose sole user space is a ramdisk, the kernel has
no hardware drivers, and I use a bootloader that does not mess up
the BIOS.  And I expect that using kernel drivers will only mess up
the firmware drivers where they work with the same hardware.

So far failures have broken into three categories.  Broken kernels
even without kexec.  Kernels that hang when running setup.S during
bootup.  Kernels that don't quite boot because a the kernel driver
cannot reinitialize the hardware from the state that same driver left
the hardware in.

For failures during setup.S I have seen three cases.  DOS running
before loadlin modified the real mode IDT so as to be useless without
DOS.  Interrupt 0x13 ah=0x15 print dasd type.  And the new EDD code.
These failures fall into the expected cases.  But except for the loadlin
case which is hopeless have not been completely tracked.

In the plan forward there are several goals.  To find what it takes
to have the BIOS work reliably after a kexec system call.  To boot
bzImages reliably, by bypassing setup.S completely.  To fix the linux
device drivers that prevent a kernel called with sys_kexec from
initializing.  

An important point is that none of these future projects require
modifying the current kexec system call.  The needed work is either in
user space or in the  device drivers.
