diff options
author | Olaf Hering <olaf@aepfle.de> | 2012-07-17 17:43:35 +0200 |
---|---|---|
committer | Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> | 2012-07-19 15:52:05 -0400 |
commit | 00e37bdb0113a98408de42db85be002f21dbffd3 (patch) | |
tree | 13207109cddbc8c3550659eb67e3345ed6cca9a7 /drivers/xen | |
parent | 4ff2d06255461390ad685843d0d7364aaa6642d2 (diff) | |
download | op-kernel-dev-00e37bdb0113a98408de42db85be002f21dbffd3.zip op-kernel-dev-00e37bdb0113a98408de42db85be002f21dbffd3.tar.gz |
xen PVonHVM: move shared_info to MMIO before kexec
Currently kexec in a PVonHVM guest fails with a triple fault because the
new kernel overwrites the shared info page. The exact failure depends on
the size of the kernel image. This patch moves the pfn from RAM into
MMIO space before the kexec boot.
The pfn containing the shared_info is located somewhere in RAM. This
will cause trouble if the current kernel is doing a kexec boot into a
new kernel. The new kernel (and its startup code) can not know where the
pfn is, so it can not reserve the page. The hypervisor will continue to
update the pfn, and as a result memory corruption occours in the new
kernel.
One way to work around this issue is to allocate a page in the
xen-platform pci device's BAR memory range. But pci init is done very
late and the shared_info page is already in use very early to read the
pvclock. So moving the pfn from RAM to MMIO is racy because some code
paths on other vcpus could access the pfn during the small window when
the old pfn is moved to the new pfn. There is even a small window were
the old pfn is not backed by a mfn, and during that time all reads
return -1.
Because it is not known upfront where the MMIO region is located it can
not be used right from the start in xen_hvm_init_shared_info.
To minimise trouble the move of the pfn is done shortly before kexec.
This does not eliminate the race because all vcpus are still online when
the syscore_ops will be called. But hopefully there is no work pending
at this point in time. Also the syscore_op is run last which reduces the
risk further.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Diffstat (limited to 'drivers/xen')
-rw-r--r-- | drivers/xen/platform-pci.c | 15 |
1 files changed, 15 insertions, 0 deletions
diff --git a/drivers/xen/platform-pci.c b/drivers/xen/platform-pci.c index 97ca359..d4c50d6 100644 --- a/drivers/xen/platform-pci.c +++ b/drivers/xen/platform-pci.c @@ -101,6 +101,19 @@ static int platform_pci_resume(struct pci_dev *pdev) return 0; } +static void __devinit prepare_shared_info(void) +{ +#ifdef CONFIG_KEXEC + unsigned long addr; + struct shared_info *hvm_shared_info; + + addr = alloc_xen_mmio(PAGE_SIZE); + hvm_shared_info = ioremap(addr, PAGE_SIZE); + memset(hvm_shared_info, 0, PAGE_SIZE); + xen_hvm_prepare_kexec(hvm_shared_info, addr >> PAGE_SHIFT); +#endif +} + static int __devinit platform_pci_init(struct pci_dev *pdev, const struct pci_device_id *ent) { @@ -138,6 +151,8 @@ static int __devinit platform_pci_init(struct pci_dev *pdev, platform_mmio = mmio_addr; platform_mmiolen = mmio_len; + prepare_shared_info(); + if (!xen_have_vector_callback) { ret = xen_allocate_irq(pdev); if (ret) { |