summaryrefslogtreecommitdiffstats
path: root/arch/x86/xen/mmu.c
Commit message (Collapse)AuthorAgeFilesLines
*---. Merge branches 'stable/irq', 'stable/p2m.bugfixes', 'stable/e820.bugfixes' ↵Linus Torvalds2011-05-191-1/+1
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and 'stable/mmu.bugfixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/irq' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: do not clear and mask evtchns in __xen_evtchn_do_upcall * 'stable/p2m.bugfixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/p2m: Create entries in the P2M_MFN trees's to track 1-1 mappings * 'stable/e820.bugfixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/setup: Fix for incorrect xen_extra_mem_start initialization under 32-bit xen/setup: Ignore E820_UNUSABLE when setting 1-1 mappings. * 'stable/mmu.bugfixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen mmu: fix a race window causing leave_mm BUG()
| | | * xen mmu: fix a race window causing leave_mm BUG()Tian, Kevin2011-05-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's a race window in xen_drop_mm_ref, where remote cpu may exit dirty bitmap between the check on this cpu and the point where remote cpu handles drop request. So in drop_other_mm_ref we need check whether TLB state is still lazy before calling into leave_mm. This bug is rarely observed in earlier kernel, but exaggerated by the commit 831d52bc153971b70e64eccfbed2b232394f22f8 ("x86, mm: avoid possible bogus tlb entries by clearing prev mm_cpumask after switching mm") which clears bitmap after changing the TLB state. the call trace is as below: --------------------------------- kernel BUG at arch/x86/mm/tlb.c:61! invalid opcode: 0000 [#1] SMP last sysfs file: /sys/devices/system/xen_memory/xen_memory0/info/current_kb CPU 1 Modules linked in: 8021q garp xen_netback xen_blkback blktap blkback_pagemap nbd bridge stp llc autofs4 ipmi_devintf ipmi_si ipmi_msghandler lockd sunrpc bonding ipv6 xenfs dm_multipath video output sbs sbshc parport_pc lp parport ses enclosure snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device serio_raw bnx2 snd_pcm_oss snd_mixer_oss snd_pcm snd_timer iTCO_wdt snd soundcore snd_page_alloc i2c_i801 iTCO_vendor_support i2c_core pcs pkr pata_acpi ata_generic ata_piix shpchp mptsas mptscsih mptbase [last unloaded: freq_table] Pid: 25581, comm: khelper Not tainted 2.6.32.36fixxen #1 Tecal RH2285 RIP: e030:[<ffffffff8103a3cb>] [<ffffffff8103a3cb>] leave_mm+0x15/0x46 RSP: e02b:ffff88002805be48 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88015f8e2da0 RDX: ffff88002805be78 RSI: 0000000000000000 RDI: 0000000000000001 RBP: ffff88002805be48 R08: ffff88009d662000 R09: dead000000200200 R10: dead000000100100 R11: ffffffff814472b2 R12: ffff88009bfc1880 R13: ffff880028063020 R14: 00000000000004f6 R15: 0000000000000000 FS: 00007f62362d66e0(0000) GS:ffff880028058000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000003aabc11909 CR3: 000000009b8ca000 CR4: 0000000000002660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 00000000000000 00 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process khelper (pid: 25581, threadinfo ffff88007691e000, task ffff88009b92db40) Stack: ffff88002805be68 ffffffff8100e4ae 0000000000000001 ffff88009d733b88 <0> ffff88002805be98 ffffffff81087224 ffff88002805be78 ffff88002805be78 <0> ffff88015f808360 00000000000004f6 ffff88002805bea8 ffffffff81010108 Call Trace: <IRQ> [<ffffffff8100e4ae>] drop_other_mm_ref+0x2a/0x53 [<ffffffff81087224>] generic_smp_call_function_single_interrupt+0xd8/0xfc [<ffffffff81010108>] xen_call_function_single_interrupt+0x13/0x28 [<ffffffff810a936a>] handle_IRQ_event+0x66/0x120 [<ffffffff810aac5b>] handle_percpu_irq+0x41/0x6e [<ffffffff8128c1c0>] __xen_evtchn_do_upcall+0x1ab/0x27d [<ffffffff8128dd11>] xen_evtchn_do_upcall+0x33/0x46 [<ffffffff81013efe>] xen_do_hyper visor_callback+0x1e/0x30 <EOI> [<ffffffff814472b2>] ? _spin_unlock_irqrestore+0x15/0x17 [<ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff81113f71>] ? flush_old_exec+0x3ac/0x500 [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef [<ffffffff8115115d>] ? load_elf_binary+0x398/0x17ef [<ffffffff81042fcf>] ? need_resched+0x23/0x2d [<ffffffff811f4648>] ? process_measurement+0xc0/0xd7 [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef [<ffffffff81113094>] ? search_binary_handler+0xc8/0x255 [<ffffffff81114362>] ? do_execve+0x1c3/0x29e [<ffffffff8101155d>] ? sys_execve+0x43/0x5d [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f [<ffffffff81013e28>] ? kernel_execve+0x68/0xd0 [<ffffffff 8106fc45>] ? __call_usermodehelper+0x0/0x6f [<ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8106fb64>] ? ____call_usermodehelper+0x113/0x11e [<ffffffff81013daa>] ? child_rip+0xa/0x20 [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f [<ffffffff81012f91>] ? int_ret_from_sys_call+0x7/0x1b [<ffffffff8101371d>] ? retint_restore_args+0x5/0x6 [<ffffffff81013da0>] ? child_rip+0x0/0x20 Code: 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 e8 17 ff ff ff c9 c3 55 48 89 e5 0f 1f 44 00 00 65 8b 04 25 c8 55 01 00 ff c8 75 04 <0f> 0b eb fe 65 48 8b 34 25 c0 55 01 00 48 81 c6 b8 02 00 00 e8 RIP [<ffffffff8103a3cb>] leave_mm+0x15/0x46 RSP <ffff88002805be48> ---[ end trace ce9cee6832a9c503 ]--- Tested-by: Maoxiaoyun<tinnycloud@hotmail.com> Signed-off-by: Kevin Tian <kevin.tian@intel.com> [v1: Fleshed out the git description a bit] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | | | arch/x86/xen/mmu: Cleanup code/data sections definitionsDaniel Kiper2011-05-191-17/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cleanup code/data sections definitions accordingly to include/linux/init.h. Signed-off-by: Daniel Kiper <dkiper@net-space.pl> [v1: Rebased on top of latest linus's to include fixes in mmu.c] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | | | x86,xen: introduce x86_init.mapping.pagetable_reserveStefano Stabellini2011-05-121-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a new x86_init hook called pagetable_reserve that at the end of init_memory_mapping is used to reserve a range of memory addresses for the kernel pagetable pages we used and free the other ones. On native it just calls memblock_x86_reserve_range while on xen it also takes care of setting the spare memory previously allocated for kernel pagetable pages from RO to RW, so that it can be used for other purposes. A detailed explanation of the reason why this hook is needed follows. As a consequence of the commit: commit 4b239f458c229de044d6905c2b0f9fe16ed9e01e Author: Yinghai Lu <yinghai@kernel.org> Date: Fri Dec 17 16:58:28 2010 -0800 x86-64, mm: Put early page table high at some point init_memory_mapping is going to reach the pagetable pages area and map those pages too (mapping them as normal memory that falls in the range of addresses passed to init_memory_mapping as argument). Some of those pages are already pagetable pages (they are in the range pgt_buf_start-pgt_buf_end) therefore they are going to be mapped RO and everything is fine. Some of these pages are not pagetable pages yet (they fall in the range pgt_buf_end-pgt_buf_top; for example the page at pgt_buf_end) so they are going to be mapped RW. When these pages become pagetable pages and are hooked into the pagetable, xen will find that the guest has already a RW mapping of them somewhere and fail the operation. The reason Xen requires pagetables to be RO is that the hypervisor needs to verify that the pagetables are valid before using them. The validation operations are called "pinning" (more details in arch/x86/xen/mmu.c). In order to fix the issue we mark all the pages in the entire range pgt_buf_start-pgt_buf_top as RO, however when the pagetable allocation is completed only the range pgt_buf_start-pgt_buf_end is reserved by init_memory_mapping. Hence the kernel is going to crash as soon as one of the pages in the range pgt_buf_end-pgt_buf_top is reused (b/c those ranges are RO). For this reason we need a hook to reserve the kernel pagetable pages we used and free the other ones so that they can be reused for other purposes. On native it just means calling memblock_x86_reserve_range, on Xen it also means marking RW the pagetable pages that we allocated before but that haven't been used before. Another way to fix this is without using the hook is by adding a 'if (xen_pv_domain)' in the 'init_memory_mapping' code and calling the Xen counterpart, but that is just nasty. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | | | Revert "xen/mmu: Add workaround "x86-64, mm: Put early page table high""Konrad Rzeszutek Wilk2011-05-121-123/+0
|/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit a38647837a411f7df79623128421eef2118b5884. It does not work with certain AMD machines. last_pfn = 0x100000 max_arch_pfn = 0x400000000 initial memory mapped : 0 - 02c3a000 Base memory trampoline at [ffff88000009b000] 9b000 size 20480 init_memory_mapping: 0000000000000000-0000000100000000 0000000000 - 0100000000 page 4k kernel direct mapping tables up to 100000000 @ ff7fb000-100000000 init_memory_mapping: 0000000100000000-00000001e0800000 0100000000 - 01e0800000 page 4k kernel direct mapping tables up to 1e0800000 @ 1df0f3000-1e0000000 xen: setting RW the range fffdc000 - 100000000 RAMDISK: 0203b000 - 02c3a000 No NUMA configuration found Faking a node at 0000000000000000-00000001e0800000 NUMA: Using 63 for the hash shift. Initmem setup node 0 0000000000000000-00000001e0800000 NODE_DATA [00000001dfffb000 - 00000001dfffffff] BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81cf6a75>] setup_node_bootmem+0x18a/0x1ea PGD 0 Oops: 0003 [#1] SMP last sysfs file: CPU 0 Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.39-0-virtual #6~smb1 RIP: e030:[<ffffffff81cf6a75>] [<ffffffff81cf6a75>] setup_node_bootmem+0x18a/0x1ea RSP: e02b:ffffffff81c01e38 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 00000001e0800000 RCX: 0000000000001040 RDX: 0000000000004100 RSI: 0000000000000000 RDI: ffff8801dfffb000 RBP: ffffffff81c01e58 R08: 0000000000000020 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000bfe400 FS: 0000000000000000(0000) GS:ffffffff81cca000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c03000 CR4: 0000000000000660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 0, threadinfo ffffffff81c00000, task ffffffff81c0b020) Stack: 0000000000000040 0000000000000001 0000000000000000 ffffffffffffffff ffffffff81c01e88 ffffffff81cf6c25 0000000000000000 0000000000000000 ffffffff81cf687f 0000000000000000 ffffffff81c01ea8 ffffffff81cf6e45 Call Trace: [<ffffffff81cf6c25>] numa_register_memblks.constprop.3+0x150/0x181 [<ffffffff81cf687f>] ? numa_add_memblk+0x7c/0x7c [<ffffffff81cf6e45>] numa_init.part.2+0x1c/0x7c [<ffffffff81cf687f>] ? numa_add_memblk+0x7c/0x7c [<ffffffff81cf6f67>] numa_init+0x6c/0x70 [<ffffffff81cf7057>] initmem_init+0x39/0x3b [<ffffffff81ce5865>] setup_arch+0x64e/0x769 [<ffffffff815e43c1>] ? printk+0x51/0x53 [<ffffffff81cdf92b>] start_kernel+0xd4/0x3f3 [<ffffffff81cdf388>] x86_64_start_reservations+0x132/0x136 [<ffffffff81ce2ed4>] xen_start_kernel+0x588/0x58f Code: 41 00 00 48 8b 3c c5 a0 24 cc 81 31 c0 40 f6 c7 01 74 05 aa 66 ba ff 40 40 f6 c7 02 74 05 66 ab 83 ea 02 89 d1 c1 e9 02 f6 c2 02 <f3> ab 74 02 66 ab 80 e2 01 74 01 aa 49 63 c4 48 c1 eb 0c 44 89 RIP [<ffffffff81cf6a75>] setup_node_bootmem+0x18a/0x1ea RSP <ffffffff81c01e38> CR2: 0000000000000000 ---[ end trace a7919e7f17c0a725 ]--- Kernel panic - not syncing: Attempted to kill the idle task! Pid: 0, comm: swapper Tainted: G D 2.6.39-0-virtual #6~smb1 Reported-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | | xen: mask_rw_pte mark RO all pagetable pages up to pgt_buf_topStefano Stabellini2011-05-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mask_rw_pte is currently checking if a pfn is a pagetable page if it falls in the range pgt_buf_start - pgt_buf_end but that is incorrect because pgt_buf_end is a moving target: pgt_buf_top is the real boundary. Acked-by: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | | xen/mmu: Add workaround "x86-64, mm: Put early page table high"Konrad Rzeszutek Wilk2011-05-021-0/+123
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As a consequence of the commit: commit 4b239f458c229de044d6905c2b0f9fe16ed9e01e Author: Yinghai Lu <yinghai@kernel.org> Date: Fri Dec 17 16:58:28 2010 -0800 x86-64, mm: Put early page table high it causes the Linux kernel to crash under Xen: mapping kernel into physical memory Xen: setup ISA identity maps about to get started... (XEN) mm.c:2466:d0 Bad type (saw 7400000000000001 != exp 1000000000000000) for mfn b1d89 (pfn bacf7) (XEN) mm.c:3027:d0 Error while pinning mfn b1d89 (XEN) traps.c:481:d0 Unhandled invalid opcode fault/trap [#6] on VCPU 0 [ec=0000] (XEN) domain_crash_sync called from entry.S (XEN) Domain 0 (vcpu#0) crashed on cpu#0: ... The reason is that at some point init_memory_mapping is going to reach the pagetable pages area and map those pages too (mapping them as normal memory that falls in the range of addresses passed to init_memory_mapping as argument). Some of those pages are already pagetable pages (they are in the range pgt_buf_start-pgt_buf_end) therefore they are going to be mapped RO and everything is fine. Some of these pages are not pagetable pages yet (they fall in the range pgt_buf_end-pgt_buf_top; for example the page at pgt_buf_end) so they are going to be mapped RW. When these pages become pagetable pages and are hooked into the pagetable, xen will find that the guest has already a RW mapping of them somewhere and fail the operation. The reason Xen requires pagetables to be RO is that the hypervisor needs to verify that the pagetables are valid before using them. The validation operations are called "pinning" (more details in arch/x86/xen/mmu.c). In order to fix the issue we mark all the pages in the entire range pgt_buf_start-pgt_buf_top as RO, however when the pagetable allocation is completed only the range pgt_buf_start-pgt_buf_end is reserved by init_memory_mapping. Hence the kernel is going to crash as soon as one of the pages in the range pgt_buf_end-pgt_buf_top is reused (b/c those ranges are RO). For this reason, this function is introduced which is called _after_ the init_memory_mapping has completed (in a perfect world we would call this function from init_memory_mapping, but lets ignore that). Because we are called _after_ init_memory_mapping the pgt_buf_[start, end,top] have all changed to new values (b/c another init_memory_mapping is called). Hence, the first time we enter this function, we save away the pgt_buf_start value and update the pgt_buf_[end,top]. When we detect that the "old" pgt_buf_start through pgt_buf_end PFNs have been reserved (so memblock_x86_reserve_range has been called), we immediately set out to RW the "old" pgt_buf_end through pgt_buf_top. And then we update those "old" pgt_buf_[end|top] with the new ones so that we can redo this on the next pagetable. Acked-by: "H. Peter Anvin" <hpa@zytor.com> Reviewed-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> [v1: Updated with Jeremy's comments] [v2: Added the crash output] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | | xen: mask_rw_pte: do not apply the early_ioremap checks on x86_32Stefano Stabellini2011-04-201-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The two "is_early_ioremap_ptep" checks in mask_rw_pte are only used on x86_64, in fact early_ioremap is not used at all to setup the initial pagetable on x86_32. Moreover on x86_32 the two checks are wrong because the range pgt_buf_start..pgt_buf_end initially should be mapped RW because the pages in the range are not pagetable pages yet and haven't been cleared yet. Afterwards considering the pgt_buf_start..pgt_buf_end is part of the initial mapping, xen_alloc_pte is capable of turning the ptes RO when they become pagetable pages. Fix the issue and improve the readability of the code providing two different implementation of mask_rw_pte for x86_32 and x86_64. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | | xen/debug: Don't be so verbose with WARN on 1-1 mapping errors.Konrad Rzeszutek Wilk2011-04-041-2/+2
|/ / | | | | | | | | | | | | | | | | | | | | | | There are valid situations in which this error is not a warning. Mainly when QEMU maps a guest memory and uses the VM_IO flag to set the MFNs. For right now make the WARN be WARN_ONCE. In the future we will: 1). Remove the VM_IO code handling.. 2). .. which will also remove this debug facility. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds2011-03-221-9/+12
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: xen: update mask_rw_pte after kernel page tables init changes xen: set max_pfn_mapped to the last pfn mapped x86: Cleanup highmap after brk is concluded Fix up trivial onflict (added header file includes) in arch/x86/mm/init_64.c
| * | xen: update mask_rw_pte after kernel page tables init changesStefano Stabellini2011-03-191-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After "x86-64, mm: Put early page table high" already existing kernel page table pages can be mapped using early_ioremap too so we need to update mask_rw_pte to make sure these pages are still mapped RO. The reason why we have to do that is explain by the commit message of fef5ba797991f9335bcfc295942b684f9bf613a1: "Xen requires that all pages containing pagetable entries to be mapped read-only. If pages used for the initial pagetable are already mapped then we can change the mapping to RO. However, if they are initially unmapped, we need to make sure that when they are later mapped, they are also mapped RO. ..SNIP.. the pagetable setup code early_ioremaps the pages to write their entries, so we must make sure that mappings created in the early_ioremap fixmap area are mapped RW. (Those mappings are removed before the pages are presented to Xen as pagetable pages.)" We accomplish all this in mask_rw_pte by mapping RO all the pages mapped using early_ioremap apart from the last one that has been allocated because it is not a page table page yet (it has not been hooked into the page tables yet). Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> LKML-Reference: <alpine.DEB.2.00.1103171739050.3382@kaball-desktop> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | xen: set max_pfn_mapped to the last pfn mappedStefano Stabellini2011-03-191-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do not set max_pfn_mapped to the end of the initial memory mappings, that also contain pages that don't belong in pfn space (like the mfn list). Set max_pfn_mapped to the last real pfn mapped in the initial memory mappings that is the pfn backing _end. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> LKML-Reference: <alpine.DEB.2.00.1103171739050.3382@kaball-desktop> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* | | Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds2011-03-181-1/+1
|\ \ \ | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Flush TLB if PGD entry is changed in i386 PAE mode x86, dumpstack: Correct stack dump info when frame pointer is available x86: Clean up csum-copy_64.S a bit x86: Fix common misspellings x86: Fix misspelling and align params x86: Use PentiumPro-optimized partial_csum() on VIA C7
| * | x86: Fix common misspellingsLucas De Marchi2011-03-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | They were generated by 'codespell' and then manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi> Cc: trivial@kernel.org LKML-Reference: <1300389856-1099-3-git-send-email-lucas.demarchi@profusion.mobi> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | |
| \ \
*-. \ \ Merge branches 'stable/hvc-console', 'stable/gntalloc.v6' and ↵Linus Torvalds2011-03-171-2/+1
|\ \ \ \ | |_|/ / |/| | / | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'stable/balloon' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/hvc-console' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/hvc: Disable probe_irq_on/off from poking the hvc-console IRQ line. * 'stable/gntalloc.v6' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: gntdev: fix build warning xen/p2m/m2p/gnttab: do not add failed grant maps to m2p override xen-gntdev: Add cast to pointer xen-gntdev: Fix incorrect use of zero handle xen: change xen/[gntdev/gntalloc] to default m xen-gntdev: prevent using UNMAP_NOTIFY_CLEAR_BYTE on read-only mappings xen-gntdev: Avoid double-mapping memory xen-gntdev: Avoid unmapping ranges twice xen-gntdev: Use map->vma for checking map validity xen-gntdev: Fix unmap notify on PV domains xen-gntdev: Fix memory leak when mmap fails xen/gntalloc,gntdev: Add unmap notify ioctl xen-gntalloc: Userspace grant allocation driver xen-gntdev: Support mapping in HVM domains xen-gntdev: Add reference counting to maps xen-gntdev: Use find_vma rather than iterating our vma list manually xen-gntdev: Change page limit to be global instead of per-open * 'stable/balloon' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: (24 commits) xen-gntdev: Use ballooned pages for grant mappings xen-balloon: Add interface to retrieve ballooned pages xen-balloon: Move core balloon functionality out of module xen/balloon: Remove pr_info's and don't alter retry_count xen/balloon: Protect against CPU exhaust by event/x process xen/balloon: Migration from mod_timer() to schedule_delayed_work() xen/balloon: Removal of driver_pages
| | * xen/balloon: Removal of driver_pagesDaniel Kiper2011-03-141-2/+1
| |/ | | | | | | | | | | | | Removal of driver_pages (I do not have seen any references to it). Signed-off-by: Daniel Kiper <dkiper@net-space.pl> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds2011-03-151-1/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (93 commits) x86, tlb, UV: Do small micro-optimization for native_flush_tlb_others() x86-64, NUMA: Don't call numa_set_distanc() for all possible node combinations during emulation x86-64, NUMA: Don't assume phys node 0 is always online in numa_emulation() x86-64, NUMA: Clean up initmem_init() x86-64, NUMA: Fix numa_emulation code with node0 without RAM x86-64, NUMA: Revert NUMA affine page table allocation x86: Work around old gas bug x86-64, NUMA: Better explain numa_distance handling x86-64, NUMA: Fix distance table handling mm: Move early_node_map[] reverse scan helpers under HAVE_MEMBLOCK x86-64, NUMA: Fix size of numa_distance array x86: Rename e820_table_* to pgt_buf_* bootmem: Move __alloc_memory_core_early() to nobootmem.c bootmem: Move contig_page_data definition to bootmem.c/nobootmem.c bootmem: Separate out CONFIG_NO_BOOTMEM code into nobootmem.c x86-64, NUMA: Seperate out numa_alloc_distance() from numa_set_distance() x86-64, NUMA: Add proper function comments to global functions x86-64, NUMA: Move NUMA emulation into numa_emulation.c x86-64, NUMA: Prepare numa_emulation() for moving NUMA emulation into a separate file x86-64, NUMA: Do not scan two times for setup_node_bootmem() ... Fix up conflicts in arch/x86/kernel/smpboot.c
| * \ Merge commit 'v2.6.38' into x86/mmIngo Molnar2011-03-151-6/+4
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: arch/x86/mm/numa_64.c Merge reason: Resolve the conflict, update the branch to .38. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | x86: Rename e820_table_* to pgt_buf_*Yinghai Lu2011-02-241-1/+1
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | e820_table_{start|end|top}, which are used to buffer page table allocation during early boot, are now derived from memblock and don't have much to do with e820. Change the names so that they reflect what they're used for. This patch doesn't introduce any behavior change. -v2: Ingo found that earlier patch "x86: Use early pre-allocated page table buffer top-down" caused crash on 32bit and needed to be dropped. This patch was updated to reflect the change. -tj: Updated commit description. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org>
| | |
| \ \
*-. \ \ Merge branches 'stable/p2m-identity.v4.9.1' and 'stable/e820' of ↵Linus Torvalds2011-03-151-3/+69
|\ \ \ \ | | |/ / | |_| / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/p2m-identity.v4.9.1' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/m2p: Check whether the MFN has IDENTITY_FRAME bit set.. xen/m2p: No need to catch exceptions when we know that there is no RAM xen/debug: WARN_ON when identity PFN has no _PAGE_IOMAP flag set. xen/debugfs: Add 'p2m' file for printing out the P2M layout. xen/setup: Set identity mapping for non-RAM E820 and E820 gaps. xen/mmu: WARN_ON when racing to swap middle leaf. xen/mmu: Set _PAGE_IOMAP if PFN is an identity PFN. xen/mmu: Add the notion of identity (1-1) mapping. xen: Mark all initial reserved pages for the balloon as INVALID_P2M_ENTRY. * 'stable/e820' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/e820: Don't mark balloon memory as E820_UNUSABLE when running as guest and fix overflow. xen/setup: Inhibit resource API from using System RAM E820 gaps as PCI mem gaps.
| * | xen/debug: WARN_ON when identity PFN has no _PAGE_IOMAP flag set.Konrad Rzeszutek Wilk2011-03-141-0/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Only enabled if XEN_DEBUG is enabled. We print a warning when: pfn_to_mfn(pfn) == pfn, but no VM_IO (_PAGE_IOMAP) flag set (and pfn is an identity mapped pfn) pfn_to_mfn(pfn) != pfn, and VM_IO flag is set. (ditto, pfn is an identity mapped pfn) [v2: Make it dependent on CONFIG_XEN_DEBUG instead of ..DEBUG_FS] [v3: Fix compiler warning] Reviewed-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/debugfs: Add 'p2m' file for printing out the P2M layout.Konrad Rzeszutek Wilk2011-03-141-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We walk over the whole P2M tree and construct a simplified view of which PFN regions belong to what level and what type they are. Only enabled if CONFIG_XEN_DEBUG_FS is set. [v2: UNKN->UNKNOWN, use uninitialized_var] [v3: Rebased on top of mmu->p2m code split] [v4: Fixed the else if] Reviewed-by: Ian Campbell <Ian.Campbell@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/mmu: Set _PAGE_IOMAP if PFN is an identity PFN.Konrad Rzeszutek Wilk2011-03-141-2/+16
| | | | | | | | | | | | | | | | | | | | | | | | If we find that the PFN is within the P2M as an identity PFN make sure to tack on the _PAGE_IOMAP flag. Reviewed-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen: Mark all initial reserved pages for the balloon as INVALID_P2M_ENTRY.Konrad Rzeszutek Wilk2011-03-031-1/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With this patch, we diligently set regions that will be used by the balloon driver to be INVALID_P2M_ENTRY and under the ownership of the balloon driver. We are OK using the __set_phys_to_machine as we do not expect to be allocating any P2M middle or entries pages. The set_phys_to_machine has the side-effect of potentially allocating new pages and we do not want that at this stage. We can do this because xen_build_mfn_list_list will have already allocated all such pages up to xen_max_p2m_pfn. We also move the check for auto translated physmap down the stack so it is present in __set_phys_to_machine. [v2: Rebased with mmu->p2m code split] Reviewed-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | x86/mm: Fix pgd_lock deadlockAndrea Arcangeli2011-03-101-6/+4
|/ | | | | | | | | | | | | | | | | | | It's forbidden to take the page_table_lock with the irq disabled or if there's contention the IPIs (for tlb flushes) sent with the page_table_lock held will never run leading to a deadlock. Nobody takes the pgd_lock from irq context so the _irqsave can be removed. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: <stable@kernel.org> LKML-Reference: <201102162345.p1GNjMjm021738@imap1.linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* xen: export arbitrary_virt_to_machineStephen Rothwell2011-01-141-0/+1
| | | | | | | | | Fixes this build error: ERROR: "arbitrary_virt_to_machine" [drivers/xen/xen-gntdev.ko] undefined! Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* xen: move p2m handling to separate fileJeremy Fitzhardinge2011-01-111-365/+0
| | | | | Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* vmalloc: eagerly clear ptes on vunmapJeremy Fitzhardinge2010-12-021-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On stock 2.6.37-rc4, running: # mount lilith:/export /mnt/lilith # find /mnt/lilith/ -type f -print0 | xargs -0 file crashes the machine fairly quickly under Xen. Often it results in oops messages, but the couple of times I tried just now, it just hung quietly and made Xen print some rude messages: (XEN) mm.c:2389:d80 Bad type (saw 7400000000000001 != exp 3000000000000000) for mfn 1d7058 (pfn 18fa7) (XEN) mm.c:964:d80 Attempt to create linear p.t. with write perms (XEN) mm.c:2389:d80 Bad type (saw 7400000000000010 != exp 1000000000000000) for mfn 1d2e04 (pfn 1d1fb) (XEN) mm.c:2965:d80 Error while pinning mfn 1d2e04 Which means the domain tried to map a pagetable page RW, which would allow it to map arbitrary memory, so Xen stopped it. This is because vm_unmap_ram() left some pages mapped in the vmalloc area after NFS had finished with them, and those pages got recycled as pagetable pages while still having these RW aliases. Removing those mappings immediately removes the Xen-visible aliases, and so it has no problem with those pages being reused as pagetable pages. Deferring the TLB flush doesn't upset Xen because it can flush the TLB itself as needed to maintain its invariants. When unmapping a region in the vmalloc space, clear the ptes immediately. There's no point in deferring this because there's no amortization benefit. The TLBs are left dirty, and they are flushed lazily to amortize the cost of the IPIs. This specific motivation for this patch is an oops-causing regression since 2.6.36 when using NFS under Xen, triggered by the NFS client's use of vm_map_ram() introduced in 56e4ebf877b60 ("NFS: readdir with vmapped pages") . XFS also uses vm_map_ram() and could cause similar problems. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Bryan Schumaker <bjschuma@netapp.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Alex Elder <aelder@sgi.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'for-linus' of ↵Linus Torvalds2010-11-251-13/+56
|\ | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: remove duplicated #include xen: x86/32: perform initial startup on initial_page_table
| * xen: x86/32: perform initial startup on initial_page_tableIan Campbell2010-11-241-13/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Only make swapper_pg_dir readonly and pinned when generic x86 architecture code (which also starts on initial_page_table) switches to it. This helps ensure that the generic setup paths work on Xen unmodified. In particular clone_pgd_range writes directly to the destination pgd entries and is used to initialise swapper_pg_dir so we need to ensure that it remains writeable until the last possible moment during bring up. This is complicated slightly by the need to avoid sharing kernel PMD entries when running under Xen, therefore the Xen implementation must make a copy of the kernel PMD (which is otherwise referred to by both intial_page_table and swapper_pg_dir) before switching to swapper_pg_dir. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| |
| \
*-. \ Merge branches 'upstream/core', 'upstream/xenfs' and 'upstream/evtchn' into ↵Jeremy Fitzhardinge2010-11-221-1/+2
|\ \ \ | | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | upstream/for-linus * upstream/core: xen/events: Use PIRQ instead of GSI value when unmapping MSI/MSI-X irqs. xen: set IO permission early (before early_cpu_init()) xen: re-enable boot-time ballooning xen/balloon: make sure we only include remaining extra ram xen/balloon: the balloon_lock is useless xen: add extra pages to balloon xen/events: use locked set|clear_bit() for cpu_evtchn_mask xen/evtchn: clear secondary CPUs' cpu_evtchn_mask[] after restore xen: implement XENMEM_machphys_mapping * upstream/xenfs: Revert "xen/privcmd: create address space to allow writable mmaps" xen/xenfs: update xenfs_mount for new prototype xen: fix header export to userspace xen: set vma flag VM_PFNMAP in the privcmd mmap file_op xen: xenfs: privcmd: check put_user() return code * upstream/evtchn: xen: make evtchn's name less generic xen/evtchn: the evtchn device is non-seekable xen/evtchn: add missing static xen/evtchn: Fix name of Xen event-channel device xen/evtchn: don't do unbind_from_irqhandler under spinlock xen/evtchn: remove spurious barrier xen/evtchn: ports start enabled xen/evtchn: dynamically allocate port_user array xen/evtchn: track enabled state for each port
| * | Merge commit 'v2.6.37-rc2' into upstream/xenfsJeremy Fitzhardinge2010-11-161-88/+416
| |\ \ | | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * commit 'v2.6.37-rc2': (10093 commits) Linux 2.6.37-rc2 capabilities/syslog: open code cap_syslog logic to fix build failure i2c: Sanity checks on adapter registration i2c: Mark i2c_adapter.id as deprecated i2c: Drivers shouldn't include <linux/i2c-id.h> i2c: Delete unused adapter IDs i2c: Remove obsolete cleanup for clientdata include/linux/kernel.h: Move logging bits to include/linux/printk.h Fix gcc 4.5.1 miscompiling drivers/char/i8k.c (again) hwmon: (w83795) Check for BEEP pin availability hwmon: (w83795) Clear intrusion alarm immediately hwmon: (w83795) Read the intrusion state properly hwmon: (w83795) Print the actual temperature channels as sources hwmon: (w83795) List all usable temperature sources hwmon: (w83795) Expose fan control method hwmon: (w83795) Fix fan control mode attributes hwmon: (lm95241) Check validity of input values hwmon: Change mail address of Hans J. Koch PCI: sysfs: fix printk warnings GFS2: Fix inode deallocation race ...
| * | xen: set vma flag VM_PFNMAP in the privcmd mmap file_opStefano Stabellini2010-11-111-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set VM_PFNMAP in the privcmd mmap file_op, rather than later in xen_remap_domain_mfn_range when it is too late because vma_wants_writenotify has already been called and vm_page_prot has already been modified. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
* | | xen: implement XENMEM_machphys_mappingIan Campbell2010-11-121-0/+14
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | This hypercall allows Xen to specify a non-default location for the machine to physical mapping. This capability is used when running a 32 bit domain 0 on a 64 bit hypervisor to shrink the hypervisor hole to exactly the size required. [ Impact: add Xen hypercall definitions ] Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
* | xen: correct size of level2_kernel_pgtIan Campbell2010-10-291-1/+1
| | | | | | | | | | | | | | | | | | | | sizeof(pmd_t *) is 4 bytes on 32-bit PAE leading to an allocation of only 2048 bytes. The correct size is sizeof(pmd_t) giving us a full page allocation. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
* | Merge branch 'stable/xen-pcifront-0.8.2' of ↵Linus Torvalds2010-10-281-3/+44
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen and branch 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm * 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm: xen: register xen pci notifier xen: initialize cpu masks for pv guests in xen_smp_init xen: add a missing #include to arch/x86/pci/xen.c xen: mask the MTRR feature from the cpuid xen: make hvc_xen console work for dom0. xen: add the direct mapping area for ISA bus access xen: Initialize xenbus for dom0. xen: use vcpu_ops to setup cpu masks xen: map a dummy page for local apic and ioapic in xen_set_fixmap xen: remap MSIs into pirqs when running as initial domain xen: remap GSIs as pirqs when running as initial domain xen: introduce XEN_DOM0 as a silent option xen: map MSIs into pirqs xen: support GSI -> pirq remapping in PV on HVM guests xen: add xen hvm acpi_register_gsi variant acpi: use indirect call to register gsi in different modes xen: implement xen_hvm_register_pirq xen: get the maximum number of pirqs from xen xen: support pirq != irq * 'stable/xen-pcifront-0.8.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: (27 commits) X86/PCI: Remove the dependency on isapnp_disable. xen: Update Makefile with CONFIG_BLOCK dependency for biomerge.c MAINTAINERS: Add myself to the Xen Hypervisor Interface and remove Chris Wright. x86: xen: Sanitse irq handling (part two) swiotlb-xen: On x86-32 builts, select SWIOTLB instead of depending on it. MAINTAINERS: Add myself for Xen PCI and Xen SWIOTLB maintainer. xen/pci: Request ACS when Xen-SWIOTLB is activated. xen-pcifront: Xen PCI frontend driver. xenbus: prevent warnings on unhandled enumeration values xenbus: Xen paravirtualised PCI hotplug support. xen/x86/PCI: Add support for the Xen PCI subsystem x86: Introduce x86_msi_ops msi: Introduce default_[teardown|setup]_msi_irqs with fallback. x86/PCI: Export pci_walk_bus function. x86/PCI: make sure _PAGE_IOMAP it set on pci mappings x86/PCI: Clean up pci_cache_line_size xen: fix shared irq device passthrough xen: Provide a variant of xen_poll_irq with timeout. xen: Find an unbound irq number in reverse order (high to low). xen: statically initialize cpu_evtchn_mask_p ... Fix up trivial conflicts in drivers/pci/Makefile
| * | xen: add the direct mapping area for ISA bus accessJuan Quintela2010-10-221-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | add the direct mapping area for ISA bus access when running as initial domain Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen: map a dummy page for local apic and ioapic in xen_set_fixmapJeremy Fitzhardinge2010-10-221-3/+20
| | | | | | | | | | | | | | | | | | Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| | |
| \ \
*-. \ \ Merge branches 'upstream/xenfs' and 'upstream/core' of ↵Linus Torvalds2010-10-261-76/+425
|\ \ \ \ | | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen * 'upstream/xenfs' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen: xen/privcmd: make privcmd visible in domU xen/privcmd: move remap_domain_mfn_range() to core xen code and export. privcmd: MMAPBATCH: Fix error handling/reporting xenbus: export xen_store_interface for xenfs xen/privcmd: make sure vma is ours before doing anything to it xen/privcmd: print SIGBUS faults xen/xenfs: set_page_dirty is supposed to return true if it dirties xen/privcmd: create address space to allow writable mmaps xen: add privcmd driver xen: add variable hypercall caller xen: add xen_set_domain_pte() xen: add /proc/xen/xsd_{kva,port} to xenfs * 'upstream/core' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen: (29 commits) xen: include xen/xen.h for definition of xen_initial_domain() xen: use host E820 map for dom0 xen: correctly rebuild mfn list list after migration. xen: improvements to VIRQ_DEBUG output xen: set up IRQ before binding virq to evtchn xen: ensure that all event channels start off bound to VCPU 0 xen/hvc: only notify if we actually sent something xen: don't add extra_pages for RAM after mem_end xen: add support for PAT xen: make sure xen_max_p2m_pfn is up to date xen: limit extra memory to a certain ratio of base xen: add extra pages for E820 RAM regions, even if beyond mem_end xen: make sure xen_extra_mem_start is beyond all non-RAM e820 xen: implement "extra" memory to reserve space for pages not present at boot xen: Use host-provided E820 map xen: don't map missing memory xen: defer building p2m mfn structures until kernel is mapped xen: add return value to set_phys_to_machine() xen: convert p2m to a 3 level tree xen: make install_p2mtop_page() static ... Fix up trivial conflict in arch/x86/xen/mmu.c, and fix the use of 'reserve_early()' - in the new memblock world order it is now 'memblock_x86_reserve_range()' instead. Pointed out by Jeremy.
| | * | xen: correctly rebuild mfn list list after migration.Ian Campbell2010-10-221-13/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise the second migration attempt fails because the mfn_list_list still refers to all the old mfns. We need to update the entires in both p2m_top_mfn and the mid_mfn pages which p2m_top_mfn refers to. In order to do this we need to keep track of the virtual addresses mapping the p2m_mid_mfn pages since we cannot rely on mfn_to_virt(p2m_top_mfn[idx]) since p2m_top_mfn[idx] will still contain the old MFN after a migration, which may now belong to another domain and hence have a different mapping in the m2p. Therefore add and maintain a third top level page, p2m_top_mfn_p[], which tracks the virtual addresses of the mfns contained in p2m_top_mfn[]. We also need to update the content of the p2m_mid_missing_mfn page on resume to refer to the page's new mfn. p2m_missing does not need updating since the migration process takes care of the leaf p2m pages for us. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
| | * | xen: add support for PATJeremy Fitzhardinge2010-10-221-3/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert Linux PAT entries into Xen ones when constructing ptes. Linux doesn't use _PAGE_PAT for ptes, so the only difference in the first 4 entries is that Linux uses _PAGE_PWT for WC, whereas Xen (and default) use it for WT. xen_pte_val does the inverse conversion. We hard-code assumptions about Linux's current PAT layout, but a warning on the wrmsr to MSR_IA32_CR_PAT should point out any problems. If necessary we could go to a more general table-based conversion between Linux and Xen PAT entries. hugetlbfs poses a problem at the moment, the x86 architecture uses the same flag for _PAGE_PAT and _PAGE_PSE, which changes meaning depending on which pagetable level we're using. At the moment this should be OK so long as nobody tries to do a pte_val on a hugetlbfs pte. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
| | * | xen: make sure xen_max_p2m_pfn is up to dateJeremy Fitzhardinge2010-10-221-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Keep xen_max_p2m_pfn up to date with the end of the extra memory we're adding. It is possible that it will be too high since memory may be truncated by a "mem=" option on the kernel command line, but that won't matter. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
| | * | xen: don't map missing memoryJeremy Fitzhardinge2010-10-221-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When setting up a pte for a missing pfn (no matching mfn), just create an empty pte rather than a junk mapping. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
| | * | xen: defer building p2m mfn structures until kernel is mappedJeremy Fitzhardinge2010-10-221-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When building mfn parts of p2m structure, we rely on being able to use mfn_to_virt, which in turn requires kernel to be mapped into the linear area (which is distinct from the kernel image mapping on 64-bit). Defer calling xen_build_mfn_list_list() until after xen_setup_kernel_pagetable(); Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
| | * | xen: add return value to set_phys_to_machine()Jeremy Fitzhardinge2010-10-221-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | set_phys_to_machine() can return false on failure, which means a memory allocation failure for the p2m structure. It can only fail if setting the mfn for a pfn in previously unused address space. It is guaranteed to succeed if you're setting a mapping to INVALID_P2M_ENTRY or updating the mfn for an existing pfn. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
| | * | xen: convert p2m to a 3 level treeJeremy Fitzhardinge2010-10-221-76/+242
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make the p2m structure a 3 level tree which covers the full possible physical space. The p2m structure contains mappings from the domain's pfns to system-wide mfns. The structure has 3 levels and two roots. The first root is for the domain's own use, and is linked with virtual addresses. The second is all mfn references, and is used by Xen on save/restore to allow it to update the p2m mapping for the domain. At boot, the domain builder provides a simple flat p2m array for all the initially present pages. We construct the two levels above that using the early_brk allocator. After early boot time, set_phys_to_machine() will allocate any missing levels using the normal kernel allocator (at GFP_KERNEL, so it must be called in a normal blocking context). Because the early_brk() API requires us to pre-reserve the maximum amount of memory we could allocate, there is still a CONFIG_XEN_MAX_DOMAIN_MEMORY config option, but its only negative side-effect is to increase the kernel's apparent bss size. However, since all unused brk memory is returned to the heap, there's no real downside to making it large. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
| | * | xen: make install_p2mtop_page() staticJeremy Fitzhardinge2010-10-221-2/+2
| | | | | | | | | | | | | | | | Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
| | * | xen: set the actual extent of the mfn_list_listJeremy Fitzhardinge2010-10-221-1/+1
| | | | | | | | | | | | | | | | Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
| | * | xen: set shared_info->arch.max_pfn to max_p2m_pfnJeremy Fitzhardinge2010-10-221-1/+1
| | | | | | | | | | | | | | | | Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
| | * | xen: allocate level1_ident_pgtJeremy Fitzhardinge2010-10-221-2/+6
| | | | | | | | | | | | | | | | Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
OpenPOWER on IntegriCloud