summaryrefslogtreecommitdiffstats
path: root/include/xen
Commit message (Collapse)AuthorAgeFilesLines
* x86/xen: Support kexec/kdump in HVM guests by doing a soft resetVitaly Kuznetsov2015-09-281-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently there is a number of issues preventing PVHVM Xen guests from doing successful kexec/kdump: - Bound event channels. - Registered vcpu_info. - PIRQ/emuirq mappings. - shared_info frame after XENMAPSPACE_shared_info operation. - Active grant mappings. Basically, newly booted kernel stumbles upon already set up Xen interfaces and there is no way to reestablish them. In Xen-4.7 a new feature called 'soft reset' is coming. A guest performing kexec/kdump operation is supposed to call SCHEDOP_shutdown hypercall with SHUTDOWN_soft_reset reason before jumping to new kernel. Hypervisor (with some help from toolstack) will do full domain cleanup (but keeping its memory and vCPU contexts intact) returning the guest to the state it had when it was first booted and thus allowing it to start over. Doing SHUTDOWN_soft_reset on Xen hypervisors which don't support it is probably OK as by default all unknown shutdown reasons cause domain destroy with a message in toolstack log: 'Unknown shutdown reason code 5. Destroying domain.' which gives a clue to what the problem is and eliminates false expectations. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
* Merge tag 'for-linus-4.3-rc0b-tag' of ↵Linus Torvalds2015-09-102-7/+7
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen terminology fixes from David Vrabel: "Use the correct GFN/BFN terms more consistently" * tag 'for-linus-4.3-rc0b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/xenbus: Rename the variable xen_store_mfn to xen_store_gfn xen/privcmd: Further s/MFN/GFN/ clean-up hvc/xen: Further s/MFN/GFN clean-up video/xen-fbfront: Further s/MFN/GFN clean-up xen/tmem: Use xen_page_to_gfn rather than pfn_to_gfn xen: Use correctly the Xen memory terminologies arm/xen: implement correctly pfn_to_mfn xen: Make clear that swiotlb and biomerge are dealing with DMA address
| * xen/privcmd: Further s/MFN/GFN/ clean-upJulien Grall2015-09-081-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The privcmd code is mixing the usage of GFN and MFN within the same functions which make the code difficult to understand when you only work with auto-translated guests. The privcmd driver is only dealing with GFN so replace all the mention of MFN into GFN. The ioctl structure used to map foreign change has been left unchanged given that the userspace is using it. Nonetheless, add a comment to explain the expected value within the "mfn" field. Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
| * xen: Use correctly the Xen memory terminologiesJulien Grall2015-09-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Based on include/xen/mm.h [1], Linux is mistakenly using MFN when GFN is meant, I suspect this is because the first support for Xen was for PV. This resulted in some misimplementation of helpers on ARM and confused developers about the expected behavior. For instance, with pfn_to_mfn, we expect to get an MFN based on the name. Although, if we look at the implementation on x86, it's returning a GFN. For clarity and avoid new confusion, replace any reference to mfn with gfn in any helpers used by PV drivers. The x86 code will still keep some reference of pfn_to_mfn which may be used by all kind of guests No changes as been made in the hypercall field, even though they may be invalid, in order to keep the same as the defintion in xen repo. Note that page_to_mfn has been renamed to xen_page_to_gfn to avoid a name to close to the KVM function gfn_to_page. Take also the opportunity to simplify simple construction such as pfn_to_mfn(page_to_pfn(page)) into xen_page_to_gfn. More complex clean up will come in follow-up patches. [1] http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=e758ed14f390342513405dd766e874934573e6cb Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
* | Merge tag 'for-linus-4.3-rc0-tag' of ↵Linus Torvalds2015-09-085-18/+136
|\ \ | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from David Vrabel: "Xen features and fixes for 4.3: - Convert xen-blkfront to the multiqueue API - [arm] Support binding event channels to different VCPUs. - [x86] Support > 512 GiB in a PV guests (off by default as such a guest cannot be migrated with the current toolstack). - [x86] PMU support for PV dom0 (limited support for using perf with Xen and other guests)" * tag 'for-linus-4.3-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (33 commits) xen: switch extra memory accounting to use pfns xen: limit memory to architectural maximum xen: avoid another early crash of memory limited dom0 xen: avoid early crash of memory limited dom0 arm/xen: Remove helpers which are PV specific xen/x86: Don't try to set PCE bit in CR4 xen/PMU: PMU emulation code xen/PMU: Intercept PMU-related MSR and APIC accesses xen/PMU: Describe vendor-specific PMU registers xen/PMU: Initialization code for Xen PMU xen/PMU: Sysfs interface for setting Xen PMU mode xen: xensyms support xen: remove no longer needed p2m.h xen: allow more than 512 GB of RAM for 64 bit pv-domains xen: move p2m list if conflicting with e820 map xen: add explicit memblock_reserve() calls for special pages mm: provide early_memremap_ro to establish read-only mapping xen: check for initrd conflicting with e820 map xen: check pre-allocated page tables for conflict with memory map xen: check for kernel memory conflicting with memory layout ...
| * xen: switch extra memory accounting to use pfnsJuergen Gross2015-09-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | Instead of using physical addresses for accounting of extra memory areas available for ballooning switch to pfns as this is much less error prone regarding partial pages. Reported-by: Roger Pau Monné <roger.pau@citrix.com> Tested-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
| * xen/PMU: Intercept PMU-related MSR and APIC accessesBoris Ostrovsky2015-08-201-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | Provide interfaces for recognizing accesses to PMU-related MSRs and LVTPC APIC and process these accesses in Xen PMU code. (The interrupt handler performs XENPMU_flush right away in the beginning since no PMU emulation is available. It will be added with a later patch). Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
| * xen/PMU: Initialization code for Xen PMUBoris Ostrovsky2015-08-202-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Map shared data structure that will hold CPU registers, VPMU context, V/PCPU IDs of the CPU interrupted by PMU interrupt. Hypervisor fills this information in its handler and passes it to the guest for further processing. Set up PMU VIRQ. Now that perf infrastructure will assume that PMU is available on a PV guest we need to be careful and make sure that accesses via RDPMC instruction don't cause fatal traps by the hypervisor. Provide a nop RDPMC handler. For the same reason avoid issuing a warning on a write to APIC's LVTPC. Both of these will be made functional in later patches. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
| * xen/PMU: Sysfs interface for setting Xen PMU modeBoris Ostrovsky2015-08-202-0/+60
| | | | | | | | | | | | | | | | Set Xen's PMU mode via /sys/hypervisor/pmu/pmu_mode. Add XENPMU hypercall. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
| * xen: xensyms supportBoris Ostrovsky2015-08-201-0/+18
| | | | | | | | | | | | | | | | | | Export Xen symbols to dom0 via /proc/xen/xensyms (similar to /proc/kallsyms). Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
| * xen: sync with xen headersJuergen Gross2015-08-201-15/+20
| | | | | | | | | | | | | | | | | | | | Use the newest headers from the xen tree to get some new structure layouts. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
| * xen/events: Support event channel rebind on ARMJulien Grall2015-08-201-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the event channel rebind code is gated with the presence of the vector callback. The virtual interrupt controller on ARM has the concept of per-CPU interrupt (PPI) which allow us to support per-VCPU event channel. Therefore there is no need of vector callback for ARM. Xen is already using a free PPI to notify the guest VCPU of an event. Furthermore, the xen code initialization in Linux (see arch/arm/xen/enlighten.c) is requesting correctly a per-CPU IRQ. Introduce new helper xen_support_evtchn_rebind to allow architecture decide whether rebind an event is support or not. It will always return true on ARM and keep the same behavior on x86. This is also allow us to drop the usage of xen_have_vector_callback entirely in the ARM code. Signed-off-by: Julien Grall <julien.grall@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
* | xen-netback: add support for multicast controlPaul Durrant2015-09-021-1/+7
|/ | | | | | | | | | | | | | | | | | | | | | | | Xen's PV network protocol includes messages to add/remove ethernet multicast addresses to/from a filter list in the backend. This allows the frontend to request the backend only forward multicast packets which are of interest thus preventing unnecessary noise on the shared ring. The canonical netif header in git://xenbits.xen.org/xen.git specifies the message format (two more XEN_NETIF_EXTRA_TYPEs) so the minimal necessary changes have been pulled into include/xen/interface/io/netif.h. To prevent the frontend from extending the multicast filter list arbitrarily a limit (XEN_NETBK_MCAST_MAX) has been set to 64 entries. This limit is not specified by the protocol and so may change in future. If the limit is reached then the next XEN_NETIF_EXTRA_TYPE_MCAST_ADD sent by the frontend will be failed with NETIF_RSP_ERROR. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* xen/events: don't bind non-percpu VIRQs with percpu chipDavid Vrabel2015-05-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | A non-percpu VIRQ (e.g., VIRQ_CONSOLE) may be freed on a different VCPU than it is bound to. This can result in a race between handle_percpu_irq() and removing the action in __free_irq() because handle_percpu_irq() does not take desc->lock. The interrupt handler sees a NULL action and oopses. Only use the percpu chip/handler for per-CPU VIRQs (like VIRQ_TIMER). # cat /proc/interrupts | grep virq 40: 87246 0 xen-percpu-virq timer0 44: 0 0 xen-percpu-virq debug0 47: 0 20995 xen-percpu-virq timer1 51: 0 0 xen-percpu-virq debug1 69: 0 0 xen-dyn-virq xen-pcpu 74: 0 0 xen-dyn-virq mce 75: 29 0 xen-dyn-virq hvc_console Signed-off-by: David Vrabel <david.vrabel@citrix.com> Cc: <stable@vger.kernel.org>
* xen: Suspend ticks on all CPUs during suspendBoris Ostrovsky2015-04-291-0/+1
| | | | | | | | | | | | | | | | | | | | | Commit 77e32c89a711 ("clockevents: Manage device's state separately for the core") decouples clockevent device's modes from states. With this change when a Xen guest tries to resume, it won't be calling its set_mode op which needs to be done on each VCPU in order to make the hypervisor aware that we are in oneshot mode. This happens because clockevents_tick_resume() (which is an intermediate step of resuming ticks on a processor) doesn't call clockevents_set_state() anymore and because during suspend clockevent devices on all VCPUs (except for the one doing the suspend) are left in ONESHOT state. As result, during resume the clockevents state machine will assume that device is already where it should be and doesn't need to be updated. To avoid this problem we should suspend ticks on all VCPUs during suspend. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
* xen/grant: introduce func gnttab_unmap_refs_sync()Bob Liu2015-04-271-0/+1
| | | | | | | | | | | There are several place using gnttab async unmap and wait for completion, so move the common code to a function gnttab_unmap_refs_sync(). Signed-off-by: Bob Liu <bob.liu@oracle.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
* xenbus_client: Extend interface to support multi-page ringWei Liu2015-04-151-6/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | Originally Xen PV drivers only use single-page ring to pass along information. This might limit the throughput between frontend and backend. The patch extends Xenbus driver to support multi-page ring, which in general should improve throughput if ring is the bottleneck. Changes to various frontend / backend to adapt to the new interface are also included. Affected Xen drivers: * blkfront/back * netfront/back * pcifront/back * scsifront/back * vtpmfront The interface is documented, as before, in xenbus_client.c. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: Bob Liu <bob.liu@oracle.com> Cc: Konrad Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
* xen/privcmd: improve performance of MMAPBATCH_V2David Vrabel2015-03-161-4/+41
| | | | | | | | | | | | | | | | | | | | | | | | Make the IOCTL_PRIVCMD_MMAPBATCH_V2 (and older V1 version) map multiple frames at a time rather than one at a time, despite the pages being non-consecutive GFNs. xen_remap_foreign_mfn_array() is added which maps an array of GFNs (instead of a consecutive range of GFNs). Since per-frame errors are returned in an array, privcmd must set the MMAPBATCH_V1 error bits as part of the "report errors" phase, after all the frames are mapped. Migrate times are significantly improved (when using a PV toolstack domain). For example, for an idle 12 GiB PV guest: Before After real 0m38.179s 0m26.868s user 0m15.096s 0m13.652s sys 0m28.988s 0m18.732s Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
* xen: unify foreign GFN map/unmap for auto-xlated physmap guestsDavid Vrabel2015-03-161-0/+8
| | | | | | | | | | | | | | | Auto-translated physmap guests (arm, arm64 and x86 PVHVM/PVH) map and unmap foreign GFNs using the same method (updating the physmap). Unify the two arm and x86 implementations into one commont one. Note that on arm and arm64, the correct error code will be returned (instead of always -EFAULT) and map/unmap failure warnings are no longer printed. These changes are required if the foreign domain is paging (-ENOENT failures are expected and must be propagated up to the caller). Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
* xen: synchronize include/xen/interface/xen.h with xenJuergen Gross2015-03-161-1/+5
| | | | | | | | | The header include/xen/interface/xen.h doesn't contain all definitions from Xen's version of that header. Update it accordingly. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
* xen: Remove trailing semicolon from xenbus_register_frontend() definitionYuval Shaia2015-03-021-2/+2
| | | | | Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
* x86/xen: allow privcmd hypercalls to be preemptedDavid Vrabel2015-02-231-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hypercalls submitted by user space tools via the privcmd driver can take a long time (potentially many 10s of seconds) if the hypercall has many sub-operations. A fully preemptible kernel may deschedule such as task in any upcall called from a hypercall continuation. However, in a kernel with voluntary or no preemption, hypercall continuations in Xen allow event handlers to be run but the task issuing the hypercall will not be descheduled until the hypercall is complete and the ioctl returns to user space. These long running tasks may also trigger the kernel's soft lockup detection. Add xen_preemptible_hcall_begin() and xen_preemptible_hcall_end() to bracket hypercalls that may be preempted. Use these in the privcmd driver. When returning from an upcall, call xen_maybe_preempt_hcall() which adds a schedule point if if the current task was within a preemptible hypercall. Since _cond_resched() can move the task to a different CPU, clear and set xen_in_preemptible_hcall around the call. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds2015-02-101-0/+5
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking updates from David Miller: 1) More iov_iter conversion work from Al Viro. [ The "crypto: switch af_alg_make_sg() to iov_iter" commit was wrong, and this pull actually adds an extra commit on top of the branch I'm pulling to fix that up, so that the pre-merge state is ok. - Linus ] 2) Various optimizations to the ipv4 forwarding information base trie lookup implementation. From Alexander Duyck. 3) Remove sock_iocb altogether, from CHristoph Hellwig. 4) Allow congestion control algorithm selection via routing metrics. From Daniel Borkmann. 5) Make ipv4 uncached route list per-cpu, from Eric Dumazet. 6) Handle rfs hash collisions more gracefully, also from Eric Dumazet. 7) Add xmit_more support to r8169, e1000, and e1000e drivers. From Florian Westphal. 8) Transparent Ethernet Bridging support for GRO, from Jesse Gross. 9) Add BPF packet actions to packet scheduler, from Jiri Pirko. 10) Add support for uniqu flow IDs to openvswitch, from Joe Stringer. 11) New NetCP ethernet driver, from Muralidharan Karicheri and Wingman Kwok. 12) More sanely handle out-of-window dupacks, which can result in serious ACK storms. From Neal Cardwell. 13) Various rhashtable bug fixes and enhancements, from Herbert Xu, Patrick McHardy, and Thomas Graf. 14) Support xmit_more in be2net, from Sathya Perla. 15) Group Policy extensions for vxlan, from Thomas Graf. 16) Remove Checksum Offload support for vxlan, from Tom Herbert. 17) Like ipv4, support lockless transmit over ipv6 UDP sockets. From Vlad Yasevich. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1494+1 commits) crypto: fix af_alg_make_sg() conversion to iov_iter ipv4: Namespecify TCP PMTU mechanism i40e: Fix for stats init function call in Rx setup tcp: don't include Fast Open option in SYN-ACK on pure SYN-data openvswitch: Only set TUNNEL_VXLAN_OPT if VXLAN-GBP metadata is set ipv6: Make __ipv6_select_ident static ipv6: Fix fragment id assignment on LE arches. bridge: Fix inability to add non-vlan fdb entry net: Mellanox: Delete unnecessary checks before the function call "vunmap" cxgb4: Add support in cxgb4 to get expansion rom version via ethtool ethtool: rename reserved1 memeber in ethtool_drvinfo for expansion ROM version net: dsa: Remove redundant phy_attach() IB/mlx4: Reset flow support for IB kernel ULPs IB/mlx4: Always use the correct port for mirrored multicast attachments net/bonding: Fix potential bad memory access during bonding events tipc: remove tipc_snprintf tipc: nl compat add noop and remove legacy nl framework tipc: convert legacy nl stats show to nl compat tipc: convert legacy nl net id get to nl compat tipc: convert legacy nl net id set to nl compat ...
| * Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2015-01-151-0/+51
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/xen-netfront.c Minor overlapping changes in xen-netfront.c, mostly to do with some buffer management changes alongside the split of stats into TX and RX. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | xen: add page_to_mfn()David Vrabel2015-01-141-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | pfn_to_mfn(page_to_pfn(p)) is a common use case so add a generic helper for it. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | xen/gntdev: mark userspace PTEs as special on x86 PV guestsDavid Vrabel2015-01-282-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In an x86 PV guest, get_user_pages_fast() on a userspace address range containing foreign mappings does not work correctly because the M2P lookup of the MFN from a userspace PTE may return the wrong page. Force get_user_pages_fast() to fail on such addresses by marking the PTEs as special. If Xen has XENFEAT_gnttab_map_avail_bits (available since at least 4.0), we can do so efficiently in the grant map hypercall. Otherwise, it needs to be done afterwards. This is both inefficient and racy (the mapping is visible to the task before we fixup the PTEs), but will be fine for well-behaved applications that do not use the mapping until after the mmap() system call returns. Guests with XENFEAT_auto_translated_physmap (ARM and x86 HVM or PVH) do not need this since get_user_pages() has always worked correctly for them. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
* | | xen/grant-table: add a mechanism to safely unmap pages that are in useJennifer Herbert2015-01-281-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce gnttab_unmap_refs_async() that can be used to safely unmap pages that may be in use (ref count > 1). If the pages are in use the unmap is deferred and retried later. This polling is not very clever but it should be good enough if the cases where the delay is necessary are rare. The initial delay is 5 ms and is increased linearly on each subsequent retry (to reduce load if the page is in use for a long time). This is needed to allow block backends using grant mapping to safely use network storage (block or filesystem based such as iSCSI or NFS). The network storage driver may complete a block request whilst there is a queued network packet retry (because the ack from the remote end races with deciding to queue the retry). The pages for the retried packet would be grant unmapped and the network driver (or hardware) would access the unmapped page. Signed-off-by: Jennifer Herbert <jennifer.herbert@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
* | | xen: mark grant mapped pages as foreignJennifer Herbert2015-01-281-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the "foreign" page flag to mark pages that have a grant map. Use page->private to store information of the grant (the granting domain and the grant reference). Signed-off-by: Jennifer Herbert <jennifer.herbert@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
* | | xen/grant-table: add helpers for allocating pagesDavid Vrabel2015-01-281-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | Add gnttab_alloc_pages() and gnttab_free_pages() to allocate/free pages suitable to for granted maps. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
* | | xen/grant-table: pre-populate kernel unmap ops for xen_gnttab_unmap_refs()David Vrabel2015-01-281-1/+1
| |/ |/| | | | | | | | | | | | | | | | | | | When unmapping grants, instead of converting the kernel map ops to unmap ops on the fly, pre-populate the set of unmap ops. This allows the grant unmap for the kernel mappings to be trivially batched in the future. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
* | x86/xen: properly retrieve NMI reasonJan Beulich2015-01-131-0/+51
|/ | | | | | | | | | | | | | | | | | | | Using the native code here can't work properly, as the hypervisor would normally have cleared the two reason bits by the time Dom0 gets to see the NMI (if passed to it at all). There's a shared info field for this, and there's an existing hook to use - just fit the two together. This is particularly relevant so that NMIs intended to be handled by APEI / GHES actually make it to the respective handler. Note that the hook can (and should) be used irrespective of whether being in Dom0, as accessing port 0x61 in a DomU would be even worse, while the shared info field would just hold zero all the time. Note further that hardware NMI handling for PVH doesn't currently work anyway due to missing code in the hypervisor (but it is expected to work the native rather than the PV way). Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
* xen/arm: introduce GNTTABOP_cache_flushStefano Stabellini2014-12-041-0/+19
| | | | | | | | | | | | | Introduce support for new hypercall GNTTABOP_cache_flush. Use it to perform cache flashing on pages used for dma when necessary. If GNTTABOP_cache_flush is supported by the hypervisor, we don't need to bounce dma map operations that involve foreign grants and non-coherent devices. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* xen/arm: remove handling of XENFEAT_grant_map_identityStefano Stabellini2014-12-041-3/+0
| | | | | | | | | | The feature has been removed from Xen. Also Linux cannot use it on ARM32 without CONFIG_ARM_LPAE. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
* xen: remove DEFINE_XENBUS_DRIVER() macroDavid Vrabel2014-10-061-9/+12
| | | | | | | | | | The DEFINE_XENBUS_DRIVER() macro looks a bit weird and causes sparse errors. Replace the uses with standard structure definitions instead. This is similar to pci and usb device registration. Signed-off-by: David Vrabel <david.vrabel@citrix.com>
* xen: sync some headers with xen treeJuergen Gross2014-10-032-26/+294
| | | | | | | | | | | | | | To be able to use an initially unmapped initrd with xen the following header files must be synced to a newer version from the xen tree: include/xen/interface/elfnote.h include/xen/interface/xen.h As the KEXEC and DUMPCORE related ELFNOTES are not relevant for the kernel they are omitted from elfnote.h. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
* xen: Add Xen pvSCSI protocol descriptionJuergen Gross2014-09-231-0/+229
| | | | | | | | | | | | | | | | | Add the definition of pvSCSI protocol used between the pvSCSI frontend in a XEN domU and the pvSCSI backend in a XEN driver domain (usually Dom0). This header was originally provided by Fujitsu for Xen based on Linux 2.6.18. Changes are: - Added comments. - Adapt to Linux style guide. - Add support for larger SG-lists by putting them in an own granted page. - Remove stale definitions. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
* xen/events: support threaded irqs for interdomain event channelsJuergen Gross2014-09-231-0/+2
| | | | | | | | | | | | | | Export bind_interdomain_evtchn_to_irq() so drivers can use threaded interrupt handlers with: irq = bind_interdomain_evtchn_to_irq(remote_dom, remote_port); if (irq < 0) /* error */ ret = request_threaded_irq(...); Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
* xen/arm: introduce XENFEAT_grant_map_identityStefano Stabellini2014-09-111-0/+3
| | | | | | | | | | | | The flag tells us that the hypervisor maps a grant page to guest physical address == machine address of the page in addition to the normal grant mapping address. It is needed to properly issue cache maintenance operation at the completion of a DMA operation involving a foreign grant. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Tested-by: Denis Schneider <v1ne2go@gmail.com>
* Merge tag 'stable/for-linus-3.17-rc0-tag' of ↵Linus Torvalds2014-08-071-29/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull Xen updates from David Vrabel: - remove unused V2 grant table support - note that Konrad is xen-blkkback/front maintainer - add 'xen_nopv' option to disable PV extentions for x86 HVM guests - misc minor cleanups * tag 'stable/for-linus-3.17-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen-pciback: Document the 'quirks' sysfs file xen/pciback: Fix error return code in xen_pcibk_attach() xen/events: drop negativity check of unsigned parameter xen/setup: Remove Identity Map Debug Message xen/events/fifo: remove a unecessary use of BM() xen/events/fifo: ensure all bitops are properly aligned even on x86 xen/events/fifo: reset control block and local HEADs on resume xen/arm: use BUG_ON xen/grant-table: remove support for V2 tables x86/xen: safely map and unmap grant frames when in atomic context MAINTAINERS: Make me the Xen block subsystem (front and back) maintainer xen: Introduce 'xen_nopv' to disable PV extensions for HVM guests.
| * xen/grant-table: remove support for V2 tablesDavid Vrabel2014-07-141-29/+1
| | | | | | | | | | | | | | | | Since 11c7ff17c9b6dbf3a4e4f36be30ad531a6cf0ec9 (xen/grant-table: Force to use v1 of grants.) the code for V2 grant tables is not used. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * x86/xen: safely map and unmap grant frames when in atomic contextDavid Vrabel2014-07-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | arch_gnttab_map_frames() and arch_gnttab_unmap_frames() are called in atomic context but were calling alloc_vm_area() which might sleep. Also, if a driver attempts to allocate a grant ref from an interrupt and the table needs expanding, then the CPU may already by in lazy MMU mode and apply_to_page_range() will BUG when it tries to re-enable lazy MMU mode. These two functions are only used in PV guests. Introduce arch_gnttab_init() to allocates the virtual address space in advance. Avoid the use of apply_to_page_range() by using saving and using the array of PTE addresses from the alloc_vm_area() call. N.B. 'alloc_vm_area' pre-allocates the pagetable so there is no need to worry about having to do a PGD/PUD/PMD walk (like apply_to_page_range does) and we can instead do set_pte. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> ---- [v2: Add comment about alloc_vm_area] [v3: Fix compile error found by 0-day bot]
* | Merge branch 'x86-efi-for-linus' of ↵Linus Torvalds2014-08-042-0/+134
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI changes from Ingo Molnar: "Main changes in this cycle are: - arm64 efi stub fixes, preservation of FP/SIMD registers across firmware calls, and conversion of the EFI stub code into a static library - Ard Biesheuvel - Xen EFI support - Daniel Kiper - Support for autoloading the efivars driver - Lee, Chun-Yi - Use the PE/COFF headers in the x86 EFI boot stub to request that the stub be loaded with CONFIG_PHYSICAL_ALIGN alignment - Michael Brown - Consolidate all the x86 EFI quirks into one file - Saurabh Tangri - Additional error logging in x86 EFI boot stub - Ulf Winkelvos - Support loading initrd above 4G in EFI boot stub - Yinghai Lu - EFI reboot patches for ACPI hardware reduced platforms" * 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits) efi/arm64: Handle missing virtual mapping for UEFI System Table arch/x86/xen: Silence compiler warnings xen: Silence compiler warnings x86/efi: Request desired alignment via the PE/COFF headers x86/efi: Add better error logging to EFI boot stub efi: Autoload efivars efi: Update stale locking comment for struct efivars arch/x86: Remove efi_set_rtc_mmss() arch/x86: Replace plain strings with constants xen: Put EFI machinery in place xen: Define EFI related stuff arch/x86: Remove redundant set_bit(EFI_MEMMAP) call arch/x86: Remove redundant set_bit(EFI_SYSTEM_TABLES) call efi: Introduce EFI_PARAVIRT flag arch/x86: Do not access EFI memory map if it is not available efi: Use early_mem*() instead of early_io*() arch/ia64: Define early_memunmap() x86/reboot: Add EFI reboot quirk for ACPI Hardware Reduced flag efi/reboot: Allow powering off machines using EFI efi/reboot: Add generic wrapper around EfiResetSystem() ...
| * | xen: Silence compiler warningsDaniel Kiper2014-07-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add inline keyword to silence the following compiler warnings if xen_efi_probe() is not used: CC arch/x86/xen/setup.o In file included from arch/x86/xen/xen-ops.h:7:0, from arch/x86/xen/setup.c:31: include/xen/xen-ops.h:43:35: warning: ‘xen_efi_probe’ defined but not used [-Wunused-function] Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
| * | xen: Put EFI machinery in placeDaniel Kiper2014-07-181-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enables EFI usage under Xen dom0. Standard EFI Linux Kernel infrastructure cannot be used because it requires direct access to EFI data and code. However, in dom0 case it is not possible because above mentioned EFI stuff is fully owned and controlled by Xen hypervisor. In this case all calls from dom0 to EFI must be requested via special hypercall which in turn executes relevant EFI code in behalf of dom0. When dom0 kernel boots it checks for EFI availability on a machine. If it is detected then artificial EFI system table is filled. Native EFI callas are replaced by functions which mimics them by calling relevant hypercall. Later pointer to EFI system table is passed to standard EFI machinery and it continues EFI subsystem initialization taking into account that there is no direct access to EFI boot services, runtime, tables, structures, etc. After that system runs as usual. This patch is based on Jan Beulich and Tang Liang work. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Tang Liang <liang.tang@oracle.com> Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com Signed-off-by: Matt Fleming <matt.fleming@intel.com>
| * | xen: Define EFI related stuffDaniel Kiper2014-07-181-0/+123
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | Define constants and structures which are needed to properly execute EFI related hypercall in Xen dom0. This patch is based on Jan Beulich and Tang Liang work. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Tang Liang <liang.tang@oracle.com> Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
* | Merge tag 'stable/for-linus-3.16-rc7-tag' of ↵Linus Torvalds2014-07-301-0/+1
|\ \ | |/ |/| | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull Xen fix from David Vrabel: "Fix BUG when trying to expand the grant table. This seems to occur often during boot with Ubuntu 14.04 PV guests" * tag 'stable/for-linus-3.16-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/xen: safely map and unmap grant frames when in atomic context
| * x86/xen: safely map and unmap grant frames when in atomic contextDavid Vrabel2014-07-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | arch_gnttab_map_frames() and arch_gnttab_unmap_frames() are called in atomic context but were calling alloc_vm_area() which might sleep. Also, if a driver attempts to allocate a grant ref from an interrupt and the table needs expanding, then the CPU may already by in lazy MMU mode and apply_to_page_range() will BUG when it tries to re-enable lazy MMU mode. These two functions are only used in PV guests. Introduce arch_gnttab_init() to allocates the virtual address space in advance. Avoid the use of apply_to_page_range() by using saving and using the array of PTE addresses from the alloc_vm_area() call (which ensures that the required page tables are pre-allocated). Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds2014-06-121-0/+53
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking updates from David Miller: 1) Seccomp BPF filters can now be JIT'd, from Alexei Starovoitov. 2) Multiqueue support in xen-netback and xen-netfront, from Andrew J Benniston. 3) Allow tweaking of aggregation settings in cdc_ncm driver, from Bjørn Mork. 4) BPF now has a "random" opcode, from Chema Gonzalez. 5) Add more BPF documentation and improve test framework, from Daniel Borkmann. 6) Support TCP fastopen over ipv6, from Daniel Lee. 7) Add software TSO helper functions and use them to support software TSO in mvneta and mv643xx_eth drivers. From Ezequiel Garcia. 8) Support software TSO in fec driver too, from Nimrod Andy. 9) Add Broadcom SYSTEMPORT driver, from Florian Fainelli. 10) Handle broadcasts more gracefully over macvlan when there are large numbers of interfaces configured, from Herbert Xu. 11) Allow more control over fwmark used for non-socket based responses, from Lorenzo Colitti. 12) Do TCP congestion window limiting based upon measurements, from Neal Cardwell. 13) Support busy polling in SCTP, from Neal Horman. 14) Allow RSS key to be configured via ethtool, from Venkata Duvvuru. 15) Bridge promisc mode handling improvements from Vlad Yasevich. 16) Don't use inetpeer entries to implement ID generation any more, it performs poorly, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1522 commits) rtnetlink: fix userspace API breakage for iproute2 < v3.9.0 tcp: fixing TLP's FIN recovery net: fec: Add software TSO support net: fec: Add Scatter/gather support net: fec: Increase buffer descriptor entry number net: fec: Factorize feature setting net: fec: Enable IP header hardware checksum net: fec: Factorize the .xmit transmit function bridge: fix compile error when compiling without IPv6 support bridge: fix smatch warning / potential null pointer dereference via-rhine: fix full-duplex with autoneg disable bnx2x: Enlarge the dorq threshold for VFs bnx2x: Check for UNDI in uncommon branch bnx2x: Fix 1G-baseT link bnx2x: Fix link for KR with swapped polarity lane sctp: Fix sk_ack_backlog wrap-around problem net/core: Add VF link state control policy net/fsl: xgmac_mdio is dependent on OF_MDIO net/fsl: Make xgmac_mdio read error message useful net_sched: drr: warn when qdisc is not work conserving ...
| * | xen-net{back, front}: Document multi-queue feature in netif.hAndrew J. Bennieston2014-06-041-0/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Document the multi-queue feature in terms of XenStore keys to be written by the backend and by the frontend. Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge branch 'for-3.16/drivers' of git://git.kernel.dk/linux-block into nextLinus Torvalds2014-06-021-1/+1
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block driver changes from Jens Axboe: "Now that the core bits are in, here's the pull request for the driver related changes for 3.16. Nothing out of the ordinary here, mostly business as usual. There are a few pulls of for-3.16/core into this branch, which were done when the blk-mq was modified after the mtip32xx conversion was put in. The pull request contains: - skd and cciss converted to use pci_enable_msix_exact(). From Alexander Gordeev. - A few mtip32xx fixes from Asai @ Micron. - The conversion of mtip32xx from make_request_fn to blk-mq, and a later small fix for that conversion on quiescing for non-queued IO. From me. - A fix for bsg to use an exported function to check whether this driver is request based or not. Needed updating for blk-mq, which is request based, but does not have a request_fn hook. From me. - Small floppy bug fix from Jiri. - A series of cleanups for the cdrom uniform layer from Joe Perches. Gets rid of various old ugly macros, making the code conform more to the modern coding style. - A series of patches for drbd from the drbd crew (Lars Ellenberg and Philipp Reisner). - A use-after-free fix for null_blk from Ming Lei. - Also from Ming Lei is a performance patch for virtio-blk, which can net us a 3x win on kvm platforms where world notification is expensive. - Ming Lei also fixed a stall issue in virtio-blk, due to a race between queue start/stop and resource limits. - A small batch of fixes for xen-blk{back,front} from Olaf Hering and Valentin Priescu" * 'for-3.16/drivers' of git://git.kernel.dk/linux-block: (54 commits) block: virtio_blk: don't hold spin lock during world switch xen-blkback: defer freeing blkif to avoid blocking xenwatch xen blkif.h: fix comment typo in discard-alignment xen/blkback: disable discard feature if requested by toolstack xen-blkfront: remove type check from blkfront_setup_discard floppy: do not corrupt bio.bi_flags when reading block 0 mtip32xx: move error handling to service thread virtio_blk: fix race between start and stop queue mtip32xx: stop block hardware queues before quiescing IO mtip32xx: blk_mq_init_queue() returns an ERR_PTR mtip32xx: convert to use blk-mq cdrom: Remove unnecessary prototype for cdrom_get_disc_info cdrom: Remove unnecessary prototype for cdrom_mrw_exit cdrom: Remove cdrom_count_tracks prototype cdrom: Remove cdrom_get_next_writeable prototype cdrom: Remove cdrom_get_last_written prototype cdrom: Move mmc_ioctls above cdrom_ioctl to remove unnecessary prototype cdrom: Remove unnecessary sanitize_format prototype cdrom: Remove unnecessary check_for_audio_disc prototype cdrom: Remove prototype for open_for_data ...
OpenPOWER on IntegriCloud