summaryrefslogtreecommitdiffstats
path: root/arch/s390/include
Commit message (Collapse)AuthorAgeFilesLines
* perf kvm: Add stat support on s390Alexander Yarygin2014-07-162-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | On s390, the vmexit event has a tree-like structure: between exit_event_begin and exit_event_end several other events may happen and with each of them refining the previous ones. This patch adds a decoder for such events to the generic code and also the files <asm/kvm_perf.h> and kvm-stat.c for s390. Commands 'perf kvm stat record', 'report' and 'live' are supported. Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1404397747-20939-5-git-send-email-yarygin@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* s390/compat: correct ucontext layout for high gprsMartin Schwidefsky2014-06-161-2/+6
| | | | | | | | | | | | | The uc_sigmask definition in the kernel differs from the one in the glibc, the kernel uc_sigmask has 64 bits while the glibc verison is 1024 bits. The extension of the ucontext structure for 64-bit register support for 31-bit compat processes added a new field uc_gprs_high which starts 8 bytes after the uc_sigmask field. As the glibc view of the ucontext assumes a size of 128 bytes for uc_sigmask add a 120 byte padding to the kernel structure ucontext_extended after the 8 byte uc_sigmask. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/uaccess: always load the kernel ASCE after task switchMartin Schwidefsky2014-06-102-21/+16
| | | | | | | | | | | | | | | This patch fixes a problem introduced with git commit beef560b4cdfafb2 "s390/uaccess: simplify control register updates". The switch_mm function is not called if the next process is a kernel thread without an attached mm or is a nop if the mm does not change. But CR1 still needs to be loaded with the kernel ASCE in case the code returns to a uaccess function that uses the secondary space mode. In addition move the set_fs call from finish_arch_switch to finish_arch_post_lock_switch and then remove finish_arch_switch. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm into nextLinus Torvalds2014-06-0411-108/+579
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull KVM updates from Paolo Bonzini: "At over 200 commits, covering almost all supported architectures, this was a pretty active cycle for KVM. Changes include: - a lot of s390 changes: optimizations, support for migration, GDB support and more - ARM changes are pretty small: support for the PSCI 0.2 hypercall interface on both the guest and the host (the latter acked by Catalin) - initial POWER8 and little-endian host support - support for running u-boot on embedded POWER targets - pretty large changes to MIPS too, completing the userspace interface and improving the handling of virtualized timer hardware - for x86, a larger set of changes is scheduled for 3.17. Still, we have a few emulator bugfixes and support for running nested fully-virtualized Xen guests (para-virtualized Xen guests have always worked). And some optimizations too. The only missing architecture here is ia64. It's not a coincidence that support for KVM on ia64 is scheduled for removal in 3.17" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (203 commits) KVM: add missing cleanup_srcu_struct KVM: PPC: Book3S PR: Rework SLB switching code KVM: PPC: Book3S PR: Use SLB entry 0 KVM: PPC: Book3S HV: Fix machine check delivery to guest KVM: PPC: Book3S HV: Work around POWER8 performance monitor bugs KVM: PPC: Book3S HV: Make sure we don't miss dirty pages KVM: PPC: Book3S HV: Fix dirty map for hugepages KVM: PPC: Book3S HV: Put huge-page HPTEs in rmap chain for base address KVM: PPC: Book3S HV: Fix check for running inside guest in global_invalidates() KVM: PPC: Book3S: Move KVM_REG_PPC_WORT to an unused register number KVM: PPC: Book3S: Add ONE_REG register names that were missed KVM: PPC: Add CAP to indicate hcall fixes KVM: PPC: MPIC: Reset IRQ source private members KVM: PPC: Graciously fail broken LE hypercalls PPC: ePAPR: Fix hypercall on LE guest KVM: PPC: BOOK3S: Remove open coded make_dsisr in alignment handler KVM: PPC: BOOK3S: Always use the saved DAR value PPC: KVM: Make NX bit available with magic page KVM: PPC: Disable NX for old magic page using guests KVM: PPC: BOOK3S: HV: Add mixed page-size support for guest ...
| * KVM: s390: Intercept the tprot instructionMatthew Rosato2014-05-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Based on original patch from Jeng-fang (Nick) Wang When standby memory is specified for a guest Linux, but no virtual memory has been allocated on the Qemu host backing that guest, the guest memory detection process encounters a memory access exception which is not thrown from the KVM handle_tprot() instruction-handler function. The access exception comes from sie64a returning EFAULT, which then passes an addressing exception to the guest. Unfortunately this does not the proper PSW fixup (nullifying vs. suppressing) so the guest will get a fault for the wrong address. Let's just intercept the tprot instruction all the time to do the right thing and not go the page fault handler path for standby memory. tprot is only used by Linux during startup so some exits should be ok. Without this patch, standby memory cannot be used with KVM. Signed-off-by: Nick Wang <jfwang@us.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Tested-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * KVM: s390: split SIE state guest prefix fieldMichael Mueller2014-05-161-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch splits the SIE state guest prefix at offset 4 into a prefix bit field. Additionally it provides the access functions: - kvm_s390_get_prefix() - kvm_s390_set_prefix() to access the prefix per vcpu. Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * s390/sclp: add sclp_get_ibc functionMichael Mueller2014-05-161-0/+1
| | | | | | | | | | | | | | | | | | | | The patch adds functionality to retrieve the IBC configuration by means of function sclp_get_ibc(). Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * KVM: s390: interpretive execution of SIGP EXTERNAL CALLDavid Hildenbrand2014-05-161-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the sigp interpretation facility is installed, most SIGP EXTERNAL CALL operations will be interpreted instead of intercepted. A partial execution interception will occurr at the sending cpu only if the target cpu is in the wait state ("W" bit in the cpuflags set). Instruction interception will only happen in error cases (e.g. cpu addr invalid). As a sending cpu might set the external call interrupt pending flags at the target cpu at every point in time, we can't handle this kind of interrupt using our kvm interrupt injection mechanism. The injection will be done automatically by the SIE when preparing the start of the target cpu. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> CC: Thomas Huth <thuth@linux.vnet.ibm.com> [Adopt external call injection to check for sigp interpretion] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * KVM: s390: decoder of SIE intercepted instructionsAlexander Yarygin2014-05-161-0/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a new decoder of SIE intercepted instructions. The decoder implemented as a macro and potentially can be used in both kernelspace and userspace. Note that this simplified instruction decoder is only intended to be used with the subset of instructions that may cause a SIE intercept. Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * KVM: s390: add sie exit reasons tablesAlexander Yarygin2014-05-161-0/+212
| | | | | | | | | | | | | | | | | | | | This patch defines tables of reasons for exiting from SIE mode in a new sie.h header file. Tables contain SIE intercepted codes, intercepted instructions and program interruptions codes. Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * KVM: s390: Fix external interrupt interceptionThomas Huth2014-05-061-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The external interrupt interception can only occur in rare cases, e.g. when the PSW of the interrupt handler has a bad value. The old handler for this interception simply ignored these events (except for increasing the exit_external_interrupt counter), but for proper operation we either have to inject the interrupts manually or we should drop to userspace in case of errors. Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * KVM: s390: enable IBS for single running VCPUsDavid Hildenbrand2014-04-291-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enables the IBS facility when a single VCPU is running. The facility is dynamically turned on/off as soon as other VCPUs enter/leave the stopped state. When this facility is operating, some instructions can be executed faster for single-cpu guests. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * KVM: s390: hardware support for guest debuggingDavid Hildenbrand2014-04-222-4/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support to debug the guest using the PER facility on s390. Single-stepping, hardware breakpoints and hardware watchpoints are supported. In order to use the PER facility of the guest without it noticing it, the control registers of the guest have to be patched and access to them has to be intercepted(stctl, stctg, lctl, lctlg). All PER program interrupts have to be intercepted and only the relevant PER interrupts for the guest have to be given back. Special care has to be taken about repeated exits on the same hardware breakpoint. The intervention of the host in the guests PER configuration is not fully transparent. PER instruction nullification can not be used by the guest and too many storage alteration events may be reported to the guest (if it is activated for special address ranges only) when the host concurrently debugging it. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * KVM: s390: kernel header addition for guest debuggingDavid Hildenbrand2014-04-221-0/+20
| | | | | | | | | | | | | | | | This patch adds the structs to the kernel headers needed to pass information from/to userspace in order to debug a guest on s390 with hardware support. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * KVM: s390: emulate stctl and stctgDavid Hildenbrand2014-04-221-0/+2
| | | | | | | | | | | | | | | | Introduce the methods to emulate the stctl and stctg instruction. Added tracing code. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * KVM: s390: deliver program irq parameters and use correct ilcDavid Hildenbrand2014-04-221-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | When a program interrupt was to be delivered until now, no program interrupt parameters were stored in the low-core of the target vcpu. This patch enables the delivery of those program interrupt parameters, takes care of concurrent PER events which can be injected in addition to any program interrupt and uses the correct instruction length code (depending on the interception code) for the injection of program interrupts. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * KVM: s390: extract irq parameters of intercepted program irqsDavid Hildenbrand2014-04-221-3/+15
| | | | | | | | | | | | | | | | | | | | Whenever a program interrupt is intercepted, some parameters are stored in the sie control block. These parameters have to be extracted in order to be reinjected correctly. This patch also takes care of intercepted PER events which can occurr in addition to any program interrupt. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * s390: rename and split lowcore field per_perc_atmidJens Freimann2014-04-221-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | per_perc_atmid is currently a two-byte field that combines two fields, the PER code and the PER Addressing-and-Translation-Mode Identification (ATMID) Let's make them accessible indepently and also rename per_cause to per_code. Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * s390: fix name of lowcore field at offset 0xa3Jens Freimann2014-04-221-2/+2
| | | | | | | | | | | | | | | | | | | | According to the Principles of Operation, at offset 0xA3 in the lowcore we have the "Architectural-Mode identification", not an "access identification". Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * KVM: s390: make use of ipte lockHeiko Carstens2014-04-221-1/+11
| | | | | | | | | | | | Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * KVM: s390/sclp: correctly set eca siif bitHeiko Carstens2014-04-221-1/+6
| | | | | | | | | | | | | | | | Check if siif is available before setting. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * KVM: s390: add 'pgm' member to kvm_vcpu_arch and helper functionHeiko Carstens2014-04-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Add a 'struct kvm_s390_pgm_info pgm' member to kvm_vcpu_arch. This structure will be used if during instruction emulation in the context of a vcpu exception data needs to be stored somewhere. Also add a helper function kvm_s390_inject_prog_cond() which can inject vcpu's last exception if needed. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * s390/ctl_reg: add union type for control register 0Heiko Carstens2014-04-221-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | Add 'union ctlreg0_bits' to easily allow setting and testing bits of control register 0 bits. This patch only adds the bits needed for the new guest access functions. Other bits and control registers can be added when needed. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * s390/ptrace: add struct psw and accessor functionHeiko Carstens2014-04-221-0/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a 'struct psw' which makes it easier to decode and test if certain bits in a psw are set or are not set. In addition also add a 'psw_bits()' helper define which allows to directly modify and test a psw_t structure. E.g. psw_t psw; psw_bits(psw).t = 1; /* set dat bit */ Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * KVM: s390: allow injecting every kind of interruptJens Freimann2014-04-221-7/+52
| | | | | | | | | | | | | | | | Add a new data structure and function that allows to inject all kinds of interrupt as defined in the PoP Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * KVM: s390: Exploiting generic userspace interface for cmmaDominik Dingel2014-04-221-0/+7
| | | | | | | | | | | | | | | | To enable CMMA and to reset its state we use the vm kvm_device ioctls, encapsulating attributes within the KVM_S390_VM_MEM_CTRL group. Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * KVM: s390: make cmma usage conditionallyDominik Dingel2014-04-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When userspace reset the guest without notifying kvm, the CMMA state of the pages might be unused, resulting in guest data corruption. To avoid this, CMMA must be enabled only if userspace understands the implications. CMMA must be enabled before vCPU creation. It can't be switched off once enabled. All subsequently created vCPUs will be enabled for CMMA according to the CMMA state of the VM. Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> [remove now unnecessary calls to page_table_reset_pgste]
| * KVM: s390/mm: new gmap_test_and_clear_dirty functionDominik Dingel2014-04-221-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | For live migration kvm needs to test and clear the dirty bit of guest pages. That for is ptep_test_and_clear_user_dirty, to be sure we are not racing with other code, we protect the pte. This needs to be done within the architecture memory management code. Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * KVM: s390/mm: use software dirty bit detection for user dirty trackingMartin Schwidefsky2014-04-221-79/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Switch the user dirty bit detection used for migration from the hardware provided host change-bit in the pgste to a fault based detection method. This reduced the dependency of the host from the storage key to a point where it becomes possible to enable the RCP bypass for KVM guests. The fault based dirty detection will only indicate changes caused by accesses via the guest address space. The hardware based method can detect all changes, even those caused by I/O or accesses via the kernel page table. The KVM/qemu code needs to take this into account. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * KVM: s390: Don't enable skeys by defaultDominik Dingel2014-04-222-1/+4
| | | | | | | | | | | | | | | | | | | | | | The first invocation of storage key operations on a given cpu will be intercepted. On these intercepts we will enable storage keys for the guest and remove the previously added intercepts. Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * KVM: s390: Allow skeys to be enabled for the current processDominik Dingel2014-04-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a new function s390_enable_skey(), which enables storage key handling via setting the use_skey flag in the mmu context. This function is only useful within the context of kvm. Note that enabling storage keys will cause a one-time hickup when walking the page table; however, it saves us special effort for cases like clear reset while making it possible for us to be architecture conform. s390_enable_skey() takes the page table lock to prevent reseting storage keys triggered from multiple vcpus. Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * KVM: s390: Clear storage keysDominik Dingel2014-04-221-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | page_table_reset_pgste() already does a complete page table walk to reset the pgste. Enhance it to initialize the storage keys to PAGE_DEFAULT_KEY if requested by the caller. This will be used for lazy storage key handling. Also provide an empty stub for !CONFIG_PGSTE Lets adopt the current code (diag 308) to not clear the keys. Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * KVM: s390: Adding skey bit to mmu contextDominik Dingel2014-04-223-14/+30
| | | | | | | | | | | | | | | | | | | | | | | | For lazy storage key handling, we need a mechanism to track if the process ever issued a storage key operation. This patch adds the basic infrastructure for making the storage key handling optional, but still leaves it enabled for now by default. Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* | Merge branch 'sched-core-for-linus' of ↵Linus Torvalds2014-06-031-12/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next Pull scheduler updates from Ingo Molnar: "The main scheduling related changes in this cycle were: - various sched/numa updates, for better performance - tree wide cleanup of open coded nice levels - nohz fix related to rq->nr_running use - cpuidle changes and continued consolidation to improve the kernel/sched/idle.c high level idle scheduling logic. As part of this effort I pulled cpuidle driver changes from Rafael as well. - standardized idle polling amongst architectures - continued work on preparing better power/energy aware scheduling - sched/rt updates - misc fixlets and cleanups" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (49 commits) sched/numa: Decay ->wakee_flips instead of zeroing sched/numa: Update migrate_improves/degrades_locality() sched/numa: Allow task switch if load imbalance improves sched/rt: Fix 'struct sched_dl_entity' and dl_task_time() comments, to match the current upstream code sched: Consolidate open coded implementations of nice level frobbing into nice_to_rlimit() and rlimit_to_nice() sched: Initialize rq->age_stamp on processor start sched, nohz: Change rq->nr_running to always use wrappers sched: Fix the rq->next_balance logic in rebalance_domains() and idle_balance() sched: Use clamp() and clamp_val() to make sys_nice() more readable sched: Do not zero sg->cpumask and sg->sgp->power in build_sched_groups() sched/numa: Fix initialization of sched_domain_topology for NUMA sched: Call select_idle_sibling() when not affine_sd sched: Simplify return logic in sched_read_attr() sched: Simplify return logic in sched_copy_attr() sched: Fix exec_start/task_hot on migrated tasks arm64: Remove TIF_POLLING_NRFLAG metag: Remove TIF_POLLING_NRFLAG sched/idle: Make cpuidle_idle_call() void sched/idle: Reflow cpuidle_idle_call() sched/idle: Delay clearing the polling bit ...
| * \ Merge tag 'v3.15-rc6' into sched/core, to pick up the latest fixesIngo Molnar2014-05-222-2/+13
| |\ \ | | | | | | | | | | | | Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | sched, s390: Create a dedicated topology tableVincent Guittot2014-05-071-10/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BOOK level is only relevant for s390 so we create a dedicated topology table with BOOK level and remove it from default table. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Philipp Hachtmann <phacht@linux.vnet.ibm.com> Cc: cmetcalf@tilera.com Cc: benh@kernel.crashing.org Cc: dietmar.eggemann@arm.com Cc: preeti@linux.vnet.ibm.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: linux390@de.ibm.com Cc: linux-s390@vger.kernel.org Link: http://lkml.kernel.org/r/1397209481-28542-3-git-send-email-vincent.guittot@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | sched: Rework sched_domain topology definitionVincent Guittot2014-05-071-2/+0
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We replace the old way to configure the scheduler topology with a new method which enables a platform to declare additionnal level (if needed). We still have a default topology table definition that can be used by platform that don't want more level than the SMT, MC, CPU and NUMA ones. This table can be overwritten by an arch which either wants to add new level where a load balance make sense like BOOK or powergating level or wants to change the flags configuration of some levels. For each level, we need a function pointer that returns cpumask for each cpu, a function pointer that returns the flags for the level and a name. Only flags that describe topology, can be set by an architecture. The current topology flags are: SD_SHARE_CPUPOWER SD_SHARE_PKG_RESOURCES SD_NUMA SD_ASYM_PACKING Then, each level must be a subset on the next one. The build sequence of the sched_domain will take care of removing useless levels like those with 1 CPU and those with the same CPU span and no more relevant information for load balancing than its children. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: David S. Miller <davem@davemloft.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hanjun Guo <hanjun.guo@linaro.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jason Low <jason.low2@hp.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux390@de.ibm.com Cc: linux-ia64@vger.kernel.org Cc: linux-s390@vger.kernel.org Link: http://lkml.kernel.org/r/1397209481-28542-2-git-send-email-vincent.guittot@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | Merge branch 'locking-core-for-linus' of ↵Linus Torvalds2014-06-032-7/+3
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next Pull core locking updates from Ingo Molnar: "The main changes in this cycle were: - reduced/streamlined smp_mb__*() interface that allows more usecases and makes the existing ones less buggy, especially in rarer architectures - add rwsem implementation comments - bump up lockdep limits" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits) rwsem: Add comments to explain the meaning of the rwsem's count field lockdep: Increase static allocations arch: Mass conversion of smp_mb__*() arch,doc: Convert smp_mb__*() arch,xtensa: Convert smp_mb__*() arch,x86: Convert smp_mb__*() arch,tile: Convert smp_mb__*() arch,sparc: Convert smp_mb__*() arch,sh: Convert smp_mb__*() arch,score: Convert smp_mb__*() arch,s390: Convert smp_mb__*() arch,powerpc: Convert smp_mb__*() arch,parisc: Convert smp_mb__*() arch,openrisc: Convert smp_mb__*() arch,mn10300: Convert smp_mb__*() arch,mips: Convert smp_mb__*() arch,metag: Convert smp_mb__*() arch,m68k: Convert smp_mb__*() arch,m32r: Convert smp_mb__*() arch,ia64: Convert smp_mb__*() ...
| * | | arch,s390: Convert smp_mb__*()Peter Zijlstra2014-04-182-7/+3
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As per the existing implementation; implement the new one using smp_mb(). AFAICT the s390 compare-and-swap does imply a barrier, however there are some immediate ops that seem to be singly-copy atomic and do not imply a barrier. One such is the "ni" op (which would be and-immediate) which is used for the constant clear_bit implementation. Therefore s390 needs full barriers for the {before,after} atomic ops. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/n/tip-kme5dz5hcobpnufnnkh1ech2@git.kernel.org Cc: Chen Gang <gang.chen@asianux.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux390@de.ibm.com Cc: linux-kernel@vger.kernel.org Cc: linux-s390@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | s390/lowcore: replace lowcore irb array with a per-cpu variableMartin Schwidefsky2014-05-281-10/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the 96-byte irb array from the lowcore and create a per-cpu variable instead. That way we will pick up any change in the definition of the struct irb automatically. Acked-By: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | s390/lowcore: reserve 96 bytes for IRB in lowcoreChristian Borntraeger2014-05-281-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The IRB might be 96 bytes if the extended-I/O-measurement facility is used. This feature is currently not used by Linux, but struct irb already has the emw defined. So let's make the irb in lowcore match the size of the internal data structure to be future proof. We also have to add a pad, to correctly align the paste. The bigger irb field also circumvents a bug in some QEMU versions that always write the emw field on test subchannel and therefore destroy the paste definitions of this CPU. Running under these QEMU version broke some timing functions in the VDSO and all users of these functions, e.g. some JREs. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: stable@vger.kernel.org
* | | s390/spinlock,rwlock: always to a load-and-test firstMartin Schwidefsky2014-05-201-20/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case a lock is contended it is better to do a load-and-test first before trying to get the lock with compare-and-swap. This helps to avoid unnecessary cache invalidations of the cacheline for the lock if the CPU has to wait for the lock. For an uncontended lock doing the compare-and-swap directly is a bit better, if the CPU does not have the cacheline in its cache yet the compare-and-swap will get it read-write immediately while a load-and-test would get it read-only first. Always to the load-and-test first to avoid the cacheline invalidations for the contended case outweight the potential read-only to read-write cacheline upgrade for the uncontended case. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | s390/cio: fix multiple structure definitionsSebastian Ott2014-05-202-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix multiple definitions of struct channel_path_desc by moving it to asm/chpid.h . Also change ccw_device_get_chp_desc to use proper types. Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | s390/uaccess: provide inline variants of get_user/put_userHeiko Carstens2014-05-201-0/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This shortens the code by ~17k (performace_defconfig, march=z196). The number of exception table entries however increases from 164 entries to 2500 entries (+~18k). However the executed code is shorter and also faster since we save the branches to the out-of-line copy_to/from_user implementations. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | s390/pci: add some new arch specific pci attributesSebastian Ott2014-05-202-3/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | Add a bunch of s390 specific pci attributes to help identifying pci functions. Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | s390/pci: use pdev->dev.groups for attribute creationSebastian Ott2014-05-201-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let the driver core handle attribute creation by putting all s390 specific pci attributes in an attribute group which is referenced by pdev->dev.groups in pcibios_add_device. Link: https://lkml.kernel.org/r/alpine.LFD.2.11.1404141101500.1529@denkbrett Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | s390: split TIF bits into CIF, PIF and TIF bitsMartin Schwidefsky2014-05-206-26/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The oi and ni instructions used in entry[64].S to set and clear bits in the thread-flags are not guaranteed to be atomic in regard to other CPUs. Split the TIF bits into CPU, pt_regs and thread-info specific bits. Updates on the TIF bits are done with atomic instructions, updates on CPU and pt_regs bits are done with non-atomic instructions. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | s390/uaccess: simplify control register updatesMartin Schwidefsky2014-05-204-32/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Always switch to the kernel ASCE in switch_mm. Load the secondary space ASCE in finish_arch_post_lock_switch after checking that any pending page table operations have completed. The primary ASCE is loaded in entry[64].S. With this the update_primary_asce call can be removed from the switch_to macro and from the start of switch_mm function. Remove the load_primary argument from update_user_asce/clear_user_asce, rename update_user_asce to set_user_asce and rename update_primary_asce to load_kernel_asce. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | s390/smp: Avoid busy loop after halt and "begin" on z/VMMichael Holzheu2014-05-201-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the smp_stop_cpu() function for SMP kernels enters a busy loop when "begin" is entered on the z/VM console after Linux is halted. To avoid this behavior, use the non-SMP variant of smp_stop_cpu() which stops the CPU again after "begin" is entered. As a side effect we now have consistent behavior for SMP and non-SMP Linux. Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | s390: fix new ccwgroup.h kernel-doc warningRandy Dunlap2014-05-201-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix new s390 kernel-doc warning: Warning(arch/s390/include/asm/ccwgroup.h:27): No description found for parameter 'ungroup_work' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Cc: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Cc: linux-s390@vger.kernel.org Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux390@de.ibm.com Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
OpenPOWER on IntegriCloud