summaryrefslogtreecommitdiffstats
path: root/arch/s390/lib/uaccess_mvcos.c
Commit message (Collapse)AuthorAgeFilesLines
* s390/uaccess: rework uaccess code - fix locking issuesHeiko Carstens2014-04-031-263/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current uaccess code uses a page table walk in some circumstances, e.g. in case of the in atomic futex operations or if running on old hardware which doesn't support the mvcos instruction. However it turned out that the page table walk code does not correctly lock page tables when accessing page table entries. In other words: a different cpu may invalidate a page table entry while the current cpu inspects the pte. This may lead to random data corruption. Adding correct locking however isn't trivial for all uaccess operations. Especially copy_in_user() is problematic since that requires to hold at least two locks, but must be protected against ABBA deadlock when a different cpu also performs a copy_in_user() operation. So the solution is a different approach where we change address spaces: User space runs in primary address mode, or access register mode within vdso code, like it currently already does. The kernel usually also runs in home space mode, however when accessing user space the kernel switches to primary or secondary address mode if the mvcos instruction is not available or if a compare-and-swap (futex) instruction on a user space address is performed. KVM however is special, since that requires the kernel to run in home address space while implicitly accessing user space with the sie instruction. So we end up with: User space: - runs in primary or access register mode - cr1 contains the user asce - cr7 contains the user asce - cr13 contains the kernel asce Kernel space: - runs in home space mode - cr1 contains the user or kernel asce -> the kernel asce is loaded when a uaccess requires primary or secondary address mode - cr7 contains the user or kernel asce, (changed with set_fs()) - cr13 contains the kernel asce In case of uaccess the kernel changes to: - primary space mode in case of a uaccess (copy_to_user) and uses e.g. the mvcp instruction to access user space. However the kernel will stay in home space mode if the mvcos instruction is available - secondary space mode in case of futex atomic operations, so that the instructions come from primary address space and data from secondary space In case of kvm the kernel runs in home space mode, but cr1 gets switched to contain the gmap asce before the sie instruction gets executed. When the sie instruction is finished cr1 will be switched back to contain the user asce. A context switch between two processes will always load the kernel asce for the next process in cr1. So the first exit to user space is a bit more expensive (one extra load control register instruction) than before, however keeps the code rather simple. In sum this means there is no need to perform any error prone page table walks anymore when accessing user space. The patch seems to be rather large, however it mainly removes the the page table walk code and restores the previously deleted "standard" uaccess code, with a couple of changes. The uaccess without mvcos mode can be enforced with the "uaccess_primary" kernel parameter. Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/uaccess: introduce 'uaccesspt' kernel parameterHeiko Carstens2014-02-211-1/+14
| | | | | | | | | The uaccesspt kernel parameter allows to enforce using the uaccess page table walk variant. This is mainly for debugging purposes, so this mode can also be enabled on machines which support the mvcos instruction. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/setup: get rid of MACHINE_HAS_MVCOS machine flagHeiko Carstens2014-02-211-1/+2
| | | | | | | | MACHINE_HAS_MVCOS is used exactly once when the machine is brought up. There is no need to cache the flag in the machine_flags. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/uaccess: consistent typesHeiko Carstens2014-02-211-19/+20
| | | | | | | | | | | | | The types 'size_t' and 'unsigned long' have been used randomly for the uaccess functions. This looks rather confusing. So let's change all functions to use unsigned long instead and get rid of size_t in order to have a consistent interface. The only exception is strncpy_from_user() which uses 'long' since it may return a signed value (-EFAULT). Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/uaccess: get rid of indirect function callsHeiko Carstens2014-02-211-19/+70
| | | | | | | | | | | | | | | There are only two uaccess variants on s390 left: the version that is used if the mvcos instruction is available, and the page table walk variant. So there is no need for expensive indirect function calls. By default the mvcos variant will be called. If the mvcos instruction is not available it will call the page table walk variant. For minimal performance impact the "if (mvcos_is_available)" is implemented with a jump label, which will be a six byte nop on machines with mvcos. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/uaccess: normalize order of parameters of indirect uaccess function callsHeiko Carstens2014-02-211-10/+10
| | | | | | | | | | | | | | | | For some unknown reason the indirect uaccess functions on s390 implement a different parameter order than what is usual. e.g.: unsigned long copy_to_user(void *to, const void *from, unsigned long n); vs. size_t (*copy_to_user)(size_t n, void __user * to, const void *from); Let's get rid of this confusing parameter reordering. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/uaccess: always run the kernel in home spaceMartin Schwidefsky2013-10-241-30/+0
| | | | | | | Simplify the uaccess code by removing the user_mode=home option. The kernel will now always run in the home space mode. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/uaccess: fix strncpy_from_user/strnlen_user zero maxlen caseHeiko Carstens2013-02-281-0/+2
| | | | | | | | | | | | | | | | If the maximum length specified for the to be accessed string for strncpy_from_user() and strnlen_user() is zero the following incorrect values would be returned or incorrect memory accesses would happen: strnlen_user_std() and strnlen_user_pt() incorrectly return "1" strncpy_from_user_pt() would incorrectly access "dst[maxlen - 1]" strncpy_from_user_mvcos() would incorrectly return "-EFAULT" Fix all these oddities by adding early checks. Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/uaccess: shorten strncpy_from_user/strnlen_userHeiko Carstens2013-02-281-12/+12
| | | | | | | | | | | | | Always stay within page boundaries when copying from user within strlen_user_mvcos()/strncpy_from_user_mvcos(). This allows to shorten the code a bit and may prevent unnecessary faults, since we copy quite large amounts of memory to kernel space. Also directly call the mvcos variants of copy_from_user() to avoid indirect branches. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/comments: unify copyright messages and remove file namesHeiko Carstens2012-07-201-3/+1
| | | | | | | | | | | | | | Remove the file name from the comment at top of many files. In most cases the file name was wrong anyway, so it's rather pointless. Also unify the IBM copyright statement. We did have a lot of sightly different statements and wanted to change them one after another whenever a file gets touched. However that never happened. Instead people start to take the old/"wrong" statements to use as a template for new files. So unify all of them in one go. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
* s390/headers: replace __s390x__ with CONFIG_64BIT where possibleHeiko Carstens2012-05-241-1/+1
| | | | | | | | Replace __s390x__ with CONFIG_64BIT in all places that are not exported to userspace or guarded with #ifdef __KERNEL__. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] Improve address space mode selection.Martin Schwidefsky2009-12-071-4/+0
| | | | | | | | | | | | | | | | | | Introduce user_mode to replace the two variables switch_amode and s390_noexec. There are three valid combinations of the old values: 1) switch_amode == 0 && s390_noexec == 0 2) switch_amode == 1 && s390_noexec == 0 3) switch_amode == 1 && s390_noexec == 1 They get replaced by 1) user_mode == HOME_SPACE_MODE 2) user_mode == PRIMARY_SPACE_MODE 3) user_mode == SECONDARY_SPACE_MODE The new kernel parameter user_mode=[primary,secondary,home] lets you choose the address space mode the user space processes should use. In addition the CONFIG_S390_SWITCH_AMODE config option is removed. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] Add EX_TABLE for addressing exception in usercopy functions.Gerald Schaefer2009-10-061-6/+6
| | | | | | | | | | | | This patch adds an EX_TABLE entry to mvc{p|s|os} usercopy functions that may be called with KERNEL_DS. In combination with collaborative memory management, kernel pages marked as unused may trigger an adressing exception in the usercopy functions. This fixes an unhandled addressing exception bug where strncpy_from_user() is used with len > strnlen and KERNEL_DS, crossing a page boundary to an unused page. Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] uaccess_mvcos: #ifdef config dependent code.Heiko Carstens2008-04-301-0/+2
| | | | | | | | | | arch/s390/lib/uaccess_mvcos.c:166: warning: 'strnlen_user_mvcos' defined but not used arch/s390/lib/uaccess_mvcos.c:186: warning: 'strncpy_from_user_mvcos' defined but not used Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] noexec protectionGerald Schaefer2007-02-051-0/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This provides a noexec protection on s390 hardware. Our hardware does not have any bits left in the pte for a hw noexec bit, so this is a different approach using shadow page tables and a special addressing mode that allows separate address spaces for code and data. As a special feature of our "secondary-space" addressing mode, separate page tables can be specified for the translation of data addresses (storage operands) and instruction addresses. The shadow page table is used for the instruction addresses and the standard page table for the data addresses. The shadow page table is linked to the standard page table by a pointer in page->lru.next of the struct page corresponding to the page that contains the standard page table (since page->private is not really private with the pte_lock and the page table pages are not in the LRU list). Depending on the software bits of a pte, it is either inserted into both page tables or just into the standard (data) page table. Pages of a vma that does not have the VM_EXEC bit set get mapped only in the data address space. Any try to execute code on such a page will cause a page translation exception. The standard reaction to this is a SIGSEGV with two exceptions: the two system call opcodes 0x0a77 (sys_sigreturn) and 0x0aad (sys_rt_sigreturn) are allowed. They are stored by the kernel to the signal stack frame. Unfortunately, the signal return mechanism cannot be modified to use an SA_RESTORER because the exception unwinding code depends on the system call opcode stored behind the signal stack frame. This feature requires that user space is executed in secondary-space mode and the kernel in home-space mode, which means that the addressing modes need to be switched and that the noexec protection only works for user space. After switching the addressing modes, we cannot use the mvcp/mvcs instructions anymore to copy between kernel and user space. A new mvcos instruction has been added to the z9 EC/BC hardware which allows to copy between arbitrary address spaces, but on older hardware the page tables need to be walked manually. Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] Get rid of a lot of sparse warnings.Heiko Carstens2007-02-051-16/+11
| | | | | Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] Add dynamic size check for usercopy functions.Gerald Schaefer2006-12-041-6/+21
| | | | | | | | | | | Use a wrapper for copy_to/from_user to chose the best usercopy method. The mvcos instruction is better for sizes greater than 256 bytes, if mvcos is not available a page table walk is better for sizes greater than 1024 bytes. Also removed the redundant copy_to/from_user_std_small functions. Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] user readable uninitialised kernel memory.Martin Schwidefsky2006-09-281-6/+16
| | | | | | | | | | A user space program can read uninitialised kernel memory by appending to a file from a bad address and then reading the result back. The cause is the copy_from_user function that does not clear the remaining bytes of the kernel buffer after it got a fault on the user space address. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] Use alternative user-copy operations for new hardware.Gerald Schaefer2006-09-201-0/+156
This introduces new user-copy operations which are optimized for copying more than 256 Bytes on new hardware. Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
OpenPOWER on IntegriCloud