summaryrefslogtreecommitdiffstats
path: root/arch/parisc/kernel
Commit message (Collapse)AuthorAgeFilesLines
* parisc: Move die_if_kernel() prototype into traps.h headerHelge Deller2016-06-051-2/+1
| | | | Signed-off-by: Helge Deller <deller@gmx.de>
* parisc: Fix pagefault crash in unaligned __get_user() callHelge Deller2016-06-051-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One of the debian buildd servers had this crash in the syslog without any other information: Unaligned handler failed, ret = -2 clock_adjtime (pid 22578): Unaligned data reference (code 28) CPU: 1 PID: 22578 Comm: clock_adjtime Tainted: G E 4.5.0-2-parisc64-smp #1 Debian 4.5.4-1 task: 000000007d9960f8 ti: 00000001bde7c000 task.ti: 00000001bde7c000 YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI PSW: 00001000000001001111100000001111 Tainted: G E r00-03 000000ff0804f80f 00000001bde7c2b0 00000000402d2be8 00000001bde7c2b0 r04-07 00000000409e1fd0 00000000fa6f7fff 00000001bde7c148 00000000fa6f7fff r08-11 0000000000000000 00000000ffffffff 00000000fac9bb7b 000000000002b4d4 r12-15 000000000015241c 000000000015242c 000000000000002d 00000000fac9bb7b r16-19 0000000000028800 0000000000000001 0000000000000070 00000001bde7c218 r20-23 0000000000000000 00000001bde7c210 0000000000000002 0000000000000000 r24-27 0000000000000000 0000000000000000 00000001bde7c148 00000000409e1fd0 r28-31 0000000000000001 00000001bde7c320 00000001bde7c350 00000001bde7c218 sr00-03 0000000001200000 0000000001200000 0000000000000000 0000000001200000 sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000402d2e84 00000000402d2e88 IIR: 0ca0d089 ISR: 0000000001200000 IOR: 00000000fa6f7fff CPU: 1 CR30: 00000001bde7c000 CR31: ffffffffffffffff ORIG_R28: 00000002369fe628 IAOQ[0]: compat_get_timex+0x2dc/0x3c0 IAOQ[1]: compat_get_timex+0x2e0/0x3c0 RP(r2): compat_get_timex+0x40/0x3c0 Backtrace: [<00000000402d4608>] compat_SyS_clock_adjtime+0x40/0xc0 [<0000000040205024>] syscall_exit+0x0/0x14 This means the userspace program clock_adjtime called the clock_adjtime() syscall and then crashed inside the compat_get_timex() function. Syscalls should never crash programs, but instead return EFAULT. The IIR register contains the executed instruction, which disassebles into "ldw 0(sr3,r5),r9". This load-word instruction is part of __get_user() which tried to read the word at %r5/IOR (0xfa6f7fff). This means the unaligned handler jumped in. The unaligned handler is able to emulate all ldw instructions, but it fails if it fails to read the source e.g. because of page fault. The following program reproduces the problem: #define _GNU_SOURCE #include <unistd.h> #include <sys/syscall.h> #include <sys/mman.h> int main(void) { /* allocate 8k */ char *ptr = mmap(NULL, 2*4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); /* free second half (upper 4k) and make it invalid. */ munmap(ptr+4096, 4096); /* syscall where first int is unaligned and clobbers into invalid memory region */ /* syscall should return EFAULT */ return syscall(__NR_clock_adjtime, 0, ptr+4095); } To fix this issue we simply need to check if the faulting instruction address is in the exception fixup table when the unaligned handler failed. If it is, call the fixup routine instead of crashing. While looking at the unaligned handler I found another issue as well: The target register should not be modified if the handler was unsuccessful. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org
* parisc: Fix printk time during bootHelge Deller2016-06-052-7/+3
| | | | | | | Avoid showing invalid printk time stamps during boot. Signed-off-by: Helge Deller <deller@gmx.de> Reviewed-by: Aaro Koskinen <aaro.koskinen@iki.fi>
* parisc: Fix backtrace on PA-RISCMikulas Patocka2016-06-041-8/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes backtrace on PA-RISC There were several problems: 1) The code that decodes instructions handles instructions that subtract from the stack pointer incorrectly. If the instruction subtracts the number X from the stack pointer the code increases the frame size by (0x100000000-X). This results in invalid accesses to memory and recursive page faults. 2) Because gcc reorders blocks, handling instructions that subtract from the frame pointer is incorrect. For example, this function int f(int a) { if (__builtin_expect(a, 1)) return a; g(); return a; } is compiled in such a way, that the code that decreases the stack pointer for the first "return a" is placed before the code for "g" call. If we recognize this decrement, we mistakenly believe that the frame size for the "g" call is zero. To fix problems 1) and 2), the patch doesn't recognize instructions that decrease the stack pointer at all. To further safeguard the unwind code against nonsense values, we don't allow frame size larger than Total_frame_size. 3) The backtrace is not locked. If stack dump races with module unload, invalid table can be accessed. This patch adds a spinlock when processing module tables. Note, that for correct backtrace, you need recent binutils. Binutils 2.18 from Debian 5 produce garbage unwind tables. Binutils 2.21 work better (it sometimes forgets function frames, but at least it doesn't generate garbage). Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Helge Deller <deller@gmx.de>
* parisc: Use long jump to reach ftrace_return_to_handler()Helge Deller2016-05-231-1/+10
| | | | | | | | Depending on config options we will need to use a long jump to reach ftrace_return_to_handler(). Additionally only compile the parisc_return_to_handler code when CONFIG_FUNCTION_GRAPH_TRACER is set. Signed-off-by: Helge Deller <deller@gmx.de>
* parisc: Merge ftrace C-helper and assembler functions into .text.hot sectionHelge Deller2016-05-222-3/+6
| | | | | | | | | | | When enabling all-branch ftrace support (CONFIG_PROFILE_ALL_BRANCHES) the kernel gets really huge and some ftrace assembler functions like mcount can't reach the ftrace helper functions which are written in C. Avoid this problem of too distant branches by moving the ftrace C-helper functions into the .text.hot section which is put in front of the standard .text section by the linker. Signed-off-by: Helge Deller <deller@gmx.de>
* parisc: Add native high-resolution sched_clock() implementationHelge Deller2016-05-221-1/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a native implementation for the sched_clock() function which utilizes the processor-internal cycle counter (Control Register 16) as high-resolution time source. With this patch we now get much more fine-grained resolutions in various in-kernel time measurements (e.g. when viewing the function tracing logs), and probably a more accurate scheduling on SMP systems. There are a few specific implementation details in this patch: 1. On a 32bit kernel we emulate the higher 32bits of the required 64-bit resolution of sched_clock() by increasing a per-cpu counter at every wrap-around of the 32bit cycle counter. 2. In a SMP system, the cycle counters of the various CPUs are not syncronized (similiar to the TSC in a x86_64 system). To cope with this we define HAVE_UNSTABLE_SCHED_CLOCK and let the upper layers do the adjustment work. 3. Since we need HAVE_UNSTABLE_SCHED_CLOCK, we need to provide a cmpxchg64() function even on a 32-bit kernel. 4. A 64-bit SMP kernel which is started on a UP system will mark the sched_clock() implementation as "stable", which means that we don't expect any jumps in the returned counter. This is true because we then run only on one CPU. Signed-off-by: Helge Deller <deller@gmx.de>
* parisc: Add ARCH_TRACEHOOK and regset supportHelge Deller2016-05-221-2/+354
| | | | | | | | | | | By adding TRACEHOOK support we now get a clean user interface to access registers via PTRACE_GETREGS, PTRACE_SETREGS, PTRACE_GETFPREGS and PTRACE_SETFPREGS. The user-visible regset struct user_regs_struct and user_fp_struct are modelled similiar to x86 and can be accessed via PTRACE_GETREGSET. Signed-off-by: Helge Deller <deller@gmx.de>
* parisc: Add syscall tracepoint supportHelge Deller2016-05-222-0/+13
| | | | | | | | This patch adds support for the TIF_SYSCALL_TRACEPOINT on the parisc architecture. Basically, it calls the appropriate tracepoints on syscall entry and exit. Signed-off-by: Helge Deller <deller@gmx.de>
* exit_thread: remove empty bodiesJiri Slaby2016-05-201-7/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Define HAVE_EXIT_THREAD for archs which want to do something in exit_thread. For others, let's define exit_thread as an empty inline. This is a cleanup before we change the prototype of exit_thread to accept a task parameter. [akpm@linux-foundation.org: fix mips] Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: "David S. Miller" <davem@davemloft.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chen Liqin <liqin.linux@gmail.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Chris Zankel <chris@zankel.net> Cc: David Howells <dhowells@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James Hogan <james.hogan@imgtec.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Jonas Bonn <jonas@southpole.se> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Mikael Starvik <starvik@axis.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Rich Felker <dalias@libc.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@arm.linux.org.uk> Cc: Steven Miao <realmz6@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* parisc: fix a bug when syscall number of tracee is __NR_Linux_syscallsDmitry V. Levin2016-05-061-1/+1
| | | | | | | | | | | | | | Do not load one entry beyond the end of the syscall table when the syscall number of a traced process equals to __NR_Linux_syscalls. Similar bug with regular processes was fixed by commit 3bb457af4fa8 ("[PARISC] Fix bug when syscall nr is __NR_Linux_syscalls"). This bug was found by strace test suite. Cc: stable@vger.kernel.org Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Acked-by: Helge Deller <deller@gmx.de> Signed-off-by: Helge Deller <deller@gmx.de>
* Merge branch 'parisc-4.6-4' of ↵Linus Torvalds2016-04-154-146/+106
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc ftrace fixes from Helge Deller: "This is (most likely) the last pull request for v4.6 for the parisc architecture. It fixes the FTRACE feature for parisc, which is horribly broken since quite some time and doesn't even compile. This patch just fixes the bare minimum (it actually removes more lines than it adds), so that the function tracer works again on 32- and 64bit kernels. I've queued up additional patches on top of this patch which e.g. add the syscall tracer, but those have to wait for the merge window for v4.7." * 'parisc-4.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Fix ftrace function tracer
| * parisc: Fix ftrace function tracerHelge Deller2016-04-144-146/+106
| | | | | | | | | | | | | | | | | | | | Fix the FTRACE function tracer for 32- and 64-bit kernel. The former code was horribly broken. Reimplement most coding in assembly and utilize optimizations, e.g. put mcount() and ftrace_stub() into one L1 cacheline. Signed-off-by: Helge Deller <deller@gmx.de>
* | parisc: Unbreak handling exceptions from kernel modulesHelge Deller2016-04-081-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Handling exceptions from modules never worked on parisc. It was just masked by the fact that exceptions from modules don't happen during normal use. When a module triggers an exception in get_user() we need to load the main kernel dp value before accessing the exception_data structure, and afterwards restore the original dp value of the module on exit. Noticed-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org
* | parisc: Fix kernel crash with reversed copy_from_user()Helge Deller2016-04-081-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | The kernel module testcase (lib/test_user_copy.c) exhibited a kernel crash on parisc if the parameters for copy_from_user were reversed ("illegal reversed copy_to_user" testcase). Fix this potential crash by checking the fault handler if the faulting address is in the exception table. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org Cc: Kees Cook <keescook@chromium.org>
* | parisc: Avoid function pointers for kernel exception routinesHelge Deller2016-04-081-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | We want to avoid the kernel module loader to create function pointers for the kernel fixup routines of get_user() and put_user(). Changing the external reference from function type to int type fixes this. This unbreaks exception handling for get_user() and put_user() when called from a kernel module. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org
* | parisc: Handle R_PARISC_PCREL32 relocations in kernel modulesHelge Deller2016-04-081-0/+8
| | | | | | | | | | | | | | | | | | | | Commit 0de7985 (parisc: Use generic extable search and sort routines) changed the exception tables to use 32bit relative offsets. This patch now adds support to the kernel module loader to handle such R_PARISC_PCREL32 relocations for 32- and 64-bit modules. Signed-off-by: Helge Deller <deller@gmx.de>
* | mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macrosKirill A. Shutemov2016-04-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge branch 'parisc-4.6-2' of ↵Linus Torvalds2016-03-313-2/+14
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc fixes from Helge Deller: "Fix seccomp filter support and SIGSYS signals on compat kernel. Both patches are tagged for v4.5 stable kernel" * 'parisc-4.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Fix and enable seccomp filter support parisc: Fix SIGSYS signals in compat case
| * | parisc: Fix and enable seccomp filter supportHelge Deller2016-03-312-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The seccomp filter support requires careful handling of task registers. This includes reloading of the return value (%r28) and proper syscall exit if secure_computing() returned -1. Additionally we need to sign-extend the syscall number from signed 32bit to signed 64bit in do_syscall_trace_enter() since the ptrace interface only allows storing 32bit values in compat mode. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v4.5
| * | parisc: Fix SIGSYS signals in compat caseHelge Deller2016-03-311-0/+5
| |/ | | | | | | | | Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v4.5
* | arch, ftrace: for KASAN put hard/soft IRQ entries into separate sectionsAlexander Potapenko2016-03-251-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | KASAN needs to know whether the allocation happens in an IRQ handler. This lets us strip everything below the IRQ entry point to reduce the number of unique stack traces needed to be stored. Move the definition of __irq_entry to <linux/interrupt.h> so that the users don't need to pull in <linux/ftrace.h>. Also introduce the __softirq_entry macro which is similar to __irq_entry, but puts the corresponding functions to the .softirqentry.text section. Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrey Konovalov <adech.fo@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Konstantin Serebryany <kcc@google.com> Cc: Dmitry Chernenkov <dmitryc@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | parisc: Wire up preadv2 and pwritev2 syscallsHelge Deller2016-03-231-0/+2
| | | | | | | | Signed-off-by: Helge Deller <deller@gmx.de>
* | parisc: Panic immediately when panic_on_oopsAaro Koskinen2016-03-231-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | PA-RISC wants to sleep 5 seconds before panicking when panic_on_oops is set, with no apparent reason. Remove this feature, since some users may want their systems to fail as quickly as possible. Users who want to delay reboot after panic can use PANIC_TIMEOUT. Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Acked-by: Helge Deller <deller@gmx.de> Signed-off-by: Helge Deller <deller@gmx.de>
* | parisc: Drop alloc_hugepages and free_hugepages syscallsHelge Deller2016-03-232-12/+2
| | | | | | | | | | | | | | | | The system calls alloc_hugepages() and free_hugepages() were introduced in Linux 2.5.36 and removed again in 2.5.54. They were never implemented on parisc, so let's drop them now. Signed-off-by: Helge Deller <deller@gmx.de>
* | Merge branch 'smp-hotplug-for-linus' of ↵Linus Torvalds2016-03-151-1/+1
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull cpu hotplug updates from Thomas Gleixner: "This is the first part of the ongoing cpu hotplug rework: - Initial implementation of the state machine - Runs all online and prepare down callbacks on the plugged cpu and not on some random processor - Replaces busy loop waiting with completions - Adds tracepoints so the states can be followed" More detailed commentary on this work from an earlier email: "What's wrong with the current cpu hotplug infrastructure? - Asymmetry The hotplug notifier mechanism is asymmetric versus the bringup and teardown. This is mostly caused by the notifier mechanism. - Largely undocumented dependencies While some notifiers use explicitely defined notifier priorities, we have quite some notifiers which use numerical priorities to express dependencies without any documentation why. - Control processor driven Most of the bringup/teardown of a cpu is driven by a control processor. While it is understandable, that preperatory steps, like idle thread creation, memory allocation for and initialization of essential facilities needs to be done before a cpu can boot, there is no reason why everything else must run on a control processor. Before this patch series, bringup looks like this: Control CPU Booting CPU do preparatory steps kick cpu into life do low level init sync with booting cpu sync with control cpu bring the rest up - All or nothing approach There is no way to do partial bringups. That's something which is really desired because we waste e.g. at boot substantial amount of time just busy waiting that the cpu comes to life. That's stupid as we could very well do preparatory steps and the initial IPI for other cpus and then go back and do the necessary low level synchronization with the freshly booted cpu. - Minimal debuggability Due to the notifier based design, it's impossible to switch between two stages of the bringup/teardown back and forth in order to test the correctness. So in many hotplug notifiers the cancel mechanisms are either not existant or completely untested. - Notifier [un]registering is tedious To [un]register notifiers we need to protect against hotplug at every callsite. There is no mechanism that bringup/teardown callbacks are issued on the online cpus, so every caller needs to do it itself. That also includes error rollback. What's the new design? The base of the new design is a symmetric state machine, where both the control processor and the booting/dying cpu execute a well defined set of states. Each state is symmetric in the end, except for some well defined exceptions, and the bringup/teardown can be stopped and reversed at almost all states. So the bringup of a cpu will look like this in the future: Control CPU Booting CPU do preparatory steps kick cpu into life do low level init sync with booting cpu sync with control cpu bring itself up The synchronization step does not require the control cpu to wait. That mechanism can be done asynchronously via a worker or some other mechanism. The teardown can be made very similar, so that the dying cpu cleans up and brings itself down. Cleanups which need to be done after the cpu is gone, can be scheduled asynchronously as well. There is a long way to this, as we need to refactor the notion when a cpu is available. Today we set the cpu online right after it comes out of the low level bringup, which is not really correct. The proper mechanism is to set it to available, i.e. cpu local threads, like softirqd, hotplug thread etc. can be scheduled on that cpu, and once it finished all booting steps, it's set to online, so general workloads can be scheduled on it. The reverse happens on teardown. First thing to do is to forbid scheduling of general workloads, then teardown all the per cpu resources and finally shut it off completely. This patch series implements the basic infrastructure for this at the core level. This includes the following: - Basic state machine implementation with well defined states, so ordering and prioritization can be expressed. - Interfaces to [un]register state callbacks This invokes the bringup/teardown callback on all online cpus with the proper protection in place and [un]installs the callbacks in the state machine array. For callbacks which have no particular ordering requirement we have a dynamic state space, so that drivers don't have to register an explicit hotplug state. If a callback fails, the code automatically does a rollback to the previous state. - Sysfs interface to drive the state machine to a particular step. This is only partially functional today. Full functionality and therefor testability will be achieved once we converted all existing hotplug notifiers over to the new scheme. - Run all CPU_ONLINE/DOWN_PREPARE notifiers on the booting/dying processor: Control CPU Booting CPU do preparatory steps kick cpu into life do low level init sync with booting cpu sync with control cpu wait for boot bring itself up Signal completion to control cpu In a previous step of this work we've done a full tree mechanical conversion of all hotplug notifiers to the new scheme. The balance is a net removal of about 4000 lines of code. This is not included in this series, as we decided to take a different approach. Instead of mechanically converting everything over, we will do a proper overhaul of the usage sites one by one so they nicely fit into the symmetric callback scheme. I decided to do that after I looked at the ugliness of some of the converted sites and figured out that their hotplug mechanism is completely buggered anyway. So there is no point to do a mechanical conversion first as we need to go through the usage sites one by one again in order to achieve a full symmetric and testable behaviour" * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) cpu/hotplug: Document states better cpu/hotplug: Fix smpboot thread ordering cpu/hotplug: Remove redundant state check cpu/hotplug: Plug death reporting race rcu: Make CPU_DYING_IDLE an explicit call cpu/hotplug: Make wait for dead cpu completion based cpu/hotplug: Let upcoming cpu bring itself fully up arch/hotplug: Call into idle with a proper state cpu/hotplug: Move online calls to hotplugged cpu cpu/hotplug: Create hotplug threads cpu/hotplug: Split out the state walk into functions cpu/hotplug: Unpark smpboot threads from the state machine cpu/hotplug: Move scheduler cpu_online notifier to hotplug core cpu/hotplug: Implement setup/removal interface cpu/hotplug: Make target state writeable cpu/hotplug: Add sysfs state interface cpu/hotplug: Hand in target state to _cpu_up/down cpu/hotplug: Convert the hotplugged cpu work to a state machine cpu/hotplug: Convert to a state machine for the control processor cpu/hotplug: Add tracepoints ...
| * arch/hotplug: Call into idle with a proper stateThomas Gleixner2016-03-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let the non boot cpus call into idle with the corresponding hotplug state, so the hotplug core can handle the further bringup. That's a first step to convert the boot side of the hotplugged cpus to do all the synchronization with the other side through the state machine. For now it'll only start the hotplug thread and kick the full bringup of the cpu. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: Rik van Riel <riel@redhat.com> Cc: Rafael Wysocki <rafael.j.wysocki@intel.com> Cc: "Srivatsa S. Bhat" <srivatsa@mit.edu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: http://lkml.kernel.org/r/20160226182341.614102639@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | parisc: Wire up copy_file_range syscallHelge Deller2016-03-011-0/+1
| | | | | | | | Signed-off-by: Helge Deller <deller@gmx.de>
* | parisc: Fix ptrace syscall number and return value modificationHelge Deller2016-03-012-6/+15
|/ | | | | | | | | | | | | | | | | | | | | | | | | Mike Frysinger reported that his ptrace testcase showed strange behaviour on parisc: It was not possible to avoid a syscall and the return value of a syscall couldn't be changed. To modify a syscall number, we were missing to save the new syscall number to gr20 which is then picked up later in assembly again. The effect that the return value couldn't be changed is a side-effect of another bug in the assembly code. When a process is ptraced, userspace expects each syscall to report entrance and exit of a syscall. If a syscall number was given which doesn't exist, we jumped to the normal syscall exit code instead of informing userspace that the (non-existant) syscall exits. This unexpected behaviour confuses userspace and thus the bug was misinterpreted as if we can't change the return value. This patch fixes both problems and was tested on 64bit kernel with 32bit userspace. Signed-off-by: Helge Deller <deller@gmx.de> Cc: Mike Frysinger <vapier@gentoo.org> Cc: stable@vger.kernel.org # v4.0+ Tested-by: Mike Frysinger <vapier@gentoo.org>
* parisc: convert to dma_map_opsChristoph Hellwig2016-01-202-41/+53
| | | | | | | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Helge Deller <deller@gmx.de> Acked-by: Helge Deller <deller@gmx.de> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'parisc-4.5-1' of ↵Linus Torvalds2016-01-173-5/+57
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parsic updates from Helge Deller: "This patchset includes two major fixes which are both scheduled for stable: First, __ARCH_SI_PREAMBLE_SIZE was defined with a wrong value. Second, huge page pte and TLB changes needed protection with a spinlock. Other than that there are just some trivial optimizations and cleanups" * 'parisc-4.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Protect huge page pte changes with spinlocks parisc: Imporove debug info about space registers and TLB configuration parisc: Drop parisc-specific NSIGTRAP define parisc: Fix __ARCH_SI_PREAMBLE_SIZE parisc: Reduce overhead of parisc_requires_coherency() parisc: Initialize PCI bridge cache line and default latency
| * parisc: Imporove debug info about space registers and TLB configurationHelge Deller2016-01-121-4/+22
| | | | | | | | Signed-off-by: Helge Deller <deller@gmx.de>
| * parisc: Reduce overhead of parisc_requires_coherency()Helge Deller2016-01-121-1/+9
| | | | | | | | Signed-off-by: Helge Deller <deller@gmx.de>
| * parisc: Initialize PCI bridge cache line and default latencyHelge Deller2016-01-121-0/+26
| | | | | | | | | | | | | | | | | | | | PCI controllers and pci-pci bridges may have not been fully initialized regarding cache line and defaul latency. This partly reverts commit 5f0e9b4 ("parisc: Remove unused pcibios_init_bus()") Signed-off-by: Helge Deller <deller@gmx.de>
* | Merge branch 'for-linus' of ↵Linus Torvalds2016-01-141-16/+16
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching Pull livepatching updates from Jiri Kosina: - RO/NX attribute fixes for patch module relocations from Josh Poimboeuf. As part of this effort, module.c has been cleaned up as well and livepatching is piggy-backing on this cleanup. Rusty is OK with this whole lot going through livepatching tree. - symbol disambiguation support from Chris J Arges. That series is also Reviewed-by: Miroslav Benes <mbenes@suse.cz> but this came in only after I've alredy pushed out. Didn't want to rebase because of that, hence I am mentioning it here. - symbol lookup fix from Miroslav Benes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching: livepatch: Cleanup module page permission changes module: keep percpu symbols in module's symtab module: clean up RO/NX handling. module: use a structure to encapsulate layout. gcov: use within_module() helper. module: Use the same logic for setting and unsetting RO/NX livepatch: function,sympos scheme in livepatch sysfs directory livepatch: add sympos as disambiguator field to klp_reloc livepatch: add old_sympos as disambiguator field to klp_func
| * module: use a structure to encapsulate layout.Rusty Russell2015-12-041-16/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Makes it easier to handle init vs core cleanly, though the change is fairly invasive across random architectures. It simplifies the rbtree code immediately, however, while keeping the core data together in the same cachline (now iff the rbtree code is enabled). Acked-by: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* | parisc: Fix syscall restartsHelge Deller2015-12-211-12/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On parisc syscalls which are interrupted by signals sometimes failed to restart and instead returned -ENOSYS which in the worst case lead to userspace crashes. A similiar problem existed on MIPS and was fixed by commit e967ef02 ("MIPS: Fix restart of indirect syscalls"). On parisc the current syscall restart code assumes that all syscall callers load the syscall number in the delay slot of the ble instruction. That's how it is e.g. done in the unistd.h header file: ble 0x100(%sr2, %r0) ldi #syscall_nr, %r20 Because of that assumption the current code never restored %r20 before returning to userspace. This assumption is at least not true for code which uses the glibc syscall() function, which instead uses this syntax: ble 0x100(%sr2, %r0) copy regX, %r20 where regX depend on how the compiler optimizes the code and register usage. This patch fixes this problem by adding code to analyze how the syscall number is loaded in the delay branch and - if needed - copy the syscall number to regX prior returning to userspace for the syscall restart. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
* | parisc: Wire up mlock2 syscallHelge Deller2015-12-121-0/+1
| | | | | | | | Signed-off-by: Helge Deller <deller@gmx.de>
* | parisc: Remove unused pcibios_init_bus()Bjorn Helgaas2015-12-121-18/+0
|/ | | | | | | There are no callers of pcibios_init_bus(), so remove it. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Helge Deller <deller@gmx.de>
* Merge branch 'parisc-4.4-2' of ↵Linus Torvalds2015-11-227-50/+80
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc update from Helge Deller: "This patchset adds Huge Page and HUGETLBFS support for parisc" Honestly, the hugepage support should have gone through in the merge window, and is not really an rc-time fix. But it only touches arch/parisc, and I cannot find it in myself to care. If one of the three parisc users notices a breakage, I will point at Helge and make rude farting noises. * 'parisc-4.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Map kernel text and data on huge pages parisc: Add Huge Page and HUGETLBFS support parisc: Use long branch to do_syscall_trace_exit parisc: Increase initial kernel mapping to 32MB on 64bit kernel parisc: Initialize the fault vector earlier in the boot process. parisc: Add defines for Huge page support parisc: Drop unused MADV_xxxK_PAGES flags from asm/mman.h parisc: Drop definition of start_thread_som for HP-UX SOM binaries parisc: Fix wrong comment regarding first pmd entry flags
| * parisc: Map kernel text and data on huge pagesHelge Deller2015-11-222-3/+14
| | | | | | | | | | | | | | Adjust the linker script and map_pages() to map kernel text and data on physical 1MB huge/large pages. Signed-off-by: Helge Deller <deller@gmx.de>
| * parisc: Add Huge Page and HUGETLBFS supportHelge Deller2015-11-222-15/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds huge page support to allow userspace to allocate huge pages and to use hugetlbfs filesystem on 32- and 64-bit Linux kernels. A later patch will add kernel support to map kernel text and data on huge pages. The only requirement is, that the kernel needs to be compiled for a PA8X00 CPU (PA2.0 architecture). Older PA1.X CPUs do not support variable page sizes. 64bit Kernels are compiled for PA2.0 by default. Technically on parisc multiple physical huge pages may be needed to emulate standard 2MB huge pages. Signed-off-by: Helge Deller <deller@gmx.de>
| * parisc: Use long branch to do_syscall_trace_exitHelge Deller2015-11-221-2/+2
| | | | | | | | | | | | | | | | Use the 22bit instead of the 17bit branch instruction on a 64bit kernel to reach the do_syscall_trace_exit function from the gateway page. A huge page enabled kernel may need the additional branch distance bits. Signed-off-by: Helge Deller <deller@gmx.de>
| * parisc: Increase initial kernel mapping to 32MB on 64bit kernelHelge Deller2015-11-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | For the 64bit kernel the initially 16 MB kernel memory might become too small if you build a kernel with many modules built-in and with kernel text and data areas mapped on huge pages. This patch increases the initial mapping to 32MB for 64bit kernels and keeps 16MB for 32bit kernels. Signed-off-by: Helge Deller <deller@gmx.de>
| * parisc: Initialize the fault vector earlier in the boot process.Helge Deller2015-11-223-28/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A fault vector on parisc needs to be 2K aligned. Furthermore the checksum of the fault vector needs to sum up to 0 which is being calculated and written at runtime. Up to now we aligned both PA20 and PA11 fault vectors on the same 4K page in order to easily write the checksum after having mapped the kernel read-only (by mapping this page only as read-write). But when we want to map the kernel text and data on huge pages this makes things harder. So, simplify it by aligning both fault vectors on 2K boundries and write the checksum before we map the page read-only. Signed-off-by: Helge Deller <deller@gmx.de>
* | parisc: Wire up userfaultfd syscallHelge Deller2015-10-221-0/+1
| | | | | | | | Signed-off-by: Helge Deller <deller@gmx.de>
* | parisc: allocate sys_membarrier system call numberMathieu Desnoyers2015-10-221-0/+1
|/ | | | | | | | | | Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Tested-by: Helge Deller <deller@gmx.de> CC: Andrew Morton <akpm@linux-foundation.org> CC: linux-api@vger.kernel.org CC: "James E.J. Bottomley" <jejb@parisc-linux.org> CC: linux-parisc@vger.kernel.org Signed-off-by: Helge Deller <deller@gmx.de>
* parisc: Use platform_device_register_simple("rtc-generic")Helge Deller2015-09-081-10/+4
| | | | Signed-off-by: Helge Deller <deller@gmx.de>
* parisc: Drop CONFIG_SMP around update_cr16_clocksource()Helge Deller2015-09-081-7/+0
| | | | | | | No need to use CONFIG_SMP around update_cr16_clocksource(). It checks for num_online_cpus() beeing greater than 1, which is always 1 in UP builds. Signed-off-by: Helge Deller <deller@gmx.de>
* parisc: Use double word condition in 64bit CAS operationJohn David Anglin2015-09-081-1/+1
| | | | | | | | | | | | The attached change fixes the condition used in the "sub" instruction. A double word comparison is needed. This fixes the 64-bit LWS CAS operation on 64-bit kernels. I can now enable 64-bit atomic support in GCC. Cc: <stable@vger.kernel.org> Signed-off-by: John David Anglin <dave.anglin> Signed-off-by: Helge Deller <deller@gmx.de>
OpenPOWER on IntegriCloud