summaryrefslogtreecommitdiffstats
path: root/arch/x86/kernel/alternative.c
Commit message (Collapse)AuthorAgeFilesLines
* x86: Call stop_machine_text_poke() on all CPUsRabin Vincent2011-11-141-1/+1
| | | | | | | | | | | | | | It appears that stop_machine_text_poke() wants to be called on all CPUs, like it's done from text_poke_smp(). Fix text_poke_smp_batch() to do this. Signed-off-by: Rabin Vincent <rabin@rab.in> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Jason Baron <jbaron@redhat.com> Link: http://lkml.kernel.org/r/1319702072-32676-1-git-send-email-rabin@rab.in Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86-64: Move vread_tsc and vread_hpet into the vDSOAndy Lutomirski2011-07-141-8/+0
| | | | | | | | | The vsyscall page now consists entirely of trap instructions. Cc: John Stultz <johnstul@us.ibm.com> Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/637648f303f2ef93af93bae25186e9a1bea093f5.1310639973.git.luto@mit.edu Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* x86: Make alternative instruction pointers relativeAndy Lutomirski2011-07-131-8/+13
| | | | | | | | | This save a few bytes on x86-64 and means that future patches can apply alternatives to unrelocated code. Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/ff64a6b9a1a3860ca4a7b8b6dc7b4754f9491cd7.1310563276.git.luto@mit.edu Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* Merge branch 'x86-cpu-for-linus' of ↵Linus Torvalds2011-05-191-82/+112
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, cpu: Fix detection of Celeron Covington stepping A1 and B0 Documentation, ABI: Update L3 cache index disable text x86, AMD, cacheinfo: Fix L3 cache index disable checks x86, AMD, cacheinfo: Fix fallout caused by max3 conversion x86, cpu: Change NOP selection for certain Intel CPUs x86, cpu: Clean up and unify the NOP selection infrastructure x86, percpu: Use ASM_NOP4 instead of hardcoding P6_NOP4 x86, cpu: Move AMD Elan Kconfig under "Processor family" Fix up trivial conflicts in alternative handling (commit dc326fca2b64 "x86, cpu: Clean up and unify the NOP selection infrastructure" removed some hacky 5-byte instruction stuff, while commit d430d3d7e646 "jump label: Introduce static_branch() interface" renamed HAVE_JUMP_LABEL to CONFIG_JUMP_LABEL in the code that went away)
| * x86, cpu: Change NOP selection for certain Intel CPUsH. Peter Anvin2011-04-181-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to a decoder implementation quirk, some specific Intel CPUs actually perform better with the "k8_nops" than with the SDM-recommended NOPs. For runtime-selected NOPs, if we detect those specific CPUs then use the k8_nops instead of the ones we would normally use. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Tejun Heo <tj@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jason Baron <jbaron@redhat.com> Link: http://lkml.kernel.org/r/1303166160-10315-4-git-send-email-hpa@linux.intel.com
| * x86, cpu: Clean up and unify the NOP selection infrastructureH. Peter Anvin2011-04-181-82/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Clean up and unify the NOP selection infrastructure: - Make the atomic 5-byte NOP a part of the selection system. - Pick NOPs once during early boot and then be done with it. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Tejun Heo <tj@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jason Baron <jbaron@redhat.com> Link: http://lkml.kernel.org/r/1303166160-10315-3-git-send-email-hpa@linux.intel.com
* | Merge branch 'x86/mem' into perf/coreIngo Molnar2011-05-181-0/+9
|\ \ | | | | | | | | | | | | | | | | | | Merge reason: memcpy_64.S changes an assumption perf bench has, so merge this here so we can fix it. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | x86, alternative, doc: Add comment for applying alternatives orderFenghua Yu2011-05-171-0/+9
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some string operation functions may be patched twice, e.g. on enhanced REP MOVSB /STOSB processors, memcpy is patched first by fast string alternative function, then it is patched by enhanced REP MOVSB/STOSB alternative function. Add comment for applying alternatives order to warn people who may change the applying alternatives order for any reason. [ Documentation-only patch ] Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1305671358-14478-4-git-send-email-fenghua.yu@intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* | jump label: Introduce static_branch() interfaceJason Baron2011-04-041-1/+1
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce: static __always_inline bool static_branch(struct jump_label_key *key); instead of the old JUMP_LABEL(key, label) macro. In this way, jump labels become really easy to use: Define: struct jump_label_key jump_key; Can be used as: if (static_branch(&jump_key)) do unlikely code enable/disale via: jump_label_inc(&jump_key); jump_label_dec(&jump_key); that's it! For the jump labels disabled case, the static_branch() becomes an atomic_read(), and jump_label_inc()/dec() are simply atomic_inc(), atomic_dec() operations. We show testing results for this change below. Thanks to H. Peter Anvin for suggesting the 'static_branch()' construct. Since we now require a 'struct jump_label_key *key', we can store a pointer into the jump table addresses. In this way, we can enable/disable jump labels, in basically constant time. This change allows us to completely remove the previous hashtable scheme. Thanks to Peter Zijlstra for this re-write. Testing: I ran a series of 'tbench 20' runs 5 times (with reboots) for 3 configurations, where tracepoints were disabled. jump label configured in avg: 815.6 jump label *not* configured in (using atomic reads) avg: 800.1 jump label *not* configured in (regular reads) avg: 803.4 Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20110316212947.GA8792@redhat.com> Signed-off-by: Jason Baron <jbaron@redhat.com> Suggested-by: H. Peter Anvin <hpa@linux.intel.com> Tested-by: David Daney <ddaney@caviumnetworks.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* x86: Fix common misspellingsLucas De Marchi2011-03-181-1/+1
| | | | | | | | | They were generated by 'codespell' and then manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi> Cc: trivial@kernel.org LKML-Reference: <1300389856-1099-3-git-send-email-lucas.demarchi@profusion.mobi> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86: stop_machine_text_poke() should issue sync_core()Mathieu Desnoyers2011-03-151-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Intel Archiecture Software Developer's Manual section 7.1.3 specifies that a core serializing instruction such as "cpuid" should be executed on _each_ core before the new instruction is made visible. Failure to do so can lead to unspecified behavior (Intel XMC erratas include General Protection Fault in the list), so we should avoid this at all cost. This problem can affect modified code executed by interrupt handlers after interrupt are re-enabled at the end of stop_machine, because no core serializing instruction is executed between the code modification and the moment interrupts are reenabled. Because stop_machine_text_poke performs the text modification from the first CPU decrementing stop_machine_first, modified code executed in thread context is also affected by this problem. To explain why, we have to split the CPUs in two categories: the CPU that initiates the text modification (calls text_poke_smp) and all the others. The scheduler, executed on all other CPUs after stop_machine, issues an "iret" core serializing instruction, and therefore handles core serialization for all these CPUs. However, the text modification initiator can continue its execution on the same thread and access the modified text without any scheduler call. Given that the CPU that initiates the code modification is not guaranteed to be the one actually performing the code modification, it falls into the XMC errata. Q: Isn't this executed from an IPI handler, which will return with IRET (a serializing instruction) anyway? A: No, now stop_machine uses per-cpu workqueue, so that handler will be executed from worker threads. There is no iret anymore. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> LKML-Reference: <20110303160137.GB1590@Krystal> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: <stable@kernel.org> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* x86: Fix text_poke_smp_batch() deadlockPeter Zijlstra2011-02-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix this deadlock - we are already holding the mutex: ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.38-rc4-test+ #1 ------------------------------------------------------- bash/1850 is trying to acquire lock: (text_mutex){+.+.+.}, at: [<ffffffff8100a9c1>] return_to_handler+0x0/0x2f but task is already holding lock: (smp_alt){+.+...}, at: [<ffffffff8100a9c1>] return_to_handler+0x0/0x2f which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (smp_alt){+.+...}: [<ffffffff81082d02>] lock_acquire+0xcd/0xf8 [<ffffffff8192e119>] __mutex_lock_common+0x4c/0x339 [<ffffffff8192e4ca>] mutex_lock_nested+0x3e/0x43 [<ffffffff8101050f>] alternatives_smp_switch+0x77/0x1d8 [<ffffffff81926a6f>] do_boot_cpu+0xd7/0x762 [<ffffffff819277dd>] native_cpu_up+0xe6/0x16a [<ffffffff81928e28>] _cpu_up+0x9d/0xee [<ffffffff81928f4c>] cpu_up+0xd3/0xe7 [<ffffffff82268d4b>] kernel_init+0xe8/0x20a [<ffffffff8100ba24>] kernel_thread_helper+0x4/0x10 -> #1 (cpu_hotplug.lock){+.+.+.}: [<ffffffff81082d02>] lock_acquire+0xcd/0xf8 [<ffffffff8192e119>] __mutex_lock_common+0x4c/0x339 [<ffffffff8192e4ca>] mutex_lock_nested+0x3e/0x43 [<ffffffff810568cc>] get_online_cpus+0x41/0x55 [<ffffffff810a1348>] stop_machine+0x1e/0x3e [<ffffffff819314c1>] text_poke_smp_batch+0x3a/0x3c [<ffffffff81932b6c>] arch_optimize_kprobes+0x10d/0x11c [<ffffffff81933a51>] kprobe_optimizer+0x152/0x222 [<ffffffff8106bb71>] process_one_work+0x1d3/0x335 [<ffffffff8106cfae>] worker_thread+0x104/0x1a4 [<ffffffff810707c4>] kthread+0x9d/0xa5 [<ffffffff8100ba24>] kernel_thread_helper+0x4/0x10 -> #0 (text_mutex){+.+.+.}: other info that might help us debug this: 6 locks held by bash/1850: #0: (&buffer->mutex){+.+.+.}, at: [<ffffffff8100a9c1>] return_to_handler+0x0/0x2f #1: (s_active#75){.+.+.+}, at: [<ffffffff8100a9c1>] return_to_handler+0x0/0x2f #2: (x86_cpu_hotplug_driver_mutex){+.+.+.}, at: [<ffffffff8100a9c1>] return_to_handler+0x0/0x2f #3: (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff8100a9c1>] return_to_handler+0x0/0x2f #4: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8100a9c1>] return_to_handler+0x0/0x2f #5: (smp_alt){+.+...}, at: [<ffffffff8100a9c1>] return_to_handler+0x0/0x2f stack backtrace: Pid: 1850, comm: bash Not tainted 2.6.38-rc4-test+ #1 Call Trace: [<ffffffff81080eb2>] print_circular_bug+0xa8/0xb7 [<ffffffff8192e4ca>] mutex_lock_nested+0x3e/0x43 [<ffffffff81010302>] alternatives_smp_unlock+0x3d/0x93 [<ffffffff81010630>] alternatives_smp_switch+0x198/0x1d8 [<ffffffff8102568a>] native_cpu_die+0x65/0x95 [<ffffffff818cc4ec>] _cpu_down+0x13e/0x202 [<ffffffff8117a619>] sysfs_write_file+0x108/0x144 [<ffffffff8111f5a2>] vfs_write+0xac/0xff [<ffffffff8111f7a9>] sys_write+0x4a/0x6e Reported-by: Steven Rostedt <rostedt@goodmis.org> Tested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: mathieu.desnoyers@efficios.com Cc: rusty@rustcorp.com.au Cc: ananth@in.ibm.com Cc: masami.hiramatsu.pt@hitachi.com Cc: fweisbec@gmail.com Cc: jbeulich@novell.com Cc: jbaron@redhat.com Cc: mhiramat@redhat.com LKML-Reference: <1297458466.5226.93.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
*---. Merge branches 'x86-alternatives-for-linus', 'x86-fpu-for-linus', ↵Linus Torvalds2011-01-061-1/+2
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'x86-hwmon-for-linus', 'x86-paravirt-for-linus', 'core-locking-for-linus' and 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-alternatives-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, suspend: Avoid unnecessary smp alternatives switch during suspend/resume * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86-64, asm: Use fxsaveq/fxrestorq in more places * 'x86-hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, hwmon: Add core threshold notification to therm_throt.c * 'x86-paravirt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, paravirt: Use native_halt on a halt, not native_safe_halt * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: locking, lockdep: Convert sprintf_symbol to %pS * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: irq: Better struct irqaction layout
| * | | x86, suspend: Avoid unnecessary smp alternatives switch during suspend/resumeSuresh Siddha2010-12-131-1/+2
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During suspend, we disable all the non boot cpus. And during resume we bring them all back again. So no need to do alternatives_smp_switch() in between. On my core 2 based laptop, this speeds up the suspend path by 15msec and the resume path by 5 msec (suspend/resume speed up differences can be attributed to the different P-states that the cpu is in during suspend/resume). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <1290557500.4946.8.camel@sbsiddha-MOBL3.sc.intel.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* | | x86: Introduce text_poke_smp_batch() for batch-code modifyingMasami Hiramatsu2010-12-061-9/+40
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce text_poke_smp_batch(). This function modifies several text areas with one stop_machine() on SMP. Because calling stop_machine() is heavy task, it is better to aggregate text_poke requests. ( Note: I've talked with Rusty about this interface, and he would not like to expand stop_machine() interface, since it is not for generic use. ) Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Jan Beulich <jbeulich@novell.com> Cc: 2nddept-manager@sdl.hitachi.co.jp LKML-Reference: <20101203095422.2961.51217.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | Merge branches 'perf-fixes-for-linus' and 'x86-fixes-for-linus' of ↵Linus Torvalds2010-10-301-54/+15
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: jump label: Add work around to i386 gcc asm goto bug x86, ftrace: Use safe noops, drop trap test jump_label: Fix unaligned traps on sparc. jump label: Make arch_jump_label_text_poke_early() optional jump label: Fix error with preempt disable holding mutex oprofile: Remove deprecated use of flush_scheduled_work() oprofile: Fix the hang while taking the cpu offline jump label: Fix deadlock b/w jump_label_mutex vs. text_mutex jump label: Fix module __init section race * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Check irq_remapped instead of remapping_enabled in destroy_irq()
| * | x86, ftrace: Use safe noops, drop trap testH. Peter Anvin2010-10-291-54/+15
| |/ | | | | | | | | | | | | | | | | | | | | | | Always use a safe 5-byte noop sequence. Drop the trap test, since it is known to return false negatives on some virtualization platforms on 32 bits. The resulting code is both simpler and safer. Cc: Daniel Drake <dsd@laptop.org> Cc: Jason Baron <jbaron@redhat.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* | x86, alternative: Call stop_machine_text_poke() on all cpusJason Baron2010-10-291-1/+1
|/ | | | | | | | | | | | | | | | | Currently, text_poke_smp() passes a NULL as the third argument to __stop_machine(), which will only run stop_machine_text_poke() on 1 cpu. Change NULL -> cpu_online_mask, as stop_machine_text_poke() is intended to be run on all cpus. I actually didn't notice any problems with stop_machine_text_poke() only being called on 1 cpu, but found this via code inspection. Signed-off-by: Jason Baron <jbaron@redhat.com> LKML-Reference: <20101028152026.GB2875@redhat.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* x86: Use __stop_machine() in text_poke_smp()Masami Hiramatsu2010-10-141-1/+2
| | | | | | | | | | | | | | Use __stop_machine() in text_poke_smp() because the caller must get online_cpus before calling text_poke_smp(), but stop_machine() do it again. We don't need it. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: 2nddept-manager@sdl.hitachi.co.jp Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> LKML-Reference: <20101014031036.4100.83989.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* jump label: Base patch for jump labelJason Baron2010-09-221-1/+1
| | | | | | | | | | | | | | | | | base patch to implement 'jump labeling'. Based on a new 'asm goto' inline assembly gcc mechanism, we can now branch to labels from an 'asm goto' statment. This allows us to create a 'no-op' fastpath, which can subsequently be patched with a jump to the slowpath code. This is useful for code which might be rarely used, but which we'd like to be able to call, if needed. Tracepoints are the current usecase that these are being implemented for. Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jason Baron <jbaron@redhat.com> LKML-Reference: <ee8b3595967989fdaf84e698dc7447d315ce972a.1284733808.git.jbaron@redhat.com> [ cleaned up some formating ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* jump label: Make text_poke_early() globally visibleJason Baron2010-09-201-2/+2
| | | | | | | | | | Make text_poke_early available outside of alternative.c. The jump label patchset wants to make use of it in order to set up the optimal no-op sequences at run-time. Signed-off-by: Jason Baron <jbaron@redhat.com> LKML-Reference: <04cfddf2ba77bcabfc3e524f1849d871d6a1cf9d.1284733808.git.jbaron@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* jump label: Make dynamic no-op selection available outside of ftraceJason Baron2010-09-201-0/+64
| | | | | | | | | | Move Steve's code for finding the best 5-byte no-op from ftrace.c to alternative.c. The idea is that other consumers (in this case jump label) want to make use of that code. Signed-off-by: Jason Baron <jbaron@redhat.com> LKML-Reference: <96259ae74172dcac99c0020c249743c523a92e18.1284733808.git.jbaron@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* x86, alternatives: BUG on encountering an invalid CPU feature numberH. Peter Anvin2010-07-131-0/+1
| | | | | | | | | | Make the alternatives-patching code BUG on encountering an invalid CPU feature number. Should have done this a long time ago. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Yinghai Lu <yinhai@kernel.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <tip-df378ccfc4dd04e263426ad805516915874774aa@git.kernel.org>
* Merge branch 'x86/asm' into x86/atomicH. Peter Anvin2010-04-291-19/+103
|\ | | | | | | | | | | | | | | | | | | | | | | Merge reason: Conflict between LOCK_PREFIX_HERE and relative alternatives pointers Resolved Conflicts: arch/x86/include/asm/alternative.h arch/x86/kernel/alternative.c Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * x86-64: Reduce SMP locks table sizeJan Beulich2010-04-281-20/+25
| | | | | | | | | | | | | | | | | | Reduce the SMP locks table size by using relative pointers instead of absolute ones, thus cutting the table size by half. Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <4BCF30FE020000780003B3B6@vpn.id2.novell.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo2010-03-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
| * Merge branch 'perf-probes-for-linus-2' of ↵Linus Torvalds2010-03-051-0/+60
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-probes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Issue at least one memory barrier in stop_machine_text_poke() perf probe: Correct probe syntax on command line help perf probe: Add lazy line matching support perf probe: Show more lines after last line perf probe: Check function address range strictly in line finder perf probe: Use libdw callback routines perf probe: Use elfutils-libdw for analyzing debuginfo perf probe: Rename probe finder functions perf probe: Fix bugs in line range finder perf probe: Update perf probe document perf probe: Do not show --line option without dwarf support kprobes: Add documents of jump optimization kprobes/x86: Support kprobes jump optimization on x86 x86: Add text_poke_smp for SMP cross modifying code kprobes/x86: Cleanup save/restore registers kprobes/x86: Boost probes when reentering kprobes: Jump optimization sysctl interface kprobes: Introduce kprobes jump optimization kprobes: Introduce generic insn_slot framework kprobes/x86: Cleanup RELATIVEJUMP_INSTRUCTION to RELATIVEJUMP_OPCODE
| | * x86: Issue at least one memory barrier in stop_machine_text_poke()Masami Hiramatsu2010-03-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix stop_machine_text_poke() to issue smp_mb() before exiting waiting loop, and use cpu_relax() for waiting. Changes in v2: - Don't use ACCESS_ONCE(). Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Jason Baron <jbaron@redhat.com> LKML-Reference: <20100304033850.3819.74590.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * x86: Add text_poke_smp for SMP cross modifying codeMasami Hiramatsu2010-02-251-0/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add generic text_poke_smp for SMP which uses stop_machine() to synchronize modifying code. This stop_machine() method is officially described at "7.1.3 Handling Self- and Cross-Modifying Code" on the intel's software developer's manual 3A. Since stop_machine() can't protect code against NMI/MCE, this function can not modify those handlers. And also, this function is basically for modifying multibyte-single-instruction. For modifying multibyte-multi-instructions, we need another special trap & detour code. This code originaly comes from immediate values with stop_machine() version. Thanks Jason and Mathieu! Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Anders Kaseorg <andersk@ksplice.com> Cc: Tim Abbott <tabbott@ksplice.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Jason Baron <jbaron@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> LKML-Reference: <20100225133438.6725.80273.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | Merge branch 'x86-asm-for-linus' of ↵Linus Torvalds2010-02-281-1/+3
| |\ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Mark atomic irq ops raw for 32bit legacy x86: Merge show_regs() x86: Macroise x86 cache descriptors x86-32: clean up rwsem inline asm statements x86: Merge asm/atomic_{32,64}.h x86: Sync asm/atomic_32.h and asm/atomic_64.h x86: Split atomic64_t functions into seperate headers x86-64: Modify memcpy()/memset() alternatives mechanism x86-64: Modify copy_user_generic() alternatives mechanism x86: Lift restriction on the location of FIX_BTMAP_* x86, core: Optimize hweight32()
| * | x86/alternatives: Fix build warningMasami Hiramatsu2010-02-071-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes these warnings: arch/x86/kernel/alternative.c: In function 'alternatives_text_reserved': arch/x86/kernel/alternative.c:402: warning: comparison of distinct pointer types lacks a cast arch/x86/kernel/alternative.c:402: warning: comparison of distinct pointer types lacks a cast arch/x86/kernel/alternative.c:405: warning: comparison of distinct pointer types lacks a cast arch/x86/kernel/alternative.c:405: warning: comparison of distinct pointer types lacks a cast Caused by: 2cfa197: ftrace/alternatives: Introducing *_text_reserved functions Changes in v2: - Use local variables to compare, instead of type casts. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> LKML-Reference: <20100205171647.15750.37221.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | ftrace/alternatives: Introducing *_text_reserved functionsMasami Hiramatsu2010-02-041-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introducing *_text_reserved functions for checking the text address range is partially reserved or not. This patch provides checking routines for x86 smp alternatives and dynamic ftrace. Since both functions modify fixed pieces of kernel text, they should reserve and protect those from other dynamic text modifier, like kprobes. This will also be extended when introducing other subsystems which modify fixed pieces of kernel text. Dynamic text modifiers should avoid those. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: przemyslaw@pawelczyk.it Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org> Cc: Jason Baron <jbaron@redhat.com> LKML-Reference: <20100202214911.4694.16587.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | x86: Add support for lock prefix in alternativesLuca Barbieri2010-02-251-2/+4
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current lock prefix UP/SMP alternative code doesn't allow LOCK_PREFIX to be used in alternatives code. This patch solves the problem by adding a new LOCK_PREFIX_ALTERNATIVE_PATCH macro that only records the lock prefix location but does not emit the prefix. The user of this macro can then start any alternative sequence with "lock" and have it UP/SMP patched. To make this work, the UP/SMP alternative code is changed to do the lock/DS prefix switching only if the byte actually contains a lock or DS prefix. Thus, if an alternative without the "lock" is selected, it will now do nothing instead of clobbering the code. Changes in v2: - Naming change - Change label to not conflict with alternatives Signed-off-by: Luca Barbieri <luca@luca-barbieri.com> LKML-Reference: <1267005265-27958-2-git-send-email-luca@luca-barbieri.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* | x86-64: Modify copy_user_generic() alternatives mechanismJan Beulich2009-12-301-1/+3
|/ | | | | | | | | | | | | | | | | | In order to avoid unnecessary chains of branches, rather than implementing copy_user_generic() as a function consisting of just a single (possibly patched) branch, instead properly deal with patching call instructions in the alternative instructions framework, and move the patching into the callers. As a follow-on, one could also introduce something like __EXPORT_SYMBOL_ALT() to avoid patching call sites in modules. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <4B2BB8180200007800026AE7@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Merge branch 'x86-cpu-for-linus' of ↵Linus Torvalds2009-09-141-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (22 commits) x86: Fix code patching for paravirt-alternatives on 486 x86, msr: change msr-reg.o to obj-y, and export its symbols x86: Use hard_smp_processor_id() to get apic id for AMD K8 cpus x86, sched: Workaround broken sched domain creation for AMD Magny-Cours x86, mcheck: Use correct cpumask for shared bank4 x86, cacheinfo: Fixup L3 cache information for AMD multi-node processors x86: Fix CPU llc_shared_map information for AMD Magny-Cours x86, msr: Fix msr-reg.S compilation with gas 2.16.1, on 32-bit too x86: Move kernel_fpu_using to irq_fpu_usable in asm/i387.h x86, msr: fix msr-reg.S compilation with gas 2.16.1 x86, msr: Export the register-setting MSR functions via /dev/*/msr x86, msr: Create _on_cpu helpers for {rw,wr}msr_safe_regs() x86, msr: Have the _safe MSR functions return -EIO, not -EFAULT x86, msr: CFI annotations, cleanups for msr-reg.S x86, asm: Make _ASM_EXTABLE() usable from assembly code x86, asm: Add 32-bit versions of the combined CFI macros x86, AMD: Disable wrongly set X86_FEATURE_LAHF_LM CPUID bit x86, msr: Rewrite AMD rd/wrmsr variants x86, msr: Add rd/wrmsr interfaces with preset registers x86: add specific support for Intel Atom architecture ...
| * x86: Fix code patching for paravirt-alternatives on 486Ben Hutchings2009-09-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As reported in <http://bugs.debian.org/511703> and <http://bugs.debian.org/515982>, kernels with paravirt-alternatives enabled crash in text_poke_early() on at least some 486-class processors. The problem is that text_poke_early() itself uses inline functions affected by paravirt-alternatives and so will modify instructions that have already been prefetched. Pentium and later processors will invalidate the prefetched instructions in this case, but 486-class processors do not. Change sync_core() to limit prefetching on 486-class (and 386-class) processors, and move the call to sync_core() above the call to the modifiable local_irq_restore(). Signed-off-by: Ben Hutchings <ben@decadent.org.uk> LKML-Reference: <1252547631.3423.134.camel@localhost> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* | x86: properly annotate alternatives.cJan Beulich2009-08-211-24/+32
|/ | | | | | | | | | | | | | | Some of the NOPs tables aren't used on 64-bits, quite some code and data is needed post-init for module loading only, and a couple of functions aren't used outside that file (i.e. can be static, and don't need to be exported). The change to __INITDATA/__INITRODATA is needed to avoid an assembler warning. Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <4A8BC8A00200007800010823@vpn.id2.novell.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* x86: expand irq-off region in text_poke()Masami Hiramatsu2009-03-101-2/+2
| | | | | | | | Expand irq-off region to cover fixmap using code and cache synchronizing. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> LKML-Reference: <49B54688.8090403@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86: implement atomic text_poke() via fixmapMasami Hiramatsu2009-03-061-9/+15
| | | | | | | | | | | | | Use fixmaps instead of vmap/vunmap in text_poke() for avoiding page allocation and delayed unmapping. At the result of above change, text_poke() becomes atomic and can be called from stop_machine() etc. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> LKML-Reference: <49B14352.2040705@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* tracing, Text Edit Lock - SMP alternatives supportMasami Hiramatsu2009-03-061-0/+5
| | | | | | | | | | Use the mutual exclusion provided by the text edit lock in alternatives code. Since alternative_smp_* will be called from module init code, etc, we'd better protect it from other subsystems. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> LKML-Reference: <49B14332.9030109@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Merge branch 'x86/core' into x86/mce2H. Peter Anvin2009-03-031-3/+3
|\
| * x86: make vmap yell louder when it is used under irqs_disabled()Peter Zijlstra2009-02-251-3/+3
| | | | | | | | Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | x86, mce: don't disable machine checks during code patchingAndi Kleen2009-02-171-6/+11
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: low priority bug fix This removes part of a a patch I added myself some time ago. After some consideration the patch was a bad idea. In particular it stopped machine check exceptions during code patching. To quote the comment: * MCEs only happen when something got corrupted and in this * case we must do something about the corruption. * Ignoring it is worse than a unlikely patching race. * Also machine checks tend to be broadcast and if one CPU * goes into machine check the others follow quickly, so we don't * expect a machine check to cause undue problems during to code * patching. So undo the machine check related parts of 8f4e956b313dcccbc7be6f10808952345e3b638c NMIs are still disabled. This only removes code, the only additions are a new comment. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* x86: improve UP kernel when CPU-hotplug and SMP is enabledThomas Gleixner2008-10-131-1/+1
| | | | | | | | | | | | num_possible_cpus() can be > 1 when disabled CPUs have been accounted. Disabled CPUs are not in the cpu_present_map, so we can use num_present_cpus() as a safe indicator to switch to UP alternatives. Reported-by: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
*-. Merge branches 'x86/alternatives', 'x86/cleanups', 'x86/commandline', ↵Ingo Molnar2008-10-061-4/+4
|\ \ | | | | | | | | | 'x86/crashdump', 'x86/debug', 'x86/defconfig', 'x86/doc', 'x86/exports', 'x86/fpu', 'x86/gart', 'x86/idle', 'x86/mm', 'x86/mtrr', 'x86/nmi-watchdog', 'x86/oprofile', 'x86/paravirt', 'x86/reboot', 'x86/sparse-fixes', 'x86/tsc', 'x86/urgent' and 'x86/vmalloc' into x86-v28-for-linus-phase1
| * | x86: alternatives : fix LOCK_PREFIX race with preemptible kernel and CPU hotplugMathieu Desnoyers2008-08-151-4/+4
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a kernel thread is preempted in single-cpu mode right after the NOP (nop about to be turned into a lock prefix), then we CPU hotplug a CPU, and then the thread is scheduled back again, a SMP-unsafe atomic operation will be used on shared SMP variables, leading to corruption. No corruption would happen in the reverse case : going from SMP to UP is ok because we split a bit instruction into tiny pieces, which does not present this condition. Changing the 0x90 (single-byte nop) currently used into a 0x3E DS segment override prefix should fix this issue. Since the default of the atomic instructions is to use the DS segment anyway, it should not affect the behavior. The exception to this are references that use ESP/RSP and EBP/RBP as the base register (they will use the SS segment), however, in Linux (a) DS == SS at all times, and (b) we do not distinguish between segment violations reported as #SS as opposed to #GP, so there is no need to disassemble the instruction to figure out the suitable segment. This patch assumes that the 0x3E prefix will leave atomic operations as-is (thus assuming they normally touch data in the DS segment). Since there seem to be no obvious ill-use of other segment override prefixes for atomic operations, it should be safe. It can be verified with a quick grep -r LOCK_PREFIX include/asm-x86/ grep -A 1 -r LOCK_PREFIX arch/x86/ Taken from This source : AMD64 Architecture Programmer's Manual Volume 3: General-Purpose and System Instructions States "Instructions that Reference a Non-Stack Segment—If an instruction encoding references any base register other than rBP or rSP, or if an instruction contains an immediate offset, the default segment is the data segment (DS). These instructions can use the segment-override prefix to select one of the non-default segments, as shown in Table 1-5." Therefore, forcing the DS segment on the atomic operations, which already use the DS segment, should not change. This source : http://wiki.osdev.org/X86_Instruction_Encoding States "In 64-bit the CS, SS, DS and ES segment overrides are ignored." Confirmed by "AMD 64-Bit Technology" A.7 http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/x86-64_overview.pdf "In 64-bit mode, the DS, ES, SS and CS segment-override prefixes have no effect. These four prefixes are no longer treated as segment-override prefixes in the context of multipleprefix rules. Instead, they are treated as null prefixes." This patch applies to 2.6.27-rc2, but would also have to be applied to earlier kernels (2.6.26, 2.6.25, ...). Performance impact of the fix : tests done on "xaddq" and "xaddl" shows it actually improves performances on Intel Xeon, AMD64, Pentium M. It does not change the performance on Pentium II, Pentium 3 and Pentium 4. Xeon E5405 2.0GHz : NR_TESTS 10000000 test empty cycles : 162207948 test test 1-byte nop xadd cycles : 170755422 test test DS override prefix xadd cycles : 170000118 * test test LOCK xadd cycles : 472012134 AMD64 2.0GHz : NR_TESTS 10000000 test empty cycles : 146674549 test test 1-byte nop xadd cycles : 150273860 test test DS override prefix xadd cycles : 149982382 * test test LOCK xadd cycles : 270000690 Pentium 4 3.0GHz NR_TESTS 10000000 test empty cycles : 290001195 test test 1-byte nop xadd cycles : 310000560 test test DS override prefix xadd cycles : 310000575 * test test LOCK xadd cycles : 1050103740 Pentium M 2.0GHz NR_TESTS 10000000 test empty cycles : 180000523 test test 1-byte nop xadd cycles : 320000345 test test DS override prefix xadd cycles : 310000374 * test test LOCK xadd cycles : 480000357 Pentium 3 550MHz NR_TESTS 10000000 test empty cycles : 510000231 test test 1-byte nop xadd cycles : 620000128 test test DS override prefix xadd cycles : 620000110 * test test LOCK xadd cycles : 800000088 Pentium II 350MHz NR_TESTS 10000000 test empty cycles : 200833494 test test 1-byte nop xadd cycles : 340000130 test test DS override prefix xadd cycles : 340000126 * test test LOCK xadd cycles : 530000078 Speed test modules can be found at http://ltt.polymtl.ca/svn/trunk/tests/kernel/test-prefix-speed-32.c http://ltt.polymtl.ca/svn/trunk/tests/kernel/test-prefix-speed.c Macro-benchmarks 2.0GHz E5405 Core 2 dual Quad-Core Xeon Summary * replace smp lock prefixes with DS segment selector prefixes no lock prefix (s) with lock prefix (s) Speedup make -j1 kernel/ 33.94 +/- 0.07 34.91 +/- 0.27 2.8 % hackbench 50 2.99 +/- 0.01 3.74 +/- 0.01 25.1 % * replace smp lock prefixes with 0x90 nops no lock prefix (s) with lock prefix (s) Speedup make -j1 kernel/ 34.16 +/- 0.32 34.91 +/- 0.27 2.2 % hackbench 50 3.00 +/- 0.01 3.74 +/- 0.01 24.7 % Detail : 1 CPU, replace smp lock prefixes with DS segment selector prefixes make -j1 kernel/ real 0m34.067s user 0m30.630s sys 0m2.980s real 0m33.867s user 0m30.582s sys 0m3.024s real 0m33.939s user 0m30.738s sys 0m2.876s real 0m33.913s user 0m30.806s sys 0m2.808s avg : 33.94s std. dev. : 0.07s hackbench 50 Time: 2.978 Time: 2.982 Time: 3.010 Time: 2.984 Time: 2.982 avg : 2.99 std. dev. : 0.01 1 CPU, noreplace-smp make -j1 kernel/ real 0m35.326s user 0m30.630s sys 0m3.260s real 0m34.325s user 0m30.802s sys 0m3.084s real 0m35.568s user 0m30.722s sys 0m3.168s real 0m34.435s user 0m30.886s sys 0m2.996s avg.: 34.91s std. dev. : 0.27s hackbench 50 Time: 3.733 Time: 3.750 Time: 3.761 Time: 3.737 Time: 3.741 avg : 3.74 std. dev. : 0.01 1 CPU, replace smp lock prefixes with 0x90 nops make -j1 kernel/ real 0m34.139s user 0m30.782s sys 0m2.820s real 0m34.010s user 0m30.630s sys 0m2.976s real 0m34.777s user 0m30.658s sys 0m2.916s real 0m33.924s user 0m30.634s sys 0m2.924s real 0m33.962s user 0m30.774s sys 0m2.800s real 0m34.141s user 0m30.770s sys 0m2.828s avg : 34.16 std. dev. : 0.32 hackbench 50 Time: 2.999 Time: 2.994 Time: 3.004 Time: 2.991 Time: 2.988 avg : 3.00 std. dev. : 0.01 I did more runs (20 runs of each) to compare the nop case to the DS prefix case. Results in seconds. They actually does not seems to show a significant difference. NOP 34.155 33.955 34.012 35.299 35.679 34.141 33.995 35.016 34.254 33.957 33.957 34.008 35.013 34.494 33.893 34.295 34.314 34.854 33.991 34.132 DS 34.080 34.304 34.374 35.095 34.291 34.135 33.940 34.208 35.276 34.288 33.861 33.898 34.610 34.709 33.851 34.256 35.161 34.283 33.865 35.078 Used http://www.graphpad.com/quickcalcs/ttest1.cfm?Format=C to do the T-test (yeah, I'm lazy) : Group Group One (DS prefix) Group Two (nops) Mean 34.37815 34.37070 SD 0.46108 0.51905 SEM 0.10310 0.11606 N 20 20 P value and statistical significance: The two-tailed P value equals 0.9620 By conventional criteria, this difference is considered to be not statistically significant. Confidence interval: The mean of Group One minus Group Two equals 0.00745 95% confidence interval of this difference: From -0.30682 to 0.32172 Intermediate values used in calculations: t = 0.0480 df = 38 standard error of difference = 0.155 So, unless these calculus are completely bogus, the difference between the nop and the DS case seems not to be statistically significant. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Acked-by: H. Peter Anvin <hpa@zytor.com> CC: Linus Torvalds <torvalds@linux-foundation.org> CC: Jeremy Fitzhardinge <jeremy@goop.org> CC: Roland McGrath <roland@redhat.com> CC: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> CC: Steven Rostedt <srostedt@redhat.com> CC: Thomas Gleixner <tglx@linutronix.de> CC: Peter Zijlstra <peterz@infradead.org> CC: Andrew Morton <akpm@linux-foundation.org> CC: David Miller <davem@davemloft.net> CC: Ulrich Drepper <drepper@redhat.com> CC: Rusty Russell <rusty@rustcorp.com.au> CC: Gregory Haskins <ghaskins@novell.com> CC: Arnaldo Carvalho de Melo <acme@redhat.com> CC: "Luis Claudio R. Goncalves" <lclaudio@uudg.org> CC: Clark Williams <williams@redhat.com> CC: Christoph Lameter <cl@linux-foundation.org> CC: Andi Kleen <andi@firstfloor.org> CC: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* | x86: use X86_FEATURE_NOPL in alternativesH. Peter Anvin2008-09-051-23/+13
|/ | | | | | | Use X86_FEATURE_NOPL to determine if it is safe to use P6 NOPs in alternatives. Also, replace table and loop with simple if statement. Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* x86: fix SMP alternatives: use mutex instead of spinlock, text_poke is sleepablePekka Paalanen2008-05-231-9/+9
| | | | | | | | | text_poke is sleepable. The original fix by Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>. Signed-off-by: Pekka Paalanen <pq@iki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* ftrace: use nops instead of jmpSteven Rostedt2008-05-231-2/+2
| | | | | | | | | This patch patches the call to mcount with nops instead of a jmp over the mcount call. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* x86: harden kernel code patchingIngo Molnar2008-04-251-0/+1
| | | | Signed-off-by: Ingo Molnar <mingo@elte.hu>
OpenPOWER on IntegriCloud