summaryrefslogtreecommitdiffstats
path: root/arch/x86
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'idle-release' of ↵Linus Torvalds2011-05-298-26/+32
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6 * 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6: x86 idle: deprecate mwait_idle() and "idle=mwait" cmdline param x86 idle: deprecate "no-hlt" cmdline param x86 idle APM: deprecate CONFIG_APM_CPU_IDLE x86 idle floppy: deprecate disable_hlt() x86 idle: EXPORT_SYMBOL(default_idle, pm_idle) only when APM demands it x86 idle: clarify AMD erratum 400 workaround idle governor: Avoid lock acquisition to read pm_qos before entering idle cpuidle: menu: fixed wrapping timers at 4.294 seconds
| * x86 idle: deprecate mwait_idle() and "idle=mwait" cmdline paramLen Brown2011-05-291-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mwait_idle() is a C1-only idle loop intended to be more efficient than HLT on SMP hardware that supports it. But mwait_idle() has been replaced by the more general mwait_idle_with_hints(), which handles both C1 and deeper C-states. ACPI uses only mwait_idle_with_hints(), and never uses mwait_idle(). Deprecate mwait_idle() and the "idle=mwait" cmdline param to simplify the x86 idle code. After this change, kernels configured with (!CONFIG_ACPI=n && !CONFIG_INTEL_IDLE=n) when run on hardware that support MWAIT will simply use HLT. If MWAIT is desired on those systems, cpuidle and the cpuidle drivers above can be used. cc: x86@kernel.org cc: stable@kernel.org # .39.x Signed-off-by: Len Brown <len.brown@intel.com>
| * x86 idle: deprecate "no-hlt" cmdline paramLen Brown2011-05-291-0/+1
| | | | | | | | | | | | | | | | | | | | | | We'd rather that modern machines not check if HLT works on every entry into idle, for the benefit of machines that had marginal electricals 15-years ago. If those machines are still running the upstream kernel, they can use "idle=poll". The only difference will be that they'll now invoke HLT in machine_hlt(). cc: x86@kernel.org # .39.x Signed-off-by: Len Brown <len.brown@intel.com>
| * x86 idle APM: deprecate CONFIG_APM_CPU_IDLELen Brown2011-05-291-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We don't want to export the pm_idle function pointer to modules. Currently CONFIG_APM_CPU_IDLE w/ CONFIG_APM_MODULE forces us to. CONFIG_APM_CPU_IDLE is of dubious value, it runs only on 32-bit uniprocessor laptops that are over 10 years old. It calls into the BIOS during idle, and is known to cause a number of machines to fail. Removing CONFIG_APM_CPU_IDLE and will allow us to stop exporting pm_idle. Any systems that were calling into the APM BIOS at run-time will simply use HLT instead. cc: x86@kernel.org cc: Jiri Kosina <jkosina@suse.cz> cc: stable@kernel.org # .39.x Signed-off-by: Len Brown <len.brown@intel.com>
| * x86 idle: EXPORT_SYMBOL(default_idle, pm_idle) only when APM demands itLen Brown2011-05-291-1/+3
| | | | | | | | | | | | | | | | | | | | In the long run, we don't want default_idle() or (pm_idle)() to be exported outside of process.c. Start by not exporting them to modules, unless the APM build demands it. cc: x86@kernel.org cc: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Len Brown <len.brown@intel.com>
| * x86 idle: clarify AMD erratum 400 workaroundLen Brown2011-05-296-25/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The workaround for AMD erratum 400 uses the term "c1e" falsely suggesting: 1. Intel C1E is somehow involved 2. All AMD processors with C1E are involved Use the string "amd_c1e" instead of simply "c1e" to clarify that this workaround is specific to AMD's version of C1E. Use the string "e400" to clarify that the workaround is specific to AMD processors with Erratum 400. This patch is text-substitution only, with no functional change. cc: x86@kernel.org Acked-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Len Brown <len.brown@intel.com>
* | Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds2011-05-2815-1194/+2283
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, asm: Clean up desc.h a bit x86, amd: Do not enable ARAT feature on AMD processors below family 0x12 x86: Move do_page_fault()'s error path under unlikely() x86, efi: Retain boot service code until after switching to virtual mode x86: Remove unnecessary check in detect_ht() x86: Reorder mm_context_t to remove x86_64 alignment padding and thus shrink mm_struct x86, UV: Clean up uv_tlb.c x86, UV: Add support for SGI UV2 hub chip x86, cpufeature: Update CPU feature RDRND to RDRAND
| * | x86, asm: Clean up desc.h a bitIngo Molnar2011-05-271-76/+76
| | | | | | | | | | | | | | | | | | | | | | | | I have looked at this file and found it rather ugly - improve readability a bit. No change in functionality. Link: http://lkml.kernel.org/n/tip-incpt6y26yd8586idx65t9ll@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | x86, amd: Do not enable ARAT feature on AMD processors below family 0x12Boris Ostrovsky2011-05-261-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit b87cf80af3ba4b4c008b4face3c68d604e1715c6 added support for ARAT (Always Running APIC timer) on AMD processors that are not affected by erratum 400. This erratum is present on certain processor families and prevents APIC timer from waking up the CPU when it is in a deep C state, including C1E state. Determining whether a processor is affected by this erratum may have some corner cases and handling these cases is somewhat complicated. In the interest of simplicity we won't claim ARAT support on processor families below 0x12 and will go back to broadcasting timer when going idle. Signed-off-by: Boris Ostrovsky <ostr@amd64.org> Link: http://lkml.kernel.org/r/1306423192-19774-1-git-send-email-ostr@amd64.org Tested-by: Boris Petkov <borislav.petkov@amd.com> Cc: Hans Rosenfeld <Hans.Rosenfeld@amd.com> Cc: Andreas Herrmann <Andreas.Herrmann3@amd.com> Cc: Chuck Ebbert <cebbert@redhat.com> Cc: stable@kernel.org # 32.x, 38.x, 39.x Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | x86: Move do_page_fault()'s error path under unlikely()KOSAKI Motohiro2011-05-261-15/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ingo suggested SIGKILL check should be moved into slowpath function. This will reduce the page fault fastpath impact of this recent commit: 37b23e0525d3: x86,mm: make pagefault killable Suggested-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: kamezawa.hiroyu@jp.fujitsu.com Cc: minchan.kim@gmail.com Cc: willy@linux.intel.com Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/4DDE0B5C.9050907@jp.fujitsu.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | Merge branch 'linus' into x86/urgentIngo Molnar2011-05-269-51/+20
| |\ \ | | | | | | | | | | | | | | | | | | | | Merge reason: we want to queue up a dependent patch. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | x86, efi: Retain boot service code until after switching to virtual modeMatthew Garrett2011-05-253-3/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | UEFI stands for "Unified Extensible Firmware Interface", where "Firmware" is an ancient African word meaning "Why do something right when you can do it so wrong that children will weep and brave adults will cower before you", and "UEI" is Celtic for "We missed DOS so we burned it into your ROMs". The UEFI specification provides for runtime services (ie, another way for the operating system to be forced to depend on the firmware) and we rely on these for certain trivial tasks such as setting up the bootloader. But some hardware fails to work if we attempt to use these runtime services from physical mode, and so we have to switch into virtual mode. So far so dreadful. The specification makes it clear that the operating system is free to do whatever it wants with boot services code after ExitBootServices() has been called. SetVirtualAddressMap() can't be called until ExitBootServices() has been. So, obviously, a whole bunch of EFI implementations call into boot services code when we do that. Since we've been charmingly naive and trusted that the specification may be somehow relevant to the real world, we've already stuffed a picture of a penguin or something in that address space. And just to make things more entertaining, we've also marked it non-executable. This patch allocates the boot services regions during EFI init and makes sure that they're executable. Then, after SetVirtualAddressMap(), it discards them and everyone lives happily ever after. Except for the ones who have to work on EFI, who live sad lives haunted by the knowledge that someone's eventually going to write yet another firmware specification. [ hpa: adding this to urgent with a stable tag since it fixes currently-broken hardware. However, I do not know what the dependencies are and so I do not know which -stable versions this may be a candidate for. ] Signed-off-by: Matthew Garrett <mjg@redhat.com> Link: http://lkml.kernel.org/r/1306331593-28715-1-git-send-email-mjg@redhat.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: <stable@kernel.org>
| * | | x86: Remove unnecessary check in detect_ht()Nikhil P Rao2011-05-251-7/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes a check that causes incorrect scheduler domain setup (SMP instead of SMT) and bootlog warning messages when cpuid extensions for topology enumeration are not supported and the number of processors reported to the OS is smaller than smp_num_siblings. Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Nikhil P Rao <nikhil.rao@intel.com> Link: http://lkml.kernel.org/r/1306343921.19325.1.camel@fedora13 Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | x86: Reorder mm_context_t to remove x86_64 alignment padding and thus shrink ↵Richard Kennedy2011-05-251-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mm_struct Reorder mm_context_t to remove alignment padding on 64 bit builds shrinking its size from 64 to 56 bytes. This allows mm_struct to shrink from 840 to 832 bytes, so using one fewer cache lines, and getting more objects per slab when using slub. slabinfo mm_struct reports before :- Sizes (bytes) Slabs ----------------------------------- Object : 840 Total : 7 SlabObj: 896 Full : 1 SlabSiz: 16384 Partial: 4 Loss : 56 CpuSlab: 2 Align : 64 Objects: 18 after :- Sizes (bytes) Slabs ---------------------------------- Object : 832 Total : 7 SlabObj: 832 Full : 1 SlabSiz: 16384 Partial: 4 Loss : 0 CpuSlab: 2 Align : 64 Objects: 19 Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Cc: wilsons@start.ca Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Pekka Enberg <penberg@kernel.org> Link: http://lkml.kernel.org/r/1306244999.1999.5.camel@castor.rsk Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | x86, UV: Clean up uv_tlb.cCliff Wickman2011-05-252-899/+1099
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SGI UV's uv_tlb.c driver has become rather hard to read, with overly large functions, non-standard coding style and (way) too long variable, constant and function names and non-obvious code flow sequences. This patch improves the readability and maintainability of the driver significantly, by doing the following strict code cleanups with no side effects: - Split long functions into shorter logical functions. - Shortened some variable and structure member names. - Added special functions for reads and writes of MMR regs with very long names. - Added the 'tunables' table to shortened tunables_write(). - Added the 'stat_description' table to shorten uv_ptc_proc_write(). - Pass fewer 'stat' arguments where it can be derived from the 'bcp' argument. - Function definitions consistent on one line, and inline in few (short) cases. - Moved some small structures and an atomic inline function to the header file. - Moved some local variables to the blocks where they are used. - Updated the copyright date. - Shortened uv_write_global_mmr64() etc. using some aliasing; no line breaks. Renamed many uv_.. functions that are not exported. - Aligned structure fields. [ note that not all structures are aligned the same way though; I'd like to keep the extensive commenting in some of them. ] - Shortened some long structure names. - Standard pass/fail exit from init_per_cpu() - Vertical alignment for mass initializations. - More separation between blocks of code. Tested on a 16-processor Altix UV. Signed-off-by: Cliff Wickman <cpw@sgi.com> Cc: penberg@kernel.org Link: http://lkml.kernel.org/r/E1QOw12-0004MN-Lp@eag09.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | x86, UV: Add support for SGI UV2 hub chipJack Steiner2011-05-256-238/+1075
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for a new version of the SGI UV hub chip. The hub chip is the node controller that connects multiple blades into a larger coherent SSI. For the most part, UV2 is compatible with UV1. The majority of the changes are in the addresses of MMRs and in a few cases, the contents of MMRs. These changes are the result in changes in the system topology such as node configuration, processor types, maximum nodes, physical address sizes, etc. Signed-off-by: Jack Steiner <steiner@sgi.com> Link: http://lkml.kernel.org/r/20110511175028.GA18006@sgi.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | x86, cpufeature: Update CPU feature RDRND to RDRANDKees Cook2011-05-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Intel manual changed the name of the CPUID bit to match the instruction name. We should follow suit for sanity's sake. (See Intel SDM Volume 2, Table 3-20 "Feature Information Returned in the ECX Register".) [ hpa: we can only do this at this time because there are currently no CPUs with this feature on the market, hence this is pre-hardware enabling. However, Cc:'ing stable so that stable can present a consistent ABI. ] Signed-off-by: Kees Cook <kees.cook@canonical.com> Link: http://lkml.kernel.org/r/20110524232926.GA27728@outflux.net Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: <stable@kernel.org> v2.6.36-39
* | | | Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds2011-05-282-47/+60
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (25 commits) perf: Fix SIGIO handling perf top: Don't stop if no kernel symtab is found perf top: Handle kptr_restrict perf top: Remove unused macro perf events: initialize fd array to -1 instead of 0 perf tools: Make sure kptr_restrict warnings fit 80 col terms perf tools: Fix build on older systems perf symbols: Handle /proc/sys/kernel/kptr_restrict perf: Remove duplicate headers ftrace: Add internal recursive checks tracing: Update btrfs's tracepoints to use u64 interface tracing: Add __print_symbolic_u64 to avoid warnings on 32bit machine ftrace: Set ops->flag to enabled even on static function tracing tracing: Have event with function tracer check error return ftrace: Have ftrace_startup() return failure code jump_label: Check entries limit in __jump_label_update ftrace/recordmcount: Avoid STT_FUNC symbols as base on ARM scripts/tags.sh: Add magic for trace-events for etags too scripts/tags.sh: Fix ctags for DEFINE_EVENT() x86/ftrace: Fix compiler warning in ftrace.c ...
| * \ \ \ Merge branch 'tip/perf/urgent' of ↵Ingo Molnar2011-05-271-6/+6
| |\ \ \ \ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/urgent
| | * | | | x86/ftrace: Fix compiler warning in ftrace.cRakib Mullick2011-05-251-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to commit dc326fca2b64 (x86, cpu: Clean up and unify the NOP selection infrastructure), we get the following warning: arch/x86/kernel/ftrace.c: In function ‘ftrace_make_nop’: arch/x86/kernel/ftrace.c:308:6: warning: assignment discards qualifiers from pointer target type arch/x86/kernel/ftrace.c: In function ‘ftrace_make_call’: arch/x86/kernel/ftrace.c:318:6: warning: assignment discards qualifiers from pointer target type ftrace_nop_replace() now returns const unsigned char *, so change its associated function/variable to its compatible type to keep compiler clam. Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> Link: http://lkml.kernel.org/r/1305221620.7986.4.camel@localhost.localdomain [ updated for change of const void *src in probe_kernel_write() ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | | | Merge branch 'urgent' of ↵Ingo Molnar2011-05-271-41/+54
| |\ \ \ \ \ | | |_|_|/ / | |/| | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile into perf/urgent
| | * | | | oprofile, x86: Enable preemption during pci device setup in IBS initRobert Richter2011-05-201-41/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | IBS initialization is a mix of per-core register access and per-node pci device setup. Register access should be pinned to the cpu, but pci setup must run with preemption enabled. This patch better separates the code into non-/preemptible sections and fixes sleeping with preemption disabled. See bug message below. Fixes also freeing the eilvt entry by introducing put_eilvt(). BUG: sleeping function called from invalid context at mm/slub.c:824 in_atomic(): 1, irqs_disabled(): 0, pid: 32357, name: modprobe INFO: lockdep is turned off. Pid: 32357, comm: modprobe Not tainted 2.6.39-rc7+ #14 Call Trace: [<ffffffff8104bdc8>] __might_sleep+0x112/0x117 [<ffffffff81129693>] kmem_cache_alloc_trace+0x4b/0xe7 [<ffffffff81278f14>] kzalloc.constprop.0+0x29/0x2b [<ffffffff81278f4c>] pci_get_subsys+0x36/0x78 [<ffffffff81022689>] ? setup_APIC_eilvt+0xfb/0x139 [<ffffffff81278fa4>] pci_get_device+0x16/0x18 [<ffffffffa06c8b5d>] op_amd_init+0xd3/0x211 [oprofile] [<ffffffffa064d000>] ? 0xffffffffa064cfff [<ffffffffa064d298>] op_nmi_init+0x21e/0x26a [oprofile] [<ffffffffa064d062>] oprofile_arch_init+0xe/0x26 [oprofile] [<ffffffffa064d010>] oprofile_init+0x10/0x42 [oprofile] [<ffffffff81002099>] do_one_initcall+0x7f/0x13a [<ffffffff81096524>] sys_init_module+0x132/0x281 [<ffffffff814cc682>] system_call_fastpath+0x16/0x1b Reported-by: Dave Jones <davej@redhat.com> Cc: <stable@kernel.org> [2.6.37.x] Signed-off-by: Robert Richter <robert.richter@amd.com>
* | | | | | Merge branch 'setns'Linus Torvalds2011-05-284-1/+6
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * setns: ns: Wire up the setns system call Done as a merge to make it easier to fix up conflicts in arm due to addition of sendmmsg system call
| * | | | | | ns: Wire up the setns system callEric W. Biederman2011-05-284-1/+6
| |/ / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 32bit and 64bit on x86 are tested and working. The rest I have looked at closely and I can't find any problems. setns is an easy system call to wire up. It just takes two ints so I don't expect any weird architecture porting problems. While doing this I have noticed that we have some architectures that are very slow to get new system calls. cris seems to be the slowest where the last system calls wired up were preadv and pwritev. avr32 is weird in that recvmmsg was wired up but never declared in unistd.h. frv is behind with perf_event_open being the last syscall wired up. On h8300 the last system call wired up was epoll_wait. On m32r the last system call wired up was fallocate. mn10300 has recvmmsg as the last system call wired up. The rest seem to at least have syncfs wired up which was new in the 2.6.39. v2: Most of the architecture support added by Daniel Lezcano <dlezcano@fr.ibm.com> v3: ported to v2.6.36-rc4 by: Eric W. Biederman <ebiederm@xmission.com> v4: Moved wiring up of the system call to another patch v5: ported to v2.6.39-rc6 v6: rebased onto parisc-next and net-next to avoid syscall conflicts. v7: ported to Linus's latest post 2.6.39 tree. >  arch/blackfin/include/asm/unistd.h     |    3 ++- >  arch/blackfin/mach-common/entry.S      |    1 + Acked-by: Mike Frysinger <vapier@gentoo.org> Oh - ia64 wiring looks good. Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | Merge branch 'for-linus' of ↵Linus Torvalds2011-05-272-7/+0
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: PM: Fix PM QOS's user mode interface to work with ASCII input PM / Hibernate: Update kerneldoc comments in hibernate.c PM / Hibernate: Remove arch_prepare_suspend() PM / Hibernate: Update some comments in core hibernate code
| * | | | | | PM / Hibernate: Remove arch_prepare_suspend()Rafael J. Wysocki2011-05-242-7/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All architectures supporting hibernation define arch_prepare_suspend() as an empty function, so remove it. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
* | | | | | | Merge branch 'upstream/tidy-xen-mmu-2.6.39' of ↵Linus Torvalds2011-05-262-271/+50
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen * 'upstream/tidy-xen-mmu-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen: xen: fix compile without CONFIG_XEN_DEBUG_FS Use arbitrary_virt_to_machine() to deal with ioremapped pud updates. Use arbitrary_virt_to_machine() to deal with ioremapped pmd updates. xen/mmu: remove all ad-hoc stats stuff xen: use normal virt_to_machine for ptes xen: make a pile of mmu pvop functions static vmalloc: remove vmalloc_sync_all() from alloc_vm_area() xen: condense everything onto xen_set_pte xen: use mmu_update for xen_set_pte_at() xen: drop all the special iomap pte paths.
| * | | | | | | xen: fix compile without CONFIG_XEN_DEBUG_FSJeremy Fitzhardinge2011-05-201-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
| * | | | | | | Use arbitrary_virt_to_machine() to deal with ioremapped pud updates.Jeremy Fitzhardinge2011-05-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
| * | | | | | | Use arbitrary_virt_to_machine() to deal with ioremapped pmd updates.Jeremy Fitzhardinge2011-05-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
| * | | | | | | xen/mmu: remove all ad-hoc stats stuffJeremy Fitzhardinge2011-05-201-138/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To make way for tracing. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
| * | | | | | | xen: use normal virt_to_machine for ptesJeremy Fitzhardinge2011-05-201-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We no longer support HIGHPTE allocations, so ptes should always be within the kernel's direct map, and don't need pagetable walks to convert to machine addresses. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
| * | | | | | | xen: make a pile of mmu pvop functions staticJeremy Fitzhardinge2011-05-202-60/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
| * | | | | | | xen: condense everything onto xen_set_pteJeremy Fitzhardinge2011-05-201-46/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xen_set_pte_at and xen_clear_pte are essentially identical to xen_set_pte, so just make them all common. When batched set_pte and pte_clear are the same, but the unbatch operation must be different: they need to update the two halves of the pte in different order. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
| * | | | | | | xen: use mmu_update for xen_set_pte_at()Jeremy Fitzhardinge2011-05-201-15/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In principle update_va_mapping is a good match for set_pte_at, since it gets the address being mapped, which allows Xen to use its linear pagetable mapping. However that assumes that the pmd for the address is attached to the current pagetable, which may not be true for a given user address space because the kernel pmd is not shared (at least on 32-bit guests). Normally the kernel will automatically sync a missing part of the pagetable with the init_mm pagetable transparently via faults, but that fails when a missing address is passed to Xen. And while the linear pagetable mapping is very useful for 32-bit Xen (as it avoids an explicit domain mapping), 32-bit Xen is deprecated. 64-bit Xen has all memory mapped all the time, so it makes no real difference. The upshot is that we should use mmu_update, since it can operate on non-current pagetables or detached pagetables. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
| * | | | | | | xen: drop all the special iomap pte paths.Jeremy Fitzhardinge2011-05-201-25/+0
| | |_|/ / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Xen can work out when we're doing IO mappings for itself, so we don't need to do anything special, and the extra tests just clog things up. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
* | | | | | | arch: remove CONFIG_GENERIC_FIND_{NEXT_BIT,BIT_LE,LAST_BIT}Akinobu Mita2011-05-261-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | By the previous style change, CONFIG_GENERIC_FIND_NEXT_BIT, CONFIG_GENERIC_FIND_BIT_LE, and CONFIG_GENERIC_FIND_LAST_BIT are not used to test for existence of find bitops anymore. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Greg Ungerer <gerg@uclinux.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | kgdbts: unify/generalize gdb breakpoint adjustmentMike Frysinger2011-05-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Blackfin arch, like the x86 arch, needs to adjust the PC manually after a breakpoint is hit as normally this is handled by the remote gdb. However, rather than starting another arch ifdef mess, create a common GDB_ADJUSTS_BREAK_OFFSET define for any arch to opt-in via their kgdb.h. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Acked-by: Dongdong Deng <dongdong.deng@windriver.com> Cc: Sergei Shtylyov <sshtylyov@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | x86: convert to asm-generic ptrace.hMike Frysinger2011-05-261-13/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Sergei Shtylyov <sshtylyov@mvista.com> Cc: Dongdong Deng <dongdong.deng@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | cgroup: remove the ns_cgroupDaniel Lezcano2011-05-262-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ns_cgroup is an annoying cgroup at the namespace / cgroup frontier and leads to some problems: * cgroup creation is out-of-control * cgroup name can conflict when pids are looping * it is not possible to have a single process handling a lot of namespaces without falling in a exponential creation time * we may want to create a namespace without creating a cgroup The ns_cgroup was replaced by a compatibility flag 'clone_children', where a newly created cgroup will copy the parent cgroup values. The userspace has to manually create a cgroup and add a task to the 'tasks' file. This patch removes the ns_cgroup as suggested in the following thread: https://lists.linux-foundation.org/pipermail/containers/2009-June/018616.html The 'cgroup_clone' function is removed because it is no longer used. This is a userspace-visible change. Commit 45531757b45c ("cgroup: notify ns_cgroup deprecated") (merged into 2.6.27) caused the kernel to emit a printk warning users that the feature is planned for removal. Since that time we have heard from XXX users who were affected by this. Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Jamal Hadi Salim <hadi@cyberus.ca> Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Paul Menage <menage@google.com> Acked-by: Matt Helsley <matthltc@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | Merge branch 'x86-vdso-for-linus' of ↵Linus Torvalds2011-05-2618-185/+202
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: vdso: Remove unused variable x86-64: Optimize vDSO time() x86-64: Add time to vDSO x86-64: Turn off -pg and turn on -foptimize-sibling-calls for vDSO x86-64: Move vread_tsc into a new file with sensible options x86-64: Vclock_gettime(CLOCK_MONOTONIC) can't ever see nsec < 0 x86-64: Don't generate cmov in vread_tsc x86-64: Remove unnecessary barrier in vread_tsc x86-64: Clean up vdso/kernel shared variables
| * | | | | | | x86: vdso: Remove unused variableThomas Gleixner2011-05-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@mit.edu>
| * | | | | | | x86-64: Optimize vDSO time()Andy Lutomirski2011-05-241-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This function just reads a 64-bit variable that's updated atomically, so we don't need any locks. Signed-off-by: Andy Lutomirski <luto@mit.edu> Cc: Andi Kleen <andi@firstfloor.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Borislav Petkov <bp@amd64.org> Link: http://lkml.kernel.org/r/%3C40e2700f8cda4d511e5910be1e633025d28b36c2.1306156808.git.luto%40mit.edu%3E Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | | | x86-64: Add time to vDSOAndy Lutomirski2011-05-242-1/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The only fast implementation of time(2) we expose is through the vsyscall page and we want to get userspace to stop using the vsyscall page. So make it available through the vDSO as well. This is essentially a cut-n-paste job. Signed-off-by: Andy Lutomirski <luto@mit.edu> Cc: Andi Kleen <andi@firstfloor.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Borislav Petkov <bp@amd64.org> Link: http://lkml.kernel.org/r/%3Cbf963bac5207de4b29613f27c42705e4371812a8.1306156808.git.luto%40mit.edu%3E Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | | | x86-64: Turn off -pg and turn on -foptimize-sibling-calls for vDSOAndy Lutomirski2011-05-241-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The vDSO isn't part of the kernel, so profiling and kernel backtraces don't really matter. Signed-off-by: Andy Lutomirski <luto@mit.edu> Cc: Andi Kleen <andi@firstfloor.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Borislav Petkov <bp@amd64.org> Link: http://lkml.kernel.org/r/%3C23087b738c037342abb53f2f07b9bef89ceaeea3.1306156808.git.luto%40mit.edu%3E Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | | | x86-64: Move vread_tsc into a new file with sensible optionsAndy Lutomirski2011-05-244-37/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vread_tsc is short and hot, and it's userspace code so the usual reasons to enable -pg and turn off sibling calls don't apply. (OK, turning off sibling calls has no effect. But it might someday...) As an added benefit, tsc.c is profilable now. Signed-off-by: Andy Lutomirski <luto@mit.edu> Cc: Andi Kleen <andi@firstfloor.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Borislav Petkov <bp@amd64.org> Link: http://lkml.kernel.org/r/%3C99c6d7f5efa3ccb65b4ac6eb443e1ab7bad47d7b.1306156808.git.luto%40mit.edu%3E Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | | | x86-64: Vclock_gettime(CLOCK_MONOTONIC) can't ever see nsec < 0Andy Lutomirski2011-05-241-18/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vclock_gettime's do_monotonic helper can't ever generate a negative nsec value, so it doesn't need to check whether it's negative. In the CLOCK_MONOTONIC_COARSE case, ns can't ever exceed 2e9-1, so we can avoid the loop entirely. This saves a single easily-predicted branch. Signed-off-by: Andy Lutomirski <luto@mit.edu> Cc: Andi Kleen <andi@firstfloor.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Borislav Petkov <bp@amd64.org> Link: http://lkml.kernel.org/r/%3Cd6d528d32c7a21618057cfc9005942a0fe5cb54a.1306156808.git.luto%40mit.edu%3E Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | | | x86-64: Don't generate cmov in vread_tscAndy Lutomirski2011-05-241-2/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vread_tsc checks whether rdtsc returns something less than cycle_last, which is an extremely predictable branch. GCC likes to generate a cmov anyway, which is several cycles slower than a predicted branch. This saves a couple of nanoseconds. Signed-off-by: Andy Lutomirski <luto@mit.edu> Cc: Andi Kleen <andi@firstfloor.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Borislav Petkov <bp@amd64.org> Link: http://lkml.kernel.org/r/%3C561280649519de41352fcb620684dfb22bad6bac.1306156808.git.luto%40mit.edu%3E Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | | | x86-64: Remove unnecessary barrier in vread_tscAndy Lutomirski2011-05-241-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RDTSC is completely unordered on modern Intel and AMD CPUs. The Intel manual says that lfence;rdtsc causes all previous instructions to complete before the tsc is read, and the AMD manual says to use mfence;rdtsc to do the same thing. From a decent amount of testing [1] this is enough to make rdtsc be ordered with respect to subsequent loads across a wide variety of CPUs. On Sandy Bridge (i7-2600), this improves a loop of clock_gettime(CLOCK_MONOTONIC) by more than 5 ns/iter. [1] https://lkml.org/lkml/2011/4/18/350 Signed-off-by: Andy Lutomirski <luto@mit.edu> Cc: Andi Kleen <andi@firstfloor.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Borislav Petkov <bp@amd64.org> Link: http://lkml.kernel.org/r/%3C1c158b9d74338aa5361f96dd473d0e6a58235302.1306156808.git.luto%40mit.edu%3E Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | | | x86-64: Clean up vdso/kernel shared variablesAndy Lutomirski2011-05-2415-145/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Variables that are shared between the vdso and the kernel are currently a bit of a mess. They are each defined with their own magic, they are accessed differently in the kernel, the vsyscall page, and the vdso, and one of them (vsyscall_clock) doesn't even really exist. This changes them all to use a common mechanism. All of them are delcared in vvar.h with a fixed address (validated by the linker script). In the kernel (as before), they look like ordinary read-write variables. In the vsyscall page and the vdso, they are accessed through a new macro VVAR, which gives read-only access. The vdso is now loaded verbatim into memory without any fixups. As a side bonus, access from the vdso is faster because a level of indirection is removed. While we're at it, pack jiffies and vgetcpu_mode into the same cacheline. Signed-off-by: Andy Lutomirski <luto@mit.edu> Cc: Andi Kleen <andi@firstfloor.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Borislav Petkov <bp@amd64.org> Link: http://lkml.kernel.org/r/%3C7357882fbb51fa30491636a7b6528747301b7ee9.1306156808.git.luto%40mit.edu%3E Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
OpenPOWER on IntegriCloud