summaryrefslogtreecommitdiffstats
path: root/init/Kconfig
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'xfs-for-linus-v3.12-rc1' of git://oss.sgi.com/xfs/xfsLinus Torvalds2013-09-091-13/+0
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull xfs updates from Ben Myers: "For 3.12-rc1 there are a number of bugfixes in addition to work to ease usage of shared code between libxfs and the kernel, the rest of the work to enable project and group quotas to be used simultaneously, performance optimisations in the log and the CIL, directory entry file type support, fixes for log space reservations, some spelling/grammar cleanups, and the addition of user namespace support. - introduce readahead to log recovery - add directory entry file type support - fix a number of spelling errors in comments - introduce new Q_XGETQSTATV quotactl for project quotas - add USER_NS support - log space reservation rework - CIL optimisations - kernel/userspace libxfs rework" * tag 'xfs-for-linus-v3.12-rc1' of git://oss.sgi.com/xfs/xfs: (112 commits) xfs: XFS_MOUNT_QUOTA_ALL needed by userspace xfs: dtype changed xfs_dir2_sfe_put_ino to xfs_dir3_sfe_put_ino Fix wrong flag ASSERT in xfs_attr_shortform_getvalue xfs: finish removing IOP_* macros. xfs: inode log reservations are too small xfs: check correct status variable for xfs_inobt_get_rec() call xfs: inode buffers may not be valid during recovery readahead xfs: check LSN ordering for v5 superblocks during recovery xfs: btree block LSN escaping to disk uninitialised XFS: Assertion failed: first <= last && last < BBTOB(bp->b_length), file: fs/xfs/xfs_trans_buf.c, line: 568 xfs: fix bad dquot buffer size in log recovery readahead xfs: don't account buffer cancellation during log recovery readahead xfs: check for underflow in xfs_iformat_fork() xfs: xfs_dir3_sfe_put_ino can be static xfs: introduce object readahead to log recovery xfs: Simplify xfs_ail_min() with list_first_entry_or_null() xfs: Register hotcpu notifier after initialization xfs: add xfs sb v4 support for dirent filetype field xfs: Add write support for dirent filetype field xfs: Add read-only support for dirent filetype field ...
| * enable building user namespace with xfsDwight Engen2013-08-151-13/+0
| | | | | | | | | | | | | | | | Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Dwight Engen <dwight.engen@oracle.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* | Merge branch 'timers-nohz-for-linus' of ↵Linus Torvalds2013-09-041-6/+22
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timers/nohz changes from Ingo Molnar: "It mostly contains fixes and full dynticks off-case optimizations, by Frederic Weisbecker" * 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) nohz: Include local CPU in full dynticks global kick nohz: Optimize full dynticks's sched hooks with static keys nohz: Optimize full dynticks state checks with static keys nohz: Rename a few state variables vtime: Always debug check snapshot source _before_ updating it vtime: Always scale generic vtime accounting results vtime: Optimize full dynticks accounting off case with static keys vtime: Describe overriden functions in dedicated arch headers m68k: hardirq_count() only need preempt_mask.h hardirq: Split preempt count mask definitions context_tracking: Split low level state headers vtime: Fix racy cputime delta update vtime: Remove a few unneeded generic vtime state checks context_tracking: User/kernel broundary cross trace events context_tracking: Optimize context switch off case with static keys context_tracking: Optimize guest APIs off case with static key context_tracking: Optimize main APIs off case with static key context_tracking: Ground setup for static key use context_tracking: Remove full dynticks' hacky dependency on wide context tracking nohz: Only enable context tracking on full dynticks CPUs ...
| * | context_tracking: Remove full dynticks' hacky dependency on wide context ↵Frederic Weisbecker2013-08-131-6/+22
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tracking Now that the full dynticks subsystem only enables the context tracking on full dynticks CPUs, lets remove the dependency on CONTEXT_TRACKING_FORCE This dependency was a hack to enable the context tracking widely for the full dynticks susbsystem until the latter becomes able to enable it in a more CPU-finegrained fashion. Now CONTEXT_TRACKING_FORCE only stands for testing on archs that work on support for the context tracking while full dynticks can't be used yet due to unmet dependencies. It simulates a system where all CPUs are full dynticks so that RCU user extended quiescent states and dynticks cputime accounting can be tested on the given arch. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Kevin Hilman <khilman@linaro.org>
* | Merge branch 'rcu/next' of ↵Ingo Molnar2013-09-031-0/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: " * Update RCU documentation. These were posted to LKML at https://lkml.org/lkml/2013/8/19/611. * Miscellaneous fixes. These were posted to LKML at https://lkml.org/lkml/2013/8/19/619. * Full-system idle detection. This is for use by Frederic Weisbecker's adaptive-ticks mechanism. Its purpose is to allow the timekeeping CPU to shut off its tick when all other CPUs are idle. These were posted to LKML at https://lkml.org/lkml/2013/8/19/648. * Improve rcutorture test coverage. These were posted to LKML at https://lkml.org/lkml/2013/8/19/675. " Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | rcu: Select IRQ_WORK from TREE_PREEMPT_RCUJames Hogan2013-08-181-0/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TREE_RCU and TREE_PREEMPT_RCU both cause kernel/rcutree.c to be built, but only TREE_RCU selects IRQ_WORK, which can result in an undefined reference to irq_work_queue for some (random) configs: kernel/built-in.o In function `rcu_start_gp_advanced': kernel/rcutree.c:1564: undefined reference to `irq_work_queue' Select IRQ_WORK from TREE_PREEMPT_RCU too to fix this. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
* | memcg: get rid of swapaccount leftoversMichal Hocko2013-08-231-1/+1
|/ | | | | | | | | | | | | | | The swapaccount kernel parameter without any values has been removed by commit a2c8990aed5a ("memsw: remove noswapaccount kernel parameter") but it seems that we didn't get rid of all the left overs. Make sure that menuconfig help text and kernel-parameters.txt are clear about value for the paramter and remove the stalled comment which is not very much useful on its own. Signed-off-by: Michal Hocko <mhocko@suse.cz> Reported-by: Gergely Risko <gergely@risko.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'slab/for-linus' of ↵Linus Torvalds2013-07-141-0/+11
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux Pull slab update from Pekka Enberg: "Highlights: - Fix for boot-time problems on some architectures due to init_lock_keys() not respecting kmalloc_caches boundaries (Christoph Lameter) - CONFIG_SLUB_CPU_PARTIAL requested by RT folks (Joonsoo Kim) - Fix for excessive slab freelist draining (Wanpeng Li) - SLUB and SLOB cleanups and fixes (various people)" I ended up editing the branch, and this avoids two commits at the end that were immediately reverted, and I instead just applied the oneliner fix in between myself. * 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux slub: Check for page NULL before doing the node_match check mm/slab: Give s_next and s_stop slab-specific names slob: Check for NULL pointer before calling ctor() slub: Make cpu partial slab support configurable slab: add kmalloc() to kernel API documentation slab: fix init_lock_keys slob: use DIV_ROUND_UP where possible slub: do not put a slab to cpu partial list when cpu_partial is 0 mm/slub: Use node_nr_slabs and node_nr_objs in get_slabinfo mm/slub: Drop unnecessary nr_partials mm/slab: Fix /proc/slabinfo unwriteable for slab mm/slab: Sharing s_next and s_stop between slab and slub mm/slab: Fix drain freelist excessively slob: Rework #ifdeffery in slab.h mm, slab: moved kmem_cache_alloc_node comment to correct place
| * slub: Make cpu partial slab support configurableJoonsoo Kim2013-07-071-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | CPU partial support can introduce level of indeterminism that is not wanted in certain context (like a realtime kernel). Make it configurable. This patch is based on Christoph Lameter's "slub: Make cpu partial slab support configurable V2". Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
* | lib: add support for LZ4-compressed kernelKyungsik Lee2013-07-091-1/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for extracting LZ4-compressed kernel images, as well as LZ4-compressed ramdisk images in the kernel boot process. Signed-off-by: Kyungsik Lee <kyungsik.lee@lge.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Borislav Petkov <bp@alien8.de> Cc: Florian Fainelli <florian@openwrt.org> Cc: Yann Collet <yann.collet.73@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge branch 'timers-core-for-linus' of ↵Linus Torvalds2013-07-061-0/+3
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer core updates from Thomas Gleixner: "The timer changes contain: - posix timer code consolidation and fixes for odd corner cases - sched_clock implementation moved from ARM to core code to avoid duplication by other architectures - alarm timer updates - clocksource and clockevents unregistration facilities - clocksource/events support for new hardware - precise nanoseconds RTC readout (Xen feature) - generic support for Xen suspend/resume oddities - the usual lot of fixes and cleanups all over the place The parts which touch other areas (ARM/XEN) have been coordinated with the relevant maintainers. Though this results in an handful of trivial to solve merge conflicts, which we preferred over nasty cross tree merge dependencies. The patches which have been committed in the last few days are bug fixes plus the posix timer lot. The latter was in akpms queue and next for quite some time; they just got forgotten and Frederic collected them last minute." * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (59 commits) hrtimer: Remove unused variable hrtimers: Move SMP function call to thread context clocksource: Reselect clocksource when watchdog validated high-res capability posix-cpu-timers: don't account cpu timer after stopped thread runtime accounting posix_timers: fix racy timer delta caching on task exit posix-timers: correctly get dying task time sample in posix_cpu_timer_schedule() selftests: add basic posix timers selftests posix_cpu_timers: consolidate expired timers check posix_cpu_timers: consolidate timer list cleanups posix_cpu_timer: consolidate expiry time type tick: Sanitize broadcast control logic tick: Prevent uncontrolled switch to oneshot mode tick: Make oneshot broadcast robust vs. CPU offlining x86: xen: Sync the CMOS RTC as well as the Xen wallclock x86: xen: Sync the wallclock when the system time is set timekeeping: Indicate that clock was set in the pvclock gtod notifier timekeeping: Pass flags instead of multiple bools to timekeeping_update() xen: Remove clock_was_set() call in the resume path hrtimers: Support resuming with two or more CPUs online (but stopped) timer: Fix jiffies wrap behavior of round_jiffies_common() ...
| * \ Merge branch 'timers/posix-cpu-timers-for-tglx' ofThomas Gleixner2013-07-041-0/+1
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into timers/core Frederic sayed: "Most of these patches have been hanging around for several month now, in -mmotm for a significant chunk. They already missed a few releases." Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | sched_clock: Make ARM's sched_clock generic for all architecturesStephen Boyd2013-06-121-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Nothing about the sched_clock implementation in the ARM port is specific to the architecture. Generalize the code so that other architectures can use it by selecting GENERIC_SCHED_CLOCK. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> [jstultz: Merge minor collisions with other patches in my tree] Signed-off-by: John Stultz <john.stultz@linaro.org>
* | | | memcg: Kconfig info updateSergey Dyasly2013-07-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now there are only 2 members in struct page_cgroup. Update config MEMCG description accordingly. Signed-off-by: Sergey Dyasly <dserrg@gmail.com> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds2013-07-021-19/+27
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The major changes: - Simplify RCU's grace-period and callback processing based on the new numbering for callbacks. - Removal of TINY_PREEMPT_RCU in favor of TREE_PREEMPT_RCU for single-CPU low-latency systems. - SRCU-related changes and fixes. - Miscellaneous fixes, including converting a few remaining printk() calls to pr_*(). - Documentation updates" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits) rcu: Shrink TINY_RCU by reworking CPU-stall ifdefs rcu: Shrink TINY_RCU by moving exit_rcu() rcu: Remove TINY_PREEMPT_RCU tracing documentation rcu: Consolidate rcutiny_plugin.h ifdefs rcu: Remove rcu_preempt_note_context_switch() rcu: Remove the CONFIG_TINY_RCU ifdefs in rcutiny.h rcu: Remove check_cpu_stall_preempt() rcu: Simplify RCU_TINY RCU callback invocation rcu: Remove rcu_preempt_process_callbacks() rcu: Remove rcu_preempt_remove_callbacks() rcu: Remove rcu_preempt_check_callbacks() rcu: Remove show_tiny_preempt_stats() rcu: Remove TINY_PREEMPT_RCU powerpc,kvm: fix imbalance srcu_read_[un]lock() rcu: Remove srcu_read_lock_raw() and srcu_read_unlock_raw(). rcu: Apply Dave Jones's NOCB Kconfig help feedback rcu: Merge adjacent identical ifdefs rcu: Drive quiescent-state-forcing delay from HZ rcu: Remove "Experimental" flags kthread: Add kworker kthreads to OS-jitter documentation ...
| | \ \ \
| | \ \ \
| *-. \ \ \ Merge branches 'cbnum.2013.06.10a', 'doc.2013.06.10a', 'fixes.2013.06.10a', ↵Paul E. McKenney2013-06-101-19/+27
| |\ \ \ \ \ | | |_|_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'srcu.2013.06.10a' and 'tiny.2013.06.10a' into HEAD cbnum.2013.06.10a: Apply simplifications stemming from the new callback numbering. doc.2013.06.10a: Documentation updates. fixes.2013.06.10a: Miscellaneous fixes. srcu.2013.06.10a: Updates to SRCU. tiny.2013.06.10a: Eliminate TINY_PREEMPT_RCU.
| | | * | | rcu: Remove TINY_PREEMPT_RCUPaul E. McKenney2013-06-101-9/+1
| | |/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TINY_PREEMPT_RCU adds significant code and complexity, but does not offer commensurate benefits. People currently using TINY_PREEMPT_RCU can get much better memory footprint with TINY_RCU, or, if they really need preemptible RCU, they can use TREE_PREEMPT_RCU with a relatively minor degradation in memory footprint. Please note that this move has been widely publicized on LKML (https://lkml.org/lkml/2012/11/12/545) and on LWN (http://lwn.net/Articles/541037/). This commit therefore removes TINY_PREEMPT_RCU. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Updated to eliminate #else in rcutiny.h as suggested by Josh ] Reviewed-by: Josh Triplett <josh@joshtriplett.org>
| | * | | rcu: Apply Dave Jones's NOCB Kconfig help feedbackPaul E. McKenney2013-06-101-9/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Kconfig help text for the RCU_NOCB_CPU_NONE, RCU_NOCB_CPU_ZERO, and RCU_NOCB_CPU_ALL Kconfig options was unclear, so this commit adds a bit more detail. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
| | * | | rcu: Remove "Experimental" flagsPaul E. McKenney2013-06-101-1/+1
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After a release or two, features are no longer experimental. Therefore, this commit removes the "Experimental" tag from them. Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
* | | | build some drivers only when compile-testingJiri Slaby2013-06-241-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some drivers can be built on more platforms than they run on. This is a burden for users and distributors who package a kernel. They have to manually deselect some (for them useless) drivers when updating their configs via oldconfig. And yet, sometimes it is even impossible to disable the drivers without patching the kernel. Introduce a new config option COMPILE_TEST and make all those drivers to depend on the platform they run on, or on the COMPILE_TEST option. Now, when users/distributors choose COMPILE_TEST=n they will not have the drivers in their allmodconfig setups, but developers still can compile-test them with COMPILE_TEST=y. Now the drivers where we use this new option: * PTP_1588_CLOCK_PCH: The PCH EG20T is only compatible with Intel Atom processors so it should depend on x86. * FB_GEODE: Geode is 32-bit only so only enable it for X86_32. * USB_CHIPIDEA_IMX: The OF_DEVICE dependency will be met on powerpc systems -- which do not actually support the hardware via that method. * INTEL_MID_PTI: It is specific to the Penwell type of Intel Atom device. [v2] * remove EXPERT dependency [gregkh - remove chipidea portion, as it's incorrect, and also doesn't apply to my driver-core tree] Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jeff Mahoney <jeffm@suse.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: linux-usb@vger.kernel.org Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de> Cc: linux-geode@lists.infradead.org Cc: linux-fbdev@vger.kernel.org Cc: Richard Cochran <richardcochran@gmail.com> Cc: netdev@vger.kernel.org Cc: Ben Hutchings <ben@decadent.org.uk> Cc: "Keller, Jacob E" <jacob.e.keller@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | | | Merge 3.10-rc6 into driver-core-nextGreg Kroah-Hartman2013-06-171-0/+1
|\ \ \ \ | |/ / / | | | | | | | | | | | | | | | | We want these fixes here too. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | rcu: Don't call wakeup() with rcu_node structure ->lock heldSteven Rostedt2013-06-101-0/+1
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit fixes a lockdep-detected deadlock by moving a wake_up() call out from a rnp->lock critical section. Please see below for the long version of this story. On Tue, 2013-05-28 at 16:13 -0400, Dave Jones wrote: > [12572.705832] ====================================================== > [12572.750317] [ INFO: possible circular locking dependency detected ] > [12572.796978] 3.10.0-rc3+ #39 Not tainted > [12572.833381] ------------------------------------------------------- > [12572.862233] trinity-child17/31341 is trying to acquire lock: > [12572.870390] (rcu_node_0){..-.-.}, at: [<ffffffff811054ff>] rcu_read_unlock_special+0x9f/0x4c0 > [12572.878859] > but task is already holding lock: > [12572.894894] (&ctx->lock){-.-...}, at: [<ffffffff811390ed>] perf_lock_task_context+0x7d/0x2d0 > [12572.903381] > which lock already depends on the new lock. > > [12572.927541] > the existing dependency chain (in reverse order) is: > [12572.943736] > -> #4 (&ctx->lock){-.-...}: > [12572.960032] [<ffffffff810b9851>] lock_acquire+0x91/0x1f0 > [12572.968337] [<ffffffff816ebc90>] _raw_spin_lock+0x40/0x80 > [12572.976633] [<ffffffff8113c987>] __perf_event_task_sched_out+0x2e7/0x5e0 > [12572.984969] [<ffffffff81088953>] perf_event_task_sched_out+0x93/0xa0 > [12572.993326] [<ffffffff816ea0bf>] __schedule+0x2cf/0x9c0 > [12573.001652] [<ffffffff816eacfe>] schedule_user+0x2e/0x70 > [12573.009998] [<ffffffff816ecd64>] retint_careful+0x12/0x2e > [12573.018321] > -> #3 (&rq->lock){-.-.-.}: > [12573.034628] [<ffffffff810b9851>] lock_acquire+0x91/0x1f0 > [12573.042930] [<ffffffff816ebc90>] _raw_spin_lock+0x40/0x80 > [12573.051248] [<ffffffff8108e6a7>] wake_up_new_task+0xb7/0x260 > [12573.059579] [<ffffffff810492f5>] do_fork+0x105/0x470 > [12573.067880] [<ffffffff81049686>] kernel_thread+0x26/0x30 > [12573.076202] [<ffffffff816cee63>] rest_init+0x23/0x140 > [12573.084508] [<ffffffff81ed8e1f>] start_kernel+0x3f1/0x3fe > [12573.092852] [<ffffffff81ed856f>] x86_64_start_reservations+0x2a/0x2c > [12573.101233] [<ffffffff81ed863d>] x86_64_start_kernel+0xcc/0xcf > [12573.109528] > -> #2 (&p->pi_lock){-.-.-.}: > [12573.125675] [<ffffffff810b9851>] lock_acquire+0x91/0x1f0 > [12573.133829] [<ffffffff816ebe9b>] _raw_spin_lock_irqsave+0x4b/0x90 > [12573.141964] [<ffffffff8108e881>] try_to_wake_up+0x31/0x320 > [12573.150065] [<ffffffff8108ebe2>] default_wake_function+0x12/0x20 > [12573.158151] [<ffffffff8107bbf8>] autoremove_wake_function+0x18/0x40 > [12573.166195] [<ffffffff81085398>] __wake_up_common+0x58/0x90 > [12573.174215] [<ffffffff81086909>] __wake_up+0x39/0x50 > [12573.182146] [<ffffffff810fc3da>] rcu_start_gp_advanced.isra.11+0x4a/0x50 > [12573.190119] [<ffffffff810fdb09>] rcu_start_future_gp+0x1c9/0x1f0 > [12573.198023] [<ffffffff810fe2c4>] rcu_nocb_kthread+0x114/0x930 > [12573.205860] [<ffffffff8107a91d>] kthread+0xed/0x100 > [12573.213656] [<ffffffff816f4b1c>] ret_from_fork+0x7c/0xb0 > [12573.221379] > -> #1 (&rsp->gp_wq){..-.-.}: > [12573.236329] [<ffffffff810b9851>] lock_acquire+0x91/0x1f0 > [12573.243783] [<ffffffff816ebe9b>] _raw_spin_lock_irqsave+0x4b/0x90 > [12573.251178] [<ffffffff810868f3>] __wake_up+0x23/0x50 > [12573.258505] [<ffffffff810fc3da>] rcu_start_gp_advanced.isra.11+0x4a/0x50 > [12573.265891] [<ffffffff810fdb09>] rcu_start_future_gp+0x1c9/0x1f0 > [12573.273248] [<ffffffff810fe2c4>] rcu_nocb_kthread+0x114/0x930 > [12573.280564] [<ffffffff8107a91d>] kthread+0xed/0x100 > [12573.287807] [<ffffffff816f4b1c>] ret_from_fork+0x7c/0xb0 Notice the above call chain. rcu_start_future_gp() is called with the rnp->lock held. Then it calls rcu_start_gp_advance, which does a wakeup. You can't do wakeups while holding the rnp->lock, as that would mean that you could not do a rcu_read_unlock() while holding the rq lock, or any lock that was taken while holding the rq lock. This is because... (See below). > [12573.295067] > -> #0 (rcu_node_0){..-.-.}: > [12573.309293] [<ffffffff810b8d36>] __lock_acquire+0x1786/0x1af0 > [12573.316568] [<ffffffff810b9851>] lock_acquire+0x91/0x1f0 > [12573.323825] [<ffffffff816ebc90>] _raw_spin_lock+0x40/0x80 > [12573.331081] [<ffffffff811054ff>] rcu_read_unlock_special+0x9f/0x4c0 > [12573.338377] [<ffffffff810760a6>] __rcu_read_unlock+0x96/0xa0 > [12573.345648] [<ffffffff811391b3>] perf_lock_task_context+0x143/0x2d0 > [12573.352942] [<ffffffff8113938e>] find_get_context+0x4e/0x1f0 > [12573.360211] [<ffffffff811403f4>] SYSC_perf_event_open+0x514/0xbd0 > [12573.367514] [<ffffffff81140e49>] SyS_perf_event_open+0x9/0x10 > [12573.374816] [<ffffffff816f4dd4>] tracesys+0xdd/0xe2 Notice the above trace. perf took its own ctx->lock, which can be taken while holding the rq lock. While holding this lock, it did a rcu_read_unlock(). The perf_lock_task_context() basically looks like: rcu_read_lock(); raw_spin_lock(ctx->lock); rcu_read_unlock(); Now, what looks to have happened, is that we scheduled after taking that first rcu_read_lock() but before taking the spin lock. When we scheduled back in and took the ctx->lock, the following rcu_read_unlock() triggered the "special" code. The rcu_read_unlock_special() takes the rnp->lock, which gives us a possible deadlock scenario. CPU0 CPU1 CPU2 ---- ---- ---- rcu_nocb_kthread() lock(rq->lock); lock(ctx->lock); lock(rnp->lock); wake_up(); lock(rq->lock); rcu_read_unlock(); rcu_read_unlock_special(); lock(rnp->lock); lock(ctx->lock); **** DEADLOCK **** > [12573.382068] > other info that might help us debug this: > > [12573.403229] Chain exists of: > rcu_node_0 --> &rq->lock --> &ctx->lock > > [12573.424471] Possible unsafe locking scenario: > > [12573.438499] CPU0 CPU1 > [12573.445599] ---- ---- > [12573.452691] lock(&ctx->lock); > [12573.459799] lock(&rq->lock); > [12573.467010] lock(&ctx->lock); > [12573.474192] lock(rcu_node_0); > [12573.481262] > *** DEADLOCK *** > > [12573.501931] 1 lock held by trinity-child17/31341: > [12573.508990] #0: (&ctx->lock){-.-...}, at: [<ffffffff811390ed>] perf_lock_task_context+0x7d/0x2d0 > [12573.516475] > stack backtrace: > [12573.530395] CPU: 1 PID: 31341 Comm: trinity-child17 Not tainted 3.10.0-rc3+ #39 > [12573.545357] ffffffff825b4f90 ffff880219f1dbc0 ffffffff816e375b ffff880219f1dc00 > [12573.552868] ffffffff816dfa5d ffff880219f1dc50 ffff88023ce4d1f8 ffff88023ce4ca40 > [12573.560353] 0000000000000001 0000000000000001 ffff88023ce4d1f8 ffff880219f1dcc0 > [12573.567856] Call Trace: > [12573.575011] [<ffffffff816e375b>] dump_stack+0x19/0x1b > [12573.582284] [<ffffffff816dfa5d>] print_circular_bug+0x200/0x20f > [12573.589637] [<ffffffff810b8d36>] __lock_acquire+0x1786/0x1af0 > [12573.596982] [<ffffffff810918f5>] ? sched_clock_cpu+0xb5/0x100 > [12573.604344] [<ffffffff810b9851>] lock_acquire+0x91/0x1f0 > [12573.611652] [<ffffffff811054ff>] ? rcu_read_unlock_special+0x9f/0x4c0 > [12573.619030] [<ffffffff816ebc90>] _raw_spin_lock+0x40/0x80 > [12573.626331] [<ffffffff811054ff>] ? rcu_read_unlock_special+0x9f/0x4c0 > [12573.633671] [<ffffffff811054ff>] rcu_read_unlock_special+0x9f/0x4c0 > [12573.640992] [<ffffffff811390ed>] ? perf_lock_task_context+0x7d/0x2d0 > [12573.648330] [<ffffffff810b429e>] ? put_lock_stats.isra.29+0xe/0x40 > [12573.655662] [<ffffffff813095a0>] ? delay_tsc+0x90/0xe0 > [12573.662964] [<ffffffff810760a6>] __rcu_read_unlock+0x96/0xa0 > [12573.670276] [<ffffffff811391b3>] perf_lock_task_context+0x143/0x2d0 > [12573.677622] [<ffffffff81139070>] ? __perf_event_enable+0x370/0x370 > [12573.684981] [<ffffffff8113938e>] find_get_context+0x4e/0x1f0 > [12573.692358] [<ffffffff811403f4>] SYSC_perf_event_open+0x514/0xbd0 > [12573.699753] [<ffffffff8108cd9d>] ? get_parent_ip+0xd/0x50 > [12573.707135] [<ffffffff810b71fd>] ? trace_hardirqs_on_caller+0xfd/0x1c0 > [12573.714599] [<ffffffff81140e49>] SyS_perf_event_open+0x9/0x10 > [12573.721996] [<ffffffff816f4dd4>] tracesys+0xdd/0xe2 This commit delays the wakeup via irq_work(), which is what perf and ftrace use to perform wakeups in critical sections. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
* | | Finally eradicate CONFIG_HOTPLUGStephen Rothwell2013-06-031-3/+0
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ever since commit 45f035ab9b8f ("CONFIG_HOTPLUG should be always on"), it has been basically impossible to build a kernel with CONFIG_HOTPLUG turned off. Remove all the remaining references to it. Cc: Russell King <linux@arm.linux.org.uk> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Merge branch 'timers-nohz-for-linus' of ↵Linus Torvalds2013-05-051-6/+6
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull 'full dynticks' support from Ingo Molnar: "This tree from Frederic Weisbecker adds a new, (exciting! :-) core kernel feature to the timer and scheduler subsystems: 'full dynticks', or CONFIG_NO_HZ_FULL=y. This feature extends the nohz variable-size timer tick feature from idle to busy CPUs (running at most one task) as well, potentially reducing the number of timer interrupts significantly. This feature got motivated by real-time folks and the -rt tree, but the general utility and motivation of full-dynticks runs wider than that: - HPC workloads get faster: CPUs running a single task should be able to utilize a maximum amount of CPU power. A periodic timer tick at HZ=1000 can cause a constant overhead of up to 1.0%. This feature removes that overhead - and speeds up the system by 0.5%-1.0% on typical distro configs even on modern systems. - Real-time workload latency reduction: CPUs running critical tasks should experience as little jitter as possible. The last remaining source of kernel-related jitter was the periodic timer tick. - A single task executing on a CPU is a pretty common situation, especially with an increasing number of cores/CPUs, so this feature helps desktop and mobile workloads as well. The cost of the feature is mainly related to increased timer reprogramming overhead when a CPU switches its tick period, and thus slightly longer to-idle and from-idle latency. Configuration-wise a third mode of operation is added to the existing two NOHZ kconfig modes: - CONFIG_HZ_PERIODIC: [formerly !CONFIG_NO_HZ], now explicitly named as a config option. This is the traditional Linux periodic tick design: there's a HZ tick going on all the time, regardless of whether a CPU is idle or not. - CONFIG_NO_HZ_IDLE: [formerly CONFIG_NO_HZ=y], this turns off the periodic tick when a CPU enters idle mode. - CONFIG_NO_HZ_FULL: this new mode, in addition to turning off the tick when a CPU is idle, also slows the tick down to 1 Hz (one timer interrupt per second) when only a single task is running on a CPU. The .config behavior is compatible: existing !CONFIG_NO_HZ and CONFIG_NO_HZ=y settings get translated to the new values, without the user having to configure anything. CONFIG_NO_HZ_FULL is turned off by default. This feature is based on a lot of infrastructure work that has been steadily going upstream in the last 2-3 cycles: related RCU support and non-periodic cputime support in particular is upstream already. This tree adds the final pieces and activates the feature. The pull request is marked RFC because: - it's marked 64-bit only at the moment - the 32-bit support patch is small but did not get ready in time. - it has a number of fresh commits that came in after the merge window. The overwhelming majority of commits are from before the merge window, but still some aspects of the tree are fresh and so I marked it RFC. - it's a pretty wide-reaching feature with lots of effects - and while the components have been in testing for some time, the full combination is still not very widely used. That it's default-off should reduce its regression abilities and obviously there are no known regressions with CONFIG_NO_HZ_FULL=y enabled either. - the feature is not completely idempotent: there is no 100% equivalent replacement for a periodic scheduler/timer tick. In particular there's ongoing work to map out and reduce its effects on scheduler load-balancing and statistics. This should not impact correctness though, there are no known regressions related to this feature at this point. - it's a pretty ambitious feature that with time will likely be enabled by most Linux distros, and we'd like you to make input on its design/implementation, if you dislike some aspect we missed. Without flaming us to crisp! :-) Future plans: - there's ongoing work to reduce 1Hz to 0Hz, to essentially shut off the periodic tick altogether when there's a single busy task on a CPU. We'd first like 1 Hz to be exposed more widely before we go for the 0 Hz target though. - once we reach 0 Hz we can remove the periodic tick assumption from nr_running>=2 as well, by essentially interrupting busy tasks only as frequently as the sched_latency constraints require us to do - once every 4-40 msecs, depending on nr_running. I am personally leaning towards biting the bullet and doing this in v3.10, like the -rt tree this effort has been going on for too long - but the final word is up to you as usual. More technical details can be found in Documentation/timers/NO_HZ.txt" * 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits) sched: Keep at least 1 tick per second for active dynticks tasks rcu: Fix full dynticks' dependency on wide RCU nocb mode nohz: Protect smp_processor_id() in tick_nohz_task_switch() nohz_full: Add documentation. cputime_nsecs: use math64.h for nsec resolution conversion helpers nohz: Select VIRT_CPU_ACCOUNTING_GEN from full dynticks config nohz: Reduce overhead under high-freq idling patterns nohz: Remove full dynticks' superfluous dependency on RCU tree nohz: Fix unavailable tick_stop tracepoint in dynticks idle nohz: Add basic tracing nohz: Select wide RCU nocb for full dynticks nohz: Disable the tick when irq resume in full dynticks CPU nohz: Re-evaluate the tick for the new task after a context switch nohz: Prepare to stop the tick on irq exit nohz: Implement full dynticks kick nohz: Re-evaluate the tick from the scheduler IPI sched: New helper to prevent from stopping the tick in full dynticks sched: Kick full dynticks CPU that have more than one task enqueued. perf: New helper to prevent full dynticks CPUs from stopping tick perf: Kick full dynticks CPU if events rotation is needed ...
| * | rcu: Fix full dynticks' dependency on wide RCU nocb modeFrederic Weisbecker2013-05-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 0637e029392386e6996f5d6574aadccee8315efa ("nohz: Select wide RCU nocb for full dynticks") intended to force CONFIG_RCU_NOCB_CPU_ALL=y when full dynticks is enabled. However this option is part of a choice menu and Kconfig's "select" instruction has no effect on such targets. Fix this by using reverse dependencies on the targets we don't want instead. Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Christoph Lameter <cl@linux.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kevin Hilman <khilman@linaro.org> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de>
| * | Merge commit '8700c95adb03' into timers/nohzFrederic Weisbecker2013-05-021-19/+58
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The full dynticks tree needs the latest RCU and sched upstream updates in order to fix some dependencies. Merge a common upstream merge point that has these updates. Conflicts: include/linux/perf_event.h kernel/rcutree.h kernel/rcutree_plugin.h Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
| * | | nohz: Select VIRT_CPU_ACCOUNTING_GEN from full dynticks configFrederic Weisbecker2013-04-261-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Turn the full dynticks passive dependency on VIRT_CPU_ACCOUNTING_GEN to an active one. The full dynticks Kconfig is currently hidden behind the full dynticks cputime accounting, which is an awkward and counter-intuitive layout: the user first has to select the dynticks cputime accounting in order to make the full dynticks feature to be visible. We definetly want it the other way around. The usual way to perform this kind of active dependency is use "select" on the depended target. Now we can't use the Kconfig "select" instruction when the target is a "choice". So this patch inspires on how the RCU subsystem Kconfig interact with its dependencies on SMP and PREEMPT: we make sure that cputime accounting can't propose another option than VIRT_CPU_ACCOUNTING_GEN when NO_HZ_FULL is selected by using the right "depends on" instruction for each cputime accounting choices. v2: Keep full dynticks cputime accounting available even without full dynticks, as per Paul McKenney's suggestion. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Christoph Lameter <cl@linux.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kevin Hilman <khilman@linaro.org> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de>
| * | | nohz: Rename CONFIG_NO_HZ to CONFIG_NO_HZ_COMMONFrederic Weisbecker2013-04-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We are planning to convert the dynticks Kconfig options layout into a choice menu. The user must be able to easily pick any of the following implementations: constant periodic tick, idle dynticks, full dynticks. As this implies a mutual exclusion, the two dynticks implementions need to converge on the selection of a common Kconfig option in order to ease the sharing of a common infrastructure. It would thus seem pretty natural to reuse CONFIG_NO_HZ to that end. It already implements all the idle dynticks code and the full dynticks depends on all that code for now. So ideally the choice menu would propose CONFIG_NO_HZ_IDLE and CONFIG_NO_HZ_EXTENDED then both would select CONFIG_NO_HZ. On the other hand we want to stay backward compatible: if CONFIG_NO_HZ is set in an older config file, we want to enable CONFIG_NO_HZ_IDLE by default. But we can't afford both at the same time or we run into a circular dependency: 1) CONFIG_NO_HZ_IDLE and CONFIG_NO_HZ_EXTENDED both select CONFIG_NO_HZ 2) If CONFIG_NO_HZ is set, we default to CONFIG_NO_HZ_IDLE We might be able to support that from Kconfig/Kbuild but it may not be wise to introduce such a confusing behaviour. So to solve this, create a new CONFIG_NO_HZ_COMMON option which gathers the common code between idle and full dynticks (that common code for now is simply the idle dynticks code) and select it from their referring Kconfig. Then we'll later create CONFIG_NO_HZ_IDLE and map CONFIG_NO_HZ to it for backward compatibility. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kevin Hilman <khilman@linaro.org> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de>
* | | | init/Kconfig: re-order CONFIG_EXPERT options to fix menuconfig displayMike Frysinger2013-04-301-40/+40
| |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kconfig language requires that dependent options all follow the menuconfig symbol in order to be collapsed below it. Recently some hidden options were added below the EXPERT menuconfig, but did not depend on EXPERT (because hidden options can't). This broke the display. So re-order all these options, and while we're here stick the PCI quirks under the EXPERT menu (since it isn't sitting with any related options). Before this commit, we get: [*] Configure standard kernel features (expert users) ---> [ ] Sysctl syscall support [*] Load all symbols for debugging/ksymoops ... [ ] Embedded system Now we get the older (and correct) behavior: [*] Configure standard kernel features (expert users) ---> [ ] Embedded system And if you go into the expert menu you get the expert options: [ ] Sysctl syscall support [*] Load all symbols for debugging/ksymoops ... Signed-off-by: Mike Frysinger <vapier@gentoo.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Cc: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Cc: Michal Marek <mmarek@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge branch 'sched-core-for-linus' of ↵Linus Torvalds2013-04-301-0/+1
|\ \ \ | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler changes from Ingo Molnar: "The main changes in this development cycle were: - full dynticks preparatory work by Frederic Weisbecker - factor out the cpu time accounting code better, by Li Zefan - multi-CPU load balancer cleanups and improvements by Joonsoo Kim - various smaller fixes and cleanups" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits) sched: Fix init NOHZ_IDLE flag sched: Prevent to re-select dst-cpu in load_balance() sched: Rename load_balance_tmpmask to load_balance_mask sched: Move up affinity check to mitigate useless redoing overhead sched: Don't consider other cpus in our group in case of NEWLY_IDLE sched: Explicitly cpu_idle_type checking in rebalance_domains() sched: Change position of resched_cpu() in load_balance() sched: Fix wrong rq's runnable_avg update with rt tasks sched: Document task_struct::personality field sched/cpuacct/UML: Fix header file dependency bug on the UML build cgroup: Kill subsys.active flag sched/cpuacct: No need to check subsys active state sched/cpuacct: Initialize cpuacct subsystem earlier sched/cpuacct: Initialize root cpuacct earlier sched/cpuacct: Allocate per_cpu cpuusage for root cpuacct statically sched/cpuacct: Clean up cpuacct.h sched/cpuacct: Remove redundant NULL checks in cpuacct_acount_field() sched/cpuacct: Remove redundant NULL checks in cpuacct_charge() sched/cpuacct: Add cpuacct_acount_field() sched/cpuacct: Add cpuacct_init() ...
| * | context_tracking: Enable probes by default for selftestingFrederic Weisbecker2013-03-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Until we provide the nohz_mask boot parameter, keeping the context tracking probes disabled by default is pointless since what we want is to runtime test this code anyway. It's furthermore confusing for the users which don't expect the probes to be off when they select RCU user mode or full dynticks cputime accounting. Let's enable these probes selftests by default for now. Suggested: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Kevin Hilman <khilman@linaro.org> Cc: Mats Liljegren <mats.liljegren@enea.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
* | | Merge branch 'rcu/next' of ↵Ingo Molnar2013-04-101-15/+58
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: * Remove restrictions on no-CBs CPUs, make RCU_FAST_NO_HZ take advantage of numbered callbacks, do additional callback accelerations based on numbered callbacks. Posted to LKML at https://lkml.org/lkml/2013/3/18/960. * RCU documentation updates. Posted to LKML at https://lkml.org/lkml/2013/3/18/570. * Miscellaneous fixes. Posted to LKML at https://lkml.org/lkml/2013/3/18/594. Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | rcu: Make RCU_FAST_NO_HZ take advantage of numbered callbacksPaul E. McKenney2013-03-261-7/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because RCU callbacks are now associated with the number of the grace period that they must wait for, CPUs can now take advance callbacks corresponding to grace periods that ended while a given CPU was in dyntick-idle mode. This eliminates the need to try forcing the RCU state machine while entering idle, thus reducing the CPU intensiveness of RCU_FAST_NO_HZ, which should increase its energy efficiency. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
| * | | rcu: Distinguish "rcuo" kthreads by RCU flavorPaul E. McKenney2013-03-261-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the per-no-CBs-CPU kthreads are named "rcuo" followed by the CPU number, for example, "rcuo". This is problematic given that there are either two or three RCU flavors, each of which gets a per-CPU kthread with exactly the same name. This commit therefore introduces a one-letter abbreviation for each RCU flavor, namely 'b' for RCU-bh, 'p' for RCU-preempt, and 's' for RCU-sched. This abbreviation is used to distinguish the "rcuo" kthreads, for example, for CPU 0 we would have "rcuob/0", "rcuop/0", and "rcuos/0". Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
| * | | rcu: Provide compile-time control for no-CBs CPUsPaul E. McKenney2013-03-261-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the only way to specify no-CBs CPUs is via the rcu_nocbs kernel command-line parameter. This is inconvenient in some cases, particularly for randconfig testing, so this commit adds a new set of kernel configuration parameters. CONFIG_RCU_NOCB_CPU_NONE (the default) retains the old behavior, CONFIG_RCU_NOCB_CPU_ZERO offloads callback processing from CPU 0 (along with any other CPUs specified by the rcu_nocbs boot-time parameter), and CONFIG_RCU_NOCB_CPU_ALL offloads callback processing from all CPUs. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
| * | | rcu: Remove restrictions on no-CBs CPUsPaul E. McKenney2013-03-121-2/+2
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, CPU 0 is constrained to not be a no-CBs CPU, and furthermore at least one no-CBs CPU must remain online at any given time. These restrictions are problematic in some situations, such as cases where all CPUs must run a real-time workload that needs to be insulated from OS jitter and latencies due to RCU callback invocation. This commit therefore provides no-CBs CPUs a (very crude and energy-inefficient) way to start and to wait for grace periods independently of the normal RCU callback mechanisms. This approach allows any or all of the CPUs to be designated as no-CBs CPUs, and allows any proper subset of the CPUs (whether no-CBs CPUs or not) to be offlined. This commit also provides a fix for a locking bug spotted by Xie ChanglongX <changlongx.xie@intel.com>. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
* | | final removal of CONFIG_EXPERIMENTALKees Cook2013-03-121-4/+0
|/ / | | | | | | | | | | | | | | Remove "config EXPERIMENTAL" itself, now that every "depends on" it has been removed from the tree. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Merge tag 'arc-v3.9-rc1-late' of ↵Linus Torvalds2013-03-021-0/+8
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull new ARC architecture from Vineet Gupta: "Initial ARC Linux port with some fixes on top for 3.9-rc1: I would like to introduce the Linux port to ARC Processors (from Synopsys) for 3.9-rc1. The patch-set has been discussed on the public lists since Nov and has received a fair bit of review, specially from Arnd, tglx, Al and other subsystem maintainers for DeviceTree, kgdb... The arch bits are in arch/arc, some asm-generic changes (acked by Arnd), a minor change to PARISC (acked by Helge). The series is a touch bigger for a new port for 2 main reasons: 1. It enables a basic kernel in first sub-series and adds ptrace/kgdb/.. later 2. Some of the fallout of review (DeviceTree support, multi-platform- image support) were added on top of orig series, primarily to record the revision history. This updated pull request additionally contains - fixes due to our GNU tools catching up with the new syscall/ptrace ABI - some (minor) cross-arch Kconfig updates." * tag 'arc-v3.9-rc1-late' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (82 commits) ARC: split elf.h into uapi and export it for userspace ARC: Fixup the current ABI version ARC: gdbserver using regset interface possibly broken ARC: Kconfig cleanup tracking cross-arch Kconfig pruning in merge window ARC: make a copy of flat DT ARC: [plat-arcfpga] DT arc-uart bindings change: "baud" => "current-speed" ARC: Ensure CONFIG_VIRT_TO_BUS is not enabled ARC: Fix pt_orig_r8 access ARC: [3.9] Fallout of hlist iterator update ARC: 64bit RTSC timestamp hardware issue ARC: Don't fiddle with non-existent caches ARC: Add self to MAINTAINERS ARC: Provide a default serial.h for uart drivers needing BASE_BAUD ARC: [plat-arcfpga] defconfig for fully loaded ARC Linux ARC: [Review] Multi-platform image #8: platform registers SMP callbacks ARC: [Review] Multi-platform image #7: SMP common code to use callbacks ARC: [Review] Multi-platform image #6: cpu-to-dma-addr optional ARC: [Review] Multi-platform image #5: NR_IRQS defined by ARC core ARC: [Review] Multi-platform image #4: Isolate platform headers ARC: [Review] Multi-platform image #3: switch to board callback ...
| * | sysctl: Enable PARISC "unaligned-trap" to be used cross-archVineet Gupta2013-02-151-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PARISC defines /proc/sys/kernel/unaligned-trap to runtime toggle unaligned access emulation. The exact mechanics of enablig/disabling are still arch specific, we can make the sysctl usable by other arches. Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Acked-by: Helge Deller <deller@gmx.de> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Serge Hallyn <serge.hallyn@canonical.com>
* | | Merge branch 'for-linus' of ↵Linus Torvalds2013-02-251-13/+7
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace and namespace infrastructure changes from Eric W Biederman: "This set of changes starts with a few small enhnacements to the user namespace. reboot support, allowing more arbitrary mappings, and support for mounting devpts, ramfs, tmpfs, and mqueuefs as just the user namespace root. I do my best to document that if you care about limiting your unprivileged users that when you have the user namespace support enabled you will need to enable memory control groups. There is a minor bug fix to prevent overflowing the stack if someone creates way too many user namespaces. The bulk of the changes are a continuation of the kuid/kgid push down work through the filesystems. These changes make using uids and gids typesafe which ensures that these filesystems are safe to use when multiple user namespaces are in use. The filesystems converted for 3.9 are ceph, 9p, afs, ocfs2, gfs2, ncpfs, nfs, nfsd, and cifs. The changes for these filesystems were a little more involved so I split the changes into smaller hopefully obviously correct changes. XFS is the only filesystem that remains. I was hoping I could get that in this release so that user namespace support would be enabled with an allyesconfig or an allmodconfig but it looks like the xfs changes need another couple of days before it they are ready." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (93 commits) cifs: Enable building with user namespaces enabled. cifs: Convert struct cifs_ses to use a kuid_t and a kgid_t cifs: Convert struct cifs_sb_info to use kuids and kgids cifs: Modify struct smb_vol to use kuids and kgids cifs: Convert struct cifsFileInfo to use a kuid cifs: Convert struct cifs_fattr to use kuid and kgids cifs: Convert struct tcon_link to use a kuid. cifs: Modify struct cifs_unix_set_info_args to hold a kuid_t and a kgid_t cifs: Convert from a kuid before printing current_fsuid cifs: Use kuids and kgids SID to uid/gid mapping cifs: Pass GLOBAL_ROOT_UID and GLOBAL_ROOT_GID to keyring_alloc cifs: Use BUILD_BUG_ON to validate uids and gids are the same size cifs: Override unmappable incoming uids and gids nfsd: Enable building with user namespaces enabled. nfsd: Properly compare and initialize kuids and kgids nfsd: Store ex_anon_uid and ex_anon_gid as kuids and kgids nfsd: Modify nfsd4_cb_sec to use kuids and kgids nfsd: Handle kuids and kgids in the nfs4acl to posix_acl conversion nfsd: Convert nfsxdr to use kuids and kgids nfsd: Convert nfs3xdr to use kuids and kgids ...
| * | | cifs: Enable building with user namespaces enabled.Eric W. Biederman2013-02-131-1/+0
| | | | | | | | | | | | | | | | | | | | Cc: Steve French <smfrench@gmail.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * | | nfsd: Enable building with user namespaces enabled.Eric W. Biederman2013-02-131-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that the kuids and kgids conversion have propogated through net/sunrpc/ and the fs/nfsd/ it is safe to enable building nfsd when user namespaces are enabled. Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * | | nfs: Enable building with user namespaces enabled.Eric W. Biederman2013-02-131-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that the kuids and kgids conversion have propogated through net/sunrpc/ and the fs/nfs/ it is safe to enable building nfs when user namespaces are enabled. Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * | | ncpfs: Support interacting with multiple user namespacesEric W. Biederman2013-02-131-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ncpfs does not natively support uids and gids so this conversion was simply a matter of updating the the type of the mounteduid, the uid and the gid on the superblock. Fixing the ioctls that read them, updating the mount option parser and the mount option printer. Cc: Petr Vandrovec <petr@vandrovec.name> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
| * | | gfs2: Enable building with user namespaces enabledEric W. Biederman2013-02-131-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that all of the necessary work has been done to push kuids and kgids throughout gfs2 and to convert between kuids and kgids when reading and writing the on disk structures it is safe to enable gfs2 when multiple user namespaces are enabled. Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * | | ocfs2: Enable building with user namespaces enabledEric W. Biederman2013-02-131-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that ocfs2 has been converted to store uids and gids in kuid_t and kgid_t and all of the conversions have been added to the appropriate places it is safe to allow building and using ocfs2 with user namespace support enabled. Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * | | coda: Allow coda to be built when user namespace support is enabledEric W. Biederman2013-02-131-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that the coda kernel to userspace has been modified to convert between kuids and kgids and uids and gids, and all internal coda structures have be modified to store uids and gids as kuids and kgids it is safe to allow code to be built with user namespace support enabled. Cc: Jan Harkes <jaharkes@cs.cmu.edu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * | | afs: Support interacting with multiple user namespacesEric W. Biederman2013-02-131-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Modify struct afs_file_status to store owner as a kuid_t and group as a kgid_t. In xdr_decode_AFSFetchStatus as owner is now a kuid_t and group is now a kgid_t don't use the EXTRACT macro. Instead perform the work of the extract macro explicitly. Read the value with ntohl and convert it to the appropriate type with make_kuid or make_kgid. Test if the value is different from what is stored in status and update changed. Update the value in status. In xdr_encode_AFS_StoreStatus call from_kuid or from_kgid as we are computing the on the wire encoding. Initialize uids with GLOBAL_ROOT_UID instead of 0. Initialize gids with GLOBAL_ROOT_GID instead of 0. Cc: David Howells <dhowells@redhat.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
| * | | 9p: Allow building 9p with user namespaces enabled.Eric W. Biederman2013-02-121-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that the uid_t -> kuid_t, gid_t -> kgid_t conversion has been completed in 9p allow 9p to be built when user namespaces are enabled. Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ron Minnich <rminnich@gmail.com> Cc: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * | | ceph: Enable building when user namespaces are enabled.Eric W. Biederman2013-02-121-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that conversions happen from kuids and kgids when generating ceph messages and conversion happen to kuids and kgids after receiving celph messages, and all intermediate data structures store uids and gids as type kuid_t and kgid_t it is safe to enable ceph with user namespace support enabled. Cc: Sage Weil <sage@inktank.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
OpenPOWER on IntegriCloud