summaryrefslogtreecommitdiffstats
path: root/lib
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'akpm' (patches from Andrew)Linus Torvalds2015-04-142-0/+65
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge first patchbomb from Andrew Morton: - arch/sh updates - ocfs2 updates - kernel/watchdog feature - about half of mm/ * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (122 commits) Documentation: update arch list in the 'memtest' entry Kconfig: memtest: update number of test patterns up to 17 arm: add support for memtest arm64: add support for memtest memtest: use phys_addr_t for physical addresses mm: move memtest under mm mm, hugetlb: abort __get_user_pages if current has been oom killed mm, mempool: do not allow atomic resizing memcg: print cgroup information when system panics due to panic_on_oom mm: numa: remove migrate_ratelimited mm: fold arch_randomize_brk into ARCH_HAS_ELF_RANDOMIZE mm: split ET_DYN ASLR from mmap ASLR s390: redefine randomize_et_dyn for ELF_ET_DYN_BASE mm: expose arch_mmap_rnd when available s390: standardize mmap_rnd() usage powerpc: standardize mmap_rnd() usage mips: extract logic for mmap_rnd() arm64: standardize mmap_rnd() usage x86: standardize mmap_rnd() usage arm: factor out mmap ASLR into mmap_rnd ...
| * Kconfig: memtest: update number of test patterns up to 17Vladimir Murzin2015-04-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Additional test patterns for memtest were introduced since commit 63823126c221 ("x86: memtest: add additional (regular) test patterns"), but looks like Kconfig was not updated that time. Update Kconfig entry with the actual number of maximum test patterns. Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * mm: move memtest under mmVladimir Murzin2015-04-141-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Memtest is a simple feature which fills the memory with a given set of patterns and validates memory contents, if bad memory regions is detected it reserves them via memblock API. Since memblock API is widely used by other architectures this feature can be enabled outside of x86 world. This patch set promotes memtest to live under generic mm umbrella and enables memtest feature for arm/arm64. It was reported that this patch set was useful for tracking down an issue with some errant DMA on an arm64 platform. This patch (of 6): There is nothing platform dependent in the core memtest code, so other platforms might benefit from this feature too. [linux@roeck-us.net: MEMTEST depends on MEMBLOCK] Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * x86, mm: support huge KVA mappings on x86Toshi Kani2015-04-141-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement huge KVA mapping interfaces on x86. On x86, MTRRs can override PAT memory types with a 4KB granularity. When using a huge page, MTRRs can override the memory type of the huge page, which may lead a performance penalty. The processor can also behave in an undefined manner if a huge page is mapped to a memory range that MTRRs have mapped with multiple different memory types. Therefore, the mapping code falls back to use a smaller page size toward 4KB when a mapping range is covered by non-WB type of MTRRs. The WB type of MTRRs has no affect on the PAT memory types. pud_set_huge() and pmd_set_huge() call mtrr_type_lookup() to see if a given range is covered by MTRRs. MTRR_TYPE_WRBACK indicates that the range is either covered by WB or not covered and the MTRR default value is set to WB. 0xFF indicates that MTRRs are disabled. HAVE_ARCH_HUGE_VMAP is selected when X86_64 or X86_32 with X86_PAE is set. X86_32 without X86_PAE is not supported since such config can unlikey be benefited from this feature, and there was an issue found in testing. [fengguang.wu@intel.com: ioremap_pud_capable can be static] Signed-off-by: Toshi Kani <toshi.kani@hp.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Robert Elliott <Elliott@hp.com> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * mm: change ioremap to set up huge I/O mappingsToshi Kani2015-04-141-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ioremap_pud_range() and ioremap_pmd_range() are changed to create huge I/O mappings when their capability is enabled, and a request meets required conditions -- both virtual & physical addresses are aligned by their huge page size, and a requested range fufills their huge page size. When pud_set_huge() or pmd_set_huge() returns zero, i.e. no-operation is performed, the code simply falls back to the next level. The changes are only enabled when CONFIG_HAVE_ARCH_HUGE_VMAP is defined on the architecture. Signed-off-by: Toshi Kani <toshi.kani@hp.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Robert Elliott <Elliott@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * lib/ioremap.c: add huge I/O map capability interfacesToshi Kani2015-04-141-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add ioremap_pud_enabled() and ioremap_pmd_enabled(), which return 1 when I/O mappings with pud/pmd are enabled on the kernel. ioremap_huge_init() calls arch_ioremap_pud_supported() and arch_ioremap_pmd_supported() to initialize the capabilities at boot-time. A new kernel option "nohugeiomap" is also added, so that user can disable the huge I/O map capabilities when necessary. Signed-off-by: Toshi Kani <toshi.kani@hp.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Robert Elliott <Elliott@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge branch 'for-linus-1' of ↵Linus Torvalds2015-04-141-0/+83
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs update from Al Viro: "Part one: - struct filename-related cleanups - saner iov_iter_init() replacements (and switching the syscalls to use of those) - ntfs switch to ->write_iter() (Anton) - aio cleanups and splitting iocb into common and async parts (Christoph) - assorted fixes (me, bfields, Andrew Elble) There's a lot more, including the completion of switchover to ->{read,write}_iter(), d_inode/d_backing_inode annotations, f_flags race fixes, etc, but that goes after #for-davem merge. David has pulled it, and once it's in I'll send the next vfs pull request" * 'for-linus-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (35 commits) sg_start_req(): use import_iovec() sg_start_req(): make sure that there's not too many elements in iovec blk_rq_map_user(): use import_single_range() sg_io(): use import_iovec() process_vm_access: switch to {compat_,}import_iovec() switch keyctl_instantiate_key_common() to iov_iter switch {compat_,}do_readv_writev() to {compat_,}import_iovec() aio_setup_vectored_rw(): switch to {compat_,}import_iovec() vmsplice_to_user(): switch to import_iovec() kill aio_setup_single_vector() aio: simplify arguments of aio_setup_..._rw() aio: lift iov_iter_init() into aio_setup_..._rw() lift iov_iter into {compat_,}do_readv_writev() NFS: fix BUG() crash in notify_change() with patch to chown_common() dcache: return -ESTALE not -EBUSY on distributed fs race NTFS: Version 2.1.32 - Update file write from aio_write to write_iter. VFS: Add iov_iter_fault_in_multipages_readable() drop bogus check in file_open_root() switch security_inode_getattr() to struct path * constify tomoyo_realpath_from_path() ...
| * \ Merge branch 'iov_iter' into for-nextAl Viro2015-04-111-0/+57
| |\ \
| | * | saner iov_iter initialization primitivesAl Viro2015-03-301-0/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | iovec-backed iov_iter instances are assumed to satisfy several properties: * no more than UIO_MAXIOV elements in iovec array * total size of all ranges is no more than MAX_RW_COUNT * all ranges pass access_ok(). The problem is, invariants of data structures should be established in the primitives creating those data structures, not in the code using those primitives. And iov_iter_init() violates that principle. For a while we managed to get away with that, but once the use of iov_iter started to spread, it didn't take long for shit to hit the fan - missed check in sys_sendto() had introduced a roothole. We _do_ have primitives for importing and validating iovecs (both native and compat ones) and those primitives are almost always followed by shoving the resulting iovec into iov_iter. Life would be considerably simpler (and safer) if we combined those primitives with initializing iov_iter. That gives us two new primitives - import_iovec() and compat_import_iovec(). Calling conventions: iovec = iov_array; err = import_iovec(direction, uvec, nr_segs, ARRAY_SIZE(iov_array), &iovec, &iter); imports user vector into kernel space (into iov_array if it fits, allocated if it doesn't fit or if iovec was NULL), validates it and sets iter up to refer to it. On success 0 is returned and allocated kernel copy (or NULL if the array had fit into caller-supplied one) is returned via iovec. On failure all allocations are undone and -E... is returned. If the total size of ranges exceeds MAX_RW_COUNT, the excess is silently truncated. compat_import_iovec() expects uvec to be a pointer to user array of compat_iovec; otherwise it's identical to import_iovec(). Finally, import_single_range() sets iov_iter backed by single-element iovec covering a user-supplied range - err = import_single_range(direction, address, size, iovec, &iter); does validation and sets iter up. Again, size in excess of MAX_RW_COUNT gets silently truncated. Next commits will be switching the things up to use of those and reducing the amount of iov_iter_init() instances. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | VFS: Add iov_iter_fault_in_multipages_readable()Anton Altaparmakov2015-04-111-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | simillar to iov_iter_fault_in_readable() but differs in that it is not limited to faulting in the first iovec and instead faults in "bytes" bytes iterating over the iovecs as necessary. Also, instead of only faulting in the first and last page of the range, all pages are faulted in. This function is needed by NTFS when it does multi page file writes. Signed-off-by: Anton Altaparmakov <anton@tuxera.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | | | Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds2015-04-141-10/+25
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU changes from Ingo Molnar: "The main changes in this cycle were: - changes permitting use of call_rcu() and friends very early in boot, for example, before rcu_init() is invoked. - add in-kernel API to enable and disable expediting of normal RCU grace periods. - improve RCU's handling of (hotplug-) outgoing CPUs. - NO_HZ_FULL_SYSIDLE fixes. - tiny-RCU updates to make it more tiny. - documentation updates. - miscellaneous fixes" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (58 commits) cpu: Provide smpboot_thread_init() on !CONFIG_SMP kernels as well cpu: Defer smpboot kthread unparking until CPU known to scheduler rcu: Associate quiescent-state reports with grace period rcu: Yet another fix for preemption and CPU hotplug rcu: Add diagnostics to grace-period cleanup rcutorture: Default to grace-period-initialization delays rcu: Handle outgoing CPUs on exit from idle loop cpu: Make CPU-offline idle-loop transition point more precise rcu: Eliminate ->onoff_mutex from rcu_node structure rcu: Process offlining and onlining only at grace-period start rcu: Move rcu_report_unblock_qs_rnp() to common code rcu: Rework preemptible expedited bitmask handling rcu: Remove event tracing from rcu_cpu_notify(), used by offline CPUs rcutorture: Enable slow grace-period initializations rcu: Provide diagnostic option to slow down grace-period initialization rcu: Detect stalls caused by failure to propagate up rcu_node tree rcu: Eliminate empty HOTPLUG_CPU ifdef rcu: Simplify sync_rcu_preempt_exp_init() rcu: Put all orphan-callback-related code under same comment rcu: Consolidate offline-CPU callback initialization ...
| * \ \ \ Merge branch 'for-mingo' of ↵Ingo Molnar2015-03-271-10/+25
| |\ \ \ \ | | |/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: - Documentation updates. - Changes permitting use of call_rcu() and friends very early in boot, for example, before rcu_init() is invoked. - Miscellaneous fixes. - Add in-kernel API to enable and disable expediting of normal RCU grace periods. - Improve RCU's handling of (hotplug-) outgoing CPUs. Note: ARM support is lagging a bit here, and these improved diagnostics might generate (harmless) splats. - NO_HZ_FULL_SYSIDLE fixes. - Tiny RCU updates to make it more tiny. Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | | | |
| | | \ \
| | *-. \ \ Merge branches 'doc.2015.02.26a', 'earlycb.2015.03.03a', ↵Paul E. McKenney2015-03-201-10/+25
| | |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'fixes.2015.03.03a', 'gpexp.2015.02.26a', 'hotplug.2015.03.20a', 'sysidle.2015.02.26b' and 'tiny.2015.02.26a' into HEAD doc.2015.02.26a: Documentation changes earlycb.2015.03.03a: Permit early-boot RCU callbacks fixes.2015.03.03a: Miscellaneous fixes gpexp.2015.02.26a: In-kernel expediting of normal grace periods hotplug.2015.03.20a: CPU hotplug fixes sysidle.2015.02.26b: NO_HZ_FULL_SYSIDLE fixes tiny.2015.02.26a: TINY_RCU fixes
| | | | * | | rcutorture: Default to grace-period-initialization delaysPaul E. McKenney2015-03-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Given that CPU-hotplug events are now applied only at the starts of grace periods, it makes sense to unconditionally enable slow grace-period initialization for rcutorture testing. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
| | | | * | | rcu: Provide diagnostic option to slow down grace-period initializationPaul E. McKenney2015-03-111-0/+24
| | | |/ / / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Grace-period initialization normally proceeds quite quickly, so that it is very difficult to reproduce races against grace-period initialization. This commit therefore allows grace-period initialization to be artificially slowed down, increasing race-reproduction probability. A pair of new Kconfig parameters are provided, CONFIG_RCU_TORTURE_TEST_SLOW_INIT to enable the slowdowns, and CONFIG_RCU_TORTURE_TEST_SLOW_INIT_DELAY to specify the number of jiffies of slowdown to apply. A boot-time parameter named rcutree.gp_init_delay allows boot-time delay to be specified. By default, no delay will be applied even if CONFIG_RCU_TORTURE_TEST_SLOW_INIT is set. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
| | | * | | rcu: Drive PROVE_RCU directly off of PROVE_LOCKINGPaul E. McKenney2015-02-261-10/+1
| | |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the past, it has been useful to enable PROVE_LOCKING without also enabling PROVE_RCU. However, experience with PROVE_RCU over the past few years has demonstrated its usefulness, so this commit makes PROVE_LOCKING directly imply PROVE_RCU. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
* | | | | Merge branch 'for-linus' of ↵Linus Torvalds2015-04-141-1/+1
|\ \ \ \ \ | |_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree from Jiri Kosina: "Usual trivial tree updates. Nothing outstanding -- mostly printk() and comment fixes and unused identifier removals" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: goldfish: goldfish_tty_probe() is not using 'i' any more powerpc: Fix comment in smu.h qla2xxx: Fix printks in ql_log message lib: correct link to the original source for div64_u64 si2168, tda10071, m88ds3103: Fix firmware wording usb: storage: Fix printk in isd200_log_config() qla2xxx: Fix printk in qla25xx_setup_mode init/main: fix reset_device comment ipwireless: missing assignment goldfish: remove unreachable line of code coredump: Fix do_coredump() comment stacktrace.h: remove duplicate declaration task_struct smpboot.h: Remove unused function prototype treewide: Fix typo in printk messages treewide: Fix typo in printk messages mod_devicetable: fix comment for match_flags
| * | | | lib: correct link to the original source for div64_u64Heinrich Schuchardt2015-03-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code refers to an invalid url http://www.hackersdelight.org/HDcode/newCode/divDouble.c.txt The correct url is http://www.hackersdelight.org/hdcodetxt/divDouble.c.txt Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* | | | | Merge tag 'driver-core-4.1-rc1' of ↵Linus Torvalds2015-04-132-12/+13
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here's the driver-core / kobject / lz4 tree update for 4.1-rc1. Everything here has been in linux-next for a while with no reported issues. It's mostly just coding style cleanups, with other minor changes in here as well, nothing big" * tag 'driver-core-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (32 commits) debugfs: allow bad parent pointers to be passed in stable_kernel_rules: Add clause about specification of kernel versions to patch. kobject: WARN as tip when call kobject_get() to a kobject not initialized lib/lz4: Pull out constant tables drivers: platform: parse IRQ flags from resources driver core: Make probe deferral more quiet drivers/core/of: Add symlink to device-tree from devices with an OF node device: Add dev_of_node() accessor drivers: base: fw: fix ret value when loading fw firmware: Avoid manual device_create_file() calls drivers/base: cacheinfo: validate device node for all the caches drivers/base: use tabs where possible in code indentation driver core: add missing blank line after declaration drivers: base: node: Delete space after pointer declaration drivers: base: memory: Use tabs instead of spaces firmware_class: Fix whitespace and indentation drivers: base: dma-mapping: Erase blank space after pointer drivers: base: class: Add a blank line after declarations attribute_container: fix missing blank lines after declarations drivers: base: memory: Fix switch indent ...
| * | | | | kobject: WARN as tip when call kobject_get() to a kobject not initializedEthan Zhao2015-03-251-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | call kobject_get() to kojbect that is not initalized or released will only leave following like call trace to us: -----------[ cut here ]------------ [ 54.545816] WARNING: CPU: 0 PID: 213 at include/linux/kref.h:47 kobject_get+0x41/0x50() [ 54.642595] Modules linked in: i2c_i801(+) mfd_core shpchp(+) acpi_cpufreq(+) edac_core ioatdma(+) xfs libcrc32c ast syscopyarea ixgbe sysfillrect sysimgblt sr_mod sd_mod drm_kms_helper igb mdio cdrom e1000e ahci dca ttm libahci uas drm i2c_algo_bit ptp megaraid_sas libata usb_storage i2c_core pps_core dm_mirror dm_region_hash dm_log dm_mod [ 55.007264] CPU: 0 PID: 213 Comm: kworker/0:2 Not tainted 3.18.5 [ 55.099970] Hardware name: Oracle Corporation SUN FIRE X4170 M2 SERVER /ASSY,MOTHERBOARD,X4170, BIOS 08120104 05/08/2012 [ 55.239736] Workqueue: kacpi_notify acpi_os_execute_deferred [ 55.308598] 0000000000000000 00000000bd730b61 ffff88046742baf8 ffffffff816b7edb [ 55.398305] 0000000000000000 0000000000000000 ffff88046742bb38 ffffffff81078ae1 [ 55.488040] ffff88046742bbd8 ffff8806706b3000 0000000000000292 0000000000000000 [ 55.577776] Call Trace: [ 55.608228] [<ffffffff816b7edb>] dump_stack+0x46/0x58 [ 55.670895] [<ffffffff81078ae1>] warn_slowpath_common+0x81/0xa0 [ 55.743952] [<ffffffff81078bfa>] warn_slowpath_null+0x1a/0x20 [ 55.814929] [<ffffffff8130d0d1>] kobject_get+0x41/0x50 [ 55.878654] [<ffffffff8153e955>] cpufreq_cpu_get+0x75/0xc0 [ 55.946528] [<ffffffff8153f37e>] cpufreq_update_policy+0x2e/0x1f0 The above issue was casued by a race condition, if there is a WARN in kobject_get() of the kobject is not initialized, that would save us much time to debug it. Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | | | lib/lz4: Pull out constant tablesRasmus Villemoes2015-03-251-11/+7
| | |/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's no reason to allocate the dec{32,64}table on the stack; it just wastes a bunch of instructions setting them up and, of course, also consumes quite a bit of stack. Using size_t for such small integers is a little excessive. $ scripts/bloat-o-meter /tmp/built-in.o lib/built-in.o add/remove: 2/2 grow/shrink: 2/0 up/down: 1304/-1548 (-244) function old new delta lz4_decompress_unknownoutputsize 55 718 +663 lz4_decompress 55 632 +577 dec64table - 32 +32 dec32table - 32 +32 lz4_uncompress 747 - -747 lz4_uncompress_unknownoutputsize 801 - -801 The now inlined lz4_uncompress functions used to have a stack footprint of 176 bytes (according to -fstack-usage); their inlinees have increased their stack use from 32 bytes to 48 and 80 bytes, respectively. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | | | | Merge branch 'timers-core-for-linus' of ↵Linus Torvalds2015-04-131-0/+13
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Ingo Molnar: "The main changes in this cycle were: - clockevents state machine cleanups and enhancements (Viresh Kumar) - clockevents broadcast notifier horror to state machine conversion and related cleanups (Thomas Gleixner, Rafael J Wysocki) - clocksource and timekeeping core updates (John Stultz) - clocksource driver updates and fixes (Ben Dooks, Dmitry Osipenko, Hans de Goede, Laurent Pinchart, Maxime Ripard, Xunlei Pang) - y2038 fixes (Xunlei Pang, John Stultz) - NMI-safe ktime_get_raw_fast() and general refactoring of the clock code, in preparation to perf's per event clock ID support (Peter Zijlstra) - generic sched/clock fixes, optimizations and cleanups (Daniel Thompson) - clockevents cpu_down() race fix (Preeti U Murthy)" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (94 commits) timers/PM: Drop unnecessary braces from tick_freeze() timers/PM: Fix up tick_unfreeze() timekeeping: Get rid of stale comment clockevents: Cleanup dead cpu explicitely clockevents: Make tick handover explicit clockevents: Remove broadcast oneshot control leftovers sched/idle: Use explicit broadcast oneshot control function ARM: Tegra: Use explicit broadcast oneshot control function ARM: OMAP: Use explicit broadcast oneshot control function intel_idle: Use explicit broadcast oneshot control function ACPI/idle: Use explicit broadcast control function ACPI/PAD: Use explicit broadcast oneshot control function x86/amd/idle, clockevents: Use explicit broadcast oneshot control functions clockevents: Provide explicit broadcast oneshot control functions clockevents: Remove the broadcast control leftovers ARM: OMAP: Use explicit broadcast control function intel_idle: Use explicit broadcast control function cpuidle: Use explicit broadcast control function ACPI/processor: Use explicit broadcast control function ACPI/PAD: Use explicit broadcast control function ...
| * \ \ \ \ Merge tag 'v4.0-rc6' into timers/core, before applying new patchesIngo Molnar2015-03-316-40/+810
| |\ \ \ \ \ | | |/ / / / | | | | | | | | | | | | Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | timekeeping: Add debugging checks to warn if we see delaysJohn Stultz2015-03-131-0/+13
| | |_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recently there's been requests for better sanity checking in the time code, so that it's more clear when something is going wrong, since timekeeping issues could manifest in a large number of strange ways in various subsystems. Thus, this patch adds some extra infrastructure to add a check to update_wall_time() to print two new warnings: 1) if we see the call delayed beyond the 'max_cycles' overflow point, 2) or if we see the call delayed beyond the clocksource's 'max_idle_ns' value, which is currently 50% of the overflow point. This extra infrastructure is conditional on a new CONFIG_DEBUG_TIMEKEEPING option, also added in this patch - default off. Tested this a bit by halting qemu for specified lengths of time to trigger the warnings. Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1426133800-29329-5-git-send-email-john.stultz@linaro.org [ Improved the changelog and the messages a bit. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | | | Merge branch 'locking-core-for-linus' of ↵Linus Torvalds2015-04-131-1/+1
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core locking changes from Ingo Molnar: "Main changes: - jump label asm preparatory work for PowerPC (Anton Blanchard) - rwsem optimizations and cleanups (Davidlohr Bueso) - mutex optimizations and cleanups (Jason Low) - futex fix (Oleg Nesterov) - remove broken atomicity checks from {READ,WRITE}_ONCE() (Peter Zijlstra)" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: powerpc, jump_label: Include linux/jump_label.h to get HAVE_JUMP_LABEL define jump_label: Allow jump labels to be used in assembly jump_label: Allow asm/jump_label.h to be included in assembly locking/mutex: Further simplify mutex_spin_on_owner() locking: Remove atomicy checks from {READ,WRITE}_ONCE locking/rtmutex: Rename argument in the rt_mutex_adjust_prio_chain() documentation as well locking/rwsem: Fix lock optimistic spinning when owner is not running locking: Remove ACCESS_ONCE() usage locking/rwsem: Check for active lock before bailing on spinning locking/rwsem: Avoid deceiving lock spinners locking/rwsem: Set lock ownership ASAP locking/rwsem: Document barrier need when waking tasks locking/futex: Check PF_KTHREAD rather than !p->mm to filter out kthreads locking/mutex: Refactor mutex_spin_on_owner() locking/mutex: In mutex_spin_on_owner(), return true when owner changes
| * | | | | locking: Remove ACCESS_ONCE() usageDavidlohr Bueso2015-02-241-1/+1
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the new standardized functions, we can replace all ACCESS_ONCE() calls across relevant locking - this includes lockref and seqlock while at it. ACCESS_ONCE() does not work reliably on non-scalar types. For example gcc 4.6 and 4.7 might remove the volatile tag for such accesses during the SRA (scalar replacement of aggregates) step: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145 Update the new calls regardless of if it is a scalar type, this is cleaner than having three alternatives. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1424662301.6539.18.camel@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | | | Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds2015-04-031-0/+11
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block layer fix from Jens Axboe: "Just one patch in this pull request, fixing a regression caused by a 'mathematically correct' change to lcm()" * 'for-linus' of git://git.kernel.dk/linux-block: block: fix blk_stack_limits() regression due to lcm() change
| * | | | | block: fix blk_stack_limits() regression due to lcm() changeMike Snitzer2015-03-311-0/+11
| | |/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Linux 3.19 commit 69c953c ("lib/lcm.c: lcm(n,0)=lcm(0,n) is 0, not n") caused blk_stack_limits() to not properly stack queue_limits for stacked devices (e.g. DM). Fix this regression by establishing lcm_not_zero() and switching blk_stack_limits() over to using it. DM uses blk_set_stacking_limits() to establish the initial top-level queue_limits that are then built up based on underlying devices' limits using blk_stack_limits(). In the case of optimal_io_size (io_opt) blk_set_stacking_limits() establishes a default value of 0. With commit 69c953c, lcm(0, n) is no longer n, which compromises proper stacking of the underlying devices' io_opt. Test: $ modprobe scsi_debug dev_size_mb=10 num_tgts=1 opt_blks=1536 $ cat /sys/block/sde/queue/optimal_io_size 786432 $ dmsetup create node --table "0 100 linear /dev/sde 0" Before this fix: $ cat /sys/block/dm-5/queue/optimal_io_size 0 After this fix: $ cat /sys/block/dm-5/queue/optimal_io_size 786432 Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org # 3.19+ Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* | | | | netlink: pad nla_memcpy dest buffer with zeroesJiri Benc2015-03-311-0/+2
|/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is especially important in cases where the kernel allocs a new structure and expects a field to be set from a netlink attribute. If such attribute is shorter than expected, the rest of the field is left containing previous data. When such field is read back by the user space, kernel memory content is leaked. Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | LZ4 : fix the data abort issueJeHyeon Yeon2015-03-161-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the part of the compression data are corrupted, or the compression data is totally fake, the memory access over the limit is possible. This is the log from my system usning lz4 decompression. [6502]data abort, halting [6503]r0 0x00000000 r1 0x00000000 r2 0xdcea0ffc r3 0xdcea0ffc [6509]r4 0xb9ab0bfd r5 0xdcea0ffc r6 0xdcea0ff8 r7 0xdce80000 [6515]r8 0x00000000 r9 0x00000000 r10 0x00000000 r11 0xb9a98000 [6522]r12 0xdcea1000 usp 0x00000000 ulr 0x00000000 pc 0x820149bc [6528]spsr 0x400001f3 and the memory addresses of some variables at the moment are ref:0xdcea0ffc, op:0xdcea0ffc, oend:0xdcea1000 As you can see, COPYLENGH is 8bytes, so @ref and @op can access the momory over @oend. Signed-off-by: JeHyeon Yeon <tom.yeon@windriver.com> Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | | | Merge branch 'gadget' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds2015-03-132-1/+769
|\ \ \ \ | | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull gadgetfs fixes from Al Viro: "Assorted fixes around AIO on gadgetfs: leaks, use-after-free, troubles caused by ->f_op flipping" * 'gadget' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: gadgetfs: really get rid of switching ->f_op gadgetfs: get rid of flipping ->f_op in ep_config() gadget: switch ep_io_operations to ->read_iter/->write_iter gadgetfs: use-after-free in ->aio_read() gadget/function/f_fs.c: switch to ->{read,write}_iter() gadget/function/f_fs.c: use put iov_iter into io_data gadget/function/f_fs.c: close leaks move iov_iter.c from mm/ to lib/ new helper: dup_iter()
| * | | move iov_iter.c from mm/ to lib/Al Viro2015-02-172-1/+769
| | | | | | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | | | Merge tag 'trace-fixes-v4.0-rc2-2' of ↵Linus Torvalds2015-03-091-2/+2
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull seq-buf/ftrace fixes from Steven Rostedt: "This includes fixes for seq_buf_bprintf() truncation issue. It also contains fixes to ftrace when /proc/sys/kernel/ftrace_enabled and function tracing are started. Doing the following causes some issues: # echo 0 > /proc/sys/kernel/ftrace_enabled # echo function_graph > /sys/kernel/debug/tracing/current_tracer # echo 1 > /proc/sys/kernel/ftrace_enabled # echo nop > /sys/kernel/debug/tracing/current_tracer # echo function_graph > /sys/kernel/debug/tracing/current_tracer As well as with function tracing too. Pratyush Anand first reported this issue to me and supplied a patch. When I tested this on my x86 test box, it caused thousands of backtraces and warnings to appear in dmesg, which also caused a denial of service (a warning for every function that was listed). I applied Pratyush's patch but it did not fix the issue for me. I looked into it and found a slight problem with trampoline accounting. I fixed it and sent Pratyush a patch, but he said that it did not fix the issue for him. I later learned tha Pratyush was using an ARM64 server, and when I tested on my ARM board, I was able to reproduce the same issue as Pratyush. After applying his patch, it fixed the problem. The above test uncovered two different bugs, one in x86 and one in ARM and ARM64. As this looked like it would affect PowerPC, I tested it on my PPC64 box. It too broke, but neither the patch that fixed ARM or x86 fixed this box (the changes were all in generic code!). The above test, uncovered two more bugs that affected PowerPC. Again, the changes were only done to generic code. It's the way the arch code expected things to be done that was different between the archs. Some where more sensitive than others. The rest of this series fixes the PPC bugs as well" * tag 'trace-fixes-v4.0-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ftrace: Fix ftrace enable ordering of sysctl ftrace_enabled ftrace: Fix en(dis)able graph caller when en(dis)abling record via sysctl ftrace: Clear REGS_EN and TRAMP_EN flags on disabling record via sysctl seq_buf: Fix seq_buf_bprintf() truncation seq_buf: Fix seq_buf_vprintf() truncation
| * | | | seq_buf: Fix seq_buf_bprintf() truncationSteven Rostedt (Red Hat)2015-03-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In seq_buf_bprintf(), bstr_printf() is used to copy the format into the buffer remaining in the seq_buf structure. The return of bstr_printf() is the amount of characters written to the buffer excluding the '\0', unless the line was truncated! If the line copied does not fit, it is truncated, and a '\0' is added to the end of the buffer. But in this case, '\0' is included in the length of the line written. To know if the buffer had overflowed, the return length will be the same or greater than the length of the buffer passed in. The check in seq_buf_bprintf() only checked if the length returned from bstr_printf() would fit in the buffer, as the seq_buf_bprintf() is only to be an all or nothing command. It either writes all the string into the seq_buf, or none of it. If the string is truncated, the pointers inside the seq_buf must be reset to what they were when the function was called. This is not the case. On overflow, it copies only part of the string. The fix is to change the overflow check to see if the length returned from bstr_printf() is less than the length remaining in the seq_buf buffer, and not if it is less than or equal to as it currently does. Then seq_buf_bprintf() will know if the write from bstr_printf() was truncated or not. Link: http://lkml.kernel.org/r/1425500481.2712.27.camel@perches.com Cc: stable@vger.kernel.org Reported-by: Joe Perches <joe@perches.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | | seq_buf: Fix seq_buf_vprintf() truncationSteven Rostedt (Red Hat)2015-03-041-1/+1
| | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In seq_buf_vprintf(), vsnprintf() is used to copy the format into the buffer remaining in the seq_buf structure. The return of vsnprintf() is the amount of characters written to the buffer excluding the '\0', unless the line was truncated! If the line copied does not fit, it is truncated, and a '\0' is added to the end of the buffer. But in this case, '\0' is included in the length of the line written. To know if the buffer had overflowed, the return length will be the same as the length of the buffer passed in. The check in seq_buf_vprintf() only checked if the length returned from vsnprintf() would fit in the buffer, as the seq_buf_vprintf() is only to be an all or nothing command. It either writes all the string into the seq_buf, or none of it. If the string is truncated, the pointers inside the seq_buf must be reset to what they were when the function was called. This is not the case. On overflow, it copies only part of the string. The fix is to change the overflow check to see if the length returned from vsnprintf() is less than the length remaining in the seq_buf buffer, and not if it is less than or equal to as it currently does. Then seq_buf_vprintf() will know if the write from vsnpritnf() was truncated or not. Cc: stable@vger.kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2015-03-032-37/+36
|\ \ \ \ | |/ / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: 1) If an IPVS tunnel is created with a mixed-family destination address, it cannot be removed. Fix from Alexey Andriyanov. 2) Fix module refcount underflow in netfilter's nft_compat, from Pablo Neira Ayuso. 3) Generic statistics infrastructure can reference variables sitting on a released function stack, therefore use dynamic allocation always. Fix from Ignacy Gawędzki. 4) skb_copy_bits() return value test is inverted in ip_check_defrag(). 5) Fix network namespace exit in openvswitch, we have to release all of the per-net vports. From Pravin B Shelar. 6) Fix signedness bug in CAIF's cfpkt_iterate(), from Dan Carpenter. 7) Fix rhashtable grow/shrink behavior, only expand during inserts and shrink during deletes. From Daniel Borkmann. 8) Netdevice names with semicolons should never be allowed, because they serve as a separator. From Matthew Thode. 9) Use {,__}set_current_state() where appropriate, from Fabian Frederick. 10) Revert byte queue limits support in r8169 driver, it's causing regressions we can't figure out. 11) tcp_should_expand_sndbuf() erroneously uses tp->packets_out to measure packets in flight, properly use tcp_packets_in_flight() instead. From Neal Cardwell. 12) Fix accidental removal of support for bluetooth in CSR based Intel wireless cards. From Marcel Holtmann. 13) We accidently added a behavioral change between native and compat tasks, wrt testing the MSG_CMSG_COMPAT bit. Just ignore it if the user happened to set it in a native binary as that was always the behavior we had. From Catalin Marinas. 14) Check genlmsg_unicast() return valud in hwsim netlink tx frame handling, from Bob Copeland. 15) Fix stale ->radar_required setting in mac80211 that can prevent starting new scans, from Eliad Peller. 16) Fix memory leak in nl80211 monitor, from Johannes Berg. 17) Fix race in TX index handling in xen-netback, from David Vrabel. 18) Don't enable interrupts in amx-xgbe driver until all software et al. state is ready for the interrupt handler to run. From Thomas Lendacky. 19) Add missing netlink_ns_capable() checks to rtnl_newlink(), from Eric W Biederman. 20) The amount of header space needed in macvtap was not calculated properly, fix it otherwise we splat past the beginning of the packet. From Eric Dumazet. 21) Fix bcmgenet TCP TX perf regression, from Jaedon Shin. 22) Don't raw initialize or mod timers, use setup_timer() and mod_timer() instead. From Vaishali Thakkar. 23) Fix software maintained statistics in bcmgenet and systemport drivers, from Florian Fainelli. 24) DMA descriptor updates in sh_eth need proper memory barriers, from Ben Hutchings. 25) Don't do UDP Fragmentation Offload on RAW sockets, from Michal Kubecek. 26) Openvswitch's non-masked set actions aren't constructed properly into netlink messages, fix from Joe Stringer. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits) openvswitch: Fix serialization of non-masked set actions. gianfar: Reduce logging noise seen due to phy polling if link is down ibmveth: Add function to enable live MAC address changes net: bridge: add compile-time assert for cb struct size udp: only allow UFO for packets from SOCK_DGRAM sockets sh_eth: Really fix padding of short frames on TX Revert "sh_eth: Enable Rx descriptor word 0 shift for r8a7790" sh_eth: Fix RX recovery on R-Car in case of RX ring underrun sh_eth: Ensure proper ordering of descriptor active bit write/read net/mlx4_en: Disbale GRO for incoming loopback/selftest packets net/mlx4_core: Fix wrong mask and error flow for the update-qp command net: systemport: fix software maintained statistics net: bcmgenet: fix software maintained statistics rxrpc: don't multiply with HZ twice rxrpc: terminate retrans loop when sending of skb fails net/hsr: Fix NULL pointer dereference and refcnt bugs when deleting a HSR interface. net: pasemi: Use setup_timer and mod_timer net: stmmac: Use setup_timer and mod_timer net: 8390: axnet_cs: Use setup_timer and mod_timer net: 8390: pcnet_cs: Use setup_timer and mod_timer ...
| * | | rhashtable: use cond_resched()Eric Dumazet2015-02-271-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a hash table has 128 slots and 16384 elems, expand to 256 slots takes more than one second. For larger sets, a soft lockup is detected. Holding cpu for that long, even in a work queue is a show stopper for non preemptable kernels. cond_resched() at strategic points to allow process scheduler to reschedule us. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | rhashtable: remove indirection for grow/shrink decision functionsDaniel Borkmann2015-02-272-39/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, all real users of rhashtable default their grow and shrink decision functions to rht_grow_above_75() and rht_shrink_below_30(), so that there's currently no need to have this explicitly selectable. It can/should be generic and private inside rhashtable until a real use case pops up. Since we can make this private, we'll save us this additional indirection layer and can improve insertion/deletion time as well. Reference: http://patchwork.ozlabs.org/patch/443040/ Suggested-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | rhashtable: unconditionally grow when max_shift is not specifiedDaniel Borkmann2015-02-272-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While commit c0c09bfdc415 ("rhashtable: avoid unnecessary wakeup for worker queue") rightfully moved part of the decision making of whether we should expand or shrink from the expand/shrink functions themselves into insert/delete functions in order to avoid unnecessary worker wake-ups, it however introduced a regression by doing so. Before that change, if no max_shift was specified (= 0) on rhashtable initialization, rhashtable_expand() would just grow unconditionally and lets the available memory be the limiting factor. After that change, if no max_shift was specified, there would be _no_ expansion step at all. Given that netlink and tipc have a max_shift specified, it was not visible there, but Josh Hunt reported that if nft that starts out with a default element hint of 3 if not otherwise provided, would slow i.e. inserts down trememdously as it cannot grow larger to relax table occupancy. Given that the test case verifies shrinks/expands manually, we also must remove pointer to the helper functions to explicitly avoid parallel resizing on insertions/deletions. test_bucket_stats() and test_rht_lookup() could also be wrapped around rhashtable mutex to explicitly synchronize a walk from resizing, but I think that defeats the actual test case which intended to have explicit test steps, i.e. 1) inserts, 2) expands, 3) shrinks, 4) deletions, with object verification after each stage. Reported-by: Josh Hunt <johunt@akamai.com> Fixes: c0c09bfdc415 ("rhashtable: avoid unnecessary wakeup for worker queue") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Ying Xue <ying.xue@windriver.com> Cc: Josh Hunt <johunt@akamai.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | rhashtable: initialize all rhashtable walker membersSasha Levin2015-02-231-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit f2dba9c6ff ("rhashtable: Introduce rhashtable_walk_*") forgot to initialize the members of struct rhashtable_walker after allocating it, which caused an undefined value for 'resize' which is used later on. Fixes: f2dba9c6ff ("rhashtable: Introduce rhashtable_walk_*") Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | rhashtable: allow to unload test moduleDaniel Borkmann2015-02-201-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's no good reason why to disallow unloading of the rhashtable test case module. Commit 9d6dbe1bbaf8 moved the code from a boot test into a stand-alone module, but only converted the subsys_initcall() handler into a module_init() function without a related exit handler, and thus preventing the test module from unloading. Fixes: 9d6dbe1bbaf8 ("rhashtable: Make selftest modular") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | rhashtable: better high order allocation attemptsDaniel Borkmann2015-02-201-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When trying to allocate future tables via bucket_table_alloc(), it seems overkill on large table shifts that we probe for kzalloc() unconditionally first, as it's likely to fail. Only probe with kzalloc() for more reasonable table sizes and use vzalloc() either as a fallback on failure or directly in case of large table sizes. Fixes: 7e1e77636e36 ("lib: Resizable, Scalable, Concurrent Hash Table") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | rhashtable: don't test for shrink on insert, expansion on deleteDaniel Borkmann2015-02-201-9/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Restore pre 54c5b7d311c8 behaviour and only probe for expansions on inserts and shrinks on deletes. Currently, it will happen that on initial inserts into a sparse hash table, we may i.e. shrink it first simply because it's not fully populated yet, only to later realize that we need to grow again. This however is counter intuitive, e.g. an initial default size of 64 elements is already small enough, and in case an elements size hint is given to the hash table by a user, we should avoid unnecessary expansion steps, so a shrink is clearly unintended here. Fixes: 54c5b7d311c8 ("rhashtable: introduce rhashtable_wakeup_worker helper function") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Ying Xue <ying.xue@windriver.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | rhashtable: don't allocate ht structure on stack in test_rht_initDaniel Borkmann2015-02-201-1/+2
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With object runtime debugging enabled, the rhashtable test suite will rightfully throw a warning "ODEBUG: object is on stack, but not annotated" from rhashtable_init(). This is because run_work is (correctly) being initialized via INIT_WORK(), and not annotated by INIT_WORK_ONSTACK(). Meaning, rhashtable_init() is okay as is, we just need to move ht e.g., into global scope. It never triggered anything, since test_rhashtable is rather a controlled environment and effectively runs to completion, so that stack memory is not vanishing underneath us, we shouldn't confuse any testers with it though. Fixes: 7e1e77636e36 ("lib: Resizable, Scalable, Concurrent Hash Table") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge branch 'kconfig' of ↵Linus Torvalds2015-02-191-15/+15
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull kconfig updates from Michal Marek: "Yann E Morin was supposed to take over kconfig maintainership, but this hasn't happened. So I'm sending a few kconfig patches that I collected: - Fix for missing va_end in kconfig - merge_config.sh displays used if given too few arguments - s/boolean/bool/ in Kconfig files for consistency, with the plan to only support bool in the future" * 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: kconfig: use va_end to match corresponding va_start merge_config.sh: Display usage if given too few arguments kconfig: use bool instead of boolean for type definition attributes
| * | | kconfig: use bool instead of boolean for type definition attributesChristoph Jaeger2015-01-071-15/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Support for keyword 'boolean' will be dropped later on. No functional change. Reference: http://lkml.kernel.org/r/cover.1418003065.git.cj@linux.com Signed-off-by: Christoph Jaeger <cj@linux.com> Signed-off-by: Michal Marek <mmarek@suse.cz>
* | | | Merge tag 'virtio-next-for-linus' of ↵Linus Torvalds2015-02-181-5/+30
|\ \ \ \ | |_|/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull virtio updates from Rusty Russell: "OK, this has the big virtio 1.0 implementation, as specified by OASIS. On top of tht is the major rework of lguest, to use PCI and virtio 1.0, to double-check the implementation. Then comes the inevitable fixes and cleanups from that work" * tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (80 commits) virtio: don't set VIRTIO_CONFIG_S_DRIVER_OK twice. virtio_net: unconditionally define struct virtio_net_hdr_v1. tools/lguest: don't use legacy definitions for net device in example launcher. virtio: Don't expose legacy net features when VIRTIO_NET_NO_LEGACY defined. tools/lguest: use common error macros in the example launcher. tools/lguest: give virtqueues names for better error messages tools/lguest: more documentation and checking of virtio 1.0 compliance. lguest: don't look in console features to find emerg_wr. tools/lguest: don't start devices until DRIVER_OK status set. tools/lguest: handle indirect partway through chain. tools/lguest: insert driver references from the 1.0 spec (4.1 Virtio Over PCI) tools/lguest: insert device references from the 1.0 spec (4.1 Virtio Over PCI) tools/lguest: rename virtio_pci_cfg_cap field to match spec. tools/lguest: fix features_accepted logic in example launcher. tools/lguest: handle device reset correctly in example launcher. virtual: Documentation: simplify and generalize paravirt_ops.txt lguest: remove NOTIFY call and eventfd facility. lguest: remove NOTIFY facility from demonstration launcher. lguest: use the PCI console device's emerg_wr for early boot messages. lguest: always put console in PCI slot #1. ...
| * | | pci: add pci_iomap_rangeMichael S. Tsirkin2015-01-211-5/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Virtio drivers should map the part of the BAR they need, not necessarily all of it. Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: linux-pci@vger.kernel.org Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* | | | Merge branch 'for-linus' of ↵Linus Torvalds2015-02-171-8/+28
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc VFS updates from Al Viro: "This cycle a lot of stuff sits on topical branches, so I'll be sending more or less one pull request per branch. This is the first pile; more to follow in a few. In this one are several misc commits from early in the cycle (before I went for separate branches), plus the rework of mntput/dput ordering on umount, switching to use of fs_pin instead of convoluted games in namespace_unlock()" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: switch the IO-triggering parts of umount to fs_pin new fs_pin killing logics allow attaching fs_pin to a group not associated with some superblock get rid of the second argument of acct_kill() take count and rcu_head out of fs_pin dcache: let the dentry count go down to zero without taking d_lock pull bumping refcount into ->kill() kill pin_put() mode_t whack-a-mole: chelsio file->f_path.dentry is pinned down for as long as the file is open... get rid of lustre_dump_dentry() gut proc_register() a bit kill d_validate() ncpfs: get rid of d_validate() nonsense selinuxfs: don't open-code d_genocide()
| * | | | dcache: let the dentry count go down to zero without taking d_lockLinus Torvalds2015-01-251-8/+28
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | We can be more aggressive about this, if we are clever and careful. This is subtle. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
OpenPOWER on IntegriCloud