summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* modpost: fix module autoloading for OF devices with generic compatible propertyPhilipp Zabel2016-05-051-24/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the wildcard at the end of OF module aliases is gone, autoloading of modules that don't match a device's last (most generic) compatible value fails. For example the CODA960 VPU on i.MX6Q has the SoC specific compatible "fsl,imx6q-vpu" and the generic compatible "cnm,coda960". Since the driver currently only works with knowledge about the SoC specific integration, it doesn't list "cnm,cod960" in the module device table. This results in the device compatible "of:NvpuT<NULL>Cfsl,imx6q-vpuCcnm,coda960" not matching the module alias "of:N*T*Cfsl,imx6q-vpu" anymore, whereas before commit 2f632369ab79 ("modpost: don't add a trailing wildcard for OF module aliases") it matched the module alias "of:N*T*Cfsl,imx6q-vpu*". This patch adds two module aliases for each compatible, one without the wildcard and one with "C*" appended. $ modinfo coda | grep imx6q alias: of:N*T*Cfsl,imx6q-vpuC* alias: of:N*T*Cfsl,imx6q-vpu Fixes: 2f632369ab79 ("modpost: don't add a trailing wildcard for OF module aliases") Link: http://lkml.kernel.org/r/1462203339-15340-1-git-send-email-p.zabel@pengutronix.de Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Cc: Javier Martinez Canillas <javier@osg.samsung.com> Cc: Brian Norris <computersforpeace@gmail.com> Cc: Sjoerd Simons <sjoerd.simons@collabora.co.uk> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: <stable@vger.kernel.org> [4.5+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* proc: prevent accessing /proc/<PID>/environ until it's readyMathias Krause2016-05-051-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If /proc/<PID>/environ gets read before the envp[] array is fully set up in create_{aout,elf,elf_fdpic,flat}_tables(), we might end up trying to read more bytes than are actually written, as env_start will already be set but env_end will still be zero, making the range calculation underflow, allowing to read beyond the end of what has been written. Fix this as it is done for /proc/<PID>/cmdline by testing env_end for zero. It is, apparently, intentionally set last in create_*_tables(). This bug was found by the PaX size_overflow plugin that detected the arithmetic underflow of 'this_len = env_end - (env_start + src)' when env_end is still zero. The expected consequence is that userland trying to access /proc/<PID>/environ of a not yet fully set up process may get inconsistent data as we're in the middle of copying in the environment variables. Fixes: https://forums.grsecurity.net/viewtopic.php?f=3&t=4363 Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=116461 Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Emese Revfy <re.emese@gmail.com> Cc: Pax Team <pageexec@freemail.hu> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Mateusz Guzik <mguzik@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Jarod Wilson <jarod@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm/zswap: provide unique zpool nameDan Streetman2016-05-051-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of using "zswap" as the name for all zpools created, add an atomic counter and use "zswap%x" with the counter number for each zpool created, to provide a unique name for each new zpool. As zsmalloc, one of the zpool implementations, requires/expects a unique name for each pool created, zswap should provide a unique name. The zsmalloc pool creation does not fail if a new pool with a conflicting name is created, unless CONFIG_ZSMALLOC_STAT is enabled; in that case, zsmalloc pool creation fails with -ENOMEM. Then zswap will be unable to change its compressor parameter if its zpool is zsmalloc; it also will be unable to change its zpool parameter back to zsmalloc, if it has any existing old zpool using zsmalloc with page(s) in it. Attempts to change the parameters will result in failure to create the zpool. This changes zswap to provide a unique name for each zpool creation. Fixes: f1c54846ee45 ("zswap: dynamic pool creation") Signed-off-by: Dan Streetman <ddstreet@ieee.org> Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Dan Streetman <dan.streetman@canonical.com> Cc: Minchan Kim <minchan@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: thp: kvm: fix memory corruption in KVM with THP enabledAndrea Arcangeli2016-05-053-3/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After the THP refcounting change, obtaining a compound pages from get_user_pages() no longer allows us to assume the entire compound page is immediately mappable from a secondary MMU. A secondary MMU doesn't want to call get_user_pages() more than once for each compound page, in order to know if it can map the whole compound page. So a secondary MMU needs to know from a single get_user_pages() invocation when it can map immediately the entire compound page to avoid a flood of unnecessary secondary MMU faults and spurious atomic_inc()/atomic_dec() (pages don't have to be pinned by MMU notifier users). Ideally instead of the page->_mapcount < 1 check, get_user_pages() should return the granularity of the "page" mapping in the "mm" passed to get_user_pages(). However it's non trivial change to pass the "pmd" status belonging to the "mm" walked by get_user_pages up the stack (up to the caller of get_user_pages). So the fix just checks if there is not a single pte mapping on the page returned by get_user_pages, and in turn if the caller can assume that the whole compound page is mapped in the current "mm" (in a pmd_trans_huge()). In such case the entire compound page is safe to map into the secondary MMU without additional get_user_pages() calls on the surrounding tail/head pages. In addition of being faster, not having to run other get_user_pages() calls also reduces the memory footprint of the secondary MMU fault in case the pmd split happened as result of memory pressure. Without this fix after a MADV_DONTNEED (like invoked by QEMU during postcopy live migration or balloning) or after generic swapping (with a failure in split_huge_page() that would only result in pmd splitting and not a physical page split), KVM would map the whole compound page into the shadow pagetables, despite regular faults or userfaults (like UFFDIO_COPY) may map regular pages into the primary MMU as result of the pte faults, leading to the guest mode and userland mode going out of sync and not working on the same memory at all times. Any other secondary MMU notifier manager (KVM is just one of the many MMU notifier users) will need the same information if it doesn't want to run a flood of get_user_pages_fast and it can support multiple granularity in the secondary MMU mappings, so I think it is justified to be exposed not just to KVM. The other option would be to move transparent_hugepage_adjust to mm/huge_memory.c but that currently has all kind of KVM data structures in it, so it's definitely not a cut-and-paste work, so I couldn't do a fix as cleaner as this one for 4.6. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: "Li, Liang Z" <liang.z.li@intel.com> Cc: Amit Shah <amit.shah@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* MAINTAINERS: fix Rajendra Nayak's addressEric Engestrom2016-05-051-1/+1
| | | | | | | | Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Cc: Rajendra Nayak <rnayak@codeaurora.org> Cc: Afzal Mohammed <afzal.mohd.ma@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm, cma: prevent nr_isolated_* counters from going negativeHugh Dickins2016-05-051-9/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | /proc/sys/vm/stat_refresh warns nr_isolated_anon and nr_isolated_file go increasingly negative under compaction: which would add delay when should be none, or no delay when should delay. The bug in compaction was due to a recent mmotm patch, but much older instance of the bug was also noticed in isolate_migratepages_range() which is used for CMA and gigantic hugepage allocations. The bug is caused by putback_movable_pages() in an error path decrementing the isolated counters without them being previously incremented by acct_isolated(). Fix isolate_migratepages_range() by removing the error-path putback, thus reaching acct_isolated() with migratepages still isolated, and leaving putback to caller like most other places do. Fixes: edc2ca612496 ("mm, compaction: move pageblock checks up from isolate_migratepages_range()") [vbabka@suse.cz: expanded the changelog] Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: update min_free_kbytes from khugepaged after core initializationJason Baron2016-05-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Khugepaged attempts to raise min_free_kbytes if its set too low. However, on boot khugepaged sets min_free_kbytes first from subsys_initcall(), and then the mm 'core' over-rides min_free_kbytes after from init_per_zone_wmark_min(), via a module_init() call. Khugepaged used to use a late_initcall() to set min_free_kbytes (such that it occurred after the core initialization), however this was removed when the initialization of min_free_kbytes was integrated into the starting of the khugepaged thread. The fix here is simply to invoke the core initialization using a core_initcall() instead of module_init(), such that the previous initialization ordering is restored. I didn't restore the late_initcall() since start_stop_khugepaged() already sets min_free_kbytes via set_recommended_min_free_kbytes(). This was noticed when we had a number of page allocation failures when moving a workload to a kernel with this new initialization ordering. On an 8GB system this restores min_free_kbytes back to 67584 from 11365 when CONFIG_TRANSPARENT_HUGEPAGE=y is set and either CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y or CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y. Fixes: 79553da293d3 ("thp: cleanup khugepaged startup") Signed-off-by: Jason Baron <jbaron@akamai.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* huge pagecache: mmap_sem is unlocked when truncation splits pmdHugh Dickins2016-05-051-9/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | zap_pmd_range()'s CONFIG_DEBUG_VM !rwsem_is_locked(&mmap_sem) BUG() will be invalid with huge pagecache, in whatever way it is implemented: truncation of a hugely-mapped file to an unhugely-aligned size would easily hit it. (Although anon THP could in principle apply khugepaged to private file mappings, which are not excluded by the MADV_HUGEPAGE restrictions, in practice there's a vm_ops check which excludes them, so it never hits this BUG() - there's no interface to "truncate" an anonymous mapping.) We could complicate the test, to check i_mmap_rwsem also when there's a vm_file; but my inclination was to make zap_pmd_range() more readable by simply deleting this check. A search has shown no report of the issue in the years since commit e0897d75f0b2 ("mm, thp: print useful information when mmap_sem is unlocked in zap_pmd_range") expanded it from VM_BUG_ON() - though I cannot point to what commit I would say then fixed the issue. But there are a couple of other patches now floating around, neither yet in the tree: let's agree to retain the check as a VM_BUG_ON_VMA(), as Matthew Wilcox has done; but subject to a vma_is_anonymous() check, as Kirill Shutemov has done. And let's get this in, without waiting for any particular huge pagecache implementation to reach the tree. Matthew said "We can reproduce this BUG() in the current Linus tree with DAX PMDs". Signed-off-by: Hugh Dickins <hughd@google.com> Tested-by: Matthew Wilcox <willy@linux.intel.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andres Lagar-Cavilla <andreslc@google.com> Cc: Yang Shi <yang.shi@linaro.org> Cc: Ning Qu <quning@gmail.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Andres Lagar-Cavilla <andreslc@google.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* rapidio/mport_cdev: fix uapi type definitionsAlexandre Bounine2016-05-052-120/+139
| | | | | | | | | | | | | | | | | | | | Fix problems in uapi definitions reported by Gabriel Laskar: (see https://lkml.org/lkml/2016/4/5/205 for details) - move public header file rio_mport_cdev.h to include/uapi/linux directory - change types in data structures passed as IOCTL parameters - improve parameter checking in some IOCTL service routines Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com> Reported-by: Gabriel Laskar <gabriel@lse.epita.fr> Tested-by: Barry Wood <barry.wood@idt.com> Cc: Gabriel Laskar <gabriel@lse.epita.fr> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com> Cc: Barry Wood <barry.wood@idt.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: memcontrol: let v2 cgroups follow changes in system swappinessJohannes Weiner2016-05-051-0/+4
| | | | | | | | | | | | | | | Cgroup2 currently doesn't have a per-cgroup swappiness setting. We might want to add one later - that's a different discussion - but until we do, the cgroups should always follow the system setting. Otherwise it will be unchangeably set to whatever the ancestor inherited from the system setting at the time of cgroup creation. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com> Cc: <stable@vger.kernel.org> [4.5] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: thp: correct split_huge_pages file permissionYang Shi2016-05-051-2/+2
| | | | | | | | | | | | | | | split_huge_pages doesn't support get method at all, so the read permission sounds confusing, change the permission to write only. And, add "\n" to the output of set method to make it more readable. Signed-off-by: Yang Shi <yang.shi@linaro.org> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'for-linus' of ↵Linus Torvalds2016-05-051-11/+14
|\ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull userns fix from Eric Biederman: "This contains just a single fix for a nasty oops" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: propogate_mnt: Handle the first propogated copy being a slave
| * propogate_mnt: Handle the first propogated copy being a slaveEric W. Biederman2016-05-051-11/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the first propgated copy was a slave the following oops would result: > BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 > IP: [<ffffffff811fba4e>] propagate_one+0xbe/0x1c0 > PGD bacd4067 PUD bac66067 PMD 0 > Oops: 0000 [#1] SMP > Modules linked in: > CPU: 1 PID: 824 Comm: mount Not tainted 4.6.0-rc5userns+ #1523 > Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 > task: ffff8800bb0a8000 ti: ffff8800bac3c000 task.ti: ffff8800bac3c000 > RIP: 0010:[<ffffffff811fba4e>] [<ffffffff811fba4e>] propagate_one+0xbe/0x1c0 > RSP: 0018:ffff8800bac3fd38 EFLAGS: 00010283 > RAX: 0000000000000000 RBX: ffff8800bb77ec00 RCX: 0000000000000010 > RDX: 0000000000000000 RSI: ffff8800bb58c000 RDI: ffff8800bb58c480 > RBP: ffff8800bac3fd48 R08: 0000000000000001 R09: 0000000000000000 > R10: 0000000000001ca1 R11: 0000000000001c9d R12: 0000000000000000 > R13: ffff8800ba713800 R14: ffff8800bac3fda0 R15: ffff8800bb77ec00 > FS: 00007f3c0cd9b7e0(0000) GS:ffff8800bfb00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000000010 CR3: 00000000bb79d000 CR4: 00000000000006e0 > Stack: > ffff8800bb77ec00 0000000000000000 ffff8800bac3fd88 ffffffff811fbf85 > ffff8800bac3fd98 ffff8800bb77f080 ffff8800ba713800 ffff8800bb262b40 > 0000000000000000 0000000000000000 ffff8800bac3fdd8 ffffffff811f1da0 > Call Trace: > [<ffffffff811fbf85>] propagate_mnt+0x105/0x140 > [<ffffffff811f1da0>] attach_recursive_mnt+0x120/0x1e0 > [<ffffffff811f1ec3>] graft_tree+0x63/0x70 > [<ffffffff811f1f6b>] do_add_mount+0x9b/0x100 > [<ffffffff811f2c1a>] do_mount+0x2aa/0xdf0 > [<ffffffff8117efbe>] ? strndup_user+0x4e/0x70 > [<ffffffff811f3a45>] SyS_mount+0x75/0xc0 > [<ffffffff8100242b>] do_syscall_64+0x4b/0xa0 > [<ffffffff81988f3c>] entry_SYSCALL64_slow_path+0x25/0x25 > Code: 00 00 75 ec 48 89 0d 02 22 22 01 8b 89 10 01 00 00 48 89 05 fd 21 22 01 39 8e 10 01 00 00 0f 84 e0 00 00 00 48 8b 80 d8 00 00 00 <48> 8b 50 10 48 89 05 df 21 22 01 48 89 15 d0 21 22 01 8b 53 30 > RIP [<ffffffff811fba4e>] propagate_one+0xbe/0x1c0 > RSP <ffff8800bac3fd38> > CR2: 0000000000000010 > ---[ end trace 2725ecd95164f217 ]--- This oops happens with the namespace_sem held and can be triggered by non-root users. An all around not pleasant experience. To avoid this scenario when finding the appropriate source mount to copy stop the walk up the mnt_master chain when the first source mount is encountered. Further rewrite the walk up the last_source mnt_master chain so that it is clear what is going on. The reason why the first source mount is special is that it it's mnt_parent is not a mount in the dest_mnt propagation tree, and as such termination conditions based up on the dest_mnt mount propgation tree do not make sense. To avoid other kinds of confusion last_dest is not changed when computing last_source. last_dest is only used once in propagate_one and that is above the point of the code being modified, so changing the global variable is meaningless and confusing. Cc: stable@vger.kernel.org fixes: f2ebb3a921c1ca1e2ddd9242e95a1989a50c4c68 ("smarter propagate_mnt()") Reported-by: Tycho Andersen <tycho.andersen@canonical.com> Reviewed-by: Seth Forshee <seth.forshee@canonical.com> Tested-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* | Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds2016-05-052-2/+2
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | Pull virtio/qemu fixes from Michael Tsirkin: "A couple of fixes for virtio and for the new QEMU fw cfg driver" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio: Silence uninitialized variable warning firmware: qemu_fw_cfg.c: potential unintialized variable
| * | virtio: Silence uninitialized variable warningDan Carpenter2016-05-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Smatch complains that we might not initialize "queue". The issue is callers like setup_vq() from virtio_pci_modern.c where "num" could be something like 2 and "vring_align" is 64. In that case, vring_size() is less than PAGE_SIZE. It won't happen in real life, but we're getting the value of "num" from a register so it's not really possible to tell what value it holds with static analysis. Let's just silence the warning. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * | firmware: qemu_fw_cfg.c: potential unintialized variableDan Carpenter2016-04-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | It acpi_acquire_global_lock() return AE_NOT_CONFIGURED then "glk" isn't initialized, which, if you got very unlucky, could cause a bug. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | | Merge branch 'for-linus' of ↵Linus Torvalds2016-05-043-24/+24
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: atmel_mxt_ts - use mxt_acquire_irq in mxt_soft_reset Input: zforce_ts - fix dual touch recognition Input: twl6040-vibra - fix atomic schedule panic
| * | | Input: atmel_mxt_ts - use mxt_acquire_irq in mxt_soft_resetNick Dyer2016-04-251-14/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If using IRQF_TRIGGER_FALLING, then there is a race here: if the reset completes before we enable the IRQ, then CHG is already low and touch will be broken. This has been seen on Chromebook Pixel 2. A workaround is to reconfig T18 COMMSCONFIG to enable the RETRIGEN bit using mxt-app: mxt-app -W -T18 44 mxt-app --backup Tested-by: Tom Rini <trini@konsulko.com> Signed-off-by: Nick Dyer <nick.dyer@itdev.co.uk> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
| * | | Input: zforce_ts - fix dual touch recognitionKnut Wohlrab2016-04-251-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A wrong decoding of the touch coordinate message causes a wrong touch ID. Touch ID for dual touch must be 0 or 1. According to the actual Neonode nine byte touch coordinate coding, the state is transported in the lower nibble and the touch ID in the higher nibble of payload byte five. Signed-off-by: Knut Wohlrab <Knut.Wohlrab@de.bosch.com> Signed-off-by: Oleksij Rempel <linux@rempel-privat.de> Signed-off-by: Dirk Behme <dirk.behme@de.bosch.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
| * | | Input: twl6040-vibra - fix atomic schedule panicH. Nikolaus Schaller2016-04-251-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit c6f39257c952 ("mfd: twl6040: Use regmap for register cache") did remove the private cache for the vibra control registers and replaced access within twl6040_get_vibralr_status() by calls to regmap. This is OK, as long as twl6040_get_vibralr_status() uses already cached values or is not called from interrupt context. But we call this in vibra_play() for checking that the vibrator is not configured for audio mode. The result is a "BUG: scheduling while atomic" if the first use of the twl6040 is a vibra effect, because the first fetch is by reading the twl6040 registers through (blocking) i2c and not from the cache. As soon as the regmap has cached the status, further calls are fine. The solution is to move the condition to the work() function which runs in context that can block. The original code returns -EBUSY, but the return value of ->play() functions is ignored anyways. Hence, we do not loose functionality by not returning an error but just reporting the issue to INFO loglevel. Tested-on: Pyra (omap5) prototype Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
* | | | Merge branch 'for-linus' of ↵Linus Torvalds2016-05-041-2/+2
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull IMA fix from James Morris. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: ima: fix the string representation of the LSM/IMA hook enumeration ordering
| * | | | ima: fix the string representation of the LSM/IMA hook enumeration orderingMimi Zohar2016-05-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the string representation of the LSM/IMA hook enumeration ordering used for displaying the IMA policy. Fixes: d9ddf077bb85 ("ima: support for kexec image and initramfs") Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Tested-by: Eric Richter <erichte@linux.vnet.ibm.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
* | | | | Merge tag 'for-linus-4.6-rc6-tag' of ↵Linus Torvalds2016-05-043-14/+26
|\ \ \ \ \ | |/ / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen regression fixes from David Vrabel: - Fix two regressions causing crashes in 32-bit PV guests - Fix a regression in the evtchn driver * tag 'for-linus-4.6-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/evtchn: fix ring resize when binding new events xen/balloon: Fix crash when ballooning on x86 32 bit PAE xen: Fix page <-> pfn conversion on 32 bit systems
| * | | | xen/evtchn: fix ring resize when binding new eventsJan Beulich2016-05-041-12/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The copying of ring data was wrong for two cases: For a full ring nothing got copied at all (as in that case the canonicalized producer and consumer indexes are identical). And in case one or both of the canonicalized (after the resize) indexes would point into the second half of the buffer, the copied data ended up in the wrong (free) part of the new buffer. In both cases uninitialized data would get passed back to the caller. Fix this by simply copying the old ring contents twice: Once to the low half of the new buffer, and a second time to the high half. This addresses the inability to boot a HVM guest with 64 or more vCPUs. This regression was caused by 8620015499101090 (xen/evtchn: dynamically grow pending event channel ring). Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: <stable@vger.kernel.org> # 4.4+ Signed-off-by: David Vrabel <david.vrabel@citrix.com>
| * | | | xen/balloon: Fix crash when ballooning on x86 32 bit PAERoss Lagerwall2016-04-061-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 55b3da98a40dbb3776f7454daf0d95dde25c33d2 (xen/balloon: find non-conflicting regions to place hotplugged memory) caused a regression in 4.4. When ballooning on an x86 32 bit PAE system with close to 64 GiB of memory, the address returned by allocate_resource may be above 64 GiB. When using CONFIG_SPARSEMEM, this setup is limited to using physical addresses < 64 GiB. When adding memory at this address, it runs off the end of the mem_section array and causes a crash. Instead, fail the ballooning request. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Cc: <stable@vger.kernel.org> # 4.4+ Signed-off-by: David Vrabel <david.vrabel@citrix.com>
| * | | | xen: Fix page <-> pfn conversion on 32 bit systemsRoss Lagerwall2016-04-061-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 1084b1988d22dc165c9dbbc2b0e057f9248ac4db (xen: Add Xen specific page definition) caused a regression in 4.4. The xen functions to convert between pages and pfns fail due to an overflow on systems where a physical address may not fit in an unsigned long (e.g. x86 32 bit PAE systems). Rework the conversion to avoid overflow. This should also result in simpler object code. This bug manifested itself as disk corruption with Linux 4.4 when using blkfront in a Xen HVM x86 32 bit guest with more than 4 GiB of memory. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Cc: <stable@vger.kernel.org> # 4.4+ Signed-off-by: David Vrabel <david.vrabel@citrix.com>
* | | | | Merge tag 'trace-fixes-v4.6-rc6' of ↵Linus Torvalds2016-05-031-2/+7
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix from Steven Rostedt: "Chunyu Hu noticed that if one writes into the trigger files within the ftrace subsystem of events that it can cause an oops. This file is only writable by root, but still is a bug that needs to be fixed" * tag 'trace-fixes-v4.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Don't display trigger file for events that can't be enabled
| * | | | | tracing: Don't display trigger file for events that can't be enabledChunyu Hu2016-05-031-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently register functions for events will be called through the 'reg' field of event class directly without any check when seting up triggers. Triggers for events that don't support register through debug fs (events under events/ftrace are for trace-cmd to read event format, and most of them don't have a register function except events/ftrace/functionx) can't be enabled at all, and an oops will be hit when setting up trigger for those events, so just not creating them is an easy way to avoid the oops. Link: http://lkml.kernel.org/r/1462275274-3911-1-git-send-email-chuhu@redhat.com Cc: stable@vger.kernel.org # 3.14+ Fixes: 85f2b08268c01 ("tracing: Add basic event trigger framework") Signed-off-by: Chunyu Hu <chuhu@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2016-05-0324-127/+259
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: "Some straggler bug fixes: 1) Batman-adv DAT must consider VLAN IDs when choosing candidate nodes, from Antonio Quartulli. 2) Fix botched reference counting of vlan objects and neigh nodes in batman-adv, from Sven Eckelmann. 3) netem can crash when it sees GSO packets, the fix is to segment then upon ->enqueue. Fix from Neil Horman with help from Eric Dumazet. 4) Fix VXLAN dependencies in mlx5 driver Kconfig, from Matthew Finlay. 5) Handle VXLAN ops outside of rcu lock, via a workqueue, in mlx5, since it can sleep. Fix also from Matthew Finlay. 6) Check mdiobus_scan() return values properly in pxa168_eth and macb drivers. From Sergei Shtylyov. 7) If the netdevice doesn't support checksumming, disable segmentation. From Alexandery Duyck. 8) Fix races between RDS tcp accept and sending, from Sowmini Varadhan. 9) In macb driver, probe MDIO bus before we register the netdev, otherwise we can try to open the device before it is really ready for that. Fix from Florian Fainelli. 10) Netlink attribute size for ILA "tunnels" not calculated properly, fix from Nicolas Dichtel" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: ipv6/ila: fix nlsize calculation for lwtunnel net: macb: Probe MDIO bus before registering netdev RDS: TCP: Synchronize accept() and connect() paths on t_conn_lock. RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock vxlan: Add checksum check to the features check function net: Disable segmentation if checksumming is not supported net: mvneta: Remove superfluous SMP function call macb: fix mdiobus_scan() error check pxa168_eth: fix mdiobus_scan() error check net/mlx5e: Use workqueue for vxlan ops net/mlx5e: Implement a mlx5e workqueue net/mlx5: Kconfig: Fix MLX5_EN/VXLAN build issue net/mlx5: Unmap only the relevant IO memory mapping netem: Segment GSO packets on enqueue batman-adv: Fix reference counting of hardif_neigh_node object for neigh_node batman-adv: Fix reference counting of vlan object for tt_local_entry batman-adv: B.A.T.M.A.N V - make sure iface is reactivated upon NETDEV_UP event batman-adv: fix DAT candidate selection (must use vid)
| * | | | | | ipv6/ila: fix nlsize calculation for lwtunnelNicolas Dichtel2016-05-031-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The handler 'ila_fill_encap_info' adds one attribute: ILA_ATTR_LOCATOR. Fixes: 65d7ab8de582 ("net: Identifier Locator Addressing module") CC: Tom Herbert <tom@herbertland.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | net: macb: Probe MDIO bus before registering netdevFlorian Fainelli2016-05-031-12/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current sequence makes us register for a network device prior to registering and probing the MDIO bus which could lead to some unwanted consequences, like a thread of execution calling into ndo_open before register_netdev() returns, while the MDIO bus is not ready yet. Rework the sequence to register for the MDIO bus, and therefore attach to a PHY prior to calling register_netdev(), which implies reworking the error path a bit. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | Merge branch 'rds-fixes'David S. Miller2016-05-034-19/+50
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sowmini Varadhan says: ==================== RDS: TCP: sychronization during connection startup This patch series ensures that the passive (accept) side of the TCP connection used for RDS-TCP is correctly synchronized with any concurrent active (connect) attempts for a given pair of peers. Patch 1 in the series makes sure that the t_sock in struct rds_tcp_connection is only reset after any threads in rds_tcp_xmit have completed (otherwise a null-ptr deref may be encountered). Patch 2 synchronizes rds_tcp_accept_one() with the rds_tcp*connect() path. v2: review comments from Santosh Shilimkar, other spelling corrections ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | RDS: TCP: Synchronize accept() and connect() paths on t_conn_lock.Sowmini Varadhan2016-05-034-10/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | An arbitration scheme for duelling SYNs is implemented as part of commit 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an outgoing socket in rds_tcp_accept_one()") which ensures that both nodes involved will arrive at the same arbitration decision. However, this needs to be synchronized with an outgoing SYN to be generated by rds_tcp_conn_connect(). This commit achieves the synchronization through the t_conn_lock mutex in struct rds_tcp_connection. The rds_conn_state is checked in rds_tcp_conn_connect() after acquiring the t_conn_lock mutex. A SYN is sent out only if the RDS connection is not already UP (an UP would indicate that rds_tcp_accept_one() has completed 3WH, so no SYN needs to be generated). Similarly, the rds_conn_state is checked in rds_tcp_accept_one() after acquiring the t_conn_lock mutex. The only acceptable states (to allow continuation of the arbitration logic) are UP (i.e., outgoing SYN was SYN-ACKed by peer after it sent us the SYN) or CONNECTING (we sent outgoing SYN before we saw incoming SYN). Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sockSowmini Varadhan2016-05-032-17/+25
| |/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a race condition between rds_send_xmit -> rds_tcp_xmit and the code that deals with resolution of duelling syns added by commit 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an outgoing socket in rds_tcp_accept_one()"). Specifically, we may end up derefencing a null pointer in rds_send_xmit if we have the interleaving sequence: rds_tcp_accept_one rds_send_xmit conn is RDS_CONN_UP, so invoke rds_tcp_xmit tc = conn->c_transport_data rds_tcp_restore_callbacks /* reset t_sock */ null ptr deref from tc->t_sock The race condition can be avoided without adding the overhead of additional locking in the xmit path: have rds_tcp_accept_one wait for rds_tcp_xmit threads to complete before resetting callbacks. The synchronization can be done in the same manner as rds_conn_shutdown(). First set the rds_conn_state to something other than RDS_CONN_UP (so that new threads cannot get into rds_tcp_xmit()), then wait for RDS_IN_XMIT to be cleared in the conn->c_flags indicating that any threads in rds_tcp_xmit are done. Fixes: 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an outgoing socket in rds_tcp_accept_one()") Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | Merge branch 'tunnel-csum-and-sg-offloads'David S. Miller2016-05-033-2/+9
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Alexander Duyck says: ==================== Fixes for tunnel checksum and segmentation offloads This patch series is a subset of patches I had submitted for net-next. I plan to drop these two patches from the v3 of "Fix Tunnel features and enable GSO partial for several drivers" and I am instead submitting them for net since these are truly fixes and likely will need to be backported to stable branches. This series addresses 2 specific issues. The first is that we could request TSO on a v4 inner header while not supporting checksum offload of the outer IPv6 header. The second is that we could request an IPv6 inner checksum offload without validating that we could actually support an inner IPv6 checksum offload. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | vxlan: Add checksum check to the features check functionAlexander Duyck2016-05-032-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need to perform an additional check on the inner headers to determine if we can offload the checksum for them. Previously this check didn't occur so we would generate an invalid frame in the case of an IPv6 header encapsulated inside of an IPv4 tunnel. To fix this I added a secondary check to vxlan_features_check so that we can verify that we can offload the inner checksum. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | net: Disable segmentation if checksumming is not supportedAlexander Duyck2016-05-031-1/+1
| |/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the case of the mlx4 and mlx5 driver they do not support IPv6 checksum offload for tunnels. With this being the case we should disable GSO in addition to the checksum offload features when we find that a device cannot perform a checksum on a given packet type. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | net: mvneta: Remove superfluous SMP function callAnna-Maria Gleixner2016-05-031-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 3b9d6da67e11 ("cpu/hotplug: Fix rollback during error-out in __cpu_disable()") it is ensured that callbacks of CPU_ONLINE and CPU_DOWN_PREPARE are processed on the hotplugged CPU. Due to this SMP function calls are no longer required. Replace smp_call_function_single() with a direct call to mvneta_percpu_enable() or mvneta_percpu_disable(). The functions do not require to be called with interrupts disabled, therefore the smp_call_function_single() calling convention is not preserved. Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Cc: netdev@vger.kernel.org Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | macb: fix mdiobus_scan() error checkSergei Shtylyov2016-05-031-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now mdiobus_scan() returns ERR_PTR(-ENODEV) instead of NULL if the PHY device ID was read as all ones. As this was not an error before, this value should be filtered out now in this driver. Fixes: b74766a0a0fe ("phylib: don't return NULL from get_phy_device()") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | pxa168_eth: fix mdiobus_scan() error checkSergei Shtylyov2016-05-031-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since mdiobus_scan() returns either an error code or NULL on error, the driver should check for both, not only for NULL, otherwise a crash is imminent... Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | Merge branch 'mlx5-fixes'David S. Miller2016-05-036-29/+74
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Saeed Mahameed says: ==================== Mellanox 100G mlx5 fixes for 4.6-rc This small series provides some bug fixes for mlx5 driver. A small bug fix for iounmap of a null pointer, which dumps a warning on some archs. One patch to fix the VXLAN/MLX5_EN dependency issue reported by Arnd. Two patches to fix the scheduling while atomic issue for ndo_add/del_vxlan_port NDOs. The first will add an internal mlx5e workqueue and the second will delegate vxlan ports add/del requests to that workqueue. Note: ('net/mlx5: Kconfig: Fix MLX5_EN/VXLAN build issue') is only needed for net and not net-next as the issue was globally fixed for all device drivers by: b7aade15485a ('vxlan: break dependency with netdev drivers') in net-next. Applied on top: f27337e16f2d ('ip_tunnel: fix preempt warning in ip tunnel creation/updating') ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | net/mlx5e: Use workqueue for vxlan opsMatthew Finlay2016-05-033-16/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The vxlan add/delete port NDOs are called under rcu lock. The current mlx5e implementation can potentially block in these calls, which is not allowed. Move to using the mlx5e workqueue to handle these NDOs. Fixes: b3f63c3d5e2c ('net/mlx5e: Add netdev support for VXLAN tunneling') Signed-off-by: Matthew Finlay <matt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | net/mlx5e: Implement a mlx5e workqueueMatthew Finlay2016-05-032-11/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement a mlx5e workqueue to handle all mlx5e specific tasks. Move all tasks currently using the system workqueue to the new workqueue. This is in preparation for vxlan using the mlx5e workqueue in order to schedule port add/remove operations. Signed-off-by: Matthew Finlay <matt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | net/mlx5: Kconfig: Fix MLX5_EN/VXLAN build issueMatthew Finlay2016-05-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When MLX5_EN=y MLX5_CORE=y and VXLAN=m there is a linker error for vxlan_get_rx_port() due to the fact that VXLAN is a module. Change Kconfig to select VXLAN when MLX5_CORE=y. When MLX5_CORE=m there is no dependency on the value of VXLAN. Fixes: b3f63c3d5e2c ('net/mlx5e: Add netdev support for VXLAN tunneling') Signed-off-by: Matthew Finlay <matt@mellanox.com> Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | net/mlx5: Unmap only the relevant IO memory mappingGal Pressman2016-05-031-2/+4
| |/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When freeing UAR the driver tries to unmap uar->map and uar->bf_map which are mutually exclusive thus always unmapping a NULL pointer. Make sure we only call iounmap() once, for the actual mapping. Fixes: 0ba422410bbf ('net/mlx5: Fix global UAR mapping') Signed-off-by: Gal Pressman <galp@mellanox.com> Reported-by: Doron Tsur <doront@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | netem: Segment GSO packets on enqueueNeil Horman2016-05-031-2/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was recently reported to me, and reproduced on the latest net kernel, when attempting to run netperf from a host that had a netem qdisc attached to the egress interface: [ 788.073771] ---------------------[ cut here ]--------------------------- [ 788.096716] WARNING: at net/core/dev.c:2253 skb_warn_bad_offload+0xcd/0xda() [ 788.129521] bnx2: caps=(0x00000001801949b3, 0x0000000000000000) len=2962 data_len=0 gso_size=1448 gso_type=1 ip_summed=3 [ 788.182150] Modules linked in: sch_netem kvm_amd kvm crc32_pclmul ipmi_ssif ghash_clmulni_intel sp5100_tco amd64_edac_mod aesni_intel lrw gf128mul glue_helper ablk_helper edac_mce_amd cryptd pcspkr sg edac_core hpilo ipmi_si i2c_piix4 k10temp fam15h_power hpwdt ipmi_msghandler shpchp acpi_power_meter pcc_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ahci ata_generic pata_acpi ttm libahci crct10dif_pclmul pata_atiixp tg3 libata crct10dif_common drm crc32c_intel ptp serio_raw bnx2 r8169 hpsa pps_core i2c_core mii dm_mirror dm_region_hash dm_log dm_mod [ 788.465294] CPU: 16 PID: 0 Comm: swapper/16 Tainted: G W ------------ 3.10.0-327.el7.x86_64 #1 [ 788.511521] Hardware name: HP ProLiant DL385p Gen8, BIOS A28 12/17/2012 [ 788.542260] ffff880437c036b8 f7afc56532a53db9 ffff880437c03670 ffffffff816351f1 [ 788.576332] ffff880437c036a8 ffffffff8107b200 ffff880633e74200 ffff880231674000 [ 788.611943] 0000000000000001 0000000000000003 0000000000000000 ffff880437c03710 [ 788.647241] Call Trace: [ 788.658817] <IRQ> [<ffffffff816351f1>] dump_stack+0x19/0x1b [ 788.686193] [<ffffffff8107b200>] warn_slowpath_common+0x70/0xb0 [ 788.713803] [<ffffffff8107b29c>] warn_slowpath_fmt+0x5c/0x80 [ 788.741314] [<ffffffff812f92f3>] ? ___ratelimit+0x93/0x100 [ 788.767018] [<ffffffff81637f49>] skb_warn_bad_offload+0xcd/0xda [ 788.796117] [<ffffffff8152950c>] skb_checksum_help+0x17c/0x190 [ 788.823392] [<ffffffffa01463a1>] netem_enqueue+0x741/0x7c0 [sch_netem] [ 788.854487] [<ffffffff8152cb58>] dev_queue_xmit+0x2a8/0x570 [ 788.880870] [<ffffffff8156ae1d>] ip_finish_output+0x53d/0x7d0 ... The problem occurs because netem is not prepared to handle GSO packets (as it uses skb_checksum_help in its enqueue path, which cannot manipulate these frames). The solution I think is to simply segment the skb in a simmilar fashion to the way we do in __dev_queue_xmit (via validate_xmit_skb), with some minor changes. When we decide to corrupt an skb, if the frame is GSO, we segment it, corrupt the first segment, and enqueue the remaining ones. tested successfully by myself on the latest net kernel, to which this applies Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Jamal Hadi Salim <jhs@mojatatu.com> CC: "David S. Miller" <davem@davemloft.net> CC: netem@lists.linux-foundation.org CC: eric.dumazet@gmail.com CC: stephen@networkplumber.org Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-mergeDavid S. Miller2016-05-036-56/+41
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Antonio Quartulli says: ==================== In this small batch of patches you have: - a fix for our Distributed ARP Table that makes sure that the input provided to the hash function during a query is the same as the one provided during an insert (so to prevent false negatives), by Antonio Quartulli - a fix for our new protocol implementation B.A.T.M.A.N. V that ensures that a hard interface is properly re-activated when it is brought down and then up again, by Antonio Quartulli - two fixes respectively to the reference counting of the tt_local_entry and neigh_node objects, by Sven Eckelmann. Such bug is rather severe as it would prevent the netdev objects references by batman-adv from being released after shutdown. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | batman-adv: Fix reference counting of hardif_neigh_node object for neigh_nodeSven Eckelmann2016-04-292-11/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The batadv_neigh_node was specific to a batadv_hardif_neigh_node and held an implicit reference to it. But this reference was never stored in form of a pointer in the batadv_neigh_node itself. Instead batadv_neigh_node_release depends on a consistent state of hard_iface->neigh_list and that batadv_hardif_neigh_get always returns the batadv_hardif_neigh_node object which it has a reference for. But batadv_hardif_neigh_get cannot guarantee that because it is working only with rcu_read_lock on this list. It can therefore happen that a neigh_addr is in this list twice or that batadv_hardif_neigh_get cannot find the batadv_hardif_neigh_node for an neigh_addr due to some other list operations taking place at the same time. Instead add a batadv_hardif_neigh_node pointer directly in batadv_neigh_node which will be used for the reference counter decremented on release of batadv_neigh_node. Fixes: cef63419f7db ("batman-adv: add list of unique single hop neighbors per hard-interface") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <a@unstable.cc>
| | * | | | | | batman-adv: Fix reference counting of vlan object for tt_local_entrySven Eckelmann2016-04-292-38/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The batadv_tt_local_entry was specific to a batadv_softif_vlan and held an implicit reference to it. But this reference was never stored in form of a pointer in the tt_local_entry itself. Instead batadv_tt_local_remove, batadv_tt_local_table_free and batadv_tt_local_purge_pending_clients depend on a consistent state of bat_priv->softif_vlan_list and that batadv_softif_vlan_get always returns the batadv_softif_vlan object which it has a reference for. But batadv_softif_vlan_get cannot guarantee that because it is working only with rcu_read_lock on this list. It can therefore happen that an vid is in this list twice or that batadv_softif_vlan_get cannot find the batadv_softif_vlan for an vid due to some other list operations taking place at the same time. Instead add a batadv_softif_vlan pointer directly in batadv_tt_local_entry which will be used for the reference counter decremented on release of batadv_tt_local_entry. Fixes: 35df3b298fc8 ("batman-adv: fix TT VLAN inconsistency on VLAN re-add") Signed-off-by: Sven Eckelmann <sven@narfation.org> Acked-by: Antonio Quartulli <a@unstable.cc> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <a@unstable.cc>
| | * | | | | | batman-adv: B.A.T.M.A.N V - make sure iface is reactivated upon NETDEV_UP eventAntonio Quartulli2016-04-293-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At the moment there is no explicit reactivation of an hard-interface upon NETDEV_UP event. In case of B.A.T.M.A.N. IV the interface is reactivated as soon as the next OGM is scheduled for sending, but this mechanism does not work with B.A.T.M.A.N. V. The latter does not rely on the same scheduling mechanism as its predecessor and for this reason the hard-interface remains deactivated forever after being brought down once. This patch fixes the reactivation mechanism by adding a new routing API which explicitly allows each algorithm to perform any needed operation upon interface re-activation. Such API is optional and is implemented by B.A.T.M.A.N. V only and it just takes care of setting the iface status to ACTIVE Signed-off-by: Antonio Quartulli <a@unstable.cc> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
OpenPOWER on IntegriCloud