summaryrefslogtreecommitdiffstats
path: root/lib
Commit message (Collapse)AuthorAgeFilesLines
...
| * | | | | | | lib/vsprintf: Move pointer_string() upperAndy Shevchenko2018-04-111-14/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As preparatory patch to further clean up. No functional change. Link: http://lkml.kernel.org/r/20180216210711.79901-5-andriy.shevchenko@linux.intel.com To: "Tobin C . Harding" <me@tobin.cc> To: linux@rasmusvillemoes.dk To: Joe Perches <joe@perches.com> To: linux-kernel@vger.kernel.org To: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Petr Mladek <pmladek@suse.com>
| * | | | | | | lib/vsprintf: Make flag_spec globalAndy Shevchenko2018-04-111-13/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are places where default specification to print flags as number is in use. Make it global and convert existing users. Link: http://lkml.kernel.org/r/20180216210711.79901-4-andriy.shevchenko@linux.intel.com To: "Tobin C . Harding" <me@tobin.cc> To: linux@rasmusvillemoes.dk To: Joe Perches <joe@perches.com> To: linux-kernel@vger.kernel.org To: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Petr Mladek <pmladek@suse.com>
| * | | | | | | lib/vsprintf: Make strspec globalAndy Shevchenko2018-04-111-12/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are places where default specification to print strings is in use. Make it global and convert existing users. Link: http://lkml.kernel.org/r/20180216210711.79901-3-andriy.shevchenko@linux.intel.com To: "Tobin C . Harding" <me@tobin.cc> To: linux@rasmusvillemoes.dk To: Joe Perches <joe@perches.com> To: linux-kernel@vger.kernel.org To: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Petr Mladek <pmladek@suse.com>
| * | | | | | | lib/vsprintf: Make dec_spec globalAndy Shevchenko2018-04-111-12/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are places where default specification to print decimal numbers is in use. Make it global and convert existing users. Link: http://lkml.kernel.org/r/20180216210711.79901-2-andriy.shevchenko@linux.intel.com To: "Tobin C . Harding" <me@tobin.cc> To: linux@rasmusvillemoes.dk To: Joe Perches <joe@perches.com> To: linux-kernel@vger.kernel.org To: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Petr Mladek <pmladek@suse.com>
| * | | | | | | lib/test_printf: Mark big constant with ULAndy Shevchenko2018-04-111-1/+1
| |/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sparse complains that constant is so big for unsigned long on 64-bit architecture. lib/test_printf.c:217:54: warning: constant 0xffff0123456789ab is so big it is unsigned long lib/test_printf.c:246:54: warning: constant 0xffff0123456789ab is so big it is unsigned long To satisfy everyone, mark the constant with UL. Link: http://lkml.kernel.org/r/20180216210711.79901-1-andriy.shevchenko@linux.intel.com To: "Tobin C . Harding" <me@tobin.cc> To: linux@rasmusvillemoes.dk To: Joe Perches <joe@perches.com> To: linux-kernel@vger.kernel.org To: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> [pmladek@suse.com: Changed from ULL to UL as suggested by Luc Van Oostenryck <luc.vanoostenryck@gmail.com>] Signed-off-by: Petr Mladek <pmladek@suse.com>
* | | | | | | Merge tag 'rslib-v4.18-rc1' of ↵Linus Torvalds2018-06-053-130/+159
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull reed-salomon library updates from Kees Cook: "Refactors rslib and callers to provide a per-instance allocation area instead of performing VLAs on the stack" * tag 'rslib-v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: rslib: Allocate decoder buffers to avoid VLAs mtd: rawnand: diskonchip: Allocate rs control per instance rslib: Split rs control struct rslib: Simplify error path rslib: Remove GPL boilerplate rslib: Add SPDX identifiers rslib: Cleanup top level comments rslib: Cleanup whitespace damage dm/verity_fec: Use GFP aware reed solomon init rslib: Add GFP aware init function
| * | | | | | | rslib: Allocate decoder buffers to avoid VLAsThomas Gleixner2018-04-242-8/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To get rid of the variable length arrays on stack in the RS decoder it's necessary to allocate the decoder buffers per control structure instance. All usage sites have been checked for potential parallel decoder usage and fixed where necessary. Kees confirmed that the pstore decoding is strictly single threaded so there should be no surprises. Allocate them in the rs control structure sized depending on the number of roots for the chosen codec and adapt the decoder code to make use of them. Document the fact that decode operations based on a particular rs control instance cannot run in parallel and the caller has to ensure that as it's not possible to provide a proper locking construct which fits all use cases. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Kees Cook <keescook@chromium.org> Cc: Boris Brezillon <boris.brezillon@free-electrons.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Kernel Hardening <kernel-hardening@lists.openwall.com> Cc: Richard Weinberger <richard@nod.at> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Anton Vorontsov <anton@enomsg.org> Cc: Colin Cross <ccross@android.com> Cc: Andrew Morton <akpm@linuxfoundation.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Alasdair Kergon <agk@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org>
| * | | | | | | rslib: Split rs control structThomas Gleixner2018-04-243-58/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The decoder library uses variable length arrays on stack. To get rid of them it would be simple to allocate fixed length arrays on stack, but those might become rather large. The other solution is to allocate the buffers in the rs control structure, but this cannot be done as long as the structure can be shared by several users. Sharing is desired because the RS polynom tables are large and initialization is time consuming. To solve this split the codec information out of the control structure and have a pointer to a shared codec in it. Instantiate the control structure for each user, create a new codec if no shareable is avaiable yet. Adjust all affected usage sites to the new scheme. This allows to add per instance decoder buffers to the control structure later on. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Boris Brezillon <boris.brezillon@bootlin.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Kernel Hardening <kernel-hardening@lists.openwall.com> Cc: Richard Weinberger <richard@nod.at> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Anton Vorontsov <anton@enomsg.org> Cc: Colin Cross <ccross@android.com> Cc: Andrew Morton <akpm@linuxfoundation.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Alasdair Kergon <agk@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org>
| * | | | | | | rslib: Simplify error pathThomas Gleixner2018-04-241-11/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The four error path labels in rs_init() can be reduced to one by allocating the struct with kzalloc so the pointers in the struct are NULL and can be unconditionally handed in to kfree() because they either point to an allocation or are NULL. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Kees Cook <keescook@chromium.org>
| * | | | | | | rslib: Remove GPL boilerplateThomas Gleixner2018-04-241-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that SPDX identifiers are in place, remove the GPL boiler plate text. Leave the notices which document that Phil Karn granted permission in place (encode/decode source code). The modified files are code written for the kernel by me. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Boris Brezillon <boris.brezillon@free-electrons.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Kernel Hardening <kernel-hardening@lists.openwall.com> Cc: Richard Weinberger <richard@nod.at> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Anton Vorontsov <anton@enomsg.org> Cc: Colin Cross <ccross@android.com> Cc: Andrew Morton <akpm@linuxfoundation.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Alasdair Kergon <agk@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org>
| * | | | | | | rslib: Add SPDX identifiersThomas Gleixner2018-04-243-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Reed-Solomon library is based on code from Phil Karn who granted permission to import it into the kernel under the GPL V2. See commit 15b5423757a7 ("Shared Reed-Solomon ECC library") in the history git tree at: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git ... The encoder/decoder code is lifted from the GPL'd userspace RS-library written by Phil Karn. I modified/wrapped it to provide the different functions which we need in the MTD/NAND code. ... Signed-Off-By: Thomas Gleixner <tglx@linutronix.de> Signed-Off-By: David Woodhouse <dwmw2@infradead.org> "No objections at all. Just keep the authorship notices." -- Phil Karn Add the proper SPDX identifiers according to Documentation/process/license-rules.rst. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Boris Brezillon <boris.brezillon@free-electrons.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Kernel Hardening <kernel-hardening@lists.openwall.com> Cc: Richard Weinberger <richard@nod.at> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Anton Vorontsov <anton@enomsg.org> Cc: Colin Cross <ccross@android.com> Cc: Andrew Morton <akpm@linuxfoundation.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Alasdair Kergon <agk@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org>
| * | | | | | | rslib: Cleanup top level commentsThomas Gleixner2018-04-243-29/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | File references and stale CVS ids are really not useful. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Boris Brezillon <boris.brezillon@free-electrons.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Kernel Hardening <kernel-hardening@lists.openwall.com> Cc: Richard Weinberger <richard@nod.at> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Anton Vorontsov <anton@enomsg.org> Cc: Colin Cross <ccross@android.com> Cc: Andrew Morton <akpm@linuxfoundation.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Alasdair Kergon <agk@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org>
| * | | | | | | rslib: Cleanup whitespace damageThomas Gleixner2018-04-241-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of mixing the whitespace cleanup into functional changes, mop it up first. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Boris Brezillon <boris.brezillon@free-electrons.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Kernel Hardening <kernel-hardening@lists.openwall.com> Cc: Richard Weinberger <richard@nod.at> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Anton Vorontsov <anton@enomsg.org> Cc: Colin Cross <ccross@android.com> Cc: Andrew Morton <akpm@linuxfoundation.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Alasdair Kergon <agk@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org>
| * | | | | | | rslib: Add GFP aware init functionThomas Gleixner2018-04-241-20/+23
| | |_|/ / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The rslib usage in dm/verity_fec is broken because init_rs() can nest in GFP_NOIO mempool allocations as init_rs() is invoked from the mempool alloc callback. Provide a variant which takes gfp_t flags as argument. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Alasdair Kergon <agk@redhat.com> Cc: Neil Brown <neilb@suse.com> Signed-off-by: Kees Cook <keescook@chromium.org>
* | | | | | | Merge branch 'x86-dax-for-linus' of ↵Linus Torvalds2018-06-041-0/+61
|\ \ \ \ \ \ \ | | |_|_|_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 dax updates from Ingo Molnar: "This contains x86 memcpy_mcsafe() fault handling improvements the nvdimm tree would like to make more use of" * 'x86-dax-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asm/memcpy_mcsafe: Define copy_to_iter_mcsafe() x86/asm/memcpy_mcsafe: Add write-protection-fault handling x86/asm/memcpy_mcsafe: Return bytes remaining x86/asm/memcpy_mcsafe: Add labels for __memcpy_mcsafe() write fault handling x86/asm/memcpy_mcsafe: Remove loop unrolling
| * | | | | | x86/asm/memcpy_mcsafe: Define copy_to_iter_mcsafe()Dan Williams2018-05-151-0/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the updated memcpy_mcsafe() implementation to define copy_user_mcsafe() and copy_to_iter_mcsafe(). The most significant difference from typical copy_to_iter() is that the ITER_KVEC and ITER_BVEC iterator types can fail to complete a full transfer. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: hch@lst.de Cc: linux-fsdevel@vger.kernel.org Cc: linux-nvdimm@lists.01.org Link: http://lkml.kernel.org/r/152539239150.31796.9189779163576449784.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | | | | | Merge tag 'dma-mapping-4.18' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds2018-06-049-321/+232
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull dma-mapping updates from Christoph Hellwig: - replace the force_dma flag with a dma_configure bus method. (Nipun Gupta, although one patch is іncorrectly attributed to me due to a git rebase bug) - use GFP_DMA32 more agressively in dma-direct. (Takashi Iwai) - remove PCI_DMA_BUS_IS_PHYS and rely on the dma-mapping API to do the right thing for bounce buffering. - move dma-debug initialization to common code, and apply a few cleanups to the dma-debug code. - cleanup the Kconfig mess around swiotlb selection - swiotlb comment fixup (Yisheng Xie) - a trivial swiotlb fix. (Dan Carpenter) - support swiotlb on RISC-V. (based on a patch from Palmer Dabbelt) - add a new generic dma-noncoherent dma_map_ops implementation and use it for arc, c6x and nds32. - improve scatterlist validity checking in dma-debug. (Robin Murphy) - add a struct device quirk to limit the dma-mask to 32-bit due to bridge/system issues, and switch x86 to use it instead of a local hack for VIA bridges. - handle devices without a dma_mask more gracefully in the dma-direct code. * tag 'dma-mapping-4.18' of git://git.infradead.org/users/hch/dma-mapping: (48 commits) dma-direct: don't crash on device without dma_mask nds32: use generic dma_noncoherent_ops nds32: implement the unmap_sg DMA operation nds32: consolidate DMA cache maintainance routines x86/pci-dma: switch the VIA 32-bit DMA quirk to use the struct device flag x86/pci-dma: remove the explicit nodac and allowdac option x86/pci-dma: remove the experimental forcesac boot option Documentation/x86: remove a stray reference to pci-nommu.c core, dma-direct: add a flag 32-bit dma limits dma-mapping: remove unused gfp_t parameter to arch_dma_alloc_attrs dma-debug: check scatterlist segments c6x: use generic dma_noncoherent_ops arc: use generic dma_noncoherent_ops arc: fix arc_dma_{map,unmap}_page arc: fix arc_dma_sync_sg_for_{cpu,device} arc: simplify arc_dma_sync_single_for_{cpu,device} dma-mapping: provide a generic dma-noncoherent implementation dma-mapping: simplify Kconfig dependencies riscv: add swiotlb support riscv: only enable ZONE_DMA32 for 64-bit ...
| * | | | | | | dma-direct: don't crash on device without dma_maskChristoph Hellwig2018-05-311-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Print a useful warning instead. Reported-by: Finn Thain <fthain@telegraphics.com.au> Tested-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | | | | | | core, dma-direct: add a flag 32-bit dma limitsChristoph Hellwig2018-05-281-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Various PCI bridges (VIA PCI, Xilinx PCIe) limit DMA to only 32-bits even if the device itself supports more. Add a single bit flag to struct device (to be moved into the dma extension once we get to it) to flag such devices and reject larger DMA to them. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | | | | | dma-debug: check scatterlist segmentsRobin Murphy2018-05-242-0/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Drivers/subsystems creating scatterlists for DMA should be taking care to respect the scatter-gather limitations of the appropriate device, as described by dma_parms. A DMA API implementation cannot feasibly split a scatterlist into *more* entries than originally passed, so it is not well defined what they should do when given a segment larger than the limit they are also required to respect. Conversely, devices which are less limited than the rather conservative defaults, or indeed have no limitations at all (e.g. GPUs with their own internal MMU), should be encouraged to set appropriate dma_parms, as they may get more efficient DMA mapping performance out of it. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | | | | | | dma-mapping: provide a generic dma-noncoherent implementationChristoph Hellwig2018-05-194-4/+127
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new dma_map_ops implementation that uses dma-direct for the address mapping of streaming mappings, and which requires arch-specific implemenations of coherent allocate/free. Architectures have to provide flushing helpers to ownership trasnfers to the device and/or CPU, and can provide optional implementations of the coherent mmap functionality, and the cache_flush routines for non-coherent long term allocations. Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Alexey Brodkin <abrodkin@synopsys.com> Acked-by: Vineet Gupta <vgupta@synopsys.com>
| * | | | | | | dma-mapping: simplify Kconfig dependenciesChristoph Hellwig2018-05-191-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ARCH_DMA_ADDR_T_64BIT is always true for 64-bit architectures now, so we can skip the clause requiring it. 'n' is the default default, so no need to explicitly state it. Tested-by: Alexey Brodkin <abrodkin@synopsys.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | | | | | | swiotlb: update comments to refer to physical instead of virtual addressesYisheng Xie2018-05-091-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | swiotlb use physical address of bounce buffer when do map and unmap, therefore, related comment should be updated. Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | | | | | | swiotlb: remove the CONFIG_DMA_DIRECT_OPS ifdefsChristoph Hellwig2018-05-091-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | swiotlb now selects the DMA_DIRECT_OPS config symbol, so this will always be true. Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | | | | | | swiotlb: move the SWIOTLB config symbol to lib/KconfigChristoph Hellwig2018-05-091-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This way we have one central definition of it, and user can select it as needed. The new option is not user visible, which is the behavior it had in most architectures, with a few notable exceptions: - On x86_64 and mips/loongson3 it used to be user selectable, but defaulted to y. It now is unconditional, which seems like the right thing for 64-bit architectures without guaranteed availablity of IOMMUs. - on powerpc the symbol is user selectable and defaults to n, but many boards select it. This change assumes no working setup required a manual selection, but if that turned out to be wrong we'll have to add another select statement or two for the respective boards. Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | | | | | | arch: define the ARCH_DMA_ADDR_T_64BIT config symbol in lib/KconfigChristoph Hellwig2018-05-091-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Define this symbol if the architecture either uses 64-bit pointers or the PHYS_ADDR_T_64BIT is set. This covers 95% of the old arch magic. We only need an additional select for Xen on ARM (why anyway?), and we now always set ARCH_DMA_ADDR_T_64BIT on mips boards with 64-bit physical addressing instead of only doing it when highmem is set. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: James Hogan <jhogan@kernel.org>
| * | | | | | | dma-mapping: move the NEED_DMA_MAP_STATE config symbol to lib/KconfigChristoph Hellwig2018-05-092-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This way we have one central definition of it, and user can select it as needed. Note that we now also always select it when CONFIG_DMA_API_DEBUG is select, which fixes some incorrect checks in a few network drivers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
| * | | | | | | scatterlist: move the NEED_SG_DMA_LENGTH config symbol to lib/KconfigChristoph Hellwig2018-05-091-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This way we have one central definition of it, and user can select it as needed. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
| * | | | | | | iommu-helper: move the IOMMU_HELPER config symbol to lib/Christoph Hellwig2018-05-091-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This way we have one central definition of it, and user can select it as needed. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
| * | | | | | | iommu-helper: mark iommu_is_span_boundary as inlineChristoph Hellwig2018-05-091-11/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This avoids selecting IOMMU_HELPER just for this function. And we only use it once or twice in normal builds so this often even is a size reduction. Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | | | | | | iommu-helper: unexport iommu_area_allocChristoph Hellwig2018-05-091-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This function is only used by built-in code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
| * | | | | | | iommu-common: move to arch/sparcChristoph Hellwig2018-05-092-268/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This code is only used by sparc, and all new iommu drivers should use the drivers/iommu/ framework. Also remove the unused exports. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: David S. Miller <davem@davemloft.net> Reviewed-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
| * | | | | | | dma-debug: remove CONFIG_HAVE_DMA_API_DEBUGChristoph Hellwig2018-05-081-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no arch specific code required for dma-debug, so there is no need to opt into the support either. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
| * | | | | | | dma-debug: unexport dma_debug_resize_entries and debug_dma_dump_mappingsChristoph Hellwig2018-05-081-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Only used by the AMD GART and Intel VT-D drivers, which must be built in. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
| * | | | | | | dma-debug: simplify counting of preallocated requestsChristoph Hellwig2018-05-081-16/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just keep a single variable with a descriptive name instead of two with confusing names. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
| * | | | | | | dma-debug: move initialization to common codeChristoph Hellwig2018-05-081-7/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most mainstream architectures are using 65536 entries, so lets stick to that. If someone is really desperate to override it that can still be done through <asm/dma-mapping.h>, but I'd rather see a really good rationale for that. dma_debug_init is now called as a core_initcall, which for many architectures means much earlier, and provides dma-debug functionality earlier in the boot process. This should be safe as it only relies on the memory allocator already being available. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
| * | | | | | | PCI: remove PCI_DMA_BUS_IS_PHYSChristoph Hellwig2018-05-071-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was used by the ide, scsi and networking code in the past to determine if they should bounce payloads. Now that the dma mapping always have to support dma to all physical memory (thanks to swiotlb for non-iommu systems) there is no need to this crude hack any more. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Palmer Dabbelt <palmer@sifive.com> (for riscv) Reviewed-by: Jens Axboe <axboe@kernel.dk>
| * | | | | | | dma-direct: try reallocation with GFP_DMA32 if possibleTakashi Iwai2018-05-071-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As the recent swiotlb bug revealed, we seem to have given up the direct DMA allocation too early and felt back to swiotlb allocation. The reason is that swiotlb allocator expected that dma_direct_alloc() would try harder to get pages even below 64bit DMA mask with GFP_DMA32, but the function doesn't do that but only deals with GFP_DMA case. This patch adds a similar fallback reallocation with GFP_DMA32 as we've done with GFP_DMA. The condition is that the coherent mask is smaller than 64bit (i.e. some address limitation), and neither GFP_DMA nor GFP_DMA32 is set beforehand. Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | | | | | | swiotlb: remove an unecessary NULL checkDan Carpenter2018-05-071-1/+1
| | |_|_|/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Smatch complains here: lib/swiotlb.c:730 swiotlb_alloc_buffer() warn: variable dereferenced before check 'dev' (see line 716) "dev" isn't ever NULL in this function so we can just remove the check. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
* | | | | | | Merge tag 'for-4.18/block-20180603' of git://git.kernel.dk/linux-blockLinus Torvalds2018-06-041-32/+81
|\ \ \ \ \ \ \ | |_|_|_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block updates from Jens Axboe: - clean up how we pass around gfp_t and blk_mq_req_flags_t (Christoph) - prepare us to defer scheduler attach (Christoph) - clean up drivers handling of bounce buffers (Christoph) - fix timeout handling corner cases (Christoph/Bart/Keith) - bcache fixes (Coly) - prep work for bcachefs and some block layer optimizations (Kent). - convert users of bio_sets to using embedded structs (Kent). - fixes for the BFQ io scheduler (Paolo/Davide/Filippo) - lightnvm fixes and improvements (Matias, with contributions from Hans and Javier) - adding discard throttling to blk-wbt (me) - sbitmap blk-mq-tag handling (me/Omar/Ming). - remove the sparc jsflash block driver, acked by DaveM. - Kyber scheduler improvement from Jianchao, making it more friendly wrt merging. - conversion of symbolic proc permissions to octal, from Joe Perches. Previously the block parts were a mix of both. - nbd fixes (Josef and Kevin Vigor) - unify how we handle the various kinds of timestamps that the block core and utility code uses (Omar) - three NVMe pull requests from Keith and Christoph, bringing AEN to feature completeness, file backed namespaces, cq/sq lock split, and various fixes - various little fixes and improvements all over the map * tag 'for-4.18/block-20180603' of git://git.kernel.dk/linux-block: (196 commits) blk-mq: update nr_requests when switching to 'none' scheduler block: don't use blocking queue entered for recursive bio submits dm-crypt: fix warning in shutdown path lightnvm: pblk: take bitmap alloc. out of critical section lightnvm: pblk: kick writer on new flush points lightnvm: pblk: only try to recover lines with written smeta lightnvm: pblk: remove unnecessary bio_get/put lightnvm: pblk: add possibility to set write buffer size manually lightnvm: fix partial read error path lightnvm: proper error handling for pblk_bio_add_pages lightnvm: pblk: fix smeta write error path lightnvm: pblk: garbage collect lines with failed writes lightnvm: pblk: rework write error recovery path lightnvm: pblk: remove dead function lightnvm: pass flag on graceful teardown to targets lightnvm: pblk: check for chunk size before allocating it lightnvm: pblk: remove unnecessary argument lightnvm: pblk: remove unnecessary indirection lightnvm: pblk: return NVM_ error on failed submission lightnvm: pblk: warn in case of corrupted write buffer ...
| * | | | | | blk-mq: avoid starving tag allocation after allocating process migratesMing Lei2018-05-241-14/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the allocation process is scheduled back and the mapped hw queue is changed, fake one extra wake up on previous queue for compensating wake up miss, so other allocations on the previous queue won't be starved. This patch fixes one request allocation hang issue, which can be triggered easily in case of very low nr_request. The race is as follows: 1) 2 hw queues, nr_requests are 2, and wake_batch is one 2) there are 3 waiters on hw queue 0 3) two in-flight requests in hw queue 0 are completed, and only two waiters of 3 are waken up because of wake_batch, but both the two waiters can be scheduled to another CPU and cause to switch to hw queue 1 4) then the 3rd waiter will wait for ever, since no in-flight request is in hw queue 0 any more. 5) this patch fixes it by the fake wakeup when waiter is scheduled to another hw queue Cc: <stable@vger.kernel.org> Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Modified commit message to make it clearer, and make it apply on top of the 4.18 branch. Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | | | | sbitmap: fix race in wait batch accountingJens Axboe2018-05-141-10/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we have multiple callers of sbq_wake_up(), we can end up in a situation where the wait_cnt will continually go more and more negative. Consider the case where our wake batch is 1, hence wait_cnt will start out as 1. wait_cnt == 1 CPU0 CPU1 atomic_dec_return(), cnt == 0 atomic_dec_return(), cnt == -1 cmpxchg(-1, 0) (succeeds) [wait_cnt now 0] cmpxchg(0, 1) (fails) This ends up with wait_cnt being 0, we'll wakeup immediately next time. Going through the same loop as above again, and we'll have wait_cnt -1. For the case where we have a larger wake batch, the only difference is that the starting point will be higher. We'll still end up with continually smaller batch wakeups, which defeats the purpose of the rolling wakeups. Always reset the wait_cnt to the batch value. Then it doesn't matter who wins the race. But ensure that whomever does win the race is the one that increments the ws index and wakes up our batch count, loser gets to call __sbq_wake_up() again to account his wakeups towards the next active wait state index. Fixes: 6c0ca7ae292a ("sbitmap: fix wakeup hang after sbq resize") Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | | | | sbitmap: warn if using smaller shallow depth than was setupOmar Sandoval2018-05-101-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make sure the user passed the right value to sbitmap_queue_min_shallow_depth(). Acked-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | | | | sbitmap: fix missed wakeups caused by sbitmap_queue_get_shallow()Omar Sandoval2018-05-101-9/+40
| |/ / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The sbitmap queue wake batch is calculated such that once allocations start blocking, all of the bits which are already allocated must be enough to fulfill the batch counters of all of the waitqueues. However, the shallow allocation depth can break this invariant, since we block before our full depth is being utilized. Add sbitmap_queue_min_shallow_depth(), which saves the minimum shallow depth the sbq will use, and update sbq_calc_wake_batch() to take it into account. Acked-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | | | | | idr: fix invalid ptr dereference on item deleteMatthew Wilcox2018-05-251-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the radix tree underlying the IDR happens to be full and we attempt to remove an id which is larger than any id in the IDR, we will call __radix_tree_delete() with an uninitialised 'slot' pointer, at which point anything could happen. This was easiest to hit with a single entry at id 0 and attempting to remove a non-0 id, but it could have happened with 64 entries and attempting to remove an id >= 64. Roman said: The syzcaller test boils down to opening /dev/kvm, creating an eventfd, and calling a couple of KVM ioctls. None of this requires superuser. And the result is dereferencing an uninitialized pointer which is likely a crash. The specific path caught by syzbot is via KVM_HYPERV_EVENTD ioctl which is new in 4.17. But I guess there are other user-triggerable paths, so cc:stable is probably justified. Matthew added: We have around 250 calls to idr_remove() in the kernel today. Many of them pass an ID which is embedded in the object they're removing, so they're safe. Picking a few likely candidates: drivers/firewire/core-cdev.c looks unsafe; the ID comes from an ioctl. drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c is similar drivers/atm/nicstar.c could be taken down by a handcrafted packet Link: http://lkml.kernel.org/r/20180518175025.GD6361@bombadil.infradead.org Fixes: 0a835c4f090a ("Reimplement IDR and IDA using the radix tree") Reported-by: <syzbot+35666cba7f0a337e2e79@syzkaller.appspotmail.com> Debugged-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds2018-05-211-2/+2
|\ \ \ \ \ \ | |_|_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull vfs fixes from Al Viro: "Assorted fixes all over the place" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: aio: fix io_destroy(2) vs. lookup_ioctx() race ext2: fix a block leak nfsd: vfs_mkdir() might succeed leaving dentry negative unhashed cachefiles: vfs_mkdir() might succeed leaving dentry negative unhashed unfuck sysfs_mount() kernfs: deal with kernfs_fill_super() failures cramfs: Fix IS_ENABLED typo befs_lookup(): use d_splice_alias() affs_lookup: switch to d_splice_alias() affs_lookup(): close a race with affs_remove_link() fix breakage caused by d_find_alias() semantics change fs: don't scan the inode cache before SB_BORN is set do d_instantiate/unlock_new_inode combinations safely iov_iter: fix memory leak in pipe_get_pages_alloc() iov_iter: fix return type of __pipe_get_pages()
| * | | | | iov_iter: fix memory leak in pipe_get_pages_alloc()Ilya Dryomov2018-05-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make n signed to avoid leaking the pages array if __pipe_get_pages() fails to allocate any pages. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | | | iov_iter: fix return type of __pipe_get_pages()Ilya Dryomov2018-05-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It returns -EFAULT and happens to be a helper for pipe_get_pages() whose return type is ssize_t. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | | | | | radix tree: fix multi-order iteration raceRoss Zwisler2018-05-181-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a race in the multi-order iteration code which causes the kernel to hit a GP fault. This was first seen with a production v4.15 based kernel (4.15.6-300.fc27.x86_64) utilizing a DAX workload which used order 9 PMD DAX entries. The race has to do with how we tear down multi-order sibling entries when we are removing an item from the tree. Remember for example that an order 2 entry looks like this: struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling] where 'entry' is in some slot in the struct radix_tree_node, and the three slots following 'entry' contain sibling pointers which point back to 'entry.' When we delete 'entry' from the tree, we call : radix_tree_delete() radix_tree_delete_item() __radix_tree_delete() replace_slot() replace_slot() first removes the siblings in order from the first to the last, then at then replaces 'entry' with NULL. This means that for a brief period of time we end up with one or more of the siblings removed, so: struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling] This causes an issue if you have a reader iterating over the slots in the tree via radix_tree_for_each_slot() while only under rcu_read_lock()/rcu_read_unlock() protection. This is a common case in mm/filemap.c. The issue is that when __radix_tree_next_slot() => skip_siblings() tries to skip over the sibling entries in the slots, it currently does so with an exact match on the slot directly preceding our current slot. Normally this works: V preceding slot struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling] ^ current slot This lets you find the first sibling, and you skip them all in order. But in the case where one of the siblings is NULL, that slot is skipped and then our sibling detection is interrupted: V preceding slot struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling] ^ current slot This means that the sibling pointers aren't recognized since they point all the way back to 'entry', so we think that they are normal internal radix tree pointers. This causes us to think we need to walk down to a struct radix_tree_node starting at the address of 'entry'. In a real running kernel this will crash the thread with a GP fault when you try and dereference the slots in your broken node starting at 'entry'. We fix this race by fixing the way that skip_siblings() detects sibling nodes. Instead of testing against the preceding slot we instead look for siblings via is_sibling_entry() which compares against the position of the struct radix_tree_node.slots[] array. This ensures that sibling entries are properly identified, even if they are no longer contiguous with the 'entry' they point to. Link: http://lkml.kernel.org/r/20180503192430.7582-6-ross.zwisler@linux.intel.com Fixes: 148deab223b2 ("radix-tree: improve multiorder iterators") Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reported-by: CR, Sapthagirish <sapthagirish.cr@intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Matthew Wilcox <willy@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | lib/test_bitmap.c: fix bitmap optimisation tests to report errors correctlyMatthew Wilcox2018-05-181-6/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I had neglected to increment the error counter when the tests failed, which made the tests noisy when they fail, but not actually return an error code. Link: http://lkml.kernel.org/r/20180509114328.9887-1-mpe@ellerman.id.au Fixes: 3cc78125a081 ("lib/test_bitmap.c: add optimisation tests") Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reported-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Yury Norov <ynorov@caviumnetworks.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: <stable@vger.kernel.org> [4.13+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OpenPOWER on IntegriCloud