summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* s390: fix save and restore of the floating-point-control registerMartin Schwidefsky2013-10-2410-111/+175
| | | | | | | | | | | | | | | | | | | | | | The FPC_VALID_MASK has been used to check the validity of the value to be loaded into the floating-point-control register. With the introduction of the floating-point extension facility and the decimal-floating-point additional bits have been defined which need to be checked in a non straight forward way. So far these bits have been ignored which can cause an incorrect results for decimal- floating-point operations, e.g. an incorrect rounding mode to be set after signal return. The static check with the FPC_VALID_MASK is replaced with a trial load of the floating-point-control value, see test_fp_ctl. In addition an information leak with the padding word between the floating-point-control word and the floating-point registers in the s390_fp_regs is fixed. Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/3270: use diagnose 0x210 for device sensing under z/VMMartin Schwidefsky2013-10-241-1/+1
| | | | | | | | | There is a debugging leftover from git commit 4d334fd155b53adf "s390/3270: asynchronous size sensing" in raw3270_reset_device_cb. Under z/VM the diagnose 0x210 can be used to find the correct size of the 3270 terminal. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/crypto: fix aes_s390 crypto module unload problemIngo Tuchscherer2013-10-241-3/+12
| | | | | | | | | | If a machine has no hardware support for the xts-aes or ctr-aes algorithms they are not registered in aes_s390_init. But aes_s390_fini unconditionally unregisters the algorithms which causes crypto_remove_alg to crash. Add two flag variables to remember if xts-aes and ctr-aes have been added. Signed-off-by: Ingo Tuchscherer <ingo.tuchscherer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/3270: remove unnecessary pointer checkMartin Schwidefsky2013-10-241-1/+1
| | | | | | | | | Make smatch happy and remove this warning: drivers/s390/char/raw3270.c:347 raw3270_irq() error: we previously assumed 'rq' could be null (see line 342) Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/monwriter: fix smatch warning for strcpy()Gerald Schaefer2013-10-241-1/+1
| | | | | | | | | | | | | This patch fixes the following smatch warning: monwrite_diag() error: strcpy() '"LNXAPPL"' too large for 'id.prod_nr' (8 vs 7) Using strcpy() is wrong, because it also copies the terminating null byte, but in this case the extra copied null byte will be overwritten right after the strcpy(), so there is no real problem here. Use strncpy() to fix the warning. Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/appldata: make copy_from_user() invocations provably correctHeiko Carstens2013-10-241-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Just change the type of "len" to unsigned int so the compiler can prove that we don't have a buffer overflow (and generates less code). We get rid of these: In function 'copy_from_user', inlined from 'appldata_interval_handler' at arch/s390/appldata/appldata_base.c:265: uaccess.h:303: warning: call to 'copy_from_user_overflow' declared with attribute warning: copy_from_user() buffer size is not provably correct In function 'copy_from_user', inlined from 'appldata_timer_handler' at arch/s390/appldata/appldata_base.c:225: uaccess.h:303: warning: call to 'copy_from_user_overflow' declared with attribute warning: copy_from_user() buffer size is not provably correct In function 'copy_from_user', inlined from 'appldata_generic_handler' at arch/s390/appldata/appldata_base.c:333: uaccess.h:303: warning: call to 'copy_from_user_overflow' declared with attribute warning: copy_from_user() buffer size is not provably correct Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/cmm: make copy_from_user() invocation provably correctHeiko Carstens2013-10-241-2/+2
| | | | | | | | | | | | | | | | | | | | | Get rid of these two warnings: In function 'copy_from_user', inlined from 'cmm_timeout_handler' at arch/s390/mm/cmm.c:310: uaccess.h:303: warning: call to 'copy_from_user_overflow' declared with attribute warning: copy_from_user() buffer size is not provably correct In function 'copy_from_user', inlined from 'cmm_pages_handler' at arch/s390/mm/cmm.c:270: uaccess.h:303: warning: call to 'copy_from_user_overflow' declared with attribute warning: copy_from_user() buffer size is not provably correct Change the "len" type to unsigned int, so we can make sure that there is no buffer overflow. This also generates less code. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/cache: get rid of compile warningHeiko Carstens2013-10-241-3/+2
| | | | | | | | | | | Get rid of this one: arch/s390/kernel/cache.c: In function 'cache_build_info': arch/s390/kernel/cache.c:144: warning: 'private' may be used uninitialized in this function Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/compat,signal: change return values to -EFAULTHeiko Carstens2013-10-242-17/+19
| | | | | | | | | | | | | Instead of returnin the number of bytes not copied and/or -EFAULT let the signal handler helper functions always return -EFAULT if a user space access failed. This doesn't fix a bug in the current code, but makes is harder to get it wrong in the future. Also "smatch" won't complain anymore about the fact that the number of remaining bytes gets returned instead of -EFAULT. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390: Remove unused declaration of zfcpdump_prefix_array[]Michael Holzheu2013-10-241-1/+0
| | | | | Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/cio: fix error-prone definesPeter Oberparleiter2013-10-241-19/+19
| | | | | | | | Missing parenthesis may cause problems when using the defines together with operations of higher precedence. Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390: Remove zfcpdump NR_CPUS dependencyMichael Holzheu2013-10-245-26/+49
| | | | | | | | | | Currently zfpcdump can only collect registers for up to CONFIG_NR_CPUS CPUss. This dependency is not necessary. So remove it by dynamically allocating the save area array. Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/ftrace: prepare_ftrace_return() function call orderHeiko Carstens2013-10-241-5/+4
| | | | | | | | | | | | | | | Steven Rostedt noted that s390 is the only architecture which calls ftrace_push_return_trace() before ftrace_graph_entry() and therefore has the small advantage that trace.depth gets initialized automatically. However this small advantage isn't worth the difference and possible subtle breakage that may result from this. So change s390 to have the same function call order like all other architectures: first ftrace_graph_entry(), then ftrace_push_return_trace() Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/crashdump: remove unused variableHeiko Carstens2013-10-241-1/+0
| | | | | | | | | | | | | Get rid of this compile warning: arch/s390/kernel/crash_dump.c: In function 'copy_from_realmem': arch/s390/kernel/crash_dump.c:48:6: warning: unused variable 'rc' [-Wunused-variable] int rc; ^ Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/atomic: use 'unsigned int' instead of 'unsigned long' for atomic_*_mask()Chen Gang2013-10-241-2/+2
| | | | | | | | | | The type of 'v->counter' is always 'int', and related inline assembly code also process 'int', so use 'unsigned int' instead of 'unsigned long' for the 'mask'. Signed-off-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/gup: handle zero nr_pages case correctlyHeiko Carstens2013-10-241-1/+1
| | | | | | | | | | | If [__]get_user_pages_fast() gets called with nr_pages == 0, the current code would walk the page tables and pin as many pages until the first invalid pte (or the kernel crashed while writing struct page pointers to the pages array). So let's handle at least the nr_pages == 0 case correctly and exit early. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/gup: reduce code duplication between [__]get_user_pages_fast functionsHeiko Carstens2013-10-241-58/+23
| | | | | | | | | Just call __get_user_pages_fast() from get_user_pages_fast() like powerpc. This saves a lot of duplicated code. Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/mm: do not initialize storage keysMartin Schwidefsky2013-10-243-2/+11
| | | | | | | | | | | With dirty and referenced bits implemented in software it is unnecessary to initialize the storage key for every page. With this patch not a single storage key operation is done for a system that does not use KVM. For KVM set_pte_at/pgste_set_key will do the initialization for the guest view of the storage key when the mapping for the page is established in the host. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/bpf,jit: fix prolog oddityMartin Schwidefsky2013-10-241-2/+2
| | | | | | | | | | | The prolog of functions generated by the bpf jit compiler uses an instruction sequence with an "ahi" instruction to create stack space instead of using an "aghi" instruction. Using the 32-bit "ahi" is not wrong as the stack we are operating on is an order-4 allocation which is always aligned to 16KB. But it is more consistent to use an "aghi" as the stack pointer is a 64-bit value. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390: cleanup and add sanity checks to control register macrosHeiko Carstens2013-10-241-61/+51
| | | | | | | | | - turn some macros into functions - merge two almost identical versions for 32/64 bit - add BUILD_BUG_ON() check to make sure the passed in array is large enough Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/eadm_sch: improve quiesce handlingSebastian Ott2013-10-242-3/+28
| | | | | | | | | When quiescing an eadm subchannel make sure that outstanding IO is cleared and potential timeout handlers are canceled. Reviewed-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/pci: implement hibernation hooksSebastian Ott2013-10-241-0/+41
| | | | | | | | Implement architecture-specific functionality when a PCI device is doing a hibernate transition. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/uaccess: always run the kernel in home spaceMartin Schwidefsky2013-10-2421-459/+51
| | | | | | | Simplify the uaccess code by removing the user_mode=home option. The kernel will now always run in the home space mode. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/bitops: rename find_first_bit_left() to find_first_bit_inv()Heiko Carstens2013-10-243-22/+46
| | | | | | | | | | | | | | | | | | find_first_bit_left() and friends have nothing to do with the normal LSB0 bit numbering for big endian machines used in Linux (least significant bit has bit number 0). Instead they use MSB0 bit numbering, where the most signficant bit has bit number 0. So rename find_first_bit_left() and friends to find_first_bit_inv(), to avoid any confusion. Also provide inv versions of set_bit, clear_bit and test_bit. This also removes the confusing use of e.g. set_bit() in airq.c which uses a "be_to_le" bit number conversion, which could imply that instead set_bit_le() could be used. But that is entirely wrong since the _le bitops variant uses yet another bit numbering scheme. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/bitops: use flogr instruction to implement __ffs, ffs, __fls, fls and fls64Heiko Carstens2013-10-241-0/+125
| | | | | | | | | | | Since z9 109 we have the flogr instruction which can be used to implement optimized versions of __ffs, ffs, __fls, fls and fls64. So implement and use them, instead of the generic variants. This reduces the size of the kernel image (defconfig, -march=z9-109) by 19,648 bytes. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/bitops: use generic find bit functions / reimplement _left variantHeiko Carstens2013-10-246-604/+126
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just like all other architectures we should use out-of-line find bit operations, since the inline variant bloat the size of the kernel image. And also like all other architecures we should only supply optimized variants of the __ffs, ffs, etc. primitives. Therefore this patch removes the inlined s390 find bit functions and uses the generic out-of-line variants instead. The optimization of the primitives follows with the next patch. With this patch also the functions find_first_bit_left() and find_next_bit_left() have been reimplemented, since logically, they are nothing else but a find_first_bit()/find_next_bit() implementation that use an inverted __fls() instead of __ffs(). Also the restriction that these functions only work on machines which support the "flogr" instruction is gone now. This reduces the size of the kernel image (defconfig, -march=z9-109) by 144,482 bytes. Alone the size of the function build_sched_domains() gets reduced from 7 KB to 3,5 KB. We also git rid of unused functions like find_first_bit_le()... Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/s390dbf: use debug_level_enabled() where applicableHendrik Brueckner2013-10-2410-43/+13
| | | | | | | | Refactor direct debug level comparisons with the (internal) s390db->level member. Use the debug_level_enabled() function instead. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/s390dbf: add debug_level_enabled() functionHendrik Brueckner2013-10-242-0/+15
| | | | | | | | | Add the debug_level_enabled() function to check if debug events for a particular level would be logged. This might help to save cycles for debug events that require additional information collection. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/bitops: optimize set_bit() for constant valuesHeiko Carstens2013-10-242-1/+37
| | | | | | | | | | | | | | | | | Since zEC12 we have the interlocked-access facility 2 which allows to use the instructions ni/oi/xi to update a single byte in storage with compare-and-swap semantics. So change set_bit(), clear_bit() and change_bit() to generate such code instead of a compare-and-swap loop (or using the load-and-* instruction family), if possible. This reduces the text segment by yet another 8KB (defconfig). Alternatively the long displacement variants niy/oiy/xiy could have been used, but the extended displacement field is usually not needed and therefore would only increase the size of the text segment again. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/bitops: remove CONFIG_SMP / simplify non-atomic bitopsHeiko Carstens2013-10-242-217/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove CONFIG_SMP from bitops code. This reduces the C code significantly but also generates better code for the SMP case. This means that for !CONFIG_SMP set_bit() and friends now also have compare and swap semantics (read: more code). However nobody really cares for !CONFIG_SMP and this is the trade-off to simplify the SMP code which we do care about. The non-atomic bitops like __set_bit() now generate also better code because the old code did not have a __builtin_contant_p() check for the CONFIG_SMP case and therefore always generated the inline assembly variant. However the inline assemblies for the non-atomic case now got completely removed since gcc can produce better code, which accesses less memory operands. test_bit() got also a bit simplified since it did have a __builtin_constant_p() check, however two identical code pathes for each case (written differently). In result this mainly reduces the to be maintained code but is not very relevant for code generation, since there are not many non-atomic bitops usages that we care about. (code reduction defconfig kernel image before/after: 560 bytes). Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/atomic: various small cleanupsHeiko Carstens2013-10-241-18/+24
| | | | | | | | | - add a typecheck to the defines to make sure they operate on an atomic_t - simplify inline assembly constraints - keep variable names common between functions Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/atomic: optimize atomic_add() for constant valuesHeiko Carstens2013-10-241-8/+40
| | | | | | | | | | | If the interlocked-access facility 1 is available we can use the asi and agsi instructions for interlocked updates if the to be added value is a contanst and small (in the range of -128..127). asi and agsi do not not return the old or new value, therefore these instructions can only be used for atomic_(add|sub|inc|dec)[64]. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/qdio: fix atomic_sub() misusageHeiko Carstens2013-10-241-1/+1
| | | | | | | | | | | | | get_inbound_buffer_frontier() makes use of the return value of atomic_sub() which shouldn't work, since atomic_sub() is supposed to return void. This only works on s390 because atomic_sub() gets mapped to atomic_sub_return() with a define without changing it's return value to void. So use atomic_sub_return() instead of atomic_sub() in qeth code before fixing atomic ops. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/kprobes: allow kprobes only on known instructionsHeiko Carstens2013-10-243-1/+9
| | | | | | | | Since we have an in-kernel disassembler we can make sure that there won't be any kprobes set on random data. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/kprobes: use insn_length helper functionHeiko Carstens2013-10-241-4/+5
| | | | | Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/dis: move disassembler function prototypes to proper header fileHeiko Carstens2013-10-245-6/+7
| | | | | | | | Now that the in-kernel disassembler has an own header file move the disassembler related function prototypes to that header file. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/dis: move common definitions to a header fileSuzuki K. Poulose2013-10-242-28/+43
| | | | | | | | | | | | | The patch moves some of the definitions to a header file. No functional changes involved. I have retained the Copyright Statement from the original file. Signed-off-by: Suzuki K Poulose <suzuki@in.ibm.com> [Heiko Carstens: rename s390-dis.h to dis.h] Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/dis: rename structures for unique typesSuzuki K. Poulose2013-10-241-28/+28
| | | | | | | | | | | | | | | | | | | | | | Rename 'insn' and 'operand' structures to more canonical names to avoid conflicts. struct insn represents information about an instruction, including the mnemonics, format and opcode. struct operand represents the 'properties' and information on howto interpret the operand value and doesn't contain the value. We rename these structures for avoiding a global conflict. i.e, 1,$s/struct insn/struct s390_insn/g 1,$s/struct operand/struct s390_operand/g Signed-off-by: Suzuki K Poulose <suzuki@in.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/atomic: make use of interlocked-access facility 1 instructionsHeiko Carstens2013-10-241-12/+65
| | | | | | | | | Same as for bitops: make use of the interlocked-access facility 1 instructions which allow to atomically update storage locations without a compare-and-swap loop. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/atomic: implement atomic_sub_return() with atomic_add_return()Heiko Carstens2013-10-241-21/+2
| | | | | | | | | | Get rid of the own atomic_sub_return() implementation. Otherwise we can't make use of the interlocked-access facility 1 instructions for atomic_sub_return(), since there is no "load and subtract" instruction available. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/kprobes: have more correct if statement in s390_get_insn_slot()Heiko Carstens2013-10-241-1/+1
| | | | | | | | | | When checking the insn address wether it is a kernel image or module address it should be an if-else-if statement not two independent if statements. This doesn't really fix a bug, but matches s390_free_insn_slot(). Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390: always set -march compiler optionHeiko Carstens2013-10-241-7/+7
| | | | | | | | | | | | | | | | | Currently we only set the -march compiler option if the kbuild system figured out that the compiler actually supports the selected architecture (cc-option test). In result this means that no -march compiler option is set when an unsupported cpu architecture of the current compiler is selected. The kernel compile will afterwards succeed but with the default architecture instead of the (unsupported) selected one. Change this behaviour, so compiles will fail if the compiler does not support the selected cpu architecture. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/bitops: make use of interlocked-access facility 1 instructionsHeiko Carstens2013-10-241-17/+49
| | | | | | | | | | | | | | | Make use of the interlocked-access facility 1 that got added with the z196 architecure. This facilility added new instructions which can atomically update a storage location without a compare-and-swap loop. E.g. setting a bit within a "long" can be done with a single instruction. The size of the kernel image gets ~30kb smaller. Considering that there are appr. 1900 bitops call sites this means that each one saves about 15-16 bytes per call site which is expected. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* Merge tag 'md/3.12-fixes' of git://neil.brown.name/mdLinus Torvalds2013-10-244-2/+25
|\ | | | | | | | | | | | | | | | | | | | | | | | | Pull md bugfixes from Neil Brown: "Assorted md bug-fixes for 3.12. All tagged for -stable releases too" * tag 'md/3.12-fixes' of git://neil.brown.name/md: raid5: avoid finding "discard" stripe raid5: set bio bi_vcnt 0 for discard request md: avoid deadlock when md_set_badblocks. md: Fix skipping recovery for read-only arrays.
| * raid5: avoid finding "discard" stripeShaohua Li2013-10-241-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | SCSI discard will damage discard stripe bio setting, eg, some fields are changed. If the stripe is reused very soon, we have wrong bios setting. We remove discard stripe from hash list, so next time the strip will be fully initialized. Suitable for backport to 3.7+. Cc: <stable@vger.kernel.org> (3.7+) Signed-off-by: Shaohua Li <shli@fusionio.com> Signed-off-by: NeilBrown <neilb@suse.de>
| * raid5: set bio bi_vcnt 0 for discard requestShaohua Li2013-10-241-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | SCSI layer will add new payload for discard request. If two bios are merged to one, the second bio has bi_vcnt 1 which is set in raid5. This will confuse SCSI and cause oops. Suitable for backport to 3.7+ Cc: stable@vger.kernel.org (v3.7+) Reported-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Shaohua Li <shli@fusionio.com> Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
| * md: avoid deadlock when md_set_badblocks.Bian Yu2013-10-241-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When operate harddisk and hit errors, md_set_badblocks is called after scsi_restart_operations which already disabled the irq. but md_set_badblocks will call write_sequnlock_irq and enable irq. so softirq can preempt the current thread and that may cause a deadlock. I think this situation should use write_sequnlock_irqsave/irqrestore instead. I met the situation and the call trace is below: [ 638.919974] BUG: spinlock recursion on CPU#0, scsi_eh_13/1010 [ 638.921923] lock: 0xffff8800d4d51fc8, .magic: dead4ead, .owner: scsi_eh_13/1010, .owner_cpu: 0 [ 638.923890] CPU: 0 PID: 1010 Comm: scsi_eh_13 Not tainted 3.12.0-rc5+ #37 [ 638.925844] Hardware name: To be filled by O.E.M. To be filled by O.E.M./MAHOBAY, BIOS 4.6.5 03/05/2013 [ 638.927816] ffff880037ad4640 ffff880118c03d50 ffffffff8172ff85 0000000000000007 [ 638.929829] ffff8800d4d51fc8 ffff880118c03d70 ffffffff81730030 ffff8800d4d51fc8 [ 638.931848] ffffffff81a72eb0 ffff880118c03d90 ffffffff81730056 ffff8800d4d51fc8 [ 638.933884] Call Trace: [ 638.935867] <IRQ> [<ffffffff8172ff85>] dump_stack+0x55/0x76 [ 638.937878] [<ffffffff81730030>] spin_dump+0x8a/0x8f [ 638.939861] [<ffffffff81730056>] spin_bug+0x21/0x26 [ 638.941836] [<ffffffff81336de4>] do_raw_spin_lock+0xa4/0xc0 [ 638.943801] [<ffffffff8173f036>] _raw_spin_lock+0x66/0x80 [ 638.945747] [<ffffffff814a73ed>] ? scsi_device_unbusy+0x9d/0xd0 [ 638.947672] [<ffffffff8173fb1b>] ? _raw_spin_unlock+0x2b/0x50 [ 638.949595] [<ffffffff814a73ed>] scsi_device_unbusy+0x9d/0xd0 [ 638.951504] [<ffffffff8149ec47>] scsi_finish_command+0x37/0xe0 [ 638.953388] [<ffffffff814a75e8>] scsi_softirq_done+0xa8/0x140 [ 638.955248] [<ffffffff8130e32b>] blk_done_softirq+0x7b/0x90 [ 638.957116] [<ffffffff8104fddd>] __do_softirq+0xfd/0x330 [ 638.958987] [<ffffffff810b964f>] ? __lock_release+0x6f/0x100 [ 638.960861] [<ffffffff8174a5cc>] call_softirq+0x1c/0x30 [ 638.962724] [<ffffffff81004c7d>] do_softirq+0x8d/0xc0 [ 638.964565] [<ffffffff8105024e>] irq_exit+0x10e/0x150 [ 638.966390] [<ffffffff8174ad4a>] smp_apic_timer_interrupt+0x4a/0x60 [ 638.968223] [<ffffffff817499af>] apic_timer_interrupt+0x6f/0x80 [ 638.970079] <EOI> [<ffffffff810b964f>] ? __lock_release+0x6f/0x100 [ 638.971899] [<ffffffff8173fa6a>] ? _raw_spin_unlock_irq+0x3a/0x50 [ 638.973691] [<ffffffff8173fa60>] ? _raw_spin_unlock_irq+0x30/0x50 [ 638.975475] [<ffffffff81562393>] md_set_badblocks+0x1f3/0x4a0 [ 638.977243] [<ffffffff81566e07>] rdev_set_badblocks+0x27/0x80 [ 638.978988] [<ffffffffa00d97bb>] raid5_end_read_request+0x36b/0x4e0 [raid456] [ 638.980723] [<ffffffff811b5a1d>] bio_endio+0x1d/0x40 [ 638.982463] [<ffffffff81304ff3>] req_bio_endio.isra.65+0x83/0xa0 [ 638.984214] [<ffffffff81306b9f>] blk_update_request+0x7f/0x350 [ 638.985967] [<ffffffff81306ea1>] blk_update_bidi_request+0x31/0x90 [ 638.987710] [<ffffffff813085e0>] __blk_end_bidi_request+0x20/0x50 [ 638.989439] [<ffffffff8130862f>] __blk_end_request_all+0x1f/0x30 [ 638.991149] [<ffffffff81308746>] blk_peek_request+0x106/0x250 [ 638.992861] [<ffffffff814a62a9>] ? scsi_kill_request.isra.32+0xe9/0x130 [ 638.994561] [<ffffffff814a633a>] scsi_request_fn+0x4a/0x3d0 [ 638.996251] [<ffffffff813040a7>] __blk_run_queue+0x37/0x50 [ 638.997900] [<ffffffff813045af>] blk_run_queue+0x2f/0x50 [ 638.999553] [<ffffffff814a5750>] scsi_run_queue+0xe0/0x1c0 [ 639.001185] [<ffffffff814a7721>] scsi_run_host_queues+0x21/0x40 [ 639.002798] [<ffffffff814a2e87>] scsi_restart_operations+0x177/0x200 [ 639.004391] [<ffffffff814a4fe9>] scsi_error_handler+0xc9/0xe0 [ 639.005996] [<ffffffff814a4f20>] ? scsi_unjam_host+0xd0/0xd0 [ 639.007600] [<ffffffff81072f6b>] kthread+0xdb/0xe0 [ 639.009205] [<ffffffff81072e90>] ? flush_kthread_worker+0x170/0x170 [ 639.010821] [<ffffffff81748cac>] ret_from_fork+0x7c/0xb0 [ 639.012437] [<ffffffff81072e90>] ? flush_kthread_worker+0x170/0x170 This bug was introduce in commit 2e8ac30312973dd20e68073653 (the first time rdev_set_badblock was call from interrupt context), so this patch is appropriate for 3.5 and subsequent kernels. Cc: <stable@vger.kernel.org> (3.5+) Signed-off-by: Bian Yu <bianyu@kedacom.com> Reviewed-by: Jianpeng Ma <majianpeng@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>
| * md: Fix skipping recovery for read-only arrays.Lukasz Dorau2013-10-242-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since: commit 7ceb17e87bde79d285a8b988cfed9eaeebe60b86 md: Allow devices to be re-added to a read-only array. spares are activated on a read-only array. In case of raid1 and raid10 personalities it causes that not-in-sync devices are marked in-sync without checking if recovery has been finished. If a read-only array is degraded and one of its devices is not in-sync (because the array has been only partially recovered) recovery will be skipped. This patch adds checking if recovery has been finished before marking a device in-sync for raid1 and raid10 personalities. In case of raid5 personality such condition is already present (at raid5.c:6029). Bug was introduced in 3.10 and causes data corruption. Cc: stable@vger.kernel.org Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com> Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | Merge tag 'scsi-fixes' of ↵Linus Torvalds2013-10-244-10/+19
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is a set of two fixes which cause oopses (Buslogic, qla2xxx) and one fix which may cause a hang because of request miscounting (sd)" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: [SCSI] sd: call blk_pm_runtime_init before add_disk [SCSI] qla2xxx: Fix request queue null dereference. [SCSI] BusLogic: Fix an oops when intializing multimaster adapter
| * [SCSI] sd: call blk_pm_runtime_init before add_diskAaron Lu2013-10-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sujit has found a race condition that would make q->nr_pending unbalanced, it occurs as Sujit explained: " sd_probe_async() -> add_disk() -> disk_add_event() -> schedule(disk_events_workfn) sd_revalidate_disk() blk_pm_runtime_init() return; Let's say the disk_events_workfn() calls sd_check_events() which tries to send test_unit_ready() and because of sd_revalidate_disk() trying to send another commands the test_unit_ready() might be re-queued as the tagged command queuing is disabled. So the race condition is - Thread 1 | Thread 2 sd_revalidate_disk() | sd_check_events() ...nr_pending = 0 as q->dev = NULL| scsi_queue_insert() blk_runtime_pm_init() | blk_pm_requeue_request() -> | nr_pending = -1 since | q->dev != NULL " The problem is, the test_unit_ready request doesn't get counted the first time it is queued, so the later decrement of q->nr_pending in blk_pm_requeue_request makes it unbalanced. Fix this by calling blk_pm_runtime_init before add_disk so that all requests initiated there will all be counted. Signed-off-by: Aaron Lu <aaron.lu@intel.com> Reported-and-tested-by: Sujit Reddy Thumma <sthumma@codeaurora.org> Cc: stable@vger.kernel.org Signed-off-by: James Bottomley <JBottomley@Parallels.com>
OpenPOWER on IntegriCloud