summaryrefslogtreecommitdiffstats
path: root/drivers/scsi
Commit message (Collapse)AuthorAgeFilesLines
* [SCSI] fcoe: fix fcoe in a DCB environment by adding DCB notifiers to set ↵john fastabend2011-12-152-0/+119
| | | | | | | | | | | | | | | | skb priority Use DCB notifiers to set the skb priority to allow packets to be steered and tagged correctly over DCB enabled drivers that setup traffic classes. This allows queue_mapping() routines to be removed in these drivers that were previously inspecting the ethertype of every skb to mark FCoE/FIP frames. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] bnx2i: Fixed kernel panic caused by unprotected task->sc->request derefEddie Wai2011-12-141-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During session recovery, the conn_stop call will trigger a flush to all outstanding SCSI cmds in the xmit queue. This will set all outstanding task->sc to NULL prior to the session_teardown call which frees the task memory. In the bnx2i SCSI response processing path, only the task was being checked for NULL under the session lock before the task->sc->request dereferencing. If there are outstanding SCSI cmd responses pending for process, the following kernel panic can be exposed where task->sc was found to be NULL. Call Trace: [ 69.720205] [<ffffffffa040d0d0>] bnx2i_process_new_cqes+0x290/0x3c0 [bnx2i] [ 69.804289] [<ffffffffa040d233>] bnx2i_fastpath_notification+0x33/0xa0 [bnx2 i] [ 69.891490] [<ffffffffa040d37b>] bnx2i_indicate_kcqe+0xdb/0x330 [bnx2i] [ 69.971427] [<ffffffffa03eac5e>] service_kcqes+0x16e/0x1d0 [cnic] [ 70.045132] [<ffffffffa03eacea>] cnic_service_bnx2x_kcq+0x2a/0x50 [cnic] [ 70.126105] [<ffffffffa03ead53>] cnic_service_bnx2x_bh+0x43/0x140 [cnic] [ 70.207081] [<ffffffff81060676>] tasklet_action+0x66/0x110 [ 70.273521] [<ffffffff8106025f>] __do_softirq+0xef/0x220 [ 70.337887] [<ffffffff81447ebc>] call_softirq+0x1c/0x30 This patch adds the !task->sc check and also protects the sc dereferencing under the session lock. Signed-off-by: Eddie Wai <eddie.wai@broadcom.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] qla4xxx: check for failed conn setupMike Christie2011-12-141-0/+3
| | | | | | | | iscsi_conn_setup can fail so we must check for NULL being returned. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] qla4xxx: a small loop fixTomas Henzl2011-12-141-1/+3
| | | | | | | | | | | | | | | | | | | | | | When the qla4xxx_get_fwddb_entry returns QLA_ERROR the nex_idx is not updated, for (idx = 0; idx < max_ddbs; idx = next_idx) { ret = qla4xxx_get_fwddb_entry(ha, idx, NULL, 0, NULL, &next_idx, &state, &conn_err, NULL, NULL); if (ret == QLA_ERROR) continue; This means there is a risk that the 'idx < max_ddbs' condition will never met and the loop will loop forever. Fix this by explicitly increasing the next_idx in the error condition. Maybe a break instead of continue is more appropriate, leaving the decision on the qlogic maintainer. Signed-off-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] qla4xxx: fix flash/ddb supportMike Christie2011-12-147-97/+1315
| | | | | | | | | | | | | With open-iscsi support, target entries persisted in the FLASH were not login. Added support in the qla4xxx driver to do the login on probe time to the target entries saved in the FLASH by user. With this changes upgrade to the new kernel with open-iscsi support in qla4xxx will ensure users original target entries login on driver load Signed-off-by: Manish Rangankar <manish.rangankar@qlogic.com> Signed-off-by: Ravi Anand <ravi.anand@qlogic.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] fcoe: Fix preempt count leak in fcoe_filter_frames()Thomas Gleixner2011-12-141-0/+1
| | | | | | | | | The error exit path leaks preempt count. Add the missing put_cpu(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Yi Zou <yi.zou@intel.com> Cc: stable@kernel.org Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] qla2xxx: Update version number to 8.03.07.12-k.Chad Dupuis2011-12-121-1/+1
| | | | | Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] qla2xxx: Submit all chained IOCBs for passthrough commands on request ↵Giridhar Malavali2011-12-121-6/+8
| | | | | | | | queue 0. Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] qla2xxx: Correct fc_host port_state display.Saurav Kashyap2011-12-121-4/+23
| | | | | | | | | | [jejb: checkpatch fixes] Add more fine grain parsing of vha->loop_state to export a more accurate fc_host port_state. Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] qla2xxx: Disable generating pause frames when firmware hang detected ↵Giridhar Malavali2011-12-124-4/+28
| | | | | | | | for ISP82xx. Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] qla2xxx: Clear mailbox busy flag during premature mailbox completion ↵Giridhar Malavali2011-12-122-0/+3
| | | | | | | | for ISP82xx. Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] qla2xxx: Encapsulate prematurely completing mailbox commands during ↵Chad Dupuis2011-12-124-30/+22
| | | | | | | | ISP82xx firmware hang. Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] qla2xxx: Display IPE error message for ISP82xx.Chad Dupuis2011-12-121-0/+5
| | | | | | [jejb: fixup checkpatch error] Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] qla2xxx: Return the correct value for a mailbox command if 82xx is in ↵Andrew Vasquez2011-12-121-2/+1
| | | | | | | | | | | reset recovery. We need to return QLA_FUNCTION_TIMEOUT immediately otherwise we mess up the mailbox command state machine. Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] qla2xxx: Enable Minidump by default with default capture mask 0x1f.Giridhar Malavali2011-12-121-3/+3
| | | | | | Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] qla2xxx: Stop unconditional completion of mailbox commands issued in ↵Giridhar Malavali2011-12-122-2/+8
| | | | | | | | interrupt mode during firmware hang. Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] qla2xxx: Revert back the request queue mapping to request queue 0.Giridhar Malavali2011-12-121-0/+1
| | | | | | | | | | If there is an error creating multiple response queues then we need to revert the request queue mapping back to request queue 0. Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] qla2xxx: Don't call alloc_fw_dump for ISP82XX.Saurav Kashyap2011-12-121-1/+2
| | | | | | Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] qla2xxx: Check for SCSI status on underruns.Arun Easi2011-12-121-1/+1
| | | | | | Signed-off-by: Arun Easi <arun.easi@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] qla2xxx: Remove qla2x00_wait_for_loop_ready function.Saurav Kashyap2011-12-121-66/+4
| | | | | | | | | | | | This function can wait for 5min under certain scenarios. One of them is when the port is down from switch and bus reset is issued. The bus reset used to wait for 5 minutes for the loop and upper layer callers used to hang and give stack trace because of getting stuck for 120 sec. It is legacy code that was used when the driver used to do queuing of the commands. Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com> Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] mpt2sas: _scsih_smart_predicted_fault uses GFP_KERNEL in interrupt ↵Anton Blanchard2011-12-121-1/+1
| | | | | | | | | | | context _scsih_smart_predicted_fault is called in an interrupt and therefore must allocate memory using GFP_ATOMIC. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: <stable@kernel.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] hpsa: Disable ASPMMatthew Garrett2011-11-141-0/+5
| | | | | | | | | | The Windows driver .inf disables ASPM on hpsa devices. Do the same because the selection of a non default ASPM policy can cause the device to hang. Signed-off-by: Matthew Garrett <mjg@redhat.com> Cc: stable@kernel.org Acked-by: Mike Miller <mike.miller@hp.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] aacraid: controller hangs if kernel uses non-default ASPM policyVasily Averin2011-11-111-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Aacraid controller can hang on some nodes if kernel uses non-default (powersave) ASPM policy. Controller hangs shortly after successful load and hardware detection. Scsi error handler detects this hang and tries to restart hardware but it does not help. Initially it was noticed on RHEL6-based openVZ kernel after backporting aacraid driver from mainline (RHEL6 kernel with original driver works well) http://bugzilla.openvz.org/show_bug.cgi?id=2043 This issue happens because default ASPM policy was changed in Red Hat kernels. Therefore guys from Red Hat have noticed this problem long time ago: on Fedora 12 https://bugzilla.redhat.com/show_bug.cgi?id=540478 on Fedora 14 https://bugzilla.redhat.com/show_bug.cgi?id=679385 In RHEL6 kernel this issue was fixed, ASPM was disabled in aacraid driver. In kernel changelog I've found that seems it was done by Matthew Garrett: - [scsi] aacraid: Disable ASPM by default (Matthew Garrett) [599735] However seems this patch was not submitted to mainline. I've reproduced this issue on vanilla 3.1.0 kernel booted with "pcie_aspm.policy=powersave" option, So I believe it makes sense to do it now. Signed-off-by: Vasily Averin <vvs@sw.ru> [mjg: Checking the Windows drivers indicates that they disable ASPM under all circumstances, so:] Acked-by: Matthew Garrett <mjg@redhat.com> Acked-by: Achim Leubner <Achim_Leubner@pmc-sierra.com> Cc: stable@kernel.org Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] mpt2sas: add missing allocation.Dan Carpenter2011-11-101-0/+5
| | | | | | | | | | | There was supposed to be a kzalloc() here and the compiler complained about it. mpt2sas_scsih.c: In function ‘mpt2sas_scsih_reset_handler’: mpt2sas_scsih.c:2807:21: warning: ‘fw_event’ may be used uninitialized in this function [-Wuninitialized] Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: "Nandigama, Nagalakshmi" <Nagalakshmi.Nandigama@lsi.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] Silencing 'killing requests for dead queue'Hannes Reinecke2011-11-091-1/+2
| | | | | | | | | | | | | When we tear down a device we try to flush all outstanding commands in scsi_free_queue(). However the check in scsi_request_fn() is imperfect as it only signals that we _might start_ aborting commands, not that we've actually aborted some. So move the printk inside the scsi_kill_request function, this will also give us a hint about which commands are aborted. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] fix WARNING: at drivers/scsi/scsi_lib.c:1704James Bottomley2011-11-091-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On Mon, 2011-11-07 at 17:24 +1100, Stephen Rothwell wrote: > Hi all, > > Starting some time last week I am getting the following during boot on > our PPC970 blade: > > calling .ipr_init+0x0/0x68 @ 1 > ipr: IBM Power RAID SCSI Device Driver version: 2.5.2 (April 27, 2011) > ipr 0000:01:01.0: Found IOA with IRQ: 26 > ipr 0000:01:01.0: Starting IOA initialization sequence. > ipr 0000:01:01.0: Adapter firmware version: 06160039 > ipr 0000:01:01.0: IOA initialized. > scsi0 : IBM 572E Storage Adapter > ------------[ cut here ]------------ > WARNING: at drivers/scsi/scsi_lib.c:1704 > Modules linked in: > NIP: c00000000053b3d4 LR: c00000000053e5b0 CTR: c000000000541d70 > REGS: c0000000783c2f60 TRAP: 0700 Not tainted (3.1.0-autokern1) > MSR: 8000000000029032 <EE,ME,CE,IR,DR> CR: 24002024 XER: 20000002 > TASK = c0000000783b8000[1] 'swapper' THREAD: c0000000783c0000 CPU: 0 > GPR00: 0000000000000001 c0000000783c31e0 c000000000cf38b0 c00000000239a9d0 > GPR04: c000000000cbe8f8 0000000000000000 c0000000783c3040 0000000000000000 > GPR08: c000000075daf488 c000000078a3b7ff c000000000bcacc8 0000000000000000 > GPR12: 0000000044002028 c000000007ffb000 0000000002e40000 000000000099b800 > GPR16: 0000000000000000 c000000000bba5fc c000000000a61db8 0000000000000000 > GPR20: 0000000001b77200 0000000000000000 c000000078990000 0000000000000001 > GPR24: c000000002396828 0000000000000000 0000000000000000 c000000078a3b938 > GPR28: fffffffffffffffa c0000000008ad2c0 c000000000c7faa8 c00000000239a9d0 > NIP [c00000000053b3d4] .scsi_free_queue+0x24/0x90 > LR [c00000000053e5b0] .scsi_alloc_sdev+0x280/0x2e0 > Call Trace: > [c0000000783c31e0] [c000000000c7faa8] wireless_seq_fops+0x278d0/0x2eb88 (unreliable) > [c0000000783c3270] [c00000000053e5b0] .scsi_alloc_sdev+0x280/0x2e0 > [c0000000783c3330] [c00000000053eba0] .scsi_probe_and_add_lun+0x390/0xb40 > [c0000000783c34a0] [c00000000053f7ec] .__scsi_scan_target+0x16c/0x650 > [c0000000783c35f0] [c00000000053fd90] .scsi_scan_channel+0xc0/0x100 > [c0000000783c36a0] [c00000000053fefc] .scsi_scan_host_selected+0x12c/0x1c0 > [c0000000783c3750] [c00000000083dcb4] .ipr_probe+0x2c0/0x390 > [c0000000783c3830] [c0000000003f50b4] .local_pci_probe+0x34/0x50 > [c0000000783c38a0] [c0000000003f5f78] .pci_device_probe+0x148/0x150 > [c0000000783c3950] [c0000000004e1e8c] .driver_probe_device+0xdc/0x210 > [c0000000783c39f0] [c0000000004e20cc] .__driver_attach+0x10c/0x110 > [c0000000783c3a80] [c0000000004e1228] .bus_for_each_dev+0x98/0xf0 > [c0000000783c3b30] [c0000000004e1bf8] .driver_attach+0x28/0x40 > [c0000000783c3bb0] [c0000000004e07d8] .bus_add_driver+0x218/0x340 > [c0000000783c3c60] [c0000000004e2a2c] .driver_register+0x9c/0x1b0 > [c0000000783c3d00] [c0000000003f62d4] .__pci_register_driver+0x64/0x140 > [c0000000783c3da0] [c000000000b99f88] .ipr_init+0x4c/0x68 > [c0000000783c3e20] [c00000000000ad24] .do_one_initcall+0x1a4/0x1e0 > [c0000000783c3ee0] [c000000000b512d0] .kernel_init+0x14c/0x1fc > [c0000000783c3f90] [c000000000022468] .kernel_thread+0x54/0x70 > Instruction dump: > ebe1fff8 7c0803a6 4e800020 7c0802a6 fba1ffe8 fbe1fff8 7c7f1b78 f8010010 > f821ff71 e8030398 3120ffff 7c090110 <0b000000> e86303b0 482de065 60000000 > ---[ end trace 759bed76a85e8dec ]--- > scsi 0:0:1:0: Direct-Access IBM-ESXS MAY2036RC T106 PQ: 0 ANSI: 5 > ------------[ cut here ]------------ > > I get lots more of these. The obvious commit to point the finger at > is 3308511c93e6 ("[SCSI] Make scsi_free_queue() kill pending SCSI > commands") but the root cause may be something different. Caused by commit f7c9c6bb14f3104608a3a83cadea10a6943d2804 Author: Anton Blanchard <anton@samba.org> Date: Thu Nov 3 08:56:22 2011 +1100 [SCSI] Fix block queue and elevator memory leak in scsi_alloc_sdev Doesn't completely do the teardown. The true fix is to do a proper teardown instead of hand rolling it Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Tested-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: stable@kernel.org #2.6.38+ Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* Merge branch 'modsplit-Oct31_2011' of ↵Linus Torvalds2011-11-0641-0/+42
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits) Revert "tracing: Include module.h in define_trace.h" irq: don't put module.h into irq.h for tracking irqgen modules. bluetooth: macroize two small inlines to avoid module.h ip_vs.h: fix implicit use of module_get/module_put from module.h nf_conntrack.h: fix up fallout from implicit moduleparam.h presence include: replace linux/module.h with "struct module" wherever possible include: convert various register fcns to macros to avoid include chaining crypto.h: remove unused crypto_tfm_alg_modname() inline uwb.h: fix implicit use of asm/page.h for PAGE_SIZE pm_runtime.h: explicitly requires notifier.h linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h miscdevice.h: fix up implicit use of lists and types stop_machine.h: fix implicit use of smp.h for smp_processor_id of: fix implicit use of errno.h in include/linux/of.h of_platform.h: delete needless include <linux/module.h> acpi: remove module.h include from platform/aclinux.h miscdevice.h: delete unnecessary inclusion of module.h device_cgroup.h: delete needless include <linux/module.h> net: sch_generic remove redundant use of <linux/module.h> net: inet_timewait_sock doesnt need <linux/module.h> ... Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in - drivers/media/dvb/frontends/dibx000_common.c - drivers/media/video/{mt9m111.c,ov6650.c} - drivers/mfd/ab3550-core.c - include/linux/dmaengine.h
| * scsi: Fix up files implicitly depending on module.h inclusionPaul Gortmaker2011-10-3125-0/+25
| | | | | | | | | | | | | | | | | | The module.h header was implicitly present everywhere, so files with no explicit include of the module infrastructure would build anyway. We are now removing the implicit include, and so we need to call out the module.h file that we need explicitly. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
| * scsi: Add export.h for EXPORT_SYMBOL/THIS_MODULE as requiredPaul Gortmaker2011-10-3116-0/+17
| | | | | | | | | | | | | | | | For the basic SCSI infrastructure files that are exporting symbols but not modules themselves, add in the basic export.h header file to allow the exports. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
* | Merge branch 'trivial' of ↵Linus Torvalds2011-11-061-14/+0
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild * 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: scsi: drop unused Kconfig symbol pci: drop unused Kconfig symbol stmmac: drop unused Kconfig symbol x86: drop unused Kconfig symbol powerpc: drop unused Kconfig symbols powerpc: 40x: drop unused Kconfig symbol mips: drop unused Kconfig symbols openrisc: drop unused Kconfig symbols arm: at91: drop unused Kconfig symbol samples: drop unused Kconfig symbol m32r: drop unused Kconfig symbol score: drop unused Kconfig symbols sh: drop unused Kconfig symbol um: drop unused Kconfig symbol sparc: drop unused Kconfig symbol alpha: drop unused Kconfig symbol Fix up trivial conflict in drivers/net/ethernet/stmicro/stmmac/Kconfig as per Michal: the STMMAC_DUAL_MAC config variable is still unused and should be deleted.
| * | scsi: drop unused Kconfig symbolPaul Bolle2011-10-311-14/+0
| | | | | | | | | | | | | | | Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Michal Marek <mmarek@suse.cz>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6Linus Torvalds2011-11-0543-1073/+1751
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (45 commits) [SCSI] Fix block queue and elevator memory leak in scsi_alloc_sdev [SCSI] scsi_dh_alua: Fix the time inteval for alua rtpg commands [SCSI] scsi_transport_iscsi: Fix documentation os parameter [SCSI] mv_sas: OCZ RevoDrive3 & zDrive R4 support [SCSI] libfc: improve flogi retries to avoid lport stuck [SCSI] libfc: avoid exchanges collision during lport reset [SCSI] libfc: fix checking FC_TYPE_BLS [SCSI] edd: Treat "XPRS" host bus type the same as "PCI" [SCSI] isci: overriding max_concurr_spinup oem parameter by max(oem, user) [SCSI] isci: revert bcn filtering [SCSI] isci: Fix hard reset timeout conditions. [SCSI] isci: No need to manage the pending reset bit on pending requests. [SCSI] isci: Remove redundant isci_request.ttype field. [SCSI] isci: Fix task management for SMP, SATA and on dev remove. [SCSI] isci: No task_done callbacks in error handler paths. [SCSI] isci: Handle task request timeouts correctly. [SCSI] isci: Fix tag leak in tasks and terminated requests. [SCSI] isci: Immediately fail I/O to removed devices. [SCSI] isci: Lookup device references through requests in completions. [SCSI] ipr: add definitions for additional adapter ...
| * | | [SCSI] Fix block queue and elevator memory leak in scsi_alloc_sdevAnton Blanchard2011-11-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When looking at memory consumption issues I noticed quite a lot of memory in the kmalloc-2048 bucket: OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 6561 6471 98% 2.30K 243 27 15552K kmalloc-2048 Over 15MB. slub debug shows that cfq is responsible for almost all of it: # sort -nr /sys/kernel/slab/kmalloc-2048/alloc_calls 6402 .cfq_init_queue+0xec/0x460 age=43423/43564/43655 pid=1 cpus=4,11,13 In scsi_alloc_sdev we do scsi_alloc_queue but if slave_alloc fails we don't free it with scsi_free_queue. The patch below fixes the issue: OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 135 72 53% 2.30K 5 27 320K kmalloc-2048 # cat /sys/kernel/slab/kmalloc-2048/alloc_calls 3 .cfq_init_queue+0xec/0x460 age=3811/3876/3925 pid=1 cpus=4,11,13 Signed-off-by: Anton Blanchard <anton@samba.org> Cc: <stable@kernel.org> #2.6.38+ Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * | | [SCSI] scsi_dh_alua: Fix the time inteval for alua rtpg commandsMoger, Babu2011-11-031-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch corrects the retry interval for alua rtpg command. Purpose was to retry the commands in seconds. But that was not happening. Reason is msleep takes argument in milliseconds. Also added minor text after successful attach. Signed-off-by: Babu Moger <babu.moger@netapp.com> Acked-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * | | [SCSI] scsi_transport_iscsi: Fix documentation os parameterMarcos Paulo de Souza2011-11-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes documentation of a parameter of iscsi_bsg_host_add function to silence to make htmldocs Signed-off-by: Marcos Paulo de Souza <marcos.mage@gmail.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * | | [SCSI] mv_sas: OCZ RevoDrive3 & zDrive R4 supportRobin H. Johnson2011-10-311-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the OCZ RevoDrive3/zDrive R4 series, the "OCZ SuperScale Storage Controller" with "Virtualized Controller Architecture 2.0" really seems to be a Marvell 88SE9485 part, with OCZ firmware/BIOS. Developed and tested on OCZ RevoDrive3 120GB [PCI 1b85:1021] Should work on: - OCZ RevoDrive3 (2x SandForce 2281) - OCZ RevoDrive3 X2 (4x SandForce 2281) - OCZ zDrive R4 CM84 (4x SandForce 2281) - OCZ zDrive R4 CM88 (8x SandForce 2281) - OCZ zDrive R4 RM84 (4x SandForce 2582) - OCZ zDrive R4 RM88 (8x SandForce 2582) All of this because a friend recently bought a OCZ RevoDrive3 and was bitten by the lack of Linux support. Notes from testing: ------------------- - SMART works. - VPD Device Identification is "OCZ-REVODRIVE3" - Thin provisioning/TRIM seems to be implemented as WRITE SAME UNMAP, with deterministic (non-zero) read after TRIM, but I'm not sure if it works 100% in my testing. - Some of the tuning in the firmware seems to ensure much better performance when in a RAID0 setup than using the two devices seperately. I have not tested booting from the SSD, because all of this was developed and tested remotely from the actual hardware. Signed-off-by: Robin H. Johnson <robbat2@gentoo.org> Thanks-To: Gordon Pritchard <gordp@sfu.ca> Acked-by: Xiangliang Yu <yuxiangl@marvell.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * | | [SCSI] libfc: improve flogi retries to avoid lport stuckVasu Dev2011-10-312-49/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds more cases to do flogi retry, now also retry on getting bad response due to either no ELS response or flogi response payload length not large enough. In those cases flogi was not retried and that was leaving lport offline. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com> Signed-off-by: Yi Zou <yi.zou@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * | | [SCSI] libfc: avoid exchanges collision during lport resetVasu Dev2011-10-312-9/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently timer delay is large and is using msleep to avoid avoid exchanges collision across lport reset, so instead do this by initializing exches pool indexes during reset also. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com> Signed-off-by: Yi Zou <yi.zou@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * | | [SCSI] libfc: fix checking FC_TYPE_BLSVasu Dev2011-10-311-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Its checked after skb freed, so instead have fh_type cached and then check FC_TYPE_BLS against cached fh_type value. This wrong check was causing double exch locking as reported by Bhanu at https://lists.open-fcoe.org/pipermail/devel/2011-October/011793.html Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com> Signed-off-by: Yi Zou <yi.zou@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * | | [SCSI] isci: overriding max_concurr_spinup oem parameter by max(oem, user)Andrzej Jakowski2011-10-313-10/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes bug where max_concurr_spinup oem parameter should be overriden by max_concurr_spinup user parameter. Override should happen only when max_concurr_spinup user parameter is specified in command line (greater than 0). Also this fix shortens variables representing max_conxurr_spinup for oem and user parameters. Signed-off-by: Andrzej Jakowski <andrzej.jakowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * | | [SCSI] isci: revert bcn filteringDan Williams2011-10-313-281/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The initial bcn filtering implementation was validated on a kernel baseline that predated the switch to new libata error handling. Also, prior to that conversion we borrowed the mvsas MVS_DEV_EH approach to prevent the unwanted extra ap->ops->phy_reset(ap) that occurred in the ata_bus_probe() path. After the conversion to new libata eh resets at discovery are more frequent and get filtered prematurely by IDEV_EH. The result is that our bcn filtering has been blocked from running and at discovery and it appears to stall discovery completion to the point of triggering hung task timeouts. So, revert the implementation for now. When it returns it will go into libsas proper. The domain rediscovery that takes place due to ->lldd_I_T_nexus_reset() events should now be properly waited for by the ata_port_wait_eh() call in ata_port_probe(). So the hard coded delay in the isci ->lldd_I_T_nexus_reset() and other libsas drivers should help debounce the libsas thread from seeing temporary device removals. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * | | [SCSI] isci: Fix hard reset timeout conditions.Jeff Skirvin2011-10-312-42/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A hard reset can timeout before or after the last phy in the port goes away. If after, then notify the OS that the last phy has failed. The recovery for the failed hard reset has been removed. This recovery code was unecessary in that the link would recover from the failure normally by a new link reset sequence or hotplug of the remote device. Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * | | [SCSI] isci: No need to manage the pending reset bit on pending requests.Jeff Skirvin2011-10-313-40/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The lldd does not need to look at or manage the pending device reset bit in pending sas_tasks. Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * | | [SCSI] isci: Remove redundant isci_request.ttype field.Jeff Skirvin2011-10-314-60/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the existing IREQ_TMF flag as a request type indicator. Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * | | [SCSI] isci: Fix task management for SMP, SATA and on dev remove.Jeff Skirvin2011-10-314-184/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | libsas uses the LLDD abort task interface to handle I/O timeouts in the SATA/STP and SMP discovery paths, so this change will terminate STP/SMP requests. Also, if the device is gone, the lldd will prevent libsas from further escalations in the error handler. Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * | | [SCSI] isci: No task_done callbacks in error handler paths.Jeff Skirvin2011-10-311-55/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | libsas will cleanup pending sas_tasks after error handler path functions are called; do not call task_done callbacks. Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * | | [SCSI] isci: Handle task request timeouts correctly.Jeff Skirvin2011-10-312-42/+109
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the case where "task" requests timeout (note that this class of requests can also include SATA/STP soft reset FIS transmissions), handle the case where the task was being managed by some call to terminate the task request by completing both the tmf and the aborting process. Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * | | [SCSI] isci: Fix tag leak in tasks and terminated requests.Jeff Skirvin2011-10-311-20/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make sure terminated requests and completed task tags are freed. Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * | | [SCSI] isci: Immediately fail I/O to removed devices.Jeff Skirvin2011-10-311-10/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the case where an I/O fails to start in isci_request_execute, only allow retries if the device is not already gone. Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * | | [SCSI] isci: Lookup device references through requests in completions.Jeff Skirvin2011-10-311-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The LLDD needs to obtain a reference to the device through the request itself and not through the domain_device, because the domain_device.lldd_dev is set to NULL early in the lldd_dev_gone call. This relies on the fact that the isci_remote_device object is keeping a seperate reference count of outstanding requests. TODO: unify the request count tracking with the isci_remote_device kref. The failure signature of this condition looks like the following log, where the important bits are the call to lldd_dev_gone followed by a crash in isci_terminate_request_core: [ 229.151541] isci 0000:0b:00.0: isci_remote_device_gone: domain_device = ffff8801492d4800, isci_device = ffff880143c657d0, isci_port = ffff880143c63658 [ 229.166007] isci 0000:0b:00.0: isci_remote_device_stop: isci_device = ffff880143c657d0 [ 229.175317] isci 0000:0b:00.0: isci_terminate_pending_requests: idev=ffff880143c657d0 request=ffff88014741f000; task=ffff8801470f46c0 old_state=2 [ 229.189702] isci 0000:0b:00.0: isci_terminate_request_core: device = ffff880143c657d0; request = ffff88014741f000 [ 229.201339] isci 0000:0b:00.0: isci_terminate_request_core: before completion wait (ffff88014741f000/ffff880149715ad0) [ 229.213414] isci 0000:0b:00.0: sci_controller_process_completions: completion queue entry:0x8000a0e9 [ 229.214401] BUG: unable to handle kernel NULL pointer dereference at 0000000000000228 [ 229.214401] IP:jdskirvi-testlbo [<ffffffffa00a58be>] sci_request_completed_state_enter+0x50/0xafb [isci] [ 229.214401] PGD 13d19e067 PUD 13d104067 PMD 0 [ 229.214401] Oops: 0000 [#1] SMP [ 229.214401] CPU 0 x kernel: [ 226 [ 229.214401] Modules linked in: ipv6 dm_multipath uinput nouveau snd_hda_codec_realtek snd_hda_intel ttm drm_kms_helper drm snd_hda_codec snd_hwdep snd_pcm snd_timer i2c_algo_bit isci snd libsas ioatdma mxm_wmi iTCO_wdt soundcore snd_page_alloc scsi_transport_sas iTCO_vendor_support wmi dca video i2c_i801 i2c_core [last unloaded: speedstep_lib] [ 229.214401] [ 229.214401] Pid: 5, comm: kworker/u:0 Not tainted 3.0.0-isci-11.7.29+ #30.353196] Buffer Intel Corporation Stoakley/Pearlcity Workstation [ 229.214401] RIP: 0010:[<ffffffffa00a58be>] I/O error on dev [<ffffffffa00a58be>] sci_request_completed_state_enter+0x50/0xafb [isci] [ 229.214401] RSP: 0018:ffff88014fc03d20 EFLAGS: 00010046 [ 229.214401] RAX: 0000000000000000 RBX: ffff88014741f000 RCX: 0000000000000000 [ 229.214401] RDX: ffffffffa00b2c90 RSI: 0000000000000017 RDI: ffff88014741f0a0 [ 229.214401] RBP: ffff88014fc03d90 R08: 0000000000000018 R09: 0000000000000000 [ 229.214401] R10: 0000000000000000 R11: ffffffff81a17d98 R12: 000000000000001d [ 229.214401] R13: ffff8801470f46c0 R14: 0000000000000000 R15: 0000000000008000 [ 229.214401] FS: 0000000000000000(0000) GS:ffff88014fc00000(0000) knlGS:0000000000000000 [ 229.214401] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 229.214401] CR2: 0000000000000228 CR3: 000000013ceaa000 CR4: 00000000000406f0 [ 229.214401] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 229.214401] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 229.214401] Process kworker/u:0 (pid: 5, threadinfo ffff880149714000, task ffff880149718000) [ 229.214401] Call Trace: [ 229.214401] <IRQ> [ 229.214401] [<ffffffffa00aa6ce>] sci_change_state+0x4a/0x4f [isci] [ 229.214401] [<ffffffffa00a4ca6>] sci_io_request_tc_completion+0x79c/0x7a0 [isci] [ 229.214401] [<ffffffffa00acf35>] sci_controller_process_completions+0x14f/0x396 [isci] [ 229.214401] [<ffffffffa00abbda>] ? spin_lock_irq+0xe/0x10 [isci] [ 229.214401] [<ffffffffa00ad2cf>] isci_host_completion_routine+0x71/0x2be [isci] [ 229.214401] [<ffffffff8107c6b3>] ? mark_held_locks+0x52/0x70 [ 229.214401] [<ffffffff810538e8>] tasklet_action+0x90/0xf1 [ 229.214401] [<ffffffff81054050>] __do_softirq+0xe5/0x1bf [ 229.214401] [<ffffffff8106d9d1>] ? hrtimer_interrupt+0x129/0x1bb [ 229.214401] [<ffffffff814ff69c>] call_softirq+0x1c/0x30 [ 229.214401] [<ffffffff8100bb67>] do_softirq+0x4b/0xa3 [ 229.214401] [<ffffffff81053d84>] irq_exit+0x53/0xb4 [ 229.214401] [<ffffffff814fffe7>] smp_apic_timer_interrupt+0x83/0x91 [ 229.214401] [<ffffffff814fee53>] apic_timer_interrupt+0x13/0x20 [ 229.214401] <EOI> [ 229.214401] [<ffffffff814f7ad4>] ? retint_restore_args+0x13/0x13 [ 229.214401] [<ffffffff8107af29>] ? trace_hardirqs_off+0xd/0xf [ 229.214401] [<ffffffff8104ea71>] ? vprintk+0x40b/0x452 [ 229.214401] [<ffffffff814f4b5a>] printk+0x41/0x47 [ 229.214401] [<ffffffff81314484>] __dev_printk+0x78/0x7a [ 229.214401] [<ffffffff8131471e>] dev_printk+0x45/0x47 [ 229.214401] [<ffffffffa00ae2a3>] isci_terminate_request_core+0x15d/0x317 [isci] [ 229.214401] [<ffffffffa00af1ad>] isci_terminate_pending_requests+0x1a4/0x204 [isci] [ 229.214401] [<ffffffffa00229f6>] ? sas_phye_oob_error+0xc3/0xc3 [libsas] [ 229.214401] [<ffffffffa00a7d9e>] isci_remote_device_nuke_requests+0xa6/0xff [isci] [ 229.214401] [<ffffffffa00a811a>] isci_remote_device_stop+0x7c/0x166 [isci] [ 229.214401] [<ffffffffa00229f6>] ? sas_phye_oob_error+0xc3/0xc3 [libsas] [ 229.214401] [<ffffffffa00a827a>] isci_remote_device_gone+0x76/0x7e [isci] [ 229.214401] [<ffffffffa002363e>] sas_notify_lldd_dev_gone+0x34/0x36 [libsas] [ 229.214401] [<ffffffffa0023945>] sas_unregister_dev+0x57/0x9c [libsas] [ 229.214401] [<ffffffffa00239c0>] sas_unregister_domain_devices+0x36/0x65 [libsas] [ 229.214401] [<ffffffffa0022cb8>] sas_deform_port+0x72/0x1ac [libsas] [ 229.214401] [<ffffffffa00229f6>] ? sas_phye_oob_error+0xc3/0xc3 [libsas] [ 229.214401] [<ffffffffa0022a34>] sas_phye_loss_of_signal+0x3e/0x42 [libsas] Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
OpenPOWER on IntegriCloud