summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* block: fix lockdep warning on io_context release put_io_context()Tejun Heo2012-02-111-7/+14
| | | | | | | | | | | | | | | | 11a3122f6c "block: strip out locking optimization in put_io_context()" removed ioc_lock depth lockdep annoation along with locking optimization; however, while recursing from put_io_context() is no longer possible, ioc_release_fn() may still end up putting the last reference of another ioc through elevator, which wlil grab ioc->lock triggering spurious (as the ioc is always different one) A-A deadlock warning. As this can only happen one time from ioc_release_fn(), using non-zero subclass from ioc_release_fn() is enough. Use subclass 1. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* relay: prevent integer overflow in relay_open()Dan Carpenter2012-02-101-2/+8
| | | | | | | | | "subbuf_size" and "n_subbufs" come from the user and they need to be capped to prevent an integer overflow. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: stable@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
* loop: zero fill bio instead of return -EIO for partial readDave Young2012-02-081-11/+13
| | | | | | | | | | | | | | | | | | commit 8268f5a741 ("deny partial write for loop dev fd") tried to fix the loop device partial read information leak problem. But it changed the semantics of read behavior. When we read beyond the end of the device we should get 0 bytes, which is normal behavior, we should not just return -EIO Instead of returning -EIO, zero out the bio to avoid information leak in case of partail read. Signed-off-by: Dave Young <dyoung@redhat.com> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Tested-by: Jeff Moyer <jmoyer@redhat.com> Cc: Dmitry Monakhov <dmonakhov@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* bio: don't overflow in bio_get_nr_vecs()Kent Overstreet2012-02-081-7/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | There were two places bio_get_nr_vecs() could overflow: First, it did a left shift to convert from sectors to bytes immediately before dividing by PAGE_SIZE. If PAGE_SIZE ever was less than 512 a great many things would break, so dividing by PAGE_SIZE >> 9 is safe and will generate smaller code too. The nastier overflow was in the DIV_ROUND_UP() (that's what the code was effectively doing, anyways). If n + d overflowed, the whole thing would return 0 which breaks things rather effectively. bio_get_nr_vecs() doesn't claim to give an exact value anyways, so the DIV_ROUND_UP() is silly; we could do a straight divide except if a device's queue_max_sectors was less than PAGE_SIZE we'd return 0. So we just add 1; this should always be safe - things will break badly if bio_get_nr_vecs() returns > BIO_MAX_PAGES (bio_alloc() will suddenly start failing) but it's queue_max_segments that must guard against this, if queue_max_sectors is preventing this from happen things are going to explode on architectures with different PAGE_SIZE. Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Tejun Heo <tj@kernel.org> Acked-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* floppy: Fix a crash during rmmodVivek Goyal2012-02-081-0/+9
| | | | | | | | | | | floppy driver does not call add_disk() on all the drives hence we don't take gendisk reference on request queue for these drives. Don't call put_disk() with disk->queue set, otherwise we try to put the reference we never took. Reported-and-tested-by: Dirk Gouders <gouders@et.bocholt.fh-gelsenkirchen.de> Signed-off-by: Vivek Goyal<vgoyal@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* floppy: Cleanup disk->queue before caling put_disk() if add_disk() was never ↵Vivek Goyal2012-02-081-1/+7
| | | | | | | | | | | | | | | | | | | | called add_disk() takes gendisk reference on request queue. If driver failed during initialization and never called add_disk() then that extra reference is not taken. That reference is put in put_disk(). floppy driver allocates the disk, allocates queue, sets disk->queue and then relizes that floppy controller is not present. It tries to tear down everything and tries to put a reference down in put_disk() which was never taken. In such error cases cleanup disk->queue before calling put_disk() so that we never try to put down a reference which was never taken in first place. Reported-and-tested-by: Suresh Jayaraman <sjayaraman@suse.com> Tested-by: Dirk Gouders <gouders@et.bocholt.fh-gelsenkirchen.de> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* cdrom: move shared static to cdrom_device_infoPaolo Bonzini2012-02-082-8/+7
| | | | | | | | | | | | | The keeplocked variable in the cdrom driver is shared across multiple drives, but set in per-device ioctls. Move it to the per-device struct, avoiding that the setting on one drive affects the driver's behavior when closing another. [ Impact: limit udev's confusion to one drive when a CD burning program unlocks the CD door at the end of burning. ] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* bsg: fix sysfs link remove warningStanislaw Gruszka2012-02-081-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We create "bsg" link if q->kobj.sd is not NULL, so remove it only when the same condition is true. Fixes: WARNING: at fs/sysfs/inode.c:323 sysfs_hash_and_remove+0x2b/0x77() sysfs: can not remove 'bsg', no directory Call Trace: [<c0429683>] warn_slowpath_common+0x6a/0x7f [<c0537a68>] ? sysfs_hash_and_remove+0x2b/0x77 [<c042970b>] warn_slowpath_fmt+0x2b/0x2f [<c0537a68>] sysfs_hash_and_remove+0x2b/0x77 [<c053969a>] sysfs_remove_link+0x20/0x23 [<c05d88f1>] bsg_unregister_queue+0x40/0x6d [<c0692263>] __scsi_remove_device+0x31/0x9d [<c069149f>] scsi_forget_host+0x41/0x52 [<c0689fa9>] scsi_remove_host+0x71/0xe0 [<f7de5945>] quiesce_and_remove_host+0x51/0x83 [usb_storage] [<f7de5a1e>] usb_stor_disconnect+0x18/0x22 [usb_storage] [<c06c29de>] usb_unbind_interface+0x4e/0x109 [<c067a80f>] __device_release_driver+0x6b/0xa6 [<c067a861>] device_release_driver+0x17/0x22 [<c067a46a>] bus_remove_device+0xd6/0xe6 [<c06785e2>] device_del+0xf2/0x137 [<c06c101f>] usb_disable_device+0x94/0x1a0 Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* block: don't call elevator callbacks for plug mergesTejun Heo2012-02-083-27/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Plug merge calls two elevator callbacks outside queue lock - elevator_allow_merge_fn() and elevator_bio_merged_fn(). Although attempt_plug_merge() suggests that elevator is guaranteed to be there through the existing request on the plug list, nothing prevents plug merge from calling into dying or initializing elevator. For regular merges, bypass ensures elvpriv count to reach zero, which in turn prevents merges as all !ELVPRIV requests get REQ_SOFTBARRIER from forced back insertion. Plug merge doesn't check ELVPRIV, and, as the requests haven't gone through elevator insertion yet, it doesn't have SOFTBARRIER set allowing merges on a bypassed queue. This, for example, leads to the following crash during elevator switch. BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: [<ffffffff813b34e9>] cfq_allow_merge+0x49/0xa0 PGD 112cbc067 PUD 115d5c067 PMD 0 Oops: 0000 [#1] PREEMPT SMP CPU 1 Modules linked in: deadline_iosched Pid: 819, comm: dd Not tainted 3.3.0-rc2-work+ #76 Bochs Bochs RIP: 0010:[<ffffffff813b34e9>] [<ffffffff813b34e9>] cfq_allow_merge+0x49/0xa0 RSP: 0018:ffff8801143a38f8 EFLAGS: 00010297 RAX: 0000000000000000 RBX: ffff88011817ce28 RCX: ffff880116eb6cc0 RDX: 0000000000000000 RSI: ffff880118056e20 RDI: ffff8801199512f8 RBP: ffff8801143a3908 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: ffff880118195708 R13: ffff880118052aa0 R14: ffff8801143a3d50 R15: ffff880118195708 FS: 00007f19f82cb700(0000) GS:ffff88011fc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000008 CR3: 0000000112c6a000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process dd (pid: 819, threadinfo ffff8801143a2000, task ffff880116eb6cc0) Stack: ffff88011817ce28 ffff880118195708 ffff8801143a3928 ffffffff81391bba ffff88011817ce28 ffff880118195708 ffff8801143a3948 ffffffff81391bf1 ffff88011817ce28 0000000000000000 ffff8801143a39a8 ffffffff81398e3e Call Trace: [<ffffffff81391bba>] elv_rq_merge_ok+0x4a/0x60 [<ffffffff81391bf1>] elv_try_merge+0x21/0x40 [<ffffffff81398e3e>] blk_queue_bio+0x8e/0x390 [<ffffffff81396a5a>] generic_make_request+0xca/0x100 [<ffffffff81396b04>] submit_bio+0x74/0x100 [<ffffffff811d45c2>] __blockdev_direct_IO+0x1ce2/0x3450 [<ffffffff811d0dc7>] blkdev_direct_IO+0x57/0x60 [<ffffffff811460b5>] generic_file_aio_read+0x6d5/0x760 [<ffffffff811986b2>] do_sync_read+0xe2/0x120 [<ffffffff81199345>] vfs_read+0xc5/0x180 [<ffffffff81199501>] sys_read+0x51/0x90 [<ffffffff81aeac12>] system_call_fastpath+0x16/0x1b There are multiple ways to fix this including making plug merge check ELVPRIV; however, * Calling into elevator outside queue lock is confusing and error-prone. * Requests on plug list aren't known to the elevator. They aren't on the elevator yet, so there's no elevator specific state to update. * Given the nature of plug merges - collecting bio's for the same purpose from the same issuer - elevator specific restrictions aren't applicable. So, simply don't call into elevator methods from plug merge by moving elv_bio_merged() from bio_attempt_*_merge() to blk_queue_bio(), and using blk_try_merge() in attempt_plug_merge(). This is based on Jens' patch to skip elevator_allow_merge_fn() from plug merge. Note that this makes per-cgroup merged stats skip plug merging. Signed-off-by: Tejun Heo <tj@kernel.org> LKML-Reference: <4F16F3CA.90904@kernel.dk> Original-patch-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* block: separate out blk_rq_merge_ok() and blk_try_merge() from elevator ↵Tejun Heo2012-02-085-55/+46
| | | | | | | | | | | | | | | | | | | | | | | | | functions blk_rq_merge_ok() is the elevator-neutral part of merge eligibility test. blk_try_merge() determines merge direction and expects the caller to have tested elv_rq_merge_ok() previously. elv_rq_merge_ok() now wraps blk_rq_merge_ok() and then calls elv_iosched_allow_merge(). elv_try_merge() is removed and the two callers are updated to call elv_rq_merge_ok() explicitly followed by blk_try_merge(). While at it, make rq_merge_ok() functions return bool. This is to prepare for plug merge update and doesn't introduce any behavior change. This is based on Jens' patch to skip elevator_allow_merge_fn() from plug merge. Signed-off-by: Tejun Heo <tj@kernel.org> LKML-Reference: <4F16F3CA.90904@kernel.dk> Original-patch-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* mtip32xx: removed the irrelevant argument of mtip_hw_submit_io() and the ↵Asai Thambi S P2012-02-072-11/+5
| | | | | | | | | | | | unused member of struct driver_data Removed the following: * irrelevant argument 'barrier' of mtip_hw_submit_io() * unused member 'eh_active' of struct driver_data Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com> Signed-off-by: Sam Bradshaw <sbradshaw@micron.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* block: strip out locking optimization in put_io_context()Tejun Heo2012-02-078-92/+18
| | | | | | | | | | | | | | | | | | put_io_context() performed a complex trylock dancing to avoid deferring ioc release to workqueue. It was also broken on UP because trylock was always assumed to succeed which resulted in unbalanced preemption count. While there are ways to fix the UP breakage, even the most pathological microbench (forced ioc allocation and tight fork/exit loop) fails to show any appreciable performance benefit of the optimization. Strip it out. If there turns out to be workloads which are affected by this change, simpler optimization from the discussion thread can be applied later. Signed-off-by: Tejun Heo <tj@kernel.org> LKML-Reference: <1328514611.21268.66.camel@sli10-conroe> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* cdrom: use copy_to_user() without the underscoresDan Carpenter2012-02-061-7/+1
| | | | | | | | | | | | "nframes" comes from the user and "nframes * CD_FRAMESIZE_RAW" can wrap on 32 bit systems. That would have been ok if we used the same wrapped value for the copy, but we use a shifted value. We should just use the checked version of copy_to_user() because it's not going to make a difference to the speed. Cc: stable@vger.kernel.com Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* block: fix ioc locking warningShaohua Li2012-02-061-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Meelis reported a warning: WARNING: at kernel/timer.c:1122 run_timer_softirq+0x199/0x1ec() Hardware name: 939Dual-SATA2 timer: cfq_idle_slice_timer+0x0/0xaa preempt leak: 00000102 -> 00000103 Modules linked in: sr_mod cdrom videodev media drm_kms_helper ohci_hcd ehci_hcd v4l2_compat_ioctl32 usbcore i2c_ali15x3 snd_seq drm snd_timer snd_seq Pid: 0, comm: swapper Not tainted 3.3.0-rc2-00110-gd125666 #176 Call Trace: <IRQ> [<ffffffff81022aaa>] warn_slowpath_common+0x7e/0x96 [<ffffffff8114c485>] ? cfq_slice_expired+0x1d/0x1d [<ffffffff81022b56>] warn_slowpath_fmt+0x41/0x43 [<ffffffff8114c526>] ? cfq_idle_slice_timer+0xa1/0xaa [<ffffffff8114c485>] ? cfq_slice_expired+0x1d/0x1d [<ffffffff8102c124>] run_timer_softirq+0x199/0x1ec [<ffffffff81047a53>] ? timekeeping_get_ns+0x12/0x31 [<ffffffff810145fd>] ? apic_write+0x11/0x13 [<ffffffff81027475>] __do_softirq+0x74/0xfa [<ffffffff812f337a>] call_softirq+0x1a/0x30 [<ffffffff81002ff9>] do_softirq+0x31/0x68 [<ffffffff810276cf>] irq_exit+0x3d/0xa3 [<ffffffff81014aca>] smp_apic_timer_interrupt+0x6b/0x77 [<ffffffff812f2de9>] apic_timer_interrupt+0x69/0x70 <EOI> [<ffffffff81040136>] ? sched_clock_cpu+0x73/0x7d [<ffffffff81040136>] ? sched_clock_cpu+0x73/0x7d [<ffffffff8100801f>] ? default_idle+0x1e/0x32 [<ffffffff81008019>] ? default_idle+0x18/0x32 [<ffffffff810008b1>] cpu_idle+0x87/0xd1 [<ffffffff812de861>] rest_init+0x85/0x89 [<ffffffff81659a4d>] start_kernel+0x2eb/0x2f8 [<ffffffff8165926e>] x86_64_start_reservations+0x7e/0x82 [<ffffffff81659362>] x86_64_start_kernel+0xf0/0xf7 this_q == locked_q is possible. There are two problems here: 1. In UP case, there is preemption counter issue as spin_trylock always successes. 2. In SMP case, the loop breaks too earlier. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Reported-by: Meelis Roos <mroos@linux.ee> Reported-by: Knut Petersen <Knut_Petersen@t-online.de> Tested-by: Knut Petersen <Knut_Petersen@t-online.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* block: fix NULL icq_cache referenceShaohua Li2012-01-191-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Vivek reported a kernel crash: [ 94.217015] BUG: unable to handle kernel NULL pointer dereference at 000000000000001c [ 94.218004] IP: [<ffffffff81142fae>] kmem_cache_free+0x5e/0x200 [ 94.218004] PGD 13abda067 PUD 137d52067 PMD 0 [ 94.218004] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [ 94.218004] CPU 0 [ 94.218004] Modules linked in: [last unloaded: scsi_wait_scan] [ 94.218004] [ 94.218004] Pid: 0, comm: swapper/0 Not tainted 3.2.0+ #16 Hewlett-Packard HP xw6600 Workstation/0A9Ch [ 94.218004] RIP: 0010:[<ffffffff81142fae>] [<ffffffff81142fae>] kmem_cache_free+0x5e/0x200 [ 94.218004] RSP: 0018:ffff88013fc03de0 EFLAGS: 00010006 [ 94.218004] RAX: ffffffff81e0d020 RBX: ffff880138b3c680 RCX: 00000001801c001b [ 94.218004] RDX: 00000000003aac1d RSI: ffff880138b3c680 RDI: ffffffff81142fae [ 94.218004] RBP: ffff88013fc03e10 R08: ffff880137830238 R09: 0000000000000001 [ 94.218004] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 94.218004] R13: ffffea0004e2cf00 R14: ffffffff812f6eb6 R15: 0000000000000246 [ 94.218004] FS: 0000000000000000(0000) GS:ffff88013fc00000(0000) knlGS:0000000000000000 [ 94.218004] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 94.218004] CR2: 000000000000001c CR3: 00000001395ab000 CR4: 00000000000006f0 [ 94.218004] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 94.218004] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 94.218004] Process swapper/0 (pid: 0, threadinfo ffffffff81e00000, task ffffffff81e0d020) [ 94.218004] Stack: [ 94.218004] 0000000000000102 ffff88013fc0db20 ffffffff81e22700 ffff880139500f00 [ 94.218004] 0000000000000001 000000000000000a ffff88013fc03e20 ffffffff812f6eb6 [ 94.218004] ffff88013fc03e90 ffffffff810c8da2 ffffffff81e01fd8 ffff880137830240 [ 94.218004] Call Trace: [ 94.218004] <IRQ> [ 94.218004] [<ffffffff812f6eb6>] icq_free_icq_rcu+0x16/0x20 [ 94.218004] [<ffffffff810c8da2>] __rcu_process_callbacks+0x1c2/0x420 [ 94.218004] [<ffffffff810c9038>] rcu_process_callbacks+0x38/0x250 [ 94.218004] [<ffffffff810405ee>] __do_softirq+0xce/0x3e0 [ 94.218004] [<ffffffff8108ed04>] ? clockevents_program_event+0x74/0x100 [ 94.218004] [<ffffffff81090104>] ? tick_program_event+0x24/0x30 [ 94.218004] [<ffffffff8183ed1c>] call_softirq+0x1c/0x30 [ 94.218004] [<ffffffff8100422d>] do_softirq+0x8d/0xc0 [ 94.218004] [<ffffffff81040c3e>] irq_exit+0xae/0xe0 [ 94.218004] [<ffffffff8183f4be>] smp_apic_timer_interrupt+0x6e/0x99 [ 94.218004] [<ffffffff8183e330>] apic_timer_interrupt+0x70/0x80 Once a queue is quiesced, it's not supposed to have any elvpriv data or icq's, and elevator switching depends on that. Request alloc path followed the rule for elvpriv data but forgot apply it to icq's leading to the following crash during elevator switch. Fix it by not allocating icq's if ELVPRIV is not set for the request. Reported-by: Vivek Goyal <vgoyal@redhat.com> Tested-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Shaohua Li <shaohua.li@intel.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* block,cfq: change code orderShaohua Li2012-01-191-3/+4
| | | | | | | | | | | cfq_slice_expired will change saved_workload_slice. It should be called first so saved_workload_slice is correctly set to 0 after workload type is changed. This fixes the code order changed by 54b466e44b1c7. Tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* uml: fix compile for x86-64Linus Torvalds2012-01-181-0/+5
| | | | | | | | | | | | | | | | | | | | | | Randy Dunlap reports that we get arch/x86/um/shared/sysdep/ptrace.h:7:20: error: redefinition of 'regs_return_value' arch/x86/um/shared/sysdep/ptrace.h:7:20: note: previous definition of 'regs_return_value' was here when compiling UML for x86-64. Stephen Rothwell root-caused it and says: "Caused by commit d7e7528bcd45 ("Audit: push audit success and retcode into arch ptrace.h") (another patch that was never in linux-next :-(). This file now needs protection against double inclusion." so let's do as the man says. Reported-by: Randy Dunlap <rdunlap@xenotime.net> Analyzed-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'for-next-merge' of ↵Linus Torvalds2012-01-187-0/+4672
|\ | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending * 'for-next-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: ib_srpt: Initial SRP Target merge for v3.3-rc1
| * ib_srpt: Initial SRP Target merge for v3.3-rc1Bart Van Assche2011-12-167-0/+4672
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the kernel module ib_srpt SCSI RDMA Protocol (SRP) target implementation conforming to the SRP r16a specification for the mainline drivers/target infrastructure. This driver was originally developed by Vu Pham and has been optimized by Bart Van Assche and merged into upstream LIO based on his srpt-lio-4.1 branch here: https://github.com/bvanassche/srpt-lio/commits/srpt-lio-4.1/ This updated patch also contains the following two changes from lio-core-2.6.git/master. One is to fix a bug with 1 >= task->task_sg[] chained mappings in ib_srpt, and the other to convert the configfs control plane to reference IB Port GUID and struct srpt_port directly following mainline v4.x target_core_fabric_configfs.c convertion for ib_srpt to work with rtslib/rtsadmin v2 code. These seperate patches can be found here: ib_srpt: Fix bug with chainged SGLs in srpt_map_sg_to_ib_sge http://www.risingtidesystems.com/git/?p=lio-core-2.6.git;a=commitdiff;h=ea485147563b6555a97dbf811825fbb586519252 ib_srpt: Convert se_wwn endpoint reference to struct srpt_port->port_wwn http://www.risingtidesystems.com/git/?p=lio-core-2.6.git;a=commitdiff;h=4e544a210acb227df1bb4ca5086e65bdf4e648ea This also includes the following recent v1 -> v2 review changes: ib_srpt: Fix potential out-of-bounds array access ib_srpt: Avoid failed multipart RDMA transfers ib_srpt: Fix srpt_alloc_fabric_acl failure case return value ib_srpt: Update comments to reference $driver/$port layout ib_srpt: Fix sport->port_guid formatting code ib_srpt: Remove legacy use_port_guid_in_session_name module parameter ib_srpt: Convert srp_max_rdma_size into per port configfs attribute ib_srpt: Convert srp_max_rsp_size into per port configfs attribute ib_srpt: Convert srpt_sq_size into per port configfs attribute and v2 -> v3 review changes: ib_srpt: Fix possible race with srp_sq_size in srpt_create_ch_ib ib_srpt: Fix possible race with srp_max_rsp_size in srpt_release_channel_work ib_srpt: Fix up MAX_SRPT_RDMA_SIZE define ib_srpt: Make srpt_map_sg_to_ib_sge() failure case return -EAGAIN ib_srpt: Convert port_guid to use subnet_prefix + interface_id formatting ib_srpt: Make srpt_check_stop_free return kref_put status ib_srpt: Make compilation with BUG=n proceed` ib_srpt: Use new target_core_fabric.h include ib_srpt: Check hex2bin() return code to silence build warning Cc: Bart Van Assche <bvanassche@acm.org> Cc: Roland Dreier <roland@purestorage.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Vu Pham <vu@mellanox.com> Cc: David Dillow <dillowda@ornl.gov> Signed-off-by: Nicholas A. Bellinger <nab@risingtidesystems.com>
* | Merge branch 'for-next' of ↵Linus Torvalds2012-01-1851-1014/+829
|\ \ | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (26 commits) target: Set additional sense length field in sense data target: Remove legacy device status check from transport_execute_tasks target: Remove __transport_execute_tasks() for each processing context target: Remove extra se_device->execute_task_lock access in fast path target: Drop se_device TCQ queue_depth usage from I/O path target: Fix possible NULL pointer with __transport_execute_tasks target: Remove TFO->check_release_cmd() fabric API caller tcm_fc: Convert ft_send_work to use target_submit_cmd target: Add target_submit_cmd() for process context fabric submission target: Make target_put_sess_cmd use target_release_cmd_kref target: Set response format in INQUIRY response target: tcm_mod_builder: small fixups Documentation/target: Fix tcm_mod_builder.py build breakage target: remove overagressive ____cacheline_aligned annoations tcm_loop: bump max_sectors target/configs: remove trailing newline from udev_path and alias iscsi-target: fix chap identifier simple_strtoul usage target: remove useless casts target: simplify target_check_cdb_and_preempt target: Move core_scsi3_check_cdb_abort_and_preempt ...
| * target: Set additional sense length field in sense dataRoland Dreier2011-12-162-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The target code was not setting the additional sense length field in the sense data it returned, which meant that at least the Linux stack ignored the ASC/ASCQ fields. For example, without this patch, on a tcm_loop device: # sg_raw -v /dev/sda 2 0 0 0 0 0 gives cdb to send: 02 00 00 00 00 00 SCSI Status: Check Condition Sense Information: Fixed format, current; Sense key: Illegal Request Raw sense data (in hex): 70 00 05 00 00 00 00 00 while after the patch we correctly get the following (which matches what a regular disk returns): cdb to send: 02 00 00 00 00 00 SCSI Status: Check Condition Sense Information: Fixed format, current; Sense key: Illegal Request Additional sense: Invalid command operation code Raw sense data (in hex): 70 00 05 00 00 00 00 0a 00 00 00 00 20 00 00 00 00 00 Signed-off-by: Roland Dreier <roland@purestorage.com> Cc: stable@kernel.org Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * target: Remove legacy device status check from transport_execute_tasksNicholas Bellinger2011-12-141-7/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes a legacy se_dev_check_online() check from within transport_execute_tasks() that should no longer be necessary as transport_lookup_cmd_lun() is already making this call. Using transport_cmd_check_stop() from transport_execute_tasks() should already be checking per se_cmd context for each descriptor upon active I/O shutdown, so no need to acquire dev->dev_status_lock again while executing se_task submission. Cc: Christoph Hellwig <hch@lst.de> Cc: Roland Dreier <roland@purestorage.com> Cc: Joern Engel <joern@logfs.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * target: Remove __transport_execute_tasks() for each processing contextNicholas Bellinger2011-12-141-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes the original usage of __transport_execute_tasks() ahead of every transport_get_cmd_from_queue() call in transport_processing_thread(). This helps reduce se_device->execute_task_lock contention between qla2xxx wq with target_submit_cmd() for READs and transport_processing_thread() context servicing WRITEs with full payloads for I/O submission. It also adds a __transport_execute_tasks() to kick the task queue again without a *se_cmd descriptor with existing queue full logic, but this may end up not being necessary. Cc: Christoph Hellwig <hch@lst.de> Cc: Roland Dreier <roland@purestorage.com> Cc: Joern Engel <joern@logfs.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * target: Remove extra se_device->execute_task_lock access in fast pathNicholas Bellinger2011-12-142-17/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes __transport_execute_tasks() perform the addition of tasks to dev->execute_task_list via __transport_add_tasks_from_cmd() while holding dev->execute_task_lock during normal I/O fast path submission. It effectively removes the unnecessary re-acquire of dev->execute_task_lock during transport_execute_tasks() -> transport_add_tasks_from_cmd() ahead of calling __transport_execute_tasks() to queue tasks for the passed *se_cmd descriptor. (v2: Re-add goto check_depth usage for multi-task submission for now..) Cc: Christoph Hellwig <hch@lst.de> Cc: Roland Dreier <roland@purestorage.com> Cc: Joern Engel <joern@logfs.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * target: Drop se_device TCQ queue_depth usage from I/O pathNicholas Bellinger2011-12-144-46/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Historically, pSCSI devices have been the ones that required target-core to enforce a per se_device->depth_left. This patch changes target-core to no longer (by default) enforce a per se_device->depth_left or sleep in transport_tcq_window_closed() when we out of queue slots for all backend export cases. Cc: Christoph Hellwig <hch@lst.de> Cc: Roland Dreier <roland@purestorage.com> Cc: Joern Engel <joern@logfs.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * target: Fix possible NULL pointer with __transport_execute_tasksNicholas Bellinger2011-12-141-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | This patch makes __transport_execute_tasks() use a local *se_dev reference to prevent direct se_cmd->se_dev access after transport_cmd_check_stop() -> transport_add_tasks_from_cmd() has been called, as in the current implementation we can expect __transport_execute_tasks() may be called from another context that may have already completed the I/O. Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * target: Remove TFO->check_release_cmd() fabric API callerNicholas Bellinger2011-12-141-4/+0
| | | | | | | | | | | | | | | | Remove the now unused target_core_fabric_ops->check_release_cmd() as target_core handles this directly for se_cmd->cmd_kref objects now. Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * tcm_fc: Convert ft_send_work to use target_submit_cmdNicholas Bellinger2011-12-141-38/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch converts the main ft_send_work() I/O path to use target_submit_cmd() with a single se_cmd->cmd_kref reference that is released via the existing ft_check_stop_free() response path callback. It also makes ft_send_tm() use transport_init_se_cmd() and target_get_sess_cmd() to also use single se_cmd->cmd_kref reference. Cc: Christoph Hellwig <hch@lst.de> Cc: Kiran Patil <kiran.patil@intel.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * target: Add target_submit_cmd() for process context fabric submissionNicholas Bellinger2011-12-143-3/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a target_submit_cmd() caller that can be used by fabrics to submit an uninitialized se_cmd descriptor to an struct se_session + unpacked_lun from workqueue process context. This call will invoke the following steps: - transport_init_se_cmd() to setup se_cmd specific pointers - Obtain se_cmd->cmd_kref references with target_get_sess_cmd() - set se_cmd->t_tasks_bidi - transport_lookup_cmd_lun() to setup struct se_cmd->se_lun from the passed unpacked_lun - transport_generic_allocate_tasks() to setup the passed *cdb, and - transport_handle_cdb_direct() handle READ dispatch or WRITE ready-to-transfer callback to fabric v2 changes from hch feedback: *) Add target_sc_flags_table for target_submit_cmd flags *) Rename bidi parameter to flags, add TARGET_SCF_BIDI_OP *) Convert checks to BUG_ON *) Add out_check_cond for transport_send_check_condition_and_sense usage v3 changes: *) Add TARGET_SCF_ACK_KREF for target_submit_cmd into target_get_sess_cmd to determine when the fabric caller is expecting a second kref_put() from fabric packet acknowledgement. Cc: Christoph Hellwig <hch@lst.de> Cc: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * target: Make target_put_sess_cmd use target_release_cmd_krefNicholas Bellinger2011-12-142-15/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch moves target_put_sess_cmd() to use a se_cmd->cmd_kref callback target_release_cmd_kref when performing driver release of fabric->se_cmd descriptor memory. It sets the default cmd_kref count value to '2' within target_get_sess_cmd() setup, and currently assumes TFO->check_stop_free() usage. It drops se_tfo->check_release_cmd() usage in the main transport_release_cmd codepath. Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * target: Set response format in INQUIRY responseRoland Dreier2011-12-141-0/+12
| | | | | | | | | | | | | | | | | | | | Current SCSI specs say that the "response format" field in the standard INQUIRY response should be set to 2, and all the real SCSI devices I have do put 2 here. So let's do that too. Signed-off-by: Roland Dreier <roland@purestorage.com> Cc: stable@kernel.org Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * target: tcm_mod_builder: small fixupsSebastian Andrzej Siewior2011-12-141-12/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This includes: - remove on _ in "__NAMELEN" in $fabric _make_tport - target_fabric_configfs_init() returns an error pointer and not NULL anymore. Consider that. - replace (!(var_name)) with (!var_name). The extra () are not required - remove #ifdef MODULE. If the code is builtin it needs an init function or the code is useless - put exit/clean functions into __exit Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * Documentation/target: Fix tcm_mod_builder.py build breakageNicholas Bellinger2011-12-141-21/+2
| | | | | | | | | | | | | | | | | | This patch fixes TFO->release_cmd() and removes legacy pack_lun() usage and new_cmd_failure when generating new TCM fabric skeleton from the tcm_mod_builder.py script. Reported-by: Stefan Bergstrand <stefan.bergstrand@sdsab.se> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * target: remove overagressive ____cacheline_aligned annoationsChristoph Hellwig2011-12-141-31/+31
| | | | | | | | | | | | | | | | | | If we want dynamically allocated objects to be cacheline aligned we need to tell that to the slab allocator by using the proper flags and not by liberally sprinkling annotations onto all structures. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * tcm_loop: bump max_sectorsChristoph Hellwig2011-12-142-14/+5
| | | | | | | | | | | | | | | | | | | | There is not reason to artifically limit max_sectors in tcm_loop, set it to UINT_MAX to allow stressing the large I/O handling in the target core using the loopback driver. Also remove various superflous defines hiding the values set in the host template. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * target/configs: remove trailing newline from udev_path and aliasSebastian Andrzej Siewior2011-12-141-0/+6
| | | | | | | | | | | | | | | | This patch strips the trailing newline from backend device udev_path and alias attributes. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * iscsi-target: fix chap identifier simple_strtoul usageNicholas Bellinger2011-12-141-3/+7
| | | | | | | | | | | | | | | | | | This patch makes chap_server_compute_md5() use proper unsigned long usage for the CHAP_I (identifier) and check for values beyond 255 as per RFC-1994. Reported-by: Joern Engel <joern@logfs.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * target: remove useless castsJörn Engel2011-12-1418-101/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A reader should spend an extra moment whenever noticing a cast, because either something special is going on that deserves extra attention or, as is all too often the case, the code is wrong. These casts, afaics, have all been useless. They cast a foo* to a foo*, cast a void* to the assigned type, cast a foo* to void*, before assigning it to a void* variable, etc. In a few cases I also removed an additional &...[0], which is equally useless. Lastly I added three FIXMEs where, to the best of my judgement, the code appears to have a bug. It would be good if someone could check these. Signed-off-by: Joern Engel <joern@logfs.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * target: simplify target_check_cdb_and_preemptJörn Engel2011-12-141-16/+10
| | | | | | | | | | | | | | | | | | - rename to target_check_cdb_and_preempt - use non-safe list_for_each_entry - move common check into callee (simplifying callers) Signed-off-by: Joern Engel <joern@logfs.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * target: Move core_scsi3_check_cdb_abort_and_preemptJörn Engel2011-12-143-17/+15
| | | | | | | | | | | | | | And make it static afterwards. Signed-off-by: Joern Engel <joern@logfs.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * target: use \n as a separator for configurationSebastian Andrzej Siewior2011-12-145-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The command | echo rd_pages=32768 > ramdisk/control Does not work because it writes "rd_pages=32768\n" and the parser which matches for "rd_pages=%d" does not recognize it due to the \n. One way of fixing this would be using "echo -n" instead. This patch adds \n to the list of separators so we don't have to use the -n argument which I find is more convinient. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * target: make the se_task task_state_active a normal boolChristoph Hellwig2011-12-143-23/+22
| | | | | | | | | | | | | | | | | | | | | | There is no need to make task_state_active an atomic_t given that it is always set under the execute_task_lock so we can make it a simple bool. Also rename it to t_state_active to be closer to the list it guards, and make sure all checks before the list addion/removal actually happen under execute_task_lock. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * target: remove the se_task task_error_status fieldChristoph Hellwig2011-12-142-8/+1
| | | | | | | | | | | | | | | | We only reach transport_complete_task once per task, so the test and set on task_error_status is never going to have an effect. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * target: fold se_task.task_sense into task_flagsChristoph Hellwig2011-12-142-4/+4
| | | | | | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * target: header reshuffle, part2Christoph Hellwig2011-12-1444-462/+288
| | | | | | | | | | | | | | | | | | | | | | | | | | This reorganized the headers under include/target into: - target_core_base.h stays as is with all target-wide data stuctures and defines - target_core_backend.h contains the whole interface to I/O backends - target_core_fabric.h contains the whole interface to fabric modules Except for those only the various configfs macro headers stay around. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * target: reshuffle headersChristoph Hellwig2011-12-1421-186/+170
| | | | | | | | | | | | | | | | | | | | | | Create a new headers, drivers/target/target_core_internal.h that is supposed to hold all target_core-internal prototypes. Move all non-exported includes from include/target to it, and merge the smaller prototype-only includes inside drivers/target into it as well. Mark functions that were found to not be called outside their implementation file static. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* | Merge branch 'release' of ↵Linus Torvalds2012-01-18182-795/+4824
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux This includes initial support for the recently published ACPI 5.0 spec. In particular, support for the "hardware-reduced" bit that eliminates the dependency on legacy hardware. APEI has patches resulting from testing on real hardware. Plus other random fixes. * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (52 commits) acpi/apei/einj: Add extensions to EINJ from rev 5.0 of acpi spec intel_idle: Split up and provide per CPU initialization func ACPI processor: Remove unneeded variable passed by acpi_processor_hotadd_init V2 ACPI processor: Remove unneeded cpuidle_unregister_driver call intel idle: Make idle driver more robust intel_idle: Fix a cast to pointer from integer of different size warning in intel_idle ACPI: kernel-parameters.txt : Add intel_idle.max_cstate intel_idle: remove redundant local_irq_disable() call ACPI processor: Fix error path, also remove sysdev link ACPI: processor: fix acpi_get_cpuid for UP processor intel_idle: fix API misuse ACPI APEI: Convert atomicio routines ACPI: Export interfaces for ioremapping/iounmapping ACPI registers ACPI: Fix possible alignment issues with GAS 'address' references ACPI, ia64: Use SRAT table rev to use 8bit or 16/32bit PXM fields (ia64) ACPI, x86: Use SRAT table rev to use 8bit or 32bit PXM fields (x86/x86-64) ACPI: Store SRAT table revision ACPI, APEI, Resolve false conflict between ACPI NVS and APEI ACPI, Record ACPI NVS regions ACPI, APEI, EINJ, Refine the fix of resource conflict ...
| | \
| | \
| | \
| | \
| | \
| | \
| | \
| | \
| *-------. \ Merge branches 'einj', 'intel_idle', 'misc', 'srat' and 'turbostat-ivb' into ↵Len Brown2012-01-1812-111/+345
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | release
| | | | | | * | tools turbostat: recognize and run properly on IVBLen Brown2011-11-181-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Len Brown <len.brown@intel.com>
| | | | | * | | ACPI, ia64: Use SRAT table rev to use 8bit or 16/32bit PXM fields (ia64)Kurt Garloff2012-01-171-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In SRAT v1, we had 8bit proximity domain (PXM) fields; SRAT v2 provides 32bits for these. The new fields were reserved before. According to the ACPI spec, the OS must disregrard reserved fields. ia64 did handle the PXM fields almost consistently, but depending on sgi's sn2 platform. This patch leaves the sn2 logic in, but does also use 16/32 bits for PXM if the SRAT has rev 2 or higher. The patch also adds __init to the two pxm accessor functions, as they access __initdata now and are called from an __init function only anyway. Note that the code only uses 16 bits for the PXM field in the processor proximity field; the patch does not address this as 16 bits are more than enough. Signed-off-by: Kurt Garloff <kurt@garloff.de> Signed-off-by: Len Brown <len.brown@intel.com>
OpenPOWER on IntegriCloud