summaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/ulp
Commit message (Collapse)AuthorAgeFilesLines
* RDMA/iser: don't send an rkey if all data is written as immadiate-dataSagi Grimberg2017-07-201-2/+4
| | | | | | | | | We might get some bogus error completions in case the target will remotely invalidate the rkey and the HCA will need to retransmit from this buffer. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/IPoIB: Fix error code in ipoib_add_port()Dan Carpenter2017-07-201-0/+1
| | | | | | | | | | | | We accidentally don't see the error code on some of these error paths. It means we return ERR_PTR(0) which is NULL and it results in a NULL dereference in the caller. This bug dates to pre-git days. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/ipoib: Let lower driver handle get_stats64 callErez Shitrit2017-07-171-0/+12
| | | | | | | | | | | | The driver checks if the lower level driver supports get_stats, and if so calls it to get the updated statistics, otherwise takes from the current netdevice stats object. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/IPoIB: Convert IPoIB to memalloc_noio_* callsLeon Romanovsky2017-07-171-9/+7
| | | | | | | | | | | | | | | | | | | | | | | | Commit 21caf2fc1931 ("mm: teach mm by current context info to not do I/O during memory allocation") added the memalloc_noio_(save|restore) functions to enable people to modify the MM behavior by disabling I/O during memory allocation. This was further extended in Fixes: 934f3072c17c ("mm: clear __GFP_FS when PF_MEMALLOC_NOIO is set"). memalloc_noio_* functions prevent allocation paths recursing back into the filesystem without explicitly changing the flags for every allocation site. However the IPoIB hasn't been keeping up with the changes and missed completely these memalloc_noio_* calls. This led to update of allocation site with special QP creation flag, see commit 09b93088d750 ("IB: Add a QP creation flag to use GFP_NOIO allocations"), while this flag is supported by small number of drivers in IB stack. Let's change it by updating to memalloc_noio_* calls and allow for every driver underneath enjoy NOIO allocations. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/IPoIB: Forward MTU change to driver belowErez Shitrit2017-07-171-2/+17
| | | | | | | | | | | | This patch checks if there is a driver below that needs to be updated on the new MTU and calls it accordingly. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB: Convert msleep below 20ms to usleep_rangeLeon Romanovsky2017-07-172-3/+3
| | | | | | | | | | | | | | | | The msleep(1) may do not sleep 1 ms as expected and will sleep longer. The simple conversion from msleep to usleep_range between 1ms and 2ms can solve an issue. The full and comprehensive explanation can be found at [1] and [2]. [1] https://lkml.org/lkml/2007/8/3/250 [2] Documentation/timers/timers-howto.txt Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/iser: Fix connection teardown race conditionVladimir Neyelov2017-07-171-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Under heavy iser target(scst) start/stop stress during login/logout on iser intitiator side happened trace call provided below. The function iscsi_iser_slave_alloc iser_conn pointer could be NULL, due to the fact that function iscsi_iser_conn_stop can be called before and free iser connection. Let's protect that flow by introducing global mutex. BUG: unable to handle kernel paging request at 0000000000001018 IP: [<ffffffffc0426f7e>] iscsi_iser_slave_alloc+0x1e/0x50 [ib_iser] Call Trace: ? scsi_alloc_sdev+0x242/0x300 scsi_probe_and_add_lun+0x9e1/0xea0 ? kfree_const+0x21/0x30 ? kobject_set_name_vargs+0x76/0x90 ? __pm_runtime_resume+0x5b/0x70 __scsi_scan_target+0xf6/0x250 scsi_scan_target+0xea/0x100 iscsi_user_scan_session.part.13+0x101/0x130 [scsi_transport_iscsi] ? iscsi_user_scan_session.part.13+0x130/0x130 [scsi_transport_iscsi] iscsi_user_scan_session+0x1e/0x30 [scsi_transport_iscsi] device_for_each_child+0x50/0x90 iscsi_user_scan+0x44/0x60 [scsi_transport_iscsi] store_scan+0xa8/0x100 ? common_file_perm+0x5d/0x1c0 dev_attr_store+0x18/0x30 sysfs_kf_write+0x37/0x40 kernfs_fop_write+0x12c/0x1c0 __vfs_write+0x18/0x40 vfs_write+0xb5/0x1a0 SyS_write+0x55/0xc0 Fixes: 318d311e8f01 ("iser: Accept arbitrary sg lists mapping if the device supports it") Cc: <stable@vger.kernel.org> # v4.5+ Signed-off-by: Vladimir Neyelov <vladimirn@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimbeg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
* Merge tag 'v4.13-rc1' into k.o/for-4.13-rcDoug Ledford2017-07-176-11/+13
|\ | | | | | | Linux v4.13-rc1
| * Merge branch 'for-next' of ↵Linus Torvalds2017-07-132-3/+3
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "It's been usually busy for summer, with most of the efforts centered around TCMU developments and various target-core + fabric driver bug fixing activities. Not particularly large in terms of LoC, but lots of smaller patches from many different folks. The highlights include: - ibmvscsis logical partition manager support (Michael Cyr + Bryant Ly) - Convert target/iblock WRITE_SAME to blkdev_issue_zeroout (hch + nab) - Add support for TMR percpu LUN reference counting (nab) - Fix a potential deadlock between EXTENDED_COPY and iscsi shutdown (Bart) - Fix COMPARE_AND_WRITE caw_sem leak during se_cmd quiesce (Jiang Yi) - Fix TMCU module removal (Xiubo Li) - Fix iser-target OOPs during login failure (Andrea Righi + Sagi) - Breakup target-core free_device backend driver callback (mnc) - Perform TCMU add/delete/reconfig synchronously (mnc) - Fix TCMU multiple UIO open/close sequences (mnc) - Fix TCMU CHECK_CONDITION sense handling (mnc) - Fix target-core SAM_STAT_BUSY + TASK_SET_FULL handling (mnc + nab) - Introduce TYPE_ZBC support in PSCSI (Damien Le Moal) - Fix possible TCMU memory leak + OOPs when recalculating cmd base size (Xiubo Li + Bryant Ly + Damien Le Moal + mnc) - Add login_keys_workaround attribute for non RFC initiators (Robert LeBlanc + Arun Easi + nab)" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (68 commits) iscsi-target: Add login_keys_workaround attribute for non RFC initiators Revert "qla2xxx: Fix incorrect tcm_qla2xxx_free_cmd use during TMR ABORT" tcmu: clean up the code and with one small fix tcmu: Fix possbile memory leak / OOPs when recalculating cmd base size target: export lio pgr/alua support as device attr target: Fix return sense reason in target_scsi3_emulate_pr_out target: Fix cmd size for PR-OUT in passthrough_parse_cdb tcmu: Fix dev_config_store target: pscsi: Introduce TYPE_ZBC support target: Use macro for WRITE_VERIFY_32 operation codes target: fix SAM_STAT_BUSY/TASK_SET_FULL handling target: remove transport_complete pscsi: finish cmd processing from pscsi_req_done tcmu: fix sense handling during completion target: add helper to copy sense to se_cmd buffer target: do not require a transport_complete for SCF_TRANSPORT_TASK_SENSE target: make device_mutex and device_list static tcmu: Fix flushing cmd entry dcache page tcmu: fix multiple uio open/close sequences tcmu: drop configured check in destroy ...
| | * iser-target: Avoid isert_conn->cm_id dereference in isert_login_recv_doneNicholas Bellinger2017-07-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a NULL pointer dereference in isert_login_recv_done() of isert_conn->cm_id due to isert_cma_handler() -> isert_connect_error() resetting isert_conn->cm_id = NULL during a failed login attempt. As per Sagi, we will always see the completion of all recv wrs posted on the qp (given that we assigned a ->done handler), this is a FLUSH error completion, we just don't get to verify that because we deref NULL before. The issue here, was the assumption that dereferencing the connection cm_id is always safe, which is not true since: commit 4a579da2586bd3b79b025947ea24ede2bbfede62 Author: Sagi Grimberg <sagig@mellanox.com> Date: Sun Mar 29 15:52:04 2015 +0300 iser-target: Fix possible deadlock in RDMA_CM connection error As I see it, we have a direct reference to the isert_device from isert_conn which is the one-liner fix that we actually need like we do in isert_rdma_read_done() and isert_rdma_write_done(). Reported-by: Andrea Righi <righi.andrea@gmail.com> Tested-by: Andrea Righi <righi.andrea@gmail.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Cc: <stable@vger.kernel.org> # 3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| | * IB/srpt: Make a debug statement in srpt_abort_cmd() more informativeBart Van Assche2017-07-061-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do not only report the state of the I/O context before srpt_abort_cmd() was called but also the new state assigned by srpt_abort_cmd() Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Andy Grover <agrover@redhat.com> Cc: David Disseldorp <ddiss@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | Merge tag 'for-linus' of ↵Linus Torvalds2017-07-066-21/+19
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma update from Doug Ledford: "This includes two bugs against the newly added opa vnic that were found by turning on the debug kernel options: - sleeping while holding a lock, so a one line fix where they switched it from GFP_KERNEL allocation to a GFP_ATOMIC allocation - a case where they had an isolated caller of their code that could call them in an atomic context so they had to switch their use of a mutex to a spinlock to be safe, so this was considerably more lines of diff because all uses of that lock had to be switched In addition, the bug that was discussed with you already about an out of bounds array access in ib_uverbs_modify_qp and ib_uverbs_create_ah and is only seven lines of diff. And finally, one fix to an earlier fix in the -rc cycle that broke hfi1 and qib in regards to IPoIB (this one is, unfortunately, larger than I would like for a -rc7 submission, but fixing the problem required that we not treat all devices as though they had allocated a netdev universally because it isn't true, and it took 70 lines of diff to resolve the issue, but the final patch has been vetted by Intel and Mellanox and they've both given their approval to the fix). Summary: - Two fixes for OPA found by debug kernel - Fix for user supplied input causing kernel problems - Fix for the IPoIB fixes submitted around -rc4" [ Doug sent this having not noticed the 4.12 release, so I guess I'll be getting another rdma pull request with the actuakl merge window updates and not just fixes. Oh well - it would have been nice if this small update had been the merge window one. - Linus ] * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: IB/core, opa_vnic, hfi1, mlx5: Properly free rdma_netdev RDMA/uverbs: Check port number supplied by user verbs cmds IB/opa_vnic: Use spinlock instead of mutex for stats_lock IB/opa_vnic: Use GFP_ATOMIC while sending trap
| * | | net: add netlink_ext_ack argument to rtnl_link_ops.changelinkMatthias Schiffer2017-06-261-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for extended error reporting. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: add netlink_ext_ack argument to rtnl_link_ops.newlinkMatthias Schiffer2017-06-261-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for extended error reporting. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2017-06-213-7/+20
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Two entries being added at the same time to the IFLA policy table, whilst parallel bug fixes to decnet routing dst handling overlapping with the dst gc removal in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | networking: make skb_push & __skb_push return void pointersJohannes Berg2017-06-163-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It seems like a historic accident that these return unsigned char *, and in many places that means casts are required, more often than not. Make these functions return void * and remove all the casts across the tree, adding a (u8 *) cast only where the unsigned char pointer was used directly, all done with the following spatch: @@ expression SKB, LEN; typedef u8; identifier fn = { skb_push, __skb_push, skb_push_rcsum }; @@ - *(fn(SKB, LEN)) + *(u8 *)fn(SKB, LEN) @@ expression E, SKB, LEN; identifier fn = { skb_push, __skb_push, skb_push_rcsum }; type T; @@ - E = ((T *)(fn(SKB, LEN))) + E = fn(SKB, LEN) @@ expression SKB, LEN; identifier fn = { skb_push, __skb_push, skb_push_rcsum }; @@ - fn(SKB, LEN)[0] + *(u8 *)fn(SKB, LEN) Note that the last part there converts from push(...)[0] to the more idiomatic *(u8 *)push(...). Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | IB/iser: Handle lack of memory management extentions correctlyMike Marciniszyn2017-07-111-2/+8
| |_|/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | max_fast_reg_page_list_len is only valid when the memory management extentions are signaled by the underlying driver. Fix by adjusting iser_calc_scsi_params() to use ISCSI_ISER_MAX_SG_TABLESIZE when the extentions are not indicated. Reported-by: Thomas Rosenstein <thomas.rosenstein@creamfinance.com> Fixes: Commit df749cdc45d9 ("IB/iser: Support up to 8MB data transfer in a single command") Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Acked-by: Sagi Grimberg <sagi@grimberg.me> Tested-by: Thomas Rosenstein <thomas.rosenstein@creamfinance.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | | | IB/core, opa_vnic, hfi1, mlx5: Properly free rdma_netdevNiranjana Vishwanathapura2017-07-052-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | IPOIB is calling free_rdma_netdev even though alloc_rdma_netdev has returned -EOPNOTSUPP. Move free_rdma_netdev from ib_device structure to rdma_netdev structure thus ensuring proper cleanup function is called for the rdma net device. Fix the following trace: ib0: Failed to modify QP to ERROR state BUG: unable to handle kernel paging request at 0000000000001d20 IP: hfi1_vnic_free_rn+0x26/0xb0 [hfi1] Call Trace: ipoib_remove_one+0xbe/0x160 [ib_ipoib] ib_unregister_device+0xd0/0x170 [ib_core] rvt_unregister_device+0x29/0x90 [rdmavt] hfi1_unregister_ib_device+0x1a/0x100 [hfi1] remove_one+0x4b/0x220 [hfi1] pci_device_remove+0x39/0xc0 device_release_driver_internal+0x141/0x200 driver_detach+0x3f/0x80 bus_remove_driver+0x55/0xd0 driver_unregister+0x2c/0x50 pci_unregister_driver+0x2a/0xa0 hfi1_mod_cleanup+0x10/0xf65 [hfi1] SyS_delete_module+0x171/0x250 do_syscall_64+0x67/0x150 entry_SYSCALL64_slow_path+0x25/0x25 Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | | | IB/opa_vnic: Use spinlock instead of mutex for stats_lockVishwanathapura, Niranjana2017-06-294-12/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stats can be read from atomic context, hence make stats_lock as a spinlock. Fix the following trace with debug kernel. BUG: sleeping function called from invalid context at kernel/locking/mutex.c:238 in_atomic(): 1, irqs_disabled(): 0, pid: 6487, name: sadc Call Trace: dump_stack+0x63/0x90 ___might_sleep+0xda/0x130 __might_sleep+0x4a/0x90 mutex_lock+0x20/0x50 opa_vnic_get_stats64+0x56/0x140 [opa_vnic] dev_get_stats+0x74/0x130 dev_seq_printf_stats+0x37/0x120 dev_seq_show+0x14/0x30 seq_read+0x26d/0x3d0 Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | | | IB/opa_vnic: Use GFP_ATOMIC while sending trapVishwanathapura, Niranjana2017-06-291-1/+1
| |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pass GFP_ATOMIC flag to ib_create_send_mad() while sending trap as it can be triggered from the atomic context. Fix the following trace with debug kernel. BUG: sleeping function called from invalid context at mm/slab.h:432 in_atomic(): 1, irqs_disabled(): 0, pid: 1771, name: NetworkManager Call Trace: dump_stack+0x63/0x90 ___might_sleep+0xda/0x130 __might_sleep+0x4a/0x90 __kmalloc+0x19e/0x220 ? ib_create_send_mad+0xea/0x390 [ib_core] ib_create_send_mad+0xea/0x390 [ib_core] opa_vnic_vema_send_trap+0x17b/0x460 [opa_vnic] opa_vnic_vema_report_event+0x57/0x80 [opa_vnic] opa_vnic_mac_send_event+0xaa/0xf0 [opa_vnic] opa_vnic_set_rx_mode+0x17/0x30 [opa_vnic] __dev_set_rx_mode+0x52/0x90 dev_set_rx_mode+0x26/0x40 __dev_open+0xe8/0x140 Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | | IB/ipoib: Fix memory leak in create child syscallFeras Daoud2017-06-141-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The flow of creating a new child goes through ipoib_vlan_add which allocates a new interface and checks the rtnl_lock. If the lock is taken, restart_syscall will be called to restart the system call again. In this case we are not releasing the already allocated interface, causing a leak. Fixes: 9baa0b036410 ("IB/ipoib: Add rtnl_link_ops support") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | | IB/ipoib: Fix access to un-initialized napi structAlex Vesker2017-06-141-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no need to re-enable napi since we set the initialized flag before calling ipoib_ib_dev_stop which will disable napi, disabling napi twice is harmless in case it was already disabled. One more reason for this fix is that when using IPoIB new device driver napi is not added to priv, this can lead to kernel panic when rn_ops ndo_open fails. [ 289.755840] invalid opcode: 0000 [#1] SMP [ 289.757111] task: ffff880036964440 ti: ffff880178ee8000 task.ti: ffff880178ee8000 [ 289.757111] RIP: 0010:[<ffffffffa05368d6>] [<ffffffffa05368d6>] napi_enable.part.24+0x4/0x6 [ib_ipoib] [ 289.757111] RSP: 0018:ffff880178eeb6d8 EFLAGS: 00010246 [ 289.757111] RAX: 0000000000000000 RBX: ffff880177a80010 RCX: 000000007fffffff [ 289.757111] RDX: ffffffff81d5f118 RSI: 0000000000000000 RDI: ffff880177a80010 [ 289.757111] RBP: ffff880178eeb6d8 R08: 0000000000000082 R09: 0000000000000283 [ 289.757111] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880175a00000 [ 289.757111] R13: ffff880177a80080 R14: 0000000000000000 R15: 0000000000000001 [ 289.757111] FS: 00007fe2ee346880(0000) GS:ffff88017fc00000(0000) knlGS:0000000000000000 [ 289.757111] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 289.757111] CR2: 00007fffca979020 CR3: 00000001792e4000 CR4: 00000000000006f0 [ 289.757111] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 289.757111] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 289.757111] Stack: [ 289.796027] ffff880178eeb6f0 ffffffffa05251f5 ffff880177a80000 ffff880178eeb718 [ 289.796027] ffffffffa0528505 ffff880175a00000 ffff880177a80000 0000000000000000 [ 289.796027] ffff880178eeb748 ffffffffa051f0ab ffff880175a00000 ffffffffa0537d60 [ 289.796027] Call Trace: [ 289.796027] [<ffffffffa05251f5>] napi_enable+0x25/0x30 [ib_ipoib] [ 289.796027] [<ffffffffa0528505>] ipoib_ib_dev_open+0x175/0x190 [ib_ipoib] [ 289.796027] [<ffffffffa051f0ab>] ipoib_open+0x4b/0x160 [ib_ipoib] [ 289.796027] [<ffffffff814fe33f>] _dev_open+0xbf/0x130 [ 289.796027] [<ffffffff814fe62d>] __dev_change_flags+0x9d/0x170 [ 289.796027] [<ffffffff814fe729>] dev_change_flags+0x29/0x60 [ 289.796027] [<ffffffff8150caf7>] do_setlink+0x397/0xa40 Fixes: cd565b4b51e5 ('IB/IPoIB: Support acceleration options callbacks') Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | | IB/ipoib: Delete napi in device uninit defaultAlex Vesker2017-06-141-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch mekas init_default and uninit_default symmetric with a call to delete napi. Additionally, the uninit_default gained delete napi call in case of init_default fails. Fixes: 515ed4f3aab4 ('IB/IPoIB: Separate control and data related initializations') Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | | IB/ipoib: Limit call to free rdma_netdev for capable devicesAlex Vesker2017-06-141-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | Limit calls to free_rdma_netdev() for capable devices only. Fixes: cd565b4b51e5 ('IB/IPoIB: Support acceleration options callbacks') Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | | IB/ipoib: Fix memory leaks for child interfaces privAlex Vesker2017-06-142-2/+10
|/ / | | | | | | | | | | | | | | | | | | | | | | There is a need to free priv explicitly and not just to release the device, child priv is freed explicitly on remove flow and this patch also includes priv free on error flow in P_key creation and also in add_port. Fixes: cd565b4b51e5 ('IB/IPoIB: Support acceleration options callbacks') Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | RDMA/SA: Fix kernel panic in CMA request handler flowMajd Dibbiny2017-06-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 9fdca4da4d8c (IB/SA: Split struct sa_path_rec based on IB and ROCE specific fields) moved the service_id to be specific attribute for IB and OPA SA Path Record, and thus wasn't assigned for RoCE. This caused to the following kernel panic in the CMA request handler flow: [ 27.074594] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [ 27.074731] IP: __radix_tree_lookup+0x1d/0xe0 ... [ 27.075356] Workqueue: ib_cm cm_work_handler [ib_cm] [ 27.075401] task: ffff88022e3b8000 task.stack: ffffc90001298000 [ 27.075449] RIP: 0010:__radix_tree_lookup+0x1d/0xe0 ... [ 27.075979] Call Trace: [ 27.076015] radix_tree_lookup+0xd/0x10 [ 27.076055] cma_ps_find+0x59/0x70 [rdma_cm] [ 27.076097] cma_id_from_event+0xd2/0x470 [rdma_cm] [ 27.076144] ? ib_init_ah_from_path+0x39a/0x590 [ib_core] [ 27.076193] cma_req_handler+0x25/0x480 [rdma_cm] [ 27.076237] cm_process_work+0x25/0x120 [ib_cm] [ 27.076280] ? cm_get_bth_pkey.isra.62+0x3c/0xa0 [ib_cm] [ 27.076350] cm_req_handler+0xb03/0xd40 [ib_cm] [ 27.076430] ? sched_clock_cpu+0x11/0xb0 [ 27.076478] cm_work_handler+0x194/0x1588 [ib_cm] [ 27.076525] process_one_work+0x160/0x410 [ 27.076565] worker_thread+0x137/0x4a0 [ 27.076614] kthread+0x112/0x150 [ 27.076684] ? max_active_store+0x60/0x60 [ 27.077642] ? kthread_park+0x90/0x90 [ 27.078530] ret_from_fork+0x2c/0x40 This patch moves it back to the common SA Path Record structure and removes the redundant setter and getter. Tested on Connect-IB and Connect-X4 in Infiniband and RoCE respectively. Fixes: 9fdca4da4d8c (IB/SA: Split struct sa_path_rec based on IB ands ROCE specific fields) Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | RDMA/srp: Fix NULL deref at srp_destroy_qp()Israel Rukshin2017-06-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | If srp_init_qp() fails at srp_create_ch_ib() then ch->send_cq may be NULL. Calling directly to ib_destroy_qp() is sufficient because no work requests were posted on the created qp. Fixes: 9294000d6d89 ("IB/srp: Drain the send queue before destroying a QP") Cc: <stable@vger.kernel.org> Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Bart van Assche <bart.vanassche@sandisk.com>-- Signed-off-by: Doug Ledford <dledford@redhat.com>
* | RDMA/IPoIB: Limit the ipoib_dev_uninit_default scopeLeon Romanovsky2017-06-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | ipoib_dev_uninit_default() call is used in ipoib_main.c file only and it generates the following warning from smatch tool: drivers/infiniband/ulp/ipoib/ipoib_main.c:1593:6: warning: symbol 'ipoib_dev_uninit_default' was not declared. Should it be static? so let's declare that function as static. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | RDMA/IPoIB: Replace netdev_priv with ipoib_priv for ipoib_get_link_ksettingsHonggang Li2017-06-011-1/+1
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ipoib_dev_init accesses the wrong private data for the IPoIB device. Commit cd565b4b51e5 (IB/IPoIB: Support acceleration options callbacks) changed ipoib_priv from being identical to netdev_priv to being an area inside of, but not the same pointer as, the netdev_priv pointer. As such, the struct we want is the ipoib_priv area, not the netdev_priv area, so use the right accessor, otherwise we kernel panic. [ 27.271938] IPv6: ADDRCONF(NETDEV_CHANGE): mlx5_ib0.8006: link becomes ready [ 28.156790] BUG: unable to handle kernel NULL pointer dereference at 000000000000067c [ 28.166309] IP: ib_query_port+0x30/0x180 [ib_core] ... [ 28.306282] RIP: 0010:ib_query_port+0x30/0x180 [ib_core] ... [ 28.393337] Call Trace: [ 28.397594] ipoib_get_link_ksettings+0x66/0xe0 [ib_ipoib] [ 28.405274] __ethtool_get_link_ksettings+0xa0/0x1c0 [ 28.412353] speed_show+0x74/0xa0 [ 28.417503] dev_attr_show+0x20/0x50 [ 28.422922] ? mutex_lock+0x12/0x40 [ 28.428179] sysfs_kf_seq_show+0xbf/0x1a0 [ 28.434002] kernfs_seq_show+0x21/0x30 [ 28.439470] seq_read+0x116/0x3b0 [ 28.444445] ? do_filp_open+0xa5/0x100 [ 28.449774] kernfs_fop_read+0xff/0x180 [ 28.455220] __vfs_read+0x37/0x150 [ 28.460167] ? security_file_permission+0x9d/0xc0 [ 28.466560] vfs_read+0x8c/0x130 [ 28.471318] SyS_read+0x55/0xc0 [ 28.475950] do_syscall_64+0x67/0x150 [ 28.481163] entry_SYSCALL64_slow_path+0x25/0x25 ... [ 28.584493] ---[ end trace 3549968a4bf0aa5d ]--- Fixes: cd565b4b51e5 (IB/IPoIB: Support acceleration options callbacks) Fixes: 0d7e2d2166f6 (IB/ipoib: add get_link_ksettings in ethtool) Signed-off-by: Honggang Li <honli@redhat.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* Merge branch 'for-next' of ↵Linus Torvalds2017-05-121-6/+3
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "Things were a lot more calm than previously expected. It's primarily fixes in various areas, with most of the new functionality centering around TCMU backend driver work that Xiubo Li has been driving. Here's the summary on the feature side: - Make T10-PI verify configurable for emulated (FILEIO + RD) backends (Dmitry Monakhov) - Allow target-core/TCMU pass-through to use in-kernel SPC-PR logic (Bryant Ly + MNC) - Add TCMU support for growing ring buffer size (Xiubo Li + MNC) - Add TCMU support for global block data pool (Xiubo Li + MNC) and on the bug-fix side: - Fix COMPARE_AND_WRITE non GOOD status handling for READ phase failures (Gary Guo + nab) - Fix iscsi-target hang with explicitly changing per NodeACL CmdSN number depth with concurrent login driven session reinstatement. (Gary Guo + nab) - Fix ibmvscsis fabric driver ABORT task handling (Bryant Ly) - Fix target-core/FILEIO zero length handling (Bart Van Assche) Also, there was an OOPs introduced with the WRITE_VERIFY changes that I ended up reverting at the last minute, because as not unusual Bart and I could not agree on the fix in time for -rc1. Since it's specific to a conformance test, it's been reverted for now. There is a separate patch in the queue to address the underlying control CDB write overflow regression in >= v4.3 separate from the WRITE_VERIFY revert here, that will be pushed post -rc1" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (30 commits) Revert "target: Fix VERIFY and WRITE VERIFY command parsing" IB/srpt: Avoid that aborting a command triggers a kernel warning IB/srpt: Fix abort handling target/fileio: Fix zero-length READ and WRITE handling ibmvscsis: Do not send aborted task response tcmu: fix module removal due to stuck thread target: Don't force session reset if queue_depth does not change iscsi-target: Set session_fall_back_to_erl0 when forcing reinstatement target: Fix compare_and_write_callback handling for non GOOD status tcmu: Recalculate the tcmu_cmd size to save cmd area memories tcmu: Add global data block pool support tcmu: Add dynamic growing data area feature support target: fixup error message in target_tg_pt_gp_tg_pt_gp_id_store() target: fixup error message in target_tg_pt_gp_alua_access_type_store() target/user: PGR Support target: Add WRITE_VERIFY_16 Documentation/target: add an example script to configure an iSCSI target target: Use kmalloc_array() in transport_kmap_data_sg() target: Use kmalloc_array() in compare_and_write_callback() target: Improve size determinations in two functions ...
| * IB/srpt: Avoid that aborting a command triggers a kernel warningBart Van Assche2017-05-071-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid that the following warning is triggered: WARNING: CPU: 10 PID: 166 at ../drivers/infiniband/ulp/srpt/ib_srpt.c:2674 srpt_release_cmd+0x139/0x140 [ib_srpt] CPU: 10 PID: 166 Comm: kworker/u24:8 Not tainted 4.9.4-1-default #1 Workqueue: tmr-fileio target_tmr_work [target_core_mod] Call Trace: [<ffffffffaa3c4f70>] dump_stack+0x63/0x83 [<ffffffffaa0844eb>] __warn+0xcb/0xf0 [<ffffffffaa0845dd>] warn_slowpath_null+0x1d/0x20 [<ffffffffc06ba429>] srpt_release_cmd+0x139/0x140 [ib_srpt] [<ffffffffc06e4377>] target_release_cmd_kref+0xb7/0x120 [target_core_mod] [<ffffffffc06e4d7f>] target_put_sess_cmd+0x2f/0x60 [target_core_mod] [<ffffffffc06e15e0>] core_tmr_lun_reset+0x340/0x790 [target_core_mod] [<ffffffffc06e4816>] target_tmr_work+0xe6/0x140 [target_core_mod] [<ffffffffaa09e4d3>] process_one_work+0x1f3/0x4d0 [<ffffffffaa09e7f8>] worker_thread+0x48/0x4e0 [<ffffffffaa09e7b0>] ? process_one_work+0x4d0/0x4d0 [<ffffffffaa0a46da>] kthread+0xca/0xe0 [<ffffffffaa0a4610>] ? kthread_park+0x60/0x60 [<ffffffffaa71b775>] ret_from_fork+0x25/0x30 Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: David Disseldorp <ddiss@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * IB/srpt: Fix abort handlingBart Van Assche2017-05-071-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let the target core check the CMD_T_ABORTED flag instead of the SRP target driver. Hence remove the transport_check_aborted_status() call. Since state == SRPT_STATE_CMD_RSP_SENT is something that really should not happen, do not try to recover if srpt_queue_response() is called for an I/O context that is in that state. This patch is a bug fix because the srpt_abort_cmd() call is misplaced - if that function is called from srpt_queue_response() it should either be called before the command state is changed or after the response has been sent. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Andy Grover <agrover@redhat.com> Cc: David Disseldorp <ddiss@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* | IB/ipoib: add get_link_ksettings in ethtoolZhu Yanjun2017-05-041-0/+59
| | | | | | | | | | | | | | | | | | | | | | | | In order to let the bonding driver report the correct speed of the underlaying interfaces, when they are IPoIB, the ethtool function get_link_ksettings() in the IPoIB driver is implemented. Cc: Joe Jin <joe.jin@oracle.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Suggested-by: Håkon Bugge <Haakon.Bugge@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/SA: Add support to query OPA path recordsDasaratharaman Chandramouli2017-05-012-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | When the bit 26 of capmask2 field in OPA classport info query is set, SA will query for OPA path records instead of querying for IB path records. Note that OPA path records can only be queried by kernel ULPs. Userspace clients continue to query IB path records. Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/SA: Add OPA path record typeDasaratharaman Chandramouli2017-05-013-9/+9
| | | | | | | | | | | | | | | | | | | | | | Add opa_sa_path_rec to sa_path_rec data structure. The 'type' field in sa_path_rec identifies the type of the path record. Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/SA: Split struct sa_path_rec based on IB and ROCE specific fieldsDasaratharaman Chandramouli2017-05-013-10/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | sa_path_rec now contains a union of sa_path_rec_ib and sa_path_rec_roce based on the type of the path record. Note that fields applicable to path record type ROCE v1 and ROCE v2 fall under sa_path_rec_roce. Accessor functions are added to these fields so the caller doesn't have to know the type. Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/SA: Rename ib_sa_path_rec to sa_path_recDasaratharaman Chandramouli2017-05-015-7/+7
| | | | | | | | | | | | | | | | | | | | | | Rename ib_sa_path_rec to a more generic sa_path_rec. This is part of extending ib_sa to also support OPA path records in addition to the IB defined path records. Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/core: Define 'ib' and 'roce' rdma_ah_attr typesDasaratharaman Chandramouli2017-05-012-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rdma_ah_attr can now be either ib or roce allowing core components to use one type or the other and also to define attributes unique to a specific type. struct ib_ah is also initialized with the type when its first created. This ensures that calls such as modify_ah dont modify the type of the address handle attribute. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/core: Use rdma_ah_attr accessor functionsDasaratharaman Chandramouli2017-05-012-36/+33
| | | | | | | | | | | | | | | | | | | | | | | | Modify core and driver components to use accessor functions introduced to access individual fields of rdma_ah_attr Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/core: Rename ib_destroy_ah to rdma_destroy_ahDasaratharaman Chandramouli2017-05-013-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | Rename ib_destroy_ah to rdma_destroy_ah so its in sync with the rename of the ib address handle attribute Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/core: Rename ib_create_ah to rdma_create_ahDasaratharaman Chandramouli2017-05-012-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | Rename ib_create_ah to rdma_create_ah so its in sync with the rename of the ib address handle attribute Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/core: Rename struct ib_ah_attr to rdma_ah_attrDasaratharaman Chandramouli2017-05-015-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch simply renames struct ib_ah_attr to rdma_ah_attr as these fields specify attributes that are not necessarily specific to IB. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/IPoIB: Remove 'else' when the 'if' has a return.Dasaratharaman Chandramouli2017-05-011-10/+9
| | | | | | | | | | | | | | | | | | | | | | This patch fixes a checkpatch issue related to not having to use an 'else' if the 'if' path returns from the function. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/core: Move opa_class_port_info definition to header fileDasaratharaman Chandramouli2017-04-281-25/+0
| | | | | | | | | | | | | | | | | | | | Both opa_vnic and the hfi driver use the same opa_classport_info definition. We will also have ib_sa capable of querying opa class port info and would need this definition. Move it to ib_mad.h for everyone to use. Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/SA: Modify SA to implicitly cache Class Port infoDasaratharaman Chandramouli2017-04-283-78/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SA will query and cache class port info as part of its initialization. SA will also invalidate and refresh the cache based on specific events. Callers such as IPoIB and CM can query the SA to get the classportinfo information. Apart from making the caller code much simpler, this change puts the onus on the SA to query and maintain classportinfo much like how it maitains the address handle to the SM. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/iser: fix spelling mistake: "unexepected" -> "unexpected"Colin Ian King2017-04-251-1/+1
| | | | | | | | | | | | | | | | | | trivial fix to spelling mistake in iser_err error message Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/ipoib: Fix deadlock between ipoib_stop and mcast join flowFeras Daoud2017-04-211-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before calling ipoib_stop, rtnl_lock should be taken, then the flow clears the IPOIB_FLAG_ADMIN_UP and IPOIB_FLAG_OPER_UP flags, and waits for mcast completion if IPOIB_MCAST_FLAG_BUSY is set. On the other hand, the flow of multicast join task initializes a mcast completion, sets the IPOIB_MCAST_FLAG_BUSY and calls ipoib_mcast_join. If IPOIB_FLAG_OPER_UP flag is not set, this call returns EINVAL without setting the mcast completion and leads to a deadlock. ipoib_stop | | | clear_bit(IPOIB_FLAG_ADMIN_UP) | | | Context Switch | | ipoib_mcast_join_task | | | spin_lock_irq(lock) | | | init_completion(mcast) | | | set_bit(IPOIB_MCAST_FLAG_BUSY) | | | Context Switch | | clear_bit(IPOIB_FLAG_OPER_UP) | | | spin_lock_irqsave(lock) | | | Context Switch | | ipoib_mcast_join | return (-EINVAL) | | | spin_unlock_irq(lock) | | | Context Switch | | ipoib_mcast_dev_flush | wait_for_completion(mcast) | ipoib_stop will wait for mcast completion for ever, and will not release the rtnl_lock. As a result panic occurs with the following trace: [13441.639268] Call Trace: [13441.640150] [<ffffffff8168b579>] schedule+0x29/0x70 [13441.641038] [<ffffffff81688fc9>] schedule_timeout+0x239/0x2d0 [13441.641914] [<ffffffff810bc017>] ? complete+0x47/0x50 [13441.642765] [<ffffffff810a690d>] ? flush_workqueue_prep_pwqs+0x16d/0x200 [13441.643580] [<ffffffff8168b956>] wait_for_completion+0x116/0x170 [13441.644434] [<ffffffff810c4ec0>] ? wake_up_state+0x20/0x20 [13441.645293] [<ffffffffa05af170>] ipoib_mcast_dev_flush+0x150/0x190 [ib_ipoib] [13441.646159] [<ffffffffa05ac967>] ipoib_ib_dev_down+0x37/0x60 [ib_ipoib] [13441.647013] [<ffffffffa05a4805>] ipoib_stop+0x75/0x150 [ib_ipoib] Fixes: 08bc327629cb ("IB/ipoib: fix for rare multicast join race condition") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/ipoib: Update broadcast object if PKey value was changed in index 0Feras Daoud2017-04-211-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update the broadcast address in the priv->broadcast object when the Pkey value changes in index 0, otherwise the multicast GID value will keep the previous value of the PKey, and will not be updated. This leads to interface state down because the interface will keep the old PKey value. For example, in SR-IOV environment, if the PF changes the value of PKey index 0 for one of the VFs, then the VF receives PKey change event that triggers heavy flush. This flush calls update_parent_pkey that update the broadcast object and its relevant members. If in this case the multicast GID will not be updated, the interface state will be down. Fixes: c2904141696e ("IPoIB: Fix pkey change flow for virtualization environments") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/IPoIB: Support acceleration options callbacksErez Shitrit2017-04-207-76/+188
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | IPoIB driver now uses the new set of callback functions. If the hardware provider supports the new ipoib_options implementation, the driver uses the callbacks in its data path flows, otherwise it uses the driver default implementation for all data flows in its code. The default implementation wasn't change and it is exactly as it was before introduction of acceleration support. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/IPoIB: Use defined function for netdev_priv functionErez Shitrit2017-04-2010-129/+136
| | | | | | | | | | | | | | | | | | Make ipoib_priv point to netdev_priv where the code calls netdev_priv. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
OpenPOWER on IntegriCloud