summaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/hw/cxgb4
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'rdma-for-linus' of ↵Linus Torvalds2015-02-212-16/+22
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull InfiniBand/RDMA updates from Roland Dreier: - Re-enable on-demand paging changes with stable ABI - Fairly large set of ocrdma HW driver fixes - Some qib HW driver fixes - Other miscellaneous changes * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (43 commits) IB/qib: Add blank line after declaration IB/qib: Fix checkpatch warnings IB/mlx5: Enable the ODP capability query verb IB/core: Add on demand paging caps to ib_uverbs_ex_query_device IB/core: Add support for extended query device caps RDMA/cxgb4: Don't hang threads forever waiting on WR replies RDMA/ocrdma: Fix off by one in ocrdma_query_gid() RDMA/ocrdma: Use unsigned for bit index RDMA/ocrdma: Help gcc generate better code for ocrdma_srq_toggle_bit RDMA/ocrdma: Update the ocrdma module version string RDMA/ocrdma: set vlan present bit for user AH RDMA/ocrdma: remove reference of ocrdma_dev out of ocrdma_qp structure RDMA/ocrdma: Add support for interrupt moderation RDMA/ocrdma: Honor return value of ocrdma_resolve_dmac RDMA/ocrdma: Allow expansion of the SQ CQEs via buddy CQ expansion of the QP RDMA/ocrdma: Discontinue support of RDMA-READ-WITH-INVALIDATE RDMA/ocrdma: Host crash on destroying device resources RDMA/ocrdma: Report correct state in ibv_query_qp RDMA/ocrdma: Debugfs enhancments for ocrdma driver RDMA/ocrdma: Report correct count of interrupt vectors while registering ocrdma device ...
| * RDMA/cxgb4: Don't hang threads forever waiting on WR repliesHariprasad S2015-02-181-15/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | In c4iw_wait_for_reply(), if a FW6_MSG WR reply is not received after C4IW_WR_TO seconds, fail the WR operation and mark the device as fatally dead. Further, if the device is marked fatally dead, then fail the WR wait immediately. Also change the timeout to 60 seconds. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| * RDMA/cxgb4: Serialize CQ event upcalls with CQ destructionHariprasad S2015-02-131-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A race exists where the application can be destroying the CQ concurrently with a HW interrupt indicating a completion has been inserted into the CQ. This can cause an event notification upcall to the application after the CQ has been destroyed. The solution is to serialize looking up the CQ in the IDR table and referencing the CQ in c4iw_ev_handler() with removing the CQID from the IDR table and blocking until the refcnt reaches 0 in c4iw_destroy_cq(). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* | Merge branch 'debugfs_automount' of ↵Linus Torvalds2015-02-171-24/+11
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull debugfs patches from Al Viro: "debugfs patches, mostly to make it possible for something like tracefs to be transparently automounted on given directory in debugfs. New primitive in there is debugfs_create_automount(name, parent, func, arg), which creates a directory and makes its ->d_automount() return func(arg). Another missing primitive was debugfs_create_file_size() - open-coded in quite a few places. Dave's patch adds it and converts the open-code instances to calling it" * 'debugfs_automount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: debugfs: Provide a file creation function that also takes an initial size new primitive: debugfs_create_automount() debugfs: split end_creating() into success and failure cases debugfs: take mode-dependent parts of debugfs_get_inode() into callers fold debugfs_mknod() into callers fold debugfs_create() into caller fold debugfs_mkdir() into caller debugfs_mknod(): get rid useless arguments fold debugfs_link() into caller debugfs: kill __create_file() debugfs: split the beginning and the end of __create_file() off debugfs_{mkdir,create,link}(): get rid of redundant argument
| * | debugfs: Provide a file creation function that also takes an initial sizeDavid Howells2015-02-171-24/+11
| |/ | | | | | | | | | | | | | | | | Provide a file creation function that also takes an initial size so that the caller doesn't have to set i_size, thus meaning that we don't have to call deal with ->d_inode in the callers. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | iw_cxgb4: Cleanup register defines/MACROS defined in t4fw_ri_api.hHariprasad Shenai2015-01-167-483/+483
| | | | | | | | | | | | | | Cleanup all the MACROS that are defined in t4fw_ri_api.h and affected files Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | iw_cxgb4: Cleanup register defines/MACROS defined in t4.hHariprasad Shenai2015-01-163-70/+70
| | | | | | | | | | | | | | Cleanup all the MACROS defined in t4.h and the affected files Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | iw_cxgb4/cxgb4/cxgb4vf/cxgb4i/csiostor: Cleanup register defines/macros ↵Hariprasad Shenai2015-01-122-12/+12
| | | | | | | | | | | | | | | | | | | | | | related to all other cpl messages This patch cleanups all other macros/register define related to CPL messages that are defined in t4_msg.h and the affected files Signed-off-by: Anish Bhatt <anish@chelsio.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | iw_cxgb4/cxgb4/cxgb4i: Cleanup register defines/MACROS related to CM CPL ↵Hariprasad Shenai2015-01-121-39/+39
| | | | | | | | | | | | | | | | | | | | | | messages This patch cleanups all macros/register define related to connection management CPL messages that are defined in t4_msg.h and the affected files Signed-off-by: Anish Bhatt <anish@chelsio.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | RDMA/cxgb4/cxgb4vf/csiostor: Cleanup SGE register definesHariprasad Shenai2015-01-051-13/+13
|/ | | | | | | | This patch cleanups all SGE related macros/register defines that are defined in t4_regs.h and the affected files. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDMA/cxgb4: Handle NET_XMIT return codesHariprasad S2014-12-151-0/+4
| | | | | | | | cxgb4_create_server() and cxgb4_create_server6() return NET_XMIT_* values or a negative errno. iw_cxgb4 need to handle this correctly. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* RDMA/cxgb4: Wake up waiters after flushing the qpSteve Wise2014-12-151-1/+1
| | | | | | | | | When transitioning into ERROR state, the QP was getting flushed after waking up any waiters. This can cause applications to miss flushed work requests which can stall an NFS mount. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* RDMA/cxgb4: Limit MRs to < 8GB for T4/T5 devicesHariprasad Shenai2014-12-151-0/+22
| | | | | | | | | | | T4/T5 hardware can't handle MRs >= 8GB due to a hardware bug. So limit registrations to < 8GB for thse devices. Based on original work by Steve Wise <swise@opengridcomputing.com>. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* RDMA/cxgb4: Fix locking issue in process_mpa_requestHariprasad Shenai2014-12-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the following lockdep report: ============================================= [ INFO: possible recursive locking detected ] 3.17.0+ #3 Tainted: G E --------------------------------------------- kworker/u64:3/299 is trying to acquire lock: (&epc->mutex){+.+.+.}, at: [<ffffffffa074e07a>] process_mpa_request+0x1aa/0x3e0 [iw_cxgb4] but task is already holding lock: (&epc->mutex){+.+.+.}, at: [<ffffffffa074e34e>] rx_data+0x9e/0x1f0 [iw_cxgb4] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&epc->mutex); lock(&epc->mutex); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by kworker/u64:3/299: #0: ("%s""iw_cxgb4"){.+.+.+}, at: [<ffffffff8106f14d>] process_one_work+0x13d/0x4d0 #1: (skb_work){+.+.+.}, at: [<ffffffff8106f14d>] process_one_work+0x13d/0x4d0 #2: (&epc->mutex){+.+.+.}, at: [<ffffffffa074e34e>] rx_data+0x9e/0x1f0 [iw_cxgb4] stack backtrace: CPU: 2 PID: 299 Comm: kworker/u64:3 Tainted: G E 3.17.0+ #3 Hardware name: Dell Inc. PowerEdge T110/0X744K, BIOS 1.2.1 01/28/2010 Workqueue: iw_cxgb4 process_work [iw_cxgb4] ffff8800b91593d0 ffff8800b8a2f9f8 ffffffff815df107 0000000000000001 ffff8800b9158750 ffff8800b8a2fa28 ffffffff8109f0e2 ffff8800bb768a00 ffff8800b91593d0 ffff8800b9158750 0000000000000000 ffff8800b8a2fa88 Call Trace: [<ffffffff815df107>] dump_stack+0x49/0x62 [<ffffffff8109f0e2>] print_deadlock_bug+0xf2/0x100 [<ffffffff810a0f04>] validate_chain+0x454/0x700 [<ffffffff810a1574>] __lock_acquire+0x3c4/0x580 [<ffffffffa074e07a>] ? process_mpa_request+0x1aa/0x3e0 [iw_cxgb4] [<ffffffff810a17cc>] lock_acquire+0x9c/0x110 [<ffffffffa074e07a>] ? process_mpa_request+0x1aa/0x3e0 [iw_cxgb4] [<ffffffff815e111b>] mutex_lock_nested+0x4b/0x360 [<ffffffffa074e07a>] ? process_mpa_request+0x1aa/0x3e0 [iw_cxgb4] [<ffffffff810c181a>] ? del_timer_sync+0xaa/0xd0 [<ffffffff810c1770>] ? try_to_del_timer_sync+0x70/0x70 [<ffffffffa074e07a>] process_mpa_request+0x1aa/0x3e0 [iw_cxgb4] [<ffffffffa074a3ec>] ? update_rx_credits+0xec/0x140 [iw_cxgb4] [<ffffffffa074e381>] rx_data+0xd1/0x1f0 [iw_cxgb4] [<ffffffff8109ff23>] ? mark_held_locks+0x73/0xa0 [<ffffffff815e4b90>] ? _raw_spin_unlock_irqrestore+0x40/0x70 [<ffffffff810a020d>] ? trace_hardirqs_on_caller+0xfd/0x1c0 [<ffffffff810a02dd>] ? trace_hardirqs_on+0xd/0x10 [<ffffffffa074c931>] process_work+0x51/0x80 [iw_cxgb4] [<ffffffff8106f1c8>] process_one_work+0x1b8/0x4d0 [<ffffffff8106f14d>] ? process_one_work+0x13d/0x4d0 [<ffffffff8106f600>] worker_thread+0x120/0x3c0 [<ffffffff8106f4e0>] ? process_one_work+0x4d0/0x4d0 [<ffffffff81074a0e>] kthread+0xde/0x100 [<ffffffff815e4b40>] ? _raw_spin_unlock_irq+0x30/0x40 [<ffffffff81074930>] ? __init_kthread_worker+0x70/0x70 [<ffffffff815e512c>] ret_from_fork+0x7c/0xb0 [<ffffffff81074930>] ? __init_kthread_worker+0x70/0x70 Based on original work by Steve Wise <swise@opengridcomputing.com>. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* RDMA/cxgb4: Configure 0B MRs to match HW implementationPramod Kumar2014-12-151-2/+4
| | | | | | | | | | | | | | | | | 0B MRs need some tweaks to work correctly with HW. When writing the TPTE, if the MR length is zero we now: 1) turn off all permissions 2) set the length to -1 While functionality/capabilities of the MR are the same with these changes, it resolves a dapltest 0B RDMA Read test failure. Based on original work by Steve Wise <swise@opengridcomputing.com>. Signed-off-by: Pramod Kumar <pramod@chelsio.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* RDMA/cxgb4: Increase epd buff size for debug interfacePramod Kumar2014-12-151-1/+1
| | | | | | | | IPv6 address string lengths require increasing the buffer size for debugfs handlers. Signed-off-by: Pramod Kumar <pramod@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* RDMA/cxgb4/cxgb4vf/csiostor: Cleanup macros/register defines related to ↵Hariprasad Shenai2014-11-221-4/+4
| | | | | | | | | | PCIE, RSS and FW This patch cleanups all PCIE, RSS & FW related macros/register defines that are defined in t4fw_api.h and the affected files. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDMA/cxgb4/csiostor: Cleansup FW related macros/register defines for PF/VF ↵Hariprasad Shenai2014-11-221-1/+1
| | | | | | | | | | and LDST This patch cleanups PF/VF and LDST related macros/register defines that are defined in t4fw_api.h and the affected files. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDMA/cxgb4: Cleanup Filter related macros/register definesHariprasad Shenai2014-11-221-7/+7
| | | | | | | | This patch cleanups all filter related macros/register defines that are defined in t4fw_api.h and the affected files. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4i/cxgb4 : Refactor macros to conform to uniform standardsAnish Bhatt2014-11-132-62/+62
| | | | | | | | | | Refactored all macros used in cxgb4i as part of previously started cxgb4 macro names cleanup. Makes them more uniform and avoids namespace collision. Minor changes in other drivers where required as some of these macros are used by multiple drivers, affected drivers are iw_cxgb4, cxgb4(vf) & csiostor Signed-off-by: Anish Bhatt <anish@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4: Cleanup macros so they follow the same style and look consistent, part 2Hariprasad Shenai2014-11-104-52/+52
| | | | | | | | | | | | | | | Various patches have ended up changing the style of the symbolic macros/register defines to different style. As a result, the current kernel.org files are a mix of different macro styles. Since this macro/register defines is used by different drivers a few patch series have ended up adding duplicate macro/register define entries with different styles. This makes these register define/macro files a complete mess and we want to make them clean and consistent. This patch cleans up a part of it. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDMA/cxgb4: Fix ntuple calculation for ipv6 and remove duplicate lineHariprasad S2014-10-141-4/+3
| | | | | | | | | | This fixes ntuple calculation for IPv6 active open request for T5 adapter. And also removes an duplicate line which got added in commit 92e7ae71726c ("iw_cxgb4: Choose appropriate hw mtu index and ISS for iWARP connections") Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* RDMA/cxgb4: Add missing neigh_release in find_routeHariprasad S2014-10-141-0/+1
| | | | | Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* RDMA/cxgb4: Take IPv6 into account for best_mtu and set_emssHariprasad S2014-10-141-8/+16
| | | | | | | best_mtu and set_emss were not considering ipv6 header for ipv6 case. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* RDMA/cxgb4: Make c4iw_wr_log_size_order staticSteve Wise2014-10-141-1/+1
| | | | | | | This fixes a sparse warning. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* Merge tag 'rdma-for-linus' of ↵Linus Torvalds2014-08-143-12/+37
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull infiniband/rdma updates from Roland Dreier: "Main set of InfiniBand/RDMA updates for 3.17 merge window: - MR reregistration support - MAD support for RMPP in userspace - iSER and SRP initiator updates - ocrdma hardware driver updates - other fixes..." * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (52 commits) IB/srp: Fix return value check in srp_init_module() RDMA/ocrdma: report asic-id in query device RDMA/ocrdma: Update sli data structure for endianness RDMA/ocrdma: Obtain SL from device structure RDMA/uapi: Include socket.h in rdma_user_cm.h IB/srpt: Handle GID change events IB/mlx5: Use ARRAY_SIZE instead of sizeof/sizeof[0] IB/mlx4: Use ARRAY_SIZE instead of sizeof/sizeof[0] RDMA/amso1100: Check for integer overflow in c2_alloc_cq_buf() IPoIB: Remove unnecessary test for NULL before debugfs_remove() IB/mad: Add user space RMPP support IB/mad: add new ioctl to ABI to support new registration options IB/mad: Add dev_notice messages for various umad/mad registration failures IB/mad: Update module to [pr|dev]_* style print messages IB/ipoib: Avoid multicast join attempts with invalid P_key IB/umad: Update module to [pr|dev]_* style print messages IB/ipoib: Avoid flushing the workqueue from worker context IB/ipoib: Use P_Key change event instead of P_Key polling mechanism IB/ipath: Add P_Key change event support mlx4_core: Add support for secure-host and SMP firewall ...
| * RDMA/cxgb4: Only call CQ completion handler if it is armedSteve Wise2014-08-013-12/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function __flush_qp() always calls the ULP's CQ completion handler functions even if the CQ was not armed. This can crash the system if the function pointer is NULL. The iSER ULP behaves this way: no completion handler and never arm the CQ for notification. So now we track whether the CQ is armed at flush time and only call the completion handlers if their CQs were armed. Also, if the RCQ and SCQ are the same CQ, the completion handler is getting called twice. It should only be called once after all SQ and RQ WRs are flushed from the QP. So rearrange the logic to fix this. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-07-223-11/+26
|\ \ | |/ | | | | | | | | | | | | | | Conflicts: drivers/infiniband/hw/cxgb4/device.c The cxgb4 conflict was simply overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
| * RDMA/cxgb4: Call iwpm_init() only onceSteve Wise2014-07-133-9/+12
| | | | | | | | | | | | | | | | | | | | We need to only register with the iwpm core once. Currently it is being done for every adapter, which causes a failure for each adapter but the first, making multiple adapters unusable. Fixes: 9eccfe109b27 ("RDMA/cxgb4: Add support for iWARP Port Mapper user space service") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| * RDMA/cxgb4: Initialize the device status pageSteve Wise2014-07-081-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | The status page is mapped to user processes and allows sharing the device state between the kernel and user processes. This state isn't getting initialized and thus intermittently causes problems. Namely, the user process can mistakenly think the user doorbell writes are disabled which causes SQ work requests to never get fetched by HW. Fixes: 05eb23893c2c ("cxgb4/iw_cxgb4: Doorbell Drop Avoidance Bug Fixes"). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Cc: <stable@vger.kernel.org> # v3.15 Signed-off-by: Roland Dreier <roland@purestorage.com>
| * RDMA/cxgb4: Clean up connection on ARP errorHariprasad S2014-07-081-1/+10
| | | | | | | | | | | | | | | | Based on origninal work by Steve Wise <swise@opengridcomputing.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| * RDMA/cxgb4: Fix skb_leak in reject_cr()Hariprasad S2014-07-081-1/+0
| | | | | | | | | | | | | | | | Based on origninal work by Steve Wise <swise@opengridcomputing.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* | iw_cxgb4: Don't limit TPTE count to 32KBHariprasad Shenai2014-07-212-2/+1
| | | | | | | | | | | | | | | | Use the size advertised by FW Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | iw_cxgb4: advertise the correct device max attributesHariprasad Shenai2014-07-215-33/+32
| | | | | | | | | | | | | | | | | | | | | | Advertise the actual max limits for things like qp depths, number of qps, cqs, etc. Clean up the queue allocation for qps and cqs. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | iw_cxgb4: Support query_qp() verbHariprasad Shenai2014-07-211-0/+6
| | | | | | | | | | | | Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | iw_cxgb4: log detailed warnings for negative adviceHariprasad Shenai2014-07-211-6/+23
| | | | | | | | | | | | Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | iw_cxgb4: fix for 64-bit integer divisionHariprasad Shenai2014-07-171-1/+2
| | | | | | | | | | | | | | | | Fixed error introduced in commit id 7730b4c (" cxgb4/iw_cxgb4: work request logging feature") while compiling on 32 bit architecture reported by kbuild. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | cxgb4/iw_cxgb4: Move common defines to cxgb4Anish Bhatt2014-07-171-1/+0
| | | | | | | | | | | | | | | | This define is used by cxgb4i and iw_cxgb4, moving to avoid code duplication Signed-off-by: Anish Bhatt <anish@chelsio.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | cxgb4/iw_cxgb4: work request logging featureHariprasad Shenai2014-07-155-0/+174
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit enhances the iwarp driver to optionally keep a log of rdma work request timining data for kernel mode QPs. If iw_cxgb4 module option c4iw_wr_log is set to non-zero, each work request is tracked and timing data maintained in a rolling log that is 4096 entries deep by default. Module option c4iw_wr_log_size_order allows specifing a log2 size to use instead of the default order of 12 (4096 entries). Both module options are read-only and must be passed in at module load time to set them. IE: modprobe iw_cxgb4 c4iw_wr_log=1 c4iw_wr_log_size_order=10 The timing data is viewable via the iw_cxgb4 debugfs file "wr_log". Writing anything to this file will clear all the timing data. Data tracked includes: - The host time when the work request was posted, just before ringing the doorbell. The host time when the completion was polled by the application. This is also the time the log entry is created. The delta of these two times is the amount of time took processing the work request. - The qid of the EQ used to post the work request. - The work request opcode. - The cqe wr_id field. For sq completions requests this is the swsqe index. For recv completions this is the MSN of the ingress SEND. This value can be used to match log entries from this log with firmware flowc event entries. - The sge timestamp value just before ringing the doorbell when posting, the sge timestamp value just after polling the completion, and CQE.timestamp field from the completion itself. With these three timestamps we can track the latency from post to poll, and the amount of time the completion resided in the CQ before being reaped by the application. With debug firmware, the sge timestamp is also logged by firmware in its flowc history so that we can compute the latency from posting the work request until the firmware sees it. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | cxgb4/iw_cxgb4: display TPTE on errorsHariprasad Shenai2014-07-153-11/+76
| | | | | | | | | | | | | | | | | | | | | | With ingress WRITE or READ RESPONSE errors, HW provides the offending stag from the packet. This patch adds logic to log the parsed TPTE in this case. cxgb4 now exports a function to read a TPTE entry from adapter memory. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | cxgb4/iw_cxgb4: use firmware ord/ird resource limitsHariprasad Shenai2014-07-155-34/+117
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Advertise a larger max read queue depth for qps, and gather the resource limits from fw and use them to avoid exhaustinq all the resources. Design: cxgb4: Obtain the max_ordird_qp and max_ird_adapter device params from FW at init time and pass them up to the ULDs when they attach. If these parameters are not available, due to older firmware, then hard-code the values based on the known values for older firmware. iw_cxgb4: Fix the c4iw_query_device() to report these correct values based on adapter parameters. ibv_query_device() will always return: max_qp_rd_atom = max_qp_init_rd_atom = min(module_max, max_ordird_qp) max_res_rd_atom = max_ird_adapter Bump up the per qp max module option to 32, allowing it to be increased by the user up to the device max of max_ordird_qp. 32 seems to be sufficient to maximize throughput for streaming read benchmarks. Fail connection setup if the negotiated IRD exhausts the available adapter ird resources. So the driver will track the amount of ird resource in use and not send an RI_WR/INIT to FW that would reduce the available ird resources below zero. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | iw_cxgb4: Detect Ing. Padding Boundary at run-timeHariprasad Shenai2014-07-156-16/+43
| | | | | | | | | | | | | | | | | | Updates iw_cxgb4 to determine the Ingress Padding Boundary from cxgb4_lld_info, and take subsequent actions. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | rdma/cxgb4: Fixes cxgb4 probe failure in VM when PF is exposed through PCI ↵Hariprasad Shenai2014-07-011-1/+2
|/ | | | | | | | | | | | | | | | | | Passthrough Change logic which determines our Physical Function at PCI Probe time. Now we read the PL_WHOAMI register and get the Physical Function. Pass Physical Function to Upper Layer Drivers in lld_info structure in the new field "pf" added to lld_info. This is useful for the cases where the PF, say PF4, is attached to a Virtual Machine via some form of "PCI Pass Through" technology and the PCI Function shows up as PF0 in the VM. Based on original work by Casey Leedom <leedom@chelsio.com> Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds2014-06-126-17/+123
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking updates from David Miller: 1) Seccomp BPF filters can now be JIT'd, from Alexei Starovoitov. 2) Multiqueue support in xen-netback and xen-netfront, from Andrew J Benniston. 3) Allow tweaking of aggregation settings in cdc_ncm driver, from Bjørn Mork. 4) BPF now has a "random" opcode, from Chema Gonzalez. 5) Add more BPF documentation and improve test framework, from Daniel Borkmann. 6) Support TCP fastopen over ipv6, from Daniel Lee. 7) Add software TSO helper functions and use them to support software TSO in mvneta and mv643xx_eth drivers. From Ezequiel Garcia. 8) Support software TSO in fec driver too, from Nimrod Andy. 9) Add Broadcom SYSTEMPORT driver, from Florian Fainelli. 10) Handle broadcasts more gracefully over macvlan when there are large numbers of interfaces configured, from Herbert Xu. 11) Allow more control over fwmark used for non-socket based responses, from Lorenzo Colitti. 12) Do TCP congestion window limiting based upon measurements, from Neal Cardwell. 13) Support busy polling in SCTP, from Neal Horman. 14) Allow RSS key to be configured via ethtool, from Venkata Duvvuru. 15) Bridge promisc mode handling improvements from Vlad Yasevich. 16) Don't use inetpeer entries to implement ID generation any more, it performs poorly, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1522 commits) rtnetlink: fix userspace API breakage for iproute2 < v3.9.0 tcp: fixing TLP's FIN recovery net: fec: Add software TSO support net: fec: Add Scatter/gather support net: fec: Increase buffer descriptor entry number net: fec: Factorize feature setting net: fec: Enable IP header hardware checksum net: fec: Factorize the .xmit transmit function bridge: fix compile error when compiling without IPv6 support bridge: fix smatch warning / potential null pointer dereference via-rhine: fix full-duplex with autoneg disable bnx2x: Enlarge the dorq threshold for VFs bnx2x: Check for UNDI in uncommon branch bnx2x: Fix 1G-baseT link bnx2x: Fix link for KR with swapped polarity lane sctp: Fix sk_ack_backlog wrap-around problem net/core: Add VF link state control policy net/fsl: xgmac_mdio is dependent on OF_MDIO net/fsl: Make xgmac_mdio read error message useful net_sched: drr: warn when qdisc is not work conserving ...
| * iw_cxgb4: don't truncate the recv window sizeHariprasad Shenai2014-06-102-4/+52
| | | | | | | | | | | | | | | | | | | | | | Fixed a bug that shows up with recv window sizes that exceed the size of the RCV_BUFSIZ field in opt0 (>= 1024K). If the recv window exceeds this, then we specify the max possible in opt0, add add the rest in via a RX_DATA_ACK credits. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * iw_cxgb4: Choose appropriate hw mtu index and ISS for iWARP connectionsHariprasad Shenai2014-06-102-11/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Select the appropriate hw mtu index and initial sequence number to optimize hw memory performance. Add new cxgb4_best_aligned_mtu() which allows callers to provide enough information to be used to [possibly] select an MTU which will result in the TCP Data Segment Size (AKA Maximum Segment Size) to be an aligned value. If an RTR message exhange is required, then align the ISS to 8B - 1 + 4, so that after the SYN the send seqno will align on a 4B boundary. The RTR message exchange will leave the send seqno aligned on an 8B boundary. If an RTR is not required, then align the ISS to 8B - 1. The goal is to have the send seqno be 8B aligned when we send the first FPDU. Based on original work by Casey Leedom <leeedom@chelsio.com> and Steve Wise <swise@opengridcomputing.com> Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * iw_cxgb4: Allocate and use IQs specifically for indirect interruptsHariprasad Shenai2014-06-103-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently indirect interrupts for RDMA CQs funnel through the LLD's RDMA RXQs, which also handle direct interrupts for offload CPLs during RDMA connection setup/teardown. The intended T4 usage model, however, is to have indirect interrupts flow through dedicated IQs. IE not to mix indirect interrupts with CPL messages in an IQ. This patch adds the concept of RDMA concentrator IQs, or CIQs, setup and maintained by the LLD and exported to iw_cxgb4 for use when creating CQs. RDMA CPLs will flow through the LLD's RDMA RXQs, and CQ interrupts flow through the CIQs. Design: cxgb4 creates and exports an array of CIQs for the RDMA ULD. These IQs are sized according to the max available CQs available at adapter init. In addition, these IQs don't need FL buffers since they only service indirect interrupts. One CIQ is setup per RX channel similar to the RDMA RXQs. iw_cxgb4 will utilize these CIQs based on the vector value passed into create_cq(). The num_comp_vectors advertised by iw_cxgb4 will be the number of CIQs configured, and thus the vector value will be the index into the array of CIQs. Based on original work by Steve Wise <swise@opengridcomputing.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| |
| \
*-. \ Merge branches 'core', 'cxgb3', 'cxgb4', 'iser', 'iwpm', 'misc', 'mlx4', ↵Roland Dreier2014-06-106-56/+279
|\ \ \ | |_|/ |/| | | | | 'mlx5', 'noio', 'ocrdma', 'qib', 'srp' and 'usnic' into for-next
| | * RDMA/cxgb4: Add support for iWARP Port Mapper user space serviceSteve Wise2014-06-103-43/+262
| |/ |/| | | | | | | | | | | | | | | | | Based on original work by Vipul Pandya. Signed-off-by: Steve Wise <swise@opengridcomputing.com> [ Fix htons -> ntohs to make sparse happy. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>
| * RDMA/cxgb4: add missing padding at end of struct c4iw_alloc_ucontext_respYann Droneaud2014-06-052-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The i386 ABI disagrees with most other ABIs regarding alignment of data types larger than 4 bytes: on most ABIs a padding must be added at end of the structures, while it is not required on i386. So for most ABI struct c4iw_alloc_ucontext_resp gets implicitly padded to be aligned on a 8 bytes multiple, while for i386, such padding is not added. The tool pahole can be used to find such implicit padding: $ pahole --anon_include \ --nested_anon_include \ --recursive \ --class_name c4iw_alloc_ucontext_resp \ drivers/infiniband/hw/cxgb4/iw_cxgb4.o Then, structure layout can be compared between i386 and x86_64: +++ obj-i386/drivers/infiniband/hw/cxgb4/iw_cxgb4.o.pahole.txt 2014-03-28 11:43:05.547432195 +0100 --- obj-x86_64/drivers/infiniband/hw/cxgb4/iw_cxgb4.o.pahole.txt 2014-03-28 10:55:10.990133017 +0100 @@ -2,9 +2,8 @@ struct c4iw_alloc_ucontext_resp { __u64 status_page_key; /* 0 8 */ __u32 status_page_size; /* 8 4 */ - /* size: 12, cachelines: 1, members: 2 */ - /* last cacheline: 12 bytes */ + /* size: 16, cachelines: 1, members: 2 */ + /* padding: 4 */ + /* last cacheline: 16 bytes */ }; This ABI disagreement will make an x86_64 kernel try to write past the buffer provided by an i386 binary. When boundary check will be implemented, the x86_64 kernel will refuse to write past the i386 userspace provided buffer and the uverbs will fail. If the structure is on a page boundary and the next page is not mapped, ib_copy_to_udata() will fail and the uverb will fail. Additionally, as reported by Dan Carpenter, without the implicit padding being properly cleared, an information leak would take place in most architectures. This patch adds an explicit padding to struct c4iw_alloc_ucontext_resp, and, like 92b0ca7cb149 ("IB/mlx5: Fix stack info leak in mlx5_ib_alloc_ucontext()"), makes function c4iw_alloc_ucontext() not writting this padding field to userspace. This way, x86_64 kernel will be able to write struct c4iw_alloc_ucontext_resp as expected by unpatched and patched i386 libcxgb4. Link: http://marc.info/?i=cover.1399309513.git.ydroneaud@opteya.com Link: http://marc.info/?i=1395848977.3297.15.camel@localhost.localdomain Link: http://marc.info/?i=20140328082428.GH25192@mwanda Cc: <stable@vger.kernel.org> Fixes: 05eb23893c2c ("cxgb4/iw_cxgb4: Doorbell Drop Avoidance Bug Fixes") Reported-by: Yann Droneaud <ydroneaud@opteya.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
OpenPOWER on IntegriCloud