summaryrefslogtreecommitdiffstats
path: root/sys/ofed
Commit message (Collapse)AuthorAgeFilesLines
* MF11 r320947; MFC r320876:hselasky2017-07-131-1/+7
| | | | | | | | | Make sure the mlx4en RX DMA ring gets stamped with software ownership in order to prevent the flow of QP to error in the firmware once UPDATE_QP is called. Approved by: re (marius) Sponsored by: Mellanox Technologies
* MFC r319972:hselasky2017-06-181-2/+2
| | | | | | | | | | Use static device numbering instead of dynamic one when creating mlx4en network interfaces. This prevents infinite unit number growth typically when the mlx4en driver is used inside virtual machines which support runtime PCI attach and detach. Approved by: re (gjb) Sponsored by: Mellanox Technologies
* MFC r319413:hselasky2017-06-041-3/+2
| | | | | | | | | | Free hardware queue resource after port is stopped in the mlx4en(4) driver. Else if the port is up the resource might still be busy and the MTT free will fail. PR: 216493 Approved by: re (kib) Sponsored by: Mellanox Technologies
* MFC r319414:hselasky2017-06-041-30/+12
| | | | | | | | | | | | | Allow communication between functions on the same host when using the mlx4en(4) driver in SRIOV mode. Place a copy of the destination MAC address in the send WQE only under SRIOV/eSwitch configuration or when the device is in selftest. This allows communication between functions on the same host. PR: 216493 Approved by: re (kib) Sponsored by: Mellanox Technologies
* MFC r314131:np2017-05-241-2/+2
| | | | | | | Avoid NULL dereference in a couple of sysctl handlers in ibcore. iw_cxgbe sets ib_device->dma_device to NULL (since r311880). Sponsored by: Chelsio Communications
* MFC r318531:hselasky2017-05-222-5/+4
| | | | | | | | | | | | | | | | | | | | | | | mlx4: Use the CQ quota for SRIOV when creating completion EQs When creating EQs to handle CQ completion events for the PF or for VFs, we create enough EQE entries to handle completions for the max number of CQs that can use that EQ. When SRIOV is activated, the max number of CQs a VF (or the PF) can obtain is its CQ quota (determined by the Hypervisor resource tracker). Therefore, when creating an EQ, the number of EQE entries that the VF should request for that EQ is the CQ quota value (and not the total number of CQs available in the firmware). Under SRIOV, the PF, also must use its CQ quota, because the resource tracker also controls how many CQs the PF can obtain. Using the firmware total CQs instead of the CQ quota when creating EQs resulted wasting MTT entries, due to allocating more EQEs than were needed. Sponsored by: Mellanox Technologies
* MFC r317505:hselasky2017-05-192-8/+8
| | | | | | | | Don't free uninitialized sysctl contexts in the mlx4en driver. This can cause NULL pointer panics during failed device attach. Differential Revision: https://reviews.freebsd.org/D8876 Sponsored by: Mellanox Technologies
* MFC r313555:hselasky2017-05-196-24/+167
| | | | | | | | | | | | | | | | | | | | | | Flexible and asymmetric allocation of EQs and MSI-X vectors for PF/VFs. Previously, the mlx4 driver queried the firmware in order to get the number of supported EQs. Under SRIOV, since this was done before the driver notified the firmware how many VFs it actually needs, the firmware had to take into account a worst case scenario and always allocated four EQs per VF, where one was used for events while the others were used for completions. Now, when the firmware supports the asymmetric allocation scheme, denoted by exposing num_sys_eqs > 0 (--> MLX4_DEV_CAP_FLAG2_SYS_EQS), we use the QUERY_FUNC command to query the firmware before enabling SRIOV. Thus we can get more EQs and MSI-X vectors per function. Moreover, when running in the new firmware/driver mode, the limitation that the number of EQs should be a power of two is lifted. Obtained from: Linux (dual BSD/GPLv2 licensed) Submitted by: Dexuan Cui @ microsoft . com Differential Revision: https://reviews.freebsd.org/D8867 Sponsored by: Mellanox Technologies
* MFC r313556:hselasky2017-05-196-2/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change mlx4 QP allocation scheme. When using Blue-Flame, BF, the QPN overrides the VLAN, CV, and SV fields in the WQE. Thus, BF may only be used for QPNs with bits 6,7 unset. The current ethernet driver code reserves a TX QP range with 256b alignment. This is wrong because if there are more than 64 TX QPs in use, QPNs >= base + 65 will have bits 6/7 set. This problem is not specific for the Ethernet driver, any entity that tries to reserve more than 64 BF-enabled QPs should fail. Also, using ranges is not necessary here and is wasteful. The new mechanism introduced here will support reservation for "Eth QPs eligible for BF" for all drivers: bare-metal, multi-PF, and VFs (when hypervisors support WC in VMs). The flow we use is: 1. In mlx4_en, allocate Tx QPs one by one instead of a range allocation, and request "BF enabled QPs" if BF is supported for the function 2. In the ALLOC_RES FW command, change param1 to: a. param1[23:0] - number of QPs b. param1[31-24] - flags controlling QPs reservation Bit 31 refers to Eth blueflame supported QPs. Those QPs must have bits 6 and 7 unset in order to be used in Ethernet. Bits 24-30 of the flags are currently reserved. When a function tries to allocate a QP, it states the required attributes for this QP. Those attributes are considered "best-effort". If an attribute, such as Ethernet BF enabled QP, is a must-have attribute, the function has to check that attribute is supported before trying to do the allocation. In a lower layer of the code, mlx4_qp_reserve_range masks out the bits which are unsupported. If SRIOV is used, the PF validates those attributes and masks out unsupported attributes as well. In order to notify VFs which attributes are supported, the VF uses QUERY_FUNC_CAP command. This command's mailbox is filled by the PF, which notifies which QP allocation attributes it supports. Obtained from: Linux (dual BSD/GPLv2 licensed) Submitted by: Dexuan Cui @ microsoft . com Differential Revision: https://reviews.freebsd.org/D8868 Sponsored by: Mellanox Technologies
* MFC r310232:dim2017-03-151-1/+1
| | | | | | | | | | | | | | After r310171, the kernel version of sscanf() has format string checking enabled. This results in a -Werror warning in mlx4ib: sys/dev/mlx4/mlx4_ib/mlx4_ib_sysfs.c:90:22: error: format specifies type 'unsigned long long *' but the argument has type 'u64 *' (aka 'unsigned long *') [-Werror,-Wformat] sscanf(buf, "%llx", &sysadmin_ag_val); ~~~~ ^~~~~~~~~~~~~~~~ Change sysadmin_ag_val to unsigned long long to avoid the warning. Reviewed by: hselasky Differential Revision: https://reviews.freebsd.org/D8831
* MFC r313778:hselasky2017-03-141-1/+1
| | | | | | | Improve code readability and fix compilation error when using clang 4.x. Found by: emaste @ Sponsored by: Mellanox Technologies
* MFC r314400:np2017-03-032-19/+10
| | | | | | | | | | | | | | | | | | | | | | cxgbe/iw_cxgbe: fix various double-close panics with iWARP sockets. Sockets representing the TCP endpoints for iWARP connections are allocated by the ibcore module. Before this revision they were closed either by the ibcore module or the iw_cxgbe hardware driver depending on the state transitions during connection teardown. This is error prone and there were cases where both iw_cxgbe and ibcore closed the socket leading to double-free panics. The fix is to let ibcore close the sockets it creates and never do it in the driver. - Use sodisconnect instead of soclose (preceded by solinger = 0) in the driver to tear down an RDMA connection abruptly. This does what's intended without releasing the socket's fd reference. - Close the socket in ibcore when the iWARP iw_cm_id is destroyed. This works for all kinds of sockets: clients that initiate connections, listeners, and sockets accepted off of listeners. Sponsored by: Chelsio Communications
* MFC r310058:hselasky2017-01-091-27/+41
| | | | | | | | Fix initialisation of mlx4_pci_table's .driver_data fields. Differential Revision: https://reviews.freebsd.org/D8791 Sponsored by: Mellanox Technologies Submitted by: Dexuan Cui <decui@microsoft.com>
* MFC 304838:jhb2016-12-011-0/+1
| | | | | | | Do not free an uninitialized pointer on soaccept failure in the iWARP connection manager. Sponsored by: Chelsio Communications
* MFC r308031:hselasky2016-11-071-8/+7
| | | | | | | Fix indentation and remove duplicate queue stopped stats increment. Found by: Ryan Stone <rysto32@gmail.com> Sponsored by: Mellanox Technologies
* MFC r306454:hselasky2016-10-101-0/+1
| | | | | | | Set hardware stats flag to avoid double counting the number of incoming bytes. Found by: Ben RUBSON <ben.rubson@gmail.com> Sponsored by: Mellanox Technologies
* MFC r304342:hselasky2016-08-261-0/+16
| | | | | | | | | Add support for setting blocking and non-blocking mode on /dev/rdma_cm by returning success on FIONBIO and FIOASYNC IOCTLs. The actual flags handling is done by the kern_ioctl() function. Reported by: Alex Bowden <alex.bowden@outlook.com> Sponsored by: Mellanox Technologies
* MFC r303786markj2016-08-142-0/+18
| | | | mthca: Add a wrapper for the firmware's DIAG_RPRT command.
* Fix bug in iwcm that caused a panic in iw_cm_wq when krping is runnp2016-06-141-2/+11
| | | | | | | repeatedly in a tight loop. Approved by: re (gjb@) Obtained from: hselasky@ (part of larger changes in D5791)
* net: Use M_HASHTYPE_OPAQUE_HASH if the mbuf flowid has hash propertiessephe2016-06-071-1/+1
| | | | | | Reviewed by: hps, erj, tuexen Sponsored by: Microsoft OSTC Differential Revision: https://reviews.freebsd.org/D6688
* Fix up the Infiniband code to handle the new arpresolve.gnn2016-06-022-4/+4
|
* Prepare for activation of LinuxKPI module parameters as read-onlyhselasky2016-05-2519-2/+42
| | | | | | | | | | | | | tunable SYSCTL's. Linux module parameters are associated with the module they belong to. FreeBSD does not share this concept of a parent module. Instead add macros which define the prefix to use for the module parameters in the LinuxKPI consumers. While at it convert all "bool" LinuxKPI module parameters to "byte" type, because we don't have a "bool" type of SYSCTL in FreeBSD. Sponsored by: Mellanox Technologies MFC after: 1 week
* Don't repeat the the word 'the'eadler2016-05-173-3/+3
| | | | | | | (one manual change to fix grammar) Confirmed With: db Approved by: secteam (not really, but this is a comment typo fix)
* sys/ofed: minor spelling fix.pfg2016-05-061-1/+1
| | | | | | No functional change. Reviewed by: hselasky
* ofed/drivers: minor spelling fixes.pfg2016-05-0610-14/+14
| | | | | | No functional change. Reviewed by: hselasky
* Fix NOIP kernels to compile.bz2016-04-241-0/+2
|
* More fixes for using IPv6 addresses with RDMA:hselasky2016-04-227-24/+103
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Added check that the SCOPE ID is only restored for IPv6 linklocal addresses. - Changes made by r237263 in the "cma_bind_addr()" function did not check if the socket address was of type IPv6 and used the IPv4 socket address for IPv6 addresses. This caused the function to fail. Fixed this. - In the "rdma_gid2ip()" function and some other places the "sin6_len" and "sin6_scope_id" fields were not set for IPv6 socket addresses. Fixed this. - The scope ID is not stored as part of the GID entries and must be passed as an argument to "rdma_gid2ip()". - Added new method to "struct ib_device" which returns a pointer to the network interface which belongs to the given infiniband device. This is needed to be able to get the scope ID for IPv6 addresses via the associated ethernet interface. - Added convenience function, "rdma_get_ipv6_scope_id()", to get the scope ID for IPv6 addresses. - Implemented new "get_netdev" method for mlx4ib. Other IB controller drivers which want to support IPv6 addresses needs to implement this aswell. - Bumped the FreeBSD version due to changing "struct ib_device". Sponsored by: Mellanox Technologies MFC after: 1 week
* Add KASSERT() and set error code in dead code case to help static codehselasky2016-04-221-0/+2
| | | | | | | | analysis tools. Suggested by: ngie@ Sponsored by: Mellanox Technologies MFC after: 1 week
* Add missing set of the current VNET when inputting IP packets in IPoIB.hselasky2016-04-221-2/+7
| | | | | | | | | This fixes a kernel panic when using IPoIB with VIMAGE and infiniband. PR: 208957 Sponsored by: Mellanox Technologies Tested by: Justin Clift <justin@postgresql.org> MFC after: 1 week
* Fix for using IPv6 addresses with RDMA:hselasky2016-04-211-39/+108
| | | | | | | | | | | | | | IPv6 addresses has a scope ID which sometimes is stored in the "sin6_scope_id" field of "struct sockaddr_in6" and sometimes as part of the IPv6 address itself depending on the context. If the scope ID is not in the expected location, the IPv6 address lookups in the so-called GID table will fail. Some code factoring has been made to achieve a clean exit of the "addr_resolve" function via a common "done" label. Sponsored by: Mellanox Technologies Submitted by: Shani Michaeli <shanim@mellanox.com> MFC after: 1 week
* Fix for resolving mac address when the destination address is a gateway.hselasky2016-04-211-4/+5
| | | | | | | Remove some dead code while at it. Sponsored by: Mellanox Technologies MFC after: 1 week
* Properly setup arguments for if_resolvemulti() callback.hselasky2016-04-211-1/+7
| | | | | Sponsored by: Mellanox Technologies MFC after: 1 week
* Fix inverted priv check calls. Priv check returns zero on success andhselasky2016-04-201-3/+3
| | | | | | | an error code on failure. Refer to man 9 priv_check . Sponsored by: Mellanox Technologies MFC after: 1 week
* ofed: for pointers replace 0 with NULL.pfg2016-04-152-2/+2
| | | | | | | These are mostly cosmetical, no functional change. Found with devel/coccinelle. Reviewed by: hselasky
* Remove some unused fields.hselasky2016-04-143-13/+0
| | | | | Sponsored by: Mellanox Technologies MFC after: 1 week
* Ensure the received IP header gets 32-bits aligned.hselasky2016-04-142-4/+12
| | | | | | | | | The FreeBSD's TCP/IP stack assumes that the IP-header is 32-bits aligned when decoding it. Else unaligned 32-bit memory access can happen, which not all processor architectures support. Sponsored by: Mellanox Technologies MFC after: 1 week
* Add missing port_up checks.hselasky2016-04-141-2/+17
| | | | | | | | | When downing a mlxen network adapter we need to check the port_up variable to ensure we don't continue to transmit data or restart timers which can reside in freed memory. Sponsored by: Mellanox Technologies MFC after: 1 week
* Cleanup unnecessary semicolons from the kernel.pfg2016-04-103-3/+3
| | | | Found with devel/coccinelle.
* tcp/lro: Use tcp_lro_flush_all in device drivers to avoid code duplicationsephe2016-04-011-7/+1
| | | | | | | | | And factor out tcp_lro_rx_done, which deduplicates the same logic with netinet/tcp_lro.c Reviewed by: gallatin (1st version), hps, zbb, np, Dexuan Cui <decui microsoft com> Sponsored by: Microsoft OSTC Differential Revision: https://reviews.freebsd.org/D5725
* Remove unnecessary dequeue_mutex (added in r294610) from the iWARPnp2016-03-301-5/+0
| | | | | | | | | connection manager. Examining so_comp without synchronization with iw_so_event_handler is a harmless race. Submitted by: Krishnamraju Eraparaju @ Chelsio Reviewed by: Steve Wise @ Open Grid Computing Sponsored by: Chelsio Communications
* Add missing curly brackets in for loop.hselasky2016-03-171-2/+3
| | | | | Sponsored by: Mellanox Technologies MFC after: 1 week
* Use hardware computed Toeplitz hash for incoming flowidshselasky2016-03-152-2/+3
| | | | | | | | | Use the Toeplitz hash value as source for the flowid. This makes the hash value more suitable for so-called hash bucket algorithms which are used in the FreeBSD's TCP/IP stack when RSS is enabled. Sponsored by: Mellanox Technologies MFC after: 1 week
* Fix witness panic in the ipoib_ioctl() function when unloading thehselasky2016-03-152-0/+7
| | | | | | | | | | | | | ipoib module. The bpfdetach() function is trying to turn off promiscious mode on the network interface it is attached to while holding a mutex. The fix consists of ignoring any further calls to the ipoib_ioctl() function when the network interface is going to be detached. The ipoib_ioctl() function might sleep. Sponsored by: Mellanox Technologies MFC after: 1 week
* Use SI_SUB_LAST instead of SI_SUB_SMP as the "catch-all" subsystem.jhb2016-03-113-3/+3
| | | | | | Reviewed by: kib Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D5515
* Whitespace fixes.hselasky2016-03-0410-95/+95
| | | | | MFC after: 1 week Sponsored by: Mellanox Technologies
* LinuxKPI list updates:hselasky2016-01-261-2/+2
| | | | | | | | | | | | | | | - Add some new hlist macros. - Update existing hlist macros removing the need for a temporary iteration variable. - Properly define the RCU hlist macros to be SMP safe with regard to RCU. - Safe list macro arguments by adding a pair of parentheses. - Prefix the _list_add() and _list_splice() functions with "linux" to reflect they are LinuxKPI internal functions. Obtained from: Linux MFC after: 1 week Sponsored by: Mellanox Technologies
* Fix for iWARP servers that listen on INADDR_ANY.np2016-01-224-9/+373
| | | | | | | | | | | | | | | | The iWARP Connection Manager (CM) on FreeBSD creates a TCP socket to represent an iWARP endpoint when the connection is over TCP. For servers the current approach is to invoke create_listen callback for each iWARP RNIC registered with the CM. This doesn't work too well for INADDR_ANY because a listen on any TCP socket already notifies all hardware TOEs/RNICs of the new listener. This patch fixes the server side of things for FreeBSD. We've tried to keep all these modifications in the iWARP/TCP specific parts of the OFED infrastructure as much as possible. Submitted by: Krishnamraju Eraparaju @ Chelsio (with design inputs from Steve Wise) Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D4801
* Finish r275196: do not dereference rtentry in if_output() routines.melifaro2016-01-091-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | The only piece of information that is required is rt_flags subset. In particular, if_loop() requires RTF_REJECT and RTF_BLACKHOLE flags to check if this particular mbuf needs to be dropped (and what error should be returned). Note that if_loop() will always return EHOSTUNREACH for "reject" routes regardless of RTF_HOST flag existence. This is due to upcoming routing changes where RTF_HOST value won't be available as lookup result. All other functions require RTF_GATEWAY flag to check if they need to return EHOSTUNREACH instead of EHOSTDOWN error. There are 11 places where non-zero 'struct route' is passed to if_output(). For most of the callers (forwarding, bpf, arp) does not care about exact error value. In fact, the only place where this result is propagated is ip_output(). (ip6_output() passes NULL route to nd6_output_ifp()). Given that, add 3 new 'struct route' flags (RT_REJECT, RT_BLACKHOLE and RT_IS_GW) and inline function (rt_update_ro_flags()) to copy necessary rte flags to ro_flags. Call this function in ip_output() after looking up/ verifying rte. Reviewed by: ae
* Make it possible for sbappend() to preserve M_NOTREADY on mbufs, just likeglebius2016-01-081-1/+1
| | | | | | | | | | sbappendstream() does. Although, M_NOTREADY may appear only on SOCK_STREAM sockets, due to sendfile(2) supporting only the latter, there is a corner case of AF_UNIX/SOCK_STREAM socket, that still uses records for the sake of control data, albeit being stream socket. Provide private version of m_clrprotoflags(), which understands PRUS_NOTREADY, similar to m_demote().
* Remove unused file.hselasky2016-01-071-1/+0
|
OpenPOWER on IntegriCloud