| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Make sure the mlx4en RX DMA ring gets stamped with software ownership
in order to prevent the flow of QP to error in the firmware once
UPDATE_QP is called.
Approved by: re (marius)
Sponsored by: Mellanox Technologies
|
|
|
|
|
|
|
|
|
|
| |
Use static device numbering instead of dynamic one when creating
mlx4en network interfaces. This prevents infinite unit number growth
typically when the mlx4en driver is used inside virtual machines which
support runtime PCI attach and detach.
Approved by: re (gjb)
Sponsored by: Mellanox Technologies
|
|
|
|
|
|
|
|
|
|
| |
Free hardware queue resource after port is stopped in the mlx4en(4)
driver. Else if the port is up the resource might still be busy and
the MTT free will fail.
PR: 216493
Approved by: re (kib)
Sponsored by: Mellanox Technologies
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow communication between functions on the same host when using the
mlx4en(4) driver in SRIOV mode.
Place a copy of the destination MAC address in the send WQE only under
SRIOV/eSwitch configuration or when the device is in selftest. This
allows communication between functions on the same host.
PR: 216493
Approved by: re (kib)
Sponsored by: Mellanox Technologies
|
|
|
|
|
|
|
| |
Avoid NULL dereference in a couple of sysctl handlers in ibcore.
iw_cxgbe sets ib_device->dma_device to NULL (since r311880).
Sponsored by: Chelsio Communications
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
mlx4: Use the CQ quota for SRIOV when creating completion EQs
When creating EQs to handle CQ completion events for the PF or for
VFs, we create enough EQE entries to handle completions for the max
number of CQs that can use that EQ.
When SRIOV is activated, the max number of CQs a VF (or the PF) can
obtain is its CQ quota (determined by the Hypervisor resource
tracker). Therefore, when creating an EQ, the number of EQE entries
that the VF should request for that EQ is the CQ quota value (and not
the total number of CQs available in the firmware).
Under SRIOV, the PF, also must use its CQ quota, because the resource
tracker also controls how many CQs the PF can obtain.
Using the firmware total CQs instead of the CQ quota when creating EQs
resulted wasting MTT entries, due to allocating more EQEs than were
needed.
Sponsored by: Mellanox Technologies
|
|
|
|
|
|
|
|
| |
Don't free uninitialized sysctl contexts in the mlx4en driver. This
can cause NULL pointer panics during failed device attach.
Differential Revision: https://reviews.freebsd.org/D8876
Sponsored by: Mellanox Technologies
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Flexible and asymmetric allocation of EQs and MSI-X vectors for PF/VFs.
Previously, the mlx4 driver queried the firmware in order to get the
number of supported EQs. Under SRIOV, since this was done before the
driver notified the firmware how many VFs it actually needs, the
firmware had to take into account a worst case scenario and always
allocated four EQs per VF, where one was used for events while the
others were used for completions. Now, when the firmware supports the
asymmetric allocation scheme, denoted by exposing num_sys_eqs > 0 (-->
MLX4_DEV_CAP_FLAG2_SYS_EQS), we use the QUERY_FUNC command to query
the firmware before enabling SRIOV. Thus we can get more EQs and MSI-X
vectors per function. Moreover, when running in the new
firmware/driver mode, the limitation that the number of EQs should be
a power of two is lifted.
Obtained from: Linux (dual BSD/GPLv2 licensed)
Submitted by: Dexuan Cui @ microsoft . com
Differential Revision: https://reviews.freebsd.org/D8867
Sponsored by: Mellanox Technologies
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change mlx4 QP allocation scheme.
When using Blue-Flame, BF, the QPN overrides the VLAN, CV, and SV
fields in the WQE. Thus, BF may only be used for QPNs with bits 6,7
unset.
The current ethernet driver code reserves a TX QP range with 256b
alignment.
This is wrong because if there are more than 64 TX QPs in use, QPNs >=
base + 65 will have bits 6/7 set.
This problem is not specific for the Ethernet driver, any entity that
tries to reserve more than 64 BF-enabled QPs should fail. Also, using
ranges is not necessary here and is wasteful.
The new mechanism introduced here will support reservation for "Eth
QPs eligible for BF" for all drivers: bare-metal, multi-PF, and VFs
(when hypervisors support WC in VMs). The flow we use is:
1. In mlx4_en, allocate Tx QPs one by one instead of a range allocation,
and request "BF enabled QPs" if BF is supported for the function
2. In the ALLOC_RES FW command, change param1 to:
a. param1[23:0] - number of QPs
b. param1[31-24] - flags controlling QPs reservation
Bit 31 refers to Eth blueflame supported QPs. Those QPs must have bits
6 and 7 unset in order to be used in Ethernet.
Bits 24-30 of the flags are currently reserved.
When a function tries to allocate a QP, it states the required
attributes for this QP. Those attributes are considered "best-effort".
If an attribute, such as Ethernet BF enabled QP, is a must-have
attribute, the function has to check that attribute is supported
before trying to do the allocation.
In a lower layer of the code, mlx4_qp_reserve_range masks out the bits
which are unsupported. If SRIOV is used, the PF validates those
attributes and masks out unsupported attributes as well. In order to
notify VFs which attributes are supported, the VF uses QUERY_FUNC_CAP
command. This command's mailbox is filled by the PF, which notifies
which QP allocation attributes it supports.
Obtained from: Linux (dual BSD/GPLv2 licensed)
Submitted by: Dexuan Cui @ microsoft . com
Differential Revision: https://reviews.freebsd.org/D8868
Sponsored by: Mellanox Technologies
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After r310171, the kernel version of sscanf() has format string checking
enabled. This results in a -Werror warning in mlx4ib:
sys/dev/mlx4/mlx4_ib/mlx4_ib_sysfs.c:90:22: error: format specifies type 'unsigned long long *' but the argument has type 'u64 *' (aka 'unsigned long *') [-Werror,-Wformat]
sscanf(buf, "%llx", &sysadmin_ag_val);
~~~~ ^~~~~~~~~~~~~~~~
Change sysadmin_ag_val to unsigned long long to avoid the warning.
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D8831
|
|
|
|
|
|
|
| |
Improve code readability and fix compilation error when using clang 4.x.
Found by: emaste @
Sponsored by: Mellanox Technologies
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cxgbe/iw_cxgbe: fix various double-close panics with iWARP sockets.
Sockets representing the TCP endpoints for iWARP connections are
allocated by the ibcore module. Before this revision they were closed
either by the ibcore module or the iw_cxgbe hardware driver depending on
the state transitions during connection teardown. This is error prone
and there were cases where both iw_cxgbe and ibcore closed the socket
leading to double-free panics. The fix is to let ibcore close the
sockets it creates and never do it in the driver.
- Use sodisconnect instead of soclose (preceded by solinger = 0) in the
driver to tear down an RDMA connection abruptly. This does what's
intended without releasing the socket's fd reference.
- Close the socket in ibcore when the iWARP iw_cm_id is destroyed. This
works for all kinds of sockets: clients that initiate connections,
listeners, and sockets accepted off of listeners.
Sponsored by: Chelsio Communications
|
|
|
|
|
|
|
|
| |
Fix initialisation of mlx4_pci_table's .driver_data fields.
Differential Revision: https://reviews.freebsd.org/D8791
Sponsored by: Mellanox Technologies
Submitted by: Dexuan Cui <decui@microsoft.com>
|
|
|
|
|
|
|
| |
Do not free an uninitialized pointer on soaccept failure in the iWARP
connection manager.
Sponsored by: Chelsio Communications
|
|
|
|
|
|
|
| |
Fix indentation and remove duplicate queue stopped stats increment.
Found by: Ryan Stone <rysto32@gmail.com>
Sponsored by: Mellanox Technologies
|
|
|
|
|
|
|
| |
Set hardware stats flag to avoid double counting the number of incoming bytes.
Found by: Ben RUBSON <ben.rubson@gmail.com>
Sponsored by: Mellanox Technologies
|
|
|
|
|
|
|
|
|
| |
Add support for setting blocking and non-blocking mode on /dev/rdma_cm
by returning success on FIONBIO and FIOASYNC IOCTLs. The actual flags
handling is done by the kern_ioctl() function.
Reported by: Alex Bowden <alex.bowden@outlook.com>
Sponsored by: Mellanox Technologies
|
|
|
|
| |
mthca: Add a wrapper for the firmware's DIAG_RPRT command.
|
|
|
|
|
|
|
| |
repeatedly in a tight loop.
Approved by: re (gjb@)
Obtained from: hselasky@ (part of larger changes in D5791)
|
|
|
|
|
|
| |
Reviewed by: hps, erj, tuexen
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D6688
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
tunable SYSCTL's. Linux module parameters are associated with the
module they belong to. FreeBSD does not share this concept of a parent
module. Instead add macros which define the prefix to use for the
module parameters in the LinuxKPI consumers.
While at it convert all "bool" LinuxKPI module parameters to "byte"
type, because we don't have a "bool" type of SYSCTL in FreeBSD.
Sponsored by: Mellanox Technologies
MFC after: 1 week
|
|
|
|
|
|
|
| |
(one manual change to fix grammar)
Confirmed With: db
Approved by: secteam (not really, but this is a comment typo fix)
|
|
|
|
|
|
| |
No functional change.
Reviewed by: hselasky
|
|
|
|
|
|
| |
No functional change.
Reviewed by: hselasky
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Added check that the SCOPE ID is only restored for IPv6 linklocal
addresses.
- Changes made by r237263 in the "cma_bind_addr()" function did not
check if the socket address was of type IPv6 and used the IPv4
socket address for IPv6 addresses. This caused the function to
fail. Fixed this.
- In the "rdma_gid2ip()" function and some other places the "sin6_len"
and "sin6_scope_id" fields were not set for IPv6 socket
addresses. Fixed this.
- The scope ID is not stored as part of the GID entries and must be
passed as an argument to "rdma_gid2ip()".
- Added new method to "struct ib_device" which returns a pointer to
the network interface which belongs to the given infiniband
device. This is needed to be able to get the scope ID for IPv6
addresses via the associated ethernet interface.
- Added convenience function, "rdma_get_ipv6_scope_id()", to get the
scope ID for IPv6 addresses.
- Implemented new "get_netdev" method for mlx4ib. Other IB controller
drivers which want to support IPv6 addresses needs to implement this
aswell.
- Bumped the FreeBSD version due to changing "struct ib_device".
Sponsored by: Mellanox Technologies
MFC after: 1 week
|
|
|
|
|
|
|
|
| |
analysis tools.
Suggested by: ngie@
Sponsored by: Mellanox Technologies
MFC after: 1 week
|
|
|
|
|
|
|
|
|
| |
This fixes a kernel panic when using IPoIB with VIMAGE and infiniband.
PR: 208957
Sponsored by: Mellanox Technologies
Tested by: Justin Clift <justin@postgresql.org>
MFC after: 1 week
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
IPv6 addresses has a scope ID which sometimes is stored in the
"sin6_scope_id" field of "struct sockaddr_in6" and sometimes as part
of the IPv6 address itself depending on the context. If the scope ID
is not in the expected location, the IPv6 address lookups in the
so-called GID table will fail. Some code factoring has been made to
achieve a clean exit of the "addr_resolve" function via a common
"done" label.
Sponsored by: Mellanox Technologies
Submitted by: Shani Michaeli <shanim@mellanox.com>
MFC after: 1 week
|
|
|
|
|
|
|
| |
Remove some dead code while at it.
Sponsored by: Mellanox Technologies
MFC after: 1 week
|
|
|
|
|
| |
Sponsored by: Mellanox Technologies
MFC after: 1 week
|
|
|
|
|
|
|
| |
an error code on failure. Refer to man 9 priv_check .
Sponsored by: Mellanox Technologies
MFC after: 1 week
|
|
|
|
|
|
|
| |
These are mostly cosmetical, no functional change.
Found with devel/coccinelle.
Reviewed by: hselasky
|
|
|
|
|
| |
Sponsored by: Mellanox Technologies
MFC after: 1 week
|
|
|
|
|
|
|
|
|
| |
The FreeBSD's TCP/IP stack assumes that the IP-header is 32-bits aligned
when decoding it. Else unaligned 32-bit memory access can happen, which
not all processor architectures support.
Sponsored by: Mellanox Technologies
MFC after: 1 week
|
|
|
|
|
|
|
|
|
| |
When downing a mlxen network adapter we need to check the port_up variable
to ensure we don't continue to transmit data or restart timers which can
reside in freed memory.
Sponsored by: Mellanox Technologies
MFC after: 1 week
|
|
|
|
| |
Found with devel/coccinelle.
|
|
|
|
|
|
|
|
|
| |
And factor out tcp_lro_rx_done, which deduplicates the same logic with
netinet/tcp_lro.c
Reviewed by: gallatin (1st version), hps, zbb, np, Dexuan Cui <decui microsoft com>
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5725
|
|
|
|
|
|
|
|
|
| |
connection manager. Examining so_comp without synchronization with
iw_so_event_handler is a harmless race.
Submitted by: Krishnamraju Eraparaju @ Chelsio
Reviewed by: Steve Wise @ Open Grid Computing
Sponsored by: Chelsio Communications
|
|
|
|
|
| |
Sponsored by: Mellanox Technologies
MFC after: 1 week
|
|
|
|
|
|
|
|
|
| |
Use the Toeplitz hash value as source for the flowid. This makes the
hash value more suitable for so-called hash bucket algorithms which
are used in the FreeBSD's TCP/IP stack when RSS is enabled.
Sponsored by: Mellanox Technologies
MFC after: 1 week
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ipoib module.
The bpfdetach() function is trying to turn off promiscious mode on the
network interface it is attached to while holding a mutex. The fix
consists of ignoring any further calls to the ipoib_ioctl() function
when the network interface is going to be detached. The ipoib_ioctl()
function might sleep.
Sponsored by: Mellanox Technologies
MFC after: 1 week
|
|
|
|
|
|
| |
Reviewed by: kib
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D5515
|
|
|
|
|
| |
MFC after: 1 week
Sponsored by: Mellanox Technologies
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Add some new hlist macros.
- Update existing hlist macros removing the need for a temporary
iteration variable.
- Properly define the RCU hlist macros to be SMP safe with regard
to RCU.
- Safe list macro arguments by adding a pair of parentheses.
- Prefix the _list_add() and _list_splice() functions with "linux"
to reflect they are LinuxKPI internal functions.
Obtained from: Linux
MFC after: 1 week
Sponsored by: Mellanox Technologies
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The iWARP Connection Manager (CM) on FreeBSD creates a TCP socket to
represent an iWARP endpoint when the connection is over TCP. For
servers the current approach is to invoke create_listen callback for
each iWARP RNIC registered with the CM. This doesn't work too well for
INADDR_ANY because a listen on any TCP socket already notifies all
hardware TOEs/RNICs of the new listener. This patch fixes the server
side of things for FreeBSD. We've tried to keep all these modifications
in the iWARP/TCP specific parts of the OFED infrastructure as much as
possible.
Submitted by: Krishnamraju Eraparaju @ Chelsio (with design inputs from Steve Wise)
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D4801
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The only piece of information that is required is rt_flags subset.
In particular, if_loop() requires RTF_REJECT and RTF_BLACKHOLE flags
to check if this particular mbuf needs to be dropped (and what
error should be returned).
Note that if_loop() will always return EHOSTUNREACH for "reject" routes
regardless of RTF_HOST flag existence. This is due to upcoming routing
changes where RTF_HOST value won't be available as lookup result.
All other functions require RTF_GATEWAY flag to check if they need
to return EHOSTUNREACH instead of EHOSTDOWN error.
There are 11 places where non-zero 'struct route' is passed to if_output().
For most of the callers (forwarding, bpf, arp) does not care about exact
error value. In fact, the only place where this result is propagated
is ip_output(). (ip6_output() passes NULL route to nd6_output_ifp()).
Given that, add 3 new 'struct route' flags (RT_REJECT, RT_BLACKHOLE and
RT_IS_GW) and inline function (rt_update_ro_flags()) to copy necessary
rte flags to ro_flags. Call this function in ip_output() after looking up/
verifying rte.
Reviewed by: ae
|
|
|
|
|
|
|
|
|
|
| |
sbappendstream() does. Although, M_NOTREADY may appear only on SOCK_STREAM
sockets, due to sendfile(2) supporting only the latter, there is a corner
case of AF_UNIX/SOCK_STREAM socket, that still uses records for the sake
of control data, albeit being stream socket.
Provide private version of m_clrprotoflags(), which understands PRUS_NOTREADY,
similar to m_demote().
|
| |
|