summaryrefslogtreecommitdiffstats
path: root/sys/dev/cxgbe
Commit message (Collapse)AuthorAgeFilesLines
* MFC 312906:jhb2017-02-035-6/+30
| | | | | | Unregister CPL handlers for TOE-related messages when unloading TOM. Sponsored by: Chelsio Communications
* MFC r312368:np2017-01-201-0/+1
| | | | | cxgbe/tom: Fix a case where do_pass_accept_req wasn't properly restoring the VNET.
* Fix mismerge in r312117. This is a direct commit to stable/10.np2017-01-171-1/+0
|
* MFC r311848:np2017-01-141-0/+1
| | | | cxgbe(4): Attach to the 2x25 debug card. This is for internal use only.
* MFC r311831 and r311832.np2017-01-141-2/+3
| | | | | | | | | | r311831: cxgbe(4): The wraparound logic in start_wrq_wr() should not get involved in work requests that end at the end of the descriptor ring, even though the pidx wraps around to 0. r311832: cxgbe(4): Enable automatic cidx flush for all control queues.
* MFC r311569, r311657, and r311949.np2017-01-146-68/+137
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | r311569: Fix comment in t4_tom. No functional change. r311657: cxgbe/t4_tom: Fix tid accounting. An offloaded IPv6 connection uses 2 tids, not 1, in the hardware. r311949: cxgbe/tom: Add VIMAGE support to the TOE driver. Active Open: - Save the socket's vnet at the time of the active open (t4_connect) and switch to it when processing the reply (do_act_open_rpl or do_act_establish). Passive Open: - Save the listening socket's vnet in the driver's listen_ctx and switch to it when processing incoming SYNs for the socket. - Reject SYNs that arrive on an ifnet that's not in the same vnet as the listening socket. CLIP (Compressed Local IPv6) table: - Add only those IPv6 addresses to the CLIP that are in a vnet associated with one of the card's ifnets. Misc: - Set vnet from the toepcb when processing TCP state transitions. - The kernel sets the vnet when calling the driver's output routine so t4_push_frames runs in proper vnet context already. One exception is when incoming credits trigger tx within the driver's ithread. Set the vnet explicitly in do_fw4_ack for that case. Sponsored by: Chelsio Communications
* MFC r310151 and r311173.np2017-01-0610-29741/+29478
| | | | | | | | | | | | | | r310151: cxgbe(4): Changes to the default T6 firmware configuration file. - Disable features that are not supported or not used on FreeBSD. - Increase the RSS table slice per interface. - Increase the share of the TCAM reserved for filtering. r311173: cxgbe(4): Update T4, T5 and T6 firmwares to 1.16.26.0. Sponsored by: Chelsio Communications
* MFC r309666, r310033, r310049, r310100, r310152, and r310807.np2017-01-047-111/+232
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | r309666: cxgbe(4): unsigned short isn't large enough to store link speed (which is in Mbps) for 100Gbps links. r310033: cxgbe(4): Retire t4_bus_space_read_8 and t4_bus_space_write_8. r310049: cxgbe(4): Fix the tid range shown for T6 cards in misc.tids. r310100: cxgbe(4): Deal with compressed error vectors. r310152: cxgbe(4): Fix typo in an unused macro. r310807: cxgbe(4): Updates to link configuration. - Update struct link_settings and associated shared code. - Add tunables to control FEC and autonegotiation. All ports inherit these values as their initial settings. hw.cxgbe.fec hw.cxgbe.autoneg - Add per-port sysctls to control FEC and autonegotiation. These can be modified at any time. dev.<port>.<n>.fec dev.<port>.<n>.autoneg
* MFC 309613: cxgbe(4): Update firmwares from version 1.16.12.0 to 1.16.22.0.jhb2016-12-0911-29149/+29298
| | | | Sponsored by: Chelsio Communications
* MFC 308066: cxgbe(4): Accurate statistics for all chip settings.jhb2016-12-051-8/+14
| | | | | | | There are 4 independent knobs in T5+ chips to include or exclude PAUSE frames from the "total frames" and "multicast frames" counters in either direction. This change lets the driver deal with any combination of these settings.
* MFC 307876:jhb2016-12-051-25/+8
| | | | | | | | | cxgbe(4): Fix bug in the calculation of the number of physically contiguous regions in an mbuf chain. If the payload of an mbuf ends at a page boundary count_mbuf_nsegs would incorrectly consider the next mbuf's payload physically contiguous based solely on a KVA comparison.
* MFC 307759: cxgbe(4): Dump any mailbox command that times out.jhb2016-12-051-0/+15
|
* MFC 307233:jhb2016-12-051-1/+1
| | | | | cxgbe(4): Allow the interface MTU to be set as high as the actual hardware limit.
* MFC 306821,306823: Permit updating firmware config file in flash.jhb2016-12-052-0/+38
| | | | | | | | | 306821: cxgbe(4): Add an ioctl to copy a firmware config file to the card's flash. 306823: cxgbetool: Add a loadcfg subcommand to allow a user to upload a firmware configuration file to the card.
* MFC 306277:jhb2016-12-051-7/+39
| | | | | cxgbe(4): Make the location/length of all descriptor rings available in the sysctl MIB.
* MFC 305695,305696,305699,305702,305703,305713,305715,305827,305852,305906,jhb2016-12-0524-20278/+32029
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 305908,306062,306063,306137,306138,306206,306216,306273,306295,306301, 306465,309302: Add support for adapters using the Terminator T6 ASIC. 305695: cxgbe(4): Set up fl_starve_threshold2 accurately for T6. 305696: cxgbe(4): Use correct macro for header length with T6 ASICs. This affects the transmit of the VF driver only. 305699: cxgbe(4): Update the pad_boundary calculation for T6, which has a different range of boundaries. 305702: cxgbe(4): Use smaller min/max bursts for fl descriptors with a T6. 305703: cxgbe(4): Deal with the slightly different SGE_STAT_CFG in T6. 305713: cxgbe(4): Add support for additional port types and link speeds. 305715: cxgbe(4): Catch up with the rename of tlscaps -> cryptocaps. TLS is one of the capabilities of the crypto engine in T6. 305827: cxgbe(4): Use the interface's viid to calculate the PF/VF/VFValid fields to use in tx work requests. 305852: cxgbe(4): Attach to cards with the Terminator 6 ASIC. T6 cards will come up as 't6nex' nexus devices with 'cc' ports hanging off them. The T6 firmware and configuration files will be added as soon as they are released. For now the driver will try to work with whatever firmware and configuration is on the card's flash. 305906: cxgbe/t4_tom: The SMAC entry for a VI is at a different location in the T6. 305908: cxgbe/t4_tom: Update the active/passive open code to support T6. Data path works as-is. 306062: cxgbe(4): Show wcwr_stats for T6 cards. 306063: cxgbe(4): Setup congestion response for T6 rx queues. 306137: cxgbetool: Add T6 support to the SGE context decoder. 306138: Fix typo. 306206: cxgbe(4): Catch up with the different layout of WHOAMI in T6. Note that the code moved below t4_prep_adapter() as part of this change because now it needs a working chip_id(). 306216: cxgbe(4): Fix the output of the "tids" sysctl on T6. 306273: cxgbe(4): Fix netmap with T6, which doesn't encapsulate SGE_EGR_UPDATE message inside a FW_MSG. The base NIC already deals with updates in either form. 306295: cxgbe(4): Support SIOGIFXMEDIA so that ifconfig displays correct media for 25Gbps and 100Gbps ports. This should have been part of r305713, which is when the driver first started reporting extended media types. 306301: cxgbe(4): Use the port's top speed to figure out whether it is "high speed" or not (for the purpose of calculating the number of queues etc.) This does the right thing for 25Gbps and 100Gbps ports. 306465: cxgbe(4): Claim the T6 -DBG card. 309302: cxgbe(4): Include firmware for T6 cards in the driver. Update all firmwares to 1.16.12.0. Sponsored by: Chelsio Communications
* MFC 305667:jhb2016-12-051-0/+2
| | | | | cxgbe(4): Avoid a NULL dereference in the clearstats ioctl handler. Port softc's are not initialized when the adapter is in recovery mode.
* MFC 305652: cxgbe(4): Do not prescreen frames before attempting LRO.jhb2016-12-051-2/+1
|
* MFC 305433:jhb2016-12-051-1/+1
| | | | | | cxgbe/t4_tom: toepcb should be all-zero on allocation because the code that cleans up on failure assumes that non-NULL values indicate initialized items.
* MFC 303688,303750,305166,305167: Centralize and rework page pod handling.jhb2016-12-053-110/+341
| | | | | | | | | | | | | | | | | | | | | | | | | | Note that the TOE DDP code in 10 is different from 11 and later and had to be updated directly. 303688: cxgbe/t4_tom: Read the chip's DDP page sizes and save them in a per-adapter data structure. This replaces a global array with hardcoded page sizes. 303750: cxgbe/t4_tom: The page pod arena allocates from pod address space and not index space. The minimum valid allocation out of this arena is the size of a single page pod. 305166: cxgbe/t4_tom: Add general purpose routines to deal with page pod regions and allocations within them. Switch to these routines to manage the TOE DDP region. 305167: cxgbe/t4_tom: Two new routines to allocate and write page pods for a buffer in the kernel's address space. Sponsored by: Chelsio Communications
* Add sys/systm.h to have critical_enter() defined, required bykib2016-12-041-0/+1
| | | | | | | | machine/counter.h on i386. This is a direct commit to stable/10. Sponsored by: The FreeBSD Foundation
* MFC 303348:jhb2016-12-031-5/+4
| | | | | | | cxgbe(4): Initialize the adapter queues (fwq and mgmtq) instead of returning EAGAIN if they aren't available when the user tries to program a filter. Do this after validating the filter so that the driver doesn't bring up the queues if it doesn't have to.
* MFC 302440,304873,305704,305985,306787,307531: Fixes for sysctls.jhb2016-12-036-42/+193
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 302440: cxgbe(4): Add sysctl to display the RSS indirection table size for an interface. dev.cxl.<n>.rss_size dev.vcxl.<n>.rss_size 304873: cxgbe(4): Provide more details about the card in the sysctl MIB. dev.t5nex.0.%desc: Chelsio T580-CR dev.t5nex.0.hw_revision: 1 dev.t5nex.0.sn: PT13140042 dev.t5nex.0.pn: 110117150A0 dev.t5nex.0.ec: 0000000000000000 dev.t5nex.0.na: 0007432AF490 dev.t5nex.0.vpd_version: 3 dev.t5nex.0.scfg_version: 53255 dev.t5nex.0.bs_version: 1.1.0.0 dev.t5nex.0.er_version: 1.0.0.68 dev.t5nex.0.tp_version: 0.1.4.9 dev.t5nex.0.firmware_version: 1.16.2.0 305704: cxgbe(4): Rename the debug_flags driver tunable/sysctl to dflags. Tunables that end with _flags are special. 305985: cxgbe(4): Fixes to wrq stats. - Increment tx_wrs_copied in the correct place. - Add tx_wrs_sspace to the sysctl MIB. 306787: cxgbe(4): Fix whitespace in the pm_stats display. 307531: cxgbe(4): Adjust whitespace to line up the column titles in cim_qcfg with the values displayed. Sponsored by: Chelsio Communications
* MFC 304854: cxgbe/iw_cxgbe: Various fixes to the iWARP driver.jhb2016-12-034-126/+200
| | | | | | | | | | - Return appropriate error code instead of ENOMEM when sosend() fails in send_mpa_req. - Fix for problematic race during destroy_qp. - Abortive close in the failure of send_mpa_reject() instead of normal close. - Remove the unnecessary doorbell flowcontrol logic. Sponsored by: Chelsio Communications
* MFC 303859,305851: Fix a typo and some whitespace nits.jhb2016-12-022-4/+4
|
* MFC 303454: Mark spg_len and fl_pktshift static.jhb2016-12-021-2/+2
| | | | These variables are no longer exported to t4_netmap.c after r296478.
* MFC 303522,303647,303860,303880,304168-304170,304479,304482,304485,305548,jhb2016-12-0211-208/+2027
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 305549: Chelsio T4/T5 VF driver. 303522: Various fixes to the t4/5nex character device. - Remove null open/close methods. - Don't set d_flags to 0 explicitly. - Remove t5_cdevsw as the .d_name member isn't really used and doesn't warrant a separate cdevsw just for the name. - Use ENOTTY as the error value for an unknown ioctl request. - Use make_dev_s() to close race with setting si_drv1. 303647: Store the offset of the KDOORBELL and GTS registers in the softc. VF devices use a different register layout than PF devices. Storing the offset in a value in the softc allows code to be shared between the PF and VF drivers. 303860: Reserve an adapter flag IS_VF to mark VF devices vs PF devices. 303880: Track the base absolute ID of ingress and egress queues. Use this to map an absolute queue ID to a logical queue ID in interrupt handlers. For the regular cxgbe/cxl drivers this should be a no-op as the base absolute ID should be zero. VF devices have a non-zero base absolute ID and require this change. While here, export the absolute ID of egress queues via a sysctl. 304168: Make SGE parameter handling more VF-friendly. Add fields to hold the SGE control register and free list buffer sizes to the sge_params structure. Populate these new fields in t4_init_sge_params() for PF devices and change t4_read_chip_settings() to pull these values out of the params structure instead of reading registers directly. This will permit t4_read_chip_settings() to be reused for VF devices which cannot read SGE registers directly. While here, move the call to t4_init_sge_params() to get_params__post_init(). The VF driver will populate the SGE parameters structure via a different method before calling t4_read_chip_settings(). 304169: Update mailbox writes to work with VF devices. - Use alternate register locations for the data and control registers for VFs. - Do a dummy read to force the writes to the mailbox data registers to post before the write to the control register on VFs. - Do not check the PCI-e firmware register for errors on VFs. 304170: Add support for register dumps on VF devices. - Add handling of VF register sets to t4_get_regs_len() and t4_get_regs(). - While here, use t4_get_regs_len() in the ioctl handler for regdump instead of inlining it. 304479: Add structures for VF-specific adapter parameters. While here, mark which parameters are PF-specific and which are VF-specific. 304482: Adjust t4_port_init() to work with VF devices. Specifically, the FW_PORT_CMD may or may not work for a VF (the PF driver can choose whether or not to permit access to this command), so don't attempt to fetch port information on a VF if permission is denied by the PF. 304485: Reorder sysctls so that nodes shared with the VF driver are added first. This permits a single early return for VF devices in the routines that add sysctl nodes. 305548: Don't break out of the m_advance() loop if len drops to zero. If a packet contains the Ethernet header (14 bytes) in the first mbuf and the payload (IP + UDP + data) in the second mbuf, then the attempt to fetch the l3hdr will return a NULL pointer. The first loop iteration will drop len to zero and exit the loop without setting 'p'. However, the desired data is at the start of the second mbuf, so the correct behavior is to loop around and let the conditional set 'p' to m_data of the next mbuf (and leave offset as 0). 305549: Chelsio T4/T5 VF driver. The cxgbev/cxlv driver supports Virtual Function devices for Chelsio T4 and T4 adapters. The VF devices share most of their code with the existing PF4 driver (cxgbe/cxl) and as such the VF device driver currently depends on the PF4 driver. Similar to the cxgbe/cxl drivers, the VF driver includes a t4vf/t5vf PCI device driver that attaches to the VF device. It then creates child cxgbev/cxlv devices representing ports assigned to the VF. By default, the PF driver assigns a single port to each VF. t4vf_hw.c contains VF-specific routines from the shared code used to fetch VF-specific parameters from the firmware. t4_vf.c contains the VF-specific PCI device driver and includes its own attach routine. VF devices are required to use a different firmware request when transmitting packets (which in turn requires a different CPL message to encapsulate messages). This alternate firmware request does not permit chaining multiple packets in a single message, so each packet results in a firmware request. In addition, the different CPL message requires more detailed information when enabling hardware checksums, so parse_pkt() on VF devices must examine L2 and L3 headers for all packets (not just TSO packets) for VF devices. Finally, L2 checksums on non-UDP/non-TCP packets do not work reliably (the firmware trashes the IPv4 fragment field), so IPv4 checksums for such packets are calculated in software. Most of the other changes in the non-VF-specific code are to expose various variables and functions private to the PF driver so that they can be used by the VF driver. Note that a limited subset of cxgbetool functions are supported on VF devices including register dumps, scheduler classes, and clearing of statistics. In addition, TOE is not supported on VF devices, only for the PF interfaces. Sponsored by: Chelsio Communications
* Fix build without INVARIANTS.jhb2016-12-022-0/+4
| | | | This is a direct commit to stable/10.
* MFC 303204: Install a handler for firmware work request error messages.jhb2016-12-021-0/+67
| | | | | | | If a driver sends an malformed or disallowed work request, the firmware responds with a work request error. Previously the driver treated this is as an unexpected message and panicked. Now it decodes the error message to aid in debugging.
* MFC 302339:jhb2016-12-0216-291/+311
| | | | | | | | | | | | | | | | | | | | | | | | | | | cxgbe(4): Changes to the CPL-handler registration mechanism and code related to "shared" CPLs. a) Combine t4_set_tcb_field and t4_set_tcb_field_rpl into a single function. Allow callers to direct the response to any iq. Tidy up set_ulp_mode_iscsi while there to use names from t4_tcb.h instead of magic constants. b) Remove all CPL handler tables from struct adapter. This reduces its size by around 2KB. All handlers are now registered at MOD_LOAD instead of attach or some kind of initialization/activation. The registration functions do not need an adapter parameter any more. c) Add per-iq handlers to deal with CPLs whose destination cannot be determined solely from the opcode. There are 2 such CPLs in use right now: SET_TCB_RPL and L2T_WRITE_RPL. The base driver continues to send filter and L2T_WRITEs over the mgmtq and solicits the reply on fwq. t4_tom (including the DDP code) now uses the port's ctrlq to send L2T_WRITEs and SET_TCB_FIELDs and solicits the reply on an ofld_rxq. fwq and ofld_rxq have different handlers that know what kind of tid to expect in the reply. Update t4_write_l2e and callers to to support any wrq/iq combination. Sponsored by: Chelsio Communications
* MFC 292736:jhb2016-12-027-238/+249
| | | | | | | | | | cxgbe(4): Updates to the base NIC driver and t4_tom to support the iSCSI offload driver. These changes come from projects/cxl_iscsi. Note that these changes make use of the mbufq API from 11.0, but that API is not present in 10.x in the same form. Borrow an implementation from the CAM CTL ha code that uses m_nextpkt to implement mbufq for use in 10.
* MFC 297797:jhb2016-12-021-1/+2
| | | | | | | cxgbe(4): Provide an explicit value for nqpcq in the firmware configuration file. Sponsored by: Chelsio Communications
* MFC 273806,289103,289201,289338,289578,293185,294474,294610,297124,297368,jhb2016-12-0110-304/+364
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 297406,300875,300888,301158,301896,301897,304838: Pull in most of the Chelsio and iWARP related changes from stable/11 into stable/10. A few changes from 278886 (OFED 1.2) were also included though the full merge is not: - The find_gid_port() function in infiband/core/cma.c. - Addition of the 'ord' and 'ird' fields to 'struct iw_cm_event'. 273806: Userspace library for Chelsio's Terminator 5 based iWARP RNICs (pretty much every T5 card that does _not_ have "-SO" in its name is RDMA capable). This plugs into the OFED verbs framework and allows userspace RDMA applications to work over T5 RNICs. Tested with rping. 289103: iw_cxgbe: fix for page fault in cm_close_handler(). This is roughly the iw_cxgbe equivalent of https://github.com/torvalds/linux/commit/be13b2dff8c4e41846477b22cc5c164ea5a6ac2e ----------------- RDMA/cxgb4: Connect_request_upcall fixes When processing an MPA Start Request, if the listening endpoint is DEAD, then abort the connection. If the IWCM returns an error, then we must abort the connection and release resources. Also abort_connection() should not post a CLOSE event, so clean that up too. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com> ----------------- 289201: iw_cxgbe: MPA v2 is always available. 289338: iw_cxgbe: use correct RFC number. 289578: Merge LinuxKPI changes from DragonflyBSD: - Define the kref structure identical to the one found in Linux. - Update clients referring inside the kref structure. - Implement kref_sub() for FreeBSD. 293185: iw_cxgbe: Shut down the socket but do not close the fd in case of error. The fd is closed later in this case. This fixes a "SS_NOFDREF on enter" panic. 294474: iw_cxgbe: fix a couple of problems int the RDMA_TERMINATE handler. a) Look for the CPL in the payload buffer instead of the descriptor. b) Retrieve the socket associated with the tid with the inpcb lock held. 294610: Fix for iWARP servers that listen on INADDR_ANY. The iWARP Connection Manager (CM) on FreeBSD creates a TCP socket to represent an iWARP endpoint when the connection is over TCP. For servers the current approach is to invoke create_listen callback for each iWARP RNIC registered with the CM. This doesn't work too well for INADDR_ANY because a listen on any TCP socket already notifies all hardware TOEs/RNICs of the new listener. This patch fixes the server side of things for FreeBSD. We've tried to keep all these modifications in the iWARP/TCP specific parts of the OFED infrastructure as much as possible. 297124: iw_cxgbe/libcxgb4: Pull in many applicable fixes from the upstream Linux iWARP driver and userspace library to the FreeBSD iw_cxgbe and libcxgb4. This commit includes internal changesets 6785 8111 8149 8478 8617 8648 8650 9110 9143 9440 9511 9894 10164 10261 10450 10980 10981 10982 11730 11792 12218 12220 12222 12223 12225 12226 12227 12228 12229 12654. 297368: cxgbe/iw_cxgbe: Fix for stray "start_ep_timer timer already started!" messages. 297406: Remove unnecessary dequeue_mutex (added in r294610) from the iWARP connection manager. Examining so_comp without synchronization with iw_so_event_handler is a harmless race. 300875: iw_cxgbe: Use vmem(9) to manage PBL and RQT allocations. 300888: iw_cxgbe: Plug a lock leak in process_mpa_request(). If the parent is DEAD or connect_request_upcall() fails, the parent mutex is left locked. This leads to a hang when process_mpa_request() is called again for another child of the listening endpoint. 301158: iw_cxgbe: Fix panic that occurs when c4iw_ev_handler tries to acquire comp_handler_lock but c4iw_destroy_cq has already freed the CQ memory (which is where the lock resides). 301896: Fix bug in iwcm that caused a panic in iw_cm_wq when krping is run repeatedly in a tight loop. 301897: iw_cxgbe: Make sure that send_abort results in a TCP RST and not a FIN. Release the hold on ep->com immediately after sending the RST. This fixes a bug that sometimes leaves userspace iWARP tools hung when the user presses ^C. 304838: Do not free an uninitialized pointer on soaccept failure in the iWARP connection manager. Submitted by: Krishnamraju Eraparaju @ Chelsio (original patch) Sponsored by: Chelsio Communications
* MFC r286227, r286443:jch2016-11-243-26/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | r286227: Decompose TCP INP_INFO lock to increase short-lived TCP connections scalability: - The existing TCP INP_INFO lock continues to protect the global inpcb list stability during full list traversal (e.g. tcp_pcblist()). - A new INP_LIST lock protects inpcb list actual modifications (inp allocation and free) and inpcb global counters. It allows to use TCP INP_INFO_RLOCK lock in critical paths (e.g. tcp_input()) and INP_INFO_WLOCK only in occasional operations that walk all connections. PR: 183659 Differential Revision: https://reviews.freebsd.org/D2599 Reviewed by: jhb, adrian Tested by: adrian, nitroboost-gmail.com Sponsored by: Verisign, Inc. r286443: Fix a kernel assertion issue introduced with r286227: Avoid too strict INP_INFO_RLOCK_ASSERT checks due to tcp_notify() being called from in6_pcbnotify(). Reported by: Larry Rosenman <ler@lerctr.org> Submitted by: markj, jch
* MFC 302313:jhb2016-11-041-2/+1
| | | | | | cxgbe(4): Avoid a NULL dereference while dumping the L2 table. Entries used by switching filters that rewrite L2 information do not have any associated ifnet.
* MFC 301516,301520,301531,301535,301540,301542,301628: Traffic schedulingjhb2016-11-044-146/+363
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | updates. 301516: cxgbetool: Allow max-rate > 10Gbps for rate-limited traffic. 301520: cxgbe(4): Create a reusable struct type for scheduling class parameters. 301531: cxgbe(4): Break up set_sched_class. Validate the channel number and min/max rates against their actual limits (which are chip and port specific) instead of hardcoded constants. 301535: cxgbe(4): Track the state of the hardware traffic schedulers in the driver. This works as long as everyone uses set_sched_class_params to program them. 301540: cxgbe(4): Provide information about traffic classes in the sysctl mib. 301542: cxgbe(4): A couple of fixes to set_sched_queue. - Validate the scheduling class against the actual limit (which is chip specific) instead of a magic number. - Return an error if an attempt is made to manipulate the tx queues of a VI that hasn't been initialized. 301628: cxgbe(4): Add a sysctl to manage the binding of a txq to a traffic class. Sponsored by: Chelsio Communications
* MFC 297883:jhb2016-11-041-2/+3
| | | | | cxgbe(4): Always dispatch all work requests that have been written to the descriptor ring before leaving drain_wrq_wr_list.
* MFC 297875: cxgbe(4): Always read the entire mailbox into the reply buffer.jhb2016-11-041-1/+1
| | | | | | The size of the reply can be different from the size of the command in case a debug firmware asserts. fw_asrt() needs the entire reply in order to decode the location of the assert.
* MFC 297776,297777,297779: Add DDB commands to cxgbe(4).jhb2016-11-042-25/+178
| | | | | | | | | | | | | | | | | | | | | | | 297776: Add a function to lookup a device_t object by name. This just walks the global list of devices looking for one with the requested name. The one use case outside of devctl2's implementation is for DDB commands that wish to lookup devices by name. 297777: Add a 'show t4 tcb <nexus> <tid>' command to dump a TCB from DDB. This allows the contents of a TCB to be extracted from a T4/T5 card in DDB after a panic. 297779: Add a 'show t4 devlog <nexus>' DDB command. This command displays the adapter's firmware device log similar to the dev.<nexus>.misc.devlog sysctl. Sponsored by: Chelsio Communications
* MFC 297194:jhb2016-11-041-1/+1
| | | | | | cxgbe(4): Be consistent and call ETHER_BPF_MTAP before writing anything to the descriptor ring no matter what path the frame takes within the driver's tx.
* MFC 296975: cxgbe(4): Tidy up PAUSE frame accounting.jhb2016-11-042-6/+21
| | | | | | | | | | | | | Figure out if the chip is counting PAUSE frames in the "normal" stats and take them out if it is. This fixes a bug in the tx stats because the default hardware behavior is different for Tx and Rx but the driver was treating both the same way. The result was that OPACKETS, OBYTES, and OMCASTS were under-reported (if tx_pause > 0) before this change. Note that the mac_stats sysctl still gives you the raw value of these statistics straight from the device registers. Sponsored by: Chelsio Communications
* MFC 296950,296951: Configuration updates.jhb2016-11-043-37/+89
| | | | | | | | | | | | | 296950: cxgbe(4): Update some register settings in the default configuration files to match the "uwire" configuration. 296951: cxgbe(4): Enable additional capabilities in the default configuration files. All features with FreeBSD drivers of some kind are now in the default configuration. Sponsored by: Chelsio Communications
* MFC 296018,296640,296641,296689,296735,296949: Fixes for sysctl handlers.jhb2016-11-043-6/+217
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 296018: cxgbe(4): Add a sysctl to retrieve the maximum speed/bandwidth supported by a port. dev.cxgbe.<n>.max_speed dev.cxl.<n>.max_speed 296640: cxgbe(4): Add a sysctl for the event capture mask of the TP block's logic analyzer. dev.t5nex.<n>.misc.tp_la_mask dev.t4nex.<n>.misc.tp_la_mask 296641: cxgbe(4): Add sysctls to display the TP microcode version and the expansion rom version (if there's one). trantor:~# sysctl dev.t4nex dev.t5nex | grep _version dev.t4nex.0.firmware_version: 1.15.28.0 dev.t4nex.0.tp_version: 0.1.9.4 dev.t5nex.0.firmware_version: 1.15.28.0 dev.t5nex.0.exprom_version: 1.0.0.68 dev.t5nex.0.tp_version: 0.1.4.9 296689: cxgbe(4): sysctls to display the TOE's TCP timers. cask:~# sysctl -d dev.t5nex.0.toe dev.t5nex.0.toe.finwait2_timer: FINWAIT2 timer (us) dev.t5nex.0.toe.initial_srtt: Initial SRTT (us) dev.t5nex.0.toe.keepalive_intvl: Keepidle interval (us) dev.t5nex.0.toe.keepalive_idle: Keepidle idle timer (us) dev.t5nex.0.toe.persist_max: Persist timer max (us) dev.t5nex.0.toe.persist_min: Persist timer min (us) dev.t5nex.0.toe.rexmt_max: Retransmit max (us) dev.t5nex.0.toe.rexmt_min: Retransmit min (us) dev.t5nex.0.toe.dack_timer: DACK timer (us) dev.t5nex.0.toe.dack_tick: DACK tick (us) dev.t5nex.0.toe.timestamp_tick: TCP timestamp tick (us) dev.t5nex.0.toe.timer_tick: TP timer tick (us) ... cask:~# sysctl dev.t5nex.0.toe dev.t5nex.0.toe.finwait2_timer: 9765440 dev.t5nex.0.toe.initial_srtt: 244128 dev.t5nex.0.toe.keepalive_intvl: 73240800 dev.t5nex.0.toe.keepalive_idle: 7031116800 dev.t5nex.0.toe.persist_max: 9765440 dev.t5nex.0.toe.persist_min: 976544 dev.t5nex.0.toe.rexmt_max: 9765440 dev.t5nex.0.toe.rexmt_min: 244128 dev.t5nex.0.toe.dack_timer: 19520 dev.t5nex.0.toe.dack_tick: 32.768 dev.t5nex.0.toe.timestamp_tick: 1048.576 dev.t5nex.0.toe.timer_tick: 32.768 ... 296735: Fix the following gcc warnings on sparc64, when TCP_OFFLOAD is not defined: sys/dev/cxgbe/t4_main.c:7474: warning: 'sysctl_tp_tick' defined but not used sys/dev/cxgbe/t4_main.c:7505: warning: 'sysctl_tp_dack_timer' defined but not used sys/dev/cxgbe/t4_main.c:7519: warning: 'sysctl_tp_timer' defined but not used This just adds a bunch of #ifdef TCP_OFFLOAD in the right places. 296949: cxgbe(4): Remove a couple of pointless assignments in sysctl_meminfo. Do not display range if start = stop (this is a workaround for some unused regions). Sponsored by: Chelsio Communications
* MFC 296552,296596,296603,296624,296627: Fixes related to memory windows.jhb2016-11-043-198/+330
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 296552: cxgbe(4): Rename regwin_lock to reg_lock. It is used to protect access to indirect registers only. 296596: cxgbe(4): Allow the addr/len pair that is being validated in validate_mem_range to span multiple memory types. Update validate_mt_off_len to use validate_mem_range. 296603: cxgbe(4): Add general purpose routines that offer safe access to the chip's memory windows. Convert existing users of these windows to the new routines. 296624: cxgbe(4): Fix bug in r296603. The memory window needs to be repositioned if the start address isn't in the window already. One of the bounds check used the end address instead. 296627: cxgbe(4): Improvements to the code that deals with the firmware's log. - Query the location of the log very early during attach. Refresh the location later after establishing contact with the firmware. - Save the log's location as a flat address in devlog_params. - Use a memory window instead of backdoor access to the EDC/MC to read the log. Sponsored by: Chelsio Communications
* MFC 295778,296249,296333,296383,296471,296478,296481,296485,296488-296491,jhb2016-11-0426-22790/+47786
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 296493-296496,296544,296710-296711,297863,299685: Catch up to changes to the internal shared code. Note that this merge includes two different firmware updates, but the effective change is to update to the last version (1.15.37.0). As such, I've trimmed the log message of the first update (1.15.28.0). In addition, the M_WAIT macro added in t4_regs.h had to be renamed to CXGBE_M_WAIT to avoid a collision on 10.x that is not present on 11. 295778: cxgbe: catch up with the latest hardware-related definitions. 296249: cxgbe(4): Update T5 and T4 firmwares to 1.15.28.0. 296333: cxgbe(4): First of many changes to reduce diffs with internal shared code: - Rename some CamelCase variables. - s/t4_link_start/t4_link_l1cfg/g - Pull in t4_get_port_type_description. - Move t4_wait_op_done to t4_hw.c. - Flip the order of the RDMA stats. - Remove unsused function t4_iq_start_stop. - Move t4_wait_op_done and t4_wait_op_done_val to t4_hw.c 296383: cxgbe(4): Very basic T6 awareness. This is part of ongoing work to update to the latest internal shared code. - Add a chip_params structure to keep track of hardware constants for all generations of Terminators handled by cxgbe. - Update t4_hw_pci_read_cfg4 to work with T6. - Update the hardware debug sysctls (hidden within dev.<tNnex>.<n>.misc.*) to work with T6. Most of the changes are in the decoders for the CIM logic analyzer and the MPS TCAM. - Acquire the regwin lock around indirect register accesses. 296471: cxgbe(4): Updated register dumps. - Get the list of registers to read during a regdump from the shared code instead of the OS specific code. This follows a similar move internally. The shared code includes the list for T6. - Update cxgbetool to be able to decode T5 VF, T6, and T6 VF register dumps (and catch up with some updates to T4 and T5 register decode). 296478: cxgbe(4): Add a struct sge_params to store per-adapter SGE parameters. Move the code that reads all the parameters to t4_init_sge_params in the shared code. Use these per-adapter values instead of globals. 296481: cxgbe(4): Overhaul the shared code that deals with the chip's TP block, which is responsible for filtering and RSS. Add the ability to use filters that match on PF/VF (aka "VNIC id") while here. This is mutually exclusive with filtering on outer VLAN tag with Q-in-Q. 296485: cxgbe(4): Update the interrupt handlers for hardware errors. 296488: cxgbe(4): Updates to mailbox routines in the shared code. 296489: cxgbe(4): Updates to the shared routines that deal with the serial EEPROM, flash, and VPD. 296490: cxgbe(4): Remove __devinit and SPEED_<foo> as part of catch up with internal shared code. 296491: cxgbe(4): Updates to shared routines that get/set various parameters via the firmware. 296493: cxgbe(4): Use t4_link_down_rc_str in shared code to decode the reason the link is down, instead of doing it in OS specific code. 296494: cxgbe(4): Many new functions in the shared code, unused at this time. 296495: cxgbe(4): Fix t4_tp_get_rdma_stats. 296496: cxgbe(4): Minor updates to the shared routines that deal with firmware images. 296544: cxgbe(4): Reshuffle and rototill t4_hw.c, solely to reduce diffs with the internal shared code. 296710: cxgbe(4): Catch up with the latest list of card capabilities as reported by the firmware. 296711: cxgbe(4): Fix typo in previous commit. 297863: Rename the 'M_B' macro in t4_regs.h to 'CXGBE_M_B'. This fixes a conflict with the M_B macro in powerpc's <machine/db_machdep.h> exposed by the recent addition of DDB commands to the cxgbe driver. 299685: cxgbe(4): Update T5 and T4 firmwares to 1.15.37.0. These firmwares were obtained from the "Chelsio T5/T4 Unified Wire v2.12.0.3 for Linux" release. Changes since 1.14.4.0 (which is the firmware in -STABLE branches) are in the "Release Notes" accompanying the Unified Wire release and are copy-pasted here as well. 22.1. T5 Firmware +++++++++++++++++++++++++++++++++ Version : 1.15.37.0 Date : 04/27/2016 ================================================================================ FIXES ----- BASE: - Fixed an issue in FW_RSS_VI_CONFIG_CMD handling where the default ingress queue was ignored. - Fixed an issue where adapter failed to load fw by adjusting DRAM frequency. - Fixed an issue in watchdog which was causing VM bring-up failure after reboot. - Fixed 40G link failures with some switches when auto-negotiation enabled. - Fixed to improve on link bring-up time. - Per port buffer groups size doubled to improve performance. - Fixed an issue where bogus d3hot bits were set causing traffic stall. - Fixed an issue where sometimes adapter was not seen after reboot. - Fixed an issue where iWARP was crashing in conjunction with traffic management. - Fixed an issue where link failed to come up after removing twinax cable and inserting optical module. ETH - Fixed a link flap issue on T580-CR. OFLD - Fixed a potential iSCSI data corruption issue by disabling RxFragEn flag. FOiSCSI - Fixed an issue in recovery path where connection was getting closed before recovery processing was done. - Fixed an issue in TCP port reuse. - Fixed an issue in recovery path when large number (>64) of iSCSI connections were in use. - Returned ENETUNREACH if IP was not been provisioned yet and driver tried to use given inerface. - Fixed an issue where fw was sending ENETUNREACH event for normal tcp disconnection. DCBX - Fixed an issue where iscsi tlv is sent incorrectly to host. (DCBX CEE) - Fixed an issue where apply bit set for APP id was affecting the ETS and PFC settings.(DCBX IEEE) - Fixed an issue where app priority values are not handled correctly in fw. (DCBX IEEE) - Fixed an issue where enable/disable dcbx can cause crash. (DCBX CEE,DCBX IEEE) FOFCoE - Removed BB6 support. ENHANCEMENTS ------------ BASE: - Added new interface to program DCA settings in SGE contexts; allow 32-byte IQE size - Added PTP interface fw_ptp_ts to support PTP Frequeny and Offset adjustment. - Added MPS raw interface. ETH: - New mailbox command FW_DCB_IEEE_CMD api added for IEEE dcbx. OFLD: - WR opcode is returned to host in cqe error response. 22.2. T4 Firmware +++++++++++++++++ Version : 1.15.37.0 Date : 04/27/2016 ================================================================================ FIXES ----- BASE: - Fixed an issue in FW_RSS_VI_CONFIG_CMD handling where default ingress queue was ignored. - Fixed an issue in watchdog which was causing VM bring-up failure after reboot. - Per port buffer groups size doubled to improve performance. - Fixed an issue where iWARP was crashing in conjunction with traffic management. FOiSCSI: - Fixed an issue in recovery path where connection was getting closed before recovery processing was done. - Fixed an issue in TCP port reuse. - Fixed an issue in recovery path when large number (>64) of iSCSI connections were in use. - Returned ENETUNREACH if IP had not been provisioned yet and driver tried to use given inerface. DCBX - Fixed an issue where iscsi tlv is sent incorrectly to host.(DCBX CEE) - Fixed an issue where enable/disable dcbx can cause crash in firmware.(DCBX CEE) FOiSCSI - Fixes an issue where fw was sending ENETUNREACH event for normal tcp disconnection. FOFCoE - Removed BB6 support. ENHANCEMENTS ------------ BASE: - Added MPS raw interface. ETH: - New mailbox command FW_DCB_IEEE_CMD api added for IEEE dcbx. ================================================================================ Sponsored by: Chelsio Communications
* MFC 295573: Remove duplicate definition (CPL_TRACE_PKT_T5).jhb2016-11-042-2/+1
| | | | Sponsored by: Chelsio Communications
* MFC 301932: Use sbused() instead of sbspace() to avoid signed issues.jhb2016-11-041-5/+3
| | | | | | | | | | | | | | | Inserting a full mbuf with an external cluster into the socket buffer resulted in sbspace() returning -MLEN. However, since sb_hiwat is unsigned, the -MLEN value was converted to unsigned in comparisons. As a result, the socket buffer was never autosized. Note that sb_lowat is signed to permit direct comparisons with sbspace(), but sb_hiwat is unsigned. Follow suit with what tcp_output() does and compare the value of sbused() with sb_hiwat instead. Note: Since stable/10 does not include sbused(), this uses sb->sb_cc instead. Sponsored by: Chelsio Communications
* MFC 290175,290633,299206,300895,301898: Various TOE fixes.jhb2016-11-044-4/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 290175: cxgbe/tom: decide whether to shove segments or not only if there is payload to transmit. 290633: cxgbe/t4_tom: add a knob to the default configuration file to tune the TOE for LAN operation. It is possible to set this to other values (cluster for networks with little loss and really tight RTTs, and wan for relatively large RTTs and/or lossy networks) depending on the environment in which the TOE is being used. None of this affects plain NIC operation in any way. 299206: Set the correct vnet in TOE event handlers. 300895: cxgbe/t4_tom: Exempt RDMA connections from a TCP sanity test for now, to avoid panicking debug kernels. t4_tom does not keep track of a connection once it switches to ULP mode iWARP. If the connection falls out of ULP mode the driver/hardware seq# etc. are out of sync. A better fix would be to figure out what the current seq# are, update the driver's state, and perform all sanity checks as usual. 301898: cxgbe/t4_tom: Fix inverted assertion in r300895. It is RDMA connections and not others that are allowed to fail the receive window check.
* MFC 277763,280146,287631: Various fixes to DDP.jhb2016-11-043-16/+51
| | | | | | | | | | | | | | | | | | 277763: Lock the socket buffer before jumping to the 'out' label if sblock() fails in t4_soreceive_ddp(). 280146: Move special DDP handling for closing a connection into a new handle_ddp_close() function in t4_ddp.c as the logic is similar to handle_ddp_data(). This allows all knowledge of the special DDP mbufs to be private to t4_ddp.c as well. 287631: Add a comment to clarify how to determine the amount of received DDP data. Sponsored by: Chelsio Communications
* MFC 291665,291685,291856,297467,302110,302263: Add support for VIs.jhb2016-10-3112-1272/+1532
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 291665: Add support for configuring additional virtual interfaces (VIs) on a port. Each virtual interface has its own MAC address, queues, and statistics. The dedicated netmap interfaces (ncxgbeX / ncxlX) were already implemented as additional VIs on each port. This change allows additional non-netmap interfaces to be configured on each port. Additional virtual interfaces use the naming scheme vcxgbeX or vcxlX. Additional VIs are enabled by setting the hw.cxgbe.num_vis tunable to a value greater than 1 before loading the cxgbe(4) or cxl(4) driver. NB: The first VI on each port is the "main" interface (cxgbeX or cxlX). T4/T5 NICs provide a limited number of MAC addresses for each physical port. As a result, a maximum of six VIs can be configured on each port (including the "main" interface and the netmap interface when netmap is enabled). One user-visible result is that when netmap is enabled, packets received or transmitted via the netmap interface are no longer counted in the stats for the "main" interface, but are not accounted to the netmap interface. The netmap interfaces now also have a new-bus device and export various information sysctl nodes via dev.n(cxgbe|cxl).X. The cxgbetool 'clearstats' command clears the stats for all VIs on the specified port along with the port's stats. There is currently no way to clear the stats of an individual VI. 291685: Fix build for !TCP_OFFLOAD case. 291856: Fix RSS build. 297467: Remove #ifdef's from various structures used in the cxgbe/cxl driver. This provides a constant ABI and layout for these structures (especially struct adapter) avoiding some foot shooting. 302110: cxgbe(4): Merge netmap support from the ncxgbe/ncxl interfaces to the vcxgbe/vcxl interfaces and retire the 'n' interfaces. The main cxgbe/cxl interfaces and tunables related to them are not affected by any of this and will continue to operate as usual. The driver used to create an additional 'n' interface for every cxgbe/cxl interface if "device netmap" was in the kernel. The 'n' interface shared the wire with the main interface but was otherwise autonomous (with its own MAC address, etc.). It did not have normal tx/rx but had a specialized netmap-only data path. r291665 added another set of virtual interfaces (the 'v' interfaces) to the driver. These had normal tx/rx but no netmap support. This revision consolidates the features of both the interfaces into the 'v' interface which now has a normal data path, TOE support, and native netmap support. The 'v' interfaces need to be created explicitly with the hw.cxgbe.num_vis tunable. This means "device netmap" will not result in the automatic creation of any virtual interfaces. The following tunables can be used to override the default number of queues allocated for each 'v' interface. nofld* = 0 will disable TOE on the virtual interface and nnm* = 0 to will disable native netmap support. # number of normal NIC queues hw.cxgbe.ntxq_vi hw.cxgbe.nrxq_vi # number of TOE queues hw.cxgbe.nofldtxq_vi hw.cxgbe.nofldrxq_vi # number of netmap queues hw.cxgbe.nnmtxq_vi hw.cxgbe.nnmrxq_vi hw.cxgbe.nnm{t,r}xq{10,1}g tunables have been removed. --- tl;dr version --- The workflow for netmap on cxgbe starting with FreeBSD 11 is: 1) "device netmap" in the kernel config. 2) "hw.cxgbe.num_vis=2" in loader.conf. num_vis > 2 is ok too, you'll end up with multiple autonomous netmap-capable interfaces for every port. 3) "dmesg | grep vcxl | grep netmap" to verify that the interface has netmap queues. 4) Use any of the 'v' interfaces for netmap. pkt-gen -i vcxl<n>... . One major improvement is that the netmap interface has a normal data path as expected. 5) Just ignore the cxl interfaces if you want to use netmap only. No need to bring them up. The vcxl interfaces are completely independent and everything should just work. --------------------- 302263: cxgbe(4): Do not bring up an interface when IFCAP_TOE is enabled on it. The interface's queues are functional after VI_INIT_DONE (which is short of interface-up) and that's all that's needed for t4_tom to communicate with the chip. Relnotes: yes Sponsored by: Chelsio Communications
OpenPOWER on IntegriCloud