summaryrefslogtreecommitdiffstats
path: root/sys/net/if_var.h
Commit message (Collapse)AuthorAgeFilesLines
* Persistently store NIC's hardware MAC address, and add a way to retrive itrpokala2017-05-191-2/+2
| | | | | | | | | | | | | | | | jhb pointed out that (struct ifnet) is part of the network driver KBI, and thus the offsets of internal fields must not change. Therefore, move the new "if_hw_addr" field to the end, and consume one of the "if_pspare"s; that's what they're there for. Because netmap on stable/10 uses "if_pspare[0]", the new field replaces the *last* element of that array; that way, offsetof(if_pspare) is unchanged compared to before r318430. PR: 194386 Reviewed by: jhb Pointyhat to: rpokala Sponsored by: Panasas (cherry picked from commit 2f103d239c07e4f88b9852f3b8689f100d7a31d0)
* MFC r318160, 318176: Persistently store NIC's hardware MAC address, and addrpokala2017-05-191-0/+2
| | | | | | | | | | | | | | | | | | | | | a way to retrive it NOTE: Due to restructuring, the merges didn't apply cleanly; the resulting change is almost identical to what went into stable/11, but in some cases in different locations. The MAC address reported by `ifconfig ${nic} ether' does not always match the address in the hardware, as reported by the driver during attach. In particular, NICs which are components of a lagg(4) interface all report the same MAC. When attaching, the NIC driver passes the MAC address it read from the hardware as an argument to ether_ifattach(). Keep a second copy of it, and create ioctl(SIOCGHWADDR) to return it. Teach `ifconfig' to report it along with the active MAC address. PR: 194386 (cherry picked from commit 2ce46e31d62424593e08c3853efe8c1e9283aba2)
* MFC r287775:hselasky2015-10-081-5/+6
| | | | | | | | | | | | | | Update TSO limits to include all headers. To make driver programming easier the TSO limits are changed to reflect the values used in the BUSDMA tag a network adapter driver is using. The TCP/IP network stack will subtract space for all linklevel and protocol level headers and ensure that the full mbuf chain passed to the network adapter fits within the given limits. See r287775 for a more detailed description. Differential Revision: https://reviews.freebsd.org/D3477 Reviewed by: rmacklem
* Add if_inc_counter() and if_get_counter_default() functions that doae2014-12-231-0/+18
| | | | | | | | access to ifnet counters for code compatibility with FreeBSD 11. This is direct commit to stable/10. Discussed with: glebius@, arch@
* MFC r274376:hselasky2014-11-191-5/+17
| | | | | | | | | Fix some minor TSO issues: - Improve description of TSO limits. - Remove a not needed KASSERT() - Remove some not needed variable casts. Sponsored by: Mellanox Technologies
* MFC r271946 and r272595:hselasky2014-11-031-5/+23
| | | | | | | | | Improve transmit sending offload, TSO, algorithm in general. This change allows all HCAs from Mellanox Technologies to function properly when TSO is enabled. See r271946 and r272595 for more details about this commit. Sponsored by: Mellanox Technologies
* MFC changes relating to running multiple interfaces on different fibs butasomers2014-06-061-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with addresses on the same subnet. MFC r266860 Fix unintended KBI change from r264905. Add _fib versions of ifa_ifwithnet() and ifa_ifwithdstaddr() The legacy functions will call the _fib() versions with RT_ALL_FIBS, preserving legacy behavior. sys/net/if_var.h sys/net/if.c Add legacy-compatible functions as described above. Ensure legacy behavior when RT_ALL_FIBS is passed as fibnum. sys/netinet/in_pcb.c sys/netinet/ip_output.c sys/netinet/ip_options.c sys/net/route.c sys/net/rtsock.c sys/netinet6/nd6.c Call with _fib() functions if we must use a specific fib, or the legacy functions otherwise. tests/sys/netinet/fibs_test.sh tests/sys/netinet/udp_dontroute.c Improve the udp_dontroute test. The bug that this test exercises is that ifa_ifwithnet() will return the wrong address, if multiple interfaces have addresses on the same subnet but with different fibs. The previous version of the test only considered one possible failure mode: that ifa_ifwithnet_fib() might fail to find any suitable address at all. The new version also checks whether ifa_ifwithnet_fib() finds the correct address by checking where the ARP request goes. MFC r264917 Style fixes, mostly trailing whitespace elimination. No functional change. MFC r264905 Fix subnet and default routes on different FIBs on the same subnet. These two bugs are closely related. The root cause is that ifa_ifwithnet does not consider FIBs when searching for an interface address. sys/net/if_var.h sys/net/if.c Add a fib argument to ifa_ifwithnet and ifa_ifwithdstadddr. Those functions will only return an address whose interface fib equals the argument. sys/net/route.c Update calls to ifa_ifwithnet and ifa_ifwithdstaddr with fib arguments. sys/netinet/in.c Update in_addprefix to consider the interface fib when adding prefixes. This will prevent it from not adding a subnet route when one already exists on a different fib. sys/net/rtsock.c sys/netinet/in_pcb.c sys/netinet/ip_output.c sys/netinet/ip_options.c sys/netinet6/nd6.c Add RT_DEFAULT_FIB arguments to ifa_ifwithdstaddr and ifa_ifwithnet. In some cases it there wasn't a clear specific fib number to use. In others, I was unable to test those functions so I chose RT_DEFAULT_FIB to minimize divergence from current behavior. I will fix some of the latter changes along with PR kern/187553. tests/sys/netinet/fibs_test.sh tests/sys/netinet/udp_dontroute.c tests/sys/netinet/Makefile Revert r263738. The udp_dontroute test was right all along. However, bugs kern/187550 and kern/187553 cancelled each other out when it came to this test. Because of kern/187553, ifa_ifwithnet searched the default fib instead of the requested one, but because of kern/187550, there was an applicable subnet route on the default fib. The new test added in r263738 doesn't work right, however. I can verify with dtrace that ifa_ifwithnet returned the wrong address before I applied this commit, but route(8) miraculously found the correct interface to use anyway. I don't know how. Clear expected failure messages for kern/187550 and kern/187552. MFC r263738 tests/sys/netinet/Makefile tests/sys/netinet/fibs.sh Replace fibs:udp_dontroute with fibs:src_addr_selection_by_subnet. The original test was poorly written; it was actually testing kern/167947 instead of the desired kern/187553. The root cause of the bug is that ifa_ifwithnet did not have a fib argument. The new test more directly targets that behavior. tests/sys/netinet/udp_dontroute.c Delete the auxilliary binary used by the old test
* Fix typo: minmum -> minimum.cperciva2013-07-051-1/+1
| | | | Submitted by: @z3ndrag0n
* Allow drivers to specify a maximum TSO length in bytes if they areandre2013-06-031-0/+5
| | | | | | | | | | | | | | | | | | | | | | | limited in the amount of data they can handle at once. Drivers can set ifp->if_hw_tsomax before calling ether_ifattach() to change the limit. The lowest allowable size is IP_MAXPACKET / 8 (8192 bytes) as anything less wouldn't be very useful anymore. The upper limit is still at IP_MAXPACKET (65536 bytes). Raising it requires further auditing of the IPv4/v6 code path's as the length field in the IP header would overflow leading to confusion in firewalls and others packet handler on the real size of the packet. The placement into "struct ifnet" is a bit hackish but the best place that was found. When the stack/driver boundary is updated it should be handled in a better way. Submitted by: cperciva (earlier version) Reviewed by: cperciva Tested by: cperciva MFC after: 1 week (using spare struct members to preserve ABI)
* Back out r249318, r249320 and r249327 due to a heisenbug mostandre2013-05-061-3/+3
| | | | | likely related to a race condition in the ipi_hash_lock with the exact cause currently unknown but under investigation.
* Add const qualifier to the dst parameter of the ifnet if_output method.glebius2013-04-261-1/+1
|
* Fix build.glebius2013-04-101-1/+1
|
* Change certain heavily used network related mutexes and rwlocks toandre2013-04-091-2/+2
| | | | | | | | | | reside on their own cache line to prevent false sharing with other nearby structures, especially for those in the .bss segment. NB: Those mutexes and rwlocks with variables next to them that get changed on every invocation do not benefit from their own cache line. Actually it may be net negative because two cache misses would be incurred in those cases.
* Resolve source address selection in presense of CARP. Add a coupleglebius2013-02-111-1/+1
| | | | | | | | | | | | | | | | | | of helper functions: - carp_master() - boolean function which is true if an address is in the MASTER state. - ifa_preferred() - boolean function that compares two addresses, and is aware of CARP. Utilize ifa_preferred() in ifa_ifwithnet(). The previous version of patch also changed source address selection logic in jails using carp_master(), but we failed to negotiate this part with Bjoern. May be we will approach this problem again later. Reported & tested by: Anton Yuzhaninov <citrin citrin.ru> Sponsored by: Nginx, Inc
* This fixes a out-of-order problem with severalrrs2013-02-071-1/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | of the newer drivers. The basic problem was that the driver was pulling the mbuf off the drbr ring and then when sending with xmit(), encounting a full transmit ring. Thus the lower layer xmit() function would return an error, and the drivers would then append the data back on to the ring. For TCP this is a horrible scenario sure to bring on a fast-retransmit. The fix is to use drbr_peek() to pull the data pointer but not remove it from the ring. If it fails then we either call the new drbr_putback or drbr_advance method. Advance moves it forward (we do this sometimes when the xmit() function frees the mbuf). When we succeed we always call advance. The putback will always copy the mbuf back to the top of the ring. Note that the putback *cannot* be used with a drbr_dequeue() only with drbr_peek(). We most of the time, in putback, would not need to copy it back since most likey the mbuf is still the same, but sometimes xmit() functions will change the mbuf via a pullup or other call. So the optimial case for the single consumer is to always copy it back. If we ever do a multiple_consumer (for lagg?) we will need a test and atomic in the put back possibly a seperate putback_mc() in the ring buf. Reviewed by: jhb@freebsd.org, jlv@freebsd.org
* Mechanically remove the last stray remains of spl* calls from net*/*.andre2012-10-181-1/+1
| | | | They have been Noop's for a long time now.
* provide helper if_initbaudrate() to set if_baudrate_pf and if_baudrate_pf.emax2012-10-171-0/+12
| | | | | | | again, use ixgbe(4) as an example of how to use new helper function. Reviewed by: jhb MFC after: 1 week
* introduce concept of ifi_baudrate power factor. the idea is to workemax2012-10-161-0/+1
| | | | | | | | | | | | | | | around the problem where high speed interfaces (such as ixgbe(4)) are not able to report real ifi_baudrate. bascially, take a spare byte from struct if_data and use it to store ifi_baudrate power factor. in other words, real ifi_baudrate = ifi_baudrate * 10 ^ ifi_baudrate power factor this should be backwards compatible with old binaries. use ixgbe(4) as an example on how drivers would set ifi_baudrate power factor Discussed with: kib, scottl, glebius MFC after: 1 week
* The drbr(9) API appeared to be so unclear, that most drivers inglebius2012-09-281-17/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tree used it incorrectly, which lead to inaccurate overrated if_obytes accounting. The drbr(9) used to update ifnet stats on drbr_enqueue(), which is not accurate since enqueuing doesn't imply successful processing by driver. Dequeuing neither mean that. Most drivers also called drbr_stats_update() which did accounting again, leading to doubled if_obytes statistics. And in case of severe transmitting, when a packet could be several times enqueued and dequeued it could have been accounted several times. o Thus, make drbr(9) API thinner. Now drbr(9) merely chooses between ALTQ queueing or buf_ring(9) queueing. - It doesn't touch the buf_ring stats any more. - It doesn't touch ifnet stats anymore. - drbr_stats_update() no longer exists. o buf_ring(9) handles its stats itself: - It handles br_drops itself. - br_prod_bytes stats are dropped. Rationale: no one ever reads them but update of a common counter on every packet negatively affects performance due to excessive cache invalidation. - buf_ring_enqueue_bytes() reduced to buf_ring_enqueue(), since we no longer account bytes. o Drivers handle their stats theirselves: if_obytes, if_omcasts. o mlx4(4), igb(4), em(4), vxge(4), oce(4) and ixv(4) no longer use drbr_stats_update(), and update ifnet stats theirselves. o bxe(4) was the most correct driver, it didn't call drbr_stats_update(), thus it was the only driver accurate under moderate load. Now it also maintains stats itself. o ixgbe(4) had already taken stats from hardware, so just - drop software stats updating. - take multicast packet count from hardware as well. o mxge(4) just no longer needs NO_SLOW_STATS define. o cxgb(4), cxgbe(4) need no change, since they obtain stats from hardware. Reviewed by: jfv, gnn
* Fix the build broken by r240099.melifaro2012-09-041-0/+2
| | | | | | Hide link_pfil_hook under _KERNEL macro. MFC after: 3 weeks
* Introduce new link-layer PFIL hook V_link_pfil_hook.melifaro2012-09-041-0/+3
| | | | | | | | | | | | | Merge ether_ipfw_chk() and part of bridge_pfil() into unified ipfw_check_frame() function called by PFIL. This change was suggested by rwatson? @ DevSummit. Remove ipfw headers from ether/bridge code since they are unneeded now. Note this thange introduce some (temporary) performance penalty since PFIL read lock has to be acquired for every link-level packet. MFC after: 3 weeks
* Fix races between in_lltable_prefix_free(), lla_lookup(),glebius2012-08-021-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | llentry_free() and arptimer(): o Use callout_init_rw() for lle timeout, this allows us safely disestablish them. - This allows us to simplify the arptimer() and make it race safe. o Consistently use ifp->if_afdata_lock to lock access to linked lists in the lle hashes. o Introduce new lle flag LLE_LINKED, which marks an entry that is attached to the hash. - Use LLE_LINKED to avoid double unlinking via consequent calls to llentry_free(). - Mark lle with LLE_DELETED via |= operation istead of =, so that other flags won't be lost. o Make LLE_ADDREF(), LLE_REMREF() and LLE_FREE_LOCKED() more consistent and provide more informative KASSERTs. The patch is a collaborative work of all submitters and myself. PR: kern/165863 Submitted by: Andrey Zonov <andrey zonov.org> Submitted by: Ryan Stone <rysto32 gmail.com> Submitted by: Eric van Gyzen <eric_van_gyzen dell.com>
* - Updated TOE support in the kernel.np2012-06-191-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Stateful TCP offload drivers for Terminator 3 and 4 (T3 and T4) ASICs. These are available as t3_tom and t4_tom modules that augment cxgb(4) and cxgbe(4) respectively. The cxgb/cxgbe drivers continue to work as usual with or without these extra features. - iWARP driver for Terminator 3 ASIC (kernel verbs). T4 iWARP in the works and will follow soon. Build-tested with make universe. 30s overview ============ What interfaces support TCP offload? Look for TOE4 and/or TOE6 in the capabilities of an interface: # ifconfig -m | grep TOE Enable/disable TCP offload on an interface (just like any other ifnet capability): # ifconfig cxgbe0 toe # ifconfig cxgbe0 -toe Which connections are offloaded? Look for toe4 and/or toe6 in the output of netstat and sockstat: # netstat -np tcp | grep toe # sockstat -46c | grep toe Reviewed by: bz, gnn Sponsored by: Chelsio communications. MFC after: ~3 months (after 9.1, and after ensuring MFC is feasible)
* Retire the IF_ADDR_LOCK() and IF_ADDR_UNLOCK() compat macros from HEAD.jhb2012-03-191-3/+0
| | | | | The new [RW]LOCK macros are merged back to 8.x so should be suitable for new code in HEAD even if it is to be MFC'd.
* g/c last bit of old ipv6 prefix management.pluknet2012-02-081-16/+1
| | | | | Reviewed by: bz Obtained from: NetBSD, net/if.h, rev 1.80
* Convert the per-interface address list lock from a mutex to a reader/writerjhb2012-01-091-11/+10
| | | | | | lock. Reviewed by: bz
* Add new variants of the IF_ADDR_*LOCK*() macros used for protectingjhb2012-01-051-2/+8
| | | | | | | | | interface address lists that distinguish read locks from write locks. To preserve the KPI, the previous operations are mapped to the write lock macros. The lock is still kept as a mutex for now. Reviewed by: bz MFC after: 2 weeks
* A major overhaul of the CARP implementation. The ip_carp.c was startedglebius2011-12-161-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | from scratch, copying needed functionality from the old implemenation on demand, with a thorough review of all code. The main change is that interface layer has been removed from the CARP. Now redundant addresses are configured exactly on the interfaces, they run on. The CARP configuration itself is, as before, configured and read via SIOCSVH/SIOCGVH ioctls. A new prefix created with SIOCAIFADDR or SIOCAIFADDR_IN6 may now be configured to a particular virtual host id, which makes the prefix redundant. ifconfig(8) semantics has been changed too: now one doesn't need to clone carpXX interface, he/she should directly configure a vhid on a Ethernet interface. To supply vhid data from the kernel to an application the getifaddrs(8) function had been changed to pass ifam_data with each address. [1] The new implementation definitely closes all PRs related to carp(4) being an interface, and may close several others. It also allows to run a single redundant IP per interface. Big thanks to Bjoern Zeeb for his help with inet6 part of patch, for idea on using ifam_data and for several rounds of reviewing! PR: kern/117000, kern/126945, kern/126714, kern/120130, kern/117448 Reviewed by: bz Submitted by: bz [1]
* Remove the unused if_free_type() function.brooks2011-12-091-1/+0
| | | | X-MFC after: never
* Add macro IF_DEQUEUE_ALL(ifq, m), that takes the entire mbuf chain offglebius2011-10-271-0/+12
| | | | | the queue. It can be utilized in queue processing to avoid multiple locking/unlocking.
* Add spares to the network stack for FreeBSD-9:bz2011-07-171-3/+3
| | | | | | | | | | | | | - TCP keep* timers - TCP UTO (adjust from what was there already) - netmap - route caching - user cookie (temporary to allow for the real fix) Slightly re-shuffle struct ifnet moving fields out of the middle of spares and to better align. Discussed with: rwatson (slightly earlier version)
* Remove extra white space to comply with style for the rest of the struct.bz2011-07-031-2/+2
| | | | MFC after: 2 weeks
* Add infrastructure to allow all frames/packets received on an interfacebz2011-07-031-0/+1
| | | | | | | | | | | | | | to be assigned to a non-default FIB instance. You may need to recompile world or ports due to the change of struct ifnet. Submitted by: cjsp Submitted by: Alexander V. Chernikov (melifaro ipfw.ru) (original versions) Reviewed by: julian Reviewed by: Alexander V. Chernikov (melifaro ipfw.ru) MFC after: 2 weeks X-MFC: use spare in struct ifnet
* - Merge changes to the base system to support OFED. These includejeff2011-03-211-0/+3
| | | | | a wider arg2 for sysctl, updates to vlan code, IFT_INFINIBAND, and other miscellaneous small features.
* This patch fixes the problem where proxy ARP entries cannot be addedqingli2010-05-251-1/+1
| | | | | | over the if_ng interface. MFC after: 3 days
* Fix a small bug in drbr_dequeue_cond spotted while preparing MFC of r203834.mlaier2010-03-151-1/+1
| | | | MFC after: 3 days
* Fix drbr and altq interaction:mlaier2010-02-131-13/+23
| | | | | | | | | | | | | | | - introduce drbr_needs_enqueue that returns whether the interface/br needs an enqueue operation: returns true if altq is enabled or there are already packets in the ring (as we need to maintain packet order) - update all drbr consumers - fix drbr_flush - avoid using the driver queue (IFQ_DRV_*) in the altq case as the multiqueue consumer does not provide enough protection, serialize altq interaction with the main queue lock - make drbr_dequeue_cond work with altq Discussed with: kmacy, yongari, jfv MFC after: 4 weeks
* Revised revision 199201 (add interface description capability as inspireddelphij2010-01-271-1/+2
| | | | | | | | by OpenBSD), based on comments from many, including rwatson, jhb, brooks and others. Sponsored by: iXsystems, Inc. MFC after: 1 month
* While flushing the multicast filter of an interface, do not zero the relevantsyrinx2010-01-241-1/+1
| | | | | | | | | | ifmultiaddr structures' reference to the parent interface, unless the parent interface is really detaching. While here, program only link layer multicast filters to a wlan's hardware parent interface. PR: kern/142391, kern/142392 Reviewed by: sam, rpaolo, bms MFC after: 1 week
* Declare a new EVENTHANDLER called iflladdr_event which signals that the L2thompsa2010-01-181-0/+3
| | | | | | | | | | | | | address on an interface has changed. This lets stacked interfaces such as vlan(4) detect that their lower interface has changed and adjust things in order to keep working. Previously this situation broke at least vlan(4) and lagg(4) configurations. The EVENTHANDLER_INVOKE call was not placed within if_setlladdr() due to the risk of a loop. PR: kern/142927 Submitted by: Nikolay Denev
* Remove a deleted comment line that was brought back byqingli2009-12-311-1/+0
| | | | | | my previous commit. MFC after: 5 days
* The proxy arp entries could not be added into the system over theqingli2009-12-301-0/+2
| | | | | | | | | | | | | | | | | | IFF_POINTOPOINT link types. The reason was due to the routing entry returned from the kernel covering the remote end is of an interface type that does not support ARP. This patch fixes this problem by providing a hint to the kernel routing code, which indicates the prefix route instead of the PPP host route should be returned to the caller. Since a host route to the local end point is also added into the routing table, and there could be multiple such instantiations due to multiple PPP links can be created with the same local end IP address, this patch also fixes the loopback route installation failure problem observed prior to this patch. The reference count of loopback route to local end would be either incremented or decremented. The first instantiation would create the entry and the last removal would delete the route entry. MFC after: 5 days
* Remove commented out prototype for ifinit(). This prototype has beenjhb2009-12-211-1/+0
| | | | | commented out since 1.1 and has not been present in <sys/systm.h> since at least 1.1 of that file. It is also not needed in FreeBSD due to SYSINIT().
* Remove if_timer/if_watchdog now that they are no longer used. The spacejhb2009-11-301-3/+1
| | | | | | | used by if_timer is reserved for expanding if_index to an int in the future. Reviewed by: rwatson, brooks
* Revert revision 199201 for now as it has introduced a kernel vulnerabilitydelphij2009-11-121-2/+1
| | | | and requires more polishing.
* Add interface description capability as inspired by OpenBSD.delphij2009-11-111-1/+2
| | | | MFC after: 3 months
* Self pointing routes are installed for configured interface addressesqingli2009-09-151-0/+3
| | | | | | | | | | and address aliases. After an interface is brought down and brought back up again, those self pointing routes disappeared. This patch ensures after an interface is brought back up, the loopback routes are reinstalled properly. Reviewed by: bz MFC after: immediately
* Make if_grow static -- it's not used outside of if.c, and with therwatson2009-08-241-1/+0
| | | | | | internals destined to change, it's better if it remains that way. MFC after: 3 days
* Rework global locks for interface list and index management, correctingrwatson2009-08-231-8/+34
| | | | | | | | | | | | | | several critical bugs, including race conditions and lock order issues: Replace the single rwlock, ifnet_lock, with two locks, an rwlock and an sxlock. Either can be held to stablize the lists and indexes, but both are required to write. This allows the list to be held stable in both network interrupt contexts and sleepable user threads across sleeping memory allocations or device driver interactions. As before, writes to the interface list must occur from sleepable contexts. Reviewed by: bz, julian MFC after: 3 days
* Remove unused if_rawoutput() macro; it has been unused since at leastrwatson2009-08-151-1/+0
| | | | | | FreeBSD 2. Approved by: re (kib)
OpenPOWER on IntegriCloud