summaryrefslogtreecommitdiffstats
path: root/sys/net/if.c
Commit message (Collapse)AuthorAgeFilesLines
* MFC: r312687, r312916dexuan2017-02-231-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Approved by: sephe (mentor) r312687 ifnet: introduce event handlers for ifup/ifdown events Hyper-V's NIC SR-IOV implementation needs a Hyper-V synthetic NIC and a VF NIC to work together, mainly to support seamless live migration. When the VF device becomes UP (or DOWN), the synthetic NIC driver needs to switch the data path from the synthetic NIC to the VF (or the opposite). So the synthetic NIC driver needs to know when a VF device is becoming UP or DOWN and hence the patch is made. Reviewed by: sephe Approved by: sephe (mentor) Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D8963 r312916 ifnet: move the new ifnet_event EVENTHANDLER_DECLARE to net/if_var.h Thank glebius for pointing this out: "The network stuff shall not be added to sys/eventhandler.h" Reviewed by: David_A_Bright_DELL.com, sephe, glebius Approved by: sephe (mentor) Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D9345
* MFC 311475sephe2017-02-211-2/+4
| | | | | | | | | | | if: Defer the if_up until the ifnet.if_ioctl is called. This ensures the interface is initialized by the interface driver before it can be used by the rest of the system. Reviewed by: jhb, karels, gnn Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D8905
* MFC 307078sephe2016-10-191-1/+1
| | | | | | | | | | ifnet: Use if_link_state snapshot to invoke ifnet_link_event So that everyone in this task have consistent view of link state. Reviewed by: ae Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D8214
* MFC 300670 (slightly adapted for 10-STABLE):n_hibma2016-06-011-1/+2
| | | | | | | | Change net.link.log_promisc_mode_change to a read-only tunable. PR: 166255 Submitted by: eugen.grosbein.net Obtained from: hselasky
* MFC 299559:n_hibma2016-05-231-4/+14
| | | | | Allow silencing of 'promiscuous mode enabled/disabled' messages.
* MFC r299865truckman2016-05-201-0/+5
| | | | | When handling SIOCSIFNAME ensure that the new interface name is NUL terminated. Reject the rename attempt if the name is too long.
* Merge r285713 (by zec@) from head:glebius2016-04-111-3/+4
| | | | Prevent null-pointer dereferencing.
* MFC r292604:bz2016-01-211-3/+17
| | | | | | | | | | | If vnets are torn down while ifconfig runs an ioctl to say, destroy an epair(4), we may hit if_detach_internal() without holding a lock and by the time we aquire it the interface might be gone. We should not panic() in this case as it is our fault for not holding the lock all the way. It is not ideal to return silently without error to user space, but other callers will all ignore the return values so do not change the entire KPI for little benefit for now. The ifp will be dealt with one way or another still.
* MFC r279538:hrs2015-07-231-8/+26
| | | | | | | | | | | | | | | | | | Fix group membership of cloned interfaces when one is moved by if_vmove(). In if_vmove(), if_detach_internal() and if_attach_internal() were called in series to detach and reattach the interface. When detaching, if_delgroup() was called and the interface leaves all of the group membership. And then upon attachment, if_addgroup(ifp, IFG_ALL) was called and it joined only "all" group again. This had a problem. Normally, a cloned interface automatically joins a group whose name is ifc_name of the cloner in addition to "all" upon creation. However, if_vmove() removed the membership and did not restore upon attachment. Approved by: re (gjb)
* MFC r281236 -- extended media types in if_media.h.erj2015-05-291-0/+1
| | | | Approved by: jfv (mentor)
* MFC r279920:ae2015-03-191-1/+11
| | | | | | | | | | Add if_input_default() method, that will be used for if_input initialization, when no input method specified before if_attach(). This prevents panics when if_input() method called directly e.g. from bpf(4) code. PR: 192426
* Add if_inc_counter() and if_get_counter_default() functions that doae2014-12-231-0/+94
| | | | | | | | access to ifnet counters for code compatibility with FreeBSD 11. This is direct commit to stable/10. Discussed with: glebius@, arch@
* MFC r274376:hselasky2014-11-191-7/+0
| | | | | | | | | Fix some minor TSO issues: - Improve description of TSO limits. - Remove a not needed KASSERT() - Remove some not needed variable casts. Sponsored by: Mellanox Technologies
* MFC r271946 and r272595:hselasky2014-11-031-6/+80
| | | | | | | | | Improve transmit sending offload, TSO, algorithm in general. This change allows all HCAs from Mellanox Technologies to function properly when TSO is enabled. See r271946 and r272595 for more details about this commit. Sponsored by: Mellanox Technologies
* MFC r268787:kevlo2014-07-241-1/+1
| | | | Deprecate m_act. Use m_nextpkt always.
* MFC r264887asomers2014-06-061-2/+2
| | | | | | | | | | | | | | | | | | | | | Fix host and network routes for new interfaces when net.add_addr_allfibs=0 sys/net/route.c In rtinit1, use the interface fib instead of the process fib. The latter wasn't very useful because ifconfig(8) is usually invoked with the default process fib. Changing ifconfig(8) to use setfib(2) would be redundant, because it already sets the interface fib. tests/sys/netinet/fibs_test.sh Clear the expected ATF failure sys/net/if.c Pass the interface fib in calls to rtrequest1_fib and rtalloc1_fib sys/netinet/in.c sys/net/if_var.h Add a fibnum argument to ifa_switch_loopback_route, a subroutine of in_scrubprefix. Pass it the interface fib.
* MFC changes relating to running multiple interfaces on different fibs butasomers2014-06-061-5/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with addresses on the same subnet. MFC r266860 Fix unintended KBI change from r264905. Add _fib versions of ifa_ifwithnet() and ifa_ifwithdstaddr() The legacy functions will call the _fib() versions with RT_ALL_FIBS, preserving legacy behavior. sys/net/if_var.h sys/net/if.c Add legacy-compatible functions as described above. Ensure legacy behavior when RT_ALL_FIBS is passed as fibnum. sys/netinet/in_pcb.c sys/netinet/ip_output.c sys/netinet/ip_options.c sys/net/route.c sys/net/rtsock.c sys/netinet6/nd6.c Call with _fib() functions if we must use a specific fib, or the legacy functions otherwise. tests/sys/netinet/fibs_test.sh tests/sys/netinet/udp_dontroute.c Improve the udp_dontroute test. The bug that this test exercises is that ifa_ifwithnet() will return the wrong address, if multiple interfaces have addresses on the same subnet but with different fibs. The previous version of the test only considered one possible failure mode: that ifa_ifwithnet_fib() might fail to find any suitable address at all. The new version also checks whether ifa_ifwithnet_fib() finds the correct address by checking where the ARP request goes. MFC r264917 Style fixes, mostly trailing whitespace elimination. No functional change. MFC r264905 Fix subnet and default routes on different FIBs on the same subnet. These two bugs are closely related. The root cause is that ifa_ifwithnet does not consider FIBs when searching for an interface address. sys/net/if_var.h sys/net/if.c Add a fib argument to ifa_ifwithnet and ifa_ifwithdstadddr. Those functions will only return an address whose interface fib equals the argument. sys/net/route.c Update calls to ifa_ifwithnet and ifa_ifwithdstaddr with fib arguments. sys/netinet/in.c Update in_addprefix to consider the interface fib when adding prefixes. This will prevent it from not adding a subnet route when one already exists on a different fib. sys/net/rtsock.c sys/netinet/in_pcb.c sys/netinet/ip_output.c sys/netinet/ip_options.c sys/netinet6/nd6.c Add RT_DEFAULT_FIB arguments to ifa_ifwithdstaddr and ifa_ifwithnet. In some cases it there wasn't a clear specific fib number to use. In others, I was unable to test those functions so I chose RT_DEFAULT_FIB to minimize divergence from current behavior. I will fix some of the latter changes along with PR kern/187553. tests/sys/netinet/fibs_test.sh tests/sys/netinet/udp_dontroute.c tests/sys/netinet/Makefile Revert r263738. The udp_dontroute test was right all along. However, bugs kern/187550 and kern/187553 cancelled each other out when it came to this test. Because of kern/187553, ifa_ifwithnet searched the default fib instead of the requested one, but because of kern/187550, there was an applicable subnet route on the default fib. The new test added in r263738 doesn't work right, however. I can verify with dtrace that ifa_ifwithnet returned the wrong address before I applied this commit, but route(8) miraculously found the correct interface to use anyway. I don't know how. Clear expected failure messages for kern/187550 and kern/187552. MFC r263738 tests/sys/netinet/Makefile tests/sys/netinet/fibs.sh Replace fibs:udp_dontroute with fibs:src_addr_selection_by_subnet. The original test was poorly written; it was actually testing kern/167947 instead of the desired kern/187553. The root cause of the bug is that ifa_ifwithnet did not have a fib argument. The new test more directly targets that behavior. tests/sys/netinet/udp_dontroute.c Delete the auxilliary binary used by the old test
* MFC: r264630rmacklem2014-05-061-1/+3
| | | | | | | | | | | | | | | | | | | | | For NFS mounts using rsize,wsize=65536 over TSO enabled network interfaces limited to 32 transmit segments, there are two known issues. The more serious one is that for an I/O of slightly less than 64K, the net device driver prepends an ethernet header, resulting in a TSO segment slightly larger than 64K. Since m_defrag() copies this into 33 mbuf clusters, the transmit fails with EFBIG. A tester indicated observing a similar failure using iSCSI. The second less critical problem is that the network device driver must copy the mbuf chain via m_defrag() (m_collapse() is not sufficient), resulting in measurable overhead. This patch reduces the default size of if_hw_tsomax slightly, so that the first issue is avoided. Fixing the second issue will require a way for the network device driver to inform tcp_output() that it is limited to 32 transmit segments.
* Fix the length calculation for the final block of a sendfile(2)des2013-09-101-2/+16
| | | | | | | | | | | | | | | | | | | | transmission which could be tricked into rounding up to the nearest page size, leaking up to a page of kernel memory. [13:11] In IPv6 and NetATM, stop SIOCSIFADDR, SIOCSIFBRDADDR, SIOCSIFDSTADDR and SIOCSIFNETMASK at the socket layer rather than pass them on to the link layer without validation or credential checks. [SA-13:12] Prevent cross-mount hardlinks between different nullfs mounts of the same underlying filesystem. [SA-13:13] Security: CVE-2013-5666 Security: FreeBSD-SA-13:11.sendfile Security: CVE-2013-5691 Security: FreeBSD-SA-13:12.ifioctl Security: CVE-2013-5710 Security: FreeBSD-SA-13:13.nullfs Approved by: re
* PR: 168520 170096rodrigc2013-07-151-3/+6
| | | | | | | | | | | | | | | | | | | | Submitted by: adrian, zec Fix multiple kernel panics when VIMAGE is enabled in the kernel. These fixes are based on patches submitted by Adrian Chadd and Marko Zec. (1) Set curthread->td_vnet to vnet0 in device_probe_and_attach() just before calling device_attach(). This fixes multiple VIMAGE related kernel panics when trying to attach Bluetooth or USB Ethernet devices because curthread->td_vnet is NULL. (2) Set curthread->td_vnet in if_detach(). This fixes kernel panics when detaching networking interfaces, especially USB Ethernet devices. (3) Use VNET_DOMAIN_SET() in ng_btsocket.c (4) In ng_unref_node() set curthread->td_vnet. This fixes kernel panics when detaching Netgraph nodes.
* Fix build with both INET and INET6 disabled.jhb2013-06-041-0/+2
|
* Allow drivers to specify a maximum TSO length in bytes if they areandre2013-06-031-6/+13
| | | | | | | | | | | | | | | | | | | | | | | limited in the amount of data they can handle at once. Drivers can set ifp->if_hw_tsomax before calling ether_ifattach() to change the limit. The lowest allowable size is IP_MAXPACKET / 8 (8192 bytes) as anything less wouldn't be very useful anymore. The upper limit is still at IP_MAXPACKET (65536 bytes). Raising it requires further auditing of the IPv4/v6 code path's as the length field in the IP header would overflow leading to confusion in firewalls and others packet handler on the real size of the packet. The placement into "struct ifnet" is a bit hackish but the best place that was found. When the stack/driver boundary is updated it should be handled in a better way. Submitted by: cperciva (earlier version) Reviewed by: cperciva Tested by: cperciva MFC after: 1 week (using spare struct members to preserve ABI)
* Back out r249318, r249320 and r249327 due to a heisenbug mostandre2013-05-061-1/+1
| | | | | likely related to a race condition in the ipi_hash_lock with the exact cause currently unknown but under investigation.
* Add const qualifier to the dst parameter of the ifnet if_output method.glebius2013-04-261-1/+1
|
* Change certain heavily used network related mutexes and rwlocks toandre2013-04-091-1/+1
| | | | | | | | | | reside on their own cache line to prevent false sharing with other nearby structures, especially for those in the .bss segment. NB: Those mutexes and rwlocks with variables next to them that get changed on every invocation do not benefit from their own cache line. Actually it may be net negative because two cache misses would be incurred in those cases.
* Fix long-standing issue with interface routes being unprotected:melifaro2013-03-081-1/+2
| | | | | | | | | | Use RTM_PINNED flag to mark route as immutable. Forbid deleting immutable routes without special rtrequest1_fib() flag. Adding interface address with prefix already in route table is handled by atomically deleting old prefix and adding interface one. Discussed with: andre, eri MFC after: 3 weeks
* Resolve source address selection in presense of CARP. Add a coupleglebius2013-02-111-3/+21
| | | | | | | | | | | | | | | | | | of helper functions: - carp_master() - boolean function which is true if an address is in the MASTER state. - ifa_preferred() - boolean function that compares two addresses, and is aware of CARP. Utilize ifa_preferred() in ifa_ifwithnet(). The previous version of patch also changed source address selection logic in jails using carp_master(), but we failed to negotiate this part with Bjoern. May be we will approach this problem again later. Reported & tested by: Anton Yuzhaninov <citrin citrin.ru> Sponsored by: Nginx, Inc
* Update to previous r241688 to use __func__ instead of spelled out functionandre2012-10-191-2/+2
| | | | | | name in log(9) message. Suggested by: glebius
* Use LOG_WARNING level in in_attachdomain1() instead of printf().andre2012-10-181-2/+2
| | | | Submitted by: vijju.singh-at-gmail.com
* Mechanically remove the last stray remains of spl* calls from net*/*.andre2012-10-181-24/+2
| | | | They have been Noop's for a long time now.
* Merge the projects/pf/head branch, that was worked on for last six months,glebius2012-09-081-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into head. The most significant achievements in the new code: o Fine grained locking, thus much better performance. o Fixes to many problems in pf, that were specific to FreeBSD port. New code doesn't have that many ifdefs and much less OpenBSDisms, thus is more attractive to our developers. Those interested in details, can browse through SVN log of the projects/pf/head branch. And for reference, here is exact list of revisions merged: r232043, r232044, r232062, r232148, r232149, r232150, r232298, r232330, r232332, r232340, r232386, r232390, r232391, r232605, r232655, r232656, r232661, r232662, r232663, r232664, r232673, r232691, r233309, r233782, r233829, r233830, r233834, r233835, r233836, r233865, r233866, r233868, r233873, r234056, r234096, r234100, r234108, r234175, r234187, r234223, r234271, r234272, r234282, r234307, r234309, r234382, r234384, r234456, r234486, r234606, r234640, r234641, r234642, r234644, r234651, r235505, r235506, r235535, r235605, r235606, r235826, r235991, r235993, r236168, r236173, r236179, r236180, r236181, r236186, r236223, r236227, r236230, r236252, r236254, r236298, r236299, r236300, r236301, r236397, r236398, r236399, r236499, r236512, r236513, r236525, r236526, r236545, r236548, r236553, r236554, r236556, r236557, r236561, r236570, r236630, r236672, r236673, r236679, r236706, r236710, r236718, r237154, r237155, r237169, r237314, r237363, r237364, r237368, r237369, r237376, r237440, r237442, r237751, r237783, r237784, r237785, r237788, r237791, r238421, r238522, r238523, r238524, r238525, r239173, r239186, r239644, r239652, r239661, r239773, r240125, r240130, r240131, r240136, r240186, r240196, r240212. I'd like to thank people who participated in early testing: Tested by: Florian Smeets <flo freebsd.org> Tested by: Chekaluk Vitaly <artemrts ukr.net> Tested by: Ben Wilber <ben desync.com> Tested by: Ian FREISLICH <ianf cloudseed.co.za>
* Add linkstate to bridge(4), set the link to up when at least one underlyingthompsa2012-04-201-2/+2
| | | | | | | | | interface is up, otherwise the link is down. This, among other things, allows carp to work on a bridge. Prodded by: glebius Tested by: Alexander Lunev
* Remove KASSERTS, they do not add any value here since the pointer is about tothompsa2012-04-181-6/+2
| | | | be derefernced anyway.
* g/c last bit of old ipv6 prefix management.pluknet2012-02-081-1/+0
| | | | | Reviewed by: bz Obtained from: NetBSD, net/if.h, rev 1.80
* Fix typo in r231010.glebius2012-02-051-1/+1
| | | | Submitted by: linimon
* Better comment for ifa_init(), ifa_ref(), ifa_free().glebius2012-02-051-1/+1
|
* In ifa_init() initialize if_data.ifi_datalen. This would beglebius2012-02-051-0/+1
| | | | | | required after upcoming changes from bz@. Discussed with: bz
* Convert all users of IF_ADDR_LOCK to use new locking macros that specifyjhb2012-01-051-53/+53
| | | | | | | either a read lock or write lock. Reviewed by: bz MFC after: 2 weeks
* Restore a feature that was present in 5.x and 6.x, and was cleared inglebius2011-12-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | | 7.x, 8.x and 9.x with pf(4) imports: pfsync(4) should suppress CARP preemption, while it is running its bulk update. However, reimplement the feature in more elegant manner, that is partially inspired by newer OpenBSD: - Rename term "suppression" to "demotion", to match with OpenBSD. - Keep a global demotion factor, that can be raised by several conditions, for now these are: - interface goes down - carp(4) has problems with ip_output() or ip6_output() - pfsync performs bulk update - Unlike in OpenBSD the demotion factor isn't a counter, but is actual value added to advskew. The adjustment values for particular error conditions are also configurable, and their defaults are maximum advskew value, so a single failure bumps demotion to maximum. This is for POLA compatibility, and should satisfy most users. - Demotion factor is a writable sysctl, so user can do foot shooting, if he desires to.
* A major overhaul of the CARP implementation. The ip_carp.c was startedglebius2011-12-161-5/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | from scratch, copying needed functionality from the old implemenation on demand, with a thorough review of all code. The main change is that interface layer has been removed from the CARP. Now redundant addresses are configured exactly on the interfaces, they run on. The CARP configuration itself is, as before, configured and read via SIOCSVH/SIOCGVH ioctls. A new prefix created with SIOCAIFADDR or SIOCAIFADDR_IN6 may now be configured to a particular virtual host id, which makes the prefix redundant. ifconfig(8) semantics has been changed too: now one doesn't need to clone carpXX interface, he/she should directly configure a vhid on a Ethernet interface. To supply vhid data from the kernel to an application the getifaddrs(8) function had been changed to pass ifam_data with each address. [1] The new implementation definitely closes all PRs related to carp(4) being an interface, and may close several others. It also allows to run a single redundant IP per interface. Big thanks to Bjoern Zeeb for his help with inet6 part of patch, for idea on using ifam_data and for several rounds of reviewing! PR: kern/117000, kern/126945, kern/126714, kern/120130, kern/117448 Reviewed by: bz Submitted by: bz [1]
* Remove the unused if_free_type() function.brooks2011-12-091-21/+2
| | | | X-MFC after: never
* Improve logging:glebius2011-11-221-2/+2
| | | | | | - don't hardcode function name - use LOG_DEBUG for such a debug message - print error value
* Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs.ed2011-11-071-2/+2
| | | | This means that their use is restricted to a single C file.
* Add infrastructure to allow all frames/packets received on an interfacebz2011-07-031-0/+16
| | | | | | | | | | | | | | to be assigned to a non-default FIB instance. You may need to recompile world or ports due to the change of struct ifnet. Submitted by: cjsp Submitted by: Alexander V. Chernikov (melifaro ipfw.ru) (original versions) Reviewed by: julian Reviewed by: Alexander V. Chernikov (melifaro ipfw.ru) MFC after: 2 weeks X-MFC: use spare in struct ifnet
* Update ifc_len field of struct ifconf passed for the ioctl SIOCGIFCONF32pluknet2011-06-281-0/+2
| | | | | | | | | (i.e. under COMPAT_FREEBSD32) in case ifconf() returned success to match the native SIOCGIFCONF behavior. PR: kern/158369 Reported by: Paul Procacci <pprocacci att gmail com> MFC after: 1 week
* When removing ifnets, we should first remove the reference to ifnetglebius2011-04-041-9/+10
| | | | | | | | | | | from the interface index, then decrease refcount, not vice versa. Otherwise there is a race (reproducible) when if_free_internal() contests on IFNET_WLOCK(), and we got a zero-refed ifnet in the index for a long time. It may be picked by some other thread, that runs ifnet_byindex_ref(), who takes the ifnet from index, and bumps refcount. When reader drops the lock, if_free_internal() proceeds with free. Then reader tries to free it a second time.
* - Merge changes to the base system to support OFED. These includejeff2011-03-211-0/+6
| | | | | a wider arg2 for sysctl, updates to vlan code, IFT_INFINIBAND, and other miscellaneous small features.
* Mfp4 CH=177274,177280,177284-177285,177297,177324-177325bz2011-02-161-15/+34
| | | | | | | | | | | | | | | | | | | | | | VNET socket push back: try to minimize the number of places where we have to switch vnets and narrow down the time we stay switched. Add assertions to the socket code to catch possibly unset vnets as seen in r204147. While this reduces the number of vnet recursion in some places like NFS, POSIX local sockets and some netgraph, .. recursions are impossible to fix. The current expectations are documented at the beginning of uipc_socket.c along with the other information there. Sponsored by: The FreeBSD Foundation Sponsored by: CK Software GmbH Reviewed by: jhb Tested by: zec Tested by: Mikolaj Golub (to.my.trociny gmail.com) MFC after: 2 weeks
* Mfp4 CH=177255:bz2011-02-111-2/+4
| | | | | | | | | | | | | | | | | Make VNET_ASSERT() available with either VNET_DEBUG or INVARIANTS. Change the syntax to match KASSERT() to allow more flexible panic messages rather than having a printf with hardcoded arguments before panic. Adjust the few assertions we have to the new format (and enhance the output). Sponsored by: The FreeBSD Foundation Sponsored by: CK Software GmbH Reviewed by: jhb MFC after: 2 weeks
* Fix a LOR by dropping the global ifnet locks while allocating a new ifnetjhb2011-01-241-6/+20
| | | | | | | | table in if_grow(). The order of the SYSINIT's for ifnet state were swapped so that the various locks were initialized before being used. Reviewed by: pluknet, bz MFC after: 2 weeks
OpenPOWER on IntegriCloud