summaryrefslogtreecommitdiffstats
path: root/sys/netinet/ip_carp.c
Commit message (Collapse)AuthorAgeFilesLines
...
* Acquire IF_ADDR_LOCK() around most iterations over ifp->if_addrheadrwatson2009-04-261-0/+17
| | | | | | | | (colloquially known as if_addrlist). Currently not acquired around interface address loops that call out to the routing code due to potential lock order issues. MFC after: 3 weeks
* Change if_output to take a struct route as its fourth argument in orderkmacy2009-04-161-2/+5
| | | | | | to allow passing a cached struct llentry * down to L2 Reviewed by: rwatson
* Update stats in struct carpstats using two new macros: CARPSTATS_ADD()rwatson2009-04-121-20/+20
| | | | | | | | | and CARPSTATS_INC(), rather than directly manipulating the fields of the structure. This will make it easier to change the implementation of these statistics, such as using per-CPU versions of the data structure. MFC after: 3 days
* This main goals of this project are:qingli2008-12-151-8/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. separating L2 tables (ARP, NDP) from the L3 routing tables 2. removing as much locking dependencies among these layers as possible to allow for some parallelism in the search operations 3. simplify the logic in the routing code, The most notable end result is the obsolescent of the route cloning (RTF_CLONING) concept, which translated into code reduction in both IPv4 ARP and IPv6 NDP related modules, and size reduction in struct rtentry{}. The change in design obsoletes the semantics of RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland applications such as "arp" and "ndp" have been modified to reflect those changes. The output from "netstat -r" shows only the routing entries. Quite a few developers have contributed to this project in the past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and Andre Oppermann. And most recently: - Kip Macy revised the locking code completely, thus completing the last piece of the puzzle, Kip has also been conducting active functional testing - Sam Leffler has helped me improving/refactoring the code, and provided valuable reviews - Julian Elischer setup the perforce tree for me and has helped me maintaining that branch before the svn conversion
* In a case of CARP status change run through the if_link_state_change()glebius2008-12-051-4/+5
| | | | routine, so that devd(8) and others are notified about link state change.
* Rather than using hidden includes (with cicular dependencies),bz2008-12-021-0/+2
| | | | | | | | | | | directly include only the header files needed. This reduces the unneeded spamming of various headers into lots of files. For now, this leaves us with very few modules including vnet.h and thus needing to depend on opt_route.h. Reviewed by: brooks, gnn, des, zec, imp Sponsored by: The FreeBSD Foundation
* Retire the MALLOC and FREE macros. They are an abomination unto style(9).des2008-10-231-9/+9
| | | | MFC after: 3 months
* Step 1.5 of importing the network stack virtualization infrastructurezec2008-10-021-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | from the vimage project, as per plan established at devsummit 08/08: http://wiki.freebsd.org/Image/Notes200808DevSummit Introduce INIT_VNET_*() initializer macros, VNET_FOREACH() iterator macros, and CURVNET_SET() context setting macros, all currently resolving to NOPs. Prepare for virtualization of selected SYSCTL objects by introducing a family of SYSCTL_V_*() macros, currently resolving to their global counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT(). Move selected #defines from sys/sys/vimage.h to newly introduced header files specific to virtualized subsystems (sys/net/vnet.h, sys/netinet/vinet.h etc.). All the changes are verified to have zero functional impact at this point in time by doing MD5 comparision between pre- and post-change object files(*). (*) netipsec/keysock.c did not validate depending on compile time options. Implemented by: julian, bz, brooks, zec Reviewed by: julian, bz, brooks, kris, rwatson, ... Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation
* Commit step 1 of the vimage project, (network stack)bz2008-08-171-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | virtualization work done by Marko Zec (zec@). This is the first in a series of commits over the course of the next few weeks. Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only. We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again. Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch
* Fix carp(4) panics that can occur during carp interface configuration.eri2008-07-141-0/+8
| | | | | | Approved by: mlaier (mentor) Reported by: Scott Ullrich MFC after: 1 week
* Sort IP addresses before hashing them for the signature. Otherwise carp ismlaier2008-06-021-13/+39
| | | | | | | | | sensitive to address configuration order. PR: kern/121574 Reported by: Douglas K. Rand, Wouter de Jong Obtained from: OpenBSD (rev 1.114 + fixes) MFC after: 2 weeks
* If the vhid already present, return EEXIST instead ofglebius2008-02-071-1/+1
| | | | non-informative EINVAL.
* Add FBSDID to all files in netinet so that people can moresilby2007-10-071-2/+3
| | | | | | easily include file version information in bug reports. Approved by: re (kensmith)
* Replace references to NET_CALLOUT_MPSAFE with CALLOUT_MPSAFE, and removerwatson2007-07-281-3/+3
| | | | | | | | definition of NET_CALLOUT_MPSAFE, which is no longer required now that debug.mpsafenet has been removed. The once over: bz Approved by: re (kensmith)
* Import rewrite of IPv4 socket multicast layer to support source-specificbms2007-06-121-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and protocol-independent host mode multicast. The code is written to accomodate IPv6, IGMPv3 and MLDv2 with only a little additional work. This change only pertains to FreeBSD's use as a multicast end-station and does not concern multicast routing; for an IGMPv3/MLDv2 router implementation, consider the XORP project. The work is based on Wilbert de Graaf's IGMPv3 code drop for FreeBSD 4.6, which is available at: http://www.kloosterhof.com/wilbert/igmpv3.html Summary * IPv4 multicast socket processing is now moved out of ip_output.c into a new module, in_mcast.c. * The in_mcast.c module implements the IPv4 legacy any-source API in terms of the protocol-independent source-specific API. * Source filters are lazy allocated as the common case does not use them. They are part of per inpcb state and are covered by the inpcb lock. * struct ip_mreqn is now supported to allow applications to specify multicast joins by interface index in the legacy IPv4 any-source API. * In UDP, an incoming multicast datagram only requires that the source port matches the 4-tuple if the socket was already bound by source port. An unbound socket SHOULD be able to receive multicasts sent from an ephemeral source port. * The UDP socket multicast filter mode defaults to exclusive, that is, sources present in the per-socket list will be blocked from delivery. * The RFC 3678 userland functions have been added to libc: setsourcefilter, getsourcefilter, setipv4sourcefilter, getipv4sourcefilter. * Definitions for IGMPv3 are merged but not yet used. * struct sockaddr_storage is now referenced from <netinet/in.h>. It is therefore defined there if not already declared in the same way as for the C99 types. * The RFC 1724 hack (specify 0.0.0.0/8 addresses to IP_MULTICAST_IF which are then interpreted as interface indexes) is now deprecated. * A patch for the Rhyolite.com routed in the FreeBSD base system is available in the -net archives. This only affects individuals running RIPv1 or RIPv2 via point-to-point and/or unnumbered interfaces. * Make IPv6 detach path similar to IPv4's in code flow; functionally same. * Bump __FreeBSD_version to 700048; see UPDATING. This work was financially supported by another FreeBSD committer. Obtained from: p4://bms_netdev Submitted by: Wilbert de Graaf (original work) Reviewed by: rwatson (locking), silence from fenner, net@ (but with encouragement)
* Do not leak lock in the case of EEXIST error.glebius2007-06-061-2/+6
| | | | | PR: kern/92776 Submitted by: Ed Schouten <Ed.Schouten tunix.nl>
* Since rev. 1.94 of netinet/in.c, the netinet layer frees all itsglebius2007-02-021-6/+24
| | | | | | | | multicast memberships, when interface is detached. Thus, when an underlying interface is detached, we do not need to free our multicast memberships. Reviewed by: bms
* Make it possible that carpdetach() unlocks on return. Then, inglebius2007-01-251-8/+7
| | | | | | | carp_clone_destroy() we are on a safe side, we don't need to unlock the cif, that can me already non-existent at this point. Reported by: Anton Yuzhaninov <citrin rambler-co.ru>
* Spacing.glebius2007-01-251-5/+5
|
* Sweep kernel replacing suser(9) calls with priv(9) calls, assigningrwatson2006-11-061-2/+5
| | | | | | | | | | | | | specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>
* Set scope on MC address so IPv6 carp advertisement will not get droppedbz2006-10-071-2/+7
| | | | | | | | | | | in ip6_output. In case this fails handle the error directly and log it[1]. In addition permit CARP over v6 in ip_fw2. PR: kern/98622 Similar patch by: suz Discussed with: glebius [1] Tested by: Paul.Dekkers surfnet.nl, Philippe.Pegon crc.u-strasbg.fr MFC after: 3 days
* Fix an incompatibility between CARP and IPv4 multicast routing, wherebybms2006-09-251-0/+1
| | | | | | | | | | | | | the VRRPv2 advertisements will originate from the wrong source address. This only affects kernels compiled with MROUTING and after the MRT_INIT ioctl() has been issued. Set imo_multicast_vif in carp's softc to the invalid value -1 after it is zeroed by softc allocation, to stop the ip_output() path looking up the incorrect source address thinking a vif is set. PR: kern/100532 Submitted by: Bohus Plucinsky MFC after: 1 week
* Revise network interface cloning to take an optional opaquesam2006-07-091-2/+2
| | | | | | | | | parameter that can specify configuration parameters: o rev cloner api's to add optional parameter block o add SIOCCREATE2 that accepts parameter data o rev vlan support to use new api (maintain old code) Reviewed by: arch@
* Make in-kernel multicast protocols for pfsync and carp work after enablingmlaier2006-07-081-0/+5
| | | | | | | | dynamic resizing of multicast membership array. Reported and testing by: Maxim Konovalov, Scott Ullrich Reminded by: thompsa MFC after: 2 weeks
* Fix the following bpf(4) race condition which can result in a panic:csjp2006-06-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (1) bpf peer attaches to interface netif0 (2) Packet is received by netif0 (3) ifp->if_bpf pointer is checked and handed off to bpf (4) bpf peer detaches from netif0 resulting in ifp->if_bpf being initialized to NULL. (5) ifp->if_bpf is dereferenced by bpf machinery (6) Kaboom This race condition likely explains the various different kernel panics reported around sending SIGINT to tcpdump or dhclient processes. But really this race can result in kernel panics anywhere you have frequent bpf attach and detach operations with high packet per second load. Summary of changes: - Remove the bpf interface's "driverp" member - When we attach bpf interfaces, we now set the ifp->if_bpf member to the bpf interface structure. Once this is done, ifp->if_bpf should never be NULL. [1] - Introduce bpf_peers_present function, an inline operation which will do a lockless read bpf peer list associated with the interface. It should be noted that the bpf code will pickup the bpf_interface lock before adding or removing bpf peers. This should serialize the access to the bpf descriptor list, removing the race. - Expose the bpf_if structure in bpf.h so that the bpf_peers_present function can use it. This also removes the struct bpf_if; hack that was there. - Adjust all consumers of the raw if_bpf structure to use bpf_peers_present Now what happens is: (1) Packet is received by netif0 (2) Check to see if bpf descriptor list is empty (3) Pickup the bpf interface lock (4) Hand packet off to process From the attach/detach side: (1) Pickup the bpf interface lock (2) Add/remove from bpf descriptor list Now that we are storing the bpf interface structure with the ifnet, there is is no need to walk the bpf interface list to locate the correct bpf interface. We now simply look up the interface, and initialize the pointer. This has a nice side effect of changing a bpf interface attach operation from O(N) (where N is the number of bpf interfaces), to O(1). [1] From now on, we can no longer check ifp->if_bpf to tell us whether or not we have any bpf peers that might be interested in receiving packets. In collaboration with: sam@ MFC after: 1 month
* o Introduce carp_multicast_cleanup(), which removes and freesglebius2006-03-211-84/+101
| | | | | | | | | | | | | | | | | | | | | | multicast addresses from carp interface. [1] o Rewrite carpdetach(), so that it does the following things: [1] - Stops callouts. - Decrements carp_suppress_preempt, if needed. - Downs interface and sets CARP state to INIT. - Calls carp_multicast_cleanup(). - Detaches softc from carp_if and if we are the last frees the carp_if. o Use new carpdetach() in carp_clone_destroy(). o In carp_ifdetach() acquire the carp_if lock and cleanup all interfaces hanging on carp_if. [1] o Make carp_ifdetach() static and use EVENT(9) to call it from if_detach(). [2] o In carp_setrun() exit if the softc doesn't have a valid pointer to parent. [1] Obtained from: OpenBSD [1] Submitted by: Dan Lukes <dan obluda.cz> [2] PR: kern/82908 [2]
* MFOpenBSD 1.62:glebius2005-11-171-2/+4
| | | | | | | Prevent backup CARP hosts from replying to arp requests, fixes strangeness with some layer-3 switches. From Bill Marquette. Tested by: Kazuaki Oda <kaakun highway.ne.jp>
* Unbreak for !INET6 case.ru2005-11-141-1/+1
|
* - Store pointer to the link-level address right in "struct ifnet"ru2005-11-111-13/+13
| | | | | | | | | | rather than in ifindex_table[]; all (except one) accesses are through ifp anyway. IF_LLADDR() works faster, and all (except one) ifaddr_byindex() users were converted to use ifp->if_addr. - Stop storing a (pointer to) Ethernet address in "struct arpcom", and drop the IFP2ENADDR() macro; all users have been converted to use IF_LLADDR() instead.
* Move the cloned interface list management in to if_clone. For some drivers thethompsa2005-11-081-3/+0
| | | | | | | | | | softc lists and associated mutex are now unused so these have been removed. Calling if_clone_detach() will now destroy all the cloned interfaces for the driver and in most cases is all thats needed to unload. Idea by: brooks Reviewed by: brooks
* Since carp(4) interfaces presently are kinda fake yet possessyar2005-10-261-1/+1
| | | | | | | IP addresses, mark them with LOOPBACK so that routing daemons take them easy for link-state routing protocols. Reviewed by: glebius
* Fix build after in6_joingroup change. It remains unclear if DAD breaks CARPmlaier2005-10-221-2/+2
| | | | or not.
* Change the reference counting to count the number of cloned interfaces for eachthompsa2005-10-121-1/+2
| | | | | | | | | | | | | | | cloner. This ensures that ifc->ifc_units is not prematurely freed in if_clone_detach() before the clones are destroyed, resulting in memory modified after free. This could be triggered with if_vlan. Assert that all cloners have been destroyed when freeing the memory. Change all simple cloners to destroy their clones with ifc_simple_destroy() on module unload so the reference count is properly updated. This also cleans up the interface destroy routines and allows future optimisation. Discussed with: brooks, pjd, -current Reviewed by: brooks
* When a carp(4) interface is being destroyed and is in a promiscous mode,glebius2005-09-091-0/+1
| | | | | | | | | | first interface is detached from parent and then bpfdetach() is called. If the interface was the last carp(4) interface attached to parent, then the mutex on parent is destroyed. When bpfdetach() calls if_setflags() we panic on destroyed mutex. To prevent the above scenario, clear pointer to parent, when we detach ourselves from parent.
* Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE andrwatson2005-08-091-23/+28
| | | | | | | | | | | | | | IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to ifnet.if_drv_flags. Device drivers are now responsible for synchronizing access to these flags, as they are in if_drv_flags. This helps prevent races between the network stack and device driver in maintaining the interface flags field. Many __FreeBSD__ and __FreeBSD_version checks maintained and continued; some less so. Reviewed by: pjd, bz MFC after: 7 days
* include netinet6/scope6_var.h.ume2005-07-251-0/+1
|
* scope cleanup. with this changeume2005-07-251-18/+17
| | | | | | | | | | | | | | | | | | | - most of the kernel code will not care about the actual encoding of scope zone IDs and won't touch "s6_addr16[1]" directly. - similarly, most of the kernel code will not care about link-local scoped addresses as a special case. - scope boundary check will be stricter. For example, the current *BSD code allows a packet with src=::1 and dst=(some global IPv6 address) to be sent outside of the node, if the application do: s = socket(AF_INET6); bind(s, "::1"); sendto(s, some_global_IPv6_addr); This is clearly wrong, since ::1 is only meaningful within a single node, but the current implementation of the *BSD kernel cannot reject this attempt. Submitted by: JINMEI Tatuya <jinmei__at__isl.rdc.toshiba.co.jp> Obtained from: KAME
* When doing ARP load balancing source IP is taken in network byte order,glebius2005-07-011-1/+1
| | | | | | | | so residue of division for all hosts on net is the same, and thus only one VHID answers. Change source IP in host byte order. Reviewed by: mlaier Approved by: re (scottl)
* Fix some long standing bugs in writing to the BPF device attached todwmalone2005-06-261-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | a DLT_NULL interface. In particular: 1) Consistently use type u_int32_t for the header of a DLT_NULL device - it continues to represent the address family as always. 2) In the DLT_NULL case get bpf_movein to store the u_int32_t in a sockaddr rather than in the mbuf, to be consistent with all the DLT types. 3) Consequently fix a bug in bpf_movein/bpfwrite which only permitted packets up to 4 bytes less than the MTU to be written. 4) Fix all DLT_NULL devices to have the code required to allow writing to their bpf devices. 5) Move the code to allow writing to if_lo from if_simloop to looutput, because it only applies to DLT_NULL devices but was being applied to other devices that use if_simloop possibly incorrectly. PR: 82157 Submitted by: Matthew Luckie <mjl@luckie.org.nz> Approved by: re (scottl)
* Stop embedding struct ifnet at the top of driver softcs. Instead thebrooks2005-06-101-94/+99
| | | | | | | | | | | | | | | | | | | | struct ifnet or the layer 2 common structure it was embedded in have been replaced with a struct ifnet pointer to be filled by a call to the new function, if_alloc(). The layer 2 common structure is also allocated via if_alloc() based on the interface type. It is hung off the new struct ifnet member, if_l2com. This change removes the size of these structures from the kernel ABI and will allow us to better manage them as interfaces come and go. Other changes of note: - Struct arpcom is no longer referenced in normal interface code. Instead the Ethernet address is accessed via the IFP2ENADDR() macro. To enforce this ac_enaddr has been renamed to _ac_enaddr. - The second argument to ether_ifattach is now always the mac address from driver private storage rather than sometimes being ac_enaddr. Reviewed by: sobomax, sam
* - When carp interface is destroyed, and it affects global preemptionglebius2005-05-151-1/+12
| | | | | | | | | suppresion counter, decrease the latter. [1] - Add sysctl to monitor preemption suppression. PR: kern/80972 [1] Submitted by: Frank Volf [1] MFC after: 1 week
* Remove anti-LOR bandaid, it is not needed now.glebius2005-04-201-21/+0
| | | | Sponsored by: Rambler
* When several carp interfaces are attached to Ethernet interface,glebius2005-03-301-27/+37
| | | | | | | | | | | | carp_carpdev_state_locked() is called every time carp interface is attached. The first call backs up flags of the first interface, and the second call backs up them again, erasing correct values. To solve this, a carp_sc_state_locked() function is introduced. It is called when interface is attached to parent, instead of calling carp_carpdev_state_locked. carp_carpdev_state_locked() calls carp_sc_state_locked() for each sc in chain. Reported by: Yuriy N. Shkandybin, sem
* If vhid exists return more informative EEXIST instead of EINVAL. While hereglebius2005-03-181-3/+2
| | | | remove redundant brackets.
* Fix a potential crash that could occur when CARP_LOG is being used.glebius2005-03-181-2/+1
| | | | Obtained from: OpenBSD (pat)
* Fix typo. Unbreak build. Take pointy hat.glebius2005-03-021-1/+1
|
* Add more locking when reading/writing to carp softc. When carp softc isglebius2005-03-011-31/+140
| | | | | | | | | | | | | | attached to a parent interface we use its mutex to lock the softc. This means that in several places like carp_ioctl() we lock softc conditionaly. This should be redesigned. To avoid LORs when MII announces us a link state change, we schedule a quick callout and call carp_carpdev_state_locked() from it. Initialize callouts using NET_CALLOUT_MPSAFE. Sponsored by: Rambler Reviewed by: mlaier
* - Add carp_mtx. Use it to protect list of all carp interfaces.glebius2005-03-011-18/+20
| | | | | | | | - In carp_send_ad_all() walk through list of all carp interfaces instead of walking through list of all interfaces. Sponsored by: Rambler Reviewed by: mlaier
* Revert change to struct ifnet. Use ifnet pointer in softc. Embeddingglebius2005-03-011-1/+1
| | | | | | ifnet into smth will soon be removed. Requested by: brooks
* Remove debugging printf.glebius2005-03-011-1/+0
| | | | Reviewed by: mlaier
OpenPOWER on IntegriCloud