summaryrefslogtreecommitdiffstats
path: root/sys/netinet/ip_carp.c
Commit message (Collapse)AuthorAgeFilesLines
* Set scope on MC address so IPv6 carp advertisement will not get droppedbz2006-10-071-2/+7
| | | | | | | | | | | in ip6_output. In case this fails handle the error directly and log it[1]. In addition permit CARP over v6 in ip_fw2. PR: kern/98622 Similar patch by: suz Discussed with: glebius [1] Tested by: Paul.Dekkers surfnet.nl, Philippe.Pegon crc.u-strasbg.fr MFC after: 3 days
* Fix an incompatibility between CARP and IPv4 multicast routing, wherebybms2006-09-251-0/+1
| | | | | | | | | | | | | the VRRPv2 advertisements will originate from the wrong source address. This only affects kernels compiled with MROUTING and after the MRT_INIT ioctl() has been issued. Set imo_multicast_vif in carp's softc to the invalid value -1 after it is zeroed by softc allocation, to stop the ip_output() path looking up the incorrect source address thinking a vif is set. PR: kern/100532 Submitted by: Bohus Plucinsky MFC after: 1 week
* Revise network interface cloning to take an optional opaquesam2006-07-091-2/+2
| | | | | | | | | parameter that can specify configuration parameters: o rev cloner api's to add optional parameter block o add SIOCCREATE2 that accepts parameter data o rev vlan support to use new api (maintain old code) Reviewed by: arch@
* Make in-kernel multicast protocols for pfsync and carp work after enablingmlaier2006-07-081-0/+5
| | | | | | | | dynamic resizing of multicast membership array. Reported and testing by: Maxim Konovalov, Scott Ullrich Reminded by: thompsa MFC after: 2 weeks
* Fix the following bpf(4) race condition which can result in a panic:csjp2006-06-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (1) bpf peer attaches to interface netif0 (2) Packet is received by netif0 (3) ifp->if_bpf pointer is checked and handed off to bpf (4) bpf peer detaches from netif0 resulting in ifp->if_bpf being initialized to NULL. (5) ifp->if_bpf is dereferenced by bpf machinery (6) Kaboom This race condition likely explains the various different kernel panics reported around sending SIGINT to tcpdump or dhclient processes. But really this race can result in kernel panics anywhere you have frequent bpf attach and detach operations with high packet per second load. Summary of changes: - Remove the bpf interface's "driverp" member - When we attach bpf interfaces, we now set the ifp->if_bpf member to the bpf interface structure. Once this is done, ifp->if_bpf should never be NULL. [1] - Introduce bpf_peers_present function, an inline operation which will do a lockless read bpf peer list associated with the interface. It should be noted that the bpf code will pickup the bpf_interface lock before adding or removing bpf peers. This should serialize the access to the bpf descriptor list, removing the race. - Expose the bpf_if structure in bpf.h so that the bpf_peers_present function can use it. This also removes the struct bpf_if; hack that was there. - Adjust all consumers of the raw if_bpf structure to use bpf_peers_present Now what happens is: (1) Packet is received by netif0 (2) Check to see if bpf descriptor list is empty (3) Pickup the bpf interface lock (4) Hand packet off to process From the attach/detach side: (1) Pickup the bpf interface lock (2) Add/remove from bpf descriptor list Now that we are storing the bpf interface structure with the ifnet, there is is no need to walk the bpf interface list to locate the correct bpf interface. We now simply look up the interface, and initialize the pointer. This has a nice side effect of changing a bpf interface attach operation from O(N) (where N is the number of bpf interfaces), to O(1). [1] From now on, we can no longer check ifp->if_bpf to tell us whether or not we have any bpf peers that might be interested in receiving packets. In collaboration with: sam@ MFC after: 1 month
* o Introduce carp_multicast_cleanup(), which removes and freesglebius2006-03-211-84/+101
| | | | | | | | | | | | | | | | | | | | | | multicast addresses from carp interface. [1] o Rewrite carpdetach(), so that it does the following things: [1] - Stops callouts. - Decrements carp_suppress_preempt, if needed. - Downs interface and sets CARP state to INIT. - Calls carp_multicast_cleanup(). - Detaches softc from carp_if and if we are the last frees the carp_if. o Use new carpdetach() in carp_clone_destroy(). o In carp_ifdetach() acquire the carp_if lock and cleanup all interfaces hanging on carp_if. [1] o Make carp_ifdetach() static and use EVENT(9) to call it from if_detach(). [2] o In carp_setrun() exit if the softc doesn't have a valid pointer to parent. [1] Obtained from: OpenBSD [1] Submitted by: Dan Lukes <dan obluda.cz> [2] PR: kern/82908 [2]
* MFOpenBSD 1.62:glebius2005-11-171-2/+4
| | | | | | | Prevent backup CARP hosts from replying to arp requests, fixes strangeness with some layer-3 switches. From Bill Marquette. Tested by: Kazuaki Oda <kaakun highway.ne.jp>
* Unbreak for !INET6 case.ru2005-11-141-1/+1
|
* - Store pointer to the link-level address right in "struct ifnet"ru2005-11-111-13/+13
| | | | | | | | | | rather than in ifindex_table[]; all (except one) accesses are through ifp anyway. IF_LLADDR() works faster, and all (except one) ifaddr_byindex() users were converted to use ifp->if_addr. - Stop storing a (pointer to) Ethernet address in "struct arpcom", and drop the IFP2ENADDR() macro; all users have been converted to use IF_LLADDR() instead.
* Move the cloned interface list management in to if_clone. For some drivers thethompsa2005-11-081-3/+0
| | | | | | | | | | softc lists and associated mutex are now unused so these have been removed. Calling if_clone_detach() will now destroy all the cloned interfaces for the driver and in most cases is all thats needed to unload. Idea by: brooks Reviewed by: brooks
* Since carp(4) interfaces presently are kinda fake yet possessyar2005-10-261-1/+1
| | | | | | | IP addresses, mark them with LOOPBACK so that routing daemons take them easy for link-state routing protocols. Reviewed by: glebius
* Fix build after in6_joingroup change. It remains unclear if DAD breaks CARPmlaier2005-10-221-2/+2
| | | | or not.
* Change the reference counting to count the number of cloned interfaces for eachthompsa2005-10-121-1/+2
| | | | | | | | | | | | | | | cloner. This ensures that ifc->ifc_units is not prematurely freed in if_clone_detach() before the clones are destroyed, resulting in memory modified after free. This could be triggered with if_vlan. Assert that all cloners have been destroyed when freeing the memory. Change all simple cloners to destroy their clones with ifc_simple_destroy() on module unload so the reference count is properly updated. This also cleans up the interface destroy routines and allows future optimisation. Discussed with: brooks, pjd, -current Reviewed by: brooks
* When a carp(4) interface is being destroyed and is in a promiscous mode,glebius2005-09-091-0/+1
| | | | | | | | | | first interface is detached from parent and then bpfdetach() is called. If the interface was the last carp(4) interface attached to parent, then the mutex on parent is destroyed. When bpfdetach() calls if_setflags() we panic on destroyed mutex. To prevent the above scenario, clear pointer to parent, when we detach ourselves from parent.
* Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE andrwatson2005-08-091-23/+28
| | | | | | | | | | | | | | IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to ifnet.if_drv_flags. Device drivers are now responsible for synchronizing access to these flags, as they are in if_drv_flags. This helps prevent races between the network stack and device driver in maintaining the interface flags field. Many __FreeBSD__ and __FreeBSD_version checks maintained and continued; some less so. Reviewed by: pjd, bz MFC after: 7 days
* include netinet6/scope6_var.h.ume2005-07-251-0/+1
|
* scope cleanup. with this changeume2005-07-251-18/+17
| | | | | | | | | | | | | | | | | | | - most of the kernel code will not care about the actual encoding of scope zone IDs and won't touch "s6_addr16[1]" directly. - similarly, most of the kernel code will not care about link-local scoped addresses as a special case. - scope boundary check will be stricter. For example, the current *BSD code allows a packet with src=::1 and dst=(some global IPv6 address) to be sent outside of the node, if the application do: s = socket(AF_INET6); bind(s, "::1"); sendto(s, some_global_IPv6_addr); This is clearly wrong, since ::1 is only meaningful within a single node, but the current implementation of the *BSD kernel cannot reject this attempt. Submitted by: JINMEI Tatuya <jinmei__at__isl.rdc.toshiba.co.jp> Obtained from: KAME
* When doing ARP load balancing source IP is taken in network byte order,glebius2005-07-011-1/+1
| | | | | | | | so residue of division for all hosts on net is the same, and thus only one VHID answers. Change source IP in host byte order. Reviewed by: mlaier Approved by: re (scottl)
* Fix some long standing bugs in writing to the BPF device attached todwmalone2005-06-261-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | a DLT_NULL interface. In particular: 1) Consistently use type u_int32_t for the header of a DLT_NULL device - it continues to represent the address family as always. 2) In the DLT_NULL case get bpf_movein to store the u_int32_t in a sockaddr rather than in the mbuf, to be consistent with all the DLT types. 3) Consequently fix a bug in bpf_movein/bpfwrite which only permitted packets up to 4 bytes less than the MTU to be written. 4) Fix all DLT_NULL devices to have the code required to allow writing to their bpf devices. 5) Move the code to allow writing to if_lo from if_simloop to looutput, because it only applies to DLT_NULL devices but was being applied to other devices that use if_simloop possibly incorrectly. PR: 82157 Submitted by: Matthew Luckie <mjl@luckie.org.nz> Approved by: re (scottl)
* Stop embedding struct ifnet at the top of driver softcs. Instead thebrooks2005-06-101-94/+99
| | | | | | | | | | | | | | | | | | | | struct ifnet or the layer 2 common structure it was embedded in have been replaced with a struct ifnet pointer to be filled by a call to the new function, if_alloc(). The layer 2 common structure is also allocated via if_alloc() based on the interface type. It is hung off the new struct ifnet member, if_l2com. This change removes the size of these structures from the kernel ABI and will allow us to better manage them as interfaces come and go. Other changes of note: - Struct arpcom is no longer referenced in normal interface code. Instead the Ethernet address is accessed via the IFP2ENADDR() macro. To enforce this ac_enaddr has been renamed to _ac_enaddr. - The second argument to ether_ifattach is now always the mac address from driver private storage rather than sometimes being ac_enaddr. Reviewed by: sobomax, sam
* - When carp interface is destroyed, and it affects global preemptionglebius2005-05-151-1/+12
| | | | | | | | | suppresion counter, decrease the latter. [1] - Add sysctl to monitor preemption suppression. PR: kern/80972 [1] Submitted by: Frank Volf [1] MFC after: 1 week
* Remove anti-LOR bandaid, it is not needed now.glebius2005-04-201-21/+0
| | | | Sponsored by: Rambler
* When several carp interfaces are attached to Ethernet interface,glebius2005-03-301-27/+37
| | | | | | | | | | | | carp_carpdev_state_locked() is called every time carp interface is attached. The first call backs up flags of the first interface, and the second call backs up them again, erasing correct values. To solve this, a carp_sc_state_locked() function is introduced. It is called when interface is attached to parent, instead of calling carp_carpdev_state_locked. carp_carpdev_state_locked() calls carp_sc_state_locked() for each sc in chain. Reported by: Yuriy N. Shkandybin, sem
* If vhid exists return more informative EEXIST instead of EINVAL. While hereglebius2005-03-181-3/+2
| | | | remove redundant brackets.
* Fix a potential crash that could occur when CARP_LOG is being used.glebius2005-03-181-2/+1
| | | | Obtained from: OpenBSD (pat)
* Fix typo. Unbreak build. Take pointy hat.glebius2005-03-021-1/+1
|
* Add more locking when reading/writing to carp softc. When carp softc isglebius2005-03-011-31/+140
| | | | | | | | | | | | | | attached to a parent interface we use its mutex to lock the softc. This means that in several places like carp_ioctl() we lock softc conditionaly. This should be redesigned. To avoid LORs when MII announces us a link state change, we schedule a quick callout and call carp_carpdev_state_locked() from it. Initialize callouts using NET_CALLOUT_MPSAFE. Sponsored by: Rambler Reviewed by: mlaier
* - Add carp_mtx. Use it to protect list of all carp interfaces.glebius2005-03-011-18/+20
| | | | | | | | - In carp_send_ad_all() walk through list of all carp interfaces instead of walking through list of all interfaces. Sponsored by: Rambler Reviewed by: mlaier
* Revert change to struct ifnet. Use ifnet pointer in softc. Embeddingglebius2005-03-011-1/+1
| | | | | | ifnet into smth will soon be removed. Requested by: brooks
* Remove debugging printf.glebius2005-03-011-1/+0
| | | | Reviewed by: mlaier
* Support running carp(4) over a vlan(4) parent interface.yar2005-02-281-1/+2
| | | | Encouraged by: glebius
* Remove unused field from carp softc.glebius2005-02-281-3/+0
| | | | OK'ed by: mcbride@OpenBSD
* Fix tcpdump(8) on carp(4) interface:glebius2005-02-281-16/+5
| | | | | | | | | | | | - Use our loop DLT type, not OpenBSD. [1] - The fields that are converted to network byte order are not 32-bit fields but 16-bit fields, so htons should be used in htonl. [1] - Secondly, ip_input changes ip->ip_len into its value without the ip-header length. So, restore the length to make bpf happy. [1] - Use bpf_mtap2(), use temporary af1, since bpf_mtap2 doesn't understand uint8_t af identifier. Submitted by: Frank Volf [1]
* Unbreak the build. carp_iamatch6 and carp_macmatch6 are not supposed to bemlaier2005-02-271-2/+2
| | | | static as they are used elsewhere.
* Remove carp_softc.sc_ifp member in favor of union pointers in struct ifnet.glebius2005-02-261-21/+21
| | | | Obtained from: OpenBSD
* Staticize local functions.glebius2005-02-261-53/+53
|
* New lines when logging.glebius2005-02-251-17/+18
|
* Embrace macros with do {} while (0)glebius2005-02-251-2/+4
| | | | Submitted by: maxim
* Call carp_carpdev_state() from carp_set_addr6(). See log for rev 1.4.glebius2005-02-251-1/+1
| | | | Sponsored by: Rambler
* Improve logging:glebius2005-02-251-38/+52
| | | | | | | | | | | | - Simplify CARP_LOG() and making it working (we don't have addlog in FreeBSD). - Introduce CARP_DEBUG() which logs with LOG_DEBUG severity when net.inet.carp.log > 1 - Use CARP_DEBUG to log state changes of carp interfaces. After CARP_LOG() cleanup it appeared that carp_input_c() does not need sc argument. Remove it. Sponsored by: Rambler
* Fix problem when master comes up with one interface down, and preemptsglebius2005-02-241-2/+2
| | | | | | | | | | | | mastering on all other interfaces: - call carp_carpdev_state() on initialize instead of just setting to INIT - in carp_carpdev_state() check that interface is UP, instead of checking that it is not DOWN, because a rebooted machine may have interface in UNKNOWN state. Sponsored by: Rambler Obtained from: OpenBSD (partially)
* Unbreak CARP build on 64-bit architectures.mux2005-02-231-1/+1
| | | | Tested on: sparc64
* Remove promisc counter from parent interface in carp_clone_destroy(),glebius2005-02-221-0/+1
| | | | | | | | | | | | so that parent interface is not left in promiscous mode after carp interface is destroyed. This is not perfect, since promisc counter is added when carp interface is assigned an IP address. However, when address is removed parent interface is still in promiscuous mode. Only removal of carp interface removes promisc from parent. Same way in OpenBSD. Sponsored by: Rambler
* Add CARP (Common Address Redundancy Protocol), which allows multipleglebius2005-02-221-0/+2032
hosts to share an IP address, providing high availability and load balancing. Original work on CARP done by Michael Shalayeff, with many additions by Marco Pfatschbacher and Ryan McBride. FreeBSD port done solely by Max Laier. Patch by: mlaier Obtained from: OpenBSD (mickey, mcbride)
OpenPOWER on IntegriCloud