summaryrefslogtreecommitdiffstats
path: root/sys/netinet/in.c
Commit message (Collapse)AuthorAgeFilesLines
...
* Remove the reference held on the loopback route when the interfaceqingli2011-10-071-1/+3
| | | | | | | | | | | address is being deleted. Only the last reference holder deletes the loopback route. All other delete operations just clear the IFA_RTSELF flag. PR: kern/159601 Submitted by: pluknet Reviewed by: discussed on net@ MFC after: 3 days
* A system may have multiple physical interfaces, all of which are on theqingli2011-10-031-5/+34
| | | | | | | | | | | | same prefix. Since a single route entry is installed for the prefix (without RADIX_MPATH), incoming packets on the interfaces that are not associated with the prefix route may trigger an error message about unable to allocation LLE entry, and fails L2. This patch makes sure a valid route is present in the system, and allow the aforementioned condition to exist and treats as valid. Reviewed by: bz MFC after: 5 days
* This patch allows ARP to work properly in the presence ofqingli2011-10-031-14/+21
| | | | | | | self-referencing routes. This patch is a rework of r223862. Reviewed by: bz, zec MFC after: 5 days
* When an interface address route is removed from the system, anotherqingli2011-08-281-1/+2
| | | | | | | | | | | | | route with the same prefix is searched for as a replacement. The current code did not bypass routes that have non-operational interfaces. This patch fixes that bug and will find a replacement route with an active interface. PR: kern/159603 Submitted by: pluknet, ambrisko at ambrisko dot com Reviewed by: discussed on net@ Approved by: re (bz) MFC after: 3 days
* If RTF_HOST flag is specified, then we are interested in destinationkevlo2011-08-101-1/+1
| | | | | | | | address. PR: kern/159600 Submitted by: Svatopluk Kraus <onwahe at gmail dot com> Approved by: re (hrs)
* Permit ARP to proceed for IPv4 host routes for which the gateway is thezec2011-07-081-2/+12
| | | | | | | | | same as the host address. This already works fine for INET6 and ND6. While here, remove two function pointers from struct lltable which are only initialized but never used. MFC after: 3 days
* Supply the LLE_STATIC flag bit to in_ifscurb() when scrubbing interfaceqingli2011-05-291-8/+17
| | | | | | | | address so that proper clean up will take place in the routing code. This patch fixes the bootp panic on startup problem. Also, added more error handling and logging code in function in_scrubprefix(). MFC after: 5 days
* The statically configured (permanent) ARP entries are removed when anqingli2011-05-201-16/+24
| | | | | | | | | | interface is brought down, even though the interface address is still valid. This patch maintains the permanent ARP entries as long as the interface address (having the same prefix as that of the ARP entries) is valid. Reviewed by: delphij MFC after: 5 days
* Reference ifaddr object before unlocking as it can be freedpluknet2011-03-211-2/+3
| | | | | | | | | from another context at the moment of later access. PR: kern/155555 Submitted by: Andrew Boyer <aboyer att averesystems.com> Approved by: avg (mentor) MFC after: 2 weeks
* Use time_uptime instead of non-monotonic time_second to drive ARPglebius2010-11-301-1/+1
| | | | | | timeouts. Suggested by: bde
* After some off-list discussion, revert a number of changes to thedim2010-11-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | DPCPU_DEFINE and VNET_DEFINE macros, as these cause problems for various people working on the affected files. A better long-term solution is still being considered. This reversal may give some modules empty set_pcpu or set_vnet sections, but these are harmless. Changes reverted: ------------------------------------------------------------------------ r215318 | dim | 2010-11-14 21:40:55 +0100 (Sun, 14 Nov 2010) | 4 lines Instead of unconditionally emitting .globl's for the __start_set_xxx and __stop_set_xxx symbols, only emit them when the set_vnet or set_pcpu sections are actually defined. ------------------------------------------------------------------------ r215317 | dim | 2010-11-14 21:38:11 +0100 (Sun, 14 Nov 2010) | 3 lines Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout the tree. ------------------------------------------------------------------------ r215316 | dim | 2010-11-14 21:23:02 +0100 (Sun, 14 Nov 2010) | 2 lines Add macros to define static instances of VNET_DEFINE and DPCPU_DEFINE.
* Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughoutdim2010-11-141-2/+2
| | | | the tree.
* Add a queue to hold packets while we await an ARP reply.gnn2010-11-121-1/+7
| | | | | | | | | | | | | | | | | | | | | | When a fast machine first brings up some non TCP networking program it is quite possible that we will drop packets due to the fact that only one packet can be held per ARP entry. This leads to packets being missed when a program starts or restarts if the ARP data is not currently in the ARP cache. This code adds a new sysctl, net.link.ether.inet.maxhold, which defines a system wide maximum number of packets to be held in each ARP entry. Up to maxhold packets are queued until an ARP reply is received or the ARP times out. The default setting is the old value of 1 which has been part of the BSD networking code since time immemorial. Expose the time we hold an incomplete ARP entry by adding the sysctl net.link.ether.inet.wait, which defaults to 20 seconds, the value used when the new ARP code was added.. Reviewed by: bz, rpaulo MFC after: 3 weeks
* MfP4 CH182763 (original version):bz2010-10-161-0/+15
| | | | | | | | | | | | | | | | Make it harder to exploit certain in_control() related races between the intiial lookup at the beginning and the time we will remove the entry from the lists by re-checking that entry is still in the list before trying to remove it. (*) It is believed that with the current code and locking strategy we cannot completely fix all race. Reported by: Nima Misaghian (nima_misa hotmail.com) on net@ 20100817 Tested by: Nima Misaghian (nima_misa hotmail.com) (original version) PR: kern/146250 Submitted by: Mikolaj Golub (to.my.trociny gmail.com) (different version) MFC after: 1 week
* In case of RADIX_MPATH do not leak the IN_IFADDR read lock onbz2010-09-041-2/+3
| | | | | | early return. MFC after: 3 days
* Allow carp(4) to be loaded as a kernel module. Follow precedent set bywill2010-08-111-6/+3
| | | | | | | | | | | | | | | bridge(4), lagg(4) etc. and make use of function pointers and pf_proto_register() to hook carp into the network stack. Currently, because of the uncertainty about whether the unload path is free of race condition panics, unloads are disallowed by default. Compiling with CARPMOD_CAN_UNLOAD in CFLAGS removes this anti foot shooting measure. This commit requires IP6PROTOSPACER, introduced in r211115. Reviewed by: bz, simon Approved by: ken (mentor) MFC after: 2 weeks
* This patch fixes the problem where proxy ARP entries cannot be addedqingli2010-05-251-2/+3
| | | | | | over the if_ng interface. MFC after: 3 days
* MFP4: @176978-176982, 176984, 176990-176994, 177441bz2010-04-291-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | "Whitspace" churn after the VIMAGE/VNET whirls. Remove the need for some "init" functions within the network stack, like pim6_init(), icmp_init() or significantly shorten others like ip6_init() and nd6_init(), using static initialization again where possible and formerly missed. Move (most) variables back to the place they used to be before the container structs and VIMAGE_GLOABLS (before r185088) and try to reduce the diff to stable/7 and earlier as good as possible, to help out-of-tree consumers to update from 6.x or 7.x to 8 or 9. This also removes some header file pollution for putatively static global variables. Revert VIMAGE specific changes in ipfilter::ip_auth.c, that are no longer needed. Reviewed by: jhb Discussed with: rwatson Sponsored by: The FreeBSD Foundation Sponsored by: CK Software GmbH MFC after: 6 days
* Plug reference leaks in the link-layer code ("new-arp") that previouslybz2010-04-111-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | prevented the link-layer entry from being freed. In both in.c and in6.c (though that code path seems to be basically dead) plug a reference leak in case of a pending callout being drained. In if_ether.c consistently add a reference before resetting the callout and in case we canceled a pending one remove the reference for that. In the final case in arptimer, before freeing the expired entry, remove the reference again and explicitly call callout_stop() to clear the active flag. In nd6.c:nd6_free() we are only ever called from the callout function and thus need to remove the reference there as well before calling into llentry_free(). In if_llatbl.c when freeing entire tables make sure that in case we cancel a pending callout to remove the reference as well. Reviewed by: qingli (earlier version) MFC after: 10 days Problem observed, patch tested by: simon on ipv6gw.f.o, Christian Kratzer (ck cksoft.de), Evgenii Davidov (dado korolev-net.ru) PR: kern/144564 Configurations still affected: with options FLOWTABLE
* One of the advantages of enabling ECMP (a.k.a RADIX_MPATH) is toqingli2010-03-091-0/+8
| | | | | | | | | | | | | | | | | | | | | allow for connection load balancing across interfaces. Currently the address alias handling method is colliding with the ECMP code. For example, when two interfaces are configured on the same prefix, only one prefix route is installed. So connection load balancing among the available interfaces is not possible. The other advantage of ECMP is for failover. The issue with the current code, is that the interface link-state is not reflected in the route entry. For example, if there are two interfaces on the same prefix, the cable on one interface is unplugged, new and existing connections should switch over to the other interface. This is not done today and packets go into a black hole. Also, there is a small bug in the kernel where deleting ECMP routes in the userland will always return an error even though the command is successfully executed. MFC after: 5 days
* Some of the existing ppp and vpn related scripts create and setqingli2010-02-021-0/+6
| | | | | | | | | the IP addresses of the tunnel end points to the same value. In these cases the loopback route is not installed for the local end. Verified by: avg MFC after: 5 days
* Ensure an address is removed from the interface addressqingli2010-01-081-1/+1
| | | | | | list when the installation of that address fails. PR: 139559
* Consolidate the route message generation code for when addressqingli2009-12-301-48/+49
| | | | | | | | aliases were added or deleted. The announced route entry for an address alias is no longer empty because this empty route entry was causing some route daemon to fail and exit abnormally. MFC after: 5 days
* The proxy arp entries could not be added into the system over theqingli2009-12-301-8/+44
| | | | | | | | | | | | | | | | | | IFF_POINTOPOINT link types. The reason was due to the routing entry returned from the kernel covering the remote end is of an interface type that does not support ARP. This patch fixes this problem by providing a hint to the kernel routing code, which indicates the prefix route instead of the PPP host route should be returned to the caller. Since a host route to the local end point is also added into the routing table, and there could be multiple such instantiations due to multiple PPP links can be created with the same local end IP address, this patch also fixes the loopback route installation failure problem observed prior to this patch. The reference count of loopback route to local end would be either incremented or decremented. The first instantiation would create the entry and the last removal would delete the route entry. MFC after: 5 days
* Use the correct option name in the preprocessor command to enableqingli2009-10-231-3/+3
| | | | | | | or disable diagnostic messages. Reviewed by: ru MFC after: 3 days
* This patch fixes the following issues in the ARP operation:qingli2009-10-151-3/+8
| | | | | | | | | | | | | | 1. There is a regression issue in the ARP code. The incomplete ARP entry was timing out too quickly (1 second timeout), as such, a new entry is created each time arpresolve() is called. Therefore the maximum attempts made is always 1. Consequently the error code returned to the application is always 0. 2. Set the expiration of each incomplete entry to a 20-second lifetime. 3. Return "incomplete" entries to the application. Reviewed by: kmacy MFC after: 3 days
* Remove a log message from production code. This log message can beqingli2009-10-021-0/+2
| | | | | | | | triggered by a misconfigured host that is sending out gratuious ARPs. This log message can also be triggered during a network renumbering event when multiple prefixes co-exist on a single network segment. MFC after: immediately
* Previously, if an address alias is configured on an interface, andqingli2009-10-021-2/+2
| | | | | | | | | | this address alias has a prefix matching that of another address configured on the same interface, then the ARP entry for the alias is not deleted from the ARP table when that address alias is removed. This patch fixes the aforementioned issue. PR: kern/139113 MFC after: 3 days
* Self pointing routes are installed for configured interface addressesqingli2009-09-151-37/+5
| | | | | | | | | | and address aliases. After an interface is brought down and brought back up again, those self pointing routes disappeared. This patch ensures after an interface is brought back up, the loopback routes are reinstalled properly. Reviewed by: bz MFC after: immediately
* The bootp code installs an interface address and the nfs clientqingli2009-09-151-0/+11
| | | | | | | | | | module tries to install the same address again. This extra code is removed, which was discovered by the removal of a call to in_ifscrub() in r196714. This call to in_ifscrub is put back here because the SIOCAIFADDR command can be used to change the prefix length of an existing alias. Reviewed by: kmacy
* Add arp_update_event. This replaces route_arp_update_event, whichnp2009-09-081-0/+1
| | | | | | | | | | | | has not worked since the arp-v2 rewrite. The event handler will be called with the llentry write-locked and can examine la_flags to determine whether the entry is being added or removed. Reviewed by: gnn, kmacy Approved by: gnn (mentor) MFC after: 1 month
* This patch fixes the following issues:qingli2009-08-311-5/+48
| | | | | | | | | | | | - Routing messages are not generated when adding and removing interface address aliases. - Loopback route installed for an interface address alias is not deleted from the routing table when that address alias is removed from the associated interface. - Function in_ifscrub() is called extraneously. Reviewed by: gnn, kmacy, sam MFC after: 3 days
* Use locks specific to the lltable code, rather than borrow the ifnetrwatson2009-08-251-1/+1
| | | | | | | | | list/index locks, to protect link layer address tables. This avoids lock order issues during interface teardown, but maintains the bug that sysctl copy routines may be called while a non-sleepable lock is held. Reviewed by: bz, kmacy MFC after: 3 days
* Rework global locks for interface list and index management, correctingrwatson2009-08-231-6/+1
| | | | | | | | | | | | | | several critical bugs, including race conditions and lock order issues: Replace the single rwlock, ifnet_lock, with two locks, an rwlock and an sxlock. Either can be held to stablize the lists and indexes, but both are required to write. This allows the list to be held stable in both network interrupt contexts and sleepable user threads across sleeping memory allocations or device driver interactions. As before, writes to the interface list must occur from sleepable contexts. Reviewed by: bz, julian MFC after: 3 days
* Merge the remainder of kern_vimage.c and vimage.h into vnet.c andrwatson2009-08-011-1/+0
| | | | | | | | | | vnet.h, we now use jails (rather than vimages) as the abstraction for virtualization management, and what remained was specific to virtual network stacks. Minor cleanups are done in the process, and comments updated to reflect these changes. Reviewed by: bz Approved by: re (vimage blanket)
* This patch does the following:qingli2009-07-271-2/+13
| | | | | | | | | | | | | | - Allow loopback route to be installed for address assigned to interface of IFF_POINTOPOINT type. - Install loopback route for an IPv4 interface addreess when the "useloopback" sysctl variable is enabled. Similarly, install loopback route for an IPv6 interface address when the sysctl variable "nd6_useloopback" is enabled. Deleting loopback routes for interface addresses is unconditional in case these sysctl variables were disabled after an interface address has been assigned. Reviewed by: bz Approved by: re
* Remove unused VNET_SET() and related macros; only VNET_GET() isrwatson2009-07-161-3/+3
| | | | | | | | | ever actually used. Rename VNET_GET() to VNET() to shorten variable references. Discussed with: bz, julian Reviewed by: bz Approved by: re (kensmith, kib)
* Build on Jeff Roberson's linker-set based dynamic per-CPU allocatorrwatson2009-07-141-19/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables. Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker. Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided. This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS. Bump __FreeBSD_version and update UPDATING. Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)
* Add a new global rwlock, in_ifaddr_lock, which will synchronize use of therwatson2009-06-251-15/+46
| | | | | | | | | | | | | | | | | | | in_ifaddrhead and INADDR_HASH address lists. Previously, these lists were used unsynchronized as they were effectively never changed in steady state, but we've seen increasing reports of writer-writer races on very busy VPN servers as core count has gone up (and similar configurations where address lists change frequently and concurrently). For the time being, use rwlocks rather than rmlocks in order to take advantage of their better lock debugging support. As a result, we don't enable ip_input()'s read-locking of INADDR_HASH until an rmlock conversion is complete and a performance analysis has been done. This means that one class of reader-writer races still exists. MFC after: 6 weeks Reviewed by: bz
* Modify most routines returning 'struct ifaddr *' to return referencesrwatson2009-06-231-65/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rather than pointers, requiring callers to properly dispose of those references. The following routines now return references: ifaddr_byindex ifa_ifwithaddr ifa_ifwithbroadaddr ifa_ifwithdstaddr ifa_ifwithnet ifaof_ifpforaddr ifa_ifwithroute ifa_ifwithroute_fib rt_getifa rt_getifa_fib IFP_TO_IA ip_rtaddr in6_ifawithifp in6ifa_ifpforlinklocal in6ifa_ifpwithaddr in6_ifadd carp_iamatch6 ip6_getdstifaddr Remove unused macro which didn't have required referencing: IFP_TO_IA6 This closes many small races in which changes to interface or address lists while an ifaddr was in use could lead to use of freed memory (etc). In a few cases, add missing if_addr_list locking required to safely acquire references. Because of a lack of deep copying support, we accept a race in which an in6_ifaddr pointed to by mbuf tags and extracted with ip6_getdstifaddr() doesn't hold a reference while in transmit. Once we have mbuf tag deep copy support, this can be fixed. Reviewed by: bz Obtained from: Apple, Inc. (portions) MFC after: 6 weeks (portions)
* Clean up common ifaddr management:rwatson2009-06-211-3/+2
| | | | | | | | | | | | | | - Unify reference count and lock initialization in a single function, ifa_init(). - Move tear-down from a macro (IFAFREE) to a function ifa_free(). - Move reference count bump from a macro (IFAREF) to a function ifa_ref(). - Instead of using a u_int protected by a mutex to refcount(9) for reference count management. The ifa_mtx is now used for exactly one ioctl, and possibly should be removed. MFC after: 3 weeks
* After r193232 rt_tables in vnet.h are no longer indirectly dependent onbz2009-06-081-1/+0
| | | | | | | | | the ROUTETABLES kernel option thus there is no need to include opt_route.h anymore in all consumers of vnet.h and no longer depend on it for module builds. Remove the hidden include in flowtable.h as well and leave the two explicit #includes in ip_input.c and ip_output.c.
* If including vnet.h one has to include opt_route.h as well. This isbz2009-05-221-0/+1
| | | | | | | | | because struct vnet_net holds the rt_tables[][] for MRT and array size is compile time dependent. If you had ROUTETABLES set to >1 after r192011 V_loif was pointing into nonsense leading to strange results or even panics for some people. Reviewed by: mz
* When an interface address is removed and the last prefixqingli2009-05-201-0/+44
| | | | | | | | route is also being deleted, the link-layer address table (arp or nd6) will flush those L2 llinfo entries that match the removed prefix. Reviewed by: kmacy
* Unbreak options VIMAGE builds, in a followup to r192011 which did notbz2009-05-171-0/+2
| | | | | | | introduce INIT_VNET_NET() initializers necessary for accessing V_loif. Submitted by: zec Reviewed by: julian
* Ignore the INADDR_ANY address inserted/deleted by DHCP when installing a ↵qingli2009-05-141-1/+5
| | | | | | loopback route to the interface address.
* This patch adds a host route to an interface address (that is assignedqingli2009-05-121-1/+47
| | | | | | | | | to a non loopback/ppp link types) through the loopback interface. Prior to the new L2/L3 rewrite, this host route is implicitly added by the L2 code during RTM_RESOLVE of that interface address. This host route is deleted when that interface is removed. Reviewed by: kmacy
* In preparation for turning on options VIMAGE in next commits,zec2009-04-261-1/+0
| | | | | | | | rearrange / replace / adjust several INIT_VNET_* initializer macros, all of which currently resolve to whitespace. Reviewed by: bz (an older version of the patch) Approved by: julian (mentor)
* Expand coverage of IF_ADDR_LOCK() in in_control() from point of initialrwatson2009-04-251-31/+81
| | | | | | | | | | | lookup of 'ia' from if_addrhead through most use. Note that we currently have to drop it prematurely in some cases due to calls out to the routing and interface code while using 'ia', but this closes many races. Annotate several potential races that persist after this change. Move to using M_NOWAIT for allocating new interface addresses due to lock(s) being held. MFC after: 3 weeks
* In in_purgemaddrs(), remove the inm being freed from the address listrwatson2009-04-241-1/+1
| | | | | | | before freeing it, rather than vice version, to avoid potential use after free. Reviewed by: bms
OpenPOWER on IntegriCloud