summaryrefslogtreecommitdiffstats
path: root/sys/netinet/in.c
Commit message (Collapse)AuthorAgeFilesLines
...
* Relocate permissions checking code in in_control() to before the bodyrwatson2009-04-241-14/+25
| | | | | | | | | | | | | of the implementation of ioctls. This makes the mapping of ioctls to specific privileges more explicit, and also simplifies the implementation by reducing the use of FALLTHROUGH handling in switch. While this is not intended to be a functional change, it does mean that certain privilege checks are now performed earlier, so EPERM might be returned in preference to EADDRNOTAVAIL for management ioctls that could have failed for both reasons. MFC after: 3 weeks
* Reorganize in_control() so that invariants are more obvious, and sorwatson2009-04-231-33/+51
| | | | | | | | | | | | | that it is easier to lock: - Handle the unsupported ioctl case at the beginning of in_control(), handing off to ifp->if_ioctl, rather than looking up interfaces and addresses unnecessarily in this case. - Make it an invariant that ifp is always non-NULL when running in_control()-implemented ioctls, simplifying the code structure. MFC after: 3 weeks
* Protect against some writer-writer races in in_control() by acquiringrwatson2009-04-191-3/+7
| | | | | | | the interface address list lock around interface address list modifications. More to do here. MFC after: 2 weeks
* Deal with the case where ifma_protospec may be NULL, duringbms2009-03-171-1/+8
| | | | | | | | | | | | | | | | | | any IPv4 multicast operations which reference it. There is a potential race because ifma_protospec is set to NULL when we discover the underlying ifnet has gone away. This write is not covered by the IF_ADDR_LOCK, and it's difficult to widen its scope without making it a recursive lock. It isn't clear why this manifests more quickly with 802.11 interfaces, but does not seem to manifest at all with wired interfaces. With this change, the 802.11 related panics reported by sam@ and cokane@ should go away. It is not the right fix, that requires more thought before 8.0. Idea from: sam Tested by: cokane
* Remove IFF_NEEDSGIANT, a compatibility infrastructure introducedrwatson2009-03-151-10/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | in FreeBSD 5.x to allow network device drivers to run with Giant despite the network stack being Giant-free. This significantly simplifies calls into ioctl() on network interfaces, especially in the multicast code, as well as eliminates deferred invocation of interface if_start routines. Disable the build on device drivers still depending on IFF_NEEDSGIANT as they no longer compile. They will be removed in a few weeks if they haven't been made MPSAFE in that time. Disabled drivers: if_ar if_axe if_aue if_cdce if_cue if_kue if_ray if_rue if_rum if_sr if_udav if_ural if_zyd Drivers that were already disabled because of tty changes: if_ppp if_sl Discussed on: arch@
* Fix uninitialized use of ifp for ii.bms2009-03-091-1/+3
| | | | Found by: Peter Holm
* Merge IGMPv3 and Source-Specific Multicast (SSM) to the FreeBSDbms2009-03-091-36/+74
| | | | | | | | | | | IPv4 stack. Diffs are minimized against p4. PCS has been used for some protocol verification, more widespread testing of recorded sources in Group-and-Source queries is needed. sizeof(struct igmpstat) has changed. __FreeBSD_version is bumped to 800070.
* Standardize the various prison_foo_ip[46] functions and prison_if tojamie2009-02-051-5/+4
| | | | | | | | | | | | | | | return zero on success and an error code otherwise. The possible errors are EADDRNOTAVAIL if an address being checked for doesn't match the prison, and EAFNOSUPPORT if the prison doesn't have any addresses in that address family. For most callers of these functions, use the returned error code instead of e.g. a hard-coded EADDRNOTAVAIL or EINVAL. Always include a jailed() check in these functions, where a non-jailed cred always returns success (and makes no changes). Remove the explicit jailed() checks that preceded many of the function calls. Approved by: bz (mentor)
* remove too noisy DIAGNOSTIC codesam2009-01-181-3/+0
| | | | Reviewed by: qingli
* Restrict arp, ndp and theoretically the FIB listing (if notbz2009-01-091-0/+4
| | | | | | | | | | | | | | | | | read with libkvm) to the addresses of a prison, when inside a jail. [1] As the patch from the PR was pre-'new-arp', add checks to the llt_dump handlers as well. While touching RTM_GET in route_output(), consistently use curthread credentials rather than the creds from the socket there. [2] PR: kern/68189 Submitted by: Mark Delany <sxcg2-fuwxj@qmda.emu.st> [1] Discussed with: rwatson [2] Reviewed by: rwatson MFC after: 4 weeks
* Make SIOCGIFADDR and related, as well as SIOCGIFADDR_IN6 and relatedbz2009-01-091-1/+9
| | | | | | | | | | | | | | | | jail-aware. Up to now we returned the first address of the interface for SIOCGIFADDR w/o an ifr_addr in the query. This caused problems for programs querying for an address but running inside a jail, as the address returned usually did not belong to the jail. Like for v6, if there was an ifr_addr given on v4, you could probe for more addresses on the interfaces that you were not allowed to see from inside a jail. Return an error (EADDRNOTAVAIL) in that case now unless the address is on the given interface and valid for the jail. PR: kern/114325 Reviewed by: rwatson MFC after: 4 weeks
* Set a minimum of information in the routing message (like version and type)harti2009-01-091-0/+4
| | | | | so that generic routing message parsing code can parse the messages for L2 info that are retrieved via the sysctl interface.
* Some modules such as SCTP supplies a valid route entry as an input argumentqingli2009-01-031-1/+2
| | | | | | | | | | | | | | | to ip_output(). The destionation is represented in a sockaddr{} object that may contain other pieces of information, e.g., port number. This same destination sockaddr{} object may be passed into L2 code, which could be used to create a L2 entry. Since there exists a L2 table per address family, the L2 lookup function can make address family specific comparison instead of the generic bcmp() operation over the entire sockaddr{} structure. Note in the IPv6 case the sin6_scope_id is not compared because the address is currently stored in the embedded form inside the kernel. The in6_lltable_lookup() has to account for the scope-id if this storage format were to change in the future.
* For consistency use LLE_IS_VALID() in this 4th place that is actuallybz2008-12-281-1/+1
| | | | | interested in the (void *)-1 return value hack. This way we can easily identify those special parts of the code.
* This checkin addresses a couple of issues:qingli2008-12-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. The "route" command allows route insertion through the interface-direct option "-iface". During if_attach(), an sockaddr_dl{} entry is created for the interface and is part of the interface address list. This sockaddr_dl{} entry describes the interface in detail. The "route" command selects this entry as the "gateway" object when the "-iface" option is present. The "arp" and "ndp" commands also interact with the kernel through the routing socket when adding and removing static L2 entries. The static L2 information is also provided through the "gateway" object with an AF_LINK family type, similar to what is provided by the "route" command. In order to differentiate between these two types of operations, a RTF_LLDATA flag is introduced. This flag is set by the "arp" and "ndp" commands when issuing the add and delete commands. This flag is also set in each L2 entry returned by the kernel. The "arp" and "ndp" command follows a convention where a RTM_GET is issued first followed by a RTM_ADD/DELETE. This RTM_GET request fills in the fields for a "rtm" object, which is reinjected into the kernel by a subsequent RTM_ADD/DELETE command. The entry returend from RTM_GET is a prefix route, so the RTF_LLDATA flag must be specified when issuing the RTM_ADD/DELETE messages. 2. Enforce the convention that NET_RT_FLAGS with a 0 w_arg is the specification for retrieving L2 information. Also optimized the code logic. Reviewed by: julian
* unlock and destroy an llentry's lock before freeingkmacy2008-12-161-0/+2
| | | | Found by: sam
* This main goals of this project are:qingli2008-12-151-0/+240
| | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. separating L2 tables (ARP, NDP) from the L3 routing tables 2. removing as much locking dependencies among these layers as possible to allow for some parallelism in the search operations 3. simplify the logic in the routing code, The most notable end result is the obsolescent of the route cloning (RTF_CLONING) concept, which translated into code reduction in both IPv4 ARP and IPv6 NDP related modules, and size reduction in struct rtentry{}. The change in design obsoletes the semantics of RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland applications such as "arp" and "ndp" have been modified to reflect those changes. The output from "netstat -r" shows only the routing entries. Quite a few developers have contributed to this project in the past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and Andre Oppermann. And most recently: - Kip Macy revised the locking code completely, thus completing the last piece of the puzzle, Kip has also been conducting active functional testing - Sam Leffler has helped me improving/refactoring the code, and provided valuable reviews - Julian Elischer setup the perforce tree for me and has helped me maintaining that branch before the svn conversion
* Rather than using hidden includes (with cicular dependencies),bz2008-12-021-0/+1
| | | | | | | | | | | directly include only the header files needed. This reduces the unneeded spamming of various headers into lots of files. For now, this leaves us with very few modules including vnet.h and thus needing to depend on opt_route.h. Reviewed by: brooks, gnn, des, zec, imp Sponsored by: The FreeBSD Foundation
* Unhide declarations of network stack virtualization structs fromzec2008-11-281-1/+0
| | | | | | | | | | | | | | | | | | underneath #ifdef VIMAGE blocks. This change introduces some churn in #include ordering and nesting throughout the network stack and drivers but is not expected to cause any additional issues. In the next step this will allow us to instantiate the virtualization container structures and switch from using global variables to their "containerized" counterparts. Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation
* Change the initialization methodology for global variables scheduledzec2008-11-191-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | for virtualization. Instead of initializing the affected global variables at instatiation, assign initial values to them in initializer functions. As a rule, initialization at instatiation for such variables should never be introduced again from now on. Furthermore, enclose all instantiations of such global variables in #ifdef VIMAGE_GLOBALS blocks. Essentialy, this change should have zero functional impact. In the next phase of merging network stack virtualization infrastructure from p4/vimage branch, the new initialization methology will allow us to switch between using global variables and their counterparts residing in virtualization containers with minimum code churn, and in the long run allow us to intialize multiple instances of such container structures. Discussed at: devsummit Strassburg Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation
* Style changes only:bz2008-10-261-44/+45
| | | | | | | | | | - Consistently add parentheses to return statements. - Use NULL instead of 0 when comparing pointers, also avoiding unnecessary casts. - Do not use pointers as booleans. Reviewed by: rwatson (earlier version) MFC after: 2 months
* Step 1.5 of importing the network stack virtualization infrastructurezec2008-10-021-4/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | from the vimage project, as per plan established at devsummit 08/08: http://wiki.freebsd.org/Image/Notes200808DevSummit Introduce INIT_VNET_*() initializer macros, VNET_FOREACH() iterator macros, and CURVNET_SET() context setting macros, all currently resolving to NOPs. Prepare for virtualization of selected SYSCTL objects by introducing a family of SYSCTL_V_*() macros, currently resolving to their global counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT(). Move selected #defines from sys/sys/vimage.h to newly introduced header files specific to virtualized subsystems (sys/net/vnet.h, sys/netinet/vinet.h etc.). All the changes are verified to have zero functional impact at this point in time by doing MD5 comparision between pre- and post-change object files(*). (*) netipsec/keysock.c did not validate depending on compile time options. Implemented by: julian, bz, brooks, zec Reviewed by: julian, bz, brooks, kris, rwatson, ... Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation
* Commit step 1 of the vimage project, (network stack)bz2008-08-171-11/+12
| | | | | | | | | | | | | | | | | | | | | | | | virtualization work done by Marko Zec (zec@). This is the first in a series of commits over the course of the next few weeks. Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only. We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again. Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch
* In case of interface initialization failure remove struct in_ifaddr* fromgonzo2008-06-241-0/+8
| | | | | | | | | | in_ifaddrhashtbl in in_ifinit because error handler in in_control removes entries only for AF_INET addresses. If in_ifinit is called for the cloned inteface that has just been created its address family is not AF_INET and therefor LIST_REMOVE is not called for respective LIST_INSERT_HEAD and freed entries remain in in_ifaddrhashtbl and lead to memory corruption. PR: kern/124384
* Differentiate between addifaddr and delifaddr for the privilege check.bz2008-01-241-1/+2
| | | | | Reviewed by: rwatson MFC after: 2 weeks
* Add FBSDID to all files in netinet so that people can moresilby2007-10-071-1/+3
| | | | | | easily include file version information in bug reports. Approved by: re (kensmith)
* Simplification to quiet a gcc4.2 warning. Just by setting match.s_addrmjacob2007-06-171-14/+9
| | | | | | | to nonzero you fulfill the same function as the variable 'cmp'. so you might as well zero match and test against it later. Reviewed by: timeout on review request
* Import rewrite of IPv4 socket multicast layer to support source-specificbms2007-06-121-164/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and protocol-independent host mode multicast. The code is written to accomodate IPv6, IGMPv3 and MLDv2 with only a little additional work. This change only pertains to FreeBSD's use as a multicast end-station and does not concern multicast routing; for an IGMPv3/MLDv2 router implementation, consider the XORP project. The work is based on Wilbert de Graaf's IGMPv3 code drop for FreeBSD 4.6, which is available at: http://www.kloosterhof.com/wilbert/igmpv3.html Summary * IPv4 multicast socket processing is now moved out of ip_output.c into a new module, in_mcast.c. * The in_mcast.c module implements the IPv4 legacy any-source API in terms of the protocol-independent source-specific API. * Source filters are lazy allocated as the common case does not use them. They are part of per inpcb state and are covered by the inpcb lock. * struct ip_mreqn is now supported to allow applications to specify multicast joins by interface index in the legacy IPv4 any-source API. * In UDP, an incoming multicast datagram only requires that the source port matches the 4-tuple if the socket was already bound by source port. An unbound socket SHOULD be able to receive multicasts sent from an ephemeral source port. * The UDP socket multicast filter mode defaults to exclusive, that is, sources present in the per-socket list will be blocked from delivery. * The RFC 3678 userland functions have been added to libc: setsourcefilter, getsourcefilter, setipv4sourcefilter, getipv4sourcefilter. * Definitions for IGMPv3 are merged but not yet used. * struct sockaddr_storage is now referenced from <netinet/in.h>. It is therefore defined there if not already declared in the same way as for the C99 types. * The RFC 1724 hack (specify 0.0.0.0/8 addresses to IP_MULTICAST_IF which are then interpreted as interface indexes) is now deprecated. * A patch for the Rhyolite.com routed in the FreeBSD base system is available in the -net archives. This only affects individuals running RIPv1 or RIPv2 via point-to-point and/or unnumbered interfaces. * Make IPv6 detach path similar to IPv4's in code flow; functionally same. * Bump __FreeBSD_version to 700048; see UPDATING. This work was financially supported by another FreeBSD committer. Obtained from: p4://bms_netdev Submitted by: Wilbert de Graaf (original work) Reviewed by: rwatson (locking), silence from fenner, net@ (but with encouragement)
* Move universally to ANSI C function declarations, with relativelyrwatson2007-05-101-41/+17
| | | | consistent style(9)-ish layout.
* Fix a bug in IPv4 address configuration exposed by refcounting.bms2007-03-291-13/+40
| | | | | | | | | | | | | | | | * Join the IPv4 all-hosts multicast group 224.0.0.1 once only; that is, when an IPv4 address is first configured on an interface. * Do not join it for subsequent IPv4 addresses as this violates IGMP. * Be sure to leave the group when all IPv4 addresses have been removed from the interface. * Add two DIAGNOSTIC printfs related to the issue. Further care and attention is needed in this area; it is suggested that netinet's attachment to the ifnet structure be compartmentalized and non-implicit. Bug found by: andre MFC after: 1 month
* Implement reference counting for ifmultiaddr, in_multi, and in6_multibms2007-03-201-74/+114
| | | | | | | | | | | | | | | | | | | structures. Detect when ifnet instances are detached from the network stack and perform appropriate cleanup to prevent memory leaks. This has been implemented in such a way as to be backwards ABI compatible. Kernel consumers are changed to use if_delmulti_ifma(); in_delmulti() is unable to detect interface removal by design, as it performs searches on structures which are removed with the interface. With this architectural change, the panics FreeBSD users have experienced with carp and pfsync should be resolved. Obtained from: p4 branch bms_netdev Reviewed by: andre Sponsored by: Garance A Drosehn Idea from: NetBSD MFC after: 1 month
* In regular forwarding path, reject packets destined for 169.254.0.0/16bms2007-02-031-1/+1
| | | | link-local addresses. See RFC 3927 section 2.7.
* Sweep kernel replacing suser(9) calls with priv(9) calls, assigningrwatson2006-11-061-7/+29
| | | | | | | | | | | | | specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>
* The IPv4 code should clean up multicast group state when an interfacebms2006-09-281-2/+31
| | | | | | | | | | | | goes away. Without this change, it leaks in_multi (and often ether_multi state) if many clonable interfaces are created and destroyed in quick succession. The concept of this fix is borrowed from KAME. Detailed information about this behaviour, as well as test cases, are available in the PR. PR: kern/78227 MFC after: 1 week
* In in_control() remove the temporary in_ifaddr structure from theandre2006-01-241-1/+2
| | | | | | | | | ia_hash only if it actually is an AF_INET address. All other places test for sa_family == AF_INET but this one. PR: kern/92091 Submitted by: Seth Kingsley <sethk-at-meowfishies.com> MFC after: 3 days
* First fill in structure with valid values, and only then attach itglebius2005-10-281-2/+2
| | | | | | to the global list. Reviewed by: rwatson
* In in_addprefix() compare not only route addresses, but their masks,glebius2005-10-221-8/+13
| | | | | | | | too. This fixes problem when connected prefixes overlap. Obtained from: OpenBSD (rev. 1.40 by claudio); [ I came to this fix myself, and then found out that OpenBSD had already fixed it the same way.]
* Unlock Giant symmetrically with respect to lock acquire order as that'srwatson2005-10-031-1/+1
| | | | | | | generally nicer. Spotted by: johan MFC after: 1 week
* Acquire Giant conditionally in in_addmulti() and in_delmulti() based onrwatson2005-10-031-0/+9
| | | | | | | | | | | whether the interface being accessed is IFF_NEEDSGIANT or not. This avoids lock order reversals when calling into the interface ioctl handler, which could potentially lead to deadlock. The long term solution is to eliminate non-MPSAFE network drivers. Discussed with: jhb MFC after: 1 week
* Take a first cut at cleaning up ifnet removal and multicast socketrwatson2005-09-181-8/+12
| | | | | | | | | | | | | | | | | | | | | | | panics, which occur when stale ifnet pointers are left in struct moptions hung off of inpcbs: - Add in_ifdetach(), which matches in6_ifdetach(), and allows the protocol to perform early tear-down on the interface early in if_detach(). - Annotate that if_detach() needs careful consideration. - Remove calls to in_pcbpurgeif0() in the handling of SIOCDIFADDR -- this is not the place to detect interface removal! This also removes what is basically a nasty (and now unnecessary) hack. - Invoke in_pcbpurgeif0() from in_ifdetach(), in both raw and UDP IPv4 sockets. It is now possible to run the msocket_ifnet_remove regression test using HEAD without panicking. MFC after: 3 days
* In order to support CARP interfaces kernel was taught to handle moreglebius2005-08-181-2/+12
| | | | | | | | | | | | | than one interface in one subnet. However, some userland apps rely on the believe that this configuration is impossible. Add a sysctl switch net.inet.ip.same_prefix_carp_only. If the switch is on, then kernel will refuse to add an additional interface to already connected subnet unless the interface is CARP. Default value is off. PR: bin/82306 In collaboration with: mlaier
* Introduce in_multi_mtx, which will protect IPv4-layer multicast addressrwatson2005-08-031-10/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | lists, as well as accessor macros. For now, this is a recursive mutex due code sequences where IPv4 multicast calls into IGMP calls into ip_output(), which then tests for a multicast forwarding case. For support macros in in_var.h to check multicast address lists, assert that in_multi_mtx is held. Acquire in_multi_mtx around iteration over the IPv4 multicast address lists, such as in ip_input() and ip_output(). Acquire in_multi_mtx when manipulating the IPv4 layer multicast addresses, as well as over the manipulation of ifnet multicast address lists in order to keep the two layers in sync. Lock down accesses to IPv4 multicast addresses in IGMP, or assert the lock when performing IGMP join/leave events. Eliminate spl's associated with IPv4 multicast addresses, portions of IGMP that weren't previously expunged by IGMP locking. Add in_multi_mtx, igmp_mtx, and if_addr_mtx lock order to hard-coded lock order in WITNESS, in that order. Problem reported by: Ed Maste <emaste at phaedrus dot sandvine dot ca> MFC after: 10 days
* Use IFF_LOCKGIANT/IFF_UNLOCKGIANT around calls to the interfaceiedowse2005-06-021-14/+26
| | | | | | if_ioctl routine. This should fix a number of code paths through soo_ioctl() that could call into Giant-locked network drivers without first acquiring Giant.
* ifma_protospec is a pointer. Use NULL when assigning or compating it.glebius2005-03-201-2/+2
|
* Remove a workaround from previos revision. It proved to be incorrect.glebius2005-03-201-7/+16
| | | | | | | | Add two another workarounds for carp(4) interfaces: - do not add connected route when address is assigned to carp(4) interface - do not add connected route when other interface goes down Embrace workarounds with #ifdef DEV_CARP
* Add antifootshooting workaround, which will make all routes "connected"glebius2005-03-101-0/+6
| | | | | to carp(4) interfaces host routes. This prevents a problem, when connected network is routed to carp(4) interface.
* /* -> /*- for license, minor formatting changesimp2005-01-071-1/+1
|
* Fix host route addition for more than one address to a loopback interfacemlaier2004-11-171-1/+1
| | | | | | | | after allowing more than one address with the same prefix. Reported by: Vladimir Grebenschikov <vova NO fbsd SPAM ru> Submitted by: ru (also NetBSD rev. 1.83) Pointyhat to: mlaier
* Merge copyright notices.mlaier2004-11-131-28/+1
| | | | Requested by: njl
* Change the way we automatically add prefix routes when adding a new address.mlaier2004-11-121-27/+147
| | | | | | | | | | | | | | | | This makes it possible to have more than one address with the same prefix. The first address added is used for the route. On deletion of an address with IFA_ROUTE set, we try to find a "fallback" address and hand over the route if possible. I plan to MFC this in 4 weeks, hence I keep the - now obsolete - argument to in_ifscrub as it must be considered KAPI as it is not static in in.c. I will clean this after the MFC. Discussed on: arch, net Tested by: many testers of the CARP patches Nits from: ru, Andrea Campi <andrea+freebsd_arch webcom it> Obtained from: WIDE via OpenBSD MFC after: 1 month
OpenPOWER on IntegriCloud