summaryrefslogtreecommitdiffstats
path: root/sys/netinet/ip_mroute.c
Commit message (Collapse)AuthorAgeFilesLines
* MFC r313821 r315277 r315286vangyzen2017-03-171-19/+20
| | | | | | | | | | | | | | Use inet_ntoa_r() instead of inet_ntoa() throughout the kernel. inet_ntoa() cannot be used safely in a multithreaded environment because it uses a static local buffer. Instead, use inet_ntoa_r() with a buffer on the caller's stack, except for KTR messages. KTR can correctly log the immediate integral values passed to it, as well as constant strings, but not non-constant strings, since they might change by the time ktrdump retrieves them. Therefore, use hex notation in KTR messages. Sponsored by: Dell EMC
* Get closer to a VIMAGE network stack teardown from top to bottom ratherbz2016-06-211-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | than removing the network interfaces first. This change is rather larger and convoluted as the ordering requirements cannot be separated. Move the pfil(9) framework to SI_SUB_PROTO_PFIL, move Firewalls and related modules to their own SI_SUB_PROTO_FIREWALL. Move initialization of "physical" interfaces to SI_SUB_DRIVERS, move virtual (cloned) interfaces to SI_SUB_PSEUDO. Move Multicast to SI_SUB_PROTO_MC. Re-work parts of multicast initialisation and teardown, not taking the huge amount of memory into account if used as a module yet. For interface teardown we try to do as many of them as we can on SI_SUB_INIT_IF, but for some this makes no sense, e.g., when tunnelling over a higher layer protocol such as IP. In that case the interface has to go along (or before) the higher layer protocol is shutdown. Kernel hhooks need to go last on teardown as they may be used at various higher layers and we cannot remove them before we cleaned up the higher layers. For interface teardown there are multiple paths: (a) a cloned interface is destroyed (inside a VIMAGE or in the base system), (b) any interface is moved from a virtual network stack to a different network stack ("vmove"), or (c) a virtual network stack is being shut down. All code paths go through if_detach_internal() where we, depending on the vmove flag or the vnet state, make a decision on how much to shut down; in case we are destroying a VNET the individual protocol layers will cleanup their own parts thus we cannot do so again for each interface as we end up with, e.g., double-frees, destroying locks twice or acquiring already destroyed locks. When calling into protocol cleanups we equally have to tell them whether they need to detach upper layer protocols ("ulp") or not (e.g., in6_ifdetach()). Provide or enahnce helper functions to do proper cleanup at a protocol rather than at an interface level. Approved by: re (hrs) Obtained from: projects/vnet Reviewed by: gnn, jhb Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D6747
* netinet: for pointers replace 0 with NULL.pfg2016-04-151-1/+1
| | | | | | | | These are mostly cosmetical, no functional change. Found with devel/coccinelle. Reviewed by: ae. tuexen
* Remove now-unused wrappers for various routing functions.melifaro2016-01-141-1/+1
|
* Remove sys/eventhandler.h from net/route.hmelifaro2016-01-091-0/+1
| | | | Reviewed by: ae
* CALLOUT_MPSAFE has lost its meaning since r141428, i.e., for more than tenjkim2015-05-221-3/+3
| | | | | | | | | | years for head. However, it is continuously misused as the mpsafe argument for callout_init(9). Deprecate the flag and clean up callout_init() calls to make them more consistent. Differential Revision: https://reviews.freebsd.org/D2613 Reviewed by: jhb MFC after: 2 weeks
* o Use new function ip_fillid() in all places throughout the kernel,glebius2015-04-011-1/+1
| | | | | | | | | | | | | | | | | where we want to create a new IP datagram. o Add support for RFC6864, which allows to set IP ID for atomic IP datagrams to any value, to improve performance. The behaviour is controlled by net.inet.ip.rfc6864 sysctl knob, which is enabled by default. o In case if we generate IP ID, use counter(9) to improve performance. o Gather all code related to IP ID into ip_id.c. Differential Revision: https://reviews.freebsd.org/D2177 Reviewed by: adrian, cy, rpaulo Tested by: Emeric POUPON <emeric.poupon stormshield.eu> Sponsored by: Netflix Sponsored by: Nginx, Inc. Relnotes: yes
* Remove SYSCTL_VNET_* macros, and simply put CTLFLAG_VNET where needed.glebius2014-11-071-1/+1
| | | | Sponsored by: Nginx, Inc.
* When deciding whether to call m_pullup() even though there is adequaterwatson2014-10-121-4/+3
| | | | | | | | | | | | | | | | data in an mbuf, use M_WRITABLE() instead of a direct test of M_EXT; the latter both unnecessarily exposes mbuf-allocator internals in the protocol stack and is also insufficient to catch all cases of non-writability. (NB: m_pullup() does not actually guarantee that a writable mbuf is returned, so further refinement of all of these code paths continues to be required.) Reviewed by: bz MFC after: 3 days Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D900
* Change pr_output's prototype to avoid the need for explicit casts.kevlo2014-08-151-1/+1
| | | | | | | This is a follow up to r269699. Phabric: D564 Reviewed by: jhb
* Merge 'struct ip6protosw' and 'struct protosw' into one. Now we havekevlo2014-08-081-16/+25
| | | | | | | only one protocol switch structure that is shared between ipv4 and ipv6. Phabric: D476 Reviewed by: jhb
* Fix fallout from r241923. Calculate length of payload inglebius2014-01-221-5/+3
| | | | | | | | | | pim_input() properly. While here, remove extra variable and incorrect condition before m_pullup(). Reported by: Olivier Cochard-Labbé <olivier cochard.me> Sponsored by: Nginx, Inc.
* The r48589 promised to remove implicit inclusion of if_var.h soon. Prepareglebius2013-10-261-0/+1
| | | | | | | | to this event, adding if_var.h to files that do need it. Also, include all includes that now are included due to implicit pollution via if_var.h Sponsored by: Netflix Sponsored by: Nginx, Inc.
* Use LIST_FOREACH_SAFE() instead of doing it by hand.jhb2013-09-051-7/+5
|
* Use an unsigned long when indexing into mfchashtbl[] and mf6ctable[]. Thisjhb2013-09-051-4/+4
| | | | | | | | | matches the types used when computing hash indices and the type of the maximum size of mfchashtbl[]. PR: kern/181821 Submitted by: Sven-Thorsten Dietrich <sven@vyatta.com> (IPv4) MFC after: 1 week
* Remove unused code and sort variables declarations.ae2013-09-051-8/+2
| | | | | PR: kern/181822 MFC after: 1 week
* Migrate structs arpstat, icmpstat, mrtstat, pimstat and udpstat to PCPUae2013-07-091-10/+11
| | | | counters.
* Fix incomplete printf.delphij2013-04-161-1/+2
| | | | | | PR: kern/177889 Submitted by: Sven-Thorsten Dietrich <sven vyatta com> MFC after: 1 week
* Don't leak lock when returning.delphij2013-04-161-0/+1
| | | | | | PR: kern/177888 Submitted by: Sven-Thorsten Dietrich <sven vyatta com> MFC after: 1 week
* Use m_get/m_gethdr instead of compat macros.glebius2013-03-151-4/+3
| | | | Sponsored by: Nginx, Inc.
* Mechanically substitute flags from historic mbuf allocator withglebius2012-12-051-6/+6
| | | | | | | | | malloc(9) flags within sys. Exceptions: - sys/contrib not touched - sys/mbuf.h edited manually
* o Remove last argument to ip_fragment(), and obtain all needed informationglebius2012-10-261-1/+2
| | | | | | | | | | | on checksums directly from mbuf flags. This simplifies code. o Clear CSUM_IP from the mbuf in ip_fragment() if we did checksums in hardware. Some driver may not announce CSUM_IP in theur if_hwassist, although try to do checksums if CSUM_IP set on mbuf. Example is em(4). o While here, consistently use CSUM_IP instead of its alias CSUM_DELAY_IP. After this change CSUM_DELAY_IP vanishes from the stack. Submitted by: Sebastian Kuzminsky <seb lineratesystems.com>
* Switch the entire IPv4 stack to keep the IP packet headerglebius2012-10-221-11/+7
| | | | | | | | | | | | | | | | | | | | | | | in network byte order. Any host byte order processing is done in local variables and host byte order values are never[1] written to a packet. After this change a packet processed by the stack isn't modified at all[2] except for TTL. After this change a network stack hacker doesn't need to scratch his head trying to figure out what is the byte order at the given place in the stack. [1] One exception still remains. The raw sockets convert host byte order before pass a packet to an application. Probably this would remain for ages for compatibility. [2] The ip_input() still subtructs header len from ip->ip_len, but this is planned to be fixed soon. Reviewed by: luigi, Maxim Dounin <mdounin mdounin.ru> Tested by: ray, Olivier Cochard-Labbe <olivier cochard.me>
* Revert previous commit...kevlo2012-10-101-1/+1
| | | | Pointyhat to: kevlo (myself)
* Prefer NULL over 0 for pointerskevlo2012-10-091-1/+1
|
* After r241245 it appeared that in_delayed_cksum(), which still expectsglebius2012-10-081-0/+3
| | | | | | | | | | | | | | host byte order, was sometimes called with net byte order. Since we are moving towards net byte order throughout the stack, the function was converted to expect net byte order, and its consumers fixed appropriately: - ip_output(), ipfilter(4) not changed, since already call in_delayed_cksum() with header in net byte order. - divert(4), ng_nat(4), ipfw_nat(4) now don't need to swap byte order there and back. - mrouting code and IPv6 ipsec now need to switch byte order there and back, but I hope, this is temporary solution. - In ipsec(4) shifted switch to net byte order prior to in_delayed_cksum(). - pf_route() catches up on r241245 changes to ip_output().
* Remove route caching from IP multicast routing code. There is noglebius2012-07-021-2/+1
| | | | | | | reason to do that, and also, cached route never got unreferenced, which meant a reference leak. Reviewed by: bms
* Change SYSINIT priorities so that ip_mroute_modevent() is executedzec2012-03-041-2/+2
| | | | | | | | | | before vnet_mroute_init(), since vnet_mroute_init() depends on mfchashsize tunable to be set, and that is done in in ip_mroute_modevent(). Apparently I broke that ordering with r208744 almost 2 years ago... PR: kern/162201 Submitted by: Stevan Markovic (mcafee.com) MFC after: 3 days
* Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.ed2011-11-071-3/+3
| | | | | | The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.
* After some off-list discussion, revert a number of changes to thedim2010-11-221-18/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | DPCPU_DEFINE and VNET_DEFINE macros, as these cause problems for various people working on the affected files. A better long-term solution is still being considered. This reversal may give some modules empty set_pcpu or set_vnet sections, but these are harmless. Changes reverted: ------------------------------------------------------------------------ r215318 | dim | 2010-11-14 21:40:55 +0100 (Sun, 14 Nov 2010) | 4 lines Instead of unconditionally emitting .globl's for the __start_set_xxx and __stop_set_xxx symbols, only emit them when the set_vnet or set_pcpu sections are actually defined. ------------------------------------------------------------------------ r215317 | dim | 2010-11-14 21:38:11 +0100 (Sun, 14 Nov 2010) | 3 lines Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout the tree. ------------------------------------------------------------------------ r215316 | dim | 2010-11-14 21:23:02 +0100 (Sun, 14 Nov 2010) | 2 lines Add macros to define static instances of VNET_DEFINE and DPCPU_DEFINE.
* Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughoutdim2010-11-141-18/+18
| | | | the tree.
* Virtualize the IPv4 multicast routing code.zec2010-06-021-202/+258
| | | | | | Submitted by: iprebeg Reviewed by: bms, bz, Pavlin Radoslavov MFC after: 30 days
* No need to include security/mac/mac_framework.h here.pjd2010-02-181-2/+0
|
* Make sure the multicast forwarding cache entry's stall queue is properlysyrinx2009-12-301-0/+9
| | | | | | | | initialized before trying to insert an entry into it. PR: kern/142052 Reviewed by: bms MFC after: now
* In expire_mfc(), add an assert on the multicast forwarding cache mutex.bms2009-09-131-0/+2
| | | | PR: 138666
* Merge the remainder of kern_vimage.c and vimage.h into vnet.c andrwatson2009-08-011-1/+1
| | | | | | | | | | vnet.h, we now use jails (rather than vimages) as the abstraction for virtualization management, and what remained was specific to virtual network stacks. Minor cleanups are done in the process, and comments updated to reflect these changes. Reviewed by: bz Approved by: re (vimage blanket)
* Build on Jeff Roberson's linker-set based dynamic per-CPU allocatorrwatson2009-07-141-12/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables. Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker. Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided. This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS. Bump __FreeBSD_version and update UPDATING. Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)
* Modify most routines returning 'struct ifaddr *' to return referencesrwatson2009-06-231-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rather than pointers, requiring callers to properly dispose of those references. The following routines now return references: ifaddr_byindex ifa_ifwithaddr ifa_ifwithbroadaddr ifa_ifwithdstaddr ifa_ifwithnet ifaof_ifpforaddr ifa_ifwithroute ifa_ifwithroute_fib rt_getifa rt_getifa_fib IFP_TO_IA ip_rtaddr in6_ifawithifp in6ifa_ifpforlinklocal in6ifa_ifpwithaddr in6_ifadd carp_iamatch6 ip6_getdstifaddr Remove unused macro which didn't have required referencing: IFP_TO_IA6 This closes many small races in which changes to interface or address lists while an ifaddr was in use could lead to use of freed memory (etc). In a few cases, add missing if_addr_list locking required to safely acquire references. Because of a lack of deep copying support, we accept a race in which an in6_ifaddr pointed to by mbuf tags and extracted with ip6_getdstifaddr() doesn't hold a reference while in transmit. Once we have mbuf tag deep copy support, this can be fixed. Reviewed by: bz Obtained from: Apple, Inc. (portions) MFC after: 6 weeks (portions)
* Switch cmd argument to u_long. This matches what if_ethersubr.c does andrdivacky2009-06-211-2/+2
| | | | | | | allows the code to compile cleanly on amd64 with clang. Reviewed by: rwatson Approved by: ed (mentor)
* Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERICrwatson2009-06-051-1/+0
| | | | | | | | and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include. Discussed with: pjd
* Use KTR_INET for MROUTING CTRs.bms2009-04-291-1/+1
|
* In preparation for turning on options VIMAGE in next commits,zec2009-04-261-0/+1
| | | | | | | | rearrange / replace / adjust several INIT_VNET_* initializer macros, all of which currently resolve to whitespace. Reviewed by: bz (an older version of the patch) Approved by: julian (mentor)
* Update stats in struct pimstat using two new macros: PIMSTAT_ADD()rwatson2009-04-121-16/+16
| | | | | | | | | and PIMSTAT_INC(), rather than directly manipulating the fields of the structure. This will make it easier to change the implementation of these statistics, such as using per-CPU versions of the data structure. MFC after: 3 days
* Update stats in struct mrtstat using two new macros: MRTSTAT_ADD()rwatson2009-04-121-15/+14
| | | | | | | | | and MRTSTAT_INC(), rather than directly manipulating the fields of the structure. This will make it easier to change the implementation of these statistics, such as using per-CPU versions of the data structure. MFC after: 3 days
* Fix brainos introduced during mechanical KTR change.bms2009-03-201-5/+6
| | | | Pointy hat to: bms
* Cleanup: Nuke debug.mrtdebug, and replace it with KTR.bms2009-03-191-142/+68
|
* Introduce a number of changes to the MROUTING code.bms2009-03-191-612/+411
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is purely a forwarding plane cleanup; no control plane code is involved. Summary: * Split IPv4 and IPv6 MROUTING support. The static compile-time kernel option remains the same, however, the modules may now be built for IPv4 and IPv6 separately as ip_mroute_mod and ip6_mroute_mod. * Clean up the IPv4 multicast forwarding code to use BSD queue and hash table constructs. Don't build our own timer abstractions when ratecheck() and timevalclear() etc will do. * Expose the multicast forwarding cache (MFC) and virtual interface table (VIF) as sysctls, to reduce netstat's dependence on libkvm for this information for running kernels. * bandwidth meters however still require libkvm. * Make the MFC hash table size a boot/load-time tunable ULONG, net.inet.ip.mfchashsize (defaults to 256). * Remove unused members from struct vif and struct mfc. * Kill RSVP support, as no current RSVP implementation uses it. These stubs could be moved to raw_ip.c. * Don't share locks or initialization between IPv4 and IPv6. * Don't use a static struct route_in6 in ip6_mroute.c. The v6 code is still using a cached struct route_in6, this is moved to mif6 for the time being. * More cleanup remains to be merged from ip_mroute.c to ip6_mroute.c. v4 path tested using ports/net/mcast-tools. v6 changes are mostly mechanical locking and *have not* been tested. As these changes partially break some kernel ABIs, they will not be MFCed. There is a lot more work to be done here. Reviewed by: Pavlin Radoslavov
* Rather than using hidden includes (with cicular dependencies),bz2008-12-021-0/+2
| | | | | | | | | | | directly include only the header files needed. This reduces the unneeded spamming of various headers into lots of files. For now, this leaves us with very few modules including vnet.h and thus needing to depend on opt_route.h. Reviewed by: brooks, gnn, des, zec, imp Sponsored by: The FreeBSD Foundation
* Step 1.5 of importing the network stack virtualization infrastructurezec2008-10-021-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | from the vimage project, as per plan established at devsummit 08/08: http://wiki.freebsd.org/Image/Notes200808DevSummit Introduce INIT_VNET_*() initializer macros, VNET_FOREACH() iterator macros, and CURVNET_SET() context setting macros, all currently resolving to NOPs. Prepare for virtualization of selected SYSCTL objects by introducing a family of SYSCTL_V_*() macros, currently resolving to their global counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT(). Move selected #defines from sys/sys/vimage.h to newly introduced header files specific to virtualized subsystems (sys/net/vnet.h, sys/netinet/vinet.h etc.). All the changes are verified to have zero functional impact at this point in time by doing MD5 comparision between pre- and post-change object files(*). (*) netipsec/keysock.c did not validate depending on compile time options. Implemented by: julian, bz, brooks, zec Reviewed by: julian, bz, brooks, kris, rwatson, ... Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation
* A bunch of formatting fixes brough to light by, or created by the Vimage commitjulian2008-08-201-1/+1
| | | | a few days ago.
OpenPOWER on IntegriCloud