FreeBSD-src - Raptor Engineering's fork of pfsense FreeBSD src with pfSense changes

	Commit message (Collapse)	Author	Age	Files	Lines
*	Add inpcb pointer to struct ipsec_ctx_data and pass it to the pfil hook	ae	2017-08-01	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	from enc_hhook(). This should solve the problem when pf is used with if_enc(4) interface, and outbound packet with existing PCB checked by pf, and this leads to deadlock due to pf does its own PCB lookup and tries to take rlock when wlock is already held. Now we pass PCB pointer if it is known to the pfil hook, this helps to avoid extra PCB lookup and thus rlock acquiring is not needed. For inbound packets it is safe to pass NULL, because we do not held any PCB locks yet. PR: 220217 MFC after: 3 weeks Sponsored by: Yandex LLC (cherry picked from commit 4c7d86cbec231dc1fe6814f0e5e2db63394a2bcb)
*	MFC r319118:	ae	2017-06-05	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Disable IPsec debugging code by default when IPSEC_DEBUG kernel option is not specified. Due to the long call chain IPsec code can produce the kernel stack exhaustion on the i386 architecture. The debugging code usually is not used, but it requires a lot of stack space to keep buffers for strings formatting. This patch conditionally defines macros to disable building of IPsec debugging code. IPsec currently has two sysctl variables to configure debug output: * net.key.debug variable is used to enable debug output for PF_KEY protocol. Such debug messages are produced by KEYDBG() macro and usually they can be interesting for developers. * net.inet.ipsec.debug variable is used to enable debug output for DPRINTF() macro and ipseclog() function. DPRINTF() macro usually is used for development debugging. ipseclog() function is used for debugging by administrator. The patch disables KEYDBG() and DPRINTF() macros, and formatting buffers declarations when IPSEC_DEBUG is not present in kernel config. This reduces stack requirement for up to several hundreds of bytes. The net.inet.ipsec.debug variable still can be used to enable ipseclog() messages by administrator. PR: 219476 MFC r319412: Build kdebug_secreplay() function only when IPSEC_DEBUG is defined. This should fix the build on sparc. Approved by: re (kib)
*	MFC r318734:	ae	2017-06-02	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix possible double releasing for SA reference. There are two possible ways how crypto callback are called: directly from caller and deffered from crypto thread. For inbound packets the direct call chain is the following: IPSEC_INPUT() method -> ipsec_common_input() -> xform_input() -> -> crypto_dispatch() -> crypto_invoke() -> crypto_done() -> -> xform_input_cb() -> ipsec[46]_common_input_cb() -> netisr_queue(). The SA reference is held while crypto processing is not finished. The error handling code wrongly expected that crypto callback always called from the crypto thread context, and it did SA reference releasing in xform_input_cb(). But when the crypto callback called directly, in case of error (e.g. data authentification failed) the error handling in ipsec_common_input() also did SA reference releasing. To fix this, remove error handling from ipsec_common_input() and do it in xform_input() before crypto_dispatch(). PR: 219356 MFC r318738: Fix possible double releasing for SA and SP references. There are two possible ways how crypto callback are called: directly from caller and deffered from crypto thread. For outbound packets the direct call chain is the following: IPSEC_OUTPUT() method -> ipsec[46]_common_output() -> -> ipsec[46]_perform_request() -> xform_output() -> -> crypto_dispatch() -> crypto_invoke() -> crypto_done() -> -> xform_output_cb() -> ipsec_process_done() -> ip[6]_output(). The SA and SP references are held while crypto processing is not finished. The error handling code wrongly expected that crypto callback always called from the crypto thread context, and it did references releasing in xform_output_cb(). But when the crypto callback called directly, in case of error the error handling code in ipsec[46]_perform_request() also did references releasing. To fix this, remove error handling from ipsec[46]_perform_request() and do it in xform_output() before crypto_dispatch(). Approved by: re (kib)
*	MFC r304572 (by bz):	ae	2017-03-18	1	-395/+151
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove the kernel optoion for IPSEC_FILTERTUNNEL, which was deprecated more than 7 years ago in favour of a sysctl in r192648. MFC r305122: Remove redundant sanity checks from ipsec[46]_common_input_cb(). This check already has been done in the each protocol callback. MFC r309144,309174,309201 (by fabient): IPsec RFC6479 support for replay window sizes up to 2^32 - 32 packets. Since the previous algorithm, based on bit shifting, does not scale with large replay windows, the algorithm used here is based on RFC 6479: IPsec Anti-Replay Algorithm without Bit Shifting. The replay window will be fast to be updated, but will cost as many bits in RAM as its size. The previous implementation did not provide a lock on the replay window, which may lead to replay issues. Obtained from: emeric.poupon@stormshield.eu Sponsored by: Stormshield Differential Revision: https://reviews.freebsd.org/D8468 MFC r309143,309146 (by fabient): In a dual processor system (26 cores) during IPSec throughput tests, we see a lot of contention on the arc4 lock, used to generate the IV of the ESP output packets. The idea of this patch is to split this mutex in order to reduce the contention on this lock. Update r309143 to prevent false sharing. Reviewed by: delphij, markm, ache Approved by: so Obtained from: emeric.poupon@stormshield.eu Sponsored by: Stormshield Differential Revision: https://reviews.freebsd.org/D8130 MFC r313330: Merge projects/ipsec into head/. Small summary ------------- o Almost all IPsec releated code was moved into sys/netipsec. o New kernel modules added: ipsec.ko and tcpmd5.ko. New kernel option IPSEC_SUPPORT added. It enables support for loading and unloading of ipsec.ko and tcpmd5.ko kernel modules. o IPSEC_NAT_T option was removed. Now NAT-T support is enabled by default. The UDP_ENCAP_ESPINUDP_NON_IKE encapsulation type support was removed. Added TCP/UDP checksum handling for inbound packets that were decapsulated by transport mode SAs. setkey(8) modified to show run-time NAT-T configuration of SA. o New network pseudo interface if_ipsec(4) added. For now it is build as part of ipsec.ko module (or with IPSEC kernel). It implements IPsec virtual tunnels to create route-based VPNs. o The network stack now invokes IPsec functions using special methods. The only one header file <netipsec/ipsec_support.h> should be included to declare all the needed things to work with IPsec. o All IPsec protocols handlers (ESP/AH/IPCOMP protosw) were removed. Now these protocols are handled directly via IPsec methods. o TCP_SIGNATURE support was reworked to be more close to RFC. o PF_KEY SADB was reworked: - now all security associations stored in the single SPI namespace, and all SAs MUST have unique SPI. - several hash tables added to speed up lookups in SADB. - SADB now uses rmlock to protect access, and concurrent threads can do SA lookups in the same time. - many PF_KEY message handlers were reworked to reflect changes in SADB. - SADB_UPDATE message was extended to support new PF_KEY headers: SADB_X_EXT_NEW_ADDRESS_SRC and SADB_X_EXT_NEW_ADDRESS_DST. They can be used by IKE daemon to change SA addresses. o ipsecrequest and secpolicy structures were cardinally changed to avoid locking protection for ipsecrequest. Now we support only limited number (4) of bundled SAs, but they are supported for both INET and INET6. o INPCB security policy cache was introduced. Each PCB now caches used security policies to avoid SP lookup for each packet. o For inbound security policies added the mode, when the kernel does check for full history of applied IPsec transforms. o References counting rules for security policies and security associations were changed. The proper SA locking added into xform code. o xform code was also changed. Now it is possible to unregister xforms. tdb_xxx structures were changed and renamed to reflect changes in SADB/SPDB, and changed rules for locking and refcounting. Obtained from: Yandex LLC Relnotes: yes Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D9352 MFC r313331: Add removed headers into the ObsoleteFiles.inc. MFC r313561 (by glebius): Move tcp_fields_to_net() static inline into tcp_var.h, just below its friend tcp_fields_to_host(). There is third party code that also uses this inline. MFC r313697: Remove IPsec related PCB code from SCTP. The inpcb structure has inp_sp pointer that is initialized by ipsec_init_pcbpolicy() function. This pointer keeps strorage for IPsec security policies associated with a specific socket. An application can use IP_IPSEC_POLICY and IPV6_IPSEC_POLICY socket options to configure these security policies. Then ip[6]_output() uses inpcb pointer to specify that an outgoing packet is associated with some socket. And IPSEC_OUTPUT() method can use a security policy stored in the inp_sp. For inbound packet the protocol-specific input routine uses IPSEC_CHECK_POLICY() method to check that a packet conforms to inbound security policy configured in the inpcb. SCTP protocol doesn't specify inpcb for ip[6]_output() when it sends packets. Thus IPSEC_OUTPUT() method does not consider such packets as associated with some socket and can not apply security policies from inpcb, even if they are configured. Since IPSEC_CHECK_POLICY() method is called from protocol-specific input routine, it can specify inpcb pointer and associated with socket inbound policy will be checked. But there are two problems: 1. Such check is asymmetric, becasue we can not apply security policy from inpcb for outgoing packet. 2. IPSEC_CHECK_POLICY() expects that caller holds INPCB lock and access to inp_sp is protected. But for SCTP this is not correct, becasue SCTP uses own locks to protect inpcb. To fix these problems remove IPsec related PCB code from SCTP. This imply that IP_IPSEC_POLICY and IPV6_IPSEC_POLICY socket options will be not applicable to SCTP sockets. To be able correctly check inbound security policies for SCTP, mark its protocol header with the PR_LASTHDR flag. Differential Revision: https://reviews.freebsd.org/D9538 MFC r313746: Add missing check to fix the build with IPSEC_SUPPORT and without MAC. MFC r313805: Fix LINT build for powerpc. Build kernel modules support only when both IPSEC and TCP_SIGNATURE are not defined. MFC r313922: For translated packets do not adjust UDP checksum if it is zero. In case when decrypted and decapsulated packet is an UDP datagram, check that its checksum is not zero before doing incremental checksum adjustment. MFC r314339: Document that the size of AH ICV for HMAC-SHA2-NNN should be half of NNN bits as described in RFC4868. PR: 215978 MFC r314812: Introduce the concept of IPsec security policies scope. Currently are defined three scopes: global, ifnet, and pcb. Generic security policies that IKE daemon can add via PF_KEY interface or an administrator creates with setkey(8) utility have GLOBAL scope. Such policies can be applied by the kernel to outgoing packets and checked agains inbound packets after IPsec processing. Security policies created by if_ipsec(4) interfaces have IFNET scope. Such policies are applied to packets that are passed through if_ipsec(4) interface. And security policies created by application using setsockopt() IP_IPSEC_POLICY option have PCB scope. Such policies are applied to packets related to specific socket. Currently there is no way to list PCB policies via setkey(8) utility. Modify setkey(8) and libipsec(3) to be able distinguish the scope of security policies in the `setkey -DP` listing. Add two optional flags: '-t' to list only policies related to virtual tunneling* interfaces, i.e. policies with IFNET scope, and '-g' to list only policies with GLOBAL scope. By default policies from all scopes are listed. To implement this PF_KEY's sadb_x_policy structure was modified. sadb_x_policy_reserved field is used to pass the policy scope from the kernel to userland. SADB_SPDDUMP message extended to support filtering by scope: sadb_msg_satype field is used to specify bit mask of requested scopes. For IFNET policies the sadb_x_policy_priority field of struct sadb_x_policy is used to pass if_ipsec's interface if_index to the userland. For GLOBAL policies sadb_x_policy_priority is used only to manage order of security policies in the SPDB. For IFNET policies it is not used, so it can be used to keep if_index. After this change the output of `setkey -DP` now looks like: # setkey -DPt 0.0.0.0/0[any] 0.0.0.0/0[any] any in ipsec esp/tunnel/87.250.242.144-87.250.242.145/unique:145 spid=7 seq=3 pid=58025 scope=ifnet ifname=ipsec0 refcnt=1 # setkey -DPg ::/0 ::/0 icmp6 135,0 out none spid=5 seq=1 pid=872 scope=global refcnt=1 Obtained from: Yandex LLC Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D9805 PR: 212018 Relnotes: yes Sponsored by: Yandex LLC
*	Overhaul if_enc(4) and make it loadable in run-time.	ae	2015-11-25	1	-52/+22
\| \| \| \| \| \| \| \|	Use hhook(9) framework to achieve ability of loading and unloading if_enc(4) kernel module. INET and INET6 code on initialization registers two helper hooks points in the kernel. if_enc(4) module uses these helper hook points and registers its hooks. IPSEC code uses these hhook points to call helper hooks implemented in if_enc(4).
*	IPSEC, remove variable argument function its already due.	eri	2015-07-21	1	-22/+7
\| \| \| \| \| \|	Differential Revision: https://reviews.freebsd.org/D3080 Reviewed by: gnn, ae Approved by: gnn(mentor)
*	Since PFIL can change mbuf pointer, we should update pointers after	ae	2015-04-28	1	-0/+1
\| \| \| \| \| \|	calling ipsec_filter(). Sponsored by: Yandex LLC
*	Change ipsec_address() and ipsec_logsastr() functions to take two	ae	2015-04-18	1	-8/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	additional arguments - buffer and size of this buffer. ipsec_address() is used to convert sockaddr structure to presentation format. The IPv6 part of this function returns pointer to the on-stack buffer and at the moment when it will be used by caller, it becames invalid. IPv4 version uses 4 static buffers and returns pointer to new buffer each time when it called. But anyway it is still possible to get corrupted data when several threads will use this function. ipsec_logsastr() is used to format string about SA entry. It also uses static buffer and has the same problem with concurrent threads. To fix these problems add the buffer pointer and size of this buffer to arguments. Now each caller will pass buffer and its size to these functions. Also convert all places where these functions are used (except disabled code). And now ipsec_address() uses inet_ntop() function from libkern. PR: 185996 Differential Revision: https://reviews.freebsd.org/D2321 Reviewed by: gnn Sponsored by: Yandex LLC
*	Requeue mbuf via netisr when we use IPSec tunnel mode and IPv6.	ae	2015-04-18	1	-1/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ipsec6_common_input_cb() uses partial copy of ip6_input() to parse headers. But this isn't correct, when we use tunnel mode IPSec. When we stripped outer IPv6 header from the decrypted packet, it can become IPv4 packet and should be handled by ip_input. Also when we use tunnel mode IPSec with IPv6 traffic, we should pass decrypted packet with inner IPv6 header to ip6_input, it will correctly handle it and also can decide to forward it. The "skip" variable points to offset where payload starts. In tunnel mode we reset it to zero after stripping the outer header. So, when it is zero, we should requeue mbuf via netisr. Differential Revision: https://reviews.freebsd.org/D2306 Reviewed by: adrian, gnn Sponsored by: Yandex LLC
*	Fix handling of scoped IPv6 addresses in IPSec code.	ae	2015-04-18	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* in ipsec_encap() embed scope zone ids into link-local addresses in the new IPv6 header, this helps ip6_output() disambiguate the scope; * teach key_ismyaddr6() use in6_localip(). in6_localip() is less strict than key_sockaddrcmp(). It doesn't compare all fileds of struct sockaddr_in6, but it is faster and it should be safe, because all SA's data was checked for correctness. Also, since IPv6 link-local addresses in the &V_in6_ifaddrhead are stored in kernel-internal form, we need to embed scope zone id from SA into the address before calling in6_localip. * in ipsec_common_input() take scope zone id embedded in the address and use it to initialize sin6_scope_id, then use this sockaddr structure to lookup SA, because we keep addresses in the SADB without embedded scope zone id. Differential Revision: https://reviews.freebsd.org/D2304 Reviewed by: gnn Sponsored by: Yandex LLC
*	Remove now unused mtag argument from ipsec*_common_input_cb.	ae	2014-12-11	1	-23/+8
\| \| \| \| \|	Obtained from: Yandex LLC Sponsored by: Yandex LLC
*	Remove route chaching support from ipsec code. It isn't used for some time.	ae	2014-12-02	1	-1/+0
\| \| \| \| \| \| \| \| \|	* remove sa_route_union declaration and route_cache member from struct secashead; * remove key_sa_routechange() call from ICMP and ICMPv6 code; * simplify ip_ipsec_mtu(); * remove #include <net/route.h>; Sponsored by: Yandex LLC
*	Strip IP header only when we act in tunnel mode.	ae	2014-11-13	1	-29/+30
\| \| \| \| \|	MFC after: 1 week Sponsored by: Yandex LLC
*	Pass mbuf to pfil processing before stripping outer IP header as it	ae	2014-11-07	1	-17/+6
\| \| \| \| \| \| \|	is described in if_enc(4). MFC after: 2 week Sponsored by: Yandex LLC
*	When mode isn't explicitly specified (wildcard) and inner protocol isn't	ae	2014-11-06	1	-1/+10
\| \| \| \| \| \| \| \|	IPv4 or IPv6, assume it is the transport mode. Reported by: jmg MFC after: 1 week Sponsored by: Yandex LLC
*	Do not strip outer header when operating in transport mode.	ae	2014-10-02	1	-2/+10
\| \| \| \| \| \| \| \| \|	Instead requeue mbuf back to IPv4 protocol handler. If there is one extra IP-IP encapsulation, it will be handled with tunneling interface. And thus proper interface will be exposed into mbuf's rcvif. Also, tcpdump that listens on tunneling interface will see packets in both directions. Sponsored by: Yandex LLC
*	Mechanically convert to if_inc_counter().	glebius	2014-09-19	1	-4/+4
\|
*	Merge 'struct ip6protosw' and 'struct protosw' into one. Now we have	kevlo	2014-08-08	1	-6/+31
\| \| \| \| \| \| \|	only one protocol switch structure that is shared between ipv4 and ipv6. Phabric: D476 Reviewed by: jhb
*	Fixed IPv4-in-IPv6 and IPv6-in-IPv4 IPsec tunnels.	vanhu	2014-05-28	1	-47/+89
\| \| \| \| \| \| \| \| \| \| \| \| \|	For IPv6-in-IPv4, you may need to do the following command on the tunnel interface if it is configured as IPv4 only: ifconfig <interface> inet6 -ifdisabled Code logic inspired from NetBSD. PR: kern/169438 Submitted by: emeric.poupon@netasq.com Reviewed by: fabient, ae Obtained from: NETASQ
*	Initialize prot variable.	ae	2013-11-11	1	-0/+1
\| \| \| \| \|	PR: 177417 MFC after: 1 week
*	The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare	glebius	2013-10-26	1	-0/+1
\| \| \| \| \| \| \| \|	to this event, adding if_var.h to files that do need it. Also, include all includes that now are included due to implicit pollution via if_var.h Sponsored by: Netflix Sponsored by: Nginx, Inc.
*	Use corresponding macros to update statistics for AH, ESP, IPIP, IPCOMP,	ae	2013-06-20	1	-54/+30
\| \| \| \| \| \|	PFKEY. MFC after: 2 weeks
*	Use IP6STAT_INC/IP6STAT_DEC macros to update ip6 stats.	ae	2013-04-09	1	-2/+2
\| \| \| \|	MFC after: 1 week
*	- Fix one more miss from r241913.	glebius	2012-10-23	1	-2/+4
\| \| \| \| \|	- Add XXX comment about necessity of the entire block, that "fixes up" the IP header.
*	Merge the projects/pf/head branch, that was worked on for last six months,	glebius	2012-09-08	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	into head. The most significant achievements in the new code: o Fine grained locking, thus much better performance. o Fixes to many problems in pf, that were specific to FreeBSD port. New code doesn't have that many ifdefs and much less OpenBSDisms, thus is more attractive to our developers. Those interested in details, can browse through SVN log of the projects/pf/head branch. And for reference, here is exact list of revisions merged: r232043, r232044, r232062, r232148, r232149, r232150, r232298, r232330, r232332, r232340, r232386, r232390, r232391, r232605, r232655, r232656, r232661, r232662, r232663, r232664, r232673, r232691, r233309, r233782, r233829, r233830, r233834, r233835, r233836, r233865, r233866, r233868, r233873, r234056, r234096, r234100, r234108, r234175, r234187, r234223, r234271, r234272, r234282, r234307, r234309, r234382, r234384, r234456, r234486, r234606, r234640, r234641, r234642, r234644, r234651, r235505, r235506, r235535, r235605, r235606, r235826, r235991, r235993, r236168, r236173, r236179, r236180, r236181, r236186, r236223, r236227, r236230, r236252, r236254, r236298, r236299, r236300, r236301, r236397, r236398, r236399, r236499, r236512, r236513, r236525, r236526, r236545, r236548, r236553, r236554, r236556, r236557, r236561, r236570, r236630, r236672, r236673, r236679, r236706, r236710, r236718, r237154, r237155, r237169, r237314, r237363, r237364, r237368, r237369, r237376, r237440, r237442, r237751, r237783, r237784, r237785, r237788, r237791, r238421, r238522, r238523, r238524, r238525, r239173, r239186, r239644, r239652, r239661, r239773, r240125, r240130, r240131, r240136, r240186, r240196, r240212. I'd like to thank people who participated in early testing: Tested by: Florian Smeets <flo freebsd.org> Tested by: Chekaluk Vitaly <artemrts ukr.net> Tested by: Ben Wilber <ben desync.com> Tested by: Ian FREISLICH <ianf cloudseed.co.za>
*	Update packet filter (pf) code to OpenBSD 4.5.	bz	2011-06-28	1	-0/+2
\| \| \| \| \| \| \| \|	You need to update userland (world and ports) tools to be in sync with the kernel. Submitted by: mlaier Submitted by: eri
*	Make IPsec compile without INET adding appropriate #ifdef checks.	bz	2011-04-27	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Unfold the IPSEC_COMMON_INPUT_CB() macro in xform_{ah,esp,ipcomp}.c to not need three different versions depending on INET, INET6 or both. Mark two places preparing for not yet supported functionality with IPv6. Reviewed by: gnn Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems MFC after: 4 days
*	Fix typo in comment.	thomas	2010-10-25	1	-1/+1
\|
*	MFp4 @178283:	bz	2010-05-24	1	-1/+1
\| \| \| \| \| \| \| \| \|	Improve IPsec flow distribution for better netisr parallelism. Instead of using the pointer that would have the last bits masked in a % statement in netisr_select_cpuid() to select the queue, use the SPI. Reviewed by: rwatson MFC after: 4 weeks
*	Merge the remainder of kern_vimage.c and vimage.h into vnet.c and	rwatson	2009-08-01	1	-1/+0
\| \| \| \| \| \| \| \| \| \|	vnet.h, we now use jails (rather than vimages) as the abstraction for virtualization management, and what remained was specific to virtual network stacks. Minor cleanups are done in the process, and comments updated to reflect these changes. Reviewed by: bz Approved by: re (vimage blanket)
*	Build on Jeff Roberson's linker-set based dynamic per-CPU allocator	rwatson	2009-07-14	1	-6/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables. Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker. Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided. This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS. Bump __FreeBSD_version and update UPDATING. Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)
*	Added support for NAT-Traversal (RFC 3948) in IPsec stack.	vanhu	2009-06-12	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Thanks to (no special order) Emmanuel Dreyfus (manu@netbsd.org), Larry Baird (lab@gta.com), gnn, bz, and other FreeBSD devs, Julien Vanherzeele (julien.vanherzeele@netasq.com, for years of bug reporting), the PFSense team, and all people who used / tried the NAT-T patch for years and reported bugs, patches, etc... X-MFC: never Reviewed by: bz Approved by: gnn(mentor) Obtained from: NETASQ
*	Properly hide IPv4 only variables and functions under #ifdef INET.	bz	2009-06-10	1	-0/+2
\|
*	Reimplement the netisr framework in order to support parallel netisr	rwatson	2009-06-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	threads: - Support up to one netisr thread per CPU, each processings its own workstream, or set of per-protocol queues. Threads may be bound to specific CPUs, or allowed to migrate, based on a global policy. In the future it would be desirable to support topology-centric policies, such as "one netisr per package". - Allow each protocol to advertise an ordering policy, which can currently be one of: NETISR_POLICY_SOURCE: packets must maintain ordering with respect to an implicit or explicit source (such as an interface or socket). NETISR_POLICY_FLOW: make use of mbuf flow identifiers to place work, as well as allowing protocols to provide a flow generation function for mbufs without flow identifers (m2flow). Falls back on NETISR_POLICY_SOURCE if now flow ID is available. NETISR_POLICY_CPU: allow protocols to inspect and assign a CPU for each packet handled by netisr (m2cpuid). - Provide utility functions for querying the number of workstreams being used, as well as a mapping function from workstream to CPU ID, which protocols may use in work placement decisions. - Add explicit interfaces to get and set per-protocol queue limits, and get and clear drop counters, which query data or apply changes across all workstreams. - Add a more extensible netisr registration interface, in which protocols declare 'struct netisr_handler' structures for each registered NETISR_ type. These include name, handler function, optional mbuf to flow ID function, optional mbuf to CPU ID function, queue limit, and ordering policy. Padding is present to allow these to be expanded in the future. If no queue limit is declared, then a default is used. - Queue limits are now per-workstream, and raised from the previous IFQ_MAXLEN default of 50 to 256. - All protocols are updated to use the new registration interface, and with the exception of netnatm, default queue limits. Most protocols register as NETISR_POLICY_SOURCE, except IPv4 and IPv6, which use NETISR_POLICY_FLOW, and will therefore take advantage of driver- generated flow IDs if present. - Formalize a non-packet based interface between interface polling and the netisr, rather than having polling pretend to be two protocols. Provide two explicit hooks in the netisr worker for start and end events for runs: netisr_poll() and netisr_pollmore(), as well as a function, netisr_sched_poll(), to allow the polling code to schedule netisr execution. DEVICE_POLLING still embeds single-netisr assumptions in its implementation, so for now if it is compiled into the kernel, a single and un-bound netisr thread is enforced regardless of tunable configuration. In the default configuration, the new netisr implementation maintains the same basic assumptions as the previous implementation: a single, un-bound worker thread processes all deferred work, and direct dispatch is enabled by default wherever possible. Performance measurement shows a marginal performance improvement over the old implementation due to the use of batched dequeue. An rmlock is used to synchronize use and registration/unregistration using the framework; currently, synchronized use is disabled (replicating current netisr policy) due to a measurable 3%-6% hit in ping-pong micro-benchmarking. It will be enabled once further rmlock optimization has taken place. However, in practice, netisrs are rarely registered or unregistered at runtime. A new man page for netisr will follow, but since one doesn't currently exist, it hasn't been updated. This change is not appropriate for MFC, although the polling shutdown handler should be merged to 7-STABLE. Bump __FreeBSD_version. Reviewed by: bz
*	Rather than using hidden includes (with cicular dependencies),	bz	2008-12-02	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	directly include only the header files needed. This reduces the unneeded spamming of various headers into lots of files. For now, this leaves us with very few modules including vnet.h and thus needing to depend on opt_route.h. Reviewed by: brooks, gnn, des, zec, imp Sponsored by: The FreeBSD Foundation
*	Step 1.5 of importing the network stack virtualization infrastructure	zec	2008-10-02	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	from the vimage project, as per plan established at devsummit 08/08: http://wiki.freebsd.org/Image/Notes200808DevSummit Introduce INIT_VNET_() initializer macros, VNET_FOREACH() iterator macros, and CURVNET_SET() context setting macros, all currently resolving to NOPs. Prepare for virtualization of selected SYSCTL objects by introducing a family of SYSCTL_V_() macros, currently resolving to their global counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT(). Move selected #defines from sys/sys/vimage.h to newly introduced header files specific to virtualized subsystems (sys/net/vnet.h, sys/netinet/vinet.h etc.). All the changes are verified to have zero functional impact at this point in time by doing MD5 comparision between pre- and post-change object files(). () netipsec/keysock.c did not validate depending on compile time options. Implemented by: julian, bz, brooks, zec Reviewed by: julian, bz, brooks, kris, rwatson, ... Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation
*	Commit step 1 of the vimage project, (network stack)	bz	2008-08-17	1	-57/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	virtualization work done by Marko Zec (zec@). This is the first in a series of commits over the course of the next few weeks. Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only. We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again. Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch
*	Increase statistic counters for enc0 interface when enabled	vanhu	2008-08-12	1	-0/+11
\| \| \| \| \| \| \|	and processing IPSec traffic. Approved by: gnn (mentor) MFC after: 1 week
*	In addition to the ipsec_osdep.h removal a week ago, now also eliminate	bz	2008-05-24	1	-2/+0
\| \| \| \|	IPSEC_SPLASSERT_SOFTNET which has been 'unused' since FreeBSD 5.0.
*	Add sysctls to if_enc(4) to control whether the firewalls or	bz	2007-11-28	1	-2/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bpf will see inner and outer headers or just inner or outer headers for incoming and outgoing IPsec packets. This is useful in bpf to not have over long lines for debugging or selcting packets based on the inner headers. It also properly defines the behavior of what the firewalls see. Last but not least it gives you if_enc(4) for IPv6 as well. [ As some auxiliary state was not available in the later input path we save it in the tdbi. That way tcpdump can give a consistent view of either of (authentic,confidential) for both before and after states. ] Discussed with: thompsa (2007-04-25, basic idea of unifying paths) Reviewed by: thompsa, gnn
*	Fix for an infinite loop in processing ESP, IPv6 packets.	gnn	2007-09-12	1	-4/+17
\| \| \| \| \| \| \| \|	The control input routine passes a NULL as its void argument when it has reached the innermost header, which terminates the loop. Reported by: Pawel Worach <pawel.worach@gmail.com> Approved by: re
*	Replace hard coded options by their defined PFIL_{IN,OUT} names.	bz	2007-07-19	1	-1/+2
\| \| \| \|	Approved by: re (hrs)
*	Looking at {ah,esp}_input_cb it seems we might be able to end up	bz	2007-06-15	1	-1/+1
\| \| \| \| \| \| \| \| \|	without an mtag in ipsec4_common_input_cb. So in case of !IPCOMP (AH,ESP) only change the m_tag_id if an mtag was passed to ipsec4_common_input_cb. Found with: Coverity Prevent(tm) CID: 2523
*	s,#,*, in a multi-line comment. This is C.	bz	2007-06-15	1	-1/+1
\| \| \| \|	No functional change.
*	Though we are only called for the three security protocols we can	bz	2007-06-15	1	-0/+4
\| \| \| \| \| \| \| \|	handle, document those sprotos using an IPSEC_ASSERT so that it will be clear that 'spi' will always be initialized when used the first time. Found with: Coverity Prevent(tm) CID: 2533
*	s,#if INET6,#ifdef INET6,	bz	2006-12-14	1	-1/+1
\| \| \| \| \| \|	This unbreaks the build for FAST_IPSEC && !INET6 and was wrong anyway. Reported by: Dmitry Pryanishnikov <dmitry atlantis.dp.ua>
*	MFp4: 92972, 98913 + one more change	bz	2006-12-12	1	-2/+10
\| \| \| \| \| \| \|	In ip6_sprintf no longer use and return one of eight static buffers for printing/logging ipv6 addresses. The caller now has to hand in a sufficiently large buffer as first argument.
*	Add a pseudo interface for packet filtering IPSec connections before or after	thompsa	2006-06-26	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \|	encryption. There are two functions, a bpf tap which has a basic header with the SPI number which our current tcpdump knows how to display, and handoff to pfil(9) for packet filtering. Obtained from: OpenBSD Based on: kern/94829 No objections: arch, net MFC after: 1 month
*	Change '#if INET' and '#if INET6' to '#ifdef INET' and '#ifdef INET6'.	pjd	2006-06-04	1	-1/+1
\| \| \| \|	This unbreaks compiling a kernel with FAST_IPSEC and no INET6.
*	Extend the notdef #ifdef to cover the packet copy as there is no point in ↵	gnn	2006-06-04	1	-8/+4
\| \| \| \| \| \| \|	doing that if we're not doing the rest of the work. Submitted by: thompsa MFC after: 1 week