summaryrefslogtreecommitdiffstats
path: root/sys/netipsec
Commit message (Collapse)AuthorAgeFilesLines
* After some off-list discussion, revert a number of changes to thedim2010-11-223-20/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | DPCPU_DEFINE and VNET_DEFINE macros, as these cause problems for various people working on the affected files. A better long-term solution is still being considered. This reversal may give some modules empty set_pcpu or set_vnet sections, but these are harmless. Changes reverted: ------------------------------------------------------------------------ r215318 | dim | 2010-11-14 21:40:55 +0100 (Sun, 14 Nov 2010) | 4 lines Instead of unconditionally emitting .globl's for the __start_set_xxx and __stop_set_xxx symbols, only emit them when the set_vnet or set_pcpu sections are actually defined. ------------------------------------------------------------------------ r215317 | dim | 2010-11-14 21:38:11 +0100 (Sun, 14 Nov 2010) | 3 lines Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout the tree. ------------------------------------------------------------------------ r215316 | dim | 2010-11-14 21:23:02 +0100 (Sun, 14 Nov 2010) | 2 lines Add macros to define static instances of VNET_DEFINE and DPCPU_DEFINE.
* Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughoutdim2010-11-143-20/+20
| | | | the tree.
* Announce both IPsec and UDP Encap (NAT-T) if available forbz2010-10-301-0/+5
| | | | | | | | | | feature_present(3) checks. This will help to run-time detect and conditionally handle specific optionas of either feature in user space (i.e. in libipsec). Descriptions read by: rwatson MFC after: 2 weeks
* Fix typo in comment.thomas2010-10-251-1/+1
|
* Make the IPsec SADB embedded route cache a union to be able to hold both thebz2010-10-233-6/+13
| | | | | | | | | legacy and IPv6 route destination address. Previously in case of IPv6, there was a memory overwrite due to not enough space for the IPv6 address. PR: kern/122565 MFC After: 2 weeks
* Remove dead code:bz2010-10-141-3/+1
| | | | | | assignment to a local variable not used anywhere after that. MFC after: 3 days
* Style: make the asterisk go with the variable name, not the type.bz2010-10-141-1/+1
| | | | MFC after: 3 days
* MFp4 @178283:bz2010-05-241-1/+1
| | | | | | | | | Improve IPsec flow distribution for better netisr parallelism. Instead of using the pointer that would have the last bits masked in a % statement in netisr_select_cpuid() to select the queue, use the SPI. Reviewed by: rwatson MFC after: 4 weeks
* Set SA's natt_type before calling key_mature() in key_add(),vanhu2010-05-051-6/+6
| | | | | | | as the SA may be used as soon as key_mature() has been done. Obtained from: NETASQ MFC after: 1 week
* Update SA's NAT-T stuff before calling key_mature() in key_update(),vanhu2010-05-051-6/+6
| | | | | | | as SA may be used as soon as key_mature() has been called. Obtained from: NETASQ MFC after: 1 week
* MFP4: @176978-176982, 176984, 176990-176994, 177441bz2010-04-299-44/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | "Whitspace" churn after the VIMAGE/VNET whirls. Remove the need for some "init" functions within the network stack, like pim6_init(), icmp_init() or significantly shorten others like ip6_init() and nd6_init(), using static initialization again where possible and formerly missed. Move (most) variables back to the place they used to be before the container structs and VIMAGE_GLOABLS (before r185088) and try to reduce the diff to stable/7 and earlier as good as possible, to help out-of-tree consumers to update from 6.x or 7.x to 8 or 9. This also removes some header file pollution for putatively static global variables. Revert VIMAGE specific changes in ipfilter::ip_auth.c, that are no longer needed. Reviewed by: jhb Discussed with: rwatson Sponsored by: The FreeBSD Foundation Sponsored by: CK Software GmbH MFC after: 6 days
* Locks SPTREE when setting some SP entries to state DEAD.vanhu2010-04-151-0/+6
| | | | | | | | This can prevent kernel panics when updating SPs while there is some traffic for them. Obtained from: NETASQ MFC after: 1m
* Fix a logic error in ipsec code that extractseri2010-04-021-1/+1
| | | | | | | | information from the packets. Reviewed by: bz, mlaier Approved by: mlaier(mentor) MFC after: 1 month
* When tearing down IPsec as part of a (virtual) network stack,bz2010-03-281-7/+9
| | | | | | | | do not try to free the same list twice but free both the acquiring list and the security policy acquiring list. Reviewed by: anchie MFC after: 3 days
* Correct typo in comment.pjd2010-02-181-1/+1
|
* Enable IPcomp by default.bz2009-11-291-1/+1
| | | | | PR: kern/123587 MFC after: 5 days
* Add more statistics variables for IPcomp.bz2009-11-292-3/+19
| | | | | | | Try to version the struct in a backward compatible way. People asked for the versioning of the stats structs in general before. MFC after: 5 days
* Assimilate very similar input and output code pathsbz2009-11-291-4/+2
| | | | | | (no real functional change). MFC after: 5 days
* Only add the IPcomp header if crypto reported success and we have a lowerbz2009-11-291-51/+53
| | | | | | | | | | | | payload size. Before we had always added the header, no matter if we actually send out compressed data or not. With this, after the opencrypto/deflate changes, IPcomp starts to work apart from edge cases. Leave it disabled by default until those are fixed as well. PR: kern/123587 MFC after: 5 days
* Remove whitespace.bz2009-11-281-3/+3
| | | | MFC after: 6 days
* Directly send data uncompressed if the packet payload size is lower thanbz2009-11-281-0/+11
| | | | | | the compression algorithm threshold. MFC after: 6 days
* Correct a typo.bz2009-11-281-1/+1
| | | | MFC after: 6 days
* fixed two race conditions when inserting/removing SAs via PFKey,vanhu2009-11-171-2/+3
| | | | | | | | which can both lead to a kernel panic when adding/removing quickly a lot of SAs. Obtained from: NETASQ MFC after: 2w (MFC on 8 before 8.0 release ???)
* Changed an IPSEC_ASSERT to a simple test, as such invalid packetsvanhu2009-10-011-3/+9
| | | | | | | | | may come from outside without being discarded before. Submitted by: aurelien.ansel@netasq.com Reviewed by: bz (secteam) Obtained from: NETASQ MFC after: 1m
* When checking traffic endpoint's adresses families in key_spdadd(),vanhu2009-09-161-12/+2
| | | | | | | | | | compare them together instead of comparing each one with respective tunnel endpoint. PR: kern/138439 Submitted by: aurelien.ansel@netasq.com Obtained from: NETASQ MFC after: 1 m
* Silent gcc? Yeah, you wish. What I ment was to silence gcc.pjd2009-09-061-2/+2
| | | | Spotted by: julian
* Initialize state_valid and arraysize variable so gcc won't complain.pjd2009-09-061-1/+3
| | | | Reported by: bz
* Improve code a bit by eliminating goto and having one unlock per lock.pjd2009-09-061-4/+3
|
* Correct typo in comment.pjd2009-09-061-1/+1
|
* Rework global locks for interface list and index management, correctingrwatson2009-08-231-4/+4
| | | | | | | | | | | | | | several critical bugs, including race conditions and lock order issues: Replace the single rwlock, ifnet_lock, with two locks, an rwlock and an sxlock. Either can be held to stablize the lists and indexes, but both are required to write. This allows the list to be held stable in both network interrupt contexts and sleepable user threads across sleeping memory allocations or device driver interactions. As before, writes to the interface list must occur from sleepable contexts. Reviewed by: bz, julian MFC after: 3 days
* Merge the remainder of kern_vimage.c and vimage.h into vnet.c andrwatson2009-08-0111-11/+0
| | | | | | | | | | vnet.h, we now use jails (rather than vimages) as the abstraction for virtualization management, and what remained was specific to virtual network stacks. Minor cleanups are done in the process, and comments updated to reflect these changes. Reviewed by: bz Approved by: re (vimage blanket)
* Introduce and use a sysinit-based initialization scheme for virtualrwatson2009-07-232-26/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | network stacks, VNET_SYSINIT: - Add VNET_SYSINIT and VNET_SYSUNINIT macros to declare events that will occur each time a network stack is instantiated and destroyed. In the !VIMAGE case, these are simply mapped into regular SYSINIT/SYSUNINIT. For the VIMAGE case, we instead use SYSINIT's to track their order and properties on registration, using them for each vnet when created/ destroyed, or immediately on module load for already-started vnets. - Remove vnet_modinfo mechanism that existed to serve this purpose previously, as well as its dependency scheme: we now just use the SYSINIT ordering scheme. - Implement VNET_DOMAIN_SET() to allow protocol domains to declare that they want init functions to be called for each virtual network stack rather than just once at boot, compiling down to DOMAIN_SET() in the non-VIMAGE case. - Walk all virtualized kernel subsystems and make use of these instead of modinfo or DOMAIN_SET() for init/uninit events. In some cases, convert modular components from using modevent to using sysinit (where appropriate). In some cases, do minor rejuggling of SYSINIT ordering to make room for or better manage events. Portions submitted by: jhb (VNET_SYSINIT), bz (cleanup) Discussed with: jhb, bz, julian, zec Reviewed by: bz Approved by: re (VIMAGE blanket)
* Garbage collect vnet module registrations that have neither constructorsrwatson2009-07-203-23/+0
| | | | | | | | | | | | | | | nor destructors, as there's no actual work to do. In most cases, the constructors weren't needed because of the existing protocol initialization functions run by net_init_domain() as part of VNET_MOD_NET, or they were eliminated when support for static initialization of virtualized globals was added. Garbage collect dependency references to modules without constructors or destructors, notably VNET_MOD_INET and VNET_MOD_INET6. Reviewed by: bz Approved by: re (vimage blanket)
* Reimplement and/or implement vnet list locking by replacing a mostlyrwatson2009-07-191-2/+2
| | | | | | | | | | | | | | | | | | | | | | unused custom mutex/condvar-based sleep locks with two locks: an rwlock (for non-sleeping use) and sxlock (for sleeping use). Either acquired for read is sufficient to stabilize the vnet list, but both must be acquired for write to modify the list. Replace previous no-op read locking macros, used in various places in the stack, with actual locking to prevent race conditions. Callers must declare when they may perform unbounded sleeps or not when selecting how to lock. Refactor vnet sysinits so that the vnet list and locks are initialized before kernel modules are linked, as the kernel linker will use them for modules loaded by the boot loader. Update various consumers of these KPIs based on whether they may sleep or not. Reviewed by: bz Approved by: re (kib)
* Remove unused VNET_SET() and related macros; only VNET_GET() isrwatson2009-07-1611-51/+51
| | | | | | | | | ever actually used. Rename VNET_GET() to VNET() to shorten variable references. Discussed with: bz, julian Reviewed by: bz Approved by: re (kensmith, kib)
* Build on Jeff Roberson's linker-set based dynamic per-CPU allocatorrwatson2009-07-1421-713/+274
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables. Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker. Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided. This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS. Bump __FreeBSD_version and update UPDATING. Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)
* Add address list locking for in6_ifaddrhead/ia_link: as with lockingrwatson2009-06-251-2/+8
| | | | | | | | | | | for in_ifaddrhead, we stick with an rwlock for the time being, which we will revisit in the future with a possible move to rmlocks. Some pieces of code require significant further reworking to be safe from all classes of writer-writer races. Reviewed by: bz MFC after: 6 weeks
* Add a new global rwlock, in_ifaddr_lock, which will synchronize use of therwatson2009-06-251-0/+3
| | | | | | | | | | | | | | | | | | | in_ifaddrhead and INADDR_HASH address lists. Previously, these lists were used unsynchronized as they were effectively never changed in steady state, but we've seen increasing reports of writer-writer races on very busy VPN servers as core count has gone up (and similar configurations where address lists change frequently and concurrently). For the time being, use rwlocks rather than rmlocks in order to take advantage of their better lock debugging support. As a result, we don't enable ip_input()'s read-locking of INADDR_HASH until an rmlock conversion is complete and a performance analysis has been done. This means that one class of reader-writer races still exists. MFC after: 6 weeks Reviewed by: bz
* Convert netinet6 to using queue(9) rather than hand-crafted linked listsrwatson2009-06-241-1/+1
| | | | | | | | for the global IPv6 address list (in6_ifaddr -> in6_ifaddrhead). Adopt the code styles and conventions present in netinet where possible. Reviewed by: gnn, bz MFC after: 6 weeks (possibly not MFCable?)
* Move setting of ports from NAT-T below key_getsah() and actuallybz2009-06-191-8/+9
| | | | | | | | | | below key_setsaval(). Without that, the lookup for the SA had failed as we were looking for a SA with the new, updated port numbers instead of the old ones and were comparing the ports in key_cmpsaidx(). This makes updating the remote -> local SA on the initiator work again. Problem introduced with: p4 changeset 152114
* Add the explicit include of vimage.h to another five .c files stillbz2009-06-171-0/+1
| | | | | | | missing it. Remove the "hidden" kernel only include of vimage.h from ip_var.h added with the very first Vimage commit r181803 to avoid further kernel poisoning.
* Added support for NAT-Traversal (RFC 3948) in IPsec stack.vanhu2009-06-125-5/+691
| | | | | | | | | | | | | | Thanks to (no special order) Emmanuel Dreyfus (manu@netbsd.org), Larry Baird (lab@gta.com), gnn, bz, and other FreeBSD devs, Julien Vanherzeele (julien.vanherzeele@netasq.com, for years of bug reporting), the PFSense team, and all people who used / tried the NAT-T patch for years and reported bugs, patches, etc... X-MFC: never Reviewed by: bz Approved by: gnn(mentor) Obtained from: NETASQ
* Properly hide IPv4 only variables and functions under #ifdef INET.bz2009-06-103-0/+6
|
* After r193232 rt_tables in vnet.h are no longer indirectly dependent onbz2009-06-082-2/+0
| | | | | | | | | the ROUTETABLES kernel option thus there is no need to include opt_route.h anymore in all consumers of vnet.h and no longer depend on it for module builds. Remove the hidden include in flowtable.h as well and leave the two explicit #includes in ip_input.c and ip_output.c.
* Introduce an infrastructure for dismantling vnet instances.zec2009-06-084-3/+91
| | | | | | | | | | | | | | | | | | | | | | | | | Vnet modules and protocol domains may now register destructor functions to clean up and release per-module state. The destructor mechanisms can be triggered by invoking "vimage -d", or a future equivalent command which will be provided via the new jail framework. While this patch introduces numerous placeholder destructor functions, many of those are currently incomplete, thus leaking memory or (even worse) failing to stop all running timers. Many of such issues are already known and will be incrementaly fixed over the next weeks in smaller incremental commits. Apart from introducing new fields in structs ifnet, domain, protosw and vnet_net, which requires the kernel and modules to be rebuilt, this change should have no impact on nooptions VIMAGE builds, since vnet destructors can only be called in VIMAGE kernels. Moreover, destructor functions should be in general compiled in only in options VIMAGE builds, except for kernel modules which can be safely kldunloaded at run time. Bump __FreeBSD_version to 800097. Reviewed by: bz, julian Approved by: rwatson, kib (re), julian (mentor)
* Reimplement the netisr framework in order to support parallel netisrrwatson2009-06-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | threads: - Support up to one netisr thread per CPU, each processings its own workstream, or set of per-protocol queues. Threads may be bound to specific CPUs, or allowed to migrate, based on a global policy. In the future it would be desirable to support topology-centric policies, such as "one netisr per package". - Allow each protocol to advertise an ordering policy, which can currently be one of: NETISR_POLICY_SOURCE: packets must maintain ordering with respect to an implicit or explicit source (such as an interface or socket). NETISR_POLICY_FLOW: make use of mbuf flow identifiers to place work, as well as allowing protocols to provide a flow generation function for mbufs without flow identifers (m2flow). Falls back on NETISR_POLICY_SOURCE if now flow ID is available. NETISR_POLICY_CPU: allow protocols to inspect and assign a CPU for each packet handled by netisr (m2cpuid). - Provide utility functions for querying the number of workstreams being used, as well as a mapping function from workstream to CPU ID, which protocols may use in work placement decisions. - Add explicit interfaces to get and set per-protocol queue limits, and get and clear drop counters, which query data or apply changes across all workstreams. - Add a more extensible netisr registration interface, in which protocols declare 'struct netisr_handler' structures for each registered NETISR_ type. These include name, handler function, optional mbuf to flow ID function, optional mbuf to CPU ID function, queue limit, and ordering policy. Padding is present to allow these to be expanded in the future. If no queue limit is declared, then a default is used. - Queue limits are now per-workstream, and raised from the previous IFQ_MAXLEN default of 50 to 256. - All protocols are updated to use the new registration interface, and with the exception of netnatm, default queue limits. Most protocols register as NETISR_POLICY_SOURCE, except IPv4 and IPv6, which use NETISR_POLICY_FLOW, and will therefore take advantage of driver- generated flow IDs if present. - Formalize a non-packet based interface between interface polling and the netisr, rather than having polling pretend to be two protocols. Provide two explicit hooks in the netisr worker for start and end events for runs: netisr_poll() and netisr_pollmore(), as well as a function, netisr_sched_poll(), to allow the polling code to schedule netisr execution. DEVICE_POLLING still embeds single-netisr assumptions in its implementation, so for now if it is compiled into the kernel, a single and un-bound netisr thread is enforced regardless of tunable configuration. In the default configuration, the new netisr implementation maintains the same basic assumptions as the previous implementation: a single, un-bound worker thread processes all deferred work, and direct dispatch is enabled by default wherever possible. Performance measurement shows a marginal performance improvement over the old implementation due to the use of batched dequeue. An rmlock is used to synchronize use and registration/unregistration using the framework; currently, synchronized use is disabled (replicating current netisr policy) due to a measurable 3%-6% hit in ping-pong micro-benchmarking. It will be enabled once further rmlock optimization has taken place. However, in practice, netisrs are rarely registered or unregistered at runtime. A new man page for netisr will follow, but since one doesn't currently exist, it hasn't been updated. This change is not appropriate for MFC, although the polling shutdown handler should be merged to 7-STABLE. Bump __FreeBSD_version. Reviewed by: bz
* Lock SPTREE before parsing it in key_spddump()vanhu2009-05-271-1/+5
| | | | | | Approved by: gnn(mentor) Obtained from: NETASQ MFC after: 2 weeks
* Only decrease refcnt once when flushing SPD entries, tovanhu2009-05-271-4/+14
| | | | | | | | avoid flushing entries which are still used. Approved by: gnn(mentor) Obtained from: NETASQ MFC after: 1 month
* Add sysctls to toggle the behaviour of the (former) IPSEC_FILTERTUNNELbz2009-05-234-0/+22
| | | | | | | | | | | | kernel option. This also permits tuning of the option per virtual network stack, as well as separately per inet, inet6. The kernel option is left for a transition period, marked deprecated, and will be removed soon. Initially requested by: phk (1 year 1 day ago) MFC after: 4 weeks
* Change the curvnet variable from a global const struct vnet *,zec2009-05-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | previously always pointing to the default vnet context, to a dynamically changing thread-local one. The currvnet context should be set on entry to networking code via CURVNET_SET() macros, and reverted to previous state via CURVNET_RESTORE(). Recursions on curvnet are permitted, though strongly discuouraged. This change should have no functional impact on nooptions VIMAGE kernel builds, where CURVNET_* macros expand to whitespace. The curthread->td_vnet (aka curvnet) variable's purpose is to be an indicator of the vnet context in which the current network-related operation takes place, in case we cannot deduce the current vnet context from any other source, such as by looking at mbuf's m->m_pkthdr.rcvif->if_vnet, sockets's so->so_vnet etc. Moreover, so far curvnet has turned out to be an invaluable consistency checking aid: it helps to catch cases when sockets, ifnets or any other vnet-aware structures may have leaked from one vnet to another. The exact placement of the CURVNET_SET() / CURVNET_RESTORE() macros was a result of an empirical iterative process, whith an aim to reduce recursions on CURVNET_SET() to a minimum, while still reducing the scope of CURVNET_SET() to networking only operations - the alternative would be calling CURVNET_SET() on each system call entry. In general, curvnet has to be set in three typicall cases: when processing socket-related requests from userspace or from within the kernel; when processing inbound traffic flowing from device drivers to upper layers of the networking stack, and when executing timer-driven networking functions. This change also introduces a DDB subcommand to show the list of all vnet instances. Approved by: julian (mentor)
OpenPOWER on IntegriCloud