summaryrefslogtreecommitdiffstats
path: root/sys/netinet6
Commit message (Collapse)AuthorAgeFilesLines
* Support for VNET in SCTP (hopefully)rrs2009-09-171-4/+4
|
* Self pointing routes are installed for configured interface addressesqingli2009-09-151-39/+5
| | | | | | | | | | and address aliases. After an interface is brought down and brought back up again, those self pointing routes disappeared. This patch ensures after an interface is brought back up, the loopback routes are reinstalled properly. Reviewed by: bz MFC after: immediately
* Improve flexibility of receiving Router Advertisement andhrs2009-09-128-34/+115
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | automatic link-local address configuration: - Convert a sysctl net.inet6.ip6.accept_rtadv to one for the default value of a per-IF flag ND6_IFF_ACCEPT_RTADV, not a global knob. The default value of the sysctl is 0. - Add a new per-IF flag ND6_IFF_AUTO_LINKLOCAL and convert a sysctl net.inet6.ip6.auto_linklocal to one for its default value. The default value of the sysctl is 1. - Make ND6_IFF_IFDISABLED more robust. It can be used to disable IPv6 functionality of an interface now. - Receiving RA is allowed if ip6_forwarding==0 *and* ND6_IFF_ACCEPT_RTADV is set on that interface. The former condition will be revisited later to support a "host + router" box like IPv6 CPE router. The current behavior is compatible with the older releases of FreeBSD. - The ifconfig(8) now supports these ND6 flags as well as "nud", "prefer_source", and "disabled" in ndp(8). The ndp(8) now supports "auto_linklocal". Discussed with: bz and jinmei Reviewed by: bz MFC after: 3 days
* The addresses that are assigned to the loopback interfaceqingli2009-09-051-4/+7
| | | | | | | should be part of the kernel routing table. Reviewed by: bz MFC after: immediately
* This patch fixes an address scope violation. Considering theqingli2009-09-051-0/+4
| | | | | | | | | | | | | | scenario where an anycast address is assigned on one interface, and a global address with the same scope is assigned on another interface. In other words, the interface owns the anycast address has only the link-local address as one other address. Without this patch, "ping6" the anycast address from another station will observe the source address of the returned ICMP6 echo reply has the link-local address, not the global address that exists on the other interface in the same node. Reviewed by: bz MFC after: immediately
* This patch fixes the following issues:qingli2009-09-053-21/+35
| | | | | | | | | | | | | | | | | | | | | | | | - Interface link-local address is not reachable within the node that owns the interface, this is due to the mismatch in address scope as the result of the installed interface address loopback route. Therefore for each interface address loopback route, the rt_gateway field (of AF_LINK type) will be used to track which interface a given address belongs to. This will aid the address source to use the proper interface for address scope/zone validation. - The loopback address is not reachable. The root cause is the same as the above. - Empty nd6 entries are created for the IPv6 loopback addresses only for validation reason. Doing so will eliminate as much of the special case (loopback addresses) handling code as possible, however, these empty nd6 entries should not be returned to the userland applications such as the "ndp" command. Since both of the above issues contain common files, these files are committed together. Reviewed by: bz MFC after: immediately
* Prefix on-link verification is being performed on staticallyqingli2009-08-301-0/+9
| | | | | | | | | | configured prefixes. Since these statically configured prefixes do not have any associated advertising routers, these prefixes are treated as unreachable and those prefix routes are deleted from the routing table. Therefore bypass prefixes that are not learned from router advertisements during prefix on-link check. Reviewed by: hrs
* When multiple interfaces exist in the system, with each interface havingqingli2009-08-261-2/+25
| | | | | | | | | | | | | | an IPv6 address assigned to it, and if an incoming packet received on one interface has a packet destination address that belongs to another interface, the routing table is consulted to determine how to reach this packet destination. Since the packet destination is an interface address, the route table will return a host route with the loopback interface as rt_ifp. The input code must recognize this fact, instead of using the loopback interface, the input code performs a search to find the right interface that owns the given IPv6 address. Reviewed by: bz, gnn, kmacy MFC after: immediately
* Use locks specific to the lltable code, rather than borrow the ifnetrwatson2009-08-251-1/+1
| | | | | | | | | list/index locks, to protect link layer address tables. This avoids lock order issues during interface teardown, but maintains the bug that sysctl copy routines may be called while a non-sleepable lock is held. Reviewed by: bz, kmacy MFC after: 3 days
* Rework global locks for interface list and index management, correctingrwatson2009-08-234-19/+14
| | | | | | | | | | | | | | several critical bugs, including race conditions and lock order issues: Replace the single rwlock, ifnet_lock, with two locks, an rwlock and an sxlock. Either can be held to stablize the lists and indexes, but both are required to write. This allows the list to be held stable in both network interrupt contexts and sleepable user threads across sleeping memory allocations or device driver interactions. As before, writes to the interface list must occur from sleepable contexts. Reviewed by: bz, julian MFC after: 3 days
* A piece of code was added to install a host route when an IPv6 interfaceqingli2009-08-121-12/+3
| | | | | | | | | | | address is configured with a /128 prefix. This is no longer necessary due to r192011. In fact that code conflicts with r192011. This patch removes the host route installation when detecting the /128 prefix, and instead let the code added by r192011 to install the loopback route for that IPv6 interface address. Reviewed by: bz Approved by: re
* Many network stack subsystems use a single global data structure to holdrwatson2009-08-021-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | all pertinent statatistics for the subsystem. These structures are sometimes "borrowed" by kernel modules that require a place to store statistics for similar events. Add KPI accessor functions for statistics structures referenced by kernel modules so that they no longer encode certain specifics of how the data structures are named and stored. This change is intended to make it easier to move to per-CPU network stats following 8.0-RELEASE. The following modules are affected by this change: if_bridge if_cxgb if_gif ip_mroute ipdivert pf In practice, most of these statistics consumers should, in fact, maintain their own statistics data structures rather than borrowing structures from the base network stack. However, that change is too agressive for this point in the release cycle. Reviewed by: bz Approved by: re (kib)
* Merge the remainder of kern_vimage.c and vimage.h into vnet.c andrwatson2009-08-0124-24/+2
| | | | | | | | | | vnet.h, we now use jails (rather than vimages) as the abstraction for virtualization management, and what remained was specific to virtual network stacks. Minor cleanups are done in the process, and comments updated to reflect these changes. Reviewed by: bz Approved by: re (vimage blanket)
* This patch does the following:qingli2009-07-271-2/+6
| | | | | | | | | | | | | | - Allow loopback route to be installed for address assigned to interface of IFF_POINTOPOINT type. - Install loopback route for an IPv4 interface addreess when the "useloopback" sysctl variable is enabled. Similarly, install loopback route for an IPv6 interface address when the sysctl variable "nd6_useloopback" is enabled. Deleting loopback routes for interface addresses is unconditional in case these sysctl variables were disabled after an interface address has been assigned. Reviewed by: bz Approved by: re
* Introduce and use a sysinit-based initialization scheme for virtualrwatson2009-07-232-42/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | network stacks, VNET_SYSINIT: - Add VNET_SYSINIT and VNET_SYSUNINIT macros to declare events that will occur each time a network stack is instantiated and destroyed. In the !VIMAGE case, these are simply mapped into regular SYSINIT/SYSUNINIT. For the VIMAGE case, we instead use SYSINIT's to track their order and properties on registration, using them for each vnet when created/ destroyed, or immediately on module load for already-started vnets. - Remove vnet_modinfo mechanism that existed to serve this purpose previously, as well as its dependency scheme: we now just use the SYSINIT ordering scheme. - Implement VNET_DOMAIN_SET() to allow protocol domains to declare that they want init functions to be called for each virtual network stack rather than just once at boot, compiling down to DOMAIN_SET() in the non-VIMAGE case. - Walk all virtualized kernel subsystems and make use of these instead of modinfo or DOMAIN_SET() for init/uninit events. In some cases, convert modular components from using modevent to using sysinit (where appropriate). In some cases, do minor rejuggling of SYSINIT ordering to make room for or better manage events. Portions submitted by: jhb (VNET_SYSINIT), bz (cleanup) Discussed with: jhb, bz, julian, zec Reviewed by: bz Approved by: re (VIMAGE blanket)
* sysctl_msec_to_ticks is used with both virtualized andbz2009-07-211-10/+2
| | | | | | | | | | | | | | | non-vrtiualized sysctls so we cannot used one common function. Add a macro to convert the arg1 in the virtualized case to vnet.h to not expose the maths to all over the code. Add a wrapper for the single virtualized call, properly handling arg1 and call the default implementation from there. Convert the two over places to use the new macro. Reviewed by: rwatson Approved by: re (kib)
* Garbage collect vnet module registrations that have neither constructorsrwatson2009-07-202-21/+0
| | | | | | | | | | | | | | | nor destructors, as there's no actual work to do. In most cases, the constructors weren't needed because of the existing protocol initialization functions run by net_init_domain() as part of VNET_MOD_NET, or they were eliminated when support for static initialization of virtualized globals was added. Garbage collect dependency references to modules without constructors or destructors, notably VNET_MOD_INET and VNET_MOD_INET6. Reviewed by: bz Approved by: re (vimage blanket)
* Reimplement and/or implement vnet list locking by replacing a mostlyrwatson2009-07-192-9/+11
| | | | | | | | | | | | | | | | | | | | | | unused custom mutex/condvar-based sleep locks with two locks: an rwlock (for non-sleeping use) and sxlock (for sleeping use). Either acquired for read is sufficient to stabilize the vnet list, but both must be acquired for write to modify the list. Replace previous no-op read locking macros, used in various places in the stack, with actual locking to prevent race conditions. Callers must declare when they may perform unbounded sleeps or not when selecting how to lock. Refactor vnet sysinits so that the vnet list and locks are initialized before kernel modules are linked, as the kernel linker will use them for modules loaded by the boot loader. Update various consumers of these KPIs based on whether they may sleep or not. Reviewed by: bz Approved by: re (kib)
* Fix a problem, whereby misbehaving IPv6 applications, which don't includebms2009-07-181-2/+12
| | | | | | | | | a valid zone ID or interface identifier in a v6 multicast leave, would trigger a fairly paranoid KASSERT(). Observed with Boost++ regression tests on ref8.freebsd.org. Approved by: re (kib)
* Remove unused VNET_SET() and related macros; only VNET_GET() isrwatson2009-07-1619-100/+100
| | | | | | | | | ever actually used. Rename VNET_GET() to VNET() to shorten variable references. Discussed with: bz, julian Reviewed by: bz Approved by: re (kensmith, kib)
* Build on Jeff Roberson's linker-set based dynamic per-CPU allocatorrwatson2009-07-1430-858/+412
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables. Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker. Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided. This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS. Bump __FreeBSD_version and update UPDATING. Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)
* This patch adds a host route to an interface address (that is assignedqingli2009-07-121-0/+46
| | | | | | | | | | to a non loopback/ppp link type) through the loopback interface. Prior to the new L2/L3 rewrite, this host route was explicitly created when processing the IPv6 address assignment. This loopback host route is deleted when that IPv6 address is removed from the interface. Reviewed by: bz, gnn Approved by: re
* Fix "options VIMAGE_GLOBALS" build following introduction ofrwatson2009-06-291-1/+1
| | | | | | in6_ifaddrhead. Approved by: re (kib)
* In in6_update_ifa(), jump to 'cleanup' rather than returning directlyrwatson2009-06-271-4/+7
| | | | | | | | | | | | | | in one additional case, avoiding an ifaddr reference leak. Defer releasing the in6_ifaddr's in6_ifaddrhead reference until the end of in6_unlink_ifa(), as callers are inconsistent regarding whether or not they hold a reference across the call. This avoids using the ifaddr after it may have been freed. Reported by: tegge Reviewed by: tegge Approved by: re (blanket) MFC after: 6 weeks
* Add address list locking for in6_ifaddrhead/ia_link: as with lockingrwatson2009-06-257-5/+46
| | | | | | | | | | | for in_ifaddrhead, we stick with an rwlock for the time being, which we will revisit in the future with a possible move to rmlocks. Some pieces of code require significant further reworking to be safe from all classes of writer-writer races. Reviewed by: bz MFC after: 6 weeks
* Clean up reference management in in6_update_ifa and in6_unlink_ifa, andrwatson2009-06-251-7/+3
| | | | | | | | | in particular, add a reference for in6_ifaddrhead since we do remove a reference for it when an IPv6 address is removed. This fixes ifconfig delete of an IPv6 alias. Reported by: tegge MFC after: 6 weeks
* Convert netinet6 to using queue(9) rather than hand-crafted linked listsrwatson2009-06-249-68/+35
| | | | | | | | for the global IPv6 address list (in6_ifaddr -> in6_ifaddrhead). Adopt the code styles and conventions present in netinet where possible. Reviewed by: gnn, bz MFC after: 6 weeks (possibly not MFCable?)
* Make callers to in6_selectsrc() and in6_pcbladdr() pass in memorybz2009-06-238-89/+84
| | | | | | | | | to save the selected source address rather than returning an unreferenced copy to a pointer that might long be gone by the time we use the pointer for anything meaningful. Asked for by: rwatson Reviewed by: rwatson
* Modify most routines returning 'struct ifaddr *' to return referencesrwatson2009-06-2314-86/+213
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rather than pointers, requiring callers to properly dispose of those references. The following routines now return references: ifaddr_byindex ifa_ifwithaddr ifa_ifwithbroadaddr ifa_ifwithdstaddr ifa_ifwithnet ifaof_ifpforaddr ifa_ifwithroute ifa_ifwithroute_fib rt_getifa rt_getifa_fib IFP_TO_IA ip_rtaddr in6_ifawithifp in6ifa_ifpforlinklocal in6ifa_ifpwithaddr in6_ifadd carp_iamatch6 ip6_getdstifaddr Remove unused macro which didn't have required referencing: IFP_TO_IA6 This closes many small races in which changes to interface or address lists while an ifaddr was in use could lead to use of freed memory (etc). In a few cases, add missing if_addr_list locking required to safely acquire references. Because of a lack of deep copying support, we accept a race in which an in6_ifaddr pointed to by mbuf tags and extracted with ip6_getdstifaddr() doesn't hold a reference while in transmit. Once we have mbuf tag deep copy support, this can be fixed. Reviewed by: bz Obtained from: Apple, Inc. (portions) MFC after: 6 weeks (portions)
* After cleaning up rt_tables from vnet.h and cleaning up opt_route.hbz2009-06-232-2/+0
| | | | | a lot of files no longer need route.h either. Garbage collect them. While here remove now unneeded vnet.h #includes as well.
* In r194702 I meant to remove vnet.h which is no longer needed, not route.h.bz2009-06-231-1/+1
|
* in6_rtqdrain() has been unused. Cleanup.bz2009-06-231-23/+0
| | | | As this was the only consumer of net/route.h left remove that as well.
* Clean up common ifaddr management:rwatson2009-06-213-10/+9
| | | | | | | | | | | | | | - Unify reference count and lock initialization in a single function, ifa_init(). - Move tear-down from a macro (IFAFREE) to a function ifa_free(). - Move reference count bump from a macro (IFAREF) to a function ifa_ref(). - Instead of using a u_int protected by a mutex to refcount(9) for reference count management. The ifa_mtx is now used for exactly one ioctl, and possibly should be removed. MFC after: 3 weeks
* Switch cmd argument to u_long. This matches what if_ethersubr.c does andrdivacky2009-06-213-4/+4
| | | | | | | allows the code to compile cleanly on amd64 with clang. Reviewed by: rwatson Approved by: ed (mentor)
* Add explicit includes for jail.h to the files that need them andbz2009-06-171-0/+1
| | | | remove the "hidden" one from vimage.h.
* Rename the host-related prison fields to be the same as the host.*jamie2009-06-132-8/+10
| | | | | | | parameters they represent, and the variables they replaced, instead of abbreviated versions of them. Approved by: bz (mentor)
* Remove unnecessary #ifdef lines and code.zec2009-06-121-7/+0
| | | | Approved by: julian (mentor)
* Prevent integer overflow in direct pipe write code from circumventingcperciva2009-06-101-1/+1
| | | | | | | | | | | | | | virtual-to-physical page lookups. [09:09] Add missing permissions check for SIOCSIFINFO_IN6 ioctl. [09:10] Fix buffer overflow in "autokey" negotiation in ntpd(8). [09:11] Approved by: so (cperciva) Approved by: re (not really, but SVN wants this...) Security: FreeBSD-SA-09:09.pipe Security: FreeBSD-SA-09:10.ipv6 Security: FreeBSD-SA-09:11.ntpd
* After r193232 rt_tables in vnet.h are no longer indirectly dependent onbz2009-06-0813-16/+0
| | | | | | | | | the ROUTETABLES kernel option thus there is no need to include opt_route.h anymore in all consumers of vnet.h and no longer depend on it for module builds. Remove the hidden include in flowtable.h as well and leave the two explicit #includes in ip_input.c and ip_output.c.
* Introduce an infrastructure for dismantling vnet instances.zec2009-06-087-7/+55
| | | | | | | | | | | | | | | | | | | | | | | | | Vnet modules and protocol domains may now register destructor functions to clean up and release per-module state. The destructor mechanisms can be triggered by invoking "vimage -d", or a future equivalent command which will be provided via the new jail framework. While this patch introduces numerous placeholder destructor functions, many of those are currently incomplete, thus leaking memory or (even worse) failing to stop all running timers. Many of such issues are already known and will be incrementaly fixed over the next weeks in smaller incremental commits. Apart from introducing new fields in structs ifnet, domain, protosw and vnet_net, which requires the kernel and modules to be rebuilt, this change should have no impact on nooptions VIMAGE builds, since vnet destructors can only be called in VIMAGE kernels. Moreover, destructor functions should be in general compiled in only in options VIMAGE builds, except for kernel modules which can be safely kldunloaded at run time. Bump __FreeBSD_version to 800097. Reviewed by: bz, julian Approved by: rwatson, kib (re), julian (mentor)
* Fix and add a workaround on an issue of EtherIP packet with reversedhrs2009-06-071-2/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | version field sent via gif(4)+if_bridge(4). The EtherIP implementation found on FreeBSD 6.1, 6.2, 6.3, 7.0, 7.1, and 7.2 had an interoperability issue because it sent the incorrect EtherIP packets and discarded the correct ones. This change introduces the following two flags to gif(4): accept_rev_ethip_ver: accepts both correct EtherIP packets and ones with reversed version field, if enabled. If disabled, the gif accepts the correct packets only. This flag is enabled by default. send_rev_ethip_ver: sends EtherIP packets with reversed version field intentionally, if enabled. If disabled, the gif sends the correct packets only. This flag is disabled by default. These flags are stored in struct gif_softc and can be set by ifconfig(8) on per-interface basis. Note that this is an incompatible change of EtherIP with the older FreeBSD releases. If you need to interoperate older FreeBSD boxes and new versions after this commit, setting "send_rev_ethip_ver" is needed. Reviewed by: thompsa and rwatson Spotted by: Shunsuke SHINOMIYA PR: kern/125003 MFC after: 2 weeks
* Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERICrwatson2009-06-054-5/+0
| | | | | | | | and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include. Discussed with: pjd
* V_loif is not an array but a pure pointer, so treat it as such.zec2009-06-011-1/+1
| | | | | Reviewed by: bz Approved by: julian (mentor)
* Remove an #undef MIN that slipped under the radar and led me tozec2009-06-011-9/+0
| | | | | | | hastily introduce an #define MIN() a few lines below in r191816. Approved by: julian (mentor) Discussed with: bz
* Convert the two dimensional array to be malloced and introducebz2009-06-013-17/+29
| | | | | | | | | | | | | | | | an accessor function to get the correct rnh pointer back. Update netstat to get the correct pointer using kvm_read() as well. This not only fixes the ABI problem depending on the kernel option but also permits the tunable to overwrite the kernel option at boot time up to MAXFIBS, enlarging the number of FIBs without having to recompile. So people could just use GENERIC now. Reviewed by: julian, rwatson, zec X-MFC: not possible
* Reimplement the netisr framework in order to support parallel netisrrwatson2009-06-012-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | threads: - Support up to one netisr thread per CPU, each processings its own workstream, or set of per-protocol queues. Threads may be bound to specific CPUs, or allowed to migrate, based on a global policy. In the future it would be desirable to support topology-centric policies, such as "one netisr per package". - Allow each protocol to advertise an ordering policy, which can currently be one of: NETISR_POLICY_SOURCE: packets must maintain ordering with respect to an implicit or explicit source (such as an interface or socket). NETISR_POLICY_FLOW: make use of mbuf flow identifiers to place work, as well as allowing protocols to provide a flow generation function for mbufs without flow identifers (m2flow). Falls back on NETISR_POLICY_SOURCE if now flow ID is available. NETISR_POLICY_CPU: allow protocols to inspect and assign a CPU for each packet handled by netisr (m2cpuid). - Provide utility functions for querying the number of workstreams being used, as well as a mapping function from workstream to CPU ID, which protocols may use in work placement decisions. - Add explicit interfaces to get and set per-protocol queue limits, and get and clear drop counters, which query data or apply changes across all workstreams. - Add a more extensible netisr registration interface, in which protocols declare 'struct netisr_handler' structures for each registered NETISR_ type. These include name, handler function, optional mbuf to flow ID function, optional mbuf to CPU ID function, queue limit, and ordering policy. Padding is present to allow these to be expanded in the future. If no queue limit is declared, then a default is used. - Queue limits are now per-workstream, and raised from the previous IFQ_MAXLEN default of 50 to 256. - All protocols are updated to use the new registration interface, and with the exception of netnatm, default queue limits. Most protocols register as NETISR_POLICY_SOURCE, except IPv4 and IPv6, which use NETISR_POLICY_FLOW, and will therefore take advantage of driver- generated flow IDs if present. - Formalize a non-packet based interface between interface polling and the netisr, rather than having polling pretend to be two protocols. Provide two explicit hooks in the netisr worker for start and end events for runs: netisr_poll() and netisr_pollmore(), as well as a function, netisr_sched_poll(), to allow the polling code to schedule netisr execution. DEVICE_POLLING still embeds single-netisr assumptions in its implementation, so for now if it is compiled into the kernel, a single and un-bound netisr thread is enforced regardless of tunable configuration. In the default configuration, the new netisr implementation maintains the same basic assumptions as the previous implementation: a single, un-bound worker thread processes all deferred work, and direct dispatch is enabled by default wherever possible. Performance measurement shows a marginal performance improvement over the old implementation due to the use of batched dequeue. An rmlock is used to synchronize use and registration/unregistration using the framework; currently, synchronized use is disabled (replicating current netisr policy) due to a measurable 3%-6% hit in ping-pong micro-benchmarking. It will be enabled once further rmlock optimization has taken place. However, in practice, netisrs are rarely registered or unregistered at runtime. A new man page for netisr will follow, but since one doesn't currently exist, it hasn't been updated. This change is not appropriate for MFC, although the polling shutdown handler should be merged to 7-STABLE. Bump __FreeBSD_version. Reviewed by: bz
* - Rename IP_NONLOCALOK IP socket option to IP_BINDANY, to be more consistentpjd2009-06-013-2/+21
| | | | | | | | | | | | | with OpenBSD (and BSD/OS originally). We can't easly do it SOL_SOCKET option as there is no more space for more SOL_SOCKET options, but this option also fits better as an IP socket option, it seems. - Implement this functionality also for IPv6 and RAW IP sockets. - Always compile it in (don't use additional kernel options). - Remove sysctl to turn this functionality on and off. - Introduce new privilege - PRIV_NETINET_BINDANY, which allows to use this functionality (currently only unjail root can use it). Discussed with: julian, adrian, jhb, rwatson, kmacy
* Place hostnames and similar information fully under the prison system.jamie2009-05-292-37/+42
| | | | | | | | | | | | | | | | | The system hostname is now stored in prison0, and the global variable "hostname" has been removed, as has the hostname_mtx mutex. Jails may have their own host information, or they may inherit it from the parent/system. The proper way to read the hostname is via getcredhostname(), which will copy either the hostname associated with the passed cred, or the system hostname if you pass NULL. The system hostname can still be accessed directly (and without locking) at prison0.pr_host, but that should be avoided where possible. The "similar information" referred to is domainname, hostid, and hostuuid, which have also become prison parameters and had their associated global variables removed. Approved by: bz (mentor)
* Merge final round of MLD changes from p4:bms2009-05-275-166/+371
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ip6_input.c, in6.h: * Add netinet6-specific mbuf flag M_RTALERT_MLD, shadowing M_PROTO6. * Always set this flag if HBH Router Alert option is present for MLD, even when not forwarding. icmp6.c: * In icmp6_input(), spell m->m_pkthdr.rcvif as ifp to be consistent. * Use scope ID for verifying input. Do not apply SSM filters here, no inpcb. * Check for M_RTALERT_MLD when validating MLD traffic, as we can't see IPv6 hop options outside of ip6_input(). in6_mcast.c: * Use KAME scope/zone ID in in6_multi. * Update net.inet6.ip6.mcast.filters implementation to use scope IDs for comparisons. * Fix scope ID treatment in multicast socket option processing. Scope IDs passed in from userland will be ignored as other less ambiguous APIs exist for specifying the link. * Tighten userland input checks in IPv6 SSM delta and full-state ops. * Source filter embedded scope IDs need to be revisited, for now just clear them and ignore them on input. * Adapt KAME behaviour of looking up the scope ID in the default zone for multicast leaves, when the interface is ambiguous. mld6.c: * Tighten origin checks on MLD traffic as per RFC3810 Section 6.2: * ip6_src MAY be the unspecified address for MLDv1 reports. * ip6_src MAY have link-local address scope for MLDv1 reports, MLDv1 queries, and MLDv2 queries. * Perform address field validation *before* accepting queries. * Use KAME scope/zone ID in query/report processing. * Break const correctness for mld_v1_input_report(), mld_v1_input_query() as we temporarily modify the input mbuf chain. * Clear the scope ID before handoff to userland MLD daemon. * Fix MLDv1 old querier present timer processing. With the protocol defaults, hosts should revert to MLDv2 after 260s. * Add net.inet6.mld.v1enable sysctl, default to on. ifmcstat.c: * Use sysctl by default; -K requests kvm(3) if so compiled. mld.4: * Connect man page to build. Tested using PCS.
* Add hierarchical jails. A jail may further virtualize its environmentjamie2009-05-273-15/+31
| | | | | | | | | | | | | | | | | | | | | | by creating a child jail, which is visible to that jail and to any parent jails. Child jails may be restricted more than their parents, but never less. Jail names reflect this hierarchy, being MIB-style dot-separated strings. Every thread now points to a jail, the default being prison0, which contains information about the physical system. Prison0's root directory is the same as rootvnode; its hostname is the same as the global hostname, and its securelevel replaces the global securelevel. Note that the variable "securelevel" has actually gone away, which should not cause any problems for code that properly uses securelevel_gt() and securelevel_ge(). Some jail-related permissions that were kept in global variables and set via sysctls are now per-jail settings. The sysctls still exist for backward compatibility, used only by the now-deprecated jail(2) system call. Approved by: bz (mentor)
OpenPOWER on IntegriCloud