summaryrefslogtreecommitdiffstats
path: root/sys/net
Commit message (Collapse)AuthorAgeFilesLines
* Introduce and use a sysinit-based initialization scheme for virtualrwatson2009-07-2310-193/+271
| | | | | | | | | | | | | | | | | | | | | | | | | | | | network stacks, VNET_SYSINIT: - Add VNET_SYSINIT and VNET_SYSUNINIT macros to declare events that will occur each time a network stack is instantiated and destroyed. In the !VIMAGE case, these are simply mapped into regular SYSINIT/SYSUNINIT. For the VIMAGE case, we instead use SYSINIT's to track their order and properties on registration, using them for each vnet when created/ destroyed, or immediately on module load for already-started vnets. - Remove vnet_modinfo mechanism that existed to serve this purpose previously, as well as its dependency scheme: we now just use the SYSINIT ordering scheme. - Implement VNET_DOMAIN_SET() to allow protocol domains to declare that they want init functions to be called for each virtual network stack rather than just once at boot, compiling down to DOMAIN_SET() in the non-VIMAGE case. - Walk all virtualized kernel subsystems and make use of these instead of modinfo or DOMAIN_SET() for init/uninit events. In some cases, convert modular components from using modevent to using sysinit (where appropriate). In some cases, do minor rejuggling of SYSINIT ordering to make room for or better manage events. Portions submitted by: jhb (VNET_SYSINIT), bz (cleanup) Discussed with: jhb, bz, julian, zec Reviewed by: bz Approved by: re (VIMAGE blanket)
* sysctl_msec_to_ticks is used with both virtualized andbz2009-07-211-0/+6
| | | | | | | | | | | | | | | non-vrtiualized sysctls so we cannot used one common function. Add a macro to convert the arg1 in the virtualized case to vnet.h to not expose the maths to all over the code. Add a wrapper for the single virtualized call, properly handling arg1 and call the default implementation from there. Convert the two over places to use the new macro. Reviewed by: rwatson Approved by: re (kib)
* Garbage collect vnet module registrations that have neither constructorsrwatson2009-07-201-1/+0
| | | | | | | | | | | | | | | nor destructors, as there's no actual work to do. In most cases, the constructors weren't needed because of the existing protocol initialization functions run by net_init_domain() as part of VNET_MOD_NET, or they were eliminated when support for static initialization of virtualized globals was added. Garbage collect dependency references to modules without constructors or destructors, notably VNET_MOD_INET and VNET_MOD_INET6. Reviewed by: bz Approved by: re (vimage blanket)
* Add macros VNET_SETNAME and VNET_SYMPREFIX, and expose to userspace ifrwatson2009-07-201-3/+10
| | | | | | | | _WANT_VNET is defined. This way we don't need separate definitions in libkvm. Reviewed by: bz Approved by: re (vimage blanket)
* Normalize field naming for struct vnet, fix two debugging printfs thatrwatson2009-07-191-2/+2
| | | | | | | print them. Reviewed by: bz Approved by: re (kensmith, kib)
* Reimplement and/or implement vnet list locking by replacing a mostlyrwatson2009-07-191-2/+2
| | | | | | | | | | | | | | | | | | | | | | unused custom mutex/condvar-based sleep locks with two locks: an rwlock (for non-sleeping use) and sxlock (for sleeping use). Either acquired for read is sufficient to stabilize the vnet list, but both must be acquired for write to modify the list. Replace previous no-op read locking macros, used in various places in the stack, with actual locking to prevent race conditions. Callers must declare when they may perform unbounded sleeps or not when selecting how to lock. Refactor vnet sysinits so that the vnet list and locks are initialized before kernel modules are linked, as the kernel linker will use them for modules loaded by the boot loader. Update various consumers of these KPIs based on whether they may sleep or not. Reviewed by: bz Approved by: re (kib)
* Remove the interim vimage containers, struct vimage and struct procg,jamie2009-07-171-15/+2
| | | | | | and the ioctl-based interface that supported them. Approved by: re (kib), bz (mentor)
* Remove unused VNET_SET() and related macros; only VNET_GET() isrwatson2009-07-1612-50/+45
| | | | | | | | | ever actually used. Rename VNET_GET() to VNET() to shorten variable references. Discussed with: bz, julian Reviewed by: bz Approved by: re (kensmith, kib)
* Add missing license line for vnet.h, correct white space nit.rwatson2009-07-151-1/+2
| | | | Approved by: re (kensmith) (implicit)
* Build on Jeff Roberson's linker-set based dynamic per-CPU allocatorrwatson2009-07-1426-388/+609
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables. Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker. Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided. This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS. Bump __FreeBSD_version and update UPDATING. Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)
* Re-factoring for adding weighted routes introduced akmacy2009-07-111-1/+11
| | | | | | | fairly irritating bug where the system will panic when RADIX_MPATH is enabled. This change fixes this. Approved by: re@
* Implementation of the upcoming Wireless Mesh standard, 802.11s, on therpaulo2009-07-111-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | net80211 wireless stack. This work is based on the March 2009 D3.0 draft standard. This standard is expected to become final next year. This includes two main net80211 modules, ieee80211_mesh.c which deals with peer link management, link metric calculation, routing table control and mesh configuration and ieee80211_hwmp.c which deals with the actually routing process on the mesh network. HWMP is the mandatory routing protocol on by the mesh standard, but others, such as RA-OLSR, can be implemented. Authentication and encryption are not implemented. There are several scripts under tools/tools/net80211/scripts that can be used to test different mesh network topologies and they also teach you how to setup a mesh vap (for the impatient: ifconfig wlan0 create wlandev ... wlanmode mesh). A new build option is available: IEEE80211_SUPPORT_MESH and it's enabled by default on GENERIC kernels for i386, amd64, sparc64 and pc98. Drivers that support mesh networks right now are: ath, ral and mwl. More information at: http://wiki.freebsd.org/WifiMesh Please note that this work is experimental. Also, please note that bridging a mesh vap with another network interface is not yet supported. Many thanks to the FreeBSD Foundation for sponsoring this project and to Sam Leffler for his support. Also, I would like to thank Gateworks Corporation for sending me a Cambria board which was used during the development of this project. Reviewed by: sam Approved by: re (kensmith) Obtained from: projects/mesh11s
* In case we cannot queue a packet reaching the queue limit, retain thebz2009-06-301-0/+1
| | | | | | | | semantics netisr_queue() always had and free the mbuf along with returning the error. Reviewed by: rwatson Approved by: re (kensmith)
* Remove support for the /dev/net/* per-interface devices. They servebrooks2009-06-293-180/+7
| | | | | | | | | | | | | | | | | little purpose and are unused in the base system. The IOCTL functionality is entirely duplicated and routing sockets provide a richer interface than the kqueue functionality. Further, it is not practical for these devices to be made sensible in the face of VIMAGE. Bump __FreeBSD_version on the off chance that there is any code out there that actually uses this stuff. Reviewed by: rwatson Discussed with: bz, zec Approved by: re@ (kensmith)
* Remove unnecessary include of kdb.h that snuck in during ifaddr refcountrwatson2009-06-271-1/+0
| | | | | | | work. Reported by: pluknet <pluknet at gmail.com> Approved by: re (kib)
* In light of DPCPU use by netisr, revise various for loops from usingrwatson2009-06-261-17/+16
| | | | | | | | | MAXCPU to mp_maxid, and handling and reporting of requests to use more threads than we have CPUs to run them on. Reviewed by: bz Approved by: re (kib) MFC after: 6 weeks
* Use if_addr_rlock/if_addr_runlock for if_spp when iterating if_addrhead,rwatson2009-06-261-8/+8
| | | | | | | as it is loadable as a module. Approved by: re (kib) MFC after: 6 weeks
* Update if_stf and if_tun to use if_addr_rlock()/if_addr_runlock() ratherrwatson2009-06-262-5/+5
| | | | | | than IF_ADDR_LOCK()/IF_ADDR_UNLOCK() when iterating ifp->if_addrhead. MFC after: 6 weeks
* Define four wrapper functions for interface address locking,rwatson2009-06-262-0/+44
| | | | | | | | | | | if_addr_rlock() and if_addr_runlock() for regular address lists, and if_maddr_rlock() and if_maddr_runlock() for multicast address lists. We will use these in various kernel modules to avoid encoding specific type and locking strategy information into modules that currently use IF_ADDR_LOCK() and IF_ADDR_UNLOCK() directly. MFC after: 6 weeks
* Convert netisr to use dynamic per-CPU storage (DPCPU) instead of sizingrwatson2009-06-261-21/+40
| | | | | | | | | arrays to [MAXCPU], offering moderate memory savings. In some places, this requires using CPU_ABSENT() to handle less common platforms with sparse CPU IDs. In several places, assert that the selected CPUID for work placement or statistics is not CPU_ABSENT() to be on the safe side. Discussed with: bz, jeff
* Change the type of uio_resid member of struct uio from int to ssize_t.kib2009-06-252-2/+2
| | | | | | | | Note that this does not actually enable full-range i/o requests for 64 architectures, and is done now to update KBI only. Tested by: pho Reviewed by: jhb, bde (as part of the review of the bigger patch)
* Add a new global rwlock, in_ifaddr_lock, which will synchronize use of therwatson2009-06-252-1/+7
| | | | | | | | | | | | | | | | | | | in_ifaddrhead and INADDR_HASH address lists. Previously, these lists were used unsynchronized as they were effectively never changed in steady state, but we've seen increasing reports of writer-writer races on very busy VPN servers as core count has gone up (and similar configurations where address lists change frequently and concurrently). For the time being, use rwlocks rather than rmlocks in order to take advantage of their better lock debugging support. As a result, we don't enable ip_input()'s read-locking of INADDR_HASH until an rmlock conversion is complete and a performance analysis has been done. This means that one class of reader-writer races still exists. MFC after: 6 weeks Reviewed by: bz
* Merge from p4: CH154790,154793,154874bz2009-06-241-0/+728
| | | | | | | | | Import if_epair(4), a virtual cross-over Ethernet-like interface pair. Note these files are 1:1 from p4 and not yet connected to the build not knowing about the new netisr interface. Sponsored by: The FreeBSD Foundation
* Add 10Gbase-T to known ethernet media types.np2009-06-241-0/+3
| | | | | Approved by: gnn (mentor) MFC after: 1 week.
* About to add 10Gbase-T to known media types, this is just a whitespacenp2009-06-241-9/+9
| | | | | | cleanup before that commit. No functional impact. Approved by: gnn (mentor)
* In if_setlladdr(), use IF_ADDR_LOCK() and ifaddr references to improverwatson2009-06-241-3/+15
| | | | | | the safety of link layer address manipulation. MFC after: 6 weeks
* Break at_ifawithnet() into two variants:rwatson2009-06-242-1/+6
| | | | | | | | | | | | | | | - at_ifawithnet(), which acquires an locks it needs and returns an at_ifaddr reference. - at_ifawithnet_locked(), which relies on the caller locking at_ifaddr_list, and returns a pointer rather than a reference. Update various consumers to prefer one or the other, including ether and fddi output, to properly release at_ifaddr references. Rework at_control() to manage locking and references in a manner identical to in_control(). MFC after: 6 weeks
* Lock if_addrhead when iterating, and where necessary acquire and releaserwatson2009-06-241-21/+27
| | | | | | ifadr references in if_sppp. MFC after: 6 weeks
* Make stf_getsrcifa6() return a reference to an in6_ifaddr rather thanrwatson2009-06-241-1/+9
| | | | | | a pointer, and dispose of the references when no longer needed. MFC after: 6 weeks
* Modify most routines returning 'struct ifaddr *' to return referencesrwatson2009-06-233-20/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rather than pointers, requiring callers to properly dispose of those references. The following routines now return references: ifaddr_byindex ifa_ifwithaddr ifa_ifwithbroadaddr ifa_ifwithdstaddr ifa_ifwithnet ifaof_ifpforaddr ifa_ifwithroute ifa_ifwithroute_fib rt_getifa rt_getifa_fib IFP_TO_IA ip_rtaddr in6_ifawithifp in6ifa_ifpforlinklocal in6ifa_ifpwithaddr in6_ifadd carp_iamatch6 ip6_getdstifaddr Remove unused macro which didn't have required referencing: IFP_TO_IA6 This closes many small races in which changes to interface or address lists while an ifaddr was in use could lead to use of freed memory (etc). In a few cases, add missing if_addr_list locking required to safely acquire references. Because of a lack of deep copying support, we accept a race in which an in6_ifaddr pointed to by mbuf tags and extracted with ip6_getdstifaddr() doesn't hold a reference while in transmit. Once we have mbuf tag deep copy support, this can be fixed. Reviewed by: bz Obtained from: Apple, Inc. (portions) MFC after: 6 weeks (portions)
* After cleaning up rt_tables from vnet.h and cleaning up opt_route.hbz2009-06-236-6/+0
| | | | | a lot of files no longer need route.h either. Garbage collect them. While here remove now unneeded vnet.h #includes as well.
* Remove duplicate #include <net/route.h> from the middle of the file.bz2009-06-231-1/+0
|
* V_irtualize flowtable state.zec2009-06-222-100/+192
| | | | | | | | | | | | This change should make options VIMAGE kernel builds usable again, to some extent at least. Note that the size of struct vnet_inet has changed, though in accordance with one-bump-per-day policy we didn't update the __FreeBSD_version number, given that it has already been touched by r194640 a few hours ago. Reviewed by: bz Approved by: julian (mentor)
* Updates after r194640:bz2009-06-221-4/+0
| | | | | | - shrink size guards for vnet_net. vnet_rtable does not need size guards as it is self-contained. - remove a bunch of defines from vnet.h no longer valid.
* Move virtualization of routing related variables into their ownbz2009-06-223-16/+47
| | | | | | | | | | | Vimage module, which had been there already but now is stateful. All variables are now file local; so this further limits the global spreading of routing related things throughout the kernel. Add a missing function local variable in case of MPATHing. Reviewed by: zec
* Collect all VIMAGE_GLOBALS variables in one place.bz2009-06-222-9/+4
| | | | | | | | | | | | No longer export rt_tables as all lookups go through rt_tables_get_rnh(). We cannot make rt_tables (and rtstat, rttrash[1]) static as netstat -r (-rs[1]) would stop working on a stripped VIMAGE_GLOBALS kernel. Reviewed by: zec Presumably broken by: phk 13.5y ago in r12820 [1]
* Add a new function, ifa_ifwithaddr_check(), which rather than returningrwatson2009-06-223-3/+18
| | | | | | | | | | a pointer to an ifaddr matching the passed socket address, returns a boolean indicating whether one was present. In the (near) future, ifa_ifwithaddr() will return a referenced ifaddr rather than a raw ifaddr pointer, and the new wrapper will allow callers that care only about the boolean condition to avoid having to free that reference. MFC after: 3 weeks
* After the update to fxp(4) in r194573 we should no longer needbz2009-06-221-1/+0
| | | | | | | | this DELAY(100) hack introduced in r56938. Thanks to: yongari MFC after: 6 weeks X-MFC note: not before the fxp(4) changes
* Clean up common ifaddr management:rwatson2009-06-214-35/+47
| | | | | | | | | | | | | | - Unify reference count and lock initialization in a single function, ifa_init(). - Move tear-down from a macro (IFAFREE) to a function ifa_free(). - Move reference count bump from a macro (IFAREF) to a function ifa_ref(). - Instead of using a u_int protected by a mutex to refcount(9) for reference count management. The ifa_mtx is now used for exactly one ioctl, and possibly should be removed. MFC after: 3 weeks
* Switch cmd argument to u_long. This matches what if_ethersubr.c does andrdivacky2009-06-219-9/+9
| | | | | | | allows the code to compile cleanly on amd64 with clang. Reviewed by: rwatson Approved by: ed (mentor)
* In non-debugging mode make this define (void)0 instead of nothing. Thisrdivacky2009-06-211-1/+1
| | | | | | | | | | helps to catch bugs like the below with clang. if (cond); <--- note the trailing ; something(); Approved by: ed (mentor) Discussed on: current@
* add helper function for flushing software queueskmacy2009-06-191-1/+15
|
* Implement the -z (zero counters) option for the various bpf counters.csjp2009-06-191-1/+45
| | | | | | | | Add necessary changes to the kernel for this (basically introduce a bpf_zero_counters() function). As well, update the man page. MFC after: 1 month Discussed with: rwatson
* Add explicit includes for jail.h to the files that need them andbz2009-06-172-0/+2
| | | | remove the "hidden" one from vimage.h.
* Add the explicit include of vimage.h to another five .c files stillbz2009-06-172-0/+2
| | | | | | | missing it. Remove the "hidden" kernel only include of vimage.h from ip_var.h added with the very first Vimage commit r181803 to avoid further kernel poisoning.
* r193336 moved ifq_detach to if_free which broke if_alloc followedsam2009-06-152-7/+6
| | | | | | | | by if_free (w/o doing if_attach); move ifq_attach to if_alloc and rename ifq_attach/detach to ifq_init/ifq_delete to better identify their purpose Reviewed by: jhb, kmacy
* Get vnets from creds instead of threads where they're available, and fromjamie2009-06-151-1/+1
| | | | | | | passed threads instead of curthread. Reviewed by: zec, julian Approved by: bz (mentor)
* Manage vnets via the jail system. If a jail is given the booleanjamie2009-06-152-2/+18
| | | | | | | | | | | parameter "vnet" when it is created, a new vnet instance will be created along with the jail. Networks interfaces can be moved between prisons with an ioctl similar to the one that moves them between vimages. For now vnets will co-exist under both jails and vimages, but soon struct vimage will be going away. Reviewed by: zec, julian Approved by: bz (mentor)
* Add an optional callback function that will be invoked when a per-CPUbz2009-06-142-0/+6
| | | | | | | | | | queue was drained. It will never fire for a directly dispatched packet. You will most likely never want to use this for any ordinary netisr usage and you will never blame netisr in case you try to use it and it does not work as expected. Reviewed by: rwatson
* Garbage collect an extern for a non-existent variable.bz2009-06-121-4/+2
| | | | | | While here let the comment end in a '.' and mark the #endif of _KERNEL. Reviewed by: rwatson (as part of a larger patch)
OpenPOWER on IntegriCloud