summaryrefslogtreecommitdiffstats
path: root/sys/net/if.c
Commit message (Collapse)AuthorAgeFilesLines
* Use the DB_SHOW_ALL_COMMAND() macro to register the formerly 'show ifnets'bz2010-02-241-1/+1
| | | | | | | | in the db_show_all_table as 'show all ifnets' and with that follow the convention for showing complete lists. Submitted by: thompsa MFC after: 3 days
* Start to implement ifnet DDB support:bz2010-02-201-0/+81
| | | | | | | | | | | | | | - 'show ifnets' prints a list of ifnet *s per virtual network stack, - 'show ifnet <struct ifnet *>' prints fields matching the given ifp. We do not yet print the complete set of fields and might want to factor this out to an extra if_debug.c file in case this grows a lot[1]. We may also want to grow 'show ifnet <if_xname>' support[1]. Sponsored by: ISPsystem Suggested by: rwatson [1] Reviewed by: rwatson MFC after: 5 days
* Enhance a panic string to contain more useful debugging information.bz2010-02-201-1/+2
| | | | | | Sponsored by: ISPsystem Reviewed by: rwatson MFC after: 5 days
* Revised revision 199201 (add interface description capability as inspireddelphij2010-01-271-0/+70
| | | | | | | | by OpenBSD), based on comments from many, including rwatson, jhb, brooks and others. Sponsored by: iXsystems, Inc. MFC after: 1 month
* While flushing the multicast filter of an interface, do not zero the relevantsyrinx2010-01-241-2/+19
| | | | | | | | | | ifmultiaddr structures' reference to the parent interface, unless the parent interface is really detaching. While here, program only link layer multicast filters to a wlan's hardware parent interface. PR: kern/142391, kern/142392 Reviewed by: sam, rpaolo, bms MFC after: 1 week
* Declare a new EVENTHANDLER called iflladdr_event which signals that the L2thompsa2010-01-181-0/+1
| | | | | | | | | | | | | address on an interface has changed. This lets stacked interfaces such as vlan(4) detect that their lower interface has changed and adjust things in order to keep working. Previously this situation broke at least vlan(4) and lagg(4) configurations. The EVENTHANDLER_INVOKE call was not placed within if_setlladdr() due to the risk of a loop. PR: kern/142927 Submitted by: Nikolay Denev
* The devices that supported EVFILT_NETDEV kqueue filters were removed inbrooks2009-12-311-9/+2
| | | | | | | | | | | | r195175. Remove all definitions, documentation, and usage. fifo_misc.c: Remove all kqueue tests as fifo_io.c performs all those that would have remained. Reviewed by: rwatson MFC after: 3 weeks X-MFC note: don't change vlan_link_state() function signature
* Change vlan interfaces to cope more usefully with the parent interface beingjhb2009-12-291-0/+10
| | | | | | | | | | | | | | renamed. Previously the vlan interfaces would lose their configuration as if the parent interface had been physically removed. Now vlan interfaces ignore rename events. - Add a new ifnet flag (IFF_RENAMING) that is set while an ifnet is being renamed. This flag can be checked in ifnet departure/arrival event handlers to treat rename events differently. - Change the ifnet departure event handler in the if_vlan(4) driver to ignore departure events due to a trunk interface being renamed. Reviewed by: brooks, rwatson MFC after: 1 week
* Remove if_timer/if_watchdog now that they are no longer used. The spacejhb2009-11-301-66/+0
| | | | | | | used by if_timer is reserved for expanding if_index to an int in the future. Reviewed by: rwatson, brooks
* Revert revision 199201 for now as it has introduced a kernel vulnerabilitydelphij2009-11-121-41/+0
| | | | and requires more polishing.
* Add interface description capability as inspired by OpenBSD.delphij2009-11-111-0/+41
| | | | MFC after: 3 months
* A wrong variable is used when setting up the interfaceqingli2009-09-201-2/+2
| | | | | | | | | address route, which broke source address selection in some code paths. Submitted by: noted by bz Reviewed by: hrs MFC after: immediately
* Self pointing routes are installed for configured interface addressesqingli2009-09-151-0/+53
| | | | | | | | | | and address aliases. After an interface is brought down and brought back up again, those self pointing routes disappeared. This patch ensures after an interface is brought back up, the loopback routes are reinstalled properly. Reviewed by: bz MFC after: immediately
* Add IFNET_HOLD reserved pointer value for the ifindex ifnet array,rwatson2009-08-261-13/+39
| | | | | | | | | | | | | | which allows an index to be reserved for an ifnet without making the ifnet available for management operations. Use this in if_alloc() while the ifnet lock is released between initial index allocation and completion of ifnet initialization. Add ifindex_free() to centralize the implementation of releasing an ifindex value. Use in if_free() and if_vmove(), as well as when releasing a held index in if_alloc(). Reviewed by: bz MFC after: 3 days
* Break out allocation of new ifindex values from if_alloc() and if_vmove(),rwatson2009-08-251-34/+42
| | | | | | | | | and centralize in a single function ifindex_alloc(). Assert the IFNET_WLOCK, and add missing IFNET_WLOCK in if_alloc(). This does not close all known races in this code. Reviewed by: bz MFC after: 3 days
* Make if_grow static -- it's not used outside of if.c, and with therwatson2009-08-241-1/+2
| | | | | | internals destined to change, it's better if it remains that way. MFC after: 3 days
* When moving ifnets from one vnet to another, and the ifnetzec2009-08-241-0/+15
| | | | | | | | | | | | has ifaddresses of AF_LINK type which thus have an embedded if_index "backpointer", we must update that if_index backpointer to reflect the new if_index that our ifnet just got assigned. This change affects only options VIMAGE builds. Submitted by: bz Reviewed by: bz Approved by: re (rwatson), julian (mentor)
* Rework global locks for interface list and index management, correctingrwatson2009-08-231-36/+58
| | | | | | | | | | | | | | several critical bugs, including race conditions and lock order issues: Replace the single rwlock, ifnet_lock, with two locks, an rwlock and an sxlock. Either can be held to stablize the lists and indexes, but both are required to write. This allows the list to be held stable in both network interrupt contexts and sleepable user threads across sleeping memory allocations or device driver interactions. As before, writes to the interface list must occur from sleepable contexts. Reviewed by: bz, julian MFC after: 3 days
* Appease VNET_DEBUG - in if_vmove we temporarily switch i.e.zec2009-08-141-1/+1
| | | | | | | recurse from one vnet to another which is OK, so no need to flood the console with warnings here. Approved by: re (rwatson), julian (mentor)
* Merge the remainder of kern_vimage.c and vimage.h into vnet.c andrwatson2009-08-011-1/+0
| | | | | | | | | | vnet.h, we now use jails (rather than vimages) as the abstraction for virtualization management, and what remained was specific to virtual network stacks. Minor cleanups are done in the process, and comments updated to reflect these changes. Reviewed by: bz Approved by: re (vimage blanket)
* Make the in-kernel logic for the SIOCSIFVNET, SIOCSIFRVNET ioctlsbz2009-07-261-2/+90
| | | | | | | | | | | | | | | | | | | | (ifconfig ifN (-)vnet <jname|jid>) work correctly. Move vi_if_move to if.c and split it up into two functions(*), one for each ioctl. In the reclaim case, correctly set the vnet before calling if_vmove. Instead of silently allowing a move of an interface from the current vnet to the current vnet, return an error. (*) There is some duplicate interface name checking before actually moving the interface between network stacks without locking and thus race prone. Ideally if_vmove will correctly and automagically handle these in the future. Suggested by: rwatson (*) Approved by: re (kib)
* Introduce and use a sysinit-based initialization scheme for virtualrwatson2009-07-231-35/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | network stacks, VNET_SYSINIT: - Add VNET_SYSINIT and VNET_SYSUNINIT macros to declare events that will occur each time a network stack is instantiated and destroyed. In the !VIMAGE case, these are simply mapped into regular SYSINIT/SYSUNINIT. For the VIMAGE case, we instead use SYSINIT's to track their order and properties on registration, using them for each vnet when created/ destroyed, or immediately on module load for already-started vnets. - Remove vnet_modinfo mechanism that existed to serve this purpose previously, as well as its dependency scheme: we now just use the SYSINIT ordering scheme. - Implement VNET_DOMAIN_SET() to allow protocol domains to declare that they want init functions to be called for each virtual network stack rather than just once at boot, compiling down to DOMAIN_SET() in the non-VIMAGE case. - Walk all virtualized kernel subsystems and make use of these instead of modinfo or DOMAIN_SET() for init/uninit events. In some cases, convert modular components from using modevent to using sysinit (where appropriate). In some cases, do minor rejuggling of SYSINIT ordering to make room for or better manage events. Portions submitted by: jhb (VNET_SYSINIT), bz (cleanup) Discussed with: jhb, bz, julian, zec Reviewed by: bz Approved by: re (VIMAGE blanket)
* Normalize field naming for struct vnet, fix two debugging printfs thatrwatson2009-07-191-2/+2
| | | | | | | print them. Reviewed by: bz Approved by: re (kensmith, kib)
* Reimplement and/or implement vnet list locking by replacing a mostlyrwatson2009-07-191-2/+2
| | | | | | | | | | | | | | | | | | | | | | unused custom mutex/condvar-based sleep locks with two locks: an rwlock (for non-sleeping use) and sxlock (for sleeping use). Either acquired for read is sufficient to stabilize the vnet list, but both must be acquired for write to modify the list. Replace previous no-op read locking macros, used in various places in the stack, with actual locking to prevent race conditions. Callers must declare when they may perform unbounded sleeps or not when selecting how to lock. Refactor vnet sysinits so that the vnet list and locks are initialized before kernel modules are linked, as the kernel linker will use them for modules loaded by the boot loader. Update various consumers of these KPIs based on whether they may sleep or not. Reviewed by: bz Approved by: re (kib)
* Remove the interim vimage containers, struct vimage and struct procg,jamie2009-07-171-15/+2
| | | | | | and the ioctl-based interface that supported them. Approved by: re (kib), bz (mentor)
* Remove unused VNET_SET() and related macros; only VNET_GET() isrwatson2009-07-161-2/+2
| | | | | | | | | ever actually used. Rename VNET_GET() to VNET() to shorten variable references. Discussed with: bz, julian Reviewed by: bz Approved by: re (kensmith, kib)
* Build on Jeff Roberson's linker-set based dynamic per-CPU allocatorrwatson2009-07-141-69/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables. Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker. Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided. This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS. Bump __FreeBSD_version and update UPDATING. Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)
* Remove support for the /dev/net/* per-interface devices. They servebrooks2009-06-291-172/+5
| | | | | | | | | | | | | | | | | little purpose and are unused in the base system. The IOCTL functionality is entirely duplicated and routing sockets provide a richer interface than the kqueue functionality. Further, it is not practical for these devices to be made sensible in the face of VIMAGE. Bump __FreeBSD_version on the off chance that there is any code out there that actually uses this stuff. Reviewed by: rwatson Discussed with: bz, zec Approved by: re@ (kensmith)
* Remove unnecessary include of kdb.h that snuck in during ifaddr refcountrwatson2009-06-271-1/+0
| | | | | | | work. Reported by: pluknet <pluknet at gmail.com> Approved by: re (kib)
* Define four wrapper functions for interface address locking,rwatson2009-06-261-0/+34
| | | | | | | | | | | if_addr_rlock() and if_addr_runlock() for regular address lists, and if_maddr_rlock() and if_maddr_runlock() for multicast address lists. We will use these in various kernel modules to avoid encoding specific type and locking strategy information into modules that currently use IF_ADDR_LOCK() and IF_ADDR_UNLOCK() directly. MFC after: 6 weeks
* In if_setlladdr(), use IF_ADDR_LOCK() and ifaddr references to improverwatson2009-06-241-3/+15
| | | | | | the safety of link layer address manipulation. MFC after: 6 weeks
* Modify most routines returning 'struct ifaddr *' to return referencesrwatson2009-06-231-11/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rather than pointers, requiring callers to properly dispose of those references. The following routines now return references: ifaddr_byindex ifa_ifwithaddr ifa_ifwithbroadaddr ifa_ifwithdstaddr ifa_ifwithnet ifaof_ifpforaddr ifa_ifwithroute ifa_ifwithroute_fib rt_getifa rt_getifa_fib IFP_TO_IA ip_rtaddr in6_ifawithifp in6ifa_ifpforlinklocal in6ifa_ifpwithaddr in6_ifadd carp_iamatch6 ip6_getdstifaddr Remove unused macro which didn't have required referencing: IFP_TO_IA6 This closes many small races in which changes to interface or address lists while an ifaddr was in use could lead to use of freed memory (etc). In a few cases, add missing if_addr_list locking required to safely acquire references. Because of a lack of deep copying support, we accept a race in which an in6_ifaddr pointed to by mbuf tags and extracted with ip6_getdstifaddr() doesn't hold a reference while in transmit. Once we have mbuf tag deep copy support, this can be fixed. Reviewed by: bz Obtained from: Apple, Inc. (portions) MFC after: 6 weeks (portions)
* Remove duplicate #include <net/route.h> from the middle of the file.bz2009-06-231-1/+0
|
* Move virtualization of routing related variables into their ownbz2009-06-221-3/+0
| | | | | | | | | | | Vimage module, which had been there already but now is stateful. All variables are now file local; so this further limits the global spreading of routing related things throughout the kernel. Add a missing function local variable in case of MPATHing. Reviewed by: zec
* Add a new function, ifa_ifwithaddr_check(), which rather than returningrwatson2009-06-221-2/+16
| | | | | | | | | | a pointer to an ifaddr matching the passed socket address, returns a boolean indicating whether one was present. In the (near) future, ifa_ifwithaddr() will return a referenced ifaddr rather than a raw ifaddr pointer, and the new wrapper will allow callers that care only about the boolean condition to avoid having to free that reference. MFC after: 3 weeks
* After the update to fxp(4) in r194573 we should no longer needbz2009-06-221-1/+0
| | | | | | | | this DELAY(100) hack introduced in r56938. Thanks to: yongari MFC after: 6 weeks X-MFC note: not before the fxp(4) changes
* Clean up common ifaddr management:rwatson2009-06-211-6/+33
| | | | | | | | | | | | | | - Unify reference count and lock initialization in a single function, ifa_init(). - Move tear-down from a macro (IFAFREE) to a function ifa_free(). - Move reference count bump from a macro (IFAREF) to a function ifa_ref(). - Instead of using a u_int protected by a mutex to refcount(9) for reference count management. The ifa_mtx is now used for exactly one ioctl, and possibly should be removed. MFC after: 3 weeks
* Switch cmd argument to u_long. This matches what if_ethersubr.c does andrdivacky2009-06-211-1/+1
| | | | | | | allows the code to compile cleanly on amd64 with clang. Reviewed by: rwatson Approved by: ed (mentor)
* r193336 moved ifq_detach to if_free which broke if_alloc followedsam2009-06-151-5/+4
| | | | | | | | by if_free (w/o doing if_attach); move ifq_attach to if_alloc and rename ifq_attach/detach to ifq_init/ifq_delete to better identify their purpose Reviewed by: jhb, kmacy
* Manage vnets via the jail system. If a jail is given the booleanjamie2009-06-151-2/+16
| | | | | | | | | | | parameter "vnet" when it is created, a new vnet instance will be created along with the jail. Networks interfaces can be moved between prisons with an ioctl similar to the one that moves them between vimages. For now vnets will co-exist under both jails and vimages, but soon struct vimage will be going away. Reviewed by: zec, julian Approved by: bz (mentor)
* carp(4) allows people to share a set of IP addresses and can onlybz2009-06-111-0/+8
| | | | | | | | | | | use IPv4/v6 for inter-node communication (according to my reading). Properly wrap the carp callouts in INET || INET6 and refelect this in sys/conf/files as well. While in theory this should be ok, it might be a bit optimistic to think that carp could build with inet6 only[1]. Discussed with: mlaier [1]
* Adapt vfs kqfilter to the shared vnode lock used by zfs write vop. Usekib2009-06-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | vnode interlock to protect the knote fields [1]. The locking assumes that shared vnode lock is held, thus we get exclusive access to knote either by exclusive vnode lock protection, or by shared vnode lock + vnode interlock. Do not use kl_locked() method to assert either lock ownership or the fact that curthread does not own the lock. For shared locks, ownership is not recorded, e.g. VOP_ISLOCKED can return LK_SHARED for the shared lock not owned by curthread, causing false positives in kqueue subsystem assertions about knlist lock. Remove kl_locked method from knlist lock vector, and add two separate assertion methods kl_assert_locked and kl_assert_unlocked, that are supposed to use proper asserts. Change knlist_init accordingly. Add convenience function knlist_init_mtx to reduce number of arguments for typical knlist initialization. Submitted by: jhb [1] Noted by: jhb [2] Reviewed by: jhb Tested by: rnoland
* After r193232 rt_tables in vnet.h are no longer indirectly dependent onbz2009-06-081-1/+0
| | | | | | | | | the ROUTETABLES kernel option thus there is no need to include opt_route.h anymore in all consumers of vnet.h and no longer depend on it for module builds. Remove the hidden include in flowtable.h as well and leave the two explicit #includes in ip_input.c and ip_output.c.
* Introduce an infrastructure for dismantling vnet instances.zec2009-06-081-1/+25
| | | | | | | | | | | | | | | | | | | | | | | | | Vnet modules and protocol domains may now register destructor functions to clean up and release per-module state. The destructor mechanisms can be triggered by invoking "vimage -d", or a future equivalent command which will be provided via the new jail framework. While this patch introduces numerous placeholder destructor functions, many of those are currently incomplete, thus leaking memory or (even worse) failing to stop all running timers. Many of such issues are already known and will be incrementaly fixed over the next weeks in smaller incremental commits. Apart from introducing new fields in structs ifnet, domain, protosw and vnet_net, which requires the kernel and modules to be rebuilt, this change should have no impact on nooptions VIMAGE builds, since vnet destructors can only be called in VIMAGE kernels. Moreover, destructor functions should be in general compiled in only in options VIMAGE builds, except for kernel modules which can be safely kldunloaded at run time. Bump __FreeBSD_version to 800097. Reviewed by: bz, julian Approved by: rwatson, kib (re), julian (mentor)
* Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERICrwatson2009-06-051-1/+0
| | | | | | | | and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include. Discussed with: pjd
* move ifq_detach from if_detach to if_free; this permits callers tosam2009-06-021-3/+1
| | | | | | | reference if_snd in the period between detach+free which helps simplify detach code Reviewed by: jhb, rwatson
* Convert the two dimensional array to be malloced and introducebz2009-06-011-1/+2
| | | | | | | | | | | | | | | | an accessor function to get the correct rnh pointer back. Update netstat to get the correct pointer using kvm_read() as well. This not only fixes the ABI problem depending on the kernel option but also permits the tunable to overwrite the kernel option at boot time up to MAXFIBS, enlarging the number of FIBs without having to recompile. So people could just use GENERIC now. Reviewed by: julian, rwatson, zec X-MFC: not possible
* Introduce an interm userland-kernel API for creating vnets andzec2009-05-311-0/+15
| | | | | | | | | | | | | | | | | | | | | | assigning ifnets from one vnet to another. Deletion of vnets is not yet supported. The interface is implemented as an ioctl extension so that no syscalls had to be introduced. This should be acceptable given that the new interface will be used for a short / interim period only, until the new jail management framwork gains the capability of managing vnets. This method for managing vimages / vnets has been in use for the past 7 years without any observable issues. The userland tool to be used in conjunction with the interim API can be found in p4: //depot/projects/vimage-commit2/src/usr.sbin/vimage/... and will most probably never get commited to svn. While here, bump copyright notices in kern_vimage.c and vimage.h to cover work done in year 2009. Approved by: julian (mentor) Discussed with: bz, rwatson
* Set ifp->if_afdata_initialized to 0 while holding IF_AFDATA_LOCK on ifp,zec2009-05-221-1/+1
| | | | | | | not after the lock has been released. Reviewed by: bz Discussed with: rwatson
* Introduce the if_vmove() function, which will be used in the futurezec2009-05-221-80/+175
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | for reassigning ifnets from one vnet to another. if_vmove() works by calling a restricted subset of actions normally executed by if_detach() on an ifnet in the current vnet, and then switches to the target vnet and executes an appropriate subset of if_attach() actions there. if_attach() and if_detach() have become wrapper functions around if_attach_internal() and if_detach_internal(), where the later variants have an additional argument, a flag indicating whether a full attach or detach sequence is to be executed, or only a restricted subset suitable for moving an ifnet from one vnet to another. Hence, if_vmove() will not call if_detach() and if_attach() directly, but will call the if_detach_internal() and if_attach_internal() variants instead, with the vmove flag set. While here, staticize ifnet_setbyindex() since it is not referenced from outside of sys/net/if.c. Also rename ifccnt field in struct vimage to ifcnt, and do some minor whitespace garbage collection where appropriate. This change should have no functional impact on nooptions VIMAGE kernel builds. Reviewed by: bz, rwatson, brooks? Approved by: julian (mentor)
OpenPOWER on IntegriCloud