summaryrefslogtreecommitdiffstats
path: root/sys/net
Commit message (Collapse)AuthorAgeFilesLines
* Garbage collect never used global, sysctl, externs.bz2011-06-212-8/+0
| | | | MFC after: 1 week
* Leave an extra comment about flowtable and IPv6 support rectifying abz2011-06-201-0/+1
| | | | | | previous comment. MFC after: 1 week
* gre(4) was using a field in the softc to detect possible recursion.bz2011-06-182-13/+72
| | | | | | | | | | | | | | | | | | | | | | | On MP systems this is not a usable solution anymore and could easily lead to false positives triggering enough logging that even using the console was no longer usable (multiple parallel ping -f can do). Switch to the suggested solution of using mbuf tags to carry per packet state between gre_output() invocations. Contrary to the proposed solution modelled after gif(4) only allocate one mbuf tag per packet rather than per packet and per gre_output() pass through. As the sysctl to control the possible valid (gre in gre) nestings does no sanity checks, make sure to always allocate space in the mbuf tag for at least one, and at most 255 possible gre interfaces to detect loops in addition to the counter. Submitted by: Cristian KLEIN (cristi net.utcluj.ro) (original version) PR: kern/114714 Reviewed by: Cristian KLEIN (cristi net.utcluj.ro) Reviewed bu: Wooseog Choi (ben_choi hotmail.com) Sponsored by: Sandvine Incorporated MFC after: 1 week
* Grab one of the ifcap bits for netmap, and enable printing in ifconfig.luigi2011-06-141-0/+10
| | | | | | | Document the fact that we might want an IFCAP_CANTCHANGE mask, even though the value is not yet used in sys/net/if.c (asked on -current a week ago, no feedback so i assume no objection).
* Set curvnet context in a callout-trigerred code path.zec2011-06-072-0/+6
| | | | MFC after: 3 days
* Properly return an ENOBUFS error if a write to a tun(4) device failsjhb2011-06-031-10/+7
| | | | | | | | | | | due to m_uiotombuf() failing. While here, trim unneeded error handling related to tuninit() since it can never fail. Submitted by: Martin Birgmeier la5lbtyi aon at Reviewed by: glebius MFC after: 1 week
* Add an optional netisr dispatch point at ether_input(), but set therwatson2011-06-011-1/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | default dispatch method to NETISR_DISPATCH_DIRECT in order to force direct dispatch. This adds a fairly negligble overhead without changing default behavior, but in the future will allow deferred or hybrid dispatch to other worker threads before link layer processing has taken place. For example, this could allow redistribution using RSS hashes without ethernet header cache line hits, if the NIC was unable to adequately implement load balancing to too small a number of input queues -- perhaps due to hard queueset counts of 1, 3, or 8, but in a modern system with 16-128 threads. This can happen on highly threaded systems, where you want want an ithread per core, redistributing work to other queues, but also on virtualised systems where hardware hashing is (or is not) available, but only a single queue has been directed to one VCPU on a VM. Note: this adds a previously non-present assertion about the equivalence of the ifnet from which the packet is received, and the ifnet stamped in the mbuf header. I believe this assertion to generally be true, but we'll find out soon -- if it's not, we might have to add additional overhead in some cases to add an m_tag with the originating ifnet pointer stored in it. Reviewed by: bz MFC after: 3 weeks Sponsored by: Juniper Networks, Inc.
* On multi-core, multi-threaded PPC systems, it is important that the threadsnwhitehorn2011-05-311-1/+1
| | | | | | | | | | | | be brought up in the order they are enumerated in the device tree (in particular, that thread 0 on each core be brought up first). The SLIST through which we loop to start the CPUs has all of its entries added with SLIST_INSERT_HEAD(), which means it is in reverse order of enumeration and so AP startup would always fail in such situations (causing a machine check or RTAS failure). Fix this by changing the SLIST into an STAILQ, and inserting new CPUs at the end. Reviewed by: jhb
* Rework netisr policy mechanism so that per-protocol dispatch policies canrwatson2011-05-243-57/+247
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | be represented: - A single policy namespace is defined, consisting of four possible policies: "default" to use the global default, "deferred" to force deferred dispatch, "direct" to employ direct dispatch where possible, and "hybrid" which makes a dynamic decision based on CPU affinity, ordering, etc. Routines are implemented to convert between strings and an integer namespace. - A new global variable, netisr_dispatch_policy, subsumes existing global variables for direct dispatch, forced direct dispatch, etc, and is used for explicit policy interpretation and composition. Old variables remain so that they can be exported by legacy sysctls for use by old netstat(1) binaries. A new sysctl and tunable, netisr.dispatch.policy, accepts the above strings for specifying a global policy default. - The protocol registration structure, netisr_handler, grows an nh_dispatch field, which accepts a per-policy policy override. The default value is '0', which corresponds to "default", meaning that protocols will accept the global default policy unless otherwise specified. - Policies are now interpreted and composed explicitly at various points in packet dispatch; protocol policies override global policies. - Protocols grow the ability to express a non-opinion about affinity even when implenting m2cpuid by returning NETISR_CPUID_NONE. In that case, the framework falls back on source ordering, rather than simply using the current CPU. These changes are in support of allowing link layer re-dispatch based on RSS or similar hashes provided by NICs, especially in the case where the number of hardware receive queues matches hardware core count, rather than hardware thread count, requiring further software redistributeon. (i.e., on RMI XLR). MFC after: 3 weeks Reviewed by: bz Sponsored by: Juniper Networks, Inc.
* Allow for vlan(4) interfaces with MTU of 1500 bytes to be configuredzec2011-05-241-0/+4
| | | | | | | | on top of epair(4) virtual interfaces, since there's no physical hardware associated with epair interfaces which would imply any constraints on MTU sizes. MFC after: 3 days
* Let epair(4) virtual interfaces report fake link / media status,zec2011-05-241-0/+37
| | | | | | | | | | by borrowing the skeleton of if_media manipulation and reporting code from if_lagg(4). The main motivation behind this change is to allow for epair(4) interfaces to participate in STP if_bridge(4) configurations. Reviewed by: bz MFC after: 3 days
* The statically configured (permanent) ARP entries are removed when anqingli2011-05-202-4/+6
| | | | | | | | | | interface is brought down, even though the interface address is still valid. This patch maintains the permanent ARP entries as long as the interface address (having the same prefix as that of the ARP entries) is valid. Reviewed by: delphij MFC after: 5 days
* - Add 10baseT as an alias for 10baseT/UTP.marius2011-05-151-0/+25
| | | | | | | | | - Add shorthand aliases for common media+option combinations as announced by miibus(4) so that one can actually supply the media strings found in the dmesg output to ifconfig(8). Obtained from: NetBSD (in principle) MFC after: 2 weeks
* Fix white space nits and styleyongari2011-05-061-9/+7
|
* Do not increment collision counter if transmit have failed.yongari2011-05-061-3/+1
| | | | | | | Transmission error in tun(4) is queueing error(i.e. ENOBUFS) and it has nothing to do with collision. Reported by: Zeus V Panchenko (zeus <> ibs dot dn dot ua)
* LACP frames must not be send VLAN-tagged, check for that before processing.thompsa2011-04-301-1/+1
| | | | | | PR: kern/156743 Submitted by: Dmitrij Tejblum MFC after: 1 week
* Make various (pseudo) interfaces compile without INET in the kernelbz2011-04-272-7/+15
| | | | | | | | | | adding appropriate #ifdefs. For module builds the framework needs adjustments for at least carp. Reviewed by: gnn Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems MFC after: 4 days
* When removing ifnets, we should first remove the reference to ifnetglebius2011-04-041-9/+10
| | | | | | | | | | | from the interface index, then decrease refcount, not vice versa. Otherwise there is a race (reproducible) when if_free_internal() contests on IFNET_WLOCK(), and we got a zero-refed ifnet in the index for a long time. It may be picked by some other thread, that runs ifnet_byindex_ref(), who takes the ifnet from index, and bumps refcount. When reader drops the lock, if_free_internal() proceeds with free. Then reader tries to free it a second time.
* - Merge changes to the base system to support OFED. These includejeff2011-03-217-65/+232
| | | | | a wider arg2 for sysctl, updates to vlan code, IFT_INFINIBAND, and other miscellaneous small features.
* Remove dead code.dchagin2011-03-201-17/+0
| | | | MFC after: 1 Week
* ouch, newrt is used on the return path, my fault.dchagin2011-03-191-1/+1
| | | | | | Partialy revert the previous change. MFC after: 1 Week.
* A bit rearranged rtalloc1_fib() code.dchagin2011-03-191-6/+6
| | | | | | | Initialize a variable when it is really needed. To avoid code duplication move the miss label to line up and jump on it. MFC after: 1 Week
* Remove a now unused variable.dchagin2011-03-191-2/+1
| | | | MFC after: 1 Week
* Fix a panic that can happen when trying to destroy a lagg(4) with scheduler ↵eri2011-03-041-1/+2
| | | | | | | set to none. Approved by: thompsa(mentor) MFC after: 1 week
* Hide the outer IP addresses of a tunnel interfaces (gif(4), gre(4))bz2011-03-022-0/+26
| | | | | | | | | from processes inside jails if the addresses do not belong to the jail. Originally reported by: Pieter de Boer via remko PR: kern/151119 Tested by: Piotr KUCHARSKI (nospam 42.pl) [gif] MFC after: 1 week
* Fix typos - remove duplicate "the".brucec2011-02-212-2/+2
| | | | | | PR: bin/154928 Submitted by: Eitan Adler <lists at eitanadler.com> MFC after: 3 days
* Mfp4 CH=177274,177280,177284-177285,177297,177324-177325bz2011-02-161-15/+34
| | | | | | | | | | | | | | | | | | | | | | VNET socket push back: try to minimize the number of places where we have to switch vnets and narrow down the time we stay switched. Add assertions to the socket code to catch possibly unset vnets as seen in r204147. While this reduces the number of vnet recursion in some places like NFS, POSIX local sockets and some netgraph, .. recursions are impossible to fix. The current expectations are documented at the beginning of uipc_socket.c along with the other information there. Sponsored by: The FreeBSD Foundation Sponsored by: CK Software GmbH Reviewed by: jhb Tested by: zec Tested by: Mikolaj Golub (to.my.trociny gmail.com) MFC after: 2 weeks
* Mfp4 CH=177255:bz2011-02-111-3/+12
| | | | | | | | | | | | | | Resort the CURVNET_SET* macros in the non-VNET_DEBUG case to match the call order of the VNET_DEBUG case. Add the VNET_ASSERT() to the non-VNET_DEBUG case as well so that INVARIANTS will still catch problems. Sponsored by: The FreeBSD Foundation Sponsored by: CK Software GmbH Reviewed by: jhb MFC after: 2 weeks
* Mfp4 CH=177255:bz2011-02-113-15/+24
| | | | | | | | | | | | | | | | | Make VNET_ASSERT() available with either VNET_DEBUG or INVARIANTS. Change the syntax to match KASSERT() to allow more flexible panic messages rather than having a printf with hardcoded arguments before panic. Adjust the few assertions we have to the new format (and enhance the output). Sponsored by: The FreeBSD Foundation Sponsored by: CK Software GmbH Reviewed by: jhb MFC after: 2 weeks
* Mfp4 CH=177255:bz2011-02-111-2/+2
| | | | | | Use __func__ rather than __FUNCTION__. MFC after: 2 weeks
* As info.rti_info[RTAX_DST] can point inside of rtm we must not free the rtmmlaier2011-02-101-1/+3
| | | | | | | until rt_dispatch is done with the sockaddr. Found by: memguard MFC after: 3 days
* Fix a LOR by dropping the global ifnet locks while allocating a new ifnetjhb2011-01-241-6/+20
| | | | | | | | table in if_grow(). The order of the SYSINIT's for ifnet state were swapped so that the various locks were initialized before being used. Reviewed by: pluknet, bz MFC after: 2 weeks
* sysctl(8) should use the CTLTYPE to determine the type of data whenmdf2011-01-192-4/+5
| | | | | | | | | | | reading. (This was already done for writing to a sysctl). This requires all SYSCTL setups to specify a type. Most of them are now checked at compile-time. Remove SYSCTL_*X* sysctl additions as the print being in hex should be controlled by the -x flag to sysctl(8). Succested by: bde
* sysctl(9) cleanup checkpoint: amd64 GENERIC builds cleanly.mdf2011-01-123-7/+7
| | | | Commit the net* piece.
* Remove unneeded includes of <sys/linker_set.h>. Other headers that usejhb2011-01-111-1/+0
| | | | | | it internally contain nested includes. Reviewed by: bde
* MfP4 CH=185246 [1]:bz2011-01-091-0/+2
| | | | | | | Add FEATURE() to announce optional VIMAGE. MFC after: 3 days [1] for the moment put it in vnet.c.
* - Restore dropping the priority of syncer down to PPAUSE when it is idle.jhb2011-01-061-0/+5
| | | | | | | | | This was lost when it was converted to using a condition variable instead of lbolt. - Drop the priority of flowtable down to PPAUSE when it is idle as well since it is a similar background task. MFC after: 2 weeks
* Teach ifconfig(8) the handy shared option shortcut aliases the NetBSDmarius2011-01-051-0/+9
| | | | | | | | counterpart also takes, i.e. "fdx" for "full-duplex", "flow" for "flowcontrol", "hdx" for "half-duplex" as well as "loop" and "loopback" for "hw-loopback". MFC after: 1 week
* Fix whitespace.marius2011-01-051-37/+35
| | | | MFC after: 1 week
* Use NULL rather than 0 to invalidate a pointer.bz2010-12-311-9/+2
| | | | | | | | | | | Rather than duplicating the LLE_FREE_LOCKED() macro code in LLE_FREE(), call it directly (like we do for the RT_* macros). Sponsored by: ISPsystem [1] Reviewed by: julian [1] MFC After: 1 week [1] Early 2010.
* Print the vnet pointer under DDB when iterating over flowtables of eachbz2010-12-311-0/+3
| | | | | | | | | | virtual network stack instance. Sponsored by: ISPsystem [1] Reviewed by: julian [1] MFC after: 1 week [1] Early 2010.
* Move the increment operation under the lock and split the conditionbz2010-12-311-8/+10
| | | | | | | | | | | | | | | variable into two so that we can see on which one we are waiting. This might also more properly propagate the update of the flowclean_cycles flag and avoid "hangs" people were seeing. Suggested by: rwatson [1] Sponsored by: ISPsystem [1] Reviewed by: julian [1] Updated by: Mikolaj Golub (to.my.trociny gmail.com) Tested by: Mikolaj Golub (to.my.trociny gmail.com) MFC After: 1 week [1] Early 2010, initial version.
* Introduce and use a new VM interface for temporarily pinning pages. Thisalc2010-12-251-6/+2
| | | | | | | new interface replaces the combined use of vm_fault_quick() and pmap_extract_and_hold() throughout the kernel. In collaboration with: kib@
* Adds IFF_CANTCONFIG to IFF_CANTCHANGE that it shouldn't happen throughweongyo2010-12-071-1/+1
| | | | ioctl(2).
* Introduces IFF_CANTCONFIG interface flag to point that the interfaceweongyo2010-12-071-1/+1
| | | | | | | | | isn't configurable in a meaningful way. This is for ifconfig(8) or other tools not to change code whenever IFT_USB-like interfaces are registered at the interface list. Reviewed by: brooks No objections: gavin, jkim
* o Swap descriptions for net.bpf.bufsize and net.bpf.maxbufsize.maxim2010-11-241-2/+2
| | | | | PR: misc/152531 MFC after: 1 week
* Allow for vlan(4) ifnets to have overlapping unit numbers if they arezec2010-11-221-0/+42
| | | | | | | | | | | | | | | | created in separated vnets. As a side-effect of having a separated if_cloner instance for each vnet, all vlan ifnets created in a vnet will be automatically destroyed when vnet teardown is initiated. Disallow SIOCSETVLAN and SIOCGETVLAN ioctls on vlan ifnets which are associated with physical ifnets residing in parent vnets. This is an interim vlan-specific solution which will be superseded by a more generic if_cloner V_irtualization change from p4. For nooptions VIMAGE builds, this should be a no-op change. Discussed with: bz MFC after: 3 days
* After some off-list discussion, revert a number of changes to thedim2010-11-2210-36/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | DPCPU_DEFINE and VNET_DEFINE macros, as these cause problems for various people working on the affected files. A better long-term solution is still being considered. This reversal may give some modules empty set_pcpu or set_vnet sections, but these are harmless. Changes reverted: ------------------------------------------------------------------------ r215318 | dim | 2010-11-14 21:40:55 +0100 (Sun, 14 Nov 2010) | 4 lines Instead of unconditionally emitting .globl's for the __start_set_xxx and __stop_set_xxx symbols, only emit them when the set_vnet or set_pcpu sections are actually defined. ------------------------------------------------------------------------ r215317 | dim | 2010-11-14 21:38:11 +0100 (Sun, 14 Nov 2010) | 3 lines Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout the tree. ------------------------------------------------------------------------ r215316 | dim | 2010-11-14 21:23:02 +0100 (Sun, 14 Nov 2010) | 2 lines Add macros to define static instances of VNET_DEFINE and DPCPU_DEFINE.
* Add a missing ';' and change the debugging sysctl from xint to int.bz2010-11-211-2/+2
| | | | | Submitted by: Mikolaj Golub (to.my.trociny gmail.com) MFC after: 3 days
* Instead of unconditionally emitting .globl's for the __start_set_xxx anddim2010-11-141-3/+4
| | | | | __stop_set_xxx symbols, only emit them when the set_vnet or set_pcpu sections are actually defined.
OpenPOWER on IntegriCloud