summaryrefslogtreecommitdiffstats
path: root/sys/net
Commit message (Collapse)AuthorAgeFilesLines
* - Fix trivial typoeadler2012-01-144-4/+4
| | | | | Approved by: nwhitehorn MFC after: 3 days
* Clarify throughout the vlan(4) code the difference between a "tag" (therwatson2012-01-122-54/+61
| | | | | | | | | | | | | 802.1q-defined 16-bit VID, CFI, and PCP field in host by order) and a VLAN ID (VID). Tags go in packets. VIDs identify VLANs. No functional change is intended, so this should be safe to MFC. Further cleanup with functional changes will be committed separately (for example, renaming vlan_tag/vlan_tag_p, which modify the KPI and KBI). Reviewed by: bz Sponsored by: ADARA Networks, Inc. MFC after: 3 days
* Consumers of bpfdetach() expect it to remove all bpf_if structs from thelstewart2012-01-101-22/+31
| | | | | | | | | | | | | | | | | | | | | | | bpf_iflist list which reference the specified ifnet. The existing implementation only removes the first matching bpf_if found in the list, effectively leaking list entries if an ifnet has been bpfattach()ed multiple times with different DLTs. Fix the leak by performing the detach logic in a loop, stopping when all bpf_if structs referencing the specified ifnet have been detached and removed from the bpf_iflist list. Whilst here, also: - Remove the unnecessary "bp->bif_ifp == NULL" check, as a bpf_if should never exist in the list with a NULL ifnet pointer. - Except when INVARIANTS is in the kernel config, silently ignore the case where no bpf_if referencing the specified ifnet is found, as it is harmless and does not require user attention. Reviewed by: csjp MFC after: 1 week
* Convert the per-interface address list lock from a mutex to a reader/writerjhb2012-01-091-11/+10
| | | | | | lock. Reviewed by: bz
* Copy ifa->if_data to ifam->ifam_data. This was forgotten in r228571.glebius2012-01-081-0/+1
| | | | Submitted by: bz
* Move arprequest() declaration to if_ether.h.glebius2012-01-081-3/+0
|
* Since r228571 CARP is no longer an interface.glebius2012-01-061-8/+0
|
* Convert all users of IF_ADDR_LOCK to use new locking macros that specifyjhb2012-01-052-63/+63
| | | | | | | either a read lock or write lock. Reviewed by: bz MFC after: 2 weeks
* Add new variants of the IF_ADDR_*LOCK*() macros used for protectingjhb2012-01-051-2/+8
| | | | | | | | | interface address lists that distinguish read locks from write locks. To preserve the KPI, the previous operations are mapped to the write lock macros. The lock is still kept as a mutex for now. Reviewed by: bz MFC after: 2 weeks
* Refine last comment.rwatson2012-01-051-1/+1
| | | | | | Submitted by: joeld Sponsored by: ADARA Networks, Inc. MFC after: 3 days
* Add comment to the VLAN code about its integration with VIMAGE: we see whatrwatson2012-01-051-0/+7
| | | | | | | | | | the code is doing, we recognise the legitimacy of its goal, but we're not quite sure it's going about it the right way. More pondering is clearly required. Sponsored by: ADARA Networks, Inc. Discussed with: bz MFC after: 3 days
* Revert r228986 until it can be reworked to avoid panicing the kernel when thelstewart2011-12-312-193/+82
| | | | | | | same interface is attached multiple times with different DLTs, as is done in net80211 for example. Reported by: adrian
* - Introduce the net.bpf.tscfg sysctl tree and associated code so as to make onelstewart2011-12-302-82/+193
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | aspect of time stamp configuration per interface rather than per BPF descriptor. Prior to this, the order in which BPF devices were opened and the per descriptor time stamp configuration settings could cause non-deterministic and unintended behaviour with respect to time stamping. With the new scheme, a BPF attached interface's tscfg sysctl entry can be set to "default", "none", "fast", "normal" or "external". Setting "default" means use the system default option (set with the net.bpf.tscfg.default sysctl), "none" means do not generate time stamps for tapped packets, "fast" means generate time stamps for tapped packets using a hz granularity system clock read, "normal" means generate time stamps for tapped packets using a full timecounter granularity system clock read and "external" (currently unimplemented) means use the time stamp provided with the packet from an underlying source. - Utilise the recently introduced sysclock_getsnapshot() and sysclock_snap2bintime() KPIs to ensure the system clock is only read once per packet, regardless of the number of BPF descriptors and time stamp formats requested. Use the per BPF attached interface time stamp configuration to control if sysclock_getsnapshot() is called and whether the system clock read is fast or normal. The per BPF descriptor time stamp configuration is then used to control how the system clock snapshot is converted to a bintime by sysclock_snap2bintime(). - Remove all FAST related BPF descriptor flag variants. Performing a "fast" read of the system clock is now controlled per BPF attached interface using the net.bpf.tscfg sysctl tree. - Update the bpf.4 man page. Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project. For more information, see http://www.synclab.org/radclock/ In collaboration with: Julien Ridoux (jridoux at unimelb edu au)
* Update if_obytes and if_omcast after successful transmit.yongari2011-12-291-4/+8
| | | | | | | | | | While I'm here update if_oerrors if parent interface of vlan is not up and running. Previously it updated collision counter and it was confusing to interprete it. PR: kern/163478 Reviewed by: glebius, jhb Tested by: Joe Holden < lists <> rewt dot org dot uk >
* Provide ABI compatibility shim to enable configuring of addressesglebius2011-12-211-0/+8
| | | | | | with ifconfig(8) prior to r228571. Requested by: brooks
* Restore a feature that was present in 5.x and 6.x, and was cleared inglebius2011-12-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | | 7.x, 8.x and 9.x with pf(4) imports: pfsync(4) should suppress CARP preemption, while it is running its bulk update. However, reimplement the feature in more elegant manner, that is partially inspired by newer OpenBSD: - Rename term "suppression" to "demotion", to match with OpenBSD. - Keep a global demotion factor, that can be raised by several conditions, for now these are: - interface goes down - carp(4) has problems with ip_output() or ip6_output() - pfsync performs bulk update - Unlike in OpenBSD the demotion factor isn't a counter, but is actual value added to advskew. The adjustment values for particular error conditions are also configurable, and their defaults are maximum advskew value, so a single failure bumps demotion to maximum. This is for POLA compatibility, and should satisfy most users. - Demotion factor is a writable sysctl, so user can do foot shooting, if he desires to.
* A major overhaul of the CARP implementation. The ip_carp.c was startedglebius2011-12-166-9/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | from scratch, copying needed functionality from the old implemenation on demand, with a thorough review of all code. The main change is that interface layer has been removed from the CARP. Now redundant addresses are configured exactly on the interfaces, they run on. The CARP configuration itself is, as before, configured and read via SIOCSVH/SIOCGVH ioctls. A new prefix created with SIOCAIFADDR or SIOCAIFADDR_IN6 may now be configured to a particular virtual host id, which makes the prefix redundant. ifconfig(8) semantics has been changed too: now one doesn't need to clone carpXX interface, he/she should directly configure a vhid on a Ethernet interface. To supply vhid data from the kernel to an application the getifaddrs(8) function had been changed to pass ifam_data with each address. [1] The new implementation definitely closes all PRs related to carp(4) being an interface, and may close several others. It also allows to run a single redundant IP per interface. Big thanks to Bjoern Zeeb for his help with inet6 part of patch, for idea on using ifam_data and for several rounds of reviewing! PR: kern/117000, kern/126945, kern/126714, kern/120130, kern/117448 Reviewed by: bz Submitted by: bz [1]
* Simplify rtrequest(RTM_ADD): ifa can't be NULL after rt_getifa_fib().glebius2011-12-151-9/+4
|
* Remove the unused if_free_type() function.brooks2011-12-092-22/+2
| | | | X-MFC after: never
* 1. Fix the handling of link reset while in netmap more.luigi2011-12-051-8/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A link reset now is completely transparent for the netmap client: even if the NIC resets its own ring (e.g. restarting from 0), the client will not see any change in the current rx/tx positions, because the driver will keep track of the offset between the two. 2. make the device-specific code more uniform across different drivers There were some inconsistencies in the implementation of the netmap support routines, now drivers have been aligned to a common code structure. 3. import netmap support for ixgbe . This is implemented as a very small patch for ixgbe.c (233 lines, 11 chunks, mostly comments: in total the patch has only 54 lines of new code) , as most of the code is in an external file sys/dev/netmap/ixgbe_netmap.h , following some initial comments from Jack Vogel about making changes less intrusive. (Note, i have emailed Jack multiple times asking if he had comments on this structure of the code; i got no reply so i assume he is fine with it). Support for other drivers (em, lem, re, igb) will come later. "ixgbe" is now the reference driver for netmap support. Both the external file (sys/dev/netmap/ixgbe_netmap.h) and the device-specific patches (in sys/dev/ixgbe/ixgbe.c) are heavily commented and should serve as a reference for other device drivers. Tested on i386 and amd64 with the pkt-gen program in tools/tools/netmap, the sender does 14.88 Mpps at 1050 Mhz and 14.2 Mpps at 900 MHz on an i7-860 with 4 cores and 82599 card. Haven't tried yet more aggressive optimizations such as adding 'prefetch' instructions in the time-critical parts of the code.
* Revert r227778 in preparation for committing reworked patches in its place.lstewart2011-11-292-165/+12
|
* Change the if_vlan driver to use if_transmit for forwarding packets to thejhb2011-11-281-83/+79
| | | | | | | | | parent interface. This avoids the overhead of queueing a packet to an IFQ only to immediately dequeue it again. Suggested by: np Reviewed by: brooks MFC after: 1 month
* - Use generic alloc_unr(9) allocator for if_clone, insteadglebius2011-11-282-78/+47
| | | | | | | | | | of hand-made. - When registering new cloner, check whether a cloner with same name already exist. - When allocating unit, also check with help of ifunit() whether such interface already exist or not. [1] PR: kern/162789 [1]
* Improve logging:glebius2011-11-221-2/+2
| | | | | | - don't hardcode function name - use LOG_DEBUG for such a debug message - print error value
* - When feed-forward clock support is compiled in, change the BPF header tolstewart2011-11-212-12/+165
| | | | | | | | | | | | | | | | | | contain both a regular timestamp obtained from the system clock and the current feed-forward ffcounter value. This enables new possibilities including comparison of timekeeping performance and timestamp correction during post processing. - Add the net.bpf.ffclock_tstamp sysctl to provide a choice between timestamping packets using the feedback or feed-forward system clock. Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project. For more information, see http://www.synclab.org/radclock/ Submitted by: Julien Ridoux (jridoux at unimelb edu au)
* Bring in support for netmap, a framework for very efficient packetluigi2011-11-172-0/+379
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I/O from userspace, capable of line rate at 10G, see http://info.iet.unipi.it/~luigi/netmap/ At this time I am bringing in only the generic code (sys/dev/netmap/ plus two headers under sys/net/), and some sample applications in tools/tools/netmap. There is also a manpage in share/man/man4 [1] In order to make use of the framework you need to build a kernel with "device netmap", and patch individual drivers with the code that you can find in sys/dev/netmap/head.diff The file will go away as the relevant pieces are committed to the various device drivers, which should happen in a few days after talking to the driver maintainers. Netmap support is available at the moment for Intel 10G and 1G cards (ixgbe, em/lem/igb), and for the Realtek 1G card ("re"). I have partial patches for "bge" and am starting to work on "cxgbe". Hopefully changes are trivial enough so interested third parties can submit their patches. Interested people can contact me for advice on how to add netmap support to specific devices. CREDITS: Netmap has been developed by Luigi Rizzo and other collaborators at the Universita` di Pisa, and supported by EU project CHANGE (http://www.change-project.eu/) The code is distributed under a BSD Copyright. [1] In my opinion is a bad idea to have all manpage in one directory. We should place kernel documentation in the same dir that contains the code, which would make it much simpler to keep doc and code in sync, reduce the clutter in share/man/ and incidentally is the policy used for all of userspace code. Makefiles and doc tools can be trivially adjusted to find the manpages in the relevant subdirs.
* Remove a few bits of FreeBSD 2.x compatibility code.rmh2011-11-141-3/+3
| | | | Approved by: kib (mentor)
* In r191367 the need for if_free_type() was removed and a new memberbrooks2011-11-114-8/+7
| | | | | | | | if_alloctype was used to store the origional interface type. Take advantage of this change by removing all existing uses of if_free_type() in favor of if_free(). MFC after: 1 Month
* Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.ed2011-11-0716-21/+26
| | | | | | The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.
* Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs.ed2011-11-078-11/+12
| | | | This means that their use is restricted to a single C file.
* Fix a use-after-free/redzone issue in the routing code.mlaier2011-11-031-12/+14
| | | | | | | Reported by (repeatedly): Mike Tancsa Prodded by (repeatedly): bz Forgotten by (repeatedly): mlaier MFC after: 2 weeks
* Add macro IF_DEQUEUE_ALL(ifq, m), that takes the entire mbuf chain offglebius2011-10-271-0/+12
| | | | | the queue. It can be utilized in queue processing to avoid multiple locking/unlocking.
* The host-id/interface-id can have a specific value and is properlyqingli2011-10-251-0/+5
| | | | | | | | | | masked out when adding a prefix route through the "route" command. However, when deleting the route, simply changing the command keyword from "add" to "delete" does not work. The failoure is observed in both IPv4 and IPv6 route insertion. The patch makes the route command behavior consistent between the "add" and the "delete" operation. MFC after: 1 week
* Add missing #includes.ed2011-10-211-0/+2
| | | | | | | | | According to POSIX, these two header files should be able to be included by themselves, not depending on other headers. The <net/if.h> header uses struct sockaddr when __BSD_VISIBLE=1, while <netinet/tcp.h> uses integer datatypes (u_int32_t, u_short, etc). MFC after: 2 months
* Get rid of D_PSEUDO.ed2011-10-182-2/+2
| | | | | | | | | | It seems the D_PSEUDO flag was meant to allow make_dev() to return NULL. Nowadays we have a different interface for that; make_dev_p(). There's no need to keep it there. While there, remove an unneeded D_NEEDMINOR from the gpio driver. Discussed with: gonzo@ (gpio)
* Pass the fibnum where we need filtering of the message on thebz2011-09-285-6/+88
| | | | | | | | | | | | | | | rtsock allowing routing daemons to filter routing updates on an rtsock per FIB. Adjust raw_input() and split it into wrapper and a new function taking an optional callback argument even though we only have one consumer [1] to keep the hackish flags local to rtsock.c. PR: kern/134931 Submitted by: multiple (see PR) Suggested by: rwatson [1] Reviewed by: rwatson MFC after: 3 days
* Make KBI changes required for future MFCing of inpcb rtentry / llentry caching.kmacy2011-09-203-2/+7
| | | | | Reviewed by: rwatson, bz Approved by: re (kib)
* In order to maximize the re-usability of kernel code in user space thiskmacy2011-09-161-1/+1
| | | | | | | | | | | | | patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls. Reviewed by: rwatson Approved by: re (bz)
* On the first loop for generating a bridge MAC address use the localthompsa2011-09-041-6/+21
| | | | | | | | | | hostid, this gives a good chance of keeping the same address over reboots. This is intended to help IPV6 and similar which generate their addresses from the mac. PR: kern/160300 Submitted by: mdodd Approved by: re (kib)
* When adding IPv6 fwd support to ipfw in r225044 these two files werebz2011-08-272-0/+2
| | | | | | | | | not committed. Initialize next_hop6 to align with the IPv4 code. PR: bin/117214 MFC after: 3 weeks X-MFC with: r225044 Approved by: re (kib)
* Fix a deficiency in the selinfo interface:attilio2011-08-253-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a selinfo object is recorded (via selrecord()) and then it is quickly destroyed, with the waiters missing the opportunity to awake, at the next iteration they will find the selinfo object destroyed, causing a PF#. That happens because the selinfo interface has no way to drain the waiters before to destroy the registered selinfo object. Also this race is quite rare to get in practice, because it would require a selrecord(), a poll request by another thread and a quick destruction of the selrecord()'ed selinfo object. Fix this by adding the seldrain() routine which should be called before to destroy the selinfo objects (in order to avoid such case), and fix the present cases where it might have already been called. Sometimes, the context is safe enough to prevent this type of race, like it happens in device drivers which installs selinfo objects on poll callbacks. There, the destruction of the selinfo object happens at driver detach time, when all the filedescriptors should be already closed, thus there cannot be a race. For this case, mfi(4) device driver can be set as an example, as it implements a full correct logic for preventing this from happening. Sponsored by: Sandvine Incorporated Reported by: rstone Tested by: pluknet Reviewed by: jhb, kib Approved by: re (bz) MFC after: 3 weeks
* When the RADIX_MPATH kernel option is enabled, the RADIX_MPATH code triesqingli2011-08-251-4/+1
| | | | | | | | | | | | to find the first route node of an ECMP chain before executing the route command. If the system has a default route, and the specific route argument to the command does not exist in the routing table, then the default route would be reached. The current code does not verify the reached node matches the given route argument, therefore erroneous removed the entry. This patch fixes that bug. Approved by: re MFC after: 3 days
* In rtinit1(), before rtrequest1_fib() is called, info.rti_flags iskevlo2011-08-081-1/+1
| | | | | | | | | | | | | initialized by flags (function argument) or-ed with ifa->ifa_flags. If both NIC has a loopback route to itself, so IFA_RTSELF is set on ifa(s). As IFA_RTSELF is defined by RTF_HOST, rtrequest1_fib() is called with RTF_HOST flag even if netmask is not NULL. Consequently, netmask is set to zero in rtrequest1_fib(), and request to add network route is changed under hands to request to add host route. Tested by: Andrew Boyer <aboyer at averesystems.com> Submitted by: Svatopluk Kraus <onwahe at gmail dot com> Approved by: re (hrs)
* Add missing MODULE_VERSION() definition to protect against duplicatingpluknet2011-08-011-0/+1
| | | | | | | | | | module loads. PR: kern/159345 Reported by: Eugene Grosbein <egrosbein att rdtc ru> Tested by: Eugene Grosbein <egrosbein att rdtc ru> Approved by: re (kib) MFC after: 1 week
* Add spares to the network stack for FreeBSD-9:bz2011-07-172-3/+4
| | | | | | | | | | | | | - TCP keep* timers - TCP UTO (adjust from what was there already) - netmap - route caching - user cookie (temporary to allow for the real fix) Slightly re-shuffle struct ifnet moving fields out of the middle of spares and to better align. Discussed with: rwatson (slightly earlier version)
* Clear the filter memory area before using it. Leaving it uninitialized maymp2011-07-141-0/+2
| | | | | | | | | leak previous kernel stack contents through a malicioius BPF filter. PR: kern/158880 Submitted by: Guy Harris Obtained from: OpenBSD MFC after: 1 week
* Permit ARP to proceed for IPv4 host routes for which the gateway is thezec2011-07-081-3/+0
| | | | | | | | | same as the host address. This already works fine for INET6 and ND6. While here, remove two function pointers from struct lltable which are only initialized but never used. MFC after: 3 days
* Grab the rlock before checking if our interface is enabled, it could bethompsa2011-07-071-1/+2
| | | | | | | | possible to hit a dead pointer when changing interfaces. PR: kern/156978 Submitted by: Andrew Boyer MFC after: 1 week
* Tag mbufs of all incoming frames or packets with the interface's FIBbz2011-07-0311-0/+13
| | | | | | | | | setting (either default or if supported as set by SIOCSIFFIB, e.g. from ifconfig). Submitted by: Alexander V. Chernikov (melifaro ipfw.ru) Reviewed by: julian MFC after: 2 weeks
* Remove extra white space to comply with style for the rest of the struct.bz2011-07-031-2/+2
| | | | MFC after: 2 weeks
OpenPOWER on IntegriCloud