summaryrefslogtreecommitdiffstats
path: root/sys/netinet/ip_dummynet.c
Commit message (Collapse)AuthorAgeFilesLines
* move kernel ipfw-related sources to a separate directory,luigi2009-06-051-2371/+0
| | | | | | | | | adjust conf/files and modules' Makefiles accordingly. No code or ABI changes so this and most of previous related changes can be easily MFC'ed MFC after: 5 days
* Small changes (no actual code changes) in preparation of moving ipfw-relatedluigi2009-06-051-4/+6
| | | | | | | | | | | stuff to its own directory, and cleaning headers and dependencies: In this commit: + remove one use of a typedef; + document dn_rule_delete(); + replace one usage of the DUMMYNET_LOADED macro with its value; No MFC planned until the cleanup is complete.
* Add emulation of delay profiles, which lets you model variousluigi2009-04-091-18/+112
| | | | | | | | | | | | | | | | | | | | types of MAC overheads such as preambles, link level retransmissions and more. Note- this commit changes the userland/kernel ABI for pipes (but not for ordinary firewall rules) so you need to rebuild kernel and /sbin/ipfw to use dummynet features. Please check the manpage for details on the new feature. The MFC would be trivial but it breaks the ABI, so it will be postponed until after 7.2 is released. Interested users are welcome to apply the patch manually to their RELENG_7 tree. Work supported by the European Commission, Projects Onelab and Onelab2 (contract 224263).
* curr_time is a 64 bit variable so SYSCTL_LONG is not appropriateluigi2009-03-021-0/+2
| | | | | | | | as a handler. The variable was exported only for debugging, but there is little reason to do it now that the timekeeping is supported by various other variables. For the time being just comment out the sysctl, but I think this should go away.
* remove unnecessary #include, and document some of the othersluigi2009-02-131-8/+8
|
* Conditionally compile out V_ globals while instantiating the appropriatezec2008-12-101-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | container structures, depending on VIMAGE_GLOBALS compile time option. Make VIMAGE_GLOBALS a new compile-time option, which by default will not be defined, resulting in instatiations of global variables selected for V_irtualization (enclosed in #ifdef VIMAGE_GLOBALS blocks) to be effectively compiled out. Instantiate new global container structures to hold V_irtualized variables: vnet_net_0, vnet_inet_0, vnet_inet6_0, vnet_ipsec_0, vnet_netgraph_0, and vnet_gif_0. Update the VSYM() macro so that depending on VIMAGE_GLOBALS the V_ macros resolve either to the original globals, or to fields inside container structures, i.e. effectively #ifdef VIMAGE_GLOBALS #define V_rt_tables rt_tables #else #define V_rt_tables vnet_net_0._rt_tables #endif Update SYSCTL_V_*() macros to operate either on globals or on fields inside container structs. Extend the internal kldsym() lookups with the ability to resolve selected fields inside the virtualization container structs. This applies only to the fields which are explicitly registered for kldsym() visibility via VNET_MOD_DECLARE() and vnet_mod_register(), currently this is done only in sys/net/if.c. Fix a few broken instances of MODULE_GLOBAL() macro use in SCTP code, and modify the MODULE_GLOBAL() macro to resolve to V_ macros, which in turn result in proper code being generated depending on VIMAGE_GLOBALS. De-virtualize local static variables in sys/contrib/pf/net/pf_subr.c which were prematurely V_irtualized by automated V_ prepending scripts during earlier merging steps. PF virtualization will be done separately, most probably after next PF import. Convert a few variable initializations at instantiation to initialization in init functions, most notably in ipfw. Also convert TUNABLE_INT() initializers for V_ variables to TUNABLE_FETCH_INT() in initializer functions. Discussed at: devsummit Strassburg Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation
* Consistently check IPFW and DUMMYNET privileges in the configurationrwatson2008-05-221-0/+5
| | | | | | | | | routines for those modules, rather than in the raw socket code. This each privilege check to occur in exactly once place and avoids duplicate checks across layers. MFC after: 3 weeks Sponsored by: nCircle Network Security, Inc.
* Dummynet has a limit of 100 slots queue size (or 1MB, if you givedwmalone2008-02-271-2/+9
| | | | | | | | | | | | | | | the limit in bytes) hard coded into both the kernel and userland. Make both these limits a sysctl, so it is easy to change the limit. If the userland part of ipfw finds that the sysctls don't exist, it will just fall back to the traditional limits. (100 packets is quite a small limit these days. If you want to test TCP at 100Mbps, 100 packets can only accommodate a DBP of 12ms.) Note these sysctls in the man page and warn against increasing them without thinking first. MFC after: 3 weeks
* Workaround p->numbytes overflow, which can result in infinite loop insideoleg2007-12-251-7/+22
| | | | | | dummynet module (prerequisite is using queues with "fat" pipe). PR: kern/113548
* - New sysctl variable: net.inet.ip.dummynet.io_fastoleg2007-11-171-3/+6
| | | | | | | | | | | | | | If it is set to zero value (default) dummynet module will try to emulate real link as close as possible (bandwidth & latency): packet will not leave pipe faster than it should be on real link with given bandwidth. (This is original behaviour of dummynet which was altered in previous commit) If it is set to non-zero value only bandwidth is enforced: packet's latency can be lower comparing to real link with given bandwidth. - Document recently introduced dummynet(4) sysctl variables. Requested by: luigi, julian MFC after: 3 month
* 1) dummynet_io() declaration has changed.oleg2007-11-061-7/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | 2) Alter packet flow inside dummynet: allow certain packets to bypass dummynet scheduler. Benefits are: - lower latency: if packet flow does not exceed pipe bandwidth, packets will not be (up to tick) delayed (due to dummynet's scheduler granularity). - lower overhead: if packet avoids dummynet scheduler it shouldn't reenter ip stack later. Such packets can be fastforwarded. - recursion (which can lead to kernel stack exhaution) eliminated. This fix long existed panic, which can be triggered this way: kldload dummynet sysctl net.inet.ip.fw.one_pass=0 ipfw pipe 1 config bw 0 for i in `jot 30`; do ipfw add 1 pipe 1 icmp from any to any; done ping -c 1 localhost 3) Three new sysctl nodes are added: net.inet.ip.dummynet.io_pkt - packets passed to dummynet net.inet.ip.dummynet.io_pkt_fast - packets avoided dummynet scheduler net.inet.ip.dummynet.io_pkt_drop - packets dropped by dummynet P.S. Above comments are true only for layer 3 packets. Layer 2 packet flow is not changed yet. MFC after: 3 month
* style(9) cleanup.oleg2007-11-061-340/+350
| | | | MFC after: 3 month
* Add FBSDID to all files in netinet so that people can moresilby2007-10-071-2/+3
| | | | | | easily include file version information in bug reports. Approved by: re (kensmith)
* Remove the now-unused NET_{LOCK,UNLOCK,ASSERT}_GIANT() macros, whichrwatson2007-08-061-7/+1
| | | | | | | | | | | | | | | previously conditionally acquired Giant based on debug.mpsafenet. As that has now been removed, they are no longer required. Removing them significantly simplifies error-handling in the socket layer, eliminated quite a bit of unwinding of locking in error cases. While here clean up the now unneeded opt_net.h, which previously was used for the NET_WITH_GIANT kernel option. Clean up some related gotos for consistency. Reviewed by: bz, csjp Tested by: kris Approved by: re (kensmith)
* Replace references to NET_CALLOUT_MPSAFE with CALLOUT_MPSAFE, and removerwatson2007-07-281-1/+1
| | | | | | | | definition of NET_CALLOUT_MPSAFE, which is no longer required now that debug.mpsafenet has been removed. The once over: bz Approved by: re (kensmith)
* Replace incorrect local OFFSET_OF macro with the correct and genericmjacob2007-06-171-1/+1
| | | | offsetof macro.
* Move universally to ANSI C function declarations, with relativelyrwatson2007-05-101-5/+7
| | | | consistent style(9)-ish layout.
* - Use non-recursive mutex. MTX_RECURSE is unnecessary since rev. 1.70oleg2006-10-291-34/+31
| | | | | | | | | | | | - Pay respect to net.isr.direct: use netisr_dispatch() instead of ip_input() Reviewed by: glebius, rwatson - purge_flow_set(): - Do not leak memory while purging queues which are not bound to pipe. - style(9) cleanup MFC after: 2 months
* - Convertoleg2006-10-271-5/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | net.inet.ip.dummynet.curr_time net.inet.ip.dummynet.searches net.inet.ip.dummynet.search_steps to SYSCTL_LONG nodes. It will prevent frequent wrap around on 64bit archs. - Implement simple mechanics for dummynet(4) internal time correction. Under certain circumstances (system high load, dummynet lock contention, etc) dummynet's tick counter can be significantly slower than it should be. (I've observed up to 25% difference on one of my production servers). Since this counter used for packet scheduling, it's accuracy is vital for precise bandwidth limitation. Introduce new sysctl nodes: net.inet.ip.dummynet. tick_lost - number of ticks coalesced by taskqueue thread. tick_adjustment - number of time corrections done. tick_diff - adjusted vs non-adjusted tick counter difference tick_delta - last vs 'standard' tick differnece (usec). tick_delta_sum - accumulated (and not corrected yet) time difference (usec). Reviewed by: glebius MFC after: 2 month
* Use separate thread for servicing dummynet(4).oleg2006-10-271-3/+32
| | | | | | | Utilize taskqueue(9) API. Submitted by: glebius MFC after: 2 month
* style(9) cleanup.oleg2006-10-271-330/+353
| | | | MFC after: 2 month
* Fix following rules: pipe X (tag|altq) Y ...oleg2006-06-081-0/+4
| | | | | Approved by: glebius (mentor) MFC after: 2 weeks
* Obey opt_inet6.h in kernel build directory.ume2006-02-201-2/+0
| | | | | Reported by: Peter Losher <plosher-keyword-freebsd.a36e57__at__plosh.net> MFC after: 3 days
* When sending a packet from dummynet, indicate that we're forwardingru2006-02-141-5/+2
| | | | | | | | | | it so that ip_id etc. don't get overwritten. This fixes forwarding of fragmented IP packets through a dummynet pipe -- fragments came out with modified and different(!) ip_id's, making it impossible to reassemble a datagram at the receiver side. Submitted by: Alexander Karptsov (reworked by me) MFC after: 3 days
* Dropping the lock in the transmit_event() is not safe, because weglebius2006-02-031-94/+115
| | | | | | | | | | | | store some pipe pointers on stack. If user reconfigures dummynet in the interlock gap, we can work with freed pipes after relock. To fix this, we decided not to send packets in transmit_event(), but fill a queue. At the end of dummynet() and dummynet_io(), after the lock is dropped, if there is something in the queue we run dummynet_send() to process the queue. In collaboration with: ru
* Axe unused function.glebius2006-02-031-38/+0
|
* First step in removing welding between ipfw(4) and dummynet.glebius2005-11-291-285/+280
| | | | | | | | | | | | | | | | | | | | | o Do not use ipfw_insn_pipe->pipe_ptr in locate_flowset(). The _ipfw_insn_pipe isn't touched by this commit to preserve ABI compatibility. o To optimize the lookup of the pipe/flowset in locate_flowset() introduce hashes for pipes and queues: - To preserve ABI compatibility utilize the place of global list pointer for SLIST_ENTRY. - Introduce locate_flowset(queue nr) and locate_pipe(pipe nr). o Rework all the dummynet code to deal with the hashes, not global lists. Also did some style(9) changes in the code blocks that were touched by this sweep: - Be conservative about flowset and pipe variable names on stack, use "fs" and "pipe" everywhere. - Cleanup whitespaces. - Sort variables. - Give variables more meaningful names. - Uppercase and dots in comments. - ENOMEM when malloc(9) failed.
* Remove bridge(4) from the tree. if_bridge(4) is a full functionalmlaier2005-09-271-23/+0
| | | | | | | | replacement and has additional features which make it superior. Discussed on: -arch Reviewed by: thompsa X-MFC-after: never (RELENG_6 as transition period)
* Use monotonic 'time_uptime' instead of 'time_second' as timebaseandre2005-09-191-2/+2
| | | | for timeouts.
* Add dummynet(4) support to if_bridge, this code is largely based on bridge.c.thompsa2005-06-101-0/+10
| | | | | | | This is the final piece to match bridge.c in functionality, we can now be a drop-in replacement. Approved by: mlaier (mentor)
* IPFW version 2 is the only option in HEAD and RELENG_5.glebius2005-05-041-19/+0
| | | | Thus, cleanup unnecessary now ifdefs.
* Make DUMMYNET compile without INET6phk2005-04-191-0/+8
|
* Add IPv6 support to IPFW and Dummynet.brooks2005-04-181-14/+69
| | | | Submitted by: Mariano Tortoriello and Raffaele De Lorenzo (via luigi)
* Use ACTION_PTR(r) instead of (r->cmd + r->act_ofs).brooks2005-04-061-2/+2
| | | | Reviewed by: md5
* Make dummynet_flush() match its prototype.brooks2005-04-051-1/+1
|
* Use NET_CALLOUT_MPSAFE macro.glebius2005-03-011-1/+1
|
* - Reduce number of arguments passed to dummynet_io(), we already have cookieglebius2005-01-161-4/+2
| | | | | in struct ip_fw_args itself. - Remove redundant &= 0xffff from dummynet_io().
* /* -> /*- for license, minor formatting changesimp2005-01-071-1/+1
|
* Allocate memory when dumping pipes with M_WAITOK flag.pjd2004-08-251-9/+33
| | | | | | | | | | | | | On a system with huge number of pipes, M_NOWAIT failes almost always, because of memory fragmentation. My fix is different than the patch proposed by Pawel Malachowski, because in FreeBSD 5.x we cannot sleep while holding dummynet mutex (in 4.x there is no such lock). My fix is also ugly, but there is no easy way to prepare nice and clean fix. PR: kern/46557 Submitted by: Eugene Grosbein <eugen@grosbein.pp.ru> Reviewed by: mlaier
* Convert ipfw to use PFIL_HOOKS. This is change is transparent to userlandandre2004-08-171-35/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and preserves the ipfw ABI. The ipfw core packet inspection and filtering functions have not been changed, only how ipfw is invoked is different. However there are many changes how ipfw is and its add-on's are handled: In general ipfw is now called through the PFIL_HOOKS and most associated magic, that was in ip_input() or ip_output() previously, is now done in ipfw_check_[in|out]() in the ipfw PFIL handler. IPDIVERT is entirely handled within the ipfw PFIL handlers. A packet to be diverted is checked if it is fragmented, if yes, ip_reass() gets in for reassembly. If not, or all fragments arrived and the packet is complete, divert_packet is called directly. For 'tee' no reassembly attempt is made and a copy of the packet is sent to the divert socket unmodified. The original packet continues its way through ip_input/output(). ipfw 'forward' is done via m_tag's. The ipfw PFIL handlers tag the packet with the new destination sockaddr_in. A check if the new destination is a local IP address is made and the m_flags are set appropriately. ip_input() and ip_output() have some more work to do here. For ip_input() the m_flags are checked and a packet for us is directly sent to the 'ours' section for further processing. Destination changes on the input path are only tagged and the 'srcrt' flag to ip_forward() is set to disable destination checks and ICMP replies at this stage. The tag is going to be handled on output. ip_output() again checks for m_flags and the 'ours' tag. If found, the packet will be dropped back to the IP netisr where it is going to be picked up by ip_input() again and the directly sent to the 'ours' section. When only the destination changes, the route's 'dst' is overwritten with the new destination from the forward m_tag. Then it jumps back at the route lookup again and skips the firewall check because it has been marked with M_SKIP_FIREWALL. ipfw 'forward' has to be compiled into the kernel with 'option IPFIREWALL_FORWARD' to enable it. DUMMYNET is entirely handled within the ipfw PFIL handlers. A packet for a dummynet pipe or queue is directly sent to dummynet_io(). Dummynet will then inject it back into ip_input/ip_output() after it has served its time. Dummynet packets are tagged and will continue from the next rule when they hit the ipfw PFIL handlers again after re-injection. BRIDGING and IPFW_ETHER are not changed yet and use ipfw_chk() directly as they did before. Later this will be changed to dedicated ETHER PFIL_HOOKS. More detailed changes to the code: conf/files Add netinet/ip_fw_pfil.c. conf/options Add IPFIREWALL_FORWARD option. modules/ipfw/Makefile Add ip_fw_pfil.c. net/bridge.c Disable PFIL_HOOKS if ipfw for bridging is active. Bridging ipfw is still directly invoked to handle layer2 headers and packets would get a double ipfw when run through PFIL_HOOKS as well. netinet/ip_divert.c Removed divert_clone() function. It is no longer used. netinet/ip_dummynet.[ch] Neither the route 'ro' nor the destination 'dst' need to be stored while in dummynet transit. Structure members and associated macros are removed. netinet/ip_fastfwd.c Removed all direct ipfw handling code and replace it with the new 'ipfw forward' handling code. netinet/ip_fw.h Removed 'ro' and 'dst' from struct ip_fw_args. netinet/ip_fw2.c (Re)moved some global variables and the module handling. netinet/ip_fw_pfil.c New file containing the ipfw PFIL handlers and module initialization. netinet/ip_input.c Removed all direct ipfw handling code and replace it with the new 'ipfw forward' handling code. ip_forward() does not longer require the 'next_hop' struct sockaddr_in argument. Disable early checks if 'srcrt' is set. netinet/ip_output.c Removed all direct ipfw handling code and replace it with the new 'ipfw forward' handling code. netinet/ip_var.h Add ip_reass() as general function. (Used from ipfw PFIL handlers for IPDIVERT.) netinet/raw_ip.c Directly check if ipfw and dummynet control pointers are active. netinet/tcp_input.c Rework the 'ipfw forward' to local code to work with the new way of forward tags. netinet/tcp_sack.c Remove include 'opt_ipfw.h' which is not needed here. sys/mbuf.h Remove m_claim_next() macro which was exclusively for ipfw 'forward' and is no longer needed. Approved by: re (scottl)
* Do a pass over all modules in the kernel and make them return EOPNOTSUPPphk2004-07-151-0/+1
| | | | | | | | for unknown events. A number of modules return EINVAL in this instance, and I have left those alone for now and instead taught MOD_QUIESCE to accept this as "didn't do anything".
* When asserting non-Giant locks in the network stack, also assertrwatson2004-06-241-1/+4
| | | | | | | | | | Giant if debug.mpsafenet=0, as any points that require synchronization in the SMPng world also required it in the Giant-world: - inpcb locks (including IPv6) - inpcbinfo locks (including IPv6) - dummynet subsystem lock - ipfw2 subsystem lock
* Add some missing DUMMYNET_UNLOCK() in config_pipe().mlaier2004-03-031-2/+7
| | | | | Noticed by: Simon Coggins Approved by: bms(mentor)
* Re-remove MT_TAGs. The problems with dummynet have been fixed now.mlaier2004-02-251-91/+128
| | | | | Tested by: -current, bms(mentor), me Approved by: bms(mentor), sam
* Backout MT_TAG removal (i.e. bring back MT_TAGs) for now, as dummynet ismlaier2004-02-181-125/+91
| | | | | | not working properly with the patch in place. Approved by: bms(mentor)
* This set of changes eliminates the use of MT_TAG "pseudo mbufs", replacingmlaier2004-02-131-91/+125
| | | | | | | | | | | them mostly with packet tags (one case is handled by using an mbuf flag since the linkage between "caller" and "callee" is direct and there's no need to incur the overhead of a packet tag). This is (mostly) work from: sam Silence from: -arch Approved by: bms(mentor), sam, rwatson
* o Fix a comment: softticks lives in sys/kern/kern_timeout.c.maxim2003-12-271-1/+1
| | | | | | PR: kern/60613 Submitted by: Gleb Smirnoff MFC after: 3 days
* Do not panic when flushing dummynet firewall rulesemax2003-12-061-1/+1
| | | | | Reviewed by: andre Approved by: re (scottl)
* Use MPSAFE callouts only when debug.mpsafenet is 1. Both timer routinessam2003-11-231-1/+1
| | | | | | | | potentially transmit packets that may enter KAME IPsec w/o Giant if the callouts are marked MPSAFE. Reviewed by: ume Approved by: re (rwatson)
* replace explicit changes to rt_refcnt by RT_ADDREF and RT_REMREFsam2003-11-081-3/+1
| | | | | | | macros that expand to include assertions when the system is built with INVARIANTS Supported by: FreeBSD Foundation
OpenPOWER on IntegriCloud