summaryrefslogtreecommitdiffstats
path: root/sys/netinet/ip_fw2.c
Commit message (Collapse)AuthorAgeFilesLines
* Convert ipfw to use PFIL_HOOKS. This is change is transparent to userlandandre2004-08-171-51/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and preserves the ipfw ABI. The ipfw core packet inspection and filtering functions have not been changed, only how ipfw is invoked is different. However there are many changes how ipfw is and its add-on's are handled: In general ipfw is now called through the PFIL_HOOKS and most associated magic, that was in ip_input() or ip_output() previously, is now done in ipfw_check_[in|out]() in the ipfw PFIL handler. IPDIVERT is entirely handled within the ipfw PFIL handlers. A packet to be diverted is checked if it is fragmented, if yes, ip_reass() gets in for reassembly. If not, or all fragments arrived and the packet is complete, divert_packet is called directly. For 'tee' no reassembly attempt is made and a copy of the packet is sent to the divert socket unmodified. The original packet continues its way through ip_input/output(). ipfw 'forward' is done via m_tag's. The ipfw PFIL handlers tag the packet with the new destination sockaddr_in. A check if the new destination is a local IP address is made and the m_flags are set appropriately. ip_input() and ip_output() have some more work to do here. For ip_input() the m_flags are checked and a packet for us is directly sent to the 'ours' section for further processing. Destination changes on the input path are only tagged and the 'srcrt' flag to ip_forward() is set to disable destination checks and ICMP replies at this stage. The tag is going to be handled on output. ip_output() again checks for m_flags and the 'ours' tag. If found, the packet will be dropped back to the IP netisr where it is going to be picked up by ip_input() again and the directly sent to the 'ours' section. When only the destination changes, the route's 'dst' is overwritten with the new destination from the forward m_tag. Then it jumps back at the route lookup again and skips the firewall check because it has been marked with M_SKIP_FIREWALL. ipfw 'forward' has to be compiled into the kernel with 'option IPFIREWALL_FORWARD' to enable it. DUMMYNET is entirely handled within the ipfw PFIL handlers. A packet for a dummynet pipe or queue is directly sent to dummynet_io(). Dummynet will then inject it back into ip_input/ip_output() after it has served its time. Dummynet packets are tagged and will continue from the next rule when they hit the ipfw PFIL handlers again after re-injection. BRIDGING and IPFW_ETHER are not changed yet and use ipfw_chk() directly as they did before. Later this will be changed to dedicated ETHER PFIL_HOOKS. More detailed changes to the code: conf/files Add netinet/ip_fw_pfil.c. conf/options Add IPFIREWALL_FORWARD option. modules/ipfw/Makefile Add ip_fw_pfil.c. net/bridge.c Disable PFIL_HOOKS if ipfw for bridging is active. Bridging ipfw is still directly invoked to handle layer2 headers and packets would get a double ipfw when run through PFIL_HOOKS as well. netinet/ip_divert.c Removed divert_clone() function. It is no longer used. netinet/ip_dummynet.[ch] Neither the route 'ro' nor the destination 'dst' need to be stored while in dummynet transit. Structure members and associated macros are removed. netinet/ip_fastfwd.c Removed all direct ipfw handling code and replace it with the new 'ipfw forward' handling code. netinet/ip_fw.h Removed 'ro' and 'dst' from struct ip_fw_args. netinet/ip_fw2.c (Re)moved some global variables and the module handling. netinet/ip_fw_pfil.c New file containing the ipfw PFIL handlers and module initialization. netinet/ip_input.c Removed all direct ipfw handling code and replace it with the new 'ipfw forward' handling code. ip_forward() does not longer require the 'next_hop' struct sockaddr_in argument. Disable early checks if 'srcrt' is set. netinet/ip_output.c Removed all direct ipfw handling code and replace it with the new 'ipfw forward' handling code. netinet/ip_var.h Add ip_reass() as general function. (Used from ipfw PFIL handlers for IPDIVERT.) netinet/raw_ip.c Directly check if ipfw and dummynet control pointers are active. netinet/tcp_input.c Rework the 'ipfw forward' to local code to work with the new way of forward tags. netinet/tcp_sack.c Remove include 'opt_ipfw.h' which is not needed here. sys/mbuf.h Remove m_claim_next() macro which was exclusively for ipfw 'forward' and is no longer needed. Approved by: re (scottl)
* Add the ability to associate ipfw rules with a specific prison ID.csjp2004-08-121-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the only thing truly unique about a prison is it's ID, I figured this would be the most granular way of handling this. This commit makes the following changes: - Adds tokenizing and parsing for the ``jail'' command line option to the ipfw(8) userspace utility. - Append the ipfw opcode list with O_JAIL. - While Iam here, add a comment informing others that if they want to add additional opcodes, they should append them to the end of the list to avoid ABI breakage. - Add ``fw_prid'' to the ipfw ucred cache structure. - When initializing ucred cache, if the process is jailed, set fw_prid to the prison ID, otherwise set it to -1. - Update man page to reflect these changes. This change was a strong motivator behind the ucred caching mechanism in ipfw. A sample usage of this new functionality could be: ipfw add count ip from any to any jail 2 It should be noted that because ucred based constraints are only implemented for TCP and UDP packets, the same applies for jail associations. Conceptual head nod by: pjd Reviewed by: rwatson Approved by: bmilekic (mentor)
* Only invoke verify_path() for verrevpath and versrcreach when we have an IP ↵andre2004-08-111-4/+4
| | | | packet.
* New ipfw option "antispoof":andre2004-08-091-0/+11
| | | | | | | | | | | | | | | For incoming packets, the packet's source address is checked if it belongs to a directly connected network. If the network is directly connected, then the interface the packet came on in is compared to the interface the network is connected to. When incoming interface and directly connected interface are not the same, the packet does not match. Usage example: ipfw add deny ip from any to any not antispoof in Manpage education by: ru
* Extend versrcreach by checking against the rt_flags for RTF_REJECT andandre2004-07-211-0/+6
| | | | | | | | | | | | | | | | | | RTF_BLACKHOLE as well. To quote the submitter: The uRPF loose-check implementation by the industry vendors, at least on Cisco and possibly Juniper, will fail the check if the route of the source address is pointed to Null0 (on Juniper, discard or reject route). What this means is, even if uRPF Loose-check finds the route, if the route is pointed to blackhole, uRPF loose-check must fail. This allows people to utilize uRPF loose-check mode as a pseudo-packet-firewall without using any manual filtering configuration -- one can simply inject a IGP or BGP prefix with next-hop set to a static route that directs to null/discard facility. This results in uRPF Loose-check failing on all packets with source addresses that are within the range of the nullroute. Submitted by: James Jun <james@towardex.com>
* Make M_SKIP_FIREWALL a global (and semantic) flag, preventing anything fromjmallett2004-07-171-12/+0
| | | | | | | | | | | using M_PROTO6 and possibly shooting someone's foot, as well as allowing the firewall to be used in multiple passes, or with a packet classifier frontend, that may need to explicitly allow a certain packet. Presently this is handled in the ipfw_chk code as before, though I have run with it moved to upper layers, and possibly it should apply to ipfilter and pf as well, though this has not been investigated. Discussed with: luigi, rwatson
* Do a pass over all modules in the kernel and make them return EOPNOTSUPPphk2004-07-151-0/+1
| | | | | | | | for unknown events. A number of modules return EINVAL in this instance, and I have left those alone for now and instead taught MOD_QUIESCE to accept this as "didn't do anything".
* When asserting non-Giant locks in the network stack, also assertrwatson2004-06-241-1/+4
| | | | | | | | | | Giant if debug.mpsafenet=0, as any points that require synchronization in the SMPng world also required it in the Giant-world: - inpcb locks (including IPv6) - inpcbinfo locks (including IPv6) - dummynet subsystem lock - ipfw2 subsystem lock
* Modify ip fw so that whenever UID or GID constraints exist in acsjp2004-06-111-30/+77
| | | | | | | | | | | | | | | | | | | ruleset, the pcb is looked up once per ipfw_chk() activation. This is done by extracting the required information out of the PCB and caching it to the ipfw_chk() stack. This should greatly reduce PCB looking contention and speed up the processing of UID/GID based firewall rules (especially with large UID/GID rulesets). Some very basic benchmarks were taken which compares the number of in_pcblookup_hash(9) activations to the number of firewall rules containing UID/GID based contraints before and after this patch. The results can be viewed here: o http://people.freebsd.org/~csjp/ip_fw_pcb.png Reviewed by: andre, luigi, rwatson Approved by: bmilekic (mentor)
* init_tables() must be run after sys/net/route.c:route_init().ru2004-06-101-1/+4
|
* Introduce a new feature to IPFW2: lookup tables. These are usefulru2004-06-091-1/+324
| | | | | | | for handling large sparse address sets. Initial implementation by Vsevolod Lobko <seva@ip.net.ua>, refined by me. MFC after: 1 week
* Add some missing <sys/module.h> includes which are masked by thephk2004-05-301-0/+1
| | | | one on death-row in <sys/kernel.h>
* Add a super-user check to ipfw_ctl() to make sure that the callingcsjp2004-05-251-0/+4
| | | | | | | | | process is a non-prison root. The security.jail.allow_raw_sockets sysctl variable is disabled by default, however if the user enables raw sockets in prisons, prison-root should not be able to interact with firewall rule sets. Approved by: rwatson, bmilekic (mentor)
* Add the option versrcreach to verify that a valid route to theandre2004-04-231-7/+31
| | | | | | | | | | | | | | | | | | | | source address of a packet exists in the routing table. The default route is ignored because it would match everything and render the check pointless. This option is very useful for routers with a complete view of the Internet (BGP) in the routing table to reject packets with spoofed or unrouteable source addresses. Example: ipfw add 1000 deny ip from any to any not versrcreach also known in Cisco-speak as: ip verify unicast source reachable-via any Reviewed by: luigi
* Re-remove MT_TAGs. The problems with dummynet have been fixed now.mlaier2004-02-251-5/+25
| | | | | Tested by: -current, bms(mentor), me Approved by: bms(mentor), sam
* Backout MT_TAG removal (i.e. bring back MT_TAGs) for now, as dummynet ismlaier2004-02-181-25/+5
| | | | | | not working properly with the patch in place. Approved by: bms(mentor)
* This set of changes eliminates the use of MT_TAG "pseudo mbufs", replacingmlaier2004-02-131-5/+25
| | | | | | | | | | | them mostly with packet tags (one case is handled by using an mbuf flag since the linkage between "caller" and "callee" is direct and there's no need to incur the overhead of a packet tag). This is (mostly) work from: sam Silence from: -arch Approved by: bms(mentor), sam, rwatson
* NULL is not 0.ume2003-12-241-1/+1
| | | | Submitted by: "Bjoern A. Zeeb" <bzeeb-lists@lists.zabbadoz.net>
* o IN_MULTICAST wants an address in host byte order.maxim2003-12-161-1/+1
| | | | | | PR: kern/60304 Submitted by: demon MFC after: 1 week
* Include opt_ipsec.h so IPSEC/FAST_IPSEC is defined and the appropriatesam2003-12-021-0/+1
| | | | | | | | | | | | code is compiled in to support the O_IPSEC operator. Previously no support was included and ipsec rules were always matching. Note that we do not return an error when an ipsec rule is added and the kernel does not have IPsec support compiled in; this is done intentionally but we may want to revisit this (document this in the man page). PR: 58899 Submitted by: Bjoern A. Zeeb Approved by: re (rwatson)
* Fix verify_rev_path() function. The author of this function tried toandre2003-11-271-13/+7
| | | | | | | | cut corners which completely broke down when the routing table locking was introduced. Reviewed by: sam (mentor) Approved by: re (rwatson)
* Correct a problem where ipfw-generated packets were being returnedsam2003-11-241-5/+9
| | | | | | | | | | | | | | for ipfw processing w/o an indication the packets were generated by ipfw--and so should not be processed (this manifested itself as a LOR.) The flag bit in the mbuf that was used to mark the packets was not listed in M_COPYFLAGS so if a packet had a header prepended (as done by IPsec) the flag was lost. Correct this by defining a new M_PROTO6 flag and use it to mark packets that need this processing. Reviewed by: bms Approved by: re (rwatson) MFC after: 2 weeks
* Use MPSAFE callouts only when debug.mpsafenet is 1. Both timer routinessam2003-11-231-1/+1
| | | | | | | | potentially transmit packets that may enter KAME IPsec w/o Giant if the callouts are marked MPSAFE. Reviewed by: ume Approved by: re (rwatson)
* Introduce tcp_hostcache and remove the tcp specific metrics fromandre2003-11-201-3/+6
| | | | | | | | | | | | | | | | | | | | | | | the routing table. Move all usage and references in the tcp stack from the routing table metrics to the tcp hostcache. It caches measured parameters of past tcp sessions to provide better initial start values for following connections from or to the same source or destination. Depending on the network parameters to/from the remote host this can lead to significant speedups for new tcp connections after the first one because they inherit and shortcut the learning curve. tcp_hostcache is designed for multiple concurrent access in SMP environments with high contention and is hash indexed by remote ip address. It removes significant locking requirements from the tcp stack with regard to the routing table. Reviewed by: sam (mentor), bms Reviewed by: -net, -current, core@kame.net (IPv6 parts) Approved by: re (scottl)
* Remove RTF_PRCLONING from routing table and adjust users of itandre2003-11-201-1/+1
| | | | | | | | | | | | accordingly. The define is left intact for ABI compatibility with userland. This is a pre-step for the introduction of tcp_hostcache. The network stack remains fully useable with this change. Reviewed by: sam (mentor), bms Reviewed by: -net, -current, core@kame.net (IPv6 parts) Approved by: re (scottl)
* Fix an arguments order in check_uidgid() call.maxim2003-11-201-2/+2
| | | | | | PR: kern/59314 Submitted by: Andrey V. Shytov Approved by: re (rwatson, jhb)
* Remove the global one-level rtcache variable and associatedandre2003-11-141-6/+1
| | | | | | | | complex locking and rework ip_rtaddr() to do its own rtlookup. Adopt all its callers to this and make ip_output() callable with NULL rt pointer. Reviewed by: sam (mentor)
* Move uid/gid checking logic out of line and lock inpcb usage. Thissam2003-11-071-40/+60
| | | | | | | has a LOR between IPFW inpcb locks but I'm committing it now as the lesser of two evils (the other being unlocked use of in_pcblookup). Supported by: FreeBSD Foundation
* use ipsec_getnhist() instead of obsoleted ipsec_gethist().ume2003-11-071-1/+1
| | | | | Submitted by: "Bjoern A. Zeeb" <bzeeb-lists@lists.zabbadoz.net> Reviewed by: Ari Suutari <ari@suutari.iki.fi> (ipfw@)
* Replace the if_name and if_unit members of struct ifnet with new membersbrooks2003-10-311-8/+9
| | | | | | | | | | | | | if_xname, if_dname, and if_dunit. if_xname is the name of the interface and if_dname/unit are the driver name and instance. This change paves the way for interface renaming and enhanced pseudo device creation and configuration symantics. Approved By: re (in principle) Reviewed By: njl, imp Tested On: i386, amd64, sparc64 Obtained From: NetBSD (if_xname)
* Malloc buckets of size 128 have been having their 64-byte offsetmckusick2003-10-161-4/+7
| | | | | | | | | | | | | | | | | | trashed after being freed. This has caused several panics including kern/42277 related to soft updates. Jim Kuhn tracked the problem down to ipfw limit rule processing. In the expiry of dynamic rules, it is possible for an O_LIMIT_PARENT rule to be removed when it still has live children. When the children eventually do expire, a pointer to the (long gone) parent is dereferenced and a count decremented. Since this memory can, and is, allocated for other purposes (in the case of kern/42277 an inodedep structure), chaos ensues. The offset in question in inodedep is the offset of the 16 bit count field in the ipfw2 ipfw_dyn_rule. Submitted by: Jim Kuhn <jkuhn@sandvine.com> Reviewed by: "Evgueni V. Gavrilov" <aquatique@rusunix.org> Reviewed by: Ben Pfountz <netprince@vt.edu> MFC after: 1 week
* Bandaid locking change: mark static rule mutex recursive so re-entry whensam2003-09-171-1/+2
| | | | | | | sending an ICMP packet doesn't cause a panic. A better solution is needed; possibly defering the transmit to a dedicated thread. Observed by: "Aaron Wohl" <freebsd@soith.com>
* Add locking.sam2003-09-171-164/+309
| | | | | | | | o change timeout to MPSAFE callout o restructure rule deletion to deal with locking requirements o replace static buffer used for ipfw control operations with malloc'd storage Sponsored by: FreeBSD Foundation
* Allow set 31 to be used for rules other than 65535.luigi2003-07-151-23/+27
| | | | | | | | | | | | | | | Set 31 is still special because rules belonging to it are not deleted by the "ipfw flush" command, but must be deleted explicitly with "ipfw delete set 31" or by individual rule numbers. This implement a flexible form of "persistent rules" which you might want to have available even after an "ipfw flush". Note that this change does not violate POLA, because you could not use set 31 in a ruleset before this change. sbin/ipfw changes to allow manipulation of set 31 will follow shortly. Suggested by: Paul Richards
* Implement comments embedded into ipfw2 instructions.luigi2003-07-121-1/+1
| | | | | | | | | | | Since we already had 'O_NOP' instructions which always match, all I needed to do is allow the NOP command to have arbitrary length (i.e. move its label in a different part of the switch() which validates instructions). The kernel must know nothing about comments, everything else is done in userland (which will be described in the upcoming ipfw2.c commit).
* Merge the handlers of O_IP_SRC_MASK and O_IP_DST_MASK opcodes, andluigi2003-07-081-17/+13
| | | | | | | | | | | | support matching a list of addr/mask pairs so one can write more efficient rulesets which were not possible before e.g. add 100 skipto 1000 not src-ip 10.0.0.0/8,127.0.0.1/8,192.168.0.0/16 The change is fully backward compatible. ipfw2 and manpage commit to follow. MFC after: 3 days
* Implement the 'ipsec' option to match packets coming out of an ipsec tunnel.luigi2003-07-041-0/+16
| | | | | | | | | Should work with both regular and fast ipsec (mutually exclusive). See manpage for more details. Submitted by: Ari Suutari (ari.suutari@syncrontech.com) Revised by: sam MFC after: 1 week
* whitespace fixluigi2003-06-281-1/+1
|
* Remove whitespace at end of line.luigi2003-06-231-4/+4
|
* Add support for multiple values and ranges for the "iplen", "ipttl",luigi2003-06-221-12/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | "ipid" options. This feature has been requested by several users. On passing, fix some minor bugs in the parser. This change is fully backward compatible so if you have an old /sbin/ipfw and a new kernel you are not in trouble (but you need to update /sbin/ipfw if you want to use the new features). Document the changes in the manpage. Now you can write things like ipfw add skipto 1000 iplen 0-500 which some people were asking to give preferential treatment to short packets. The 'MFC after' is just set as a reminder, because I still need to merge the Alpha/Sparc64 fixes for ipfw2 (which unfortunately change the size of certain kernel structures; not that it matters a lot since ipfw2 is entirely optional and not the default...) PR: bin/48015 MFC after: 1 week
* Change handling to support strong alignment architectures such as alpha andticso2003-06-041-6/+15
| | | | | | | | sparc64. PR: alpha/50658 Submitted by: rizzo Tested on: alpha
* Account for packets processed at layer-2 (i.e. net.link.ether.ipfw=1).kbyanc2003-06-021-3/+6
| | | | MFC after: 2 weeks
* Add a 'verrevpath' option that verifies the interface that a packetcjc2003-03-151-0/+50
| | | | | | | | | comes in on is the same interface that we would route out of to get to the packet's source address. Essentially automates an anti-spoofing check using the information in the routing table. Experimental. The usage and rule format for the feature may still be subject to change.
* Back out M_* changes, per decision of the TRB.imp2003-02-191-2/+2
| | | | Approved by: trb
* o Fix ipfw uid rules: socheckuid() returns 0 when uid matches a socketmaxim2003-02-171-2/+2
| | | | | | | | | | | | cr_uid. Note: we do not have socheckuid() in RELENG_4, ip_fw2.c uses its own macro for a similar purpose that is why ipfw2 in RELENG_4 processes uid rules correctly. I will MFC the diff for code consistency. Reported by: Oleg Baranov <ol@csa.ru> Reviewed by: luigi MFC after: 1 month
* Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.alfred2003-01-211-2/+2
| | | | Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
* If the first action is O_LOG adjust a pointer to the real one, unbreaksmaxim2003-01-201-0/+2
| | | | | | | skipto + log rules. Reported by: Wiktor Niesiobedzki <w@evip.pl> MFC after: 1 week
* Introduce the ability to flag a sysctl for operation at secure level 2 or 3dillon2003-01-141-3/+3
| | | | | | | | | | | | in addition to secure level 1. The mask supports up to a secure level of 8 but only add defines through CTLFLAG_SECURE3 for now. As per the missif in the log entry for 1.11 of ip_fw2.c which added the secure flag to the IPFW sysctl's in the first place, change the secure level requirement from 1 to 3 now that we have support for it. Reviewed by: imp With Design Suggestions by: imp
* Bridged packets are supplied to the firewall with their IP headeriedowse2002-12-271-2/+8
| | | | | | | | | | in network byte order, but icmp_error() expects the IP header to be in host order and the code here did not perform the necessary swapping for the bridged case. This bug causes an "icmp_error: bad length" panic when certain length IP packets (e.g. ip_len == 0x100) are rejected by the firewall with an ICMP response. MFC after: 3 days
* o De-anonymity dummynet(4) and ipfw(4) messages, prepend themmaxim2002-12-241-15/+16
| | | | | | by 'dummynet: ' and 'ipfw: ' prefixes. PR: kern/41609
OpenPOWER on IntegriCloud