summaryrefslogtreecommitdiffstats
path: root/sys/netpfil
Commit message (Collapse)AuthorAgeFilesLines
* These files were getting sys/malloc.h and vm/uma.h with header pollutionglebius2016-02-011-1/+2
| | | | via sys/mbuf.h
* cleanup and document in some detail the internals of the testing codeluigi2016-01-275-143/+199
| | | | for dummynet schedulers
* the _Static_assert was not supposed to be in the commit.luigi2016-01-271-1/+0
|
* bugfix: the scheduler template (dn_schk) for the round robin schedulerluigi2016-01-271-1/+2
| | | | | | | | | | | | is followed by another structure (rr_schk) whose size must be set in the schk_datalen field of the descriptor. Not allocating the memory may cause other memory to be overwritten (though dn_schk is 192 bytes and rr_schk only 12 so we may be lucky and end up in the padding after the dn_schk). This is a merge candidate for stable and 10.3 MFC after: 3 days
* fix various warnings to compile the test code with -Wextraluigi2016-01-263-3/+9
|
* fix various warnings (signed/unsigned, printf types, unused arguments)luigi2016-01-261-13/+16
|
* prevent warnings for signed/unsigned comparisons and unused arguments.luigi2016-01-261-6/+14
| | | | Add checks for parameters overflowing 32 bit.
* prevent warning for unused argumentluigi2016-01-261-0/+1
|
* avoid warnings for signed/unsigned comparison and unused argumentsluigi2016-01-261-1/+3
|
* Revert one chunk from commit 285362, which introduced an off-by-one errorluigi2016-01-261-2/+6
| | | | | | | | | in computing a shift index. The error was due to the use of mixed fls() / __fls() functions in another implementation of qfq. To avoid that the problem occurs again, properly document which incarnation of the function we need. Note that the bug only affects QFQ in FreeBSD head from last july, as the patch was not merged to other versions.
* MFP r287070,r287073: split radix implementation and route table structure.melifaro2016-01-252-54/+55
| | | | | | | | | | | | | | | | | | | | | | | There are number of radix consumers in kernel land (pf,ipfw,nfs,route) with different requirements. In fact, first 3 don't have _any_ requirements and first 2 does not use radix locking. On the other hand, routing structure do have these requirements (rnh_gen, multipath, custom to-be-added control plane functions, different locking). Additionally, radix should not known anything about its consumers internals. So, radix code now uses tiny 'struct radix_head' structure along with internal 'struct radix_mask_head' instead of 'struct radix_node_head'. Existing consumers still uses the same 'struct radix_node_head' with slight modifications: they need to pass pointer to (embedded) 'struct radix_head' to all radix callbacks. Routing code now uses new 'struct rib_head' with different locking macro: RADIX_NODE_HEAD prefix was renamed to RIB_ (which stands for routing information base). New net/route_var.h header was added to hold routing subsystem internal data. 'struct rib_head' was placed there. 'struct rtentry' will also be moved there soon.
* Fix panic on table/table entry delete. The panic could have happenedmelifaro2016-01-211-0/+1
| | | | | | | | | | | | | | if more than 64 distinct values had been used. Table value code uses internal objhash API which requires unique key for each object. For value code, pointer to the actual value data is used. The actual problem arises from the fact that 'actual' e.g. runtime data is stored in array and that array is auto-growing. There is special hook (update_tvalue() function) which is used to update the pointers after the change. For some reason, object 'key' was not updated. Fix this by adding update code to the update_tvalue(). Sponsored by: Yandex LLC
* Initialize error value ta_lookup_kfib() by default to please compiler.melifaro2016-01-101-3/+1
|
* Initialize error after r293626 in case neither INET nor INET6 isbz2016-01-101-0/+3
| | | | | | compiled into the kernel. Ideally lots more code would just not be called (or compiled in) in that case but that requires a lot more surgery. For now try to make IP-less kernels compile again.
* Make ipfw addr:kfib lookup algo use new routing KPI.melifaro2016-01-101-49/+72
|
* Use already pre-calculated number of entries instead of tc->count.melifaro2016-01-101-1/+1
|
* Remove sys/eventhandler.h from net/route.hmelifaro2016-01-091-0/+1
| | | | Reviewed by: ae
* Convert pf(4) to the new routing API.melifaro2016-01-071-42/+89
| | | | Differential Revision: https://reviews.freebsd.org/D4763
* Properly drain callouts in the IPFW subsystem to avoid use after freehselasky2015-12-153-6/+12
| | | | | | | | | | | | | | | | | panics when unloading the dummynet and IPFW modules: - The callout drain function can sleep and should not be called having a non-sleepable lock locked. Remove locks around "ipfw_dyn_uninit(0)". - Add a new "dn_gone" variable to prevent asynchronous restart of dummynet callouts when unloading the dummynet kernel module. - Call "dn_reschedule()" locked so that "dn_gone" can be set and checked atomically with regard to starting a new callout. Reviewed by: hiren MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D3855
* Merge helper fib* functions used for basic lookups.melifaro2015-12-081-59/+23
| | | | | | | | | | | | | | | | | | | | Vast majority of rtalloc(9) users require only basic info from route table (e.g. "does the rtentry interface match with the interface I have?". "what is the MTU?", "Give me the IPv4 source address to use", etc..). Instead of hand-rolling lookups, checking if rtentry is up, valid, dealing with IPv6 mtu, finding "address" ifp (almost never done right), provide easy-to-use API hiding all the complexity and returning the needed info into small on-stack structure. This change also helps hiding route subsystem internals (locking, direct rtentry accesses). Additionaly, using this API improves lookup performance since rtentry is not locked. (This is safe, since all the rtentry changes happens under both radix WLOCK and rtentry WLOCK). Sponsored by: Yandex LLC
* Add destroy_object callback to object rewriting framework.ae2015-11-232-2/+11
| | | | | | | | It is called when last reference to named object is going to be released and allows to do additional cleanup for implementation of named objects. Obtained from: Yandex LLC Sponsored by: Yandex LLC
* Fix dynamic IPv6 rules showing junk for non-specified address masks.bdrewery2015-11-171-0/+3
| | | | | | | | | | | | | | | | For example: 00002 0 0 (19s) PARENT 1 tcp 10.10.0.5 0 <-> 0.0.0.0 0 00002 4 412 (1s) LIMIT tcp 10.10.0.5 25848 <-> 10.10.0.7 22 00002 10 777 (1s) LIMIT tcp 2001:894:5a24:653::503:1 52023 <-> 2001:894:5a24:653:ca0a:a9ff:fe04:3978 22 00002 0 0 (17s) PARENT 1 tcp 2001:894:5a24:653::503:1 0 <-> 80f3:70d:23fe:ffff:1005:: 0 Fix this by zeroing the unused address, as is done for IPv4: 00002 0 0 (18s) PARENT 1 tcp 10.10.0.5 0 <-> 0.0.0.0 0 00002 36 14952 (1s) LIMIT tcp 10.10.0.5 25848 <-> 10.10.0.7 22 00002 0 0 (0s) PARENT 1 tcp 2001:894:5a24:653::503:1 0 <-> :: 0 00002 4 345 (274s) LIMIT tcp 2001:894:5a24:653::503:1 34131 <-> 2001:470:1f11:262:ca0a:a9ff:fe04:3978 22 MFC after: 2 weeks
* Bring back the ability of passing cached route via nd6_output_ifp().melifaro2015-11-151-1/+1
|
* This fixes several places where callout_stops return is examined. Therrs2015-11-131-2/+2
| | | | | | | | | | new return codes of -1 were mistakenly being considered "true". Callout_stop now returns -1 to indicate the callout had either already completed or was not running and 0 to indicate it could not be stopped. Also update the manual page to make it more consistent no non-zero in the callout_stop or callout_reset descriptions. MFC after: 1 Month with associated callout change.
* Print proper setfib values in ipfw log.melifaro2015-11-081-1/+1
| | | | Submitted by: Denis Schneider <v1ne2go at gmail>
* Fix setfib target.melifaro2015-11-082-3/+3
| | | | | | Problem was introduced in r272840 when converting tablearg value to 0. Submitted by: Denis Schneider <v1ne2go at gmail>
* pf: Fix broken rule skip calculationkp2015-11-071-2/+2
| | | | | | | | r289932 accidentally broke the rule skip calculation. The address family argument to PF_ANEQ() is now important, and because it was set to 0 the macro always evaluated to false. This resulted in incorrect skip values, which in turn broke the rule evaluations.
* Remove now obsolete KASSERT.ae2015-11-031-6/+0
| | | | | | | | Actually, object classify callbacks can skip some opcodes, that could be rewritten. We will deteremine real numbed of rewritten opcodes a bit later in this function. Reported by: David H. Wolfskill <david at catwhisker dot org>
* Eliminate any conditional increments of object_opcodes in theae2015-11-032-3/+9
| | | | | | | | | | | check_ipfw_rule_body() function. This function is intended to just determine that rule has some opcodes that can be rewrited. Then the ref_rule_objects() function will determine real number of rewritten opcodes using classify callback. Reviewed by: melifaro Obtained from: Yandex LLC Sponsored by: Yandex LLC
* Add ipfw_check_object_name_generic() function to do basic checks for anae2015-11-034-30/+17
| | | | | | | | | object name correctness. Each type of object can do more strict checking in own implementation. Do such checks for tables in check_table_name(). Reviewed by: melifaro Obtained from: Yandex LLC Sponsored by: Yandex LLC
* Implement `ipfw internal olist` command to list named objects.ae2015-11-032-5/+63
| | | | | | Reviewed by: melifaro Obtained from: Yandex LLC Sponsored by: Yandex LLC
* pf: Fix IPv6 checksums with route-to.kp2015-10-291-0/+7
| | | | | | | | | | | | | | When using route-to (or reply-to) pf sends the packet directly to the output interface. If that interface doesn't support checksum offloading the checksum has to be calculated in software. That was already done in the IPv4 case, but not for the IPv6 case. As a result we'd emit packets with pseudo-header checksums (i.e. incorrect checksums). This issue was exposed by the changes in r289316 when pf stopped performing full checksum calculations for all packets. Submitted by: Luoqi Chen MFC after: 1 week
* Eliminate last rtalloc_ign() caller.melifaro2015-10-271-3/+0
| | | | Differential Revision: https://reviews.freebsd.org/D3927
* pf: Fix TSO issueskp2015-10-143-50/+88
| | | | | | | | | | | | | | | | | | | | | In certain configurations (mostly but not exclusively as a VM on Xen) pf produced packets with an invalid TCP checksum. The problem was that pf could only handle packets with a full checksum. The FreeBSD IP stack produces TCP packets with a pseudo-header checksum (only addresses, length and protocol). Certain network interfaces expect to see the pseudo-header checksum, so they end up producing packets with invalid checksums. To fix this stop calculating the full checksum and teach pf to only update TCP checksums if TSO is disabled or the change affects the pseudo-header checksum. PR: 154428, 193579, 198868 Reviewed by: sbruno MFC after: 1 week Relnotes: yes Sponsored by: RootBSD Differential Revision: https://reviews.freebsd.org/D3779
* Bump number of prefixes in O_IP_<SRC|DST> from 15 to 31 (max possible).melifaro2015-10-031-1/+1
| | | | | | PR: 203459 Submitted by: groos at xiplink.com MFC after: 2 weeks
* Simplify the way of attaching IPv6 link-layer header.melifaro2015-09-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem description: How do we currently perform layer 2 resolution and header imposition: For IPv4 we have the following chain: ip_output() -> (ether|atm|whatever)_output() -> arpresolve() Lookup is done in proper place (link-layer output routine) and it is possible to provide cached lle data. For IPv6 situation is more complex: ip6_output() -> nd6_output() -> nd6_output_ifp() -> (whatever)_output() -> nd6_storelladdr() We have ip6_ouput() which calls nd6_output() instead of link output routine. nd6_output() does the following: * checks if lle exists, creates it if needed (similar to arpresolve()) * performes lle state transitions (similar to arpresolve()) * calls nd6_output_ifp() which pushes packets to link output routine along with running SeND/MAC hooks regardless of lle state (e.g. works as run-hooks placeholder). After that, iface output routine like ether_output() calls nd6_storelladdr() which performs lle lookup once again. As a result, we perform lookup twice for each outgoing packet for most types of interfaces. We also need to maintain runtime-checked table of 'nd6-free' interfaces (see nd6_need_cache()). Fix this behavior by eliminating first ND lookup. To be more specific: * make all nd6_output() consumers use nd6_output_ifp() instead * rename nd6_output[_slow]() to nd6_resolve_[slow]() * convert nd6_resolve() and nd6_resolve_slow() to arpresolve() semantics, e.g. copy L2 address to buffer instead of pushing packet towards lower layers * Make all nd6_storelladdr() users use nd6_resolve() * eliminate nd6_storelladdr() The resulting callchain is the following: ip6_output() -> nd6_output_ifp() -> (whatever)_output() -> nd6_resolve() Error handling: Currently sending packet to non-existing la results in ip6_<output|forward> -> nd6_output() -> nd6_output _lle() which returns 0. In new scenario packet is propagated to <ether|whatever>_output() -> nd6_resolve() which will return EWOULDBLOCK, and that result will be converted to 0. (And EWOULDBLOCK is actually used by IB/TOE code). Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D1469
* pf: Fix misdetection of forwarding when net.link.bridge.pfil_bridge is setkp2015-09-011-1/+11
| | | | | | | | | | | | | | | | If net.link.bridge.pfil_bridge is set we can end up thinking we're forwarding in pf_test6() because the rcvif and the ifp (output interface) are different. In that case we're bridging though, and the rcvif the the bridge member on which the packet was received and ifp is the bridge itself. If we'd set dir to PF_FWD we'd end up calling ip6_forward() which is incorrect. Instead check if the rcvif is a member of the ifp bridge. (In other words, the if_bridge is the ifp's softc). If that's the case we're not forwarding but bridging. PR: 202351 Reviewed by: eri Differential Revision: https://reviews.freebsd.org/D3534
* pf: Remove support for 'scrub fragment crop|drop-ovl'kp2015-08-271-479/+31
| | | | | | | | | | | | | | The crop/drop-ovl fragment scrub modes are not very useful and likely to confuse users into making poor choices. It's also a fairly large amount of complex code, so just remove the support altogether. Users who have 'scrub fragment crop|drop-ovl' in their pf configuration will be implicitly converted to 'scrub fragment reassemble'. Reviewed by: gnn, eri Relnotes: yes Differential Revision: https://reviews.freebsd.org/D3466
* Fix packets/bytes accounting on i386.melifaro2015-08-271-1/+1
| | | | Spotted by: julian
* Reapply r196551 which was accidentally reverted by r223637 (update toloos2015-08-241-1/+1
| | | | | | | | | | | | | OpenBSD pf 4.5). Fix argument ordering to memcpy as well as the size of the copy in the (theoretical) case that pfi_buffer_cnt should be greater than ~_max. This fix the failure when you hit the self table size and force it to be resized. MFC after: 3 days Sponsored by: Rubicon Communications (Netgate)
* Add ALTQ(9) support for the CoDel algorithm.loos2015-08-211-0/+7
| | | | | | | | | | | | | CoDel is a parameterless queue discipline that handles variable bandwidth and RTT. It can be used as the single queue discipline on an interface or as a sub discipline of existing queue disciplines such as PRIQ, CBQ, HFSC, FAIRQ. Differential Revision: https://reviews.freebsd.org/D3272 Reviewd by: rpaulo, gnn (previous version) Obtained from: pfSense Sponsored by: Rubicon Communications (Netgate)
* Fix the copy of addresses passed from userland in table replace command.loos2015-08-171-2/+1
| | | | | | | | | The size2 is the maximum userland buffer size (used when the addresses are copied back to userland). Obtained from: pfSense MFC after: 3 days Sponsored by: Rubicon Communications (Netgate)
* Use correct src/dst ports when removing states.oshogbo2015-08-111-2/+2
| | | | | | | | | Submitted by: Milosz Kaniewski <m.kaniewski@wheelsystems.com>, UMEZAWA Takeshi <umezawa@iij.ad.jp> (orginal) Reviewed by: glebius Approved by: pjd (mentor) Obtained from: OpenBSD MFC after: 3 days
* Reduce overhead of ipfw's me6 opcode.ae2015-07-291-23/+27
| | | | | | | | | | | | Skip checks for IPv6 multicast addresses. Use in6_localip() for global unicast. And for IPv6 link-local addresses do search in the IPv6 addresses list. Since LLA are stored in the kernel internal form, use IN6_ARE_MASKED_ADDR_EQUAL() macro with lla_mask for addresses comparison. lla_mask has zero bits in the second word, where we keep sin6_scope_id. Obtained from: Yandex LLC Sponsored by: Yandex LLC
* pf: Always initialise pf_fragment.fr_flagskp2015-07-291-3/+1
| | | | | | | | | | | | | | | When we allocate the struct pf_fragment in pf_fillup_fragment() we forgot to initialise the fr_flags field. As a result we sometimes mistakenly thought the fragment to not be a buffered fragment. This resulted in panics because we'd end up freeing the pf_fragment but not removing it from V_pf_fragqueue (believing it to be part of V_pf_cachequeue). The next time we iterated V_pf_fragqueue we'd use a freed object and panic. While here also fix a pf_fragment use after free in pf_normalize_ip(). pf_reassemble() frees the pf_fragment, so we can't use it any more. PR: 201879, 201932 MFC after: 5 days
* Simplify logic added in r285945 as suggested by glebiusgarga2015-07-281-4/+2
| | | | | | Approved by: glebius MFC after: 3 days Sponsored by: Netgate
* Respect pf rule log option before log dropped packets with IP options orgarga2015-07-281-2/+4
| | | | | | | | | | | dangerous v6 headers Reviewed by: gnn, eri Approved by: gnn Obtained from: pfSense MFC after: 3 days Sponsored by: Netgate Differential Revision: https://reviews.freebsd.org/D3222
* Fix a typo in r280169. Of course we are interested in deleting nsn onlyglebius2015-07-281-1/+1
| | | | | | if we have just created it and we were the last reference. Submitted by: dhartmei
* Add helper functions for IP checksum adjusting. Use these functions inae2015-07-203-17/+26
| | | | | | | | dummynet code and for setdscp. This fixes wrong checksums in some cases. Obtained from: Yandex LLC MFC after: 2 weeks Sponsored by: Yandex LLC
* assorted algorithmic fixes from Paolo Valente (one of my qfq coauthors):luigi2015-07-101-7/+13
| | | | | | | - use 1ULL to avoid shift truncations - recompute the sum of weight dynamically to provide better fairness - fix an erroneous constant in the computation of the slot - preserve timestamp correctness when the old timestamp is stale.
OpenPOWER on IntegriCloud