| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
only one protocol switch structure that is shared between ipv4 and ipv6.
Phabric: D476
Reviewed by: jhb
|
|
|
|
|
|
|
| |
on stack to avoid unaligned access.
PR: 187381
Submitted by: Lytochkin Boris <lytboris gmail.com>
|
| |
|
|
|
|
|
|
|
|
|
|
| |
These changes prevent sysctl(8) from returning proper output,
such as:
1) no output from sysctl(8)
2) erroneously returning ENOMEM with tools like truss(1)
or uname(1)
truss: can not get etype: Cannot allocate memory
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.
Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.
MFC after: 2 weeks
Sponsored by: Mellanox Technologies
|
|
|
|
|
|
| |
DNOLD_* flags are for compat with old binaries.
Suggested by: luigi
|
|
|
|
|
|
|
|
|
|
| |
Changes include both DCTCP and RFC 3168 ECN marking methodology.
DCTCP draft: http://tools.ietf.org/html/draft-bensley-tcpm-dctcp-00
Submitted by: Midori Kato (aoimidori27@gmail.com)
Worked with: Lars Eggert (lars@netapp.com)
Reviewed by: luigi, hiren
|
|
|
|
| |
not a maximum ID value (so it is a cap on mp_ncpus, not mp_maxid).
|
|
|
|
|
|
|
|
|
| |
in the mask when calling LibAliasSetMode() to properly clear unneeded
options.
PR: 189655
MFC after: 1 week
Sponsored by: Yandex LLC
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add `flags` u16 field to the hole in ipfw_table_xentry structure.
Kernel has been guessing address family for supplied record based
on xent length size.
Userland, however, has been getting fixed-size ipfw_table_xentry structures
guessing address family by checking address by IN6_IS_ADDR_V4COMPAT().
Fix this behavior by providing specific IPFW_TCF_INET flag for IPv4 records.
PR: bin/189471
Submitted by: Dennis Yusupoff <dyr@smartspb.net>
MFC after: 2 weeks
|
|
|
|
|
|
|
|
| |
!(PFRULE_FRAGCROP|PFRULE_FRAGDROP) case.
o In the (PFRULE_FRAGCROP|PFRULE_FRAGDROP) case we should allocate mtag
if we don't find any.
Tested by: Ian FREISLICH <ianf cloudseed.co.za>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- DIOCADDADDR adds addresses and puts them into V_pf_pabuf
- DIOCADDRULE takes all addresses from V_pf_pabuf and links
them into rule.
The ugly part is that if address is a table, then it is initialized
in DIOCADDRULE, because we need ruleset, and DIOCADDADDR doesn't
supply ruleset. But if address is a dynaddr, we need address family,
and address family could be different for different addresses in one
rule, so dynaddr is initialized in DIOCADDADDR.
This leads to the entangled state of addresses on V_pf_pabuf. Some are
initialized, and some not. That's why running pf_empty_pool(&V_pf_pabuf)
can lead to a panic on a NULL table address.
Since proper fix requires API/ABI change, for now simply plug the panic
in pf_empty_pool().
Reported by: danger
|
|
|
|
|
|
|
|
|
| |
De-virtualize UMA zone pf_mtag_z and move to global initialization part.
The m_tag struct does not know about vnet context and the pf_mtag_free()
callback is called unaware of current vnet. This causes a panic.
MFC after: 1 week
|
|
|
|
|
|
| |
PR: 188543
MFC after: 1 week
Sponsored by: Yandex LLC
|
|
|
|
|
|
|
| |
some regressions in ICMP handling, and right now me and Baptiste
are out of time on analyzing them.
PR: 188253
|
|
|
|
|
|
| |
CID: 1199366
Found with: Coverity Prevent(tm)
MFC after: 1 week
|
|
|
|
|
|
|
| |
Execute pf_overload_task() in vnet context. Fixes a vnet kernel panic.
Reviewed by: trociny
MFC after: 1 week
|
|
|
|
|
|
|
|
|
| |
De-vnet hash sizes and hash masks.
Submitted by: Nikos Vassiliadis <nvass gmx.com>
Reviewed by: trociny
MFC after: 1 month
|
|
|
|
|
| |
PR: kern/187665
Sponsored by: Nginx, Inc.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Use counter(9) for rt_pksent (former rt_rmx.rmx_pksent). This
removes another cache trashing ++ from packet forwarding path.
- Create zini/fini methods for the rtentry UMA zone. Via initialize
mutex and counter in them.
- Fix reporting of rmx_pksent to routing socket.
- Fix netstat(1) to report "Use" both in kvm(3) and sysctl(3) mode.
The change is mostly targeted for stable/10 merge. For head,
rt_pksent is expected to just disappear.
Discussed with: melifaro
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
|
|
|
|
|
|
|
| |
structure pf_rule, that are used when the structure is passed via
ioctl().
PR: 187074
|
|
|
|
|
| |
I am going to split this into two individual patches and test it with
the projects/pf branch that may get merged later.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Process V_pf_overloadqueue in vnet context [2]
This fixes two VIMAGE kernel panics and allows to simultaneously run host-pf
and vnet jails. pf inside jails remains broken.
PR: kern/182964
Submitted by: glebius@FreeBSD.org [2], myself [1]
Tested by: rodrigc@FreeBSD.org, myself
MFC after: 2 weeks
|
|
|
|
| |
longer stored in netinet but in netpfil.
|
|
|
|
|
|
| |
#ifdef INET6, since they are unused when INET6 is disabled.
MFC after: 3 days
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
race prone. Some just gather statistics, but some are later used in
different calculations.
A real problem was the race provoked underflow of the states_cur counter
on a rule. Once it goes below zero, it wraps to UINT32_MAX. Later this
value is used in pf_state_expires() and any state created by this rule
is immediately expired.
Thus, make fields states_cur, states_tot and src_nodes of struct
pf_rule be counter(9)s.
Thanks to Dennis for providing me shell access to problematic box and
his help with reproducing, debugging and investigating the problem.
Thanks to: Dennis Yusupoff <dyr smartspb.net>
Also reported by: dumbbell, pgj, Rambler
Sponsored by: Nginx, Inc.
|
|
|
|
|
|
|
|
| |
* move rarely-used fields down
* move uh_lock to different cacheline
* remove some usused fields
Sponsored by: Yandex LLC
|
|
|
|
| |
CID: 1009118
|
|
|
|
| |
CID: 1007035
|
|
|
|
| |
but this requires some more playing with IPFW_UH_WLOCK(). Leave till later.
|
|
|
|
|
|
|
|
|
| |
LibAliasSetAddress() uses its own mutex to serialize changes.
While here, convert ifp->if_xname access to if_name() function.
MFC after: 2 weeks
Sponsored by: Yandex LLC
|
|
|
|
|
|
| |
otherwise we will panic in pf_test_rule().
PR: 182557
|
|
|
|
|
|
|
| |
rnh_lookup is effectively the same as rnh_matchaddr if called with
empy network mask.
MFC after: 2 weeks
|
|
|
|
| |
in r257186. Found by clang 3.4.
|
|
|
|
|
|
|
| |
be unlinked.
Reported by: Konstantin Kukushkin <dark rambler-co.ru>
Sponsored by: Nginx, Inc.
|
|
|
|
|
|
|
|
|
|
|
| |
re-links dynamic states to default rule instead of
flushing on rule deletion.
This can be useful while performing ruleset reload
(think about `atomic` reload via changing sets).
Currently it is turned off by default.
MFC after: 2 weeks
Sponsored by: Yandex LLC
|
|
|
|
|
| |
MFC after: 2 weeks
Sponsored by: Yandex LLC
|
|
|
|
|
|
| |
Found by: Saychik Pavel <umka@localka.net>
MFC after: 2 weeks
Sponsored by: Yandex LLC
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"IPFW_WLOCK(chain);".
This lock gets deleted in sys/netpfil/ipfw/ip_fw2.c:vnet_ipfw_uninit().
Therefore, vnet_ipfw_nat_uninit() *must* be called before vnet_ipfw_uninit(),
but this doesn't always happen, because the VNET_SYSINIT order is the same for both functions.
In sys/net/netpfil/ipfw/ip_fw2.c and sys/net/netpfil/ipfw/ip_fw_nat.c,
IPFW_SI_SUB_FIREWALL == IPFW_NAT_SI_SUB_FIREWALL == SI_SUB_PROTO_IFATTACHDOMAIN
and
IPFW_MODULE_ORDER == IPFW_NAT_MODULE_ORDER
Consequently, if VIMAGE is enabled, and jails are created and destroyed,
the system sometimes crashes, because we are trying to use a deleted lock.
To reproduce the problem:
(1) Take a GENERIC kernel config, and add options for: VIMAGE, WITNESS,
INVARIANTS.
(2) Run this command in a loop:
jail -l -u root -c path=/ name=foo persist vnet && jexec foo ifconfig lo0 127.0.0.1/8 && jail -r foo
(see http://lists.freebsd.org/pipermail/freebsd-current/2010-November/021280.html )
Fix the problem by increasing the value of IPFW_NAT_SI_SUB_FIREWALL,
so that vnet_ipfw_nat_uninit() runs after vnet_ipfw_uninit().
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
where "m" is number of source nodes and "n" is number of states. Thus,
on heavy loaded router its processing consumed a lot of CPU time.
Reimplement it with O(m+n) complexity. We first scan through source
nodes and disconnect matching ones, putting them on the freelist and
marking with a cookie value in their expire field. Then we scan through
the states, detecting references to source nodes with a cookie, and
disconnect them as well. Then the freelist is passed to pf_free_src_nodes().
In collaboration with: Kajetan Staszkiewicz <kajetan.staszkiewicz innogames.de>
PR: kern/176763
Sponsored by: InnoGames GmbH
Sponsored by: Nginx, Inc.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Removed pf_remove_src_node().
- Introduce pf_unlink_src_node() and pf_unlink_src_node_locked().
These function do not proceed with freeing of a node, just disconnect
it from storage.
- New function pf_free_src_nodes() works on a list of previously
disconnected nodes and frees them.
- Utilize new API in pf_purge_expired_src_nodes().
In collaboration with: Kajetan Staszkiewicz <kajetan.staszkiewicz innogames.de>
Sponsored by: InnoGames GmbH
Sponsored by: Nginx, Inc.
|
|
|
|
| |
Sponsored by: Nginx, Inc.
|
|
|
|
| |
Sponsored by: Nginx, Inc.
|
|
|
|
| |
and add a block for userspace compiling.
|
| |
|
| |
|
|
|
|
| |
emulate the uma_zone for dynamic rules.
|
|
|
|
|
|
|
|
| |
so they can be used in the userspace version of ipfw/dummynet
(normally using netmap for the I/O path).
This is the first of a few commits to ease compiling the
ipfw kernel code in userspace.
|
|
|
|
|
|
|
| |
- Do not return blindly if proto isn't ICMP.
- The dport is in network order, so fix comparisons.
- Remove ridiculous htonl(arc4random()).
- Push local variable to a narrower block.
|