| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove the kernel optoion for IPSEC_FILTERTUNNEL, which was deprecated
more than 7 years ago in favour of a sysctl in r192648.
MFC r305122:
Remove redundant sanity checks from ipsec[46]_common_input_cb().
This check already has been done in the each protocol callback.
MFC r309144,309174,309201 (by fabient):
IPsec RFC6479 support for replay window sizes up to 2^32 - 32 packets.
Since the previous algorithm, based on bit shifting, does not scale
with large replay windows, the algorithm used here is based on
RFC 6479: IPsec Anti-Replay Algorithm without Bit Shifting.
The replay window will be fast to be updated, but will cost as many bits
in RAM as its size.
The previous implementation did not provide a lock on the replay window,
which may lead to replay issues.
Obtained from: emeric.poupon@stormshield.eu
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D8468
MFC r309143,309146 (by fabient):
In a dual processor system (2*6 cores) during IPSec throughput tests,
we see a lot of contention on the arc4 lock, used to generate the IV
of the ESP output packets.
The idea of this patch is to split this mutex in order to reduce the
contention on this lock.
Update r309143 to prevent false sharing.
Reviewed by: delphij, markm, ache
Approved by: so
Obtained from: emeric.poupon@stormshield.eu
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D8130
MFC r313330:
Merge projects/ipsec into head/.
Small summary
-------------
o Almost all IPsec releated code was moved into sys/netipsec.
o New kernel modules added: ipsec.ko and tcpmd5.ko. New kernel
option IPSEC_SUPPORT added. It enables support for loading
and unloading of ipsec.ko and tcpmd5.ko kernel modules.
o IPSEC_NAT_T option was removed. Now NAT-T support is enabled by
default. The UDP_ENCAP_ESPINUDP_NON_IKE encapsulation type
support was removed. Added TCP/UDP checksum handling for
inbound packets that were decapsulated by transport mode SAs.
setkey(8) modified to show run-time NAT-T configuration of SA.
o New network pseudo interface if_ipsec(4) added. For now it is
build as part of ipsec.ko module (or with IPSEC kernel).
It implements IPsec virtual tunnels to create route-based VPNs.
o The network stack now invokes IPsec functions using special
methods. The only one header file <netipsec/ipsec_support.h>
should be included to declare all the needed things to work
with IPsec.
o All IPsec protocols handlers (ESP/AH/IPCOMP protosw) were removed.
Now these protocols are handled directly via IPsec methods.
o TCP_SIGNATURE support was reworked to be more close to RFC.
o PF_KEY SADB was reworked:
- now all security associations stored in the single SPI namespace,
and all SAs MUST have unique SPI.
- several hash tables added to speed up lookups in SADB.
- SADB now uses rmlock to protect access, and concurrent threads
can do SA lookups in the same time.
- many PF_KEY message handlers were reworked to reflect changes
in SADB.
- SADB_UPDATE message was extended to support new PF_KEY headers:
SADB_X_EXT_NEW_ADDRESS_SRC and SADB_X_EXT_NEW_ADDRESS_DST. They
can be used by IKE daemon to change SA addresses.
o ipsecrequest and secpolicy structures were cardinally changed to
avoid locking protection for ipsecrequest. Now we support
only limited number (4) of bundled SAs, but they are supported
for both INET and INET6.
o INPCB security policy cache was introduced. Each PCB now caches
used security policies to avoid SP lookup for each packet.
o For inbound security policies added the mode, when the kernel does
check for full history of applied IPsec transforms.
o References counting rules for security policies and security
associations were changed. The proper SA locking added into xform
code.
o xform code was also changed. Now it is possible to unregister xforms.
tdb_xxx structures were changed and renamed to reflect changes in
SADB/SPDB, and changed rules for locking and refcounting.
Obtained from: Yandex LLC
Relnotes: yes
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D9352
MFC r313331:
Add removed headers into the ObsoleteFiles.inc.
MFC r313561 (by glebius):
Move tcp_fields_to_net() static inline into tcp_var.h, just below its
friend tcp_fields_to_host(). There is third party code that also uses
this inline.
MFC r313697:
Remove IPsec related PCB code from SCTP.
The inpcb structure has inp_sp pointer that is initialized by
ipsec_init_pcbpolicy() function. This pointer keeps strorage for IPsec
security policies associated with a specific socket.
An application can use IP_IPSEC_POLICY and IPV6_IPSEC_POLICY socket
options to configure these security policies. Then ip[6]_output()
uses inpcb pointer to specify that an outgoing packet is associated
with some socket. And IPSEC_OUTPUT() method can use a security policy
stored in the inp_sp. For inbound packet the protocol-specific input
routine uses IPSEC_CHECK_POLICY() method to check that a packet conforms
to inbound security policy configured in the inpcb.
SCTP protocol doesn't specify inpcb for ip[6]_output() when it sends
packets. Thus IPSEC_OUTPUT() method does not consider such packets as
associated with some socket and can not apply security policies
from inpcb, even if they are configured. Since IPSEC_CHECK_POLICY()
method is called from protocol-specific input routine, it can specify
inpcb pointer and associated with socket inbound policy will be
checked. But there are two problems:
1. Such check is asymmetric, becasue we can not apply security policy
from inpcb for outgoing packet.
2. IPSEC_CHECK_POLICY() expects that caller holds INPCB lock and
access to inp_sp is protected. But for SCTP this is not correct,
becasue SCTP uses own locks to protect inpcb.
To fix these problems remove IPsec related PCB code from SCTP.
This imply that IP_IPSEC_POLICY and IPV6_IPSEC_POLICY socket options
will be not applicable to SCTP sockets. To be able correctly check
inbound security policies for SCTP, mark its protocol header with
the PR_LASTHDR flag.
Differential Revision: https://reviews.freebsd.org/D9538
MFC r313746:
Add missing check to fix the build with IPSEC_SUPPORT and without MAC.
MFC r313805:
Fix LINT build for powerpc.
Build kernel modules support only when both IPSEC and TCP_SIGNATURE
are not defined.
MFC r313922:
For translated packets do not adjust UDP checksum if it is zero.
In case when decrypted and decapsulated packet is an UDP datagram,
check that its checksum is not zero before doing incremental checksum
adjustment.
MFC r314339:
Document that the size of AH ICV for HMAC-SHA2-NNN should be half of
NNN bits as described in RFC4868.
PR: 215978
MFC r314812:
Introduce the concept of IPsec security policies scope.
Currently are defined three scopes: global, ifnet, and pcb.
Generic security policies that IKE daemon can add via PF_KEY interface
or an administrator creates with setkey(8) utility have GLOBAL scope.
Such policies can be applied by the kernel to outgoing packets and checked
agains inbound packets after IPsec processing.
Security policies created by if_ipsec(4) interfaces have IFNET scope.
Such policies are applied to packets that are passed through if_ipsec(4)
interface.
And security policies created by application using setsockopt()
IP_IPSEC_POLICY option have PCB scope. Such policies are applied to
packets related to specific socket. Currently there is no way to list
PCB policies via setkey(8) utility.
Modify setkey(8) and libipsec(3) to be able distinguish the scope of
security policies in the `setkey -DP` listing. Add two optional flags:
'-t' to list only policies related to virtual *tunneling* interfaces,
i.e. policies with IFNET scope, and '-g' to list only policies with GLOBAL
scope. By default policies from all scopes are listed.
To implement this PF_KEY's sadb_x_policy structure was modified.
sadb_x_policy_reserved field is used to pass the policy scope from the
kernel to userland. SADB_SPDDUMP message extended to support filtering
by scope: sadb_msg_satype field is used to specify bit mask of requested
scopes.
For IFNET policies the sadb_x_policy_priority field of struct sadb_x_policy
is used to pass if_ipsec's interface if_index to the userland. For GLOBAL
policies sadb_x_policy_priority is used only to manage order of security
policies in the SPDB. For IFNET policies it is not used, so it can be used
to keep if_index.
After this change the output of `setkey -DP` now looks like:
# setkey -DPt
0.0.0.0/0[any] 0.0.0.0/0[any] any
in ipsec
esp/tunnel/87.250.242.144-87.250.242.145/unique:145
spid=7 seq=3 pid=58025 scope=ifnet ifname=ipsec0
refcnt=1
# setkey -DPg
::/0 ::/0 icmp6 135,0
out none
spid=5 seq=1 pid=872 scope=global
refcnt=1
Obtained from: Yandex LLC
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D9805
PR: 212018
Relnotes: yes
Sponsored by: Yandex LLC
|
|
|
|
|
|
|
|
| |
Print hostcache usage counts with TCP statistics.
PR: 196252
Submitted by: Anton Yuzhaninov <citrin+pr@citrin.ru>
MFC after: 3 weeks.
|
|
|
|
|
|
|
|
|
| |
Use strlcpy and snprintf in netstat(1).
Expand inet6name() line buffer to NI_MAXHOST and use strlcpy/snprintf
in various places.
Reported by: Anton Yuzhaninov <citrin citrin ru>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix a bug which results in a core dump when running netstat with
the -W option and having a listening SCTP socket.
The bug was introduced in r279122 when adding support for libxo.
MFC r302907:
When calling netstat -Laptcp the local address values are not aligned
with the corresponding entry in the table header. r295136
increased the value width from 14 to 32 without the corresponding
change to the table header. This commit adds the change to the table
header width.
MFC r302917:
Ensure that the -a, -W, -L options for SCTP behave similar
as for TCP.
MFC r302928:
Address a potential memory leak found a the clang static code analyzer
running on the userland stack.
MFC r302930:
Don't free a data chunk twice.
Found by the clang static code analyzer running for the userland stack.
MFC r302935:
Deal with a portential memory allocation failure, which was reported
by the clang static code analyzer.
Joint work with rrs@.
MFC r302942:
Add missing sctps_reasmusrmsgs counter.
Joint work with rrs@.
MFC r302945:
Don't duplicate code for SCTP, just use the ones used for UDP and TCP.
This fixes a bug with link local addresses. This will require and
upcoming change in the kernel to bring SCTP to the same behaviour
as UDP and TCP.
MFC r302949:
Fix the PR-SCTP behaviour.
This is done by rrs@.
MFC r302950:
Add a constant required by RFC 7496.
MFC r303024:
netstat and sockstat expect the IPv6 link local addresses to
have an embedded scope. So don't recover.
MFC r303025:
Use correct order of conditions to avoid NULL deref.
MFC r303073:
Fix a bug in deferred stream reset processing which results
in using a length field before it is set.
Thanks to Taylor Brandstetter for reporting the issue and
providing a fix.
Approved by: re (kib)
|
|
|
|
|
|
| |
Also malloc will return NULL if it cannot allocate memory.
MFC after: 2 weeks.
|
| |
|
|
|
|
|
|
|
|
| |
from short to int.
PR: 203922
Submitted by: White Knight <white_knight@2ch.net>
MFC After: 4 weeks
|
| |
|
| |
|
|
|
|
|
|
| |
duplicating roughly similar code for each protocol.
MFC after: 2 weeks
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
routepr() (-r flag). It is too narrow to show an IPv6 prefix
in most cases.
- Accept "local" as a synonym of "unix" in protocol family name.
- Show a prefix length in CIDR notation when name resolution failed in
netname().
- Make routename() and netname() AF-independent and remove
unnecessary typecasting from struct sockaddr.
- Use getnameinfo(3) to format L2 addr in intpr().
- Fix a bug which showed "Address" when -A flag is specfied in pr_rthdr().
- Replace cryptic GETSA() macro with SA_SIZE().
- Fix declarations shadowing local variables with the same names.
- Add more static, remove unused header files and variables.
MFC after: 1 week
|
|
|
|
|
|
|
| |
Obtained from: Phil Shafer <phil@juniper.net>
Ported to -current by: alfred@ (mostly), Kim Shrier
Formatting: marcel@
Sponsored by: Juniper Networks, Inc.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
o Introduce a notion of "not ready" mbufs in socket buffers. These
mbufs are now being populated by some I/O in background and are
referenced outside. This forces following implications:
- An mbuf which is "not ready" can't be taken out of the buffer.
- An mbuf that is behind a "not ready" in the queue neither.
- If sockbet buffer is flushed, then "not ready" mbufs shouln't be
freed.
o In struct sockbuf the sb_cc field is split into sb_ccc and sb_acc.
The sb_ccc stands for ""claimed character count", or "committed
character count". And the sb_acc is "available character count".
Consumers of socket buffer API shouldn't already access them directly,
but use sbused() and sbavail() respectively.
o Not ready mbufs are marked with M_NOTREADY, and ready but blocked ones
with M_BLOCKED.
o New field sb_fnrdy points to the first not ready mbuf, to avoid linear
search.
o New function sbready() is provided to activate certain amount of mbufs
in a socket buffer.
A special note on SCTP:
SCTP has its own sockbufs. Unfortunately, FreeBSD stack doesn't yet
allow protocol specific sockbufs. Thus, SCTP does some hacks to make
itself compatible with FreeBSD: it manages sockbufs on its own, but keeps
sb_cc updated to inform the stack of amount of data in them. The new
notion of "not ready" data isn't supported by SCTP. Instead, only a
mechanical substitute is done: s/sb_cc/sb_ccc/.
A proper solution would be to take away struct sockbuf from struct
socket and allow protocols to implement their own socket buffers, like
SCTP already does. This was discussed with rrs@.
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is intended to help in diagnostics and debugging of NIC and stack
flowid support.
Eventually this will grow another column (RSS CPU ID) but
that currently isn't cached in the inpcb.
There's also no clean flowtype -> flowtype identifier string. This is
the mbuf M_HASHTYPE_* values for RSS.
Here's some example output:
adrian@adrian-hackbox:~/work/freebsd/head/src % netstat -Rn | more
Active Internet connections
Proto Recv-Q Send-Q Local Address Foreign Address flowid ftype
tcp4 0 0 10.11.1.65.22 10.11.1.64.12409 29041942 2
udp4 0 0 127.0.0.1.123 *.* 00000000 0
udp6 0 0 fe80::1%lo0.123 *.* 00000000 0
udp6 0 0 ::1.123 *.* 00000000 0
udp4 0 0 10.11.1.65.123 *.* 00000000 0
Tested:
* amd64 system w/ igb NIC; local driver changes to expose RSS flowid in if_igb.
|
|
|
|
|
|
|
|
|
| |
same events that tcpstat's tcps_rcvmemdrop counter counts.
- Rename tcps_rcvmemdrop to tcps_rcvreassfull and improve its
description in netstat(1) output.
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
|
|
|
|
|
|
| |
TCP statistics output.
MFC after: 3 weeks
|
| |
|
|
|
|
| |
counters.
|
|
|
|
| |
Change interface of kread_counters() similar ot kread() in the netstat(1).
|
|
|
|
|
|
|
|
|
|
| |
Use uint64_t as type for all fields of structures.
Changed structures: ahstat, arpstat, espstat, icmp6_ifstat, icmp6stat,
in6_ifstat, ip6stat, ipcompstat, ipipstat, ipsecstat, mrt6stat, mrtstat,
pfkeystat, pim6stat, pimstat, rip6stat, udpstat.
Discussed with: arch@
|
|
|
|
|
|
| |
kernel core files.
Sponsored by: Nginx, Inc.
|
|
|
|
|
|
|
|
|
| |
Convert 'struct ipstat' and 'struct tcpstat' to counter(9).
This speeds up IP forwarding at extreme packet rates, and
makes accounting more precise.
Sponsored by: Nginx, Inc.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Stateful TCP offload drivers for Terminator 3 and 4 (T3 and T4) ASICs.
These are available as t3_tom and t4_tom modules that augment cxgb(4)
and cxgbe(4) respectively. The cxgb/cxgbe drivers continue to work as
usual with or without these extra features.
- iWARP driver for Terminator 3 ASIC (kernel verbs). T4 iWARP in the
works and will follow soon.
Build-tested with make universe.
30s overview
============
What interfaces support TCP offload? Look for TOE4 and/or TOE6 in the
capabilities of an interface:
# ifconfig -m | grep TOE
Enable/disable TCP offload on an interface (just like any other ifnet
capability):
# ifconfig cxgbe0 toe
# ifconfig cxgbe0 -toe
Which connections are offloaded? Look for toe4 and/or toe6 in the
output of netstat and sockstat:
# netstat -np tcp | grep toe
# sockstat -46c | grep toe
Reviewed by: bz, gnn
Sponsored by: Chelsio communications.
MFC after: ~3 months (after 9.1, and after ensuring MFC is feasible)
|
|
|
|
| |
MFC after: 1 month
|
|
|
|
|
|
|
|
|
|
| |
The index() and rindex() functions were marked LEGACY in the 2001
revision of POSIX and were subsequently removed from the 2008 revision.
The strchr() and strrchr() functions are part of the C standard.
This makes the source code a lot more consistent, as most of these C
files also call into other str*() routines. In fact, about a dozen
already perform strchr() calls.
|
|
|
|
|
|
| |
(I didn't try to fix negative TCP timers with -x.)
MFC after: 3 days
|
| |
|
|
|
|
|
|
|
|
|
| |
is in accordance with the information provided at
ftp://ftp.cs.berkeley.edu/pub/4bsd/README.Impt.License.Change
Also add $FreeBSD$ to a few files to keep svn happy.
Discussed with: imp, rwatson
|
|
|
|
|
| |
Pointed out by: brucec@
MFC after: 3 weeks
|
|
|
|
|
|
|
|
|
|
| |
Retransmitted Packets
Zero Window Advertisements
Out of Order Receives
These statistics are available via the -T argument to
netstat(1).
MFC after: 2 weeks
|
|
|
|
| |
Submitted by: Maxim Dounin
|
| |
|
|
|
|
|
|
|
|
|
| |
feature when you have a seemingly stuck socket and want to figure
out why it has not been closed yet.
No plans to MFC this, as it changes the netstat sysctl ABI.
Reviewed by: andre, rwatson, Eric Van Gyzen
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
New counters now exist for:
requests sent
replies sent
requests received
replies received
packets received
total packets dropped due to no ARP entry
entrys timed out
Duplicate IPs seen
The new statistics are seen in the netstat command
when it is given the -s command line switch.
MFC after: 2 weeks
In collaboration with: bz
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
certain flags that should have been in inp_flags ended up in inp_vflag,
meaning that they were inconsistently locked, and in one case,
interpreted. Move the following flags from inp_vflag to gaps in the
inp_flags space (and clean up the inp_flags constants to make gaps
more obvious to future takers):
INP_TIMEWAIT
INP_SOCKREF
INP_ONESBCAST
INP_DROPPED
Some aspects of this change have no effect on kernel ABI at all, as these
are UDP/TCP/IP-internal uses; however, netstat and sockstat detect
INP_TIMEWAIT when listing TCP sockets, so any MFC will need to take this
into account.
MFC after: 1 week (or after dependencies are MFC'd)
Reviewed by: bz
|
|
|
|
|
|
|
|
|
|
|
| |
IPv4 stack.
Diffs are minimized against p4.
PCS has been used for some protocol verification, more widespread
testing of recorded sources in Group-and-Source queries is needed.
sizeof(struct igmpstat) has changed.
__FreeBSD_version is bumped to 800070.
|
|
|
|
|
|
|
| |
by adding the -x flag earlier.
Submitted by: Anton Yuzhaninov
MFC after: 3 days
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
(all types) used per socket buffer.
Add support to netstat to print out all of the socket buffer
statistics.
Update the netstat manual page to describe the new -x flag
which gives the extended output.
Reviewed by: rwatson, julian
|
| |
|
| |
|
|
|
|
|
|
|
| |
+ kread is not a boolean, so check it as such
+ fix $FreeBSD$ Ids
+ denote copyrights with /*-
+ misc whitespace changes.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
general, when support was added to netstat for fetching data using sysctl,
no provision was left for fetching equivalent data from a core dump, and
in fact, netstat would _always_ fetch data from the live kernel using
sysctl even when -M was specified resulting in the user believing they
were getting data from coredumps when they actually weren't. Some specific
changes:
- Add a global 'live' variable that is true if netstat is running against
the live kernel and false if -M has been specified.
- Stop abusing the sysctl flag in the protocol tables to hold the protocol
number. Instead, the protocol is now its own field in the tables, and
it is passed as a separate parameter to the PCB and stat routines rather
than overloading the KVM offset parameter.
- Don't run PCB or stats functions who don't have a namelist offset if we
are being run against a crash dump (!live).
- For the inet and unix PCB routines, we generate the same buffer from KVM
that the sysctl usually generates complete with the header and trailer.
- Don't run bpf stats for !live (before it would just silently always run
live).
- kread() no longer trashes memory when opening the buffer if there is an
error on open and the passed in buffer is smaller than _POSIX2_LINE_MAX.
- The multicast routing code doesn't fallback to kvm on live kernels if
the sysctl fails. Keeping this made the code rather hairy, and netstat
is already tied to the kernel ABI anyway (even when using sysctl's since
things like xinpcb contain an inpcb) so any kernels this is run against
that have the multicast routing stuff should have the sysctls.
- Don't try to dig around in the kernel linker in the netgraph PCB routine
for core dumps.
Other notes:
- sctp's PCB routine only works on live kernels, it looked rather
complicated to generate all the same stuff via KVM. Someone can always
add it later if desired though.
- Fix the ipsec removal bug where N_xxx for IPSEC stats weren't renumbered.
- Use sysctlbyname() everywhere rather than hardcoded mib values.
MFC after: 1 week
Approved by: re (rwatson)
|