| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the flowid is available for the mbuf that finalised the creation
of a syncache connection, copy it into the inp_flowid field.
Without this, an incoming TCP connection won't have an inp_flowid marked
until some data comes in, and this means that things like the per-CPU
TCP timer option will choose a different CPU for the timer work.
(It also means that if one grabbed the flowid via an ioctl from userland,
it won't be available until some data has been received.)
Sponsored by: Netflix, Inc.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix ipfw fwd for IPv4 traffic broken by r249894.
Problem case:
Original lookup returns route with GW set, so gw points to
rte->rt_gateway.
After that we're changing dst and performing lookup another time.
Since fwd host is most probably directly reachable, resulting
rte does not contain rt_gateway, so gw is not set. Finally, we
end with packet transmitted to proper interface but wrong
link-layer address.
|
|
|
|
|
|
|
| |
Fix various places where we don't properly release a lock
PR: 185043
Submitted by: Michael Bentkofsky
|
| |
|
|
|
|
|
| |
Make TCP_KEEP* socket options readable. At least PostgreSQL wants
to read the values.
|
| |
|
|
|
|
| |
MFC slacker: adrian
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use an RLOCK here instead of an RWLOCK - matching all the other calls
to lla_lookup().
This drastically reduces the very high lock contention when doing parallel
TCP throughput tests (> 1024 sockets) with IPv6.
MFC r260187:
lla_lookup() does modification only when LLE_CREATE is specified.
Thus we can use IF_AFDATA_RLOCK() instead of IF_AFDATA_LOCK() when doing
lla_lookup() without LLE_CREATE flag.
MFC r260217:
Add IF_AFDATA_WLOCK_ASSERT() in case lla_lookup() is called with
LLE_CREATE flag.
|
|
|
|
| |
Pointy hat to: peter
|
|
|
|
| |
Address some warnings which showed up on the userland version.
|
|
|
|
| |
PR: kern/99188
|
|
|
|
|
|
|
| |
Fix regression from r249894. Now we pass "gw" as argument to if_output
method, thus for multicast case we need it to point at "dst".
PR: 185395
|
| |
|
|
|
|
|
|
| |
In sys/netinet/in_mcast.c, inm_is_ifp_detached() is only used whenever
KTR is defined, so put it between #ifdef KTR guards. This avoids a
warning about a unused function if KTR is not enabled.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Only initialize some mutexes for the default VNET.
In r208160, sctp_it_ctl was made a global variable, across all VNETs.
However, sctp_init() is called for every VNET that is created. This results
in the same global mutexes which are part of sctp_it_ctl being initialized. This can result
in crashes if many jails are created.
To reproduce the problem:
(1) Take a GENERIC kernel config, and add options for: VIMAGE, WITNESS,
INVARIANTS.
(2) Run this command in a loop:
jail -l -u root -c path=/ name=foo persist vnet && jexec foo ifconfig lo0 127.0.0.1/8 && jail -r foo
(see http://lists.freebsd.org/pipermail/freebsd-current/2010-November/021280.html )
Witness will warn about the same mutex being initialized.
Fix the problem by only initializing these mutexes in the default VNET.
MFC r258765:
In
http://svnweb.freebsd.org/changeset/base/258221
I introduced a bug which initialized global locks
whenever the SCTP stack initialized. This was fixed in
http://svnweb.freebsd.org/changeset/base/258574
by rodrigc@. He just initialized the locks for
the default vnet. This fix reverts to the old
behaviour before r258221, which explicitly makes
sure it is only called once, because this works also on
other platforms.
Approved by: re@ (gjb)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove a buggy comparision when setting manually the path MTU.
After fixing, the comparision would have become redundant.
Thanks to Andrew Galante for reporting the issue.
MFC r257272:
Fix compilation if SCTP_DONT_DO_PRIVADDR_SCOPE is defined.
The issue was reported by Andrew Galante.
MFC r257274:
Fix the value of *optlen when calling getsockopt() for
SCTP_REMOTE_UDP_ENCAPS_PORT.
This issue was reported by Andrew Galante.
MFC r257359:
Terminate a debug output with a \n.
MFC r257555:
Changes from upstream to improve compilation when INET or INET6
or none of them is defined.
MFC r257574:
Unlock the lock before destroying it.
This issue was reported by Andrew Galante.
MFC r257800:
Use htons()/ntohs() appropriately.
These issues were reported by Andrew Galante.
MFC r257803:
Make sure that we don't try to build an ASCONF-ACK chunk
larger than what fits in the the mbuf cluster.
This issue was reported by Andrew Galante.
MFC r257804:
Get rid of the artification limitation enforced by
SCTP_AUTH_RANDOM_SIZE_MAX.
This was suggested by Andrew Galante.
MFC r258221:
Cleanups which result in fixes which have been made upstream
and where partially suggested by Andrew Galante.
There is no functional change in FreeBSD.
MFC r258224:
When determining if an address belongs to an stcb, take the address family
into account for wildcard bound endpoints.
MFC r258228:
Remove a stray write operation.
MFC r258235:
Use SCTP_PR_SCTP_TTL when the user provides a positive
timetolive in sctp_sendmsg().
Approved by: re@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The TCP delayed ACK logic isn't aware of LRO passing up large aggregated
segments thinking it received only one segment. This causes it to enable
the delay the ACK for 100ms to wait for another segment which may never
come because all the data was received already.
Doing delayed ACK for LRO segments is bogus for two reasons: a) it pushes
us further away from acking every other packet; b) it introduces additional
delay in responding to the sender. The latter is especially bad because it
is in the nature of LRO to aggregated all segments of a burst with no more
coming until an ACK is sent back.
Change the delayed ACK logic to detect LRO segments by being larger than
the MSS for this connection and issuing an immediate ACK for them to keep
the ACK clock ticking without interruption.
Reported by: julian, cperciva
Tested by: cperciva
Reviewed by: lstewart
Approved by: re (glebius)
|
|
|
|
|
|
|
|
|
|
|
| |
sbdrop_locked() to cut acked mbufs from the socket buffer. Free this
chain a batch manner after the socket buffer lock is dropped.
This measurably reduces contention on socket buffer.
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
Approved by: re (marius)
|
|
|
|
|
|
|
|
|
| |
input path. These probes get some of the fields in host order, whereas the
output probes get them in network order, so a single translator isn't
enough. This workaround ensures that the problem is essentially invisble
to users: none of the probe arguments or their fields have changed.
Approved by: re (hrs)
|
|
|
|
|
|
|
|
|
| |
so that fixed TCP_SIGNATURE handling can later be merged.
This is derived from follow-up work to SVN r183001 posted to
net@ on Sep 13 2008.
Approved by: re (gjb)
|
|
|
|
|
| |
Discussed with: andre
Approved by: re (rodrigc)
|
|
|
|
|
|
|
| |
user initiated error cause (using SCTP_ABORT|SCTP_SENDALL).
Approved by: re (delphij)
MFC after: 1 week
|
|
|
|
|
| |
Reviewed by: glebius
Approved by: re (kib)
|
|
|
|
|
|
| |
receiver socket buffer size correctly into account.
MFC after: 1 week
|
| |
|
|
|
|
|
|
|
|
|
| |
matches the types used when computing hash indices and the type of the
maximum size of mfchashtbl[].
PR: kern/181821
Submitted by: Sven-Thorsten Dietrich <sven@vyatta.com> (IPv4)
MFC after: 1 week
|
|
|
|
|
| |
PR: kern/181822
MFC after: 1 week
|
|
|
|
| |
MFC after: 1 week
|
|
|
|
| |
MFC after: 1 week
|
|
|
|
|
|
|
|
|
| |
* Remove non working code related to SHA224.
* Remove support for non-standardised HMAC-IDs using SHA384 and SHA512.
* Prefer SHA256 over SHA1.
* Minor cleanup.
MFC after: 2 weeks
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a last-modified timestamp to each LRO entry and provide an interface
to flush all inactive entries. Drivers decide when to flush and what
the inactivity threshold should be.
Network drivers that process an rx queue to completion can enter a
livelock type situation when the rate at which packets are received
reaches equilibrium with the rate at which the rx thread is processing
them. When this happens the final LRO flush (normally when the rx
routine is done) does not occur. Pure ACKs and segments with total
payload < 64K can get stuck in an LRO entry. Symptoms are that TCP
tx-mostly connections' performance falls off a cliff during heavy,
unrelated rx on the interface.
Flushing only inactive LRO entries works better than any of these
alternates that I tried:
- don't LRO pure ACKs
- flush _all_ LRO entries periodically (every 'x' microseconds or every
'y' descriptors)
- stop rx processing in the driver periodically and schedule remaining
work for later.
Reviewed by: andre
|
|
|
|
|
|
|
|
| |
ever intended for use in sysctl(8) and it has not used them for many
years.
Reviewed by: bde
Tested by: exp-run by bdrewery
|
|
|
|
|
|
| |
connection state, not the IP header.
X-MFC with: r254889
|
|
|
|
|
|
|
|
|
| |
dynamic translation so that their arguments match the definitions for
these providers in Solaris and illumos. Thus, existing scripts for these
providers should work unmodified on FreeBSD.
Tested by: gnn, hiren
MFC after: 1 month
|
| |
|
|
|
|
|
|
| |
The upper 32bits are not occupied for now.
Sponsored by: The FreeBSD Foundation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
features. The changes in particular are:
o Remove rarely used "header" pointer and replace it with a 64bit protocol/
layer specific union PH_loc for local use. Protocols can flexibly overlay
their own 8 to 64 bit fields to store information while the packet is
worked on.
o Mechanically convert IP reassembly, IGMP/MLD and ATM to use pkthdr.PH_loc
instead of pkthdr.header.
o Extend csum_flags to 64bits to allow for additional future offload
information to be carried (e.g. iSCSI, IPsec offload, and others).
o Move the RSS hash type enumerator from abusing m_flags to its own 8bit
rsstype field. Adjust accessor macros.
o Add cosqos field to store Class of Service / Quality of Service information
with the packet. It is not yet supported in any drivers but allows us to
get on par with Cisco/Juniper in routing applications (plus MPLS QoS) with
a modernized ALTQ.
o Add four 8 bit fields l[2-5]hlen to store the relative header offsets
from the start of the packet. This is important for various offload
capabilities and to relieve the drivers from having to parse the packet
and protocol headers to find out location of checksums and other
information. Header parsing in drivers is a lot of copy-paste and
unhandled corner cases which we want to avoid.
o Add another flexible 64bit union to map various additional persistent
packet information, like ether_vtag, tso_segsz and csum fields.
Depending on the csum_flags settings some fields may have different usage
making it very flexible and adaptable to future capabilities.
o Restructure the CSUM flags to better signify their outbound (down the
stack) and inbound (up the stack) use. The CSUM flags used to be a bit
chaotic and rather poorly documented leading to incorrect use in many
places. Bring clarity into their use through better naming.
Compatibility mappings are provided to preserve the API. The drivers
can be corrected one by one and MFC'd without issue.
o The size of pkthdr stays the same at 48/56bytes (32/64bit architectures).
Sponsored by: The FreeBSD Foundation
|
|
|
|
|
|
| |
Bump __FreeBSD_version to 1000048 since the
modified structure is user visible and used
by netstat, for example.
|
|
|
|
|
|
|
|
|
| |
When exporting to xinpcb, just export the lower
32-bit. Using there also 64-bits will break the
ABI and will be committed separetly.
MFC after: 2 weeks
X-MFC with: 254248
|
|
|
|
|
|
|
|
| |
can result in a buffer which is too small for the requested
operation.
Security: CVE-2013-3077
Security: FreeBSD-SA-13:09.ip_multicast
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
together.
Add M_FLAG_PRINTF for use with printf(9) %b indentifier.
Use the generic mbuf flags print names in the net80211 code and adjust
the protocol specific bits for their new positions.
Change SCTP M_PROTO mapping from 5 to 1 to fit within the 16bit field
they use internally to store some additional information.
Discussed with: trociny, glebius
|
|
|
|
|
|
|
|
| |
downwards layer crossings.
Consistently use it within IP, IPv6 and ethernet protocols.
Discussed with: trociny, glebius
|
|
|
|
|
|
|
| |
specific mbuf flag from sys/mbuf.h to netinet/sctp_os_bsd.h. It is
only relevant within SCTP.
Discussed with: tuexen
|
|
|
|
|
|
|
|
|
|
| |
flag instead. The flag is only used within the IP and IPv6 layer 3
protocols.
Because some firewall packages treat IPv4 and IPv6 packets the same the
flag should have the same value for both.
Discussed with: trociny, glebius
|
|
|
|
|
|
|
| |
specific flag instead. The flag is only relevant while the packet stays in
the IP reassembly queue.
Discussed with: trociny, glebius
|
|
|
|
|
| |
There wasn't any real driver (and hardware) support for it. Modern hardware
does full fragmentation/segmentation offload instead.
|
|
|
|
|
|
|
|
|
| |
using SDT_PROBE_ARGTYPE(). This will make it easy to extend the SDT(9) API
to allow probes with dynamically-translated types.
There is no functional change.
MFC after: 2 weeks
|
|
|
|
|
|
|
| |
every cookie on the wire. This bug was reported in
https://bugzilla.mozilla.org/show_bug.cgi?id=905080
MFC after: 3 days
|
|
|
|
| |
Reviewed by: ae, glebius
|
|
|
|
|
|
|
|
| |
This will allow an easier integration of the support
for NDATA.
While there, do also some minor cleanups.
Obtained from: rrs@
MFC after: 2 weeks
|