summaryrefslogtreecommitdiffstats
path: root/sys/netinet
Commit message (Collapse)AuthorAgeFilesLines
* Add the missing change from the last commit.Luiz Souza2017-07-201-0/+1
|
* Fix a bug where existing CARP alias addresses cannot be changed.Luiz Otavio O Souza2017-07-173-11/+13
| | | | | | | | When a existent address is delete with carp_detach() if it is the last address for that CARP vhid, the CARP vhid will be destroyed and the subsequent carp_attach() to add the new IP will fail. Ticket #6892 (cherry picked from commit 77805aa5fa51dbd2ed0b6c363c6235c892caee76)
* Merge remote-tracking branch 'origin/stable/11' into devel-11Luiz Souza2017-07-1721-206/+435
|\
| * MFC r320263:tuexen2017-06-285-72/+89
| | | | | | | | | | | | | | | | | | | | | | | | Use a longer buffer for messages in ERROR chunks. MFC r320264: Check the length of a COOKIE chunk before accessing fields in it. MFC r320300: Handle sctp_get_next_param() in a consistent way. Approved by: re (marius@)
| * MFC r319556:tuexen2017-06-073-37/+42
| | | | | | | | | | | | | | | | | | | | | | Fix the ICMP6 handling for TCP. The ICMP6 packets might not be contained in a single mbuf. So don't assume this. Keep the IPv4 and IPv6 code in sync and make explicit that the syncache code only need the TCP sequence number, not the complete TCP header. Approved by: re (marius)
| * When a SYN-ACK is received in SYN-SENT state, RFC 793 requires thetuexen2017-06-012-14/+38
| | | | | | | | | | | | | | | | | | | | | | validation of SEG.ACK as the first step. If the ACK is not acceptable, a RST segment should be sent and the segment should be dropped. Up to now, the segment was partially processed. This patch moves the check for the SEG.ACK validation up to the front as required. Reviewed by: hiren, gnn Differential Revision: https://reviews.freebsd.org/D10424
| * MFC r318649:tuexen2017-06-011-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | The connect() system call should return -1 and set errno to EAFNOSUPPORT if it is called on a TCP socket * with an IPv6 address and the socket is bound to an IPv4-mapped IPv6 address. * with an IPv4-mapped IPv6 address and the socket is bound to an IPv6 address. Thanks to Jonathan T. Leighton for reporting this issue. Reviewed by: bz, gnn Differential Revision: https://reviews.freebsd.org/D9163
| * MFC r317597:tuexen2017-06-014-16/+108
| | | | | | | | | | | | | | | | Allow SCTP to use the hostcache. This patch allows the MTU stored in the hostcache to be used as an initial value for SCTP paths. When an ICMP PTB message is received, store the MTU in the hostcache.
| * MFC r317592:tuexen2017-06-011-0/+8
| | | | | | | | Don't set the DF-bit on timer based retransmissions.
| * MFC r317558:tuexen2017-06-011-1/+1
| | | | | | | | Set the DF bit for responses to out-of-the-blue packets.
| * MFC 317464:tuexen2017-06-011-3/+3
| | | | | | | | | | Fix an issue with MTU calculation if an ICMP message is received for an SCTP/UDP packet.
| * MFC r317457:tuexen2017-06-013-5/+5
| | | | | | | | | | | | | | Use consistently uint32_t for mtu values. This does not change functionality, but this cleanup is need for further improvements of ICMP handling.
| * MFC r317244:tuexen2017-06-011-1/+2
| | | | | | | | | | | | | | | | | | Represent "a syncache overflow hasn't happend yet" by using -(SYNCOOKIE_LIFETIME + 1) instead of INT64_MIN, since it is good enough and works when time_t is int32 or int64. This fixes the issue reported by cy@ on i386. Reported by: cy
| * MFC r317208:tuexen2017-06-012-5/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Syncoockies can be used in combination with the syncache. If the cache overflows, syncookies are used. This patch restricts the usage of syncookies in this case: accept syncookies only if there was an overflow of the syncache recently. This mitigates a problem reported in PR217637, where is syncookie was accepted without any recent drops. Thanks to glebius@ for suggesting an improvement. PR: 217637 Reviewed by: gnn, glebius Differential Revision: https://reviews.freebsd.org/D10272
| * MFC r316743:tuexen2017-06-012-2/+46
| | | | | | | | | | | | | | | | | | | | | | The sysctl variable net.inet.tcp.drop_synfin is not honored in all states, for example not in SYN-SENT. This patch adds code to check the sysctl variable in other states than LISTEN. Thanks to ae and gnn for providing comments. Reviewed by: gnn Differential Revision: https://reviews.freebsd.org/D9894
| * MFC r314155:tuexen2017-06-011-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | TCP window updates are only sent if the window can be increased by at least 2 * MSS. However, if the receive buffer size is small, this might be impossible. Add back a criterion to send a TCP window update if the window can be increased by at least half of the receive buffer size. This condition was removed in r242252. This patch simply brings it back. PR: 211003 Reviewed by: gnn Differential Revision: https://reviews.freebsd.org/D9475
| * MFC r313032:tuexen2017-06-011-2/+2
| | | | | | | | Ensure that the variable bail is always initialized before used.
| * MFC r313031:tuexen2017-06-011-1/+1
| | | | | | | | | | | | | | Take the SCTP common header into account when computing the space available for chunks. This unbreaks the handling of ICMPV6 packets indicating "packet too big". It just worked for IPv4 since we are overbooking for IPv4.
| * MFC r313030:tuexen2017-06-011-6/+2
| | | | | | | | Remove a duplicate debug statement.
| * MFC r312722:tuexen2017-06-011-34/+41
| | | | | | | | Fix a bug where the overhead of the I-DATA chunk was not considered.
| * MFC r312063:tuexen2017-06-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Ensure that the buffer length and the length provided in the IPv4 header match when using a raw socket to send IPv4 packets and providing the header. If they don't match, let send return -1 and set errno to EINVAL. Before this patch is was only enforced that the length in the header is not larger then the buffer length. PR: 212283 Reviewed by: ae, gnn Differential Revision: https://reviews.freebsd.org/D9161
| * MFC r318399:ae2017-05-241-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set M_BCAST and M_MCAST flags on mbuf sent via divert socket. r290383 has changed how mbufs sent by divert socket are handled. Previously they are always handled by slow path processing in ip_input(). Now ip_tryforward() is invoked from ip_input() before in_broadcast() check. Since diverted packet lost all mbuf flags, it passes the broadcast check in ip_tryforward() due to missing M_BCAST flag. In the result the broadcast packet is forwarded to the wire instead of be consumed by network stack. Add in_broadcast() check to the div_output() function. And restore the M_BCAST flag if destination address is broadcast for the given network interface. PR: 209491
| * MFC r317170, r317389, and r317390.np2017-05-241-8/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | r317170: Remove redundant assignment. r317389: Frames that are not considered for LRO should not be counted in LRO statistics. r317390: Flush the LRO ctrl as soon as lro_mbufs fills up. There is no need to wait for the next enqueue from the driver. Sponsored by: Chelsio Communications
| * MFC r318150:eugen2017-05-191-1/+3
| | | | | | | | | | | | | | Fix translation of transit PPtP/GRE connections for ipfw nat/natd "global" case. PR: 218968 Approved by: ae, vsevolod (mentor)
* | Add the timestamp of the last match, packet and byte counters to tableLuiz Souza2017-07-151-0/+5
| | | | | | | | | | | | | | | | | | | | entries with a new ipfw table command to zero the counters. Each table type implementation needs to be modified to add the support to this feature and the FIB backend is the only one that was not modified (because the backend does not have any local storage). (cherry picked from commit 3b06c382c8a2e04b7a64291bfb6b0ca0e5dd8dca)
* | Add ipfw support to MAC address tables.Luiz Otavio O Souza2017-07-141-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The l2 filter implementation on ipfw works with MAC address pairs as it happens on wire (first destination and then source). The table entries works in the same way, but the MAC address pair has to be passed in a single argument: $ ipfw table create l2 type mac $ ipfw table add "00:01:02:03:04:05 0a:0b:0c:0d:0e:0f" added: 00:01:02:03:04:05 0a:0b:0c:0d:0e:0f 0 $ ipfw table add "00:01:02:03:04:05 any" added: 00:01:02:03:04:05 any 0 $ ipfw table l2 add "any 0a:0b:0c:0d:0e:0f" added: any 0a:0b:0c:0d:0e:0f 0 The MAC tables can also hold an optinal value used to implement additional features (skipto, fib, pipe, tag, nat, ...). $ ipfw table l2 add "00:01:02:03:04:05 0a:0b:0c:0d:0e:ff" 1234 added: 00:01:02:03:04:05 0a:0b:0c:0d:0e:ff 1234 $ ipfw table l2 list --- table(l2), set(0) --- 00:01:02:03:04:05 0a:0b:0c:0d:0e:0f 0 any 0a:0b:0c:0d:0e:0f 0 00:01:02:03:04:05 any 0 00:01:02:03:04:05 0a:0b:0c:0d:0e:ff 1234 Rule example: $ ipfw add pass MAC 1:2:3:4:5:6 2:3:4:5:6:7 via igb0 00100 allow ip from any to any MAC 01:02:03:04:05:06 02:03:04:05:06:07 via igb0 $ ipfw add pass MAC table\(l2\) via igb0 00000 allow ip from any to any MAC table(l2) via igb0 $ ipfw list 00100 allow ip from any to any MAC 01:02:03:04:05:06 02:03:04:05:06:07 via igb0 00200 allow ip from any to any MAC table(l2) via igb0 00300 allow ip from any to any 65535 deny ip from any to any (cherry picked from commit 1fc9408b335ef6e8863019212c12a4bc99ed8e75)
* | Merge remote-tracking branch 'origin/stable/11' into devel-11Renato Botelho2017-05-1138-1603/+1123
|\ \ | |/
| * MFC r316676:smh2017-04-246-124/+82
| | | | | | | | | | | | | | Use estimated RTT for receive buffer auto resizing instead of timestamps Relnotes: Yes Sponsored by: Multiplay
| * MFC r316065: Enable route and LLE (ndp) caching in TCP/IPv6karels2017-04-221-7/+4
| | | | | | | | | | | | | | | | | | tcp_output.c was using a route on the stack for IPv6, which does not allow route caching or LLE/ndp caching. Switch to using the route (v6 flavor) in the in_pcb, which was already present, which caches both L3 and L2 lookups. Reviewed by: gnn hiren
| * MFC r316715:ae2017-04-182-4/+6
| | | | | | | | | | | | | | | | Make sysctl identifiers for direct netisr queue unique. Introduce IPCTL_INTRDQMAXLEN and IPCTL_INTRDQDROPS macros for this purpose. Reviewed by: gnn Differential Revision: https://reviews.freebsd.org/D10358
| * MFC r316434:ae2017-04-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add O_EXTERNAL_DATA opcode support. This opcode can be used to attach some data to external action opcode. And unlike to O_EXTERNAL_INSTANCE opcode, this opcode does not require creating of named instance to pass configuration arguments to external action handler. The data is coming just next to O_EXTERNAL_ACTION opcode. The userlevel part currenly supports formatting for opcode with ipfw_insn size, by default it expects u16 numeric value in the arg1. Obtained from: Yandex LLC Sponsored by: Yandex LLC
| * MFC r316313, r316328:smh2017-04-141-9/+9
| | | | | | | | | | | | | | | | Allow explicitly assigned IPv4 & IPv6 loopback addresses to be used in jails. Relnotes: Yes Sponsored by: Multiplay
| * MFC r303863:smh2017-04-141-0/+432
| | | | | | | | | | | | Move IPv4 & IPv6 specific jail functions to netinet and netinet6 files. Sponsored by: Multiplay
| * MFC r306829, r310286, r311695:markj2017-04-111-0/+4
| | | | | | | | Lock the ND prefix list and add refcounting for prefixes.
| * Fix reference count leak with L2 caching.karels2017-04-102-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MFC r315956 ip_forward, TCP/IPv6, and probably SCTP leaked references to L2 cache entry because they used their own routes on the stack, not in_pcb routes. The original model for route caching was callers that provided a route structure to ip{,6}input() would keep the route, and this model was used for L2 caching as well. Instead, change L2 caching to be done by default only when using a route structure in the in_pcb; the pcb deallocation code frees L2 as well as L3 cacches. A separate change will add route caching to TCP/IPv6. Another suggestion was to have the transport protocols indicate willingness to use L2 caching, but this approach keeps the changes in the network level Reviewed by: ae gnn MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D10059
| * MFC r304041:ae2017-04-031-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move logging via BPF support into separate file. * make interface cloner VNET-aware; * simplify cloner code and use if_clone_simple(); * migrate LOGIF_LOCK() to rmlock; * add ipfw_bpf_mtap2() function to pass mbuf to BPF; * introduce new additional ipfwlog0 pseudo interface. It differs from ipfw0 by DLT type used in bpfattach. This interface is intended to used by ipfw modules to dump packets with additional info attached. Currently pflog format is used. ipfw_bpf_mtap2() function uses second argument to determine which interface use for dumping. If dlen is equal to ETHER_HDR_LEN it uses old ipfw0 interface, if dlen is equal to PFLOG_HDRLEN - ipfwlog0 will be used. Obtained from: Yandex LLC Sponsored by: Yandex LLC MFC r304043: Add three helper function to manage tables from external modules. ipfw_objhash_lookup_table_kidx does lookup kernel index of table; ipfw_ref_table/ipfw_unref_table takes and releases reference to table. Obtained from: Yandex LLC Sponsored by: Yandex LLC MFC r304046, 304108: Add ipfw_nat64 module that implements stateless and stateful NAT64. The module works together with ipfw(4) and implemented as its external action module. Stateless NAT64 registers external action with name nat64stl. This keyword should be used to create NAT64 instance and to address this instance in rules. Stateless NAT64 uses two lookup tables with mapped IPv4->IPv6 and IPv6->IPv4 addresses to perform translation. A configuration of instance should looks like this: 1. Create lookup tables: # ipfw table T46 create type addr valtype ipv6 # ipfw table T64 create type addr valtype ipv4 2. Fill T46 and T64 tables. 3. Add rule to allow neighbor solicitation and advertisement: # ipfw add allow icmp6 from any to any icmp6types 135,136 4. Create NAT64 instance: # ipfw nat64stl NAT create table4 T46 table6 T64 5. Add rules that matches the traffic: # ipfw add nat64stl NAT ip from any to table(T46) # ipfw add nat64stl NAT ip from table(T64) to 64:ff9b::/96 6. Configure DNS64 for IPv6 clients and add route to 64:ff9b::/96 via NAT64 host. Stateful NAT64 registers external action with name nat64lsn. The only one option required to create nat64lsn instance - prefix4. It defines the pool of IPv4 addresses used for translation. A configuration of instance should looks like this: 1. Add rule to allow neighbor solicitation and advertisement: # ipfw add allow icmp6 from any to any icmp6types 135,136 2. Create NAT64 instance: # ipfw nat64lsn NAT create prefix4 A.B.C.D/28 3. Add rules that matches the traffic: # ipfw add nat64lsn NAT ip from any to A.B.C.D/28 # ipfw add nat64lsn NAT ip6 from any to 64:ff9b::/96 4. Configure DNS64 for IPv6 clients and add route to 64:ff9b::/96 via NAT64 host. Obtained from: Yandex LLC Relnotes: yes Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D6434 MFC r304048: Replace __noinline with special debug macro NAT64NOINLINE. MFC r304061: Use %ju to print unsigned 64-bit value. MFC r304076: Make statistics nat64lsn, nat64stl an nptv6 output netstat-like: "@value @description" and fix build due to -Wformat errors. MFC r304378 (by bz): Try to fix gcc compilation errors (which are right). nat64_getlasthdr() returns an int, which can be -1 in case of error, storing the result in an uint8_t and then comparing to < 0 is not helpful. Do what is done in the rest of the code and make proto an int here as well. MFC r309187: Fix ICMPv6 Time Exceeded error message translation. MFC r314718: Use new ipfw_lookup_table() in the nat64 too. MFC r315204,315233: Use memset with structure size.
| * MFC r303012:ae2017-04-031-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add ipfw_nptv6 module that implements Network Prefix Translation for IPv6 as defined in RFC 6296. The module works together with ipfw(4) and implemented as its external action module. When it is loaded, it registers as eaction and can be used in rules. The usage pattern is similar to ipfw_nat(4). All matched by rule traffic goes to the NPT module. Reviewed by: hrs Obtained from: Yandex LLC Relnotes: yes Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D6420 MFC r304049: Add `stats reset` command implementation to NPTv6 module to be able reset statistics counters. Obtained from: Yandex LLC Sponsored by: Yandex LLC MFC r304076: Make statistics nat64lsn, nat64stl an nptv6 output netstat-like: "@value @description" and fix build due to -Wformat errors. MFC r314507: Fix NPTv6 rule counters when one_pass is not enabled. Consider the rule matching when both @done and @retval values returned from ipfw_run_eaction() are zero. And modify ipfw_nptv6() to return IP_FW_DENY and @done=0 when addresses do not match. Obtained from: Yandex LLC Sponsored by: Yandex LLC
| * MFC r303018:ae2017-03-301-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add named dynamic states support to ipfw(4). The keep-state, limit and check-state now will have additional argument flowname. This flowname will be assigned to dynamic rule by keep-state or limit opcode. And then can be matched by check-state opcode or O_PROBE_STATE internal opcode. To reduce possible breakage and to maximize compatibility with old rulesets default flowname introduced. It will be assigned to the rules when user has omitted state name in keep-state and check-state opcodes. Also if name is ambiguous (can be evaluated as rule opcode) it will be replaced to default. Reviewed by: julian Obtained from: Yandex LLC Relnotes: yes Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D6674 MFC r304087: Do not warn about ambiguous state name when we inspect a comment token. MFC r304089: Add an ability to attach comment to check-state rules. MFC r310727 (by marius): Fix a bug in r272840; given that the optlen parameter of setsockopt(2) is a 32-bit socklen_t, do_get3() passes the kernel to access the wrong 32-bit half on big-endian LP64 machines when simply casting the 64-bit size_t optlen to a socklen_t pointer. While at it and given that the intention of do_get3() apparently is to hide/wrap the fact that socket options are used for communication with ipfw(4), change the optlen parameter of do_set3() to be of type size_t and as such more appropriate than uintptr_t, too. MFC r315305: Change the syntax of ipfw's named states. Since the state name is an optional argument, it often can conflict with other options. To avoid ambiguity now the state name must be prefixed with a colon. Sponsored by: Yandex LLC
| * MFC: 311225, 311243, 313045gnn2017-03-305-35/+32
| | | | | | | | | | | | | | | | Fix DTrace TCP tracepoints to not use mtod() as it is both unnecessary and dangerous. Those wanting data from an mbuf should use DTrace itself to get the data. Add an mbuf to ipinfo_t translator to finish cleanup of mbuf passing to TCP probes.
| * MFC r304572 (by bz):ae2017-03-1819-1257/+311
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the kernel optoion for IPSEC_FILTERTUNNEL, which was deprecated more than 7 years ago in favour of a sysctl in r192648. MFC r305122: Remove redundant sanity checks from ipsec[46]_common_input_cb(). This check already has been done in the each protocol callback. MFC r309144,309174,309201 (by fabient): IPsec RFC6479 support for replay window sizes up to 2^32 - 32 packets. Since the previous algorithm, based on bit shifting, does not scale with large replay windows, the algorithm used here is based on RFC 6479: IPsec Anti-Replay Algorithm without Bit Shifting. The replay window will be fast to be updated, but will cost as many bits in RAM as its size. The previous implementation did not provide a lock on the replay window, which may lead to replay issues. Obtained from: emeric.poupon@stormshield.eu Sponsored by: Stormshield Differential Revision: https://reviews.freebsd.org/D8468 MFC r309143,309146 (by fabient): In a dual processor system (2*6 cores) during IPSec throughput tests, we see a lot of contention on the arc4 lock, used to generate the IV of the ESP output packets. The idea of this patch is to split this mutex in order to reduce the contention on this lock. Update r309143 to prevent false sharing. Reviewed by: delphij, markm, ache Approved by: so Obtained from: emeric.poupon@stormshield.eu Sponsored by: Stormshield Differential Revision: https://reviews.freebsd.org/D8130 MFC r313330: Merge projects/ipsec into head/. Small summary ------------- o Almost all IPsec releated code was moved into sys/netipsec. o New kernel modules added: ipsec.ko and tcpmd5.ko. New kernel option IPSEC_SUPPORT added. It enables support for loading and unloading of ipsec.ko and tcpmd5.ko kernel modules. o IPSEC_NAT_T option was removed. Now NAT-T support is enabled by default. The UDP_ENCAP_ESPINUDP_NON_IKE encapsulation type support was removed. Added TCP/UDP checksum handling for inbound packets that were decapsulated by transport mode SAs. setkey(8) modified to show run-time NAT-T configuration of SA. o New network pseudo interface if_ipsec(4) added. For now it is build as part of ipsec.ko module (or with IPSEC kernel). It implements IPsec virtual tunnels to create route-based VPNs. o The network stack now invokes IPsec functions using special methods. The only one header file <netipsec/ipsec_support.h> should be included to declare all the needed things to work with IPsec. o All IPsec protocols handlers (ESP/AH/IPCOMP protosw) were removed. Now these protocols are handled directly via IPsec methods. o TCP_SIGNATURE support was reworked to be more close to RFC. o PF_KEY SADB was reworked: - now all security associations stored in the single SPI namespace, and all SAs MUST have unique SPI. - several hash tables added to speed up lookups in SADB. - SADB now uses rmlock to protect access, and concurrent threads can do SA lookups in the same time. - many PF_KEY message handlers were reworked to reflect changes in SADB. - SADB_UPDATE message was extended to support new PF_KEY headers: SADB_X_EXT_NEW_ADDRESS_SRC and SADB_X_EXT_NEW_ADDRESS_DST. They can be used by IKE daemon to change SA addresses. o ipsecrequest and secpolicy structures were cardinally changed to avoid locking protection for ipsecrequest. Now we support only limited number (4) of bundled SAs, but they are supported for both INET and INET6. o INPCB security policy cache was introduced. Each PCB now caches used security policies to avoid SP lookup for each packet. o For inbound security policies added the mode, when the kernel does check for full history of applied IPsec transforms. o References counting rules for security policies and security associations were changed. The proper SA locking added into xform code. o xform code was also changed. Now it is possible to unregister xforms. tdb_xxx structures were changed and renamed to reflect changes in SADB/SPDB, and changed rules for locking and refcounting. Obtained from: Yandex LLC Relnotes: yes Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D9352 MFC r313331: Add removed headers into the ObsoleteFiles.inc. MFC r313561 (by glebius): Move tcp_fields_to_net() static inline into tcp_var.h, just below its friend tcp_fields_to_host(). There is third party code that also uses this inline. MFC r313697: Remove IPsec related PCB code from SCTP. The inpcb structure has inp_sp pointer that is initialized by ipsec_init_pcbpolicy() function. This pointer keeps strorage for IPsec security policies associated with a specific socket. An application can use IP_IPSEC_POLICY and IPV6_IPSEC_POLICY socket options to configure these security policies. Then ip[6]_output() uses inpcb pointer to specify that an outgoing packet is associated with some socket. And IPSEC_OUTPUT() method can use a security policy stored in the inp_sp. For inbound packet the protocol-specific input routine uses IPSEC_CHECK_POLICY() method to check that a packet conforms to inbound security policy configured in the inpcb. SCTP protocol doesn't specify inpcb for ip[6]_output() when it sends packets. Thus IPSEC_OUTPUT() method does not consider such packets as associated with some socket and can not apply security policies from inpcb, even if they are configured. Since IPSEC_CHECK_POLICY() method is called from protocol-specific input routine, it can specify inpcb pointer and associated with socket inbound policy will be checked. But there are two problems: 1. Such check is asymmetric, becasue we can not apply security policy from inpcb for outgoing packet. 2. IPSEC_CHECK_POLICY() expects that caller holds INPCB lock and access to inp_sp is protected. But for SCTP this is not correct, becasue SCTP uses own locks to protect inpcb. To fix these problems remove IPsec related PCB code from SCTP. This imply that IP_IPSEC_POLICY and IPV6_IPSEC_POLICY socket options will be not applicable to SCTP sockets. To be able correctly check inbound security policies for SCTP, mark its protocol header with the PR_LASTHDR flag. Differential Revision: https://reviews.freebsd.org/D9538 MFC r313746: Add missing check to fix the build with IPSEC_SUPPORT and without MAC. MFC r313805: Fix LINT build for powerpc. Build kernel modules support only when both IPSEC and TCP_SIGNATURE are not defined. MFC r313922: For translated packets do not adjust UDP checksum if it is zero. In case when decrypted and decapsulated packet is an UDP datagram, check that its checksum is not zero before doing incremental checksum adjustment. MFC r314339: Document that the size of AH ICV for HMAC-SHA2-NNN should be half of NNN bits as described in RFC4868. PR: 215978 MFC r314812: Introduce the concept of IPsec security policies scope. Currently are defined three scopes: global, ifnet, and pcb. Generic security policies that IKE daemon can add via PF_KEY interface or an administrator creates with setkey(8) utility have GLOBAL scope. Such policies can be applied by the kernel to outgoing packets and checked agains inbound packets after IPsec processing. Security policies created by if_ipsec(4) interfaces have IFNET scope. Such policies are applied to packets that are passed through if_ipsec(4) interface. And security policies created by application using setsockopt() IP_IPSEC_POLICY option have PCB scope. Such policies are applied to packets related to specific socket. Currently there is no way to list PCB policies via setkey(8) utility. Modify setkey(8) and libipsec(3) to be able distinguish the scope of security policies in the `setkey -DP` listing. Add two optional flags: '-t' to list only policies related to virtual *tunneling* interfaces, i.e. policies with IFNET scope, and '-g' to list only policies with GLOBAL scope. By default policies from all scopes are listed. To implement this PF_KEY's sadb_x_policy structure was modified. sadb_x_policy_reserved field is used to pass the policy scope from the kernel to userland. SADB_SPDDUMP message extended to support filtering by scope: sadb_msg_satype field is used to specify bit mask of requested scopes. For IFNET policies the sadb_x_policy_priority field of struct sadb_x_policy is used to pass if_ipsec's interface if_index to the userland. For GLOBAL policies sadb_x_policy_priority is used only to manage order of security policies in the SPDB. For IFNET policies it is not used, so it can be used to keep if_index. After this change the output of `setkey -DP` now looks like: # setkey -DPt 0.0.0.0/0[any] 0.0.0.0/0[any] any in ipsec esp/tunnel/87.250.242.144-87.250.242.145/unique:145 spid=7 seq=3 pid=58025 scope=ifnet ifname=ipsec0 refcnt=1 # setkey -DPg ::/0 ::/0 icmp6 135,0 out none spid=5 seq=1 pid=872 scope=global refcnt=1 Obtained from: Yandex LLC Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D9805 PR: 212018 Relnotes: yes Sponsored by: Yandex LLC
| * MFC r315050:ae2017-03-181-1/+1
| | | | | | | | | | | | | | | | | | Fix the L2 address printed in the "arp: %s moved from %*D" message. In the r292978 struct llentry was changed and the ll_addr field become the pointer. PR: 217667
| * MFC r313821 r315277 r315286vangyzen2017-03-1712-166/+204
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Use inet_ntoa_r() instead of inet_ntoa() throughout the kernel. inet_ntoa() cannot be used safely in a multithreaded environment because it uses a static local buffer. Instead, use inet_ntoa_r() with a buffer on the caller's stack, except for KTR messages. KTR can correctly log the immediate integral values passed to it, as well as constant strings, but not non-constant strings, since they might change by the time ktrdump retrieves them. Therefore, use hex notation in KTR messages. Sponsored by: Dell EMC
| * MFC r305093 (by mjg@):dchagin2017-03-151-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fd: add fdeget_locked and use in kern_descrip MFC r305756 (by oshogbo@): fd: add fget_cap and fget_cap_locked primitives. They can be used to obtain capabilities along with a referenced fp. MFC r306174 (by oshogbo@): capsicum: propagate rights on accept(2) Descriptor returned by accept(2) should inherits capabilities rights from the listening socket. PR: 201052 MFC r306184 (by oshogbo@): fd: simplify fgetvp_rights by using fget_cap_locked. MFC r306225 (by mjg@): fd: fix up fgetvp_rights after r306184 fget_cap_locked returns a referenced file, but the fgetvp_rights does not need it. Instead, due to the filedesc lock being held, it can ref the vnode after the file was looked up. Fix up fget_cap_locked to be consistent with other _locked helpers and not ref the file. This plugs a leak introduced in r306184. MFC r306232 (by oshogbo@): fd: fix up fget_cap If the kernel is not compiled with the CAPABILITIES kernel options fget_unlocked doesn't return the sequence number so fd_modify will always report modification, in that case we got infinity loop. MFC r311474 (by glebius@): Use getsock_cap() instead of fgetsock(). MFC r312079 (by glebius@): Use getsock_cap() instead of deprecated fgetsock(). MFC r312081 (by glebius@): Use getsock_cap() instead of deprecated fgetsock(). MFC r312087 (by glebius@): Remove deprecated fgetsock() and fputsock(). Bump __FreeBSD_version as getsock_cap changed and fgetsock/fputsock pair removed.
* | Merge remote-tracking branch 'origin/stable/11' into devel-11Renato Botelho2017-02-231-4/+4
|\ \ | |/
| * MFC r313401vangyzen2017-02-101-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | Fix garbage IP addresses in UDP log_in_vain messages If multiple threads emit a UDP log_in_vain message concurrently, or indeed call inet_ntoa() for any other reason, the IP addresses could be garbage due to concurrent usage of a single string buffer inside inet_ntoa(). Use inet_ntoa_r() with two stack buffers instead. Relnotes: yes Sponsored by: Dell EMC
* | Merge remote-tracking branch 'origin/stable/11' into devel-11Luiz Otavio O Souza2017-02-093-13/+29
|\ \ | |/
| * MFC r312982:cy2017-02-061-2/+2
| | | | | | | | Correct comment grammar and make it easier to understand.
| * Revert MFC of 310847 and 310864jpaetzel2017-01-201-110/+16
| | | | | | | | | | Requested by glebius who had questions about the original head commit that I didn't see.
| * MFC 310847 310864jpaetzel2017-01-191-16/+110
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Harden CARP against network loops. If there is a loop in the network a CARP that is in MASTER state will see it's own broadcasts, which will then cause it to assume BACKUP state. When it assumes BACKUP it will stop sending advertisements. In that state it will no longer see advertisements and will assume MASTER... We can't catch all the cases where we are seeing our own CARP broadcast, but we can catch the obvious case. Unbreak ip_carp with WITHOUT_INET6 enabled by conditionalizing all IPv6 structs under the INET6 #ifdef. Similarly (even though it doesn't seem to affect the build), conditionalize all IPv4 structs under the INET #ifdef This also unbreaks the LINT-NOINET6 tinderbox target on amd64; I have not verified other MACHINE/TARGET pairs (e.g. armv6/arm). Submitted by: torek Obtained from: FreeNAS Pointyhat fix: ngie
| * MFC r311453hiren2017-01-121-0/+5
| | | | | | | | | | | | | | sysctl net.inet.tcp.hostcache.list in a jail can see connections from other jails and the host. This commit fixes it. PR: 200361
OpenPOWER on IntegriCloud