summaryrefslogtreecommitdiffstats
path: root/net/xfrm
Commit message (Collapse)AuthorAgeFilesLines
...
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2017-11-023-2/+4
|\ \ \ | |/ / | | | | | | | | | | | | | | | | | | Smooth Cong Wang's bug fix into 'net-next'. Basically put the bulk of the tcf_block_put() logic from 'net' into tcf_block_put_ext(), but after the offload unbind. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | xfrm: Fix GSO for IPsec with GRE tunnel.Steffen Klassert2017-10-311-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We reset the encapsulation field of the skb too early in xfrm_output. As a result, the GRE GSO handler does not segment the packets. This leads to a performance drop down. We fix this by resetting the encapsulation field right before we do the transformation, when the inner headers become invalid. Fixes: f1bd7d659ef0 ("xfrm: Add encapsulation header offsets while SKB is not encrypted") Reported-by: Vicente De Luca <vdeluca@zendesk.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * | xfrm: Clear sk_dst_cache when applying per-socket policy.Jonathan Basseri2017-10-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a socket has a valid dst cache, then xfrm_lookup_route will get skipped. However, the cache is not invalidated when applying policy to a socket (i.e. IPV6_XFRM_POLICY). The result is that new policies are sometimes ignored on those sockets. (Note: This was broken for IPv4 and IPv6 at different times.) This can be demonstrated like so, 1. Create UDP socket. 2. connect() the socket. 3. Apply an outbound XFRM policy to the socket. (setsockopt) 4. send() data on the socket. Packets will continue to be sent in the clear instead of matching an xfrm or returning a no-match error (EAGAIN). This affects calls to send() and not sendto(). Invalidating the sk_dst_cache is necessary to correctly apply xfrm policies. Since we do this in xfrm_user_policy(), the sk_lock was already acquired in either do_ip_setsockopt() or do_ipv6_setsockopt(), and we may call __sk_dst_reset(). Performance impact should be negligible, since this code is only called when changing xfrm policy, and only affects the socket in question. Fixes: 00bc0ef5880d ("ipv6: Skip XFRM lookup if dst_entry in socket cache is valid") Tested: https://android-review.googlesource.com/517555 Tested: https://android-review.googlesource.com/418659 Signed-off-by: Jonathan Basseri <misterikkit@google.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * | xfrm: Fix xfrm_dst_cache memleakSteffen Klassert2017-10-241-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have a memleak whenever a flow matches a policy without a matching SA. In this case we generate a dummy bundle and take an additional refcount on the dst_entry. This was needed as long as we had the flowcache. The flowcache removal patches deleted all related refcounts but forgot the one for the dummy bundle case. Fix the memleak by removing this refcount. Fixes: 3ca28286ea80 ("xfrm_policy: bypass flow_cache_lookup") Reported-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | | Merge branch 'master' of ↵David S. Miller2017-11-011-47/+58
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2017-10-30 1) Change some variables that can't be negative from int to unsigned int. From Alexey Dobriyan. 2) Remove a redundant header initialization in esp6. From Colin Ian King. 3) Some BUG to BUG_ON conversions. From Gustavo A. R. Silva. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: xfrm_user: use BUG_ON instead of if condition followed by BUGGustavo A. R. Silva2017-10-261-18/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use BUG_ON instead of if condition followed by BUG. This issue was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * | | xfrm: eradicate size_tAlexey Dobriyan2017-09-251-21/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All netlink message sizes are a) unsigned, b) can't be >= 4GB in size because netlink doesn't support >= 64KB messages in the first place. All those size_t across the code are a scam especially across networking which likes to work with small numbers like 1500 or 65536. Propagate unsignedness and flip some "int" to "unsigned int" as well. This is preparation to switching nlmsg_new() to "unsigned int". Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * | | xfrm: make xfrm_replay_state_esn_len() return unsigned intAlexey Dobriyan2017-09-251-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replay detection bitmaps can't have negative length. Comparisons with nla_len() are left signed just in case negative value can sneak in there. Propagate unsignedness for code size savings: add/remove: 0/0 grow/shrink: 0/5 up/down: 0/-38 (-38) function old new delta xfrm_state_construct 1802 1800 -2 xfrm_update_ae_params 295 289 -6 xfrm_state_migrate 1345 1339 -6 xfrm_replay_notify_esn 349 337 -12 xfrm_replay_notify_bmp 345 333 -12 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * | | xfrm: make xfrm_alg_auth_len() return unsigned intAlexey Dobriyan2017-09-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Key lengths can't be negative. Comparison with nla_len() is left signed just in case negative value can sneak in there. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * | | xfrm: make xfrm_alg_len() return unsigned intAlexey Dobriyan2017-09-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Key lengths can't be negative. Comparison with nla_len() is left signed just in case negative value can sneak in there. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * | | xfrm: make aead_len() return unsigned intAlexey Dobriyan2017-09-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Key lengths can't be negative. Comparison with nla_len() is left signed just in case negative value can sneak in there. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2017-10-302-18/+23
|\ \ \ \ | | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several conflicts here. NFP driver bug fix adding nfp_netdev_is_nfp_repr() check to nfp_fl_output() needed some adjustments because the code block is in an else block now. Parallel additions to net/pkt_cls.h and net/sch_generic.h A bug fix in __tcp_retransmit_skb() conflicted with some of the rbtree changes in net-next. The tc action RCU callback fixes in 'net' had some overlap with some of the recent tcf_block reworking. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | ipsec: Fix aborted xfrm policy dump crashHerbert Xu2017-10-231-10/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | An independent security researcher, Mohamed Ghannam, has reported this vulnerability to Beyond Security's SecuriTeam Secure Disclosure program. The xfrm_dump_policy_done function expects xfrm_dump_policy to have been called at least once or it will crash. This can be triggered if a dump fails because the target socket's receive buffer is full. This patch fixes it by using the cb->start mechanism to ensure that the initialisation is always done regardless of the buffer situation. Fixes: 12a169e7d8f4 ("ipsec: Put dumpers on the dump list") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * | | ipsec: Fix dst leak in xfrm_bundle_create().David Miller2017-10-111-8/+8
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | If we cannot find a suitable inner_mode value, we will leak the currently allocated 'xdst'. The fix is to make sure it is linked into the chain before erroring out. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | | xfrm: Convert timers to use timer_setup()Kees Cook2017-10-181-9/+8
|/ / | | | | | | | | | | | | | | | | | | | | | | | | In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() helper to pass the timer pointer explicitly. Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | xfrm: don't call xfrm_policy_cache_flush under xfrm_state_lockArtem Savkov2017-09-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | I might be wrong but it doesn't look like xfrm_state_lock is required for xfrm_policy_cache_flush and calling it under this lock triggers both "sleeping function called from invalid context" and "possible circular locking dependency detected" warnings on flush. Fixes: ec30d78c14a8 xfrm: add xdst pcpu cache Signed-off-by: Artem Savkov <asavkov@redhat.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | vti: fix NULL dereference in xfrm_input()Alexey Kodanev2017-09-131-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Can be reproduced with LTP tests: # icmp-uni-vti.sh -p ah -a sha256 -m tunnel -S fffffffe -k 1 -s 10 IPv4: RIP: 0010:xfrm_input+0x7f9/0x870 ... Call Trace: <IRQ> vti_input+0xaa/0x110 [ip_vti] ? skb_free_head+0x21/0x40 vti_rcv+0x33/0x40 [ip_vti] xfrm4_ah_rcv+0x33/0x60 ip_local_deliver_finish+0x94/0x1e0 ip_local_deliver+0x6f/0xe0 ? ip_route_input_noref+0x28/0x50 ... # icmp-uni-vti.sh -6 -p ah -a sha256 -m tunnel -S fffffffe -k 1 -s 10 IPv6: RIP: 0010:xfrm_input+0x7f9/0x870 ... Call Trace: <IRQ> xfrm6_rcv_tnl+0x3c/0x40 vti6_rcv+0xd5/0xe0 [ip6_vti] xfrm6_ah_rcv+0x33/0x60 ip6_input_finish+0xee/0x460 ip6_input+0x3f/0xb0 ip6_rcv_finish+0x45/0xa0 ipv6_rcv+0x34b/0x540 xfrm_input() invokes xfrm_rcv_cb() -> vti_rcv_cb(), the last callback might call skb_scrub_packet(), which in turn can reset secpath. Fix it by adding a check that skb->sp is not NULL. Fixes: 7e9e9202bccc ("xfrm: Clear RX SKB secpath xfrm_offload") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | xfrm: Fix negative device refcount on offload failure.Steffen Klassert2017-09-111-0/+1
| | | | | | | | | | | | | | | | | | | | Reset the offload device at the xfrm_state if the device was not able to offload the state. Otherwise we drop the device refcount twice. Fixes: d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API") Reported-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | xfrm: Fix deletion of offloaded SAs on failure.Steffen Klassert2017-09-111-0/+1
|/ | | | | | | | | | When we off load a SA, it gets pushed to the NIC before we can add it. In case of a failure, we don't delete this SA from the NIC. Fix this by calling xfrm_dev_state_delete on failure. Fixes: d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API") Reported-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2017-09-013-2/+19
|\ | | | | | | | | | | Three cases of simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
| * xfrm_user: fix info leak in build_aevent()Mathias Krause2017-08-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | The memory reserved to dump the ID of the xfrm state includes a padding byte in struct xfrm_usersa_id added by the compiler for alignment. To prevent the heap info leak, memset(0) the sa_id before filling it. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Fixes: d51d081d6504 ("[IPSEC]: Sync series - user") Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * xfrm_user: fix info leak in build_expire()Mathias Krause2017-08-281-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | The memory reserved to dump the expired xfrm state includes padding bytes in struct xfrm_user_expire added by the compiler for alignment. To prevent the heap info leak, memset(0) the remainder of the struct. Initializing the whole structure isn't needed as copy_to_user_state() already takes care of clearing the padding bytes within the 'state' member. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * xfrm_user: fix info leak in xfrm_notify_sa()Mathias Krause2017-08-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | The memory reserved to dump the ID of the xfrm state includes a padding byte in struct xfrm_usersa_id added by the compiler for alignment. To prevent the heap info leak, memset(0) the whole struct before filling it. Cc: Herbert Xu <herbert@gondor.apana.org.au> Fixes: 0603eac0d6b7 ("[IPSEC]: Add XFRMA_SA/XFRMA_POLICY for delete notification") Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * xfrm_user: fix info leak in copy_user_offload()Mathias Krause2017-08-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | The memory reserved to dump the xfrm offload state includes padding bytes of struct xfrm_user_offload added by the compiler for alignment. Add an explicit memset(0) before filling the buffer to avoid the heap info leak. Cc: Steffen Klassert <steffen.klassert@secunet.com> Fixes: d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API") Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * net: xfrm: don't double-hold dst when sk_policy in use.Lorenzo Colitti2017-08-241-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While removing dst_entry garbage collection, commit 52df157f17e5 ("xfrm: take refcnt of dst when creating struct xfrm_dst bundle") changed xfrm_resolve_and_create_bundle so it returns an xdst with a refcount of 1 instead of 0. However, it did not delete the dst_hold performed by xfrm_lookup when a per-socket policy is in use. This means that when a socket policy is in use, dst entries returned by xfrm_lookup have a refcount of 2, and are not freed when no longer in use. Cc: Wei Wang <weiwan@google.com> Fixes: 52df157f17 ("xfrm: take refcnt of dst when creating struct xfrm_dst bundle") Tested: https://android-review.googlesource.com/417481 Tested: https://android-review.googlesource.com/418659 Tested: https://android-review.googlesource.com/424463 Tested: https://android-review.googlesource.com/452776 passes on net-next Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Acked-by: Wei Wang <weiwan@google.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * xfrm: policy: check policy direction valueVladis Dronov2017-08-031-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 'dir' parameter in xfrm_migrate() is a user-controlled byte which is used as an array index. This can lead to an out-of-bound access, kernel lockup and DoS. Add a check for the 'dir' value. This fixes CVE-2017-11600. References: https://bugzilla.redhat.com/show_bug.cgi?id=1474928 Fixes: 80c9abaabf42 ("[XFRM]: Extension for dynamic update of endpoint address(es)") Cc: <stable@vger.kernel.org> # v2.6.21-rc1 Reported-by: "bo Zhang" <zhangbo5891001@gmail.com> Signed-off-by: Vladis Dronov <vdronov@redhat.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * xfrm: fix null pointer dereference on state and tmpl sortKoichiro Den2017-08-021-0/+8
| | | | | | | | | | | | | | | | | | | | | | Creating sub policy that matches the same outer flow as main policy does leads to a null pointer dereference if the outer mode's family is ipv4. For userspace compatibility, this patch just eliminates the crash i.e., does not introduce any new sorting rule, which would fruitlessly affect all but the aforementioned case. Signed-off-by: Koichiro Den <den@klaipeden.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | xfrm: Fix return value check of copy_sec_ctx.Steffen Klassert2017-08-311-2/+2
| | | | | | | | | | | | | | | | | | | | A recent commit added an output_mark. When copying this output_mark, the return value of copy_sec_ctx is overwitten without a check. Fix this by copying the output_mark before the security context. Fixes: 077fbac405bf ("net: xfrm: support setting an output mark.") Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | xfrm: Add support for network devices capable of removing the ESP trailerYossi Kuperman2017-08-311-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In conjunction with crypto offload [1], removing the ESP trailer by hardware can potentially improve the performance by avoiding (1) a cache miss incurred by reading the nexthdr field and (2) the necessity to calculate the csum value of the trailer in order to keep skb->csum valid. This patch introduces the changes to the xfrm stack and merely serves as an infrastructure. Subsequent patch to mlx5 driver will put this to a good use. [1] https://www.mail-archive.com/netdev@vger.kernel.org/msg175733.html Signed-off-by: Yossi Kuperman <yossiku@mellanox.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | Merge branch 'master' of ↵David S. Miller2017-08-216-15/+41
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2017-08-21 1) Support RX checksum with IPsec crypto offload for esp4/esp6. From Ilan Tayari. 2) Fixup IPv6 checksums when doing IPsec crypto offload. From Yossi Kuperman. 3) Auto load the xfrom offload modules if a user installs a SA that requests IPsec offload. From Ilan Tayari. 4) Clear RX offload informations in xfrm_input to not confuse the TX path with stale offload informations. From Ilan Tayari. 5) Allow IPsec GSO for local sockets if the crypto operation will be offloaded. 6) Support setting of an output mark to the xfrm_state. This mark can be used to to do the tunnel route lookup. From Lorenzo Colitti. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: xfrm: support setting an output mark.Lorenzo Colitti2017-08-114-9/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On systems that use mark-based routing it may be necessary for routing lookups to use marks in order for packets to be routed correctly. An example of such a system is Android, which uses socket marks to route packets via different networks. Currently, routing lookups in tunnel mode always use a mark of zero, making routing incorrect on such systems. This patch adds a new output_mark element to the xfrm state and a corresponding XFRMA_OUTPUT_MARK netlink attribute. The output mark differs from the existing xfrm mark in two ways: 1. The xfrm mark is used to match xfrm policies and states, while the xfrm output mark is used to set the mark (and influence the routing) of the packets emitted by those states. 2. The existing mark is constrained to be a subset of the bits of the originating socket or transformed packet, but the output mark is arbitrary and depends only on the state. The use of a separate mark provides additional flexibility. For example: - A packet subject to two transforms (e.g., transport mode inside tunnel mode) can have two different output marks applied to it, one for the transport mode SA and one for the tunnel mode SA. - On a system where socket marks determine routing, the packets emitted by an IPsec tunnel can be routed based on a mark that is determined by the tunnel, not by the marks of the unencrypted packets. - Support for setting the output marks can be introduced without breaking any existing setups that employ both mark-based routing and xfrm tunnel mode. Simply changing the code to use the xfrm mark for routing output packets could xfrm mark could change behaviour in a way that breaks these setups. If the output mark is unspecified or set to zero, the mark is not set or changed. Tested: make allyesconfig; make -j64 Tested: https://android-review.googlesource.com/452776 Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * | xfrm: Clear RX SKB secpath xfrm_offloadIlan Tayari2017-08-021-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If an incoming packet undergoes XFRM crypto-offload, its secpath is filled with xfrm_offload struct denoting offload information. If the SKB is then forwarded to a device which supports crypto- offload, the stack wrongfully attempts to offload it (even though the output SA may not exist on the device) due to the leftover secpath xo. Clear the ingress xo by zeroizing secpath->olen just before delivering the decapsulated packet to the network stack. Fixes: d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API") Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * | xfrm: Auto-load xfrm offload modulesIlan Tayari2017-08-023-6/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | IPSec crypto offload depends on the protocol-specific offload module (such as esp_offload.ko). When the user installs an SA with crypto-offload, load the offload module automatically, in the same way that the protocol module is loaded (such as esp.ko) Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | | xfrm: check that cached bundle is still validFlorian Westphal2017-08-071-1/+2
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Quoting Ilan Tayari: 1. Set up a host-to-host IPSec tunnel (or transport, doesn't matter) 2. Ping over IPSec, or do something to populate the pcpu cache 3. Join a MC group, then leave MC group 4. Try to ping again using same CPU as before -> traffic doesn't egress the machine at all Ilan debugged the problem down to the fact that one of the path dsts devices point to lo due to earlier dst_dev_put(). In this case, dst is marked as DEAD and we cannot reuse the bundle. The cache only asserted that the requested policy and that of the cached bundle match, but its not enough - also verify the path is still valid. Fixes: ec30d78c14a813 ("xfrm: add xdst pcpu cache") Reported-by: Ayham Masood <ayhamm@mellanox.com> Tested-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | xfrm: add xdst pcpu cacheFlorian Westphal2017-07-183-3/+131
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | retain last used xfrm_dst in a pcpu cache. On next request, reuse this dst if the policies are the same. The cache will not help with strict RR workloads as there is no hit. The cache packet-path part is reasonably small, the notifier part is needed so we do not add long hangs when a device is dismantled but some pcpu xdst still holds a reference, there are also calls to the flush operation when userspace deletes SAs so modules can be removed (there is no hit. We need to run the dst_release on the correct cpu to avoid races with packet path. This is done by adding a work_struct for each cpu and then doing the actual test/release on each affected cpu via schedule_work_on(). Test results using 4 network namespaces and null encryption: ns1 ns2 -> ns3 -> ns4 netperf -> xfrm/null enc -> xfrm/null dec -> netserver what TCP_STREAM UDP_STREAM UDP_RR Flow cache: 14644.61 294.35 327231.64 No flow cache: 14349.81 242.64 202301.72 Pcpu cache: 14629.70 292.21 205595.22 UDP tests used 64byte packets, tests ran for one minute each, value is average over ten iterations. 'Flow cache' is 'net-next', 'No flow cache' is net-next plus this series but without this patch. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | xfrm: remove flow cacheFlorian Westphal2017-07-183-112/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | After rcu conversions performance degradation in forward tests isn't that noticeable anymore. See next patch for some numbers. A followup patcg could then also remove genid from the policies as we do not cache bundles anymore. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | xfrm_policy: make xfrm_bundle_lookup return xfrm dst objectFlorian Westphal2017-07-181-16/+12
| | | | | | | | | | | | | | This allows to remove flow cache object embedded in struct xfrm_dst. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | xfrm_policy: remove xfrm_policy_lookupFlorian Westphal2017-07-181-32/+4
| | | | | | | | | | | | | | | | This removes the wrapper and renames the __xfrm_policy_lookup variant to get rid of another place that used flow cache objects. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | xfrm_policy: kill flow to policy dir conversionFlorian Westphal2017-07-181-42/+4
| | | | | | | | | | | | | | | | | | XFRM_POLICY_IN/OUT/FWD are identical to FLOW_DIR_*, so gcc already removed this function as its just returns the argument. Again, no code change. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | xfrm_policy: remove always true/false branchesFlorian Westphal2017-07-181-60/+14
| | | | | | | | | | | | | | | | after previous change oldflo and xdst are always NULL. These branches were already removed by gcc, this doesn't change code. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | xfrm_policy: bypass flow_cache_lookupFlorian Westphal2017-07-181-9/+5
|/ | | | | | | | | | | Instead of consulting flow cache, call the xfrm bundle/policy lookup functions directly. This pretends the flow cache had no entry. This helps to gradually remove flow cache integration, followup commit will remove the dead code that this change adds. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net, xfrm: convert sec_path.refcnt from atomic_t to refcount_tReshetova, Elena2017-07-041-2/+2
| | | | | | | | | | | | | | refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net, xfrm: convert xfrm_policy.refcnt from atomic_t to refcount_tReshetova, Elena2017-07-041-2/+2
| | | | | | | | | | | | | | refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net, xfrm: convert xfrm_state.refcnt from atomic_t to refcount_tReshetova, Elena2017-07-041-2/+2
| | | | | | | | | | | | | | refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2017-06-304-6/+4
|\ | | | | | | | | | | | | A set of overlapping changes in macvlan and the rocker driver, nothing serious. Signed-off-by: David S. Miller <davem@davemloft.net>
| * xfrm: move xfrm_garbage_collect out of xfrm_policy_flushHangbin Liu2017-06-122-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now we will force to do garbage collection if any policy removed in xfrm_policy_flush(). But during xfrm_net_exit(). We call flow_cache_fini() first and set set fc->percpu to NULL. Then after we call xfrm_policy_fini() -> frxm_policy_flush() -> flow_cache_flush(), we will get NULL pointer dereference when check percpu_empty. The code path looks like: flow_cache_fini() - fc->percpu = NULL xfrm_policy_fini() - xfrm_policy_flush() - xfrm_garbage_collect() - flow_cache_flush() - flow_cache_percpu_empty() - fcp = per_cpu_ptr(fc->percpu, cpu) To reproduce, just add ipsec in netns and then remove the netns. v2: As Xin Long suggested, since only two other places need to call it. move xfrm_garbage_collect() outside xfrm_policy_flush(). v3: Fix subject mismatch after v2 fix. Fixes: 35db06912189 ("xfrm: do the garbage collection after flushing policy") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * xfrm: fix xfrm_dev_event() missing when compile without CONFIG_XFRM_OFFLOADHangbin Liu2017-06-072-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API") we make xfrm_device.o only compiled when enable option CONFIG_XFRM_OFFLOAD. But this will make xfrm_dev_event() missing if we only enable default XFRM options. Then if we set down and unregister an interface with IPsec on it. there will no xfrm_garbage_collect(), which will cause dev usage count hold and get error like: unregister_netdevice: waiting for <dev> to become free. Usage count = 4 Fixes: d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | Merge branch 'master' of ↵David S. Miller2017-06-234-32/+55
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2017-06-23 1) Use memdup_user to spmlify xfrm_user_policy. From Geliang Tang. 2) Make xfrm_dev_register static to silence a sparse warning. From Wei Yongjun. 3) Use crypto_memneq to check the ICV in the AH protocol. From Sabrina Dubroca. 4) Remove some unused variables in esp6. From Stephen Hemminger. 5) Extend XFRM MIGRATE to allow to change the UDP encapsulation port. From Antony Antony. 6) Include the UDP encapsulation port to km_migrate announcements. From Antony Antony. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | xfrm: add UDP encapsulation port in migrate messageAntony Antony2017-06-073-9/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | Add XFRMA_ENCAP, UDP encapsulation port, to km_migrate announcement to userland. Only add if XFRMA_ENCAP was in user migrate request. Signed-off-by: Antony Antony <antony@phenome.org> Reviewed-by: Richard Guy Briggs <rgb@tricolour.ca> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * | xfrm: extend MIGRATE with UDP encapsulation portAntony Antony2017-06-073-14/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add UDP encapsulation port to XFRM_MSG_MIGRATE using an optional netlink attribute XFRMA_ENCAP. The devices that support IKE MOBIKE extension (RFC-4555 Section 3.8) could go to sleep for a few minutes and wake up. When it wake up the NAT mapping could have expired, the device send a MOBIKE UPDATE_SA message to migrate the IPsec SA. The change could be a change UDP encapsulation port, IP address, or both. Reported-by: Paul Wouters <pwouters@redhat.com> Signed-off-by: Antony Antony <antony@phenome.org> Reviewed-by: Richard Guy Briggs <rgb@tricolour.ca> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
OpenPOWER on IntegriCloud