summaryrefslogtreecommitdiffstats
path: root/include/net/xfrm.h
Commit message (Collapse)AuthorAgeFilesLines
* xfrm: add xdst pcpu cacheFlorian Westphal2017-07-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | retain last used xfrm_dst in a pcpu cache. On next request, reuse this dst if the policies are the same. The cache will not help with strict RR workloads as there is no hit. The cache packet-path part is reasonably small, the notifier part is needed so we do not add long hangs when a device is dismantled but some pcpu xdst still holds a reference, there are also calls to the flush operation when userspace deletes SAs so modules can be removed (there is no hit. We need to run the dst_release on the correct cpu to avoid races with packet path. This is done by adding a work_struct for each cpu and then doing the actual test/release on each affected cpu via schedule_work_on(). Test results using 4 network namespaces and null encryption: ns1 ns2 -> ns3 -> ns4 netperf -> xfrm/null enc -> xfrm/null dec -> netserver what TCP_STREAM UDP_STREAM UDP_RR Flow cache: 14644.61 294.35 327231.64 No flow cache: 14349.81 242.64 202301.72 Pcpu cache: 14629.70 292.21 205595.22 UDP tests used 64byte packets, tests ran for one minute each, value is average over ten iterations. 'Flow cache' is 'net-next', 'No flow cache' is net-next plus this series but without this patch. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: remove flow cacheFlorian Westphal2017-07-181-8/+0
| | | | | | | | | | | | | After rcu conversions performance degradation in forward tests isn't that noticeable anymore. See next patch for some numbers. A followup patcg could then also remove genid from the policies as we do not cache bundles anymore. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net, xfrm: convert sec_path.refcnt from atomic_t to refcount_tReshetova, Elena2017-07-041-3/+3
| | | | | | | | | | | | | | refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net, xfrm: convert xfrm_policy.refcnt from atomic_t to refcount_tReshetova, Elena2017-07-041-3/+3
| | | | | | | | | | | | | | refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net, xfrm: convert xfrm_state.refcnt from atomic_t to refcount_tReshetova, Elena2017-07-041-4/+5
| | | | | | | | | | | | | | refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2017-06-301-5/+2
|\ | | | | | | | | | | | | A set of overlapping changes in macvlan and the rocker driver, nothing serious. Signed-off-by: David S. Miller <davem@davemloft.net>
| * xfrm: fix xfrm_dev_event() missing when compile without CONFIG_XFRM_OFFLOADHangbin Liu2017-06-071-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API") we make xfrm_device.o only compiled when enable option CONFIG_XFRM_OFFLOAD. But this will make xfrm_dev_event() missing if we only enable default XFRM options. Then if we set down and unregister an interface with IPsec on it. there will no xfrm_garbage_collect(), which will cause dev usage count hold and get error like: unregister_netdevice: waiting for <dev> to become free. Usage count = 4 Fixes: d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | Merge branch 'master' of ↵David S. Miller2017-06-231-4/+8
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2017-06-23 1) Use memdup_user to spmlify xfrm_user_policy. From Geliang Tang. 2) Make xfrm_dev_register static to silence a sparse warning. From Wei Yongjun. 3) Use crypto_memneq to check the ICV in the AH protocol. From Sabrina Dubroca. 4) Remove some unused variables in esp6. From Stephen Hemminger. 5) Extend XFRM MIGRATE to allow to change the UDP encapsulation port. From Antony Antony. 6) Include the UDP encapsulation port to km_migrate announcements. From Antony Antony. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * xfrm: add UDP encapsulation port in migrate messageAntony Antony2017-06-071-2/+4
| | | | | | | | | | | | | | | | | | Add XFRMA_ENCAP, UDP encapsulation port, to km_migrate announcement to userland. Only add if XFRMA_ENCAP was in user migrate request. Signed-off-by: Antony Antony <antony@phenome.org> Reviewed-by: Richard Guy Briggs <rgb@tricolour.ca> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * xfrm: extend MIGRATE with UDP encapsulation portAntony Antony2017-06-071-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add UDP encapsulation port to XFRM_MSG_MIGRATE using an optional netlink attribute XFRMA_ENCAP. The devices that support IKE MOBIKE extension (RFC-4555 Section 3.8) could go to sleep for a few minutes and wake up. When it wake up the NAT mapping could have expired, the device send a MOBIKE UPDATE_SA message to migrate the IPsec SA. The change could be a change UDP encapsulation port, IP address, or both. Reported-by: Paul Wouters <pwouters@redhat.com> Signed-off-by: Antony Antony <antony@phenome.org> Reviewed-by: Richard Guy Briggs <rgb@tricolour.ca> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | xfrm: fix stack access out of bounds with CONFIG_XFRM_SUB_POLICYSabrina Dubroca2017-05-041-10/+0
|/ | | | | | | | | | | | | | | | | | | When CONFIG_XFRM_SUB_POLICY=y, xfrm_dst stores a copy of the flowi for that dst. Unfortunately, the code that allocates and fills this copy doesn't care about what type of flowi (flowi, flowi4, flowi6) gets passed. In multiple code paths (from raw_sendmsg, from TCP when replying to a FIN, in vxlan, geneve, and gre), the flowi that gets passed to xfrm is actually an on-stack flowi4, so we end up reading stuff from the stack past the end of the flowi4 struct. Since xfrm_dst->origin isn't used anywhere following commit ca116922afa8 ("xfrm: Eliminate "fl" and "pol" args to xfrm_bundle_ok()."), just get rid of it. xfrm_dst->partner isn't used either, so get rid of that too. Fixes: 9d6ec938019c ("ipv4: Use flowi4 in public route lookup interfaces.") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* net: Add a xfrm validate function to validate_xmit_skbSteffen Klassert2017-04-141-0/+6
| | | | | | | | | | | | | When we do IPsec offloading, we need a fallback for packets that were targeted to be IPsec offloaded but rerouted to a device that does not support IPsec offload. For that we add a function that checks the offloading features of the sending device and and flags the requirement of a fallback before it calls the IPsec output function. The IPsec output function adds the IPsec trailer and does encryption if needed. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* xfrm: Add an IPsec hardware offloading APISteffen Klassert2017-04-141-1/+64
| | | | | | | | | | | | | | | | | | | This patch adds all the bits that are needed to do IPsec hardware offload for IPsec states and ESP packets. We add xfrmdev_ops to the net_device. xfrmdev_ops has function pointers that are needed to manage the xfrm states in the hardware and to do a per packet offloading decision. Joint work with: Ilan Tayari <ilant@mellanox.com> Guy Shapiro <guysh@mellanox.com> Yossi Kuperman <yossiku@mellanox.com> Signed-off-by: Guy Shapiro <guysh@mellanox.com> Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Yossi Kuperman <yossiku@mellanox.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* xfrm: Add mode handlers for IPsec on layer 2Steffen Klassert2017-04-141-0/+10
| | | | | | | | This patch adds a gso_segment and xmit callback for the xfrm_mode and implement these functions for tunnel and transport mode. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* xfrm: Move device notifications to a sepatate fileSteffen Klassert2017-04-141-0/+1
| | | | | | This is needed for the upcomming IPsec device offloading. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* xfrm: Add a xfrm type offload.Steffen Klassert2017-04-141-6/+22
| | | | | | | | We add a struct xfrm_type_offload so that we have the offloaded codepath separated to the non offloaded codepath. With this the non offloade and the offloaded codepath can coexist. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* xfrm: branchless addr4_match() on 64-bitAlexey Dobriyan2017-03-271-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current addr4_match() code has special test for /0 prefixes because of standard required undefined behaviour. However, it is possible to omit it on 64-bit because shifting can be done within a 64-bit register and then truncated to the expected value (which is 0 mask). Implicit truncation by htonl() fits nicely into R32-within-R64 model on x86-64. Space savings: none (coincidence) Branch savings: 1 Before: movzx eax,BYTE PTR [rdi+0x2a] # ->prefixlen_d test al,al jne xfrm_selector_match + 0x23f ... movzx eax,BYTE PTR [rbx+0x2b] # ->prefixlen_s test al,al je xfrm_selector_match + 0x1c7 After (no branches): mov r8d,0x20 mov rdx,0xffffffffffffffff mov esi,DWORD PTR [rsi+0x2c] mov ecx,r8d sub cl,BYTE PTR [rdi+0x2a] xor esi,DWORD PTR [rbx] mov rdi,rdx xor eax,eax shl rdi,cl bswap edi Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* xfrm: use "unsigned int" in addr_match()Alexey Dobriyan2017-03-241-3/+3
| | | | | | | | | | | | | | | x86_64 is zero-extending arch so "unsigned int" is preferred over "int" for address calculations and extending to size_t. Space savings: add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-24 (-24) function old new delta xfrm_state_walk 708 696 -12 xfrm_selector_match 918 906 -12 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* xfrm: remove unused struct xfrm_mgr::idAlexey Dobriyan2017-03-241-1/+0
| | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* esp: Add a software GRO codepathSteffen Klassert2017-02-151-0/+1
| | | | | | | | | | | | This patch adds GRO ifrastructure and callbacks for ESP on ipv4 and ipv6. In case the GRO layer detects an ESP packet, the esp{4,6}_gro_receive() function does a xfrm state lookup and calls the xfrm input layer if it finds a matching state. The packet will be decapsulated and reinjected it into layer 2. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* xfrm: Extend the sec_path for IPsec offloadingSteffen Klassert2017-02-151-0/+41
| | | | | | | | We need to keep per packet offloading informations across the layers. So we extend the sec_path to carry these for the input and output offload codepath. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* xfrm: Export xfrm_parse_spi.Steffen Klassert2017-02-151-0/+1
| | | | | | We need it in the ESP offload handlers, so export it. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* xfrm: Add a secpath_set helper.Steffen Klassert2017-02-151-0/+1
| | | | | | | | Add a new helper to set the secpath to the skb. This avoids code duplication, as this is used in multiple places. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* xfrm: policy: remove family fieldFlorian Westphal2017-02-091-3/+2
| | | | | | | Only needed it to register the policy backend at init time. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* xfrm: policy: remove garbage_collect callbackFlorian Westphal2017-02-091-1/+1
| | | | | | | | | Just call xfrm_garbage_collect_deferred() directly. This gets rid of a write to afinfo in register/unregister and allows to constify afinfo later on. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* xfrm: policy: xfrm_policy_unregister_afinfo can return voidFlorian Westphal2017-02-091-1/+1
| | | | | | | | | | Nothing checks the return value. Also, the errors returned on unregister are impossible (we only support INET and INET6, so no way xfrm_policy_afinfo[afinfo->family] can be anything other than 'afinfo' itself). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* xfrm: input: constify xfrm_input_afinfoFlorian Westphal2017-02-091-3/+2
| | | | | | | | | | Nothing writes to these structures (the module owner was not used). While at it, size xfrm_input_afinfo[] by the highest existing xfrm family (INET6), not AF_MAX. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* esp4: Avoid skb_cow_data whenever possibleSteffen Klassert2017-01-171-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | This patch tries to avoid skb_cow_data on esp4. On the encrypt side we add the IPsec tailbits to the linear part of the buffer if there is space on it. If there is no space on the linear part, we add a page fragment with the tailbits to the buffer and use separate src and dst scatterlists. On the decrypt side, we leave the buffer as it is if it is not cloned. With this, we can avoid a linearization of the buffer in most of the cases. Joint work with: Sowmini Varadhan <sowmini.varadhan@oracle.com> Ilan Tayari <ilant@mellanox.com> Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* xfrm: add and use xfrm_state_afinfo_get_rcuFlorian Westphal2017-01-101-0/+1
| | | | | | | | | | | | | | | | xfrm_init_tempstate is always called from within rcu read side section. We can thus use a simpler function that doesn't call rcu_read_lock again. While at it, also make xfrm_init_tempstate return value void, the return value was never tested. A followup patch will replace remaining callers of xfrm_state_get_afinfo with xfrm_state_afinfo_get_rcu variant and then remove the 'old' get_afinfo interface. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* xfrm: remove xfrm_state_put_afinfoFlorian Westphal2017-01-101-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | commit 44abdc3047aecafc141dfbaf1ed ("xfrm: replace rwlock on xfrm_state_afinfo with rcu") made xfrm_state_put_afinfo equivalent to rcu_read_unlock. Use spatch to replace it with direct calls to rcu_read_unlock: @@ struct xfrm_state_afinfo *a; @@ - xfrm_state_put_afinfo(a); + rcu_read_unlock(); old: text data bss dec hex filename 22570 72 424 23066 5a1a xfrm_state.o 1612 0 0 1612 64c xfrm_output.o new: 22554 72 424 23050 5a0a xfrm_state.o 1596 0 0 1596 63c xfrm_output.o Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2016-09-231-1/+3
|\
| * vti6: fix input pathNicolas Dichtel2016-09-211-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 1625f4529957, vti6 is broken, all input packets are dropped (LINUX_MIB_XFRMINNOSTATES is incremented). XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip6 is set by vti6_rcv() before calling xfrm6_rcv()/xfrm6_rcv_spi(), thus we cannot set to NULL that value in xfrm6_rcv_spi(). A new function xfrm6_rcv_tnl() that enables to pass a value to xfrm6_rcv_spi() is added, so that xfrm6_rcv() is not touched (this function is used in several handlers). CC: Alexey Kodanev <alexey.kodanev@oracle.com> Fixes: 1625f4529957 ("net/xfrm_input: fix possible NULL deref of tunnel.ip6->parms.i_key") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | xfrm: constify xfrm_replay structuresJulia Lawall2016-08-101-1/+1
|/ | | | | | | | | The xfrm_replay structures are never modified, so declare them as const. Done with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* net: xfrm: kill XFRM_INC_STATS_BH()Eric Dumazet2016-04-271-2/+0
| | | | | | | Not used anymore. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: snmp: kill various STATS_USER() helpersEric Dumazet2016-04-271-2/+0
| | | | | | | | | | | | | | | | | | | | | In the old days (before linux-3.0), SNMP counters were duplicated, one for user context, and one for BH context. After commit 8f0ea0fe3a03 ("snmp: reduce percpu needs by 50%") we have a single copy, and what really matters is preemption being enabled or disabled, since we use this_cpu_inc() or __this_cpu_inc() respectively. We therefore kill SNMP_INC_STATS_USER(), SNMP_ADD_STATS_USER(), NET_INC_STATS_USER(), NET_ADD_STATS_USER(), SCTP_INC_STATS_USER(), SNMP_INC_STATS64_USER(), SNMP_ADD_STATS64_USER(), TCP_ADD_STATS_USER(), UDP_INC_STATS_USER(), UDP6_INC_STATS_USER(), and XFRM_INC_STATS_USER() Following patches will rename __BH helpers to make clear their usage is not tied to BH being disabled. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: add rcu protection to sk->sk_policy[]Eric Dumazet2015-12-111-9/+15
| | | | | | | | | | | | | | XFRM can deal with SYNACK messages, sent while listener socket is not locked. We add proper rcu protection to __xfrm_sk_clone_policy() and xfrm_sk_policy_lookup() This might serve as the first step to remove xfrm.xfrm_policy_lock use in fast path. Fixes: fa76ce7328b2 ("inet: get rid of central tcp/dccp listener timer") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: add rcu grace period in xfrm_policy_destroy()Eric Dumazet2015-12-111-0/+1
| | | | | | | | | | | | We will soon switch sk->sk_policy[] to RCU protection, as SYNACK packets are sent while listener socket is not locked. This patch simply adds RCU grace period before struct xfrm_policy freeing, and the corresponding rcu_head in struct xfrm_policy. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* dst: Pass net into dst->outputEric W. Biederman2015-10-081-3/+3
| | | | | | | | The network namespace is already passed into dst_output pass it into dst->output lwt->output and friends. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: Remove unused afinfo method init_dstEric W. Biederman2015-09-171-2/+0
| | | | | Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: Add oif to dst lookupsDavid Ahern2015-08-111-2/+5
| | | | | | | | | Rules can be installed that direct route lookups to specific tables based on oif. Plumb the oif through the xfrm lookups so it gets set in the flow struct and passed to the resolver routines. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* ipsec: Add IV generator information to xfrm_stateHerbert Xu2015-05-281-0/+1
| | | | | | | This patch adds IV generator information to xfrm_state. This is currently obtained from our own list of algorithm descriptions. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* xfrm: Add IV generator information to xfrm_algo_descHerbert Xu2015-05-281-0/+2
| | | | | | | | This patch adds IV generator information for each AEAD and block cipher to xfrm_algo_desc. This will be used to access the new AEAD interface. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* netfilter: Pass socket pointer down through okfn().David Miller2015-04-071-4/+4
| | | | | | | | | | | | | | | | | | | On the output paths in particular, we have to sometimes deal with two socket contexts. First, and usually skb->sk, is the local socket that generated the frame. And second, is potentially the socket used to control a tunneling socket, such as one the encapsulates using UDP. We do not want to disassociate skb->sk when encapsulating in order to fix this, because that would break socket memory accounting. The most extreme case where this can cause huge problems is an AF_PACKET socket transmitting over a vxlan device. We hit code paths doing checks that assume they are dealing with an ipv4 socket, but are actually operating upon the AF_PACKET one. Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: simplify xfrm_address_t useJiri Benc2015-03-311-3/+3
| | | | | | | | | | | In many places, the a6 field is typecasted to struct in6_addr. As the fields are in union anyway, just add in6_addr type to the union and get rid of the typecasting. Modifying the uapi header is okay, the union has still the same size. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Introduce possible_net_tEric W. Biederman2015-03-121-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Having to say > #ifdef CONFIG_NET_NS > struct net *net; > #endif in structures is a little bit wordy and a little bit error prone. Instead it is possible to say: > typedef struct { > #ifdef CONFIG_NET_NS > struct net *net; > #endif > } possible_net_t; And then in a header say: > possible_net_t net; Which is cleaner and easier to use and easier to test, as the possible_net_t is always there no matter what the compile options. Further this allows read_pnet and write_pnet to be functions in all cases which is better at catching typos. This change adds possible_net_t, updates the definitions of read_pnet and write_pnet, updates optional struct net * variables that write_pnet uses on to have the type possible_net_t, and finally fixes up the b0rked users of read_pnet and write_pnet. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: configure policy hash table thresholds by netlinkChristophe Gouault2014-09-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enable to specify local and remote prefix length thresholds for the policy hash table via a netlink XFRM_MSG_NEWSPDINFO message. prefix length thresholds are specified by XFRMA_SPD_IPV4_HTHRESH and XFRMA_SPD_IPV6_HTHRESH optional attributes (struct xfrmu_spdhthresh). example: struct xfrmu_spdhthresh thresh4 = { .lbits = 0; .rbits = 24; }; struct xfrmu_spdhthresh thresh6 = { .lbits = 0; .rbits = 56; }; struct nlmsghdr *hdr; struct nl_msg *msg; msg = nlmsg_alloc(); hdr = nlmsg_put(msg, NL_AUTO_PORT, NL_AUTO_SEQ, XFRMA_SPD_IPV4_HTHRESH, sizeof(__u32), NLM_F_REQUEST); nla_put(msg, XFRMA_SPD_IPV4_HTHRESH, sizeof(thresh4), &thresh4); nla_put(msg, XFRMA_SPD_IPV6_HTHRESH, sizeof(thresh6), &thresh6); nla_send_auto(sk, msg); The numbers are the policy selector minimum prefix lengths to put a policy in the hash table. - lbits is the local threshold (source address for out policies, destination address for in and fwd policies). - rbits is the remote threshold (destination address for out policies, source address for in and fwd policies). The default values are: XFRMA_SPD_IPV4_HTHRESH: 32 32 XFRMA_SPD_IPV6_HTHRESH: 128 128 Dynamic re-building of the SPD is performed when the thresholds values are changed. The current thresholds can be read via a XFRM_MSG_GETSPDINFO request: the kernel replies to XFRM_MSG_GETSPDINFO requests by an XFRM_MSG_NEWSPDINFO message, with both attributes XFRMA_SPD_IPV4_HTHRESH and XFRMA_SPD_IPV6_HTHRESH. Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* xfrm: Remove useless xfrm_audit struct.Tetsuo Handa2014-04-231-23/+19
| | | | | | | | | | | | | Commit f1370cc4 "xfrm: Remove useless secid field from xfrm_audit." changed "struct xfrm_audit" to have either { audit_get_loginuid(current) / audit_get_sessionid(current) } or { INVALID_UID / -1 } pair. This means that we can represent "struct xfrm_audit" as "bool". This patch replaces "struct xfrm_audit" argument with "bool". Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* xfrm: Remove useless secid field from xfrm_audit.Tetsuo Handa2014-04-221-19/+10
| | | | | | | | | | | | | | | | | | | | | | | | | It seems to me that commit ab5f5e8b "[XFRM]: xfrm audit calls" is doing something strange at xfrm_audit_helper_usrinfo(). If secid != 0 && security_secid_to_secctx(secid) != 0, the caller calls audit_log_task_context() which basically does secid != 0 && security_secid_to_secctx(secid) == 0 case except that secid is obtained from current thread's context. Oh, what happens if secid passed to xfrm_audit_helper_usrinfo() was obtained from other thread's context? It might audit current thread's context rather than other thread's context if security_secid_to_secctx() in xfrm_audit_helper_usrinfo() failed for some reason. Then, are all the caller of xfrm_audit_helper_usrinfo() passing either secid obtained from current thread's context or secid == 0? It seems to me that they are. If I didn't miss something, we don't need to pass secid to xfrm_audit_helper_usrinfo() because audit_log_task_context() will obtain secid from current thread's context. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* ipv4: add a sock pointer to dst->output() path.Eric Dumazet2014-04-151-3/+3
| | | | | | | | | | | | | | | | In the dst->output() path for ipv4, the code assumes the skb it has to transmit is attached to an inet socket, specifically via ip_mc_output() : The sk_mc_loop() test triggers a WARN_ON() when the provider of the packet is an AF_PACKET socket. The dst->output() method gets an additional 'struct sock *sk' parameter. This needs a cascade of changes so that this parameter can be propagated from vxlan to final consumer. Fixes: 8f646c922d55 ("vxlan: keep original skb ownership") Reported-by: lucien xin <lucien.xin@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2014-03-181-22/+28
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== One patch to rename a newly introduced struct. The rest is the rework of the IPsec virtual tunnel interface for ipv6 to support inter address family tunneling and namespace crossing. 1) Rename the newly introduced struct xfrm_filter to avoid a conflict with iproute2. From Nicolas Dichtel. 2) Introduce xfrm_input_afinfo to access the address family dependent tunnel callback functions properly. 3) Add and use a IPsec protocol multiplexer for ipv6. 4) Remove dst_entry caching. vti can lookup multiple different dst entries, dependent of the configured xfrm states. Therefore it does not make to cache a dst_entry. 5) Remove caching of flow informations. vti6 does not use the the tunnel endpoint addresses to do route and xfrm lookups. 6) Update the vti6 to use its own receive hook. 7) Remove the now unused xfrm_tunnel_notifier. This was used from vti and is replaced by the IPsec protocol multiplexer hooks. 8) Support inter address family tunneling for vti6. 9) Check if the tunnel endpoints of the xfrm state and the vti interface are matching and return an error otherwise. 10) Enable namespace crossing for vti devices. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
OpenPOWER on IntegriCloud