summaryrefslogtreecommitdiffstats
path: root/net/netfilter/ipvs/ip_vs_core.c
Commit message (Collapse)AuthorAgeFilesLines
* ipvs: use skb_to_full_sk() helperEric Dumazet2015-11-151-8/+8
| | | | | | | | | | | | | | SYNACK packets might be attached to request sockets. Use skb_to_full_sk() helper to avoid illegal accesses to inet_sk(skb->sk) Fixes: ca6fb0651883 ("tcp: attach SYNACK messages to request sockets instead of listener") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Acked-by: Julian Anastasov <ja@ssi.bg> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵Pablo Neira Ayuso2015-10-171-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next This merge resolves conflicts with 75aec9df3a78 ("bridge: Remove br_nf_push_frag_xmit_sk") as part of Eric Biederman's effort to improve netns support in the network stack that reached upstream via David's net-next tree. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Conflicts: net/bridge/br_netfilter_hooks.c
| * ipv4: Pass struct net into ip_defrag and ip_check_defragEric W. Biederman2015-10-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | The function ip_defrag is called on both the input and the output paths of the networking stack. In particular conntrack when it is tracking outbound packets from the local machine calls ip_defrag. So add a struct net parameter and stop making ip_defrag guess which network namespace it needs to defragment packets in. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | netfilter: remove hook owner refcountingFlorian Westphal2015-10-161-12/+0
| | | | | | | | | | | | | | | | | | | | | | since commit 8405a8fff3f8 ("netfilter: nf_qeueue: Drop queue entries on nf_unregister_hook") all pending queued entries are discarded. So we can simply remove all of the owner handling -- when module is removed it also needs to unregister all its hooks. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* | ipvs: Remove possibly unused variable from ip_vs_outDavid Ahern2015-10-071-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Eric's net namespace changes in 1b75097dd7a26 leaves net unreferenced if CONFIG_IP_VS_IPV6 is not enabled: ../net/netfilter/ipvs/ip_vs_core.c: In function ‘ip_vs_out’: ../net/netfilter/ipvs/ip_vs_core.c:1177:14: warning: unused variable ‘net’ [-Wunused-variable] After the net refactoring there is only 1 user; push the reference to the 1 user. While the line length slightly exceeds 80 it seems to be the best change. Fixes: 1b75097dd7a26("ipvs: Pass ipvs into ip_vs_out") Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Acked-by: Julian Anastasov <ja@ssi.bg> [horms: updated subject] Signed-off-by: Simon Horman <horms@verge.net.au>
* | ipvs: Don't protect ip_vs_addr_is_unicast with CONFIG_SYSCTLEric W. Biederman2015-10-011-2/+0
|/ | | | | | | | | | | | | | | | | | | | | I arranged the code so that the compiler can remove the unecessary bits in ip_vs_leave when CONFIG_SYSCTL is unset, and removed an explicit CONFIG_SYSCTL. Unfortunately when rebasing my work on top of that of Alex Gartrell I missed the fact that the newly added function ip_vs_addr_is_unicast was surrounded by CONFIG_SYSCTL. So remove the now unnecessary CONFIG_SYSCTL guards around ip_vs_addr_is_unicast. It is causing build failures today when CONFIG_SYSCTL is not selected and any self respecting compiler will notice that sysctl_cache_bypass is always false without CONFIG_SYSCTL and not include the logic from the function ip_vs_addr_is_unicast in the compiled code. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipv6: Pass struct net into ip6_route_me_harderEric W. Biederman2015-09-291-1/+1
| | | | | | | | Don't make ip6_route_me_harder guess which network namespace it is routing in, pass the network namespace in. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* ipv4: Pass struct net into ip_route_me_harderEric W. Biederman2015-09-291-1/+1
| | | | | | | | Don't make ip_route_me_harder guess which network namespace it is routing in, pass the network namespace in. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* ipvs: Pass ipvs into ip_vs_gather_fragsEric W. Biederman2015-09-241-4/+5
| | | | | | | | | This will be needed later when the network namespace guessing is removed from ip_defrag. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Pass ipvs not net to ip_vs_protocol_net_(init|cleanup)Eric W. Biederman2015-09-241-3/+3
| | | | | | Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Pass ipvs through ip_vs_route_me_harder into sysctl_snat_rerouteEric W. Biederman2015-09-241-8/+7
| | | | | | | | This removes the need to use the hack skb_net. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Pass ipvs into ip_vs_out_icmp and ip_vs_out_icmp_v6Eric W. Biederman2015-09-241-8/+7
| | | | | | | | This removes the need to compute ipvs with the hack "net_ipvs(skb_net(skb))" Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Pass ipvs into ip_vs_in_icmp and ip_vs_in_icmp_v6Eric W. Biederman2015-09-241-22/+14
| | | | | | | | | | | | With ipvs passed into ip_vs_in_icmp and ip_vs_in_icmp_v6 they no longer need to call the hack that is skb_net. Additionally ipvs_in_icmp no longer needs to call dev_net(skb->dev) and can use the ipvs->net instead. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Pass ipvs into ip_vs_inEric W. Biederman2015-09-241-9/+5
| | | | | | | | | Derive ipvs from state->net in the callers of ip_vs_in and pass it into ip_vs_out. Removing the need to use the hack skb_net. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Pass ipvs into ip_vs_outEric W. Biederman2015-09-241-9/+6
| | | | | | | | | Derive ipvs from state->net in the callers of ip_vs_out and pass it into ip_vs_out. Removing the need to use the hack skb_net. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Pass ipvs not net into sysctl_nat_icmp_sendEric W. Biederman2015-09-241-4/+3
| | | | | | Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Simplify ipvs and net access in ip_vs_leaveEric W. Biederman2015-09-241-6/+2
| | | | | | | | Stop using the hack skb_net(skb) to compute the network namespace. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Wrap sysctl_cache_bypass and remove ifdefs in ip_vs_leaveEric W. Biederman2015-09-241-10/+3
| | | | | | | | | | | | | With sysctl_cache_bypass now a compile time constant the compiler can figue out that it can elimiate all of the code that depends on sysctl_cache_bypass being true. Also remove the duplicate computation of net previously necessitated by #ifdef CONFIG_SYSCTL Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Better derivation of ipvs in ip_vs_in_stats and ip_vs_out_statsEric W. Biederman2015-09-241-2/+2
| | | | | | Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Pass ipvs into .conn_schedule and ip_vs_try_to_scheduleEric W. Biederman2015-09-241-5/+6
| | | | | | | | | This moves the hack "net_ipvs(skb_net(skb))" up one level where it will be easier to remove. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Pass ipvs not net into ip_vs_conn_net_init and ip_vs_conn_net_cleanupEric W. Biederman2015-09-241-3/+3
| | | | | | Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Pass ipvs into conn_out_getEric W. Biederman2015-09-241-3/+5
| | | | | | | | | Move the hack of relying on "net_ipvs(skb_net(skb))" to derive the ipvs up a layer. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Pass ipvs into .conn_in_get and ip_vs_conn_in_get_protoEric W. Biederman2015-09-241-4/+4
| | | | | | | | | Stop relying on "net_ipvs(skb_net(skb))" to derive the ipvs as skb_net is a hack. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Pass ipvs not net into ip_vs_app_net_init and ip_vs_app_net_cleanupEric W. Biederman2015-09-241-3/+3
| | | | | | Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Pass ipvs not net to ip_vs_estimator_net_init and ip_vs_estimator_cleanupEric W. Biederman2015-09-241-3/+3
| | | | | | Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Pass ipvs not net into ip_vs_control_net_(init|cleanup)Eric W. Biederman2015-09-241-3/+3
| | | | | | Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Pass ipvs not net to ip_vs_sync_net_cleanupEric W. Biederman2015-09-241-2/+3
| | | | | | Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Pass ipvs not net to ip_vs_sync_net_initEric W. Biederman2015-09-241-1/+1
| | | | | | Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Pass ipvs not net to ip_vs_sync_connEric W. Biederman2015-09-241-1/+1
| | | | | | Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Pass ipvs not net to ip_vs_proto_data_getEric W. Biederman2015-09-241-4/+4
| | | | | | Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Cache ipvs in ip_vs_in_icmp and ip_vs_in_icmp_v6Eric W. Biederman2015-09-241-2/+6
| | | | | | | | | | Storte the value of net_ipvs in a variable named ipvs so that when there are more users struct netns_ipvs in ip_vs_in_cmp and ip_vs_in_icmp_v6 they won't need to compute the value again. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Pass ipvs not net to ip_vs_service_net_cleanupEric W. Biederman2015-09-241-2/+4
| | | | | | Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Pass ipvs not net to ip_vs_has_real_serviceEric W. Biederman2015-09-241-2/+4
| | | | | | Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Store ipvs not net in struct ip_vs_serviceEric W. Biederman2015-09-241-5/+5
| | | | | | | | | | | | | | | In practice struct netns_ipvs is as meaningful as struct net and more useful as it holds the ipvs specific data. So store a pointer to struct netns_ipvs. Update the accesses of param->net to access param->ipvs->net instead. In functions where we are searching for an svc and filtering by net filter by ipvs instead. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Pass ipvs not net to ip_vs_fill_connEric W. Biederman2015-09-241-4/+4
| | | | | | | | | ipvs is what is actually desired so change the parameter and the modify the callers to pass struct netns_ipvs. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Use state->net in the ipvs forward functionsEric W. Biederman2015-09-241-6/+2
| | | | | | Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* netfilter: Pass priv instead of nf_hook_ops to netfilter hooksEric W. Biederman2015-09-181-12/+12
| | | | | | | | | Only pass the void *priv parameter out of the nf_hook_ops. That is all any of the functions are interested now, and by limiting what is passed it becomes simpler to change implementation details. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* ipvs: Read hooknum from state rather than ops->hooknumEric W. Biederman2015-09-181-8/+8
| | | | | | | | This should be more cache efficient as state is more likely to be in core, and the netfilter core will stop passing in ops soon. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* ipvs: add sysctl to ignore tunneled packetsAlex Gartrell2015-09-171-1/+9
| | | | | | | | This is a way to avoid nasty routing loops when multiple ipvs instances can forward to eachother. Signed-off-by: Alex Gartrell <agartrell@fb.com> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: attempt to schedule icmp packetsAlex Gartrell2015-09-011-9/+36
| | | | | | | | | | Invoke the try_to_schedule logic from the icmp path and update it to the appropriate ip_vs_conn_put function. The schedule functions have been updated to reject the packets immediately for now. Signed-off-by: Alex Gartrell <agartrell@fb.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Make ip_vs_schedule aware of inverse iph'esAlex Gartrell2015-09-011-14/+36
| | | | | | | | This is necessary to schedule icmp later. Signed-off-by: Alex Gartrell <agartrell@fb.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: drop inverse argument to conn_{in,out}_getAlex Gartrell2015-09-011-13/+18
| | | | | | | | | No longer necessary since the information is included in the ip_vs_iphdr itself. Signed-off-by: Alex Gartrell <agartrell@fb.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: pull out ip_vs_try_to_schedule functionAlex Gartrell2015-09-011-21/+39
| | | | | | | | | This is necessary as we'll be trying to schedule icmp later and we'll want to share this code. Signed-off-by: Alex Gartrell <agartrell@fb.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Handle inverse and icmp headers in ip_vs_leaveAlex Gartrell2015-09-011-12/+21
| | | | | | Signed-off-by: Alex Gartrell <agartrell@fb.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: Add hdr_flags to iphdrAlex Gartrell2015-09-011-10/+10
| | | | | | | | | | These flags contain information like whether or not the addresses are inverted or from icmp. The first will allow us to drop an inverse param all over the place, and the second will later be useful in scheduling icmp. Signed-off-by: Alex Gartrell <agartrell@fb.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: replace ip_vs_fill_ip4hdr with ip_vs_fill_iph_skb_offAlex Gartrell2015-09-011-44/+26
| | | | | | | | | This removes some duplicated code and makes the ICMPv6 path look more like the ICMP path. Signed-off-by: Alex Gartrell <agartrell@fb.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* ipvs: fix crash if scheduler is changedJulian Anastasov2015-07-141-2/+14
| | | | | | | | | | | | | | | | | | | | | | I overlooked the svc->sched_data usage from schedulers when the services were converted to RCU in 3.10. Now the rare ipvsadm -E command can change the scheduler but due to the reverse order of ip_vs_bind_scheduler and ip_vs_unbind_scheduler we provide new sched_data to the old scheduler resulting in a crash. To fix it without changing the scheduler methods we have to use synchronize_rcu() only for the editing case. It means all svc->scheduler readers should expect a NULL value. To avoid breakage for the service listing and ipvsadm -R we can use the "none" name to indicate that scheduler is not assigned, a state when we drop new connections. Reported-by: Alexander Vasiliev <a.vasylev@404-group.com> Fixes: ceec4c381681 ("ipvs: convert services to rcu") Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* netfilter: Make nf_hookfn use nf_hook_state.David S. Miller2015-04-041-22/+10
| | | | | | | Pass the nf_hook_state all the way down into the hook functions themselves. Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller2015-03-021-19/+50
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo Neira Ayuso says: ==================== Netfilter updates for net-next A small batch with accumulated updates in nf-next, mostly IPVS updates, they are: 1) Add 64-bits stats counters to IPVS, from Julian Anastasov. 2) Move NETFILTER_XT_MATCH_ADDRTYPE out of NETFILTER_ADVANCED as docker seem to require this, from Anton Blanchard. 3) Use boolean instead of numeric value in set_match_v*(), from coccinelle via Fengguang Wu. 4) Allows rescheduling of new connections in IPVS when port reuse is detected, from Marcelo Ricardo Leitner. 5) Add missing bits to support arptables extensions from nft_compat, from Arturo Borrero. Patrick is preparing a large batch to enhance the set infrastructure, named expressions among other things, that should follow up soon after this batch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * ipvs: allow rescheduling of new connections when port reuse is detectedMarcelo Ricardo Leitner2015-02-251-4/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, when TCP/SCTP port reusing happens, IPVS will find the old entry and use it for the new one, behaving like a forced persistence. But if you consider a cluster with a heavy load of small connections, such reuse will happen often and may lead to a not optimal load balancing and might prevent a new node from getting a fair load. This patch introduces a new sysctl, conn_reuse_mode, that allows controlling how to proceed when port reuse is detected. The default value will allow rescheduling of new connections only if the old entry was in TIME_WAIT state for TCP or CLOSED for SCTP. Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
OpenPOWER on IntegriCloud