summaryrefslogtreecommitdiffstats
path: root/net/bridge/br_netfilter.c
Commit message (Collapse)AuthorAgeFilesLines
* netfilter: bridge: use rcu hook to resolve br_netfilter dependencyPablo Neira Ayuso2015-03-101-2/+7
| | | | | | | | | | | | | | | | | e5de75b ("netfilter: bridge: move DNAT helper to br_netfilter") results in the following link problem: net/bridge/br_device.c:29: undefined reference to `br_nf_prerouting_finish_bridge` Moreover it creates a hard dependency between br_netfilter and the bridge core, which is what we've been trying to avoid so far. Resolve this problem by using a hook structure so we reduce #ifdef pollution and keep bridge netfilter specific code under br_netfilter.c which was the original intention. Reported-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netfilter: bridge: move DNAT helper to br_netfilterPablo Neira Ayuso2015-03-091-0/+32
| | | | | | | | | Only one caller, there is no need to keep this in a header. Move it to br_netfilter.c where this belongs to. Based on patch from Florian Westphal. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netfilter: bridge: refactor conditional in br_nf_dev_queue_xmitFlorian Westphal2015-03-091-3/+6
| | | | | | | simpilifies followup patch that re-works brnf ip_fragment handling. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netfilter: bridge: move nf_bridge_update_protocol to where its usedFlorian Westphal2015-03-091-0/+8
| | | | | | | no need to keep it in a header file. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* bridge: move mac header copying into br_netfilterFlorian Westphal2015-03-091-1/+28
| | | | | | | | | The mac header only has to be copied back into the skb for fragments generated by ip_fragment(), which only happens for bridge forwarded packets with nf-call-iptables=1 && active nf_defrag. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* bridge: netfilter: Move sysctl-specific error code inside #ifdefGeert Uytterhoeven2015-02-121-5/+2
| | | | | | | | | | | | | If CONFIG_SYSCTL=n: net/bridge/br_netfilter.c: In function ‘br_netfilter_init’: net/bridge/br_netfilter.c:996: warning: label ‘err1’ defined but not used Move the label and the code after it inside the existing #ifdef to get rid of the warning. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: rename vlan_tx_* helpers since "tx" is misleading thereJiri Pirko2015-01-131-6/+6
| | | | | | | The same macros are used for rx as well. So rename it. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller2014-11-241-0/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo Neira Ayuso says: ==================== netfilter/ipvs updates for net-next The following patchset contains Netfilter updates for your net-next tree, this includes the NAT redirection support for nf_tables, the cgroup support for nft meta and conntrack zone support for the connlimit match. Coming after those, a bunch of sparse warning fixes, missing netns bits and cleanups. More specifically, they are: 1) Prepare IPv4 and IPv6 NAT redirect code to use it from nf_tables, patches from Arturo Borrero. 2) Introduce the nf_tables redir expression, from Arturo Borrero. 3) Remove an unnecessary assignment in ip_vs_xmit/__ip_vs_get_out_rt(). Patch from Alex Gartrell. 4) Add nft_log_dereference() macro to the nf_log infrastructure, patch from Marcelo Leitner. 5) Add some extra validation when registering logger families, also from Marcelo. 6) Some spelling cleanups from stephen hemminger. 7) Fix sparse warning in nf_logger_find_get(). 8) Add cgroup support to nf_tables meta, patch from Ana Rey. 9) A Kconfig fix for the new redir expression and fix sparse warnings in the new redir expression. 10) Fix several sparse warnings in the netfilter tree, from Florian Westphal. 11) Reduce verbosity when OOM in nfnetlink_log. User can basically do nothing when this situation occurs. 12) Add conntrack zone support to xt_connlimit, again from Florian. 13) Add netnamespace support to the h323 conntrack helper, contributed by Vasily Averin. 14) Remove unnecessary nul-pointer checks before free_percpu() and module_put(), from Markus Elfring. 15) Use pr_fmt in nfnetlink_log, again patch from Marcelo Leitner. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * netfilter: fix various sparse warningsFlorian Westphal2014-11-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | net/bridge/br_netfilter.c:870:6: symbol 'br_netfilter_enable' was not declared. Should it be static? no; add include net/ipv4/netfilter/nft_reject_ipv4.c:22:6: symbol 'nft_reject_ipv4_eval' was not declared. Should it be static? yes net/ipv6/netfilter/nf_reject_ipv6.c:16:6: symbol 'nf_send_reset6' was not declared. Should it be static? no; add include net/ipv6/netfilter/nft_reject_ipv6.c:22:6: symbol 'nft_reject_ipv6_eval' was not declared. Should it be static? yes net/netfilter/core.c:33:32: symbol 'nf_ipv6_ops' was not declared. Should it be static? no; add include net/netfilter/xt_DSCP.c:40:57: cast truncates bits from constant value (ffffff03 becomes 3) net/netfilter/xt_DSCP.c:57:59: cast truncates bits from constant value (ffffff03 becomes 3) add __force, 3 is what we want. net/ipv4/netfilter/nf_log_arp.c:77:6: symbol 'nf_log_arp_packet' was not declared. Should it be static? yes net/ipv4/netfilter/nf_reject_ipv4.c:17:6: symbol 'nf_send_reset' was not declared. Should it be static? no; add include Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* | bridge: Do not compile options in br_parse_ip_optionsHerbert Xu2014-10-241-19/+5
|/ | | | | | | | | | | | | | | | | | | | | | Commit 462fb2af9788a82a534f8184abfde31574e1cfa0 bridge : Sanitize skb before it enters the IP stack broke when IP options are actually used because it mangles the skb as if it entered the IP stack which is wrong because the bridge is supposed to operate below the IP stack. Since nobody has actually requested for parsing of IP options this patch fixes it by simply reverting to the previous approach of ignoring all IP options, i.e., zeroing the IPCB. If and when somebody who uses IP options and actually needs them to be parsed by the bridge complains then we can revisit this. Reported-by: David Newall <davidn@davidnewall.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-10-081-0/+11
|\
| * bridge: Save frag_max_size between PRE_ROUTING and POST_ROUTINGHerbert Xu2014-10-071-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As we may defragment the packet in IPv4 PRE_ROUTING and refragment it after POST_ROUTING we should save the value of frag_max_size. This is still very wrong as the bridge is supposed to leave the packets intact, meaning that the right thing to do is to use the original frag_list for fragmentation. Unfortunately we don't currently guarantee that the frag_list is left untouched throughout netfilter so until this changes this is the best we can do. There is also a spot in FORWARD where it appears that we can forward a packet without going through fragmentation, mark it so that we can fix it later. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | netfilter: explicit module dependency between br_netfilter and physdevPablo Neira Ayuso2014-10-021-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | You can use physdev to match the physical interface enslaved to the bridge device. This information is stored in skb->nf_bridge and it is set up by br_netfilter. So, this is only available when iptables is used from the bridge netfilter path. Since 34666d4 ("netfilter: bridge: move br_netfilter out of the core"), the br_netfilter code is modular. To reduce the impact of this change, we can autoload the br_netfilter if the physdev match is used since we assume that the users need br_netfilter in place. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* | netfilter: bridge: move br_netfilter out of the corePablo Neira Ayuso2014-09-261-72/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jesper reported that br_netfilter always registers the hooks since this is part of the bridge core. This harms performance for people that don't need this. This patch modularizes br_netfilter so it can be rmmod'ed, thus, the hooks can be unregistered. I think the bridge netfilter should have been a separated module since the beginning, Patrick agreed on that. Note that this is breaking compatibility for users that expect that bridge netfilter is going to be available after explicitly 'modprobe bridge' or via automatic load through brctl. However, the damage can be easily undone by modprobing br_netfilter. The bridge core also spots a message to provide a clue to people that didn't notice that this has been deprecated. On top of that, the plan is that nftables will not rely on this software layer, but integrate the connection tracking into the bridge layer to enable stateful filtering and NAT, which is was bridge netfilter users seem to require. This patch still keeps the fake_dst_ops in the bridge core, since this is required by when the bridge port is initialized. So we can safely modprobe/rmmod br_netfilter anytime. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Florian Westphal <fw@strlen.de>
* | netfilter: bridge: nf_bridge_copy_header as static inline in headerPablo Neira Ayuso2014-09-261-28/+0
|/ | | | | | | | Move nf_bridge_copy_header() as static inline in netfilter_bridge.h header file. This patch prepares the modularization of the br_netfilter code. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* vlan: rename __vlan_find_dev_deep() to __vlan_find_dev_deep_rcu()dingtianhong2014-05-121-1/+1
| | | | | | | | | The __vlan_find_dev_deep should always called in RCU, according David's suggestion, rename to __vlan_find_dev_deep_rcu looks more reasonable. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: superfluous skb->nfct check in br_nf_dev_queue_xmitVasily Averin2014-05-051-2/+2
| | | | | | | | | | | | | | | | | | | | | Currently bridge can silently drop ipv4 fragments. If node have loaded nf_defrag_ipv4 module but have no nf_conntrack_ipv4, br_nf_pre_routing defragments incoming ipv4 fragments but nfct check in br_nf_dev_queue_xmit does not allow re-fragment combined packet back, and therefore it is dropped in br_dev_queue_push_xmit without incrementing of any failcounters It seems the only way to hit the ip_fragment code in the bridge xmit path is to have a fragment list whose reassembled fragments go over the mtu. This only happens if nf_defrag is enabled. Thanks to Florian Westphal for providing feedback to clarify this. Defragmentation ipv4 is required not only in conntracks but at least in TPROXY target and socket match, therefore #ifdef is changed from NF_CONNTRACK_IPV4 to NF_DEFRAG_IPV4 Signed-off-by: Vasily Averin <vvs@openvz.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* inet: remove now unused flag DST_NOPEERHannes Frederic Sowa2014-03-061-1/+1
| | | | | | | | | | | | | | | Commit e688a604807647 ("net: introduce DST_NOPEER dst flag") introduced DST_NOPEER because because of crashes in ipv6_select_ident called from udp6_ufo_fragment. Since commit 916e4cf46d0204 ("ipv6: reuse ip6_frag_id from ip6_ufo_append_data") we don't call ipv6_select_ident any more from ip6_ufo_append_data, thus this flag lost its purpose and can be removed. Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: netfilter: Use ether_addr_copyJoe Perches2014-02-241-1/+1
| | | | | | | | Convert the uses of memcpy to ether_addr_copy because for some architectures it is smaller and faster. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: change "foo* bar" to "foo *bar"tanxiaojun2013-12-191-1/+1
| | | | | | | "foo * bar" should be "foo *bar". Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2013-11-041-0/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== This is another batch containing Netfilter/IPVS updates for your net-next tree, they are: * Six patches to make the ipt_CLUSTERIP target support netnamespace, from Gao feng. * Two cleanups for the nf_conntrack_acct infrastructure, introducing a new structure to encapsulate conntrack counters, from Holger Eitzenberger. * Fix missing verdict in SCTP support for IPVS, from Daniel Borkmann. * Skip checksum recalculation in SCTP support for IPVS, also from Daniel Borkmann. * Fix behavioural change in xt_socket after IP early demux, from Florian Westphal. * Fix bogus large memory allocation in the bitmap port set type in ipset, from Jozsef Kadlecsik. * Fix possible compilation issues in the hash netnet set type in ipset, also from Jozsef Kadlecsik. * Define constants to identify netlink callback data in ipset dumps, again from Jozsef Kadlecsik. * Use sock_gen_put() in xt_socket to replace xt_socket_put_sk, from Eric Dumazet. * Improvements for the SH scheduler in IPVS, from Alexander Frolkin. * Remove extra delay due to unneeded rcu barrier in IPVS net namespace cleanup path, from Julian Anastasov. * Save some cycles in ip6t_REJECT by skipping checksum validation in packets leaving from our stack, from Stanislav Fomichev. * Fix IPVS_CMD_ATTR_MAX definition in IPVS, larger that required, from Julian Anastasov. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * bridge: netfilter: orphan skb before invoking ip netfilter hooksFlorian Westphal2013-10-271-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pekka Pietikäinen reports xt_socket behavioural change after commit 00028aa37098o (netfilter: xt_socket: use IP early demux). Reason is xt_socket now no longer does an unconditional sk lookup - it re-uses existing skb->sk if possible, assuming ->sk was set by ip early demux. However, when netfilter is invoked via bridge, this can cause 'bogus' sockets to be examined by the match, e.g. a 'tun' device socket. bridge netfilter should orphan the skb just like the routing path before invoking ipv4/ipv6 netfilter hooks to avoid this. Reported-and-tested-by: Pekka Pietikäinen <pp@ee.oulu.fi> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* | netfilter: pass hook ops to hookfnPatrick McHardy2013-10-141-8/+14
|/ | | | | | | | Pass the hook ops to the hookfn to allow for generic hook functions. This change is required by nf_tables. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* net: Convert uses of typedef ctl_table to struct ctl_tableJoe Perches2013-06-131-2/+2
| | | | | | | | | | | | | | | | Reduce the uses of this unnecessary typedef. Done via perl script: $ git grep --name-only -w ctl_table net | \ xargs perl -p -i -e '\ sub trim { my ($local) = @_; $local =~ s/(^\s+|\s+$)//g; return $local; } \ s/\b(?<!struct\s)ctl_table\b(\s*\*\s*|\s+\w+)/"struct ctl_table " . trim($1)/ge' Reflow the modified lines that now exceed 80 columns. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: vlan: add protocol argument to packet tagging functionsPatrick McHardy2013-04-191-1/+1
| | | | | | | | | | Add a protocol argument to the VLAN packet tagging functions. In case of HW tagging, we need that protocol available in the ndo_start_xmit functions, so it is stored in a new field in the skb. The new field fits into a hole (on 64 bit) and doesn't increase the sks's size. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: vlan: prepare for 802.1ad supportPatrick McHardy2013-04-191-1/+2
| | | | | | | | Make the encapsulation protocol value a property of VLAN devices and change the device lookup functions to take the protocol value into account. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: Pull ip header into skb->data before looking into ip header.Sarveshwar Bandi2012-10-101-0/+3
| | | | | | | | | If lower layer driver leaves the ip header in the skb fragment, it needs to be first pulled into skb->data before inspecting ip header length or ip version number. Signed-off-by: Sarveshwar Bandi <sarveshwar.bandi@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Pass optional SKB and SK arguments to dst_ops->{update_pmtu,redirect}()David S. Miller2012-07-171-2/+4
| | | | | | | | | | | | | | | | This will be used so that we can compose a full flow key. Even though we have a route in this context, we need more. In the future the routes will be without destination address, source address, etc. keying. One ipv4 route will cover entire subnets, etc. In this environment we have to have a way to possess persistent storage for redirects and PMTU information. This persistent storage will exist in the FIB tables, and that's why we'll need to be able to rebuild a full lookup flow key here. Using that flow key will do a fib_lookup() and create/update the persistent entry. Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Add dummy dst_ops->redirect method where needed.David S. Miller2012-07-121-0/+5
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* br_netfilter: Convert to dst_neigh_lookup_skb().David S. Miller2012-07-051-13/+23
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Add optional SKB arg to dst_ops->neigh_lookup().David S. Miller2012-07-051-1/+3
| | | | | | | Causes the handler to use the daddr in the ipv4/ipv6 header when the route gateway is unspecified (local subnet). Signed-off-by: David S. Miller <davem@davemloft.net>
* netfilter: bridge: switch hook PFs to nfprotoAlban Crequy2012-06-071-14/+14
| | | | | | | | | | This patch is a cleanup. Use NFPROTO_* for consistency with other netfilter code. Signed-off-by: Alban Crequy <alban.crequy@collabora.co.uk> Reviewed-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk> Reviewed-by: Vincent Sanders <vincent.sanders@collabora.co.uk> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* ipv6: correct the ipv6 option name - Pad0 to Pad1Eldad Zack2012-05-171-1/+1
| | | | | | | | | | The padding destination or hop-by-hop option is called Pad1 and not Pad0. See RFC2460 (4.2) or the IANA ipv6-parameters registry: http://www.iana.org/assignments/ipv6-parameters/ipv6-parameters.xml Signed-off-by: Eldad Zack <eldad@fogrefinery.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netfilter: bridge: optionally set indev to vlanPablo Neira Ayuso2012-05-081-2/+24
| | | | | | | | | | | | | | | | | | | | if net.bridge.bridge-nf-filter-vlan-tagged sysctl is enabled, bridge netfilter removes the vlan header temporarily and then feeds the packet to ip(6)tables. When the new "bridge-nf-pass-vlan-input-device" sysctl is on (default off), then bridge netfilter will also set the in-interface to the vlan interface; if such an interface exists. This is needed to make iptables REDIRECT target work with "vlan-on-top-of-bridge" setups and to allow use of "iptables -i" to match the vlan device name. Also update Documentation with current brnf default settings. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Bart De Schuymer <bdschuym@pandora.be> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2012-05-071-6/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/ethernet/intel/e1000e/param.c drivers/net/wireless/iwlwifi/iwl-agn-rx.c drivers/net/wireless/iwlwifi/iwl-trans-pcie-rx.c drivers/net/wireless/iwlwifi/iwl-trans.h Resolved the iwlwifi conflict with mainline using 3-way diff posted by John Linville and Stephen Rothwell. In 'net' we added a bug fix to make iwlwifi report a more accurate skb->truesize but this conflicted with RX path changes that happened meanwhile in net-next. In e1000e a conflict arose in the validation code for settings of adapter->itr. 'net-next' had more sophisticated logic so that logic was used. Signed-off-by: David S. Miller <davem@davemloft.net>
| * set fake_rtable's dst to NULL to avoid kernel OopsPeter Huang (Peng)2012-04-241-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bridge: set fake_rtable's dst to NULL to avoid kernel Oops when bridge is deleted before tap/vif device's delete, kernel may encounter an oops because of NULL reference to fake_rtable's dst. Set fake_rtable's dst to NULL before sending packets out can solve this problem. v4 reformat, change br_drop_fake_rtable(skb) to {} v3 enrich commit header v2 introducing new flag DST_FAKE_RTABLE to dst_entry struct. [ Use "do { } while (0)" for nop br_drop_fake_rtable() implementation -DaveM ] Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Peter Huang <peter.huangpeng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: Convert all sysctl registrations to register_net_sysctlEric W. Biederman2012-04-201-7/+1
| | | | | | | | | | | | | | | | | | | | | | | | This results in code with less boiler plate that is a bit easier to read. Additionally stops us from using compatibility code in the sysctl core, hastening the day when the compatibility code can be removed. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: Move all of the network sysctls without a namespace into init_net.Eric W. Biederman2012-04-201-2/+2
|/ | | | | | | | | | | | | | | | This makes it clearer which sysctls are relative to your current network namespace. This makes it a little less error prone by not exposing sysctls for the initial network namespace in other namespaces. This is the same way we handle all of our other network interfaces to userspace and I can't honestly remember why we didn't do this for sysctls right from the start. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: netfilter: don't call iptables on vlan packets if sysctl is offFlorian Westphal2012-03-061-14/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When net.bridge.bridge-nf-filter-vlan-tagged is 0 (default), vlan packets arriving should not be sent to ip(6)tables by bridge netfilter. However, it turns out that we currently always send VLAN packets to netfilter, if .. a), CONFIG_VLAN_8021Q is enabled ; or b), CONFIG_VLAN_8021Q is not set but rx vlan offload is enabled on the bridge port. This is because bridge netfilter treats skb with skb->protocol == ETH_P_IP{V6} as "non-vlan packet". With rx vlan offload on or CONFIG_VLAN_8021Q=y, the vlan header has already been removed here, and we cannot rely on skb->protocol alone. Fix this by only using skb->protocol if the skb has no vlan tag, or if a vlan tag is present and filter-vlan-tagged bridge netfilter sysctl is enabled. We cannot remove the skb->protocol == htons(ETH_P_8021Q) test because the vlan tag is still around in the CONFIG_VLAN_8021Q=n && "ethtool -K $itf rxvlan off" case. reproducer: iptables -t raw -I PREROUTING -i br0 iptables -t raw -I PREROUTING -i br0.1 Then send packets to an ip address configured on br0.1 interface. Even with net.bridge.bridge-nf-filter-vlan-tagged=0, the 1st rule will match instead of the 2nd one. With this patch applied, the 2nd rule will match instead. In the non-local address case, netfilter won't be consulted after this patch unless the sysctl is switched on. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2011-12-231-1/+7
|\ | | | | | | | | | | | | | | | | | | Conflicts: net/bluetooth/l2cap_core.c Just two overlapping changes, one added an initialization of a local variable, and another change added a new local variable. Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: introduce DST_NOPEER dst flagEric Dumazet2011-12-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Chris Boot reported crashes occurring in ipv6_select_ident(). [ 461.457562] RIP: 0010:[<ffffffff812dde61>] [<ffffffff812dde61>] ipv6_select_ident+0x31/0xa7 [ 461.578229] Call Trace: [ 461.580742] <IRQ> [ 461.582870] [<ffffffff812efa7f>] ? udp6_ufo_fragment+0x124/0x1a2 [ 461.589054] [<ffffffff812dbfe0>] ? ipv6_gso_segment+0xc0/0x155 [ 461.595140] [<ffffffff812700c6>] ? skb_gso_segment+0x208/0x28b [ 461.601198] [<ffffffffa03f236b>] ? ipv6_confirm+0x146/0x15e [nf_conntrack_ipv6] [ 461.608786] [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77 [ 461.614227] [<ffffffff81271d64>] ? dev_hard_start_xmit+0x357/0x543 [ 461.620659] [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111 [ 461.626440] [<ffffffffa0379745>] ? br_parse_ip_options+0x19a/0x19a [bridge] [ 461.633581] [<ffffffff812722ff>] ? dev_queue_xmit+0x3af/0x459 [ 461.639577] [<ffffffffa03747d2>] ? br_dev_queue_push_xmit+0x72/0x76 [bridge] [ 461.646887] [<ffffffffa03791e3>] ? br_nf_post_routing+0x17d/0x18f [bridge] [ 461.653997] [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77 [ 461.659473] [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge] [ 461.665485] [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111 [ 461.671234] [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge] [ 461.677299] [<ffffffffa0379215>] ? nf_bridge_update_protocol+0x20/0x20 [bridge] [ 461.684891] [<ffffffffa03bb0e5>] ? nf_ct_zone+0xa/0x17 [nf_conntrack] [ 461.691520] [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge] [ 461.697572] [<ffffffffa0374812>] ? NF_HOOK.constprop.8+0x3c/0x56 [bridge] [ 461.704616] [<ffffffffa0379031>] ? nf_bridge_push_encap_header+0x1c/0x26 [bridge] [ 461.712329] [<ffffffffa037929f>] ? br_nf_forward_finish+0x8a/0x95 [bridge] [ 461.719490] [<ffffffffa037900a>] ? nf_bridge_pull_encap_header+0x1c/0x27 [bridge] [ 461.727223] [<ffffffffa0379974>] ? br_nf_forward_ip+0x1c0/0x1d4 [bridge] [ 461.734292] [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77 [ 461.739758] [<ffffffffa03748cc>] ? __br_deliver+0xa0/0xa0 [bridge] [ 461.746203] [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111 [ 461.751950] [<ffffffffa03748cc>] ? __br_deliver+0xa0/0xa0 [bridge] [ 461.758378] [<ffffffffa037533a>] ? NF_HOOK.constprop.4+0x56/0x56 [bridge] This is caused by bridge netfilter special dst_entry (fake_rtable), a special shared entry, where attaching an inetpeer makes no sense. Problem is present since commit 87c48fa3b46 (ipv6: make fragment identifications less predictable) Introduce DST_NOPEER dst flag and make sure ipv6_select_ident() and __ip_select_ident() fallback to the 'no peer attached' handling. Reported-by: Chris Boot <bootc@bootc.net> Tested-by: Chris Boot <bootc@bootc.net> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bridge: provide a mtu() method for fake_dst_opsEric Dumazet2011-12-221-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | Commit 618f9bc74a039da76 (net: Move mtu handling down to the protocol depended handlers) forgot the bridge netfilter case, adding a NULL dereference in ip_fragment(). Reported-by: Chris Boot <bootc@bootc.net> CC: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net:bridge: use IS_ENABLEDIgor Maravić2011-12-161-1/+1
| | | | | | | | | | | | | | | | Use IS_ENABLED(CONFIG_FOO) instead of defined(CONFIG_FOO) || defined (CONFIG_FOO_MODULE) Signed-off-by: Igor Maravić <igorm@etf.rs> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: Rename dst_get_neighbour{, _raw} to dst_get_neighbour_noref{, _raw}.David Miller2011-12-051-1/+1
|/ | | | | | | | To reflect the fact that a refrence is not obtained to the resulting neighbour entry. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Roland Dreier <roland@purestorage.com>
* net: Add ->neigh_lookup() operation to dst_opsDavid S. Miller2011-07-181-0/+6
| | | | | | | | In the future dst entries will be neigh-less. In that environment we need to have an easy transition point for current users of dst->neighbour outside of the packet output fast path. Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Abstract dst->neighbour accesses behind helpers.David S. Miller2011-07-171-1/+1
| | | | | | dst_{get,set}_neighbour() Signed-off-by: David S. Miller <davem@davemloft.net>
* neigh: Pass neighbour entry to output ops.David S. Miller2011-07-171-2/+2
| | | | | | | | | | This will get us closer to being able to do "neigh stuff" completely independent of the underlying dst_entry for protocols (ipv4/ipv6) that wish to do so. We will also be able to make dst entries neigh-less. Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Embed hh_cache inside of struct neighbour.David S. Miller2011-07-141-2/+4
| | | | | | | | | | | | | | | Now that there is a one-to-one correspondance between neighbour and hh_cache entries, we no longer need: 1) dynamic allocation 2) attachment to dst->hh 3) refcounting Initialization of the hh_cache entry is indicated by hh_len being non-zero, and such initialization is always done with the neighbour's lock held as a writer. Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: provide a cow_metrics method for fake_opsAlexander Holler2011-06-071-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Like in commit 0972ddb237 (provide cow_metrics() methods to blackhole dst_ops), we must provide a cow_metrics for bridges fake_dst_ops as well. This fixes a regression coming from commits 62fa8a846d7d (net: Implement read-only protection and COW'ing of metrics.) and 33eb9873a28 (bridge: initialize fake_rtable metrics) ip link set mybridge mtu 1234 --> [ 136.546243] Pid: 8415, comm: ip Tainted: P 2.6.39.1-00006-g40545b7 #103 ASUSTeK Computer Inc. V1Sn /V1Sn [ 136.546256] EIP: 0060:[<00000000>] EFLAGS: 00010202 CPU: 0 [ 136.546268] EIP is at 0x0 [ 136.546273] EAX: f14a389c EBX: 000005d4 ECX: f80d32c0 EDX: f80d1da1 [ 136.546279] ESI: f14a3000 EDI: f255bf10 EBP: f15c3b54 ESP: f15c3b48 [ 136.546285] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 136.546293] Process ip (pid: 8415, ti=f15c2000 task=f4741f80 task.ti=f15c2000) [ 136.546297] Stack: [ 136.546301] f80c658f f14a3000 ffffffed f15c3b64 c12cb9c8 f80d1b80 ffffffa1 f15c3bbc [ 136.546315] c12da347 c12d9c7d 00000000 f7670b00 00000000 f80d1b80 ffffffa6 f15c3be4 [ 136.546329] 00000004 f14a3000 f255bf20 00000008 f15c3bbc c11d6cae 00000000 00000000 [ 136.546343] Call Trace: [ 136.546359] [<f80c658f>] ? br_change_mtu+0x5f/0x80 [bridge] [ 136.546372] [<c12cb9c8>] dev_set_mtu+0x38/0x80 [ 136.546381] [<c12da347>] do_setlink+0x1a7/0x860 [ 136.546390] [<c12d9c7d>] ? rtnl_fill_ifinfo+0x9bd/0xc70 [ 136.546400] [<c11d6cae>] ? nla_parse+0x6e/0xb0 [ 136.546409] [<c12db931>] rtnl_newlink+0x361/0x510 [ 136.546420] [<c1023240>] ? vmalloc_sync_all+0x100/0x100 [ 136.546429] [<c1362762>] ? error_code+0x5a/0x60 [ 136.546438] [<c12db5d0>] ? rtnl_configure_link+0x80/0x80 [ 136.546446] [<c12db27a>] rtnetlink_rcv_msg+0xfa/0x210 [ 136.546454] [<c12db180>] ? __rtnl_unlock+0x20/0x20 [ 136.546463] [<c12ee0fe>] netlink_rcv_skb+0x8e/0xb0 [ 136.546471] [<c12daf1c>] rtnetlink_rcv+0x1c/0x30 [ 136.546479] [<c12edafa>] netlink_unicast+0x23a/0x280 [ 136.546487] [<c12ede6b>] netlink_sendmsg+0x26b/0x2f0 [ 136.546497] [<c12bb828>] sock_sendmsg+0xc8/0x100 [ 136.546508] [<c10adf61>] ? __alloc_pages_nodemask+0xe1/0x750 [ 136.546517] [<c11d0602>] ? _copy_from_user+0x42/0x60 [ 136.546525] [<c12c5e4c>] ? verify_iovec+0x4c/0xc0 [ 136.546534] [<c12bd805>] sys_sendmsg+0x1c5/0x200 [ 136.546542] [<c10c2150>] ? __do_fault+0x310/0x410 [ 136.546549] [<c10c2c46>] ? do_wp_page+0x1d6/0x6b0 [ 136.546557] [<c10c47d1>] ? handle_pte_fault+0xe1/0x720 [ 136.546565] [<c12bd1af>] ? sys_getsockname+0x7f/0x90 [ 136.546574] [<c10c4ec1>] ? handle_mm_fault+0xb1/0x180 [ 136.546582] [<c1023240>] ? vmalloc_sync_all+0x100/0x100 [ 136.546589] [<c10233b3>] ? do_page_fault+0x173/0x3d0 [ 136.546596] [<c12bd87b>] ? sys_recvmsg+0x3b/0x60 [ 136.546605] [<c12bdd83>] sys_socketcall+0x293/0x2d0 [ 136.546614] [<c13629d0>] sysenter_do_call+0x12/0x26 [ 136.546619] Code: Bad EIP value. [ 136.546627] EIP: [<00000000>] 0x0 SS:ESP 0068:f15c3b48 [ 136.546645] CR2: 0000000000000000 [ 136.546652] ---[ end trace 6909b560e78934fa ]--- Signed-off-by: Alexander Holler <holler@ahsoftware.de> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: initialize fake_rtable metricsEric Dumazet2011-05-241-1/+5
| | | | | | | | | | bridge netfilter code uses a fake_rtable, and we must init its _metric field or risk NULL dereference later. Ref: https://bugzilla.kernel.org/show_bug.cgi?id=35672 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
OpenPOWER on IntegriCloud