summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* gre: remove unnecessary rcu_read_lock/unlockstephen hemminger2012-09-272-25/+10
| | | | | | | | The gre function pointers for receive and error handling are always called (from gre.c) with rcu_read_lock already held. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* gre: fix handling of key 0stephen hemminger2012-09-271-10/+34
| | | | | | | | | | | | | | | | | GRE driver incorrectly uses zero as a flag value. Zero is a perfectly valid value for key, and the tunnel should match packets with no key only with tunnels created without key, and vice versa. This is a slightly visible change since previously it might be possible to construct a working tunnel that sent key 0 and received only because of the key wildcard of zero. I.e the sender sent key of zero, but tunnel was defined without key. Note: using gre key 0 requires iproute2 utilities v3.2 or later. The original utility code was broken as well. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sparc: bpf_jit_comp: add XOR instruction for BPF JIT JITDaniel Borkmann2012-09-271-0/+4
| | | | | | | | This patch is a follow-up for patch "filter: add XOR instruction for use with X/K" that implements BPF SPARC JIT parts for the BPF XOR operation. Signed-off-by: Daniel Borkmann <daniel.borkmann@tik.ee.ethz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* lxt PHY: Support for the buggy LXT973 rev A2LEROY Christophe2012-09-271-0/+127
| | | | | | | | | | | | This patch adds proper handling of the buggy revision A2 of LXT973 phy, adding precautions linked to ERRATA Item 4: Revision A2 of LXT973 chip randomly returns the contents of the previous even register when you read a odd register regularly Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4: Don't attempt to upgrade T4 firmware when cxgb4 will end up as a slaveVipul Pandya2012-09-275-5/+206
| | | | | | | | | | | | | | | This patch adds a new common code routine to upgrade an adapter's firmware. This routine handles all of the complexities of working with the the existing adapter firmware in order to quiesce the adapter and uP, etc. For an automatic upgrade it will send a HELLO command to check if cxgb4 want/can upgrade firmware, i.e. if cxgb4 is MASTER and has newer firmware that it wants to load and call the new common code routine t4_fw_upgrade. Note that it should not issue a RESET command after a successful firmware upgrade. Signed-off-by: Jay Hernandez <jay@chelsio.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4: Inform caller if driver didn't upgrade firmwareVipul Pandya2012-09-271-0/+6
| | | | | | | | | | | | If a card had already been initialized, on reloading cxgb4 driver firmware required an upgrade but the upgrade did not happen. In that case a mailbox timeout would occur during T4 configuration file stuff. The fix is to let the caller know the firmware was not upgraded so a reset would be issued before starting the T4 config stuff. Signed-off-by: Jay Hernandez <jay@chelsio.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4: Add support for T4 hardwired driver configuration settingsVipul Pandya2012-09-275-32/+442
| | | | | | | | | | In case if user defined configuration file at /lib/firmware/cxgb4/t4-config.txt location and also factory default configuration file written to FLASH are not present then driver will use hardwired configuration settings. Signed-off-by: Jay Hernandez <jay@chelsio.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4: Add support for T4 configuration fileVipul Pandya2012-09-276-133/+920
| | | | | | | | | | | | | | | | | | | Starting with T4 firmware version 1.3.11.0 the firmware now supports device configuration via a Firmware Configuration File. The Firmware Configuration File was primarily developed in order to centralize all of the configuration, resource allocation, etc. for Unified Wire operation where multiple Physical / Virtual Function Drivers would be using a T4 adapter simultaneously. The Firmware Configuration file can live in three locations as shown below in order of precedence. 1) User defined configuration file: /lib/firmware/cxgb4/t4-config.txt 2) Factory Default configuration file written to FLASH within the manufacturing process. 3) Hardwired driver configuration. Signed-off-by: Jay Hernandez <jay@chelsio.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4/cxgb4vf: Code cleanup to enable T4 Configuration File supportVipul Pandya2012-09-275-71/+365
| | | | | | | | | | | | | | | | This patch adds new enums and macros to enable T4 configuration file support. It also removes duplicate macro definitions. It fixes the build failure in cxgb4vf driver introduced because of old macro definition removal. It also performs SGE initialization based on T4 configuration file is provided or not. If it is provided then it uses the parameters provided in it otherwise it uses hard coded values. Signed-off-by: Jay Hernandez <jay@chelsio.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4: Add functions to read memory via PCIE memory windowVipul Pandya2012-09-273-0/+219
| | | | | | | | | | | | This patch implements two new functions t4_mem_win_read and t4_memory_read. These new functions can be used to read memory via the PCIE memory window. Please note, for proper execution of these functions PCIE_MEM_ACCESS_BASE_WIN registers must be setup correctly like how setup_memwin in the cxgb4 driver does it. Signed-off-by: Jay Hernandez <jay@chelsio.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4: Fix incorrect values for MEMWIN*_APERTURE and MEMWIN*_BASEVipul Pandya2012-09-271-4/+4
| | | | | | Signed-off-by: Jay Hernandez <jay@chelsio.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipconfig: fix trivial build errorAndy Shevchenko2012-09-251-11/+11
| | | | | | | | | | | | The commit 5e953778a2aab04929a5e7b69f53dc26e39b079e ("ipconfig: add nameserver IPs to kernel-parameter ip=") introduces ic_nameservers_predef() that defined only for BOOTP. However it is used by ip_auto_config_setup() as well. This patch moves it outside of #ifdef BOOTP. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Christoph Fritz <chf.fritz@googlemail.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: raw: revert unrelated changeEric Dumazet2012-09-251-12/+7
| | | | | | | | | | | | | Commit 5640f7685831 ("net: use a per task frag allocator") accidentally contained an unrelated change to net/ipv4/raw.c, later committed (without the pr_err() debugging bits) in net tree as commit ab43ed8b749 (ipv4: raw: fix icmp_filter()) This patch reverts this glitch, noticed by Stephen Rothwell. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* x86: bpf_jit_comp: add XOR instruction for BPF JITDaniel Borkmann2012-09-241-0/+9
| | | | | | | | | This patch is a follow-up for patch "filter: add XOR instruction for use with X/K" that implements BPF x86 JIT parts for the BPF XOR operation. Signed-off-by: Daniel Borkmann <daniel.borkmann@tik.ee.ethz.ch> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* filter: add XOR instruction for use with X/KDaniel Borkmann2012-09-242-3/+12
| | | | | | | | | | | | SKF_AD_ALU_XOR_X has been added a while ago, but as an 'ancillary' operation that is invoked through a negative offset in K within BPF load operations. Since BPF_MOD has recently been added, BPF_XOR should also be part of the common ALU operations. Removing SKF_AD_ALU_XOR_X might not be an option since this is exposed to user space. Signed-off-by: Daniel Borkmann <daniel.borkmann@tik.ee.ethz.ch> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: mipsnet: Remove the MIPSsim Ethernet driver.Steven J. Hill2012-09-243-355/+0
| | | | | | | | The MIPSsim platform is no longer supported or used. This patch deletes the Ethernet driver. Signed-off-by: Steven J. Hill <sjhill@mips.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: use a per task frag allocatorEric Dumazet2012-09-2413-200/+167
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We currently use a per socket order-0 page cache for tcp_sendmsg() operations. This page is used to build fragments for skbs. Its done to increase probability of coalescing small write() into single segments in skbs still in write queue (not yet sent) But it wastes a lot of memory for applications handling many mostly idle sockets, since each socket holds one page in sk->sk_sndmsg_page Its also quite inefficient to build TSO 64KB packets, because we need about 16 pages per skb on arches where PAGE_SIZE = 4096, so we hit page allocator more than wanted. This patch adds a per task frag allocator and uses bigger pages, if available. An automatic fallback is done in case of memory pressure. (up to 32768 bytes per frag, thats order-3 pages on x86) This increases TCP stream performance by 20% on loopback device, but also benefits on other network devices, since 8x less frags are mapped on transmit and unmapped on tx completion. Alexander Duyck mentioned a probable performance win on systems with IOMMU enabled. Its possible some SG enabled hardware cant cope with bigger fragments, but their ndo_start_xmit() should already handle this, splitting a fragment in sub fragments, since some arches have PAGE_SIZE=65536 Successfully tested on various ethernet devices. (ixgbe, igb, bnx2x, tg3, mellanox mlx4) Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Vijay Subramanian <subramanian.vijay@gmail.com> Cc: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Vijay Subramanian <subramanian.vijay@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* gianfar: Change default HW Tx queue scheduling modeClaudiu Manoil2012-09-242-2/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | This is primarily to address transmission timeout occurrences, when multiple H/W Tx queues are being used concurrently. Because in the priority scheduling mode the controller does not service the Tx queues equally (but in ascending index order), Tx timeouts are being triggered rightaway for a basic test with multiple simultaneous connections like: iperf -c <server_ip> -n 100M -P 8 resulting in kernel trace: NETDEV WATCHDOG: eth1 (fsl-gianfar): transmit queue <X> timed out ------------[ cut here ]------------ WARNING: at net/sched/sch_generic.c:255 ... and controller reset during intense traffic, and possibly further complications. This patch changes the default H/W Tx scheduling setting (TXSCHED) for multi-queue devices, from priority scheduling mode to a weighted round robin mode with equal weights for all H/W Tx queues, and addresses the issue above. Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: loopback: set default mtu to 64KEric Dumazet2012-09-241-1/+1
| | | | | | | | | | | | | loopback current mtu of 16436 bytes allows no more than 3 MSS TCP segments per frame, or 48 Kbytes. Changing mtu to 64K allows TCP stack to build large frames and significantly reduces stack overhead. Performance boost on bulk TCP transferts can be up to 30 %, partly because we now have one ACK message for two 64KB segments, and a lower probability of hitting /proc/sys/net/ipv4/tcp_reordering default limit. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Remove unnecessary NULL check in scm_destroy().David S. Miller2012-09-241-1/+1
| | | | | | All callers provide a non-NULL scm argument. Signed-off-by: David S. Miller <davem@davemloft.net>
* bnx2x: Improve code around bnx2x_tests_str_arrMerav Sicron2012-09-241-9/+4
| | | | | | | | | | | | This patch changes the definition of bnx2x_tests_str_arr from static char pointer to static const char bi-directional array. Also the bnx2x_get_strings function is simplified. Reported-by: Joe Perches <joe@perches.com> Reported-by: David Laight <David.Laight@ACULAB.COM> Signed-off-by: Merav Sicron <meravs@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ctcm: fix error return codePeter Senna Tschudin2012-09-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert a nonnegative error return code to a negative one, as returned elsewhere in the function. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> ( if@p1 (\(ret < 0\|ret != 0\)) { ... return ret; } | ret@p1 = 0 ) ... when != ret = e1 when != &ret *if(...) { ... when != ret = e2 when forall return ret; } // </smpl> Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com> Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com> Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* drivers/s390/net: removes unnecessary semicolonPeter Senna Tschudin2012-09-242-3/+3
| | | | | | | | | | | removes unnecessary semicolon Found by Coccinelle: http://coccinelle.lip6.fr/ Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com> Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com> Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* qeth: fix possible memory leak in qeth_l3_add_[vipa|rxip]()Wei Yongjun2012-09-241-0/+2
| | | | | | | | | | | | | ipaddr has been allocated in function qeth_l3_add_vipa() but does not free before leaving from the error handling cases. The same problem also exists in function qeth_l3_add_rxip(). spatch with a semantic match is used to found this problem. (http://coccinelle.lip6.fr/) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* lcs: ensure proper ccw setupSebastian Ott2012-09-241-1/+1
| | | | | | | | | | Make sure that all ccws used for writing are initialized with zeros - especially since the last ccw contains a TIC for which the unused fields have to be zeros. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* qeth: cleanup channel path descriptor functionSebastian Ott2012-09-241-33/+41
| | | | | | | | | | Cleanup the qeth_get_channel_path_desc function and rename it to qeth_update_from_chp_desc. No functional change. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com> Acked-by: Ursula Braun <ursula.braun@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of git://1984.lsi.us.es/nf-nextDavid S. Miller2012-09-2437-670/+773
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo Neira Ayuso says: ==================== This patchset contains updates for your net-next tree, they are: * Mostly fixes for the recently pushed IPv6 NAT support: - Fix crash while removing nf_nat modules from Patrick McHardy. - Fix unbalanced rcu_read_unlock from Ulrich Weber. - Merge NETMAP and REDIRECT into one single xt_target module, from Jan Engelhardt. - Fix Kconfig for IPv6 NAT, which allows inconsistent configurations, from myself. * Updates for ipset, all of the from Jozsef Kadlecsik: - Add the new "nomatch" option to obtain reverse set matching. - Support for /0 CIDR in hash:net,iface set type. - One non-critical fix for a rare crash due to pass really wrong configuration parameters. - Coding style cleanups. - Sparse fixes. - Add set revision supported via modinfo.i * One extension for the xt_time match, to support matching during the transition between two days with one single rule, from Florian Westphal. * Fix maximum packet length supported by nfnetlink_queue and add NFQA_CAP_LEN attribute, from myself. You can notice that this batch contains a couple of fixes that may go to 3.6-rc but I don't consider them critical to push them: * The ipset fix for the /0 cidr case, which is triggered with one inconsistent command line invocation of ipset. * The nfnetlink_queue maximum packet length supported since it requires the new NFQA_CAP_LEN attribute to provide a full workaround for the described problem. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * netfilter: nfnetlink_queue: add NFQA_CAP_LEN attributePablo Neira Ayuso2012-09-242-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the NFQA_CAP_LEN attribute that allows us to know what is the real packet size from user-space (even if we decided to retrieve just a few bytes from the packet instead of all of it). Security software that inspects packets should always check for this new attribute to make sure that it is inspecting the entire packet. This also helps to provide a workaround for the problem described in: http://marc.info/?l=netfilter-devel&m=134519473212536&w=2 Original idea from Florian Westphal. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: nfnetlink_queue: fix maximum packet length to userspacePablo Neira Ayuso2012-09-241-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The packets that we send via NFQUEUE are encapsulated in the NFQA_PAYLOAD attribute. The length of the packet in userspace is obtained via attr->nla_len field. This field contains the size of the Netlink attribute header plus the packet length. If the maximum packet length is specified, ie. 65535 bytes, and packets in the range of (65531,65535] are sent to userspace, the attr->nla_len overflows and it reports bogus lengths to the application. To fix this, this patch limits the maximum packet length to 65531 bytes. If larger packet length is specified, the packet that we send to user-space is truncated to 65531 bytes. To support 65535 bytes packets, we have to revisit the idea of the 32-bits Netlink attribute length. Reported-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: nf_ct_ftp: add sequence tracking pickup facility for injected entriesPablo Neira Ayuso2012-09-244-3/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows the FTP helper to pickup the sequence tracking from the first packet seen. This is useful to fix the breakage of the first FTP command after the failover while using conntrackd to synchronize states. The seq_aft_nl_num field in struct nf_ct_ftp_info has been shrinked to 16-bits (enough for what it does), so we can use the remaining 16-bits to store the flags while using the same size for the private FTP helper data. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: xt_time: add support to ignore day transitionFlorian Westphal2012-09-242-1/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, if you want to do something like: "match Monday, starting 23:00, for two hours" You need two rules, one for Mon 23:00 to 0:00 and one for Tue 0:00-1:00. The rule: --weekdays Mo --timestart 23:00 --timestop 01:00 looks correct, but it will first match on monday from midnight to 1 a.m. and then again for another hour from 23:00 onwards. This permits userspace to explicitly ignore the day transition and match for a single, continuous time period instead. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: ipset: Support to match elements marked with "nomatch"Jozsef Kadlecsik2012-09-227-20/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Exceptions can now be matched and we can branch according to the possible cases: a. match in the set if the element is not flagged as "nomatch" b. match in the set if the element is flagged with "nomatch" c. no match i.e. iptables ... -m set --match-set ... -j ... iptables ... -m set --match-set ... --nomatch-entries -j ... ... Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
| * netfilter: ipset: Coding style fixesJozsef Kadlecsik2012-09-224-8/+12
| | | | | | | | Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
| * netfilter: ipset: Include supported revisions in module descriptionJozsef Kadlecsik2012-09-2212-39/+78
| | | | | | | | Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
| * netfilter: ipset: Add /0 network support to hash:net,iface typeJozsef Kadlecsik2012-09-221-23/+21
| | | | | | | | | | | | | | Now it is possible to setup a single hash:net,iface type of set and a single ip6?tables match which covers all egress/ingress filtering. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
| * netfilter: ipset: Rewrite cidr book keeping to handle /0Jozsef Kadlecsik2012-09-221-49/+55
| | | | | | | | Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
| * netfilter: ipset: Check and reject crazy /0 input parametersJozsef Kadlecsik2012-09-216-10/+13
| | | | | | | | | | | | | | | | | | | | bitmap:ip and bitmap:ip,mac type did not reject such a crazy range when created and using such a set results in a kernel crash. The hash types just silently ignored such parameters. Reject invalid /0 input parameters explicitely. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
| * netfilter: ipset: Fix sparse warnings "incorrect type in assignment"Jozsef Kadlecsik2012-09-217-33/+39
| | | | | | | | Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
| * netfilter: combine ipt_REDIRECT and ip6t_REDIRECTJan Engelhardt2012-09-219-230/+207
| | | | | | | | | | | | | | | | | | | | | | | | | | Combine more modules since the actual code is so small anyway that the kmod metadata and the module in its loaded state totally outweighs the combined actual code size. IP_NF_TARGET_REDIRECT becomes a compat option; IP6_NF_TARGET_REDIRECT is completely eliminated since it has not see a release yet. Signed-off-by: Jan Engelhardt <jengelh@inai.de> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: combine ipt_NETMAP and ip6t_NETMAPJan Engelhardt2012-09-219-212/+181
| | | | | | | | | | | | | | | | | | | | | | | | | | Combine more modules since the actual code is so small anyway that the kmod metadata and the module in its loaded state totally outweighs the combined actual code size. IP_NF_TARGET_NETMAP becomes a compat option; IP6_NF_TARGET_NETMAP is completely eliminated since it has not see a release yet. Signed-off-by: Jan Engelhardt <jengelh@inai.de> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: nf_nat: remove obsolete rcu_read_unlock callUlrich Weber2012-09-211-3/+1
| | | | | | | | | | | | | | | | | | | | hlist walk in find_appropriate_src() is not protected anymore by rcu_read_lock(), so rcu_read_unlock() is unnecessary if in_range() matches. This bug was added in (c7232c9 netfilter: add protocol independent NAT core). Signed-off-by: Ulrich Weber <ulrich.weber@sophos.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: nf_nat: fix oops when unloading protocol modulesPatrick McHardy2012-09-212-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When unloading a protocol module nf_ct_iterate_cleanup() is used to remove all conntracks using the protocol from the bysource hash and clean their NAT sections. Since the conntrack isn't actually killed, the NAT callback is invoked twice, once for each direction, which causes an oops when trying to delete it from the bysource hash for the second time. The same oops can also happen when removing both an L3 and L4 protocol since the cleanup function doesn't check whether the conntrack has already been cleaned up. Pid: 4052, comm: modprobe Not tainted 3.6.0-rc3-test-nat-unload-fix+ #32 Red Hat KVM RIP: 0010:[<ffffffffa002c303>] [<ffffffffa002c303>] nf_nat_proto_clean+0x73/0xd0 [nf_nat] RSP: 0018:ffff88007808fe18 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8800728550c0 RCX: ffff8800756288b0 RDX: dead000000200200 RSI: ffff88007808fe88 RDI: ffffffffa002f208 RBP: ffff88007808fe28 R08: ffff88007808e000 R09: 0000000000000000 R10: dead000000200200 R11: dead000000100100 R12: ffffffff81c6dc00 R13: ffff8800787582b8 R14: ffff880078758278 R15: ffff88007808fe88 FS: 00007f515985d700(0000) GS:ffff88007cd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f515986a000 CR3: 000000007867a000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process modprobe (pid: 4052, threadinfo ffff88007808e000, task ffff8800756288b0) Stack: ffff88007808fe68 ffffffffa002c290 ffff88007808fe78 ffffffff815614e3 ffffffff00000000 00000aeb00000246 ffff88007808fe68 ffffffff81c6dc00 ffff88007808fe88 ffffffffa00358a0 0000000000000000 000000000040f5b0 Call Trace: [<ffffffffa002c290>] ? nf_nat_net_exit+0x50/0x50 [nf_nat] [<ffffffff815614e3>] nf_ct_iterate_cleanup+0xc3/0x170 [<ffffffffa002c55a>] nf_nat_l3proto_unregister+0x8a/0x100 [nf_nat] [<ffffffff812a0303>] ? compat_prepare_timeout+0x13/0xb0 [<ffffffffa0035848>] nf_nat_l3proto_ipv4_exit+0x10/0x23 [nf_nat_ipv4] ... To fix this, - check whether the conntrack has already been cleaned up in nf_nat_proto_clean - change nf_ct_iterate_cleanup() to only invoke the callback function once for each conntrack (IP_CT_DIR_ORIGINAL). The second change doesn't affect other callers since when conntracks are actually killed, both directions are removed from the hash immediately and the callback is already only invoked once. If it is not killed, the second callback invocation will always return the same decision not to kill it. Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: fix IPv6 NAT dependencies in KconfigPablo Neira Ayuso2012-09-211-55/+55
| | | | | | | | | | | | | | | | | | | | | | | | * NF_NAT_IPV6 requires IP6_NF_IPTABLES * IP6_NF_TARGET_MASQUERADE, IP6_NF_TARGET_NETMAP, IP6_NF_TARGET_REDIRECT and IP6_NF_TARGET_NPT require NF_NAT_IPV6. This change just mirrors what IPv4 does in Kconfig, for consistency. Reported-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* | ixgbevf: Return error on failure to enable VLANAlexander Duyck2012-09-242-7/+34
| | | | | | | | | | | | | | | | | | | | | | | | With recent kernel changes we can now return errors on a failure to setup a VLAN filter. This patch takes advantage of that opportunity so that we can return either an EIO error in the case of a mailbox failure, or an EACCESS error in the case of being denied access to the VLAN filter table by the PF. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Robert Garrett <robertx.e.garrett@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* | ixgbevf: Add fix to VF to handle multi-descriptor buffersAlexander Duyck2012-09-242-2/+17
| | | | | | | | | | | | | | | | This change fixes the ixgbevf driver so that it can correctly drop a frame should it receive a jumbo frame. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* | ixgbevf: Fix AIM (Adaptive Interrupt Moderation)Greg Rose2012-09-241-0/+4
| | | | | | | | | | | | | | | | | | | | While fixing up a patch from Alex Duyck to use q_vectors in ring containers to update the ITR I bungled it and missed actually updating the counters in the ring container q_vectors. This patch fixes my mistake and makes interrupt moderation actually work. Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* | ixgbevf - Remove unused parameter in ixgbevf_receive_skbNarendra K2012-09-241-3/+1
| | | | | | | | | | | | | | | | Remove 'rx_ring' parameter as it is not used in ixgbevf_receive_skb Signed-off-by: Narendra K <narendra_k@dell.com> Acked-by: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* | ixgbe: Do not read the spoofed packets counter when not in IOV modeGreg Rose2012-09-241-2/+3
| | | | | | | | | | | | | | | | The counter is not valid unless the controller is running in IOV mode. Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* | ixgbevf: Fix code for handling timeoutAlexander Duyck2012-09-243-52/+58
| | | | | | | | | | | | | | | | | | | | | | The VF driver was not designed to correctly handle a message timeout. As a result it is possible for one bad message to invalidate all messages following it until the part is reset. Instead we should copy the example in igbvf of how to handle a mailbox event and message timeout. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* | netlink: Rearrange netlink_kernel_cfg to save space on 64-bit.David S. Miller2012-09-231-1/+1
| | | | | | | | | | | | Suggested by Jan Engelhardt. Signed-off-by: David S. Miller <davem@davemloft.net>
OpenPOWER on IntegriCloud