op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	sit: fix use after free of fb_tunnel_dev	Willem de Bruijn	2013-11-14	1	-4/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bug: The fallback device is created in sit_init_net and assumed to be freed in sit_exit_net. First, it is dereferenced in that function, in sit_destroy_tunnels: struct net *net = dev_net(sitn->fb_tunnel_dev); Prior to this, rtnl_unlink_register has removed all devices that match rtnl_link_ops == sit_link_ops. Commit 205983c43700 added the line + sitn->fb_tunnel_dev->rtnl_link_ops = &sit_link_ops; which cases the fallback device to match here and be freed before it is last dereferenced. Fix: This commit adds an explicit .delllink callback to sit_link_ops that skips deallocation at rtnl_unlink_register for the fallback device. This mechanism is comparable to the one in ip_tunnel. It also modifies sit_destroy_tunnels and its only caller sit_exit_net to avoid the offending dereference in the first place. That double lookup is more complicated than required. Test: The bug is only triggered when CONFIG_NET_NS is enabled. It causes a GPF only when CONFIG_DEBUG_SLAB is enabled. Verified that this bug exists at the mentioned commit, at davem-net HEAD and at 3.11.y HEAD. Verified that it went away after applying this patch. Fixes: 205983c43700 ("sit: allow to use rtnl ops on fb tunnel") Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net:fec: fix WARNING caused by lack of calls to dma_mapping_error()	Duan Fugang-B38611	2013-11-14	1	-6/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The driver fails to check the results of DMA mapping and results in the following warning: (with kernel config "CONFIG_DMA_API_DEBUG" enable) ------------[ cut here ]------------ WARNING: at lib/dma-debug.c:937 check_unmap+0x43c/0x7d8() fec 2188000.ethernet: DMA-API: device driver failed to check map error[device address=0x00000000383a8040] [size=2048 bytes] [mapped as single] Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.17-16827-g9cdb0ba-dirty #188 [<80013c4c>] (unwind_backtrace+0x0/0xf8) from [<80011704>] (show_stack+0x10/0x14) [<80011704>] (show_stack+0x10/0x14) from [<80025614>] (warn_slowpath_common+0x4c/0x6c) [<80025614>] (warn_slowpath_common+0x4c/0x6c) from [<800256c8>] (warn_slowpath_fmt+0x30/0x40) [<800256c8>] (warn_slowpath_fmt+0x30/0x40) from [<8026bfdc>] (check_unmap+0x43c/0x7d8) [<8026bfdc>] (check_unmap+0x43c/0x7d8) from [<8026c584>] (debug_dma_unmap_page+0x6c/0x78) [<8026c584>] (debug_dma_unmap_page+0x6c/0x78) from [<8038049c>] (fec_enet_rx_napi+0x254/0x8a8) [<8038049c>] (fec_enet_rx_napi+0x254/0x8a8) from [<804dc8c0>] (net_rx_action+0x94/0x160) [<804dc8c0>] (net_rx_action+0x94/0x160) from [<8002c758>] (__do_softirq+0xe8/0x1d0) [<8002c758>] (__do_softirq+0xe8/0x1d0) from [<8002c8e8>] (do_softirq+0x4c/0x58) [<8002c8e8>] (do_softirq+0x4c/0x58) from [<8002cb50>] (irq_exit+0x90/0xc8) [<8002cb50>] (irq_exit+0x90/0xc8) from [<8000ea88>] (handle_IRQ+0x3c/0x94) [<8000ea88>] (handle_IRQ+0x3c/0x94) from [<8000855c>] (gic_handle_irq+0x28/0x5c) [<8000855c>] (gic_handle_irq+0x28/0x5c) from [<8000de00>] (__irq_svc+0x40/0x50) Exception stack(0x815a5f38 to 0x815a5f80) 5f20: 815a5f80 3b9aca00 5f40: 0fe52383 00000002 0dd8950e 00000002 81e7b080 00000000 00000000 815ac4d8 5f60: 806032ec 00000000 00000017 815a5f80 80059028 8041fc4c 60000013 ffffffff [<8000de00>] (__irq_svc+0x40/0x50) from [<8041fc4c>] (cpuidle_enter_state+0x50/0xf0) [<8041fc4c>] (cpuidle_enter_state+0x50/0xf0) from [<8041fd94>] (cpuidle_idle_call+0xa8/0x14c) [<8041fd94>] (cpuidle_idle_call+0xa8/0x14c) from [<8000edac>] (arch_cpu_idle+0x10/0x4c) [<8000edac>] (arch_cpu_idle+0x10/0x4c) from [<800582f8>] (cpu_startup_entry+0x60/0x130) [<800582f8>] (cpu_startup_entry+0x60/0x130) from [<80bc7a48>] (start_kernel+0x2d0/0x328) [<80bc7a48>] (start_kernel+0x2d0/0x328) from [<10008074>] (0x10008074) ---[ end trace c6edec32436e0042 ]--- Because dma-debug add new interfaces to debug dma mapping errors, pls refer to: http://lwn.net/Articles/516640/ After dma mapping, it must call dma_mapping_error() to check mapping error, otherwise the map_err_type alway is MAP_ERR_NOT_CHECKED, check_unmap() define the mapping is not checked and dump the error msg. So,add dma_mapping_error() checking to fix the WARNING And RX DMA buffers are used repeatedly and the driver copies it into an skb, fec_enet_rx() should not map or unmap, use dma_sync_single_for_cpu()/dma_sync_single_for_device() instead of dma_map_single()/dma_unmap_single(). There have another potential issue: fec_enet_rx() passes the DMA address to __va(). Physical and DMA addresses are not the same thing. They may differ if the device is behind an IOMMU or bounce buffering was required, or just because there is a fixed offset between the device and host physical addresses. Also fix it in this patch. ============================================= V2: add net_ratelimit() to limit map err message. use dma_sync_single_for_cpu() instead of dma_map_single(). fix the issue that pass DMA addresses to __va() to get virture address. V1: initial send ============================================= Signed-off-by: Fugang Duan <B38611@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: sctp: bug-fixing: retran_path not set properly after transports ↵	Chang Xiangzhong	2013-11-14	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	recovering (v3) When a transport recovers due to the new coming sack, SCTP should iterate all of its transport_list to locate the __two__ most recently used transport and set to active_path and retran_path respectively. The exising code does not find the two properly - In case of the following list: [most-recent] -> [2nd-most-recent] -> ... Both active_path and retran_path would be set to the 1st element. The bug happens when: 1) multi-homing 2) failure/partial_failure transport recovers Both active_path and retran_path would be set to the same most-recent one, in other words, retran_path would not take its role - an end user might not even notice this issue. Signed-off-by: Chang Xiangzhong <changxiangzhong@gmail.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net-tcp: fix panic in tcp_fastopen_cache_set()	Eric Dumazet	2013-11-14	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We had some reports of crashes using TCP fastopen, and Dave Jones gave a nice stack trace pointing to the error. Issue is that tcp_get_metrics() should not be called with a NULL dst Fixes: 1fe4c481ba637 ("net-tcp: Fast Open client - cookie cache") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dave Jones <davej@redhat.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Tested-by: Dave Jones <davej@fedoraproject.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	bonding: fix two race conditions in bond_store_updelay/downdelay	Nikolay Aleksandrov	2013-11-14	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes two race conditions between bond_store_updelay/downdelay and bond_store_miimon which could lead to division by zero as miimon can be set to 0 while either updelay/downdelay are being set and thus miss the zero check in the beginning, the zero div happens because updelay/downdelay are stored as new_value / bond->params.miimon. Use rtnl to synchronize with miimon setting. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tcp: tsq: restore minimal amount of queueing	Eric Dumazet	2013-11-14	3	-10/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After commit c9eeec26e32e ("tcp: TSQ can use a dynamic limit"), several users reported throughput regressions, notably on mvneta and wifi adapters. 802.11 AMPDU requires a fair amount of queueing to be effective. This patch partially reverts the change done in tcp_write_xmit() so that the minimal amount is sysctl_tcp_limit_output_bytes. It also remove the use of this sysctl while building skb stored in write queue, as TSO autosizing does the right thing anyway. Users with well behaving NICS and correct qdisc (like sch_fq), can then lower the default sysctl_tcp_limit_output_bytes value from 128KB to 8KB. This new usage of sysctl_tcp_limit_output_bytes permits each driver authors to check how their driver performs when/if the value is set to a minimum of 4KB. Normally, line rate for a single TCP flow should be possible, but some drivers rely on timers to perform TX completion and too long TX completion delays prevent reaching full throughput. Fixes: c9eeec26e32e ("tcp: TSQ can use a dynamic limit") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Sujith Manoharan <sujith@msujith.org> Reported-by: Arnaud Ebalard <arno@natisbad.org> Tested-by: Sujith Manoharan <sujith@msujith.org> Cc: Felix Fietkau <nbd@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'hwtstamp'	David S. Miller	2013-11-14	6	-54/+30
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ben Hutchings says: ==================== net_tstamp: Validate hwtstamp_config completely before applying it This series fixes very similar bugs in 6 implementations of SIOCSHWTSTAMP. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	ixp4xx_eth: Validate hwtstamp_config completely before applying it	Ben Hutchings	2013-11-14	1	-9/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	hwtstamp_ioctl() should validate all fields of hwtstamp_config before making any changes. Currently it sets the TX configuration before validating the rx_filter field. Untested as I don't have a cross-compiler to hand. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	ti_cpsw: Validate hwtstamp_config completely before applying it	Ben Hutchings	2013-11-14	1	-10/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	cpsw_hwtstamp_ioctl() should validate all fields of hwtstamp_config, and the hardware version, before making any changes. Currently it sets the TX configuration before validating the rx_filter field or that the hardware supports timestamping. Also correct the error code for hardware versions that don't support timestamping. ENOTSUPP is used by the NFS implementation and is not part of userland API; we want EOPNOTSUPP (which glibc also calls ENOTSUP, with one 'P'). Untested as I don't have a cross-compiler to hand. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Mugunthan V N <mugunthanvnm@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	stmmac: Validate hwtstamp_config completely before applying it	Ben Hutchings	2013-11-14	1	-9/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	stmmac_hwtstamp_ioctl() should validate all fields of hwtstamp_config before making any changes. Currently it sets the TX configuration before validating the rx_filter field. Compile-tested only. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	pch_gbe: Validate hwtstamp_config completely before applying it	Ben Hutchings	2013-11-14	1	-9/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	hwtstamp_ioctl() should validate all fields of hwtstamp_config before making any changes. Currently it sets the TX configuration before validating the rx_filter field. Compile-tested only. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	e1000e: Validate hwtstamp_config completely before applying it	Ben Hutchings	2013-11-14	1	-8/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	e1000e_hwtstamp_ioctl() should validate all fields of hwtstamp_config before making any changes. Currently it copies the configuration to the e1000_adapter structure before validating it at all. Change e1000e_config_hwtstamp() to take a pointer to the hwstamp_config and to copy the config after validating it. Compile-tested only. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tg3: Validate hwtstamp_config completely before applying it	Ben Hutchings	2013-11-14	1	-9/+7
\|/ \| \| \| \| \| \| \| \| \| \| \|	tg3_hwtstamp_ioctl() should validate all fields of hwtstamp_config before making any changes. Currently it sets the TX configuration before validating the rx_filter field. Compile-tested only. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Nithin Nayak Sujir <nsujir@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	bridge: Fix memory leak when deleting bridge with vlan filtering enabled	Toshiaki Makita	2013-11-14	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We currently don't call br_vlan_flush() when deleting a bridge, which leads to memory leak if br->vlan_info is allocated. Steps to reproduce: while : do brctl addbr br0 bridge vlan add dev br0 vid 10 self brctl delbr br0 done We can observe the cache size of corresponding slab entry (as kmalloc-2048 in SLUB) is increased. kmemleak output: unreferenced object 0xffff8800b68a7000 (size 2048): comm "bridge", pid 2086, jiffies 4295774704 (age 47.656s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 48 9b 36 00 88 ff ff .........H.6.... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff815eb6ae>] kmemleak_alloc+0x4e/0xb0 [<ffffffff8116a1ca>] kmem_cache_alloc_trace+0xca/0x220 [<ffffffffa03eddd6>] br_vlan_add+0x66/0xe0 [bridge] [<ffffffffa03e543c>] br_setlink+0x2dc/0x340 [bridge] [<ffffffff8150e481>] rtnl_bridge_setlink+0x101/0x200 [<ffffffff8150d9d9>] rtnetlink_rcv_msg+0x99/0x260 [<ffffffff81528679>] netlink_rcv_skb+0xa9/0xc0 [<ffffffff8150d938>] rtnetlink_rcv+0x28/0x30 [<ffffffff81527ccd>] netlink_unicast+0xdd/0x190 [<ffffffff8152807f>] netlink_sendmsg+0x2ff/0x740 [<ffffffff814e8368>] sock_sendmsg+0x88/0xc0 [<ffffffff814e8ac8>] ___sys_sendmsg.part.14+0x298/0x2b0 [<ffffffff814e91de>] __sys_sendmsg+0x4e/0x90 [<ffffffff814e922e>] SyS_sendmsg+0xe/0x10 [<ffffffff81601669>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
*	bridge: Call vlan_vid_del for all vids at nbp_vlan_flush	Toshiaki Makita	2013-11-14	1	-0/+4
\| \| \| \| \| \| \| \|	We should call vlan_vid_del for all vids at nbp_vlan_flush to prevent vid_info->refcount from being leaked when detaching a bridge port. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
*	bridge: Use vlan_vid_[add/del] instead of direct ndo_vlan_rx_[add/kill]_vid ↵	Toshiaki Makita	2013-11-14	1	-14/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	calls We should use wrapper functions vlan_vid_[add/del] instead of ndo_vlan_rx_[add/kill]_vid. Otherwise, we might be not able to communicate using vlan interface in a certain situation. Example of problematic case: vconfig add eth0 10 brctl addif br0 eth0 bridge vlan add dev eth0 vid 10 bridge vlan del dev eth0 vid 10 brctl delif br0 eth0 In this case, we cannot communicate via eth0.10 because vlan 10 is filtered by NIC that has the vlan filtering feature. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
*	random32: use msecs_to_jiffies for reseed timer	Daniel Borkmann	2013-11-14	1	-2/+6
\| \| \| \| \| \| \| \| \| \|	Use msecs_to_jiffies, for these calculations as different HZ considerations are taken into account for conversion of the timer shot, and also it makes the code more readable. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	random32: add __init prefix to prandom_start_seed_timer	Daniel Borkmann	2013-11-14	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	We only call that in functions annotated with __init, so add __init prefix in prandom_start_seed_timer() as well, so that the kernel can make use of this hint and we can possibly free up resources after it's usage. And since it's an internal function rename it to __prandom_start_seed_timer(). Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	macvtap: limit head length of skb allocated	Jason Wang	2013-11-14	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We currently use hdr_len as a hint of head length which is advertised by guest. But when guest advertise a very big value, it can lead to an 64K+ allocating of kmalloc() which has a very high possibility of failure when host memory is fragmented or under heavy stress. The huge hdr_len also reduce the effect of zerocopy or even disable if a gso skb is linearized in guest. To solves those issues, this patch introduces an upper limit (PAGE_SIZE) of the head, which guarantees an order 0 allocation each time. Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tuntap: limit head length of skb allocated	Jason Wang	2013-11-14	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We currently use hdr_len as a hint of head length which is advertised by guest. But when guest advertise a very big value, it can lead to an 64K+ allocating of kmalloc() which has a very high possibility of failure when host memory is fragmented or under heavy stress. The huge hdr_len also reduce the effect of zerocopy or even disable if a gso skb is linearized in guest. To solves those issues, this patch introduces an upper limit (PAGE_SIZE) of the head, which guarantees an order 0 allocation each time. Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: mv643xx_eth: potential NULL dereference in probe()	Dan Carpenter	2013-11-14	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \|	We assume that "mp->phy" can be NULL a couple lines before the dereference. Fixes: 1cce16d37d0f ('net: mv643xx_eth: Add missing phy_addr_set in DT mode') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Acked-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: cdc_ncm: cleanup a type issue in cdc_ncm_setup()	Dan Carpenter	2013-11-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	This is harmless but cdc_ncm_setup() returns negative error codes truncated to u8 values. There is only one caller and treats all non-zero returns as errors but doesn't store the the return code. So the code works correctly but it's messy and upsets the static checkers. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	core/dev: do not ignore dmac in dev_forward_skb()	Alexei Starovoitov	2013-11-14	2	-7/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 06a23fe31ca3 ("core/dev: set pkt_type after eth_type_trans() in dev_forward_skb()") and refactoring 64261f230a91 ("dev: move skb_scrub_packet() after eth_type_trans()") are forcing pkt_type to be PACKET_HOST when skb traverses veth. which means that ip forwarding will kick in inside netns even if skb->eth->h_dest != dev->dev_addr Fix order of eth_type_trans() and skb_scrub_packet() in dev_forward_skb() and in ip_tunnel_rcv() Fixes: 06a23fe31ca3 ("core/dev: set pkt_type after eth_type_trans() in dev_forward_skb()") CC: Isaku Yamahata <yamahatanetdev@gmail.com> CC: Maciej Zenczykowski <zenczykowski@gmail.com> CC: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	usbnet: fix status interrupt urb handling	Felix Fietkau	2013-11-14	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since commit 7b0c5f21f348a66de495868b8df0284e8dfd6bbf "sierra_net: keep status interrupt URB active", sierra_net triggers status interrupt polling before the net_device is opened (in order to properly receive the sync message response). To be able to receive further interrupts, the interrupt urb needs to be re-submitted, so this patch removes the bogus check for netif_running(). Signed-off-by: Felix Fietkau <nbd@openwrt.org> Tested-by: Dan Williams <dcbw@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	bonding: don't permit to use ARP monitoring in 802.3ad mode	Veaceslav Falico	2013-11-14	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the ARP monitoring is not supported with 802.3ad, and it's prohibited to use it via the module params. However we still can set it afterwards via sysfs, cause we only check for *LB modes there. To fix this - add a check for 802.3ad mode in bonding_store_arp_interval. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	MAINTAINERS: cxgb3: Update cxgb3 maintainer entry	Divy Le Ray	2013-11-14	1	-1/+1
\| \| \| \| \| \| \|	Santosh raspatur is taking over the maintenance of cxgb3. Signed-off-by: Divy Le Ray <divy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next	Linus Torvalds	2013-11-13	1331	-32347/+78900
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull networking updates from David Miller: 1) The addition of nftables. No longer will we need protocol aware firewall filtering modules, it can all live in userspace. At the core of nftables is a, for lack of a better term, virtual machine that executes byte codes to inspect packet or metadata (arriving interface index, etc.) and make verdict decisions. Besides support for loading packet contents and comparing them, the interpreter supports lookups in various datastructures as fundamental operations. For example sets are supports, and therefore one could create a set of whitelist IP address entries which have ACCEPT verdicts attached to them, and use the appropriate byte codes to do such lookups. Since the interpreted code is composed in userspace, userspace can do things like optimize things before giving it to the kernel. Another major improvement is the capability of atomically updating portions of the ruleset. In the existing netfilter implementation, one has to update the entire rule set in order to make a change and this is very expensive. Userspace tools exist to create nftables rules using existing netfilter rule sets, but both kernel implementations will need to co-exist for quite some time as we transition from the old to the new stuff. Kudos to Patrick McHardy, Pablo Neira Ayuso, and others who have worked so hard on this. 2) Daniel Borkmann and Hannes Frederic Sowa made several improvements to our pseudo-random number generator, mostly used for things like UDP port randomization and netfitler, amongst other things. In particular the taus88 generater is updated to taus113, and test cases are added. 3) Support 64-bit rates in HTB and TBF schedulers, from Eric Dumazet and Yang Yingliang. 4) Add support for new 577xx tigon3 chips to tg3 driver, from Nithin Sujir. 5) Fix two fatal flaws in TCP dynamic right sizing, from Eric Dumazet, Neal Cardwell, and Yuchung Cheng. 6) Allow IP_TOS and IP_TTL to be specified in sendmsg() ancillary control message data, much like other socket option attributes. From Francesco Fusco. 7) Allow applications to specify a cap on the rate computed automatically by the kernel for pacing flows, via a new SO_MAX_PACING_RATE socket option. From Eric Dumazet. 8) Make the initial autotuned send buffer sizing in TCP more closely reflect actual needs, from Eric Dumazet. 9) Currently early socket demux only happens for TCP sockets, but we can do it for connected UDP sockets too. Implementation from Shawn Bohrer. 10) Refactor inet socket demux with the goal of improving hash demux performance for listening sockets. With the main goals being able to use RCU lookups on even request sockets, and eliminating the listening lock contention. From Eric Dumazet. 11) The bonding layer has many demuxes in it's fast path, and an RCU conversion was started back in 3.11, several changes here extend the RCU usage to even more locations. From Ding Tianhong and Wang Yufen, based upon suggestions by Nikolay Aleksandrov and Veaceslav Falico. 12) Allow stackability of segmentation offloads to, in particular, allow segmentation offloading over tunnels. From Eric Dumazet. 13) Significantly improve the handling of secret keys we input into the various hash functions in the inet hashtables, TCP fast open, as well as syncookies. From Hannes Frederic Sowa. The key fundamental operation is "net_get_random_once()" which uses static keys. Hannes even extended this to ipv4/ipv6 fragmentation handling and our generic flow dissector. 14) The generic driver layer takes care now to set the driver data to NULL on device removal, so it's no longer necessary for drivers to explicitly set it to NULL any more. Many drivers have been cleaned up in this way, from Jingoo Han. 15) Add a BPF based packet scheduler classifier, from Daniel Borkmann. 16) Improve CRC32 interfaces and generic SKB checksum iterators so that SCTP's checksumming can more cleanly be handled. Also from Daniel Borkmann. 17) Add a new PMTU discovery mode, IP_PMTUDISC_INTERFACE, which forces using the interface MTU value. This helps avoid PMTU attacks, particularly on DNS servers. From Hannes Frederic Sowa. 18) Use generic XPS for transmit queue steering rather than internal (re-)implementation in virtio-net. From Jason Wang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1622 commits) random32: add test cases for taus113 implementation random32: upgrade taus88 generator to taus113 from errata paper random32: move rnd_state to linux/random.h random32: add prandom_reseed_late() and call when nonblocking pool becomes initialized random32: add periodic reseeding random32: fix off-by-one in seeding requirement PHY: Add RTL8201CP phy_driver to realtek xtsonic: add missing platform_set_drvdata() in xtsonic_probe() macmace: add missing platform_set_drvdata() in mace_probe() ethernet/arc/arc_emac: add missing platform_set_drvdata() in arc_emac_probe() ipv6: protect for_each_sk_fl_rcu in mem_check with rcu_read_lock_bh vlan: Implement vlan_dev_get_egress_qos_mask as an inline. ixgbe: add warning when max_vfs is out of range. igb: Update link modes display in ethtool netfilter: push reasm skb through instead of original frag skbs ip6_output: fragment outgoing reassembled skb properly MAINTAINERS: mv643xx_eth: take over maintainership from Lennart net_sched: tbf: support of 64bit rates ixgbe: deleting dfwd stations out of order can cause null ptr deref ixgbe: fix build err, num_rx_queues is only available with CONFIG_RPS ...
\| *	Merge branch 'prandom'	David S. Miller	2013-11-11	5	-46/+294
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	prandom fixes/improvements ==================== It would be great if you could still consider this series that fixes and improves prandom for 3.13. We have sent it to netdev as prandom() originally came from net/core/utils.c and networking is its main user. For a detailled description, please see individual patches. For patch 3 in this series, there will be a minor merge conflict with the random tree that is for 3.13. See below how to resolve it. ==== Hannes says: on merge with the random tree I would suggest to resolve the conflict in drivers/char/random.c like this: if (r->entropy_total > 128) { r->initialized = 1; r->entropy_total = 0; if (r == &nonblocking_pool) { prandom_reseed_late(); pr_notice("random: %s pool is initialized\n", r->name); } } So it won't generate a warning if DEBUG_RANDOM_BOOT gets activated. ==== Patch 1 should probably also go to -stable. Set tested on 32 and 64 bit machines. Thanks a lot! Ref. original discussion: http://patchwork.ozlabs.org/patch/289951/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| *	random32: add test cases for taus113 implementation	Daniel Borkmann	2013-11-11	2	-6/+196
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We generated a battery of 100 test cases from GSL taus113 implemention and compare the results from a particular seed and a particular iteration with our implementation in the kernel. We have verified on 32 and 64 bit machines that our taus113 kernel implementation gives same results as GSL taus113 implementation: [ 0.147370] prandom: seed boundary self test passed [ 0.148078] prandom: 100 self tests passed This is a Kconfig option that is disabled on default, just like the crc32 init selftests in order to not unnecessary slow down boot process. We also refactored out prandom_seed_very_weak() as it's now used in multiple places in order to reduce redundant code. GSL code we used for generating test cases: int i, j; srand(time(NULL)); for (i = 0; i < 100; ++i) { int iteration = 500 + (rand() % 500); gsl_rng_default_seed = rand() + 1; gsl_rng *r = gsl_rng_alloc(gsl_rng_taus113); printf("\t{ %lu, ", gsl_rng_default_seed); for (j = 0; j < iteration - 1; ++j) gsl_rng_get(r); printf("%u, %lu },\n", iteration, gsl_rng_get(r)); gsl_rng_free(r); } Joint work with Hannes Frederic Sowa. Cc: Florian Weimer <fweimer@redhat.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| *	random32: upgrade taus88 generator to taus113 from errata paper	Daniel Borkmann	2013-11-11	2	-39/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since we use prandom() functions quite often in networking code i.e. in UDP port selection, netfilter code, etc, upgrade the PRNG from Pierre L'Ecuyer's original paper "Maximally Equidistributed Combined Tausworthe Generators", Mathematics of Computation, 65, 213 (1996), 203--213 to the version published in his errata paper [1]. The Tausworthe generator is a maximally-equidistributed generator, that is fast and has good statistical properties [1]. The version presented there upgrades the 3 state LFSR to a 4 state LFSR with increased periodicity from about 2^88 to 2^113. The algorithm is presented in [1] by the very same author who also designed the original algorithm in [2]. Also, by increasing the state, we make it a bit harder for attackers to "guess" the PRNGs internal state. See also discussion in [3]. Now, as we use this sort of weak initialization discussed in [3] only between core_initcall() until late_initcall() time [] for prandom32() users, namely in prandom_init(), it is less relevant from late_initcall() onwards as we overwrite seeds through prandom_reseed() anyways with a seed source of higher entropy, that is, get_random_bytes(). In other words, a exhaustive keysearch of 96 bit would be needed. Now, with the help of this patch, this state-search increases further to 128 bit. Initialization needs to make sure that s1 > 1, s2 > 7, s3 > 15, s4 > 127. taus88 and taus113 algorithm is also part of GSL. I added a test case in the next patch to verify internal behaviour of this patch with GSL and ran tests with the dieharder 3.31.1 RNG test suite: $ dieharder -g 052 -a -m 10 -s 1 -S 4137730333 #taus88 $ dieharder -g 054 -a -m 10 -s 1 -S 4137730333 #taus113 With this seed configuration, in order to compare both, we get the following differences: algorithm taus88 taus113 rands/second [] 1.61e+08 1.37e+08 sts_serial(4, 1st run) WEAK PASSED sts_serial(9, 2nd run) WEAK PASSED rgb_lagged_sum(31) WEAK PASSED We took out diehard_sums test as according to the authors it is considered broken and unusable [4]. Despite that and the slight decrease in performance (which is acceptable), taus113 here passes all 113 tests (only rgb_minimum_distance_5 in WEAK, the rest PASSED). In general, taus/taus113 is considered "very good" by the authors of dieharder [5]. The papers [1][2] states a single warm-up step is sufficient by running quicktaus once on each state to ensure proper initialization of ~s_{0}: Our selection of (s) according to Table 1 of [1] row 1 holds the condition L - k <= r - s, that is, (32 32 32 32) - (31 29 28 25) <= (25 27 15 22) - (18 2 7 13) with r = k - q and q = (6 2 13 3) as also stated by the paper. So according to [2] we are safe with one round of quicktaus for initialization. However we decided to include the warm-up phase of the PRNG as done in GSL in every case as a safety net. We also use the warm up phase to make the output of the RNG easier to verify by the GSL output. In prandom_init(), we also mix random_get_entropy() into it, just like drivers/char/random.c does it, jiffies ^ random_get_entropy(). random-get_entropy() is get_cycles(). xor is entropy preserving so it is fine if it is not implemented by some architectures. Note, this PRNG is not* used for cryptography in the kernel, but rather as a fast PRNG for various randomizations i.e. in the networking code, or elsewhere for debugging purposes, for example. []: In order to generate some "sort of pseduo-randomness", since get_random_bytes() is not yet available for us, we use jiffies and initialize states s1 - s3 with a simple linear congruential generator (LCG), that is x <- x 69069; and derive s2, s3, from the 32bit initialization from s1. So the above quote from [3] accounts only for the time from core to late initcall, not afterwards. [**] Single threaded run on MacBook Air w/ Intel Core i5-3317U [1] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme2.ps [2] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme.ps [3] http://thread.gmane.org/gmane.comp.encryption.general/12103/ [4] http://code.google.com/p/dieharder/source/browse/trunk/libdieharder/diehard_sums.c?spec=svn490&r=490#20 [5] http://www.phy.duke.edu/~rgb/General/dieharder.php Joint work with Hannes Frederic Sowa. Cc: Florian Weimer <fweimer@redhat.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| *	random32: move rnd_state to linux/random.h	Daniel Borkmann	2013-11-11	2	-7/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	struct rnd_state got mistakenly pulled into uapi header. It is not used anywhere and does also not belong there! Commit 5960164fde ("lib/random32: export pseudo-random number generator for modules"), the last commit on rnd_state before it got moved to uapi, says: This patch moves the definition of struct rnd_state and the inline __seed() function to linux/random.h. It renames the static __random32() function to prandom32() and exports it for use in modules. Hence, the structure was moved from lib/random32.c to linux/random.h so that it can be used within modules (FCoE-related code in this case), but not from user space. However, it seems to have been mistakenly moved to uapi header through the uapi script. Since no-one should make use of it from the linux headers, move the structure back to the kernel for internal use, so that it can be modified on demand. Joint work with Hannes Frederic Sowa. Cc: Joe Eykholt <jeykholt@cisco.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| *	random32: add prandom_reseed_late() and call when nonblocking pool becomes ↵	Hannes Frederic Sowa	2013-11-11	3	-2/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	initialized The Tausworthe PRNG is initialized at late_initcall time. At that time the entropy pool serving get_random_bytes is not filled sufficiently. This patch adds an additional reseeding step as soon as the nonblocking pool gets marked as initialized. On some machines it might be possible that late_initcall gets called after the pool has been initialized. In this situation we won't reseed again. (A call to prandom_seed_late blocks later invocations of early reseed attempts.) Joint work with Daniel Borkmann. Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| *	random32: add periodic reseeding	Hannes Frederic Sowa	2013-11-11	1	-0/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current Tausworthe PRNG is never reseeded with truly random data after the first attempt in late_initcall. As this PRNG is used for some critical random data as e.g. UDP port randomization we should try better and reseed the PRNG once in a while with truly random data from get_random_bytes(). When we reseed with prandom_seed we now make also sure to throw the first output away. This suffices the reseeding procedure. The delay calculation is based on a proposal from Eric Dumazet. Joint work with Daniel Borkmann. Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| *	random32: fix off-by-one in seeding requirement	Daniel Borkmann	2013-11-11	2	-10/+10
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For properly initialising the Tausworthe generator [1], we have a strict seeding requirement, that is, s1 > 1, s2 > 7, s3 > 15. Commit 697f8d0348 ("random32: seeding improvement") introduced a __seed() function that imposes boundary checks proposed by the errata paper [2] to properly ensure above conditions. However, we're off by one, as the function is implemented as: "return (x < m) ? x + m : x;", and called with __seed(X, 1), __seed(X, 7), __seed(X, 15). Thus, an unwanted seed of 1, 7, 15 would be possible, whereas the lower boundary should actually be of at least 2, 8, 16, just as GSL does. Fix this, as otherwise an initialization with an unwanted seed could have the effect that Tausworthe's PRNG properties cannot not be ensured. Note that this PRNG is not used for cryptography in the kernel. [1] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme.ps [2] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme2.ps Joint work with Hannes Frederic Sowa. Fixes: 697f8d0348a6 ("random32: seeding improvement") Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Florian Weimer <fweimer@redhat.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	PHY: Add RTL8201CP phy_driver to realtek	Jonas Jensen	2013-11-11	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add RTL8201CP phy_driver. Signed-off-by: Jonas Jensen <jonas.jensen@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	xtsonic: add missing platform_set_drvdata() in xtsonic_probe()	Wei Yongjun	2013-11-11	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add missing platform_set_drvdata() in xtsonic_probe(), otherwise calling platform_get_drvdata() in xtsonic_device_remove() may returns NULL. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	macmace: add missing platform_set_drvdata() in mace_probe()	Wei Yongjun	2013-11-11	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add missing platform_set_drvdata() in mace_probe(), otherwise calling platform_get_drvdata() in mac_mace_device_remove() may returns NULL. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	ethernet/arc/arc_emac: add missing platform_set_drvdata() in arc_emac_probe()	Wei Yongjun	2013-11-11	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add missing platform_set_drvdata() in arc_emac_probe(), otherwise calling platform_get_drvdata() in arc_emac_remove() may returns NULL. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	ipv6: protect for_each_sk_fl_rcu in mem_check with rcu_read_lock_bh	Hannes Frederic Sowa	2013-11-11	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes a suspicious rcu derference warning. Cc: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	vlan: Implement vlan_dev_get_egress_qos_mask as an inline.	David S. Miller	2013-11-11	3	-107/+99
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is to avoid very silly Kconfig dependencies for modules using this routine. Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	ixgbe: add warning when max_vfs is out of range.	Jacob Keller	2013-11-11	2	-8/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The max_vfs parameter has a limit of 63 and silently fails (adding 0 vfs) when it is out of range. This patch adds a warning so that the user knows something went wrong. Also, this patch moves the warning in ixgbe_enable_sriov() to where max_vfs is checked, so that even an out of range value will show the deprecated warning. Previously, an out of range parameter didn't even warn the user to use the new sysfs interface instead. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	igb: Update link modes display in ethtool	Carolyn Wyborny	2013-11-11	1	-24/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes multiple problems in the link modes display in ethtool. Newer parts have more complicated methods to determine actual link capabilities. Older parts cannot communicate with their SFP modules. Finally, all the available defines are not displayed by ethtool. This updates the link modes to be as accurate as possible depending on what data is available to the driver at any given time. Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	netfilter: push reasm skb through instead of original frag skbs	Jiri Pirko	2013-11-11	9	-203/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pushing original fragments through causes several problems. For example for matching, frags may not be matched correctly. Take following example: <example> On HOSTA do: ip6tables -I INPUT -p icmpv6 -j DROP ip6tables -I INPUT -p icmpv6 -m icmp6 --icmpv6-type 128 -j ACCEPT and on HOSTB you do: ping6 HOSTA -s2000 (MTU is 1500) Incoming echo requests will be filtered out on HOSTA. This issue does not occur with smaller packets than MTU (where fragmentation does not happen) </example> As was discussed previously, the only correct solution seems to be to use reassembled skb instead of separete frags. Doing this has positive side effects in reducing sk_buff by one pointer (nfct_reasm) and also the reams dances in ipvs and conntrack can be removed. Future plan is to remove net/ipv6/netfilter/nf_conntrack_reasm.c entirely and use code in net/ipv6/reassembly.c instead. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	ip6_output: fragment outgoing reassembled skb properly	Jiri Pirko	2013-11-11	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If reassembled packet would fit into outdev MTU, it is not fragmented according the original frag size and it is send as single big packet. The second case is if skb is gso. In that case fragmentation does not happen according to the original frag size. This patch fixes these. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	MAINTAINERS: mv643xx_eth: take over maintainership from Lennart	Sebastian Hesselbarth	2013-11-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Lennart says: "I haven't been able to spend time on mv643xx_eth for a while now, so if you want to take over maintainership, I'd be fine with that." Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Acked-by: Lennert Buytenhek <buytenh@wantstofly.org> Acked-by: Jason Cooper <jason@lakedaemon.net> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net_sched: tbf: support of 64bit rates	Yang Yingliang	2013-11-09	2	-4/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With psched_ratecfg_precompute(), tbf can deal with 64bit rates. Add two new attributes so that tc can use them to break the 32bit limit. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Suggested-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	ixgbe: deleting dfwd stations out of order can cause null ptr deref	John Fastabend	2013-11-08	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The number of stations in use is kept in the num_rx_pools counter in the ixgbe_adapter structure. This is in turn used by the queue allocation scheme to determine how many queues are needed to support the number of pools in use with the current feature set. This works as long as the pools are added and destroyed in order because (num_rx_pools * queues_per_pool) is equal to the last queue in use by a pool. But as soon as you delete a pool out of order this is no longer the case. So the above multiplication allocates to few queues and a pool may reference a ring that has not been allocated/initialized. To resolve use the bit mask of in use pools to determine the final pool being used and allocate enough queues so that we don't inadvertently remove its queues. # ip link add link eth2 \ numtxqueues 4 numrxqueues 4 txqueuelen 50 type macvlan # ip link set dev macvlan0 up # ip link add link eth2 \ numtxqueues 4 numrxqueues 4 txqueuelen 50 type macvlan # ip link set dev macvlan1 up # for i in {0..100}; do ip link set dev macvlan0 down; ip link set dev macvlan0 up; done; Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	ixgbe: fix build err, num_rx_queues is only available with CONFIG_RPS	John Fastabend	2013-11-08	1	-6/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the recent support for layer 2 hardware acceleration, I added a few references to real_num_rx_queues and num_rx_queues which are only available with CONFIG_RPS. The fix is first to remove unnecessary references to num_rx_queues. Because the hardware offload case is limited to cases where RX queues and TX queues are equal we only need a single check. Then wrap the single case in an ifdef. The patch that introduce this is here, commit a6cc0cfa72e0b6d9f2c8fd858aacc32313c4f272 Author: John Fastabend <john.r.fastabend@intel.com> Date: Wed Nov 6 09:54:46 2013 -0800 net: Add layer 2 hardware acceleration operations for macvlan devices Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	ipv6: use rt6_get_dflt_router to get default router in rt6_route_rcv	Duan Jiong	2013-11-08	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As the rfc 4191 said, the Router Preference and Lifetime values in a ::/0 Route Information Option should override the preference and lifetime values in the Router Advertisement header. But when the kernel deals with a ::/0 Route Information Option, the rt6_get_route_info() always return NULL, that means that overriding will not happen, because those default routers were added without flag RTF_ROUTEINFO in rt6_add_dflt_router(). In order to deal with that condition, we should call rt6_get_dflt_router when the prefix length is 0. Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	nfnetlink: do not ack malformed messages	Jiri Benc	2013-11-08	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 0628b123c96d ("netfilter: nfnetlink: add batch support and use it from nf_tables") introduced a bug leading to various crashes in netlink_ack when netlink message with invalid nlmsg_len was sent by an unprivileged user. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>