op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	inet: fix a UFO regression	Eric Dumazet	2013-11-08	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While testing virtio_net and skb_segment() changes, Hannes reported that UFO was sending wrong frames. It appears this was introduced by a recent commit : 8c3a897bfab1 ("inet: restore gso for vxlan") The old condition to perform IP frag was : tunnel = !!skb->encapsulation; ... if (!tunnel && proto == IPPROTO_UDP) { So the new one should be : udpfrag = !skb->encapsulation && proto == IPPROTO_UDP; ... if (udpfrag) { Initialization of udpfrag must be done before call to ops->callbacks.gso_segment(skb, features), as skb_udp_tunnel_segment() clears skb->encapsulation (We want udpfrag to be true for UFO, false for VXLAN) With help from Alexei Starovoitov Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'pskb_put'	David S. Miller	2013-11-07	5	-32/+31
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Mathias Krause says: ==================== move pskb_put (was: IPsec improvements) This series moves pskb_put() to the core code, making the code duplication in caif obsolete (patches 1 and 2). Patch 3 fixes a few kernel-doc issues. v2 of this series does no longer contain the skb_cow_data() patch and therefore no performance improvements for IPsec. The change is still under discussion, but otherwise independent from the above changes. Please apply! v2: - kernel-doc fixes for pskb_put, as noticed by Ben - dropped skb_cow_data patch as it's still discussed - added a kernel-doc fixes patch (patch 3) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: skbuff - kernel-doc fixes	Mathias Krause	2013-11-07	2	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use "@" to refer to parameters in the kernel-doc description. According to Documentation/kernel-doc-nano-HOWTO.txt "&" shall be used to refer to structures only. Signed-off-by: Mathias Krause <mathias.krause@secunet.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	caif: use pskb_put() instead of reimplementing its functionality	Mathias Krause	2013-11-07	1	-11/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Also remove the warning for fragmented packets -- skb_cow_data() will linearize the buffer, removing all fragments. Signed-off-by: Mathias Krause <mathias.krause@secunet.com> Cc: Dmitry Tarnyagin <dmitry.tarnyagin@lockless.no> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: move pskb_put() to core code	Mathias Krause	2013-11-07	4	-15/+24
\|/ \| \| \| \| \| \| \| \| \| \| \|	This function has usage beside IPsec so move it to the core skbuff code. While doing so, give it some documentation and change its return type to 'unsigned char *' to be in line with skb_put(). Signed-off-by: Mathias Krause <mathias.krause@secunet.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: calxedaxgmac: Fix panic caused by MTU change of active interface	Andreas Herrmann	2013-11-07	1	-6/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Changing MTU size of an xgmac network interface while it is active can cause a panic like skbuff: skb_over_panic: text:c03bc62c len:1090 put:1090 head:edfb6900 data:edfb6942 tail:0xedfb6d84 end:0xedfb6bc0 dev:eth0 ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:126! Internal error: Oops - BUG: 0 [#1] SMP ARM Modules linked in: CPU: 0 PID: 762 Comm: python Tainted: G W 3.10.0-00015-g3e33cd7 #309 task: edcfe000 ti: ed67e000 task.ti: ed67e000 PC is at skb_panic+0x64/0x70 LR is at wake_up_klogd+0x5c/0x68 This happens because xgmac_change_mtu modifies dev->mtu before the network interface is quiesced. And thus there still might be buffers in use which have a buffer size based on the old MTU. To fix this I moved the change of dev->mtu after the call to xgmac_stop. Another modification is required (in xgmac_stop) to ensure that xgmac_xmit is really not called anymore (xgmac_tx_complete might wake up the queue again). I've tested the fix by switching MTU size every second between 600 and 1500 while network traffic was going on. The test box survived a test of several hours (until I've stopped it) whereas w/o this fix above panic occurs after several minutes (at most). Change since v1: - remove call to netif_stop_queue at beginning of xgmac_stop - use netif_tx_disable instead of locking+netif_stop_queue Signed-off-by: Andreas Herrmann <andreas.herrmann@calxeda.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'mlx4'	David S. Miller	2013-11-07	22	-228/+383
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Amir Vadai says: ==================== net/mlx4: Mellanox driver update 07-11-2013 This patchset contains some enhancements and bug fixes for the mlx4_* drivers. Patchset was applied and tested against commit: "9bb8ca8 virtio-net: switch to use XPS to choose txq" ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net/mlx4_en: Datapath structures are allocated per NUMA node	Eugenia Emantayev	2013-11-07	7	-34/+71
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For each RX/TX ring and its CQ, allocation is done on a NUMA node that corresponds to the core that the data structure should operate on. The assumption is that the core number is reflected by the ring index. The affected allocations are the ring/CQ data structures, the TX/RX info and the shared HW/SW buffer. For TX rings, each core has rings of all UPs. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net/mlx4_core: ICM pages are allocated on device NUMA node	Eugenia Emantayev	2013-11-07	3	-12/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is done to optimize FW/HW access to host memory. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net/mlx4_en: Datapath resources allocated dynamically	Eugenia Emantayev	2013-11-07	8	-111/+178
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently all TX/RX rings and completion queues are part of the netdev priv structure and are allocated statically. This patch will change the priv to hold only arrays of pointers and therefore all TX/RX rings and completetion queues will be allocated dynamically. This is in preparation for NUMA aware allocations. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net/mlx4_core: Add immediate activate for VGT->VST->VGT	Rony Efraim	2013-11-07	2	-32/+81
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allow immediate activate of VGT->VST and VST->VGT transitions, without the need of rebinding in mlx4_master_immediate_activate_vlan_qos(). Also in struct res_qp: add qp parameters (vlan_index,fvl,vlan_cntrol..) to the saved set, in order to restore when move to VGT. - Clear at mlx4_RST2INIT_QP_wrapper() - Save at mlx4_INIT2RTR_QP_wrapper() - Restore at mlx4_vf_immed_vlan_work_handler() Update mlx4_vf_immed_vlan_work_handler() to support VGT. Signed-off-by: Rony Efraim <ronye@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net/mlx4_core: Initialize all mailbox buffers to zero before use	Jack Morgenstein	2013-11-07	10	-43/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To guarantee that all unused fields in all FW commands for both inboxes and outboxes are zeroed out, initialize the mailbox buffer to all zeroes. This is especially important for SRIOV comm-channel virtual commands (such as QUERY_FUNC_CAP), where if new fields are added to support new features, the driver can depend on older kernels passing zeroes in these fields. In addition to zeroing out the mailbox buffer at allocation time, all (now unnecessary) calls to memset by the callers of mlx4_alloc_cmd_mailbox() are removed. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net/mlx4_en: Add RFS support in UDP	Eyal Perry	2013-11-07	1	-11/+34
\|/ \| \| \| \| \| \| \|	Modify RFS code to support applying filters for incoming UDP streams. Signed-off-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'macvlan_hwaccel'	David S. Miller	2013-11-07	12	-97/+532
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	John Fastabend says: ==================== l2 hardware accelerated macvlans This patch adds support to offload macvlan net_devices to the hardware. With these patches packets are pushed to the macvlan net_device directly and do not pass through the lower dev. The patches here have made it through multiple iterations each with a slightly different focus. First I tried to push these as a new link type called "VMDQ". The patches shown here, http://comments.gmane.org/gmane.linux.network/237617 Following this implementation I renamed the link type "VSI" and addressed various comments. Finally Neil Horman picked up the patches and integrated the offload into the macvlan code. Here, http://permalink.gmane.org/gmane.linux.network/285658 The attached series is clean-up of his patches, with a few fixes. If folks find this series acceptable there are a few items we can work on next. First broadcast and multicast will use the hardware even for local traffic with this series. It would be best (I think) to use the software path for macvlan to macvlan traffic and save the PCIe bus. This depends on how much you value CPU time vs PCIE bandwidth. This will need another patch series to flush out. Also this series only allows for layer 2 mac forwarding where some hardware supports more interesting forwarding capabilities. Integrating with OVS may be useful here. As always any comments/feedback welcome. My basic I/O test is here but I've also done some link testing, SRIOV/DCB with macvlans and others, Changelog: v2: two fixes to ixgbe when all features DCB, FCoE, SR-IOV are enabled with macvlans. A VMDQ_P() reference should have been accel->pool and do not set the offset of the ring index from dfwd add call. The offset is used by SR-IOV so clearing it can cause SR-IOV quue index's to go sideways. With these fixes testing macvlan's with SRIOV enabled was successful. v3: addressed Neil's comments in ixgbe fixed error path on dfwd_add_station() in ixgbe fixed ixgbe to allow SRIOV and accelerated macvlans to coexist. v4: Dave caught some strange indentation, fixed it here ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	ixgbe: enable l2 forwarding acceleration for macvlans	John Fastabend	2013-11-07	4	-89/+443
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that l2 acceleration ops are in place from the prior patch, enable ixgbe to take advantage of these operations. Allow it to allocate queues for a macvlan so that when we transmit a frame, we can do the switching in hardware inside the ixgbe card, rather than in software. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: Add layer 2 hardware acceleration operations for macvlan devices	John Fastabend	2013-11-07	8	-8/+89
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a operations structure that allows a network interface to export the fact that it supports package forwarding in hardware between physical interfaces and other mac layer devices assigned to it (such as macvlans). This operaions structure can be used by virtual mac devices to bypass software switching so that forwarding can be done in hardware more efficiently. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net/mlx4_en: Fixed crash when port type is changed	Amir Vadai	2013-11-07	1	-4/+4
\| \| \| \| \| \| \| \| \|	timecounter_init() was was called only after first potential timecounter_read(). Moved mlx4_en_init_timestamp() before mlx4_en_init_netdev() Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	isdn: icn: NULL dereference printing error message	Dan Carpenter	2013-11-07	1	-2/+1
\| \| \| \| \| \| \| \|	"card2" is NULL here so I have changed it to use "id2" instead of "card2->interface.id". Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: make ndev->irq signed for error handling	Dan Carpenter	2013-11-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	There is a bug in cpsw_probe() where we do: ndev->irq = platform_get_irq(pdev, 0); if (ndev->irq < 0) { The problem is that "ndev->irq" is unsigned so the error handling doesn't work. I have changed it to a regular int. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	6lowpan: release device on error path	Dan Carpenter	2013-11-07	1	-1/+3
\| \| \| \| \| \| \| \|	We recently added a new error path and it needs a dev_put(). Fixes: 7adac1ec8198 ('6lowpan: Only make 6lowpan links to IEEE802154 devices') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	RDMA/cma: Set IBoE SL (user-priority) by egress map when using vlans	Eyal Perry	2013-11-07	1	-5/+21
\| \| \| \| \| \| \| \| \| \| \| \| \|	On top of commit 366cddb40 "IB/rdma_cm: TOS <=> UP mapping for IBoE", add support for case vlan egress map is used. When the IBoE session is being set over a vlan, inherit the socket priority to vlan priority mapping which was configured for the vlan device egress map. Signed-off-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net/vlan: Provide read access to the vlan egress map	Eyal Perry	2013-11-07	2	-7/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Provide a method for read-only access to the vlan device egress mapping. Do this by refactoring vlan_dev_get_egress_qos_mask() such that now it receives as an argument the skb priority instead of pointer to the skb. Such an access is needed for the IBoE stack where the control plane goes through the network stack. This is an add-on step on top of commit d4a968658c "net/route: export symbol ip_tos2prio" which allowed the RDMA-CM to use ip_tos2prio. Signed-off-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tg3: avoid double-freeing of rx data memory	Ivan Vecera	2013-11-07	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	If build_skb fails the memory associated with the ring buffer is freed but the ri->data member is not zeroed in this case. This causes a double-free of this memory in tg3_free_rings->... path. The patch moves this block after setting ri->data to NULL. It would be nice to fix this bug also in stable >= v3.4 trees. Cc: Nithin Nayak Sujir <nsujir@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Acked-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	MAINTAINERS: Update bnx2x maintainer	Eilon Greenstein	2013-11-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Ariel Elior will take over the bnx2x maintenance. It's been a pleasure! Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Acked-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: x86: bpf: don't forget to free sk_filter (v2)	Andrey Vagin	2013-11-07	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	sk_filter isn't freed if bpf_func is equal to sk_run_filter. This memory leak was introduced by v3.12-rc3-224-gd45ed4a4 "net: fix unsafe set_memory_rw from softirq". Before this patch sk_filter was freed in sk_filter_release_rcu, now it should be freed in bpf_jit_free. Here is output of kmemleak: unreferenced object 0xffff8800b774eab0 (size 128): comm "systemd", pid 1, jiffies 4294669014 (age 124.062s) hex dump (first 32 bytes): 00 00 00 00 0b 00 00 00 20 63 7f b7 00 88 ff ff ........ c...... 60 d4 55 81 ff ff ff ff 30 d9 55 81 ff ff ff ff `.U.....0.U..... backtrace: [<ffffffff816444be>] kmemleak_alloc+0x4e/0xb0 [<ffffffff811845af>] __kmalloc+0xef/0x260 [<ffffffff81534028>] sock_kmalloc+0x38/0x60 [<ffffffff8155d4dd>] sk_attach_filter+0x5d/0x190 [<ffffffff815378a1>] sock_setsockopt+0x991/0x9e0 [<ffffffff81531bd6>] SyS_setsockopt+0xb6/0xd0 [<ffffffff8165f3e9>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff v2: add extra { } after else Fixes: d45ed4a4e33a ("net: fix unsafe set_memory_rw from softirq") Acked-by: Daniel Borkmann <dborkman@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Eric Dumazet <edumazet@google.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrey Vagin <avagin@openvz.org> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'tipc_fragmentation'	David S. Miller	2013-11-07	6	-145/+80
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Erik Hugne says: ==================== tipc: message reassembly using fragment chain We introduce a new reassembly algorithm that improves performance and eliminates the risk of causing out-of-memory situations. v3: -Use skb_try_coalesce, and revert to fraglist if this does not succeed. -Make sure reassembly list head is uncloned. v2: -Rebased on Ying's indentation fix. -Node unlock call in msg_fragmenter case moved from patch #2 to #1. ('continue' with this lock held would cause spinlock recursion if only patch #1 is used) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tipc: reassembly failures should cause link reset	Erik Hugne	2013-11-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If appending a received fragment to the pending fragment chain in a unicast link fails, the current code tries to force a retransmission of the fragment by decrementing the 'next received sequence number' field in the link. This is done under the assumption that the failure is caused by an out-of-memory situation, an assumption that does not hold true after the previous patch in this series. A failure to append a fragment can now only be caused by a protocol violation by the sending peer, and it must hence be assumed that it is either malicious or buggy. Either way, the correct behavior is now to reset the link instead of trying to revert its sequence number. So, this is what we do in this commit. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tipc: message reassembly using fragment chain	Erik Hugne	2013-11-07	6	-142/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the first fragment of a long data data message is received on a link, a reassembly buffer large enough to hold the data from this and all subsequent fragments of the message is allocated. The payload of each new fragment is copied into this buffer upon arrival. When the last fragment is received, the reassembled message is delivered upwards to the port/socket layer. Not only is this an inefficient approach, but it may also cause bursts of reassembly failures in low memory situations. since we may fail to allocate the necessary large buffer in the first place. Furthermore, after 100 subsequent such failures the link will be reset, something that in reality aggravates the situation. To remedy this problem, this patch introduces a different approach. Instead of allocating a big reassembly buffer, we now append the arriving fragments to a reassembly chain on the link, and deliver the whole chain up to the socket layer once the last fragment has been received. This is safe because the retransmission layer of a TIPC link always delivers packets in strict uninterrupted order, to the reassembly layer as to all other upper layers. Hence there can never be more than one fragment chain pending reassembly at any given time in a link, and we can trust (but still verify) that the fragments will be chained up in the correct order. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tipc: don't reroute message fragments	Erik Hugne	2013-11-07	2	-3/+6
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a message fragment is received in a broadcast or unicast link, the reception code will append the fragment payload to a big reassembly buffer through a call to the function tipc_recv_fragm(). However, after the return of that call, the logics goes on and passes the fragment buffer to the function tipc_net_route_msg(), which will simply drop it. This behavior is a remnant from the now obsolete multi-cluster functionality, and has no relevance in the current code base. Although currently harmless, this unnecessary call would be fatal after applying the next patch in this series, which introduces a completely new reassembly algorithm. So we change the code to eliminate the redundant call. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	phy: Add MOXA MDIO driver	Jonas Jensen	2013-11-07	3	-0/+209
\| \| \| \| \| \| \| \| \| \|	The MOXA UC-711X hardware(s) has an ethernet controller that seem to be developed internally. The IC used is "RTL8201CP". This patch adds an MDIO driver which handles the MII bus. Signed-off-by: Jonas Jensen <jonas.jensen@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	bonding: document the new packets_per_slave option	Nikolay Aleksandrov	2013-11-07	1	-0/+9
\| \| \| \| \| \| \| \|	Add new documentation for the packets_per_slave option available for balance-rr mode. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	bonding: extend round-robin mode with packets_per_slave	Nikolay Aleksandrov	2013-11-07	3	-6/+101
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch aims to extend round-robin mode with a new option called packets_per_slave which can have the following values and effects: 0 - choose a random slave 1 (default) - standard round-robin, 1 packet per slave >1 - round-robin when >1 packets have been transmitted per slave The allowed values are between 0 and 65535. This patch also fixes the comment style in bond_xmit_roundrobin(). Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	qeth: avoid buffer overflow in snmp ioctl	Ursula Braun	2013-11-07	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	Check user-defined length in snmp ioctl request and allow request only if it fits into a qeth command buffer. Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com> Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com> Reviewed-by: Heiko Carstens <heicars2@linux.vnet.ibm.com> Reported-by: Nico Golde <nico@ngolde.de> Reported-by: Fabian Yamaguchi <fabs@goesec.de> Cc: <stable@vger.kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net:drivers/net: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO	Duan Jiong	2013-11-07	1	-3/+1
\| \| \| \| \| \| \| \|	This patch fixes coccinelle error regarding usage of IS_ERR and PTR_ERR instead of PTR_ERR_OR_ZERO. Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	smsc: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO	Duan Jiong	2013-11-07	1	-3/+1
\| \| \| \| \| \| \| \|	This patch fixes coccinelle error regarding usage of IS_ERR and PTR_ERR instead of PTR_ERR_OR_ZERO. Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	udp: Remove unnecessary semicolon from do{}while (0) macro	Joe Perches	2013-11-07	1	-7/+7
\| \| \| \| \| \| \| \| \|	Just an unnecessary semicolon that should be removed... Whitespace neatening of macro too. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	smsc9420: Use netif_<level>	Joe Perches	2013-11-07	1	-78/+77
\| \| \| \| \| \| \| \| \| \| \|	Use a more standard logging style. Convert smsc_<level> macros to use netif_<level>. Remove unused #define PFX Add pr_fmt and neaten pr_<level> uses. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	jme: Remove unused #define PFX	Joe Perches	2013-11-07	1	-1/+0
\| \| \| \| \| \| \|	It's unused, remove it. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	virtio-net: switch to use XPS to choose txq	Jason Wang	2013-11-05	1	-46/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We used to use a percpu structure vq_index to record the cpu to queue mapping, this is suboptimal since it duplicates the work of XPS and loses all other XPS functionality such as allowing user to configure their own transmission steering strategy. So this patch switches to use XPS and suggest a default mapping when the number of cpus is equal to the number of queues. With XPS support, there's no need for keeping per-cpu vq_index and .ndo_select_queue(), so they were removed also. Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Michael S. Tsirkin <mst@redhat.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv6: drop the judgement in rt6_alloc_cow()	Duan Jiong	2013-11-05	1	-5/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Now rt6_alloc_cow() is only called by ip6_pol_route() when rt->rt6i_flags doesn't contain both RTF_NONEXTHOP and RTF_GATEWAY, and rt->rt6i_flags hasn't been changed in ip6_rt_copy(). So there is no neccessary to judge whether rt->rt6i_flags contains RTF_GATEWAY or not. Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv6: fix headroom calculation in udp6_ufo_fragment	Hannes Frederic Sowa	2013-11-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 1e2bd517c108816220f262d7954b697af03b5f9c ("udp6: Fix udp fragmentation for tunnel traffic.") changed the calculation if there is enough space to include a fragment header in the skb from a skb->mac_header dervived one to skb_headroom. Because we already peeled off the skb to transport_header this is wrong. Change this back to check if we have enough room before the mac_header. This fixes a panic Saran Neti reported. He used the tbf scheduler which skb_gso_segments the skb. The offsets get negative and we panic in memcpy because the skb was erroneously not expanded at the head. Reported-by: Saran Neti <Saran.Neti@telus.com> Cc: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: mv643xx_eth: Add missing phy_addr_set in DT mode	Jason Gunthorpe	2013-11-05	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit cc9d4598 'net: mv643xx_eth: use of_phy_connect if phy_node present' made the call to phy_scan optional, if the DT has a link to the phy node. However phy_scan has the side effect of calling phy_addr_set, which writes the phy MDIO address to the ethernet controller. If phy_addr_set is not called, and the bootloader has not set the correct address then the driver will fail to function. Tested on Kirkwood. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Acked-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Tested-by: Arnaud Ebalard <arno@natisbad.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv4: introduce new IP_MTU_DISCOVER mode IP_PMTUDISC_INTERFACE	Hannes Frederic Sowa	2013-11-05	7	-9/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Sockets marked with IP_PMTUDISC_INTERFACE won't do path mtu discovery, their sockets won't accept and install new path mtu information and they will always use the interface mtu for outgoing packets. It is guaranteed that the packet is not fragmented locally. But we won't set the DF-Flag on the outgoing frames. Florian Weimer had the idea to use this flag to ensure DNS servers are never generating outgoing fragments. They may well be fragmented on the path, but the server never stores or usees path mtu values, which could well be forged in an attack. (The root of the problem with path MTU discovery is that there is no reliable way to authenticate ICMP Fragmentation Needed But DF Set messages because they are sent from intermediate routers with their source addresses, and the IMCP payload will not always contain sufficient information to identify a flow.) Recent research in the DNS community showed that it is possible to implement an attack where DNS cache poisoning is feasible by spoofing fragments. This work was done by Amir Herzberg and Haya Shulman: <https://sites.google.com/site/hayashulman/files/fragmentation-poisoning.pdf> This issue was previously discussed among the DNS community, e.g. <http://www.ietf.org/mail-archive/web/dnsext/current/msg01204.html>, without leading to fixes. This patch depends on the patch "ipv4: fix DO and PROBE pmtu mode regarding local fragmentation with UFO/CORK" for the enforcement of the non-fragmentable checks. If other users than ip_append_page/data should use this semantic too, we have to add a new flag to IPCB(skb)->flags to suppress local fragmentation and check for this in ip_finish_output. Many thanks to Florian Weimer for the idea and feedback while implementing this patch. Cc: David S. Miller <davem@davemloft.net> Suggested-by: Florian Weimer <fweimer@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'huawei_cdc_ncm'	David S. Miller	2013-11-05	5	-13/+253
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bjørn Mork says: ==================== The huawei_cdc_ncm driver. Enrico has been kind enough to let me repost his driver with the changes requested by Oliver Neukum during the last review of this series. The changes I have made from Enricos original v5 series to this version are: v6: - fix to avoid corrupting drvstate->pmcount - fix error return value from huawei_cdc_ncm_suspend() - drop redundant testing for subdriver->suspend during resume - broke a few lines to keep within the 80 columns recommendation - rebased on top of current net-next Enrico's orginal introduction to the v5 series follows below. It explains the background much better than I can. Bjørn [quote Enrico Mioso] So this is a new, revised, edition of the huawei_cdc_ncm.c driver, which supports devices resembling the NCM standard, but using it also as a mean to encapsulate other protocols, as is the case for the Huawei E3131 and E3251 modem devices. Some precisations are needed however - and I encourage discussion on this: and that's why I'm sending this message with a broader CC. Merging those patches might change: - the way Modem Manager interacts with those devices - some regressions might be possible if there are some unknown firmware variants around (Franko?) First of all: I observed the behaviours of two devices. Huawei E3131: this device doesn't accept NDIS setup requests unless they're sent via the embedded AT channel exposed by this driver. So actually we gain funcionality in this case! The second case, is the Huawei E3251: which works with standard NCM driver, still exposing an AT embedded channel. Whith this patch set applied, you gain some funcionality, loosing the ability to catch standard NCM events for now. The device will work in both ways with no problems, but this has to be acknowledged and discussed. Might be we can develop this driver further to change this, when more devices are tested. We where thinking Huawei changed their interfaces on new devices - but probably this driver only works around a nice firmware bug present in E3131, which prevented the modem from being used in NDIS mode. I think committing this is definitely wortth-while, since it will allow for more Huawei devices to be used without serial connection. Some devices like the E3251 also, reports some status information only via the embedded AT channel, at least in my case. Note: I'm not subscribed to any list except the Modem Manager's one, so please CC me, thanks!! [/quote] Enrico Mioso (3): net: cdc_ncm: Export cdc_ncm_{tx,rx}_fixup functions for re-use net: huawei_cdc_ncm: Introduce the huawei_cdc_ncm driver net: cdc_ncm: remove non-standard NCM device IDs ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: cdc_ncm: remove non-standard NCM device IDs	Enrico Mioso	2013-11-05	1	-11/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove device IDs of NCM-like (but not NCM-conformant) devices, that are handled by the huawwei_cdc_ncm driver now. Signed-off-by: Enrico Mioso <mrkiko.rs@gmail.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: huawei_cdc_ncm: Introduce the huawei_cdc_ncm driver	Enrico Mioso	2013-11-05	3	-0/+246
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This driver supports devices using the NCM protocol as an encapsulation layer for other protocols, like the E3131 Huawei 3G modem. This drivers approach was heavily inspired by the qmi_wwan/cdc_mbim approach & code model. Signed-off-by: Enrico Mioso <mrkiko.rs@gmail.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: cdc_ncm: Export cdc_ncm_{tx, rx}_fixup functions for re-use	Enrico Mioso	2013-11-05	2	-2/+7
\|/ \| \| \| \| \| \| \| \| \| \|	Some drivers implementing NCM-like protocols, may re-use those functions, as is the case in the huawei_cdc_ncm driver. Export them via EXPORT_SYMBOL_GPL, in accordance with how other functions have been exported. Signed-off-by: Enrico Mioso <mrkiko.rs@gmail.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv6: remove old conditions on flow label sharing	Florent Fourcot	2013-11-05	1	-33/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The code of flow label in Linux Kernel follows the rules of RFC 1809 (an informational one) for conditions on flow label sharing. There rules are not in the last proposed standard for flow label (RFC 6437), or in the previous one (RFC 3697). Since this code does not follow any current or old standard, we can remove it. With this removal, the ipv6_opt_cmp function is now a dead code and it can be removed too. Changelog to v1: * add justification for the change * remove the condition on IPv6 options [ Remove ipv6_hdr_cmp and it is now unused as well. -DaveM ] Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'for-davem' of ↵	David S. Miller	2013-11-05	186	-4189/+15085
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== Please accept the following pull request intended for the 3.13 tree... I had intended to pass most of these to you as much as two weeks ago. Unfortunately, I failed to account for the effects of bad Internet connections and my own fatique/laziness while traveling. On the bright side, at least these have been baking in linux-next for some time! For the mac80211 bits, Johannes says: "This time I have two fixes for P2P (which requires not using CCK rates) and a workaround for APs with broken WMM information." For the iwlwifi bits, Johannes says: "I have a few fixes for warnings/issues: one from Alex, fixing scan timings, one from Emmanuel fixing a WARN_ON in the DVM driver, one from Stanislaw removing a trigger-happy WARN_ON in the MVM driver and a change from myself to try to recover when the device isn't processing commands quickly." And: "For this round, I have a lot of changes: * power management improvements * BT coexistence improvements/updates * new device support * VHT support * IBSS support (though due to a small bug it requires new firmware) * various other fixes/improvements." For the Bluetooth bits, Gustavo says: "More patches for 3.12, busy times for Bluetooth. More than a 100 commits since the last pull. The bulk of work comes from Johan and Marcel, they are doing fixes and improvements all over the Bluetooth subsystem, as the diffstat can show." For the ath10k and ath6kl bits, Kalle says: "Bartosz added support to ath10k for our 10.x AP firmware branch, which gives us AP specific features and fixes. We still support the main firmware branch as well just like before, ath10k detects runtime what firmware is used. Unfortunately the firmware interface in 10.x branch is somewhat different so there was quite a lot of changes in ath10k for this. Michal and Sujith did some performance improvements in ath10k. Vladimir fixed a compiler warning and Fengguang removed an extra semicolon." For the NFC bits, Samuel says: "It's a fairly big one, with the following highlights: - NFC digital layer implementation: Most NFC chipsets implement the NFC digital layer in firmware, but others have more basic functionalities and expect the host to implement the digital layer. This layer sits below the NFC core. - Sony's port100 support: This is "soft" NFC USB dongle that expects the digital layer to be implemented on the host. This is the first user of our NFC digital stack implementation. - Secure element API: We now provide a netlink API for enabling, disabling and discovering NFC attached (embedded or UICC ones) secure elements. With some userspace help, this allows us to support NFC payments. Only the pn544 driver currently supports that API. - NCI SPI fixes and improvements: In order to support NCI devices over SPI, we fixed and improved our NCI/SPI implementation. The currently most deployed NFC NCI chipset, Broadcom's bcm2079x, supports that mode and we're planning to use our NCI/SPI framework to implement a driver for it. - pn533 fragmentation support in target mode: This was the only missing feature from our pn533 impementation. We now support fragmentation in both Tx and Rx modes, in target mode." On top of all that, brcmfmac and rt2x00 both get the usual flurry of updates. A few other drivers get hit here or there as well. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	Merge branch 'master' of ↵	John W. Linville	2013-11-04	186	-4189/+15085
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem Conflicts: drivers/net/wireless/brcm80211/brcmfmac/sdio_host.h