summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* cfg80211: Fix 160 MHz channels with 80+80 and 160 MHz driversJouni Malinen2014-12-121-3/+6
| | | | | | | | | | | | | The VHT supported channel width field is a two bit integer, not a bitfield. cfg80211_chandef_usable() was interpreting it incorrectly and ended up rejecting 160 MHz channel width if the driver indicated support for both 160 and 80+80 MHz channels. Cc: stable@vger.kernel.org (3.16+) Fixes: 3d9d1d6656a73 ("nl80211/cfg80211: support VHT channel configuration") (however, no real drivers had 160 MHz support it until 3.16) Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
* mac80211: fix multicast LED blinking and counterAndreas Müller2014-12-121-5/+6
| | | | | | | | | | | | | | | | As multicast-frames can't be fragmented, "dot11MulticastReceivedFrameCount" stopped being incremented after the use-after-free fix. Furthermore, the RX-LED will be triggered by every multicast frame (which wouldn't happen before) which wouldn't allow the LED to rest at all. Fixes https://bugzilla.kernel.org/show_bug.cgi?id=89431 which also had the patch. Cc: stable@vger.kernel.org Fixes: b8fff407a180 ("mac80211: fix use-after-free in defragmentation") Signed-off-by: Andreas Müller <goo@stapelspeicher.org> [rewrite commit message] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
* mac80211: avoid using uninitialized stack dataJes Sorensen2014-12-121-0/+1
| | | | | | | | | | Avoid a case where we would access uninitialized stack data if the AP advertises HT support without 40MHz channel support. Cc: stable@vger.kernel.org Fixes: f3000e1b43f1 ("mac80211: fix broken use of VHT/20Mhz with some APs") Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
* r8169:update rtl8168g pcie ephy parameterChun-Hao Lin2014-12-111-3/+21
| | | | | | | | | | | | | | | | | | | | | | Add ephy parameter to rtl8168g. Also change the common function of rtl8168g from "rtl_hw_start_8168g_1" to "rtl_hw_start_8168g". And function "rtl_hw_start_8168g_1" is used for setting rtl8168g hardware parameters. Following is the explanation of what hardware parameter change for. rtl8168g may erroneous judge the PCIe signal quality and show the error bit on PCI configuration space when in PCIe low power mode. The following ephy parameters are for above issue. { 0x00, 0x0000, 0x0008 } { 0x0c, 0x37d0, 0x0820 } { 0x1e, 0x0000, 0x0001 } rtl8168g may return to PCIe L0 from PCIe L0s low power mode too slow. The following ephy parameter is for above issue. { 0x19, 0x8000, 0x0000 } Signed-off-by: Chunhao Lin <hau@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: dsa: bcm_sf2: force link for all fixed PHY devicesFlorian Fainelli2014-12-111-10/+13
| | | | | | | | | | | | | | | For ports of the switch that we define as "fixed PHYs" such as MoCA, we would have our Port 7 special handling that would allow us to assert the link status indication. For other ports, such as e.g: RGMII_1 connected to a cable modem, we would rely on whatever the bootloader has left configured, which is a bad assumption to make, we really need to force the link status indication here. Fixes: 246d7f773c13 ("net: dsa: add Broadcom SF2 switch driver") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'dma_mb'David S. Miller2014-12-1118-179/+258
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Alexander Duyck says: ==================== arch: Add lightweight memory barriers for coherent memory access These patches introduce two new primitives for synchronizing cache coherent memory writes and reads. These two new primitives are: dma_rmb() dma_wmb() The first patch cleans up some unnecessary overhead related to the definition of read_barrier_depends, smp_read_barrier_depends, and comments related to the barrier. The second patch adds the primitives for the applicable architectures and asm-generic. The third patch adds the barriers to r8169 which turns out to be a good example of where the new barriers might be useful as they have full rmb()/wmb() barriers ordering accesses to the descriptors and the DescOwn bit. The fourth patch adds support for coherent_rmb() to the Intel fm10k, igb, and ixgbe drivers. Testing with the ixgbe driver has shown a processing time reduction of at least 7ns per 64B frame on a Core i7-4930K. This patch series is essentially the v7 for: v4-7: Add lightweight memory barriers for coherent memory access v3: Add lightweight memory barriers fast_rmb() and fast_wmb() v2: Introduce load_acquire() and store_release() v1: Introduce read_acquire() The key changes in this patch series versus the earlier patches are: v7 resubmit: - Added Acked-by: Ben Herrenschmidt from v5 to dma_rmb/wmb patch - No code changes from previous set, still applies cleanly and builds. v7: - Dropped test/debug patch that was accidentally slipped in v6: - Replaced "memory based device I/O" with "consistent memory" in docs - Added reference to DMA-API.txt to explain consistent memory v5: - Renamed barriers dma_rmb and dma_wmb - Undid smp_wmb changes in x86 and PowerPC - Defined smp_rmb as __lwsync for SMP case on PowerPC v4: - Renamed barriers coherent_rmb and coherent_wmb - Added smp_lwsync for use in smp_load_acquire/smp_store_release v3: - Moved away from acquire()/store() and instead focused on barriers - Added cleanup of read_barrier_depends - Added change in r8169 to fix cur_tx/DescOwn ordering - Simplified changes to just replacing/moving barriers in r8169 - Added update to documentation with code example v2: - Renamed read_acquire() to be consistent with smp_load_acquire() - Changed barrier used to be consistent with smp_load_acquire() - Updated PowerPC code to use __lwsync based on IBM article - Added store_release() as this is a viable use case for drivers - Added r8169 patch which is able to fully use primitives - Added fm10k/igb/ixgbe patch which is able to test performance ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * fm10k/igb/ixgbe: Use dma_rmb on Rx descriptor readsAlexander Duyck2014-12-113-11/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change makes it so that dma_rmb is used when reading the Rx descriptor. The advantage of dma_rmb is that it allows for a much lower cost barrier on x86, powerpc, arm, and arm64 architectures than a traditional memory barrier when dealing with reads that only have to synchronize to coherent memory. In addition I have updated the code so that it just checks to see if any bits have been set instead of just the DD bit since the DD bit will always be set as a part of a descriptor write-back so we just need to check for a non-zero value being present at that memory location rather than just checking for any specific bit. This allows the code itself to appear much cleaner and allows the compiler more room to optimize. Cc: Matthew Vick <matthew.vick@intel.com> Cc: Don Skidmore <donald.c.skidmore@intel.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * r8169: Use dma_rmb() and dma_wmb() for DescOwn checksAlexander Duyck2014-12-111-8/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The r8169 use a pair of wmb() calls when setting up the descriptor rings. The first is to synchronize the descriptor data with the descriptor status, and the second is to synchronize the descriptor status with the use of the MMIO doorbell to notify the device that descriptors are ready. This can come at a heavy price on some systems, and is not really necessary on systems such as x86 as a simple barrier() would suffice to order store/store accesses. As such we can replace the first memory barrier with dma_wmb() to reduce the cost for these accesses. In addition the r8169 uses a rmb() to prevent compiler optimization in the cleanup paths, however by moving the barrier down a few lines and replacing it with a dma_rmb() we should be able to use it to guarantee descriptor accesses do not occur until the device has updated the DescOwn bit from its end. One last change I made is to move the update of cur_tx in the xmit path to after the wmb. This way we can guarantee the device and all CPUs should see the DescOwn update before they see the cur_tx value update. Cc: Realtek linux nic maintainers <nic_swsd@realtek.com> Cc: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * arch: Add lightweight memory barriers dma_rmb() and dma_wmb()Alexander Duyck2014-12-1112-26/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are a number of situations where the mandatory barriers rmb() and wmb() are used to order memory/memory operations in the device drivers and those barriers are much heavier than they actually need to be. For example in the case of PowerPC wmb() calls the heavy-weight sync instruction when for coherent memory operations all that is really needed is an lsync or eieio instruction. This commit adds a coherent only version of the mandatory memory barriers rmb() and wmb(). In most cases this should result in the barrier being the same as the SMP barriers for the SMP case, however in some cases we use a barrier that is somewhere in between rmb() and smp_rmb(). For example on ARM the rmb barriers break down as follows: Barrier Call Explanation --------- -------- ---------------------------------- rmb() dsb() Data synchronization barrier - system dma_rmb() dmb(osh) data memory barrier - outer sharable smp_rmb() dmb(ish) data memory barrier - inner sharable These new barriers are not as safe as the standard rmb() and wmb(). Specifically they do not guarantee ordering between coherent and incoherent memories. The primary use case for these would be to enforce ordering of reads and writes when accessing coherent memory that is shared between the CPU and a device. It may also be noted that there is no dma_mb(). Most architectures don't provide a good mechanism for performing a coherent only full barrier without resorting to the same mechanism used in mb(). As such there isn't much to be gained in trying to define such a function. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Michael Neuling <mikey@neuling.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: David Miller <davem@davemloft.net> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * arch: Cleanup read_barrier_depends() and commentsAlexander Duyck2014-12-1110-135/+129
|/ | | | | | | | | | | | | | | | | | This patch is meant to cleanup the handling of read_barrier_depends and smp_read_barrier_depends. In multiple spots in the kernel headers read_barrier_depends is defined as "do {} while (0)", however we then go into the SMP vs non-SMP sections and have the SMP version reference read_barrier_depends, and the non-SMP define it as yet another empty do/while. With this commit I went through and cleaned out the duplicate definitions and reduced the number of definitions down to 2 per header. In addition I moved the 50 line comments for the macro from the x86 and mips headers that defined it as an empty do/while to those that were actually defining the macro, alpha and blackfin. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'dsa'David S. Miller2014-12-111-3/+13
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Florian Fainelli says: ==================== net: dsa: two small bug fixes Here are two small fixes for the DSA slave interface creation code: - first patch fixes a null pointer de-reference with an invalid PHY device pointer while calling phy_connect_direct() - second path propagates the dsa_slave_phy_setup() error code down to its caller: dsa_slave_create ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: dsa: propagate error code from dsa_slave_phy_setupFlorian Fainelli2014-12-111-4/+11
| | | | | | | | | | | | | | | | | | | | In case we cannot attach to our slave netdevice PHY, error out and propagate that error up to the caller: dsa_slave_create(). Fixes: 0d8bcdd383b8 ("net: dsa: allow for more complex PHY setups") Signed-off-by: Andrey Volkov <andrey.volkov@nexvision.fr> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: dsa: handle non-existing PHYs on switch internal busFlorian Fainelli2014-12-111-0/+3
|/ | | | | | | | | | | | | | | | | | | | | | | | | | In case there is no PHY at the designated address on the internal switch, we would basically de-reference a null pointer here: dsa_slave_phy_setup(...) { p->phy = ds->slave_mii_bus->phy_map[p->port]; phy_connect_direct(slave_dev, p->phy, dsa_slave_adjust_link, ^------ This can be triggered when the platform configuration (platform_data or Device Tree) indicates there should be a PHY device at this address, but the HW is non-responsive, such that we cannot attach a PHY device at this specific location. Fix this by checking the return value prior to calling phy_connect_direct(). CC: Andrew Lunn <andrew@lunn.ch> Fixes: b31f65fb4383 ("net: dsa: slave: Fix autoneg for phys on switch MDIO bus") Reported-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Andrey Volkov <andrey.volkov@nexvision.fr> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds2014-12-111336-29147/+70816
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking updates from David Miller: 1) New offloading infrastructure and example 'rocker' driver for offloading of switching and routing to hardware. This work was done by a large group of dedicated individuals, not limited to: Scott Feldman, Jiri Pirko, Thomas Graf, John Fastabend, Jamal Hadi Salim, Andy Gospodarek, Florian Fainelli, Roopa Prabhu 2) Start making the networking operate on IOV iterators instead of modifying iov objects in-situ during transfers. Thanks to Al Viro and Herbert Xu. 3) A set of new netlink interfaces for the TIPC stack, from Richard Alpe. 4) Remove unnecessary looping during ipv6 routing lookups, from Martin KaFai Lau. 5) Add PAUSE frame generation support to gianfar driver, from Matei Pavaluca. 6) Allow for larger reordering levels in TCP, which are easily achievable in the real world right now, from Eric Dumazet. 7) Add a variable of napi_schedule that doesn't need to disable cpu interrupts, from Eric Dumazet. 8) Use a doubly linked list to optimize neigh_parms_release(), from Nicolas Dichtel. 9) Various enhancements to the kernel BPF verifier, and allow eBPF programs to actually be attached to sockets. From Alexei Starovoitov. 10) Support TSO/LSO in sunvnet driver, from David L Stevens. 11) Allow controlling ECN usage via routing metrics, from Florian Westphal. 12) Remote checksum offload, from Tom Herbert. 13) Add split-header receive, BQL, and xmit_more support to amd-xgbe driver, from Thomas Lendacky. 14) Add MPLS support to openvswitch, from Simon Horman. 15) Support wildcard tunnel endpoints in ipv6 tunnels, from Steffen Klassert. 16) Do gro flushes on a per-device basis using a timer, from Eric Dumazet. This tries to resolve the conflicting goals between the desired handling of bulk vs. RPC-like traffic. 17) Allow userspace to ask for the CPU upon what a packet was received/steered, via SO_INCOMING_CPU. From Eric Dumazet. 18) Limit GSO packets to half the current congestion window, from Eric Dumazet. 19) Add a generic helper so that all drivers set their RSS keys in a consistent way, from Eric Dumazet. 20) Add xmit_more support to enic driver, from Govindarajulu Varadarajan. 21) Add VLAN packet scheduler action, from Jiri Pirko. 22) Support configurable RSS hash functions via ethtool, from Eyal Perry. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1820 commits) Fix race condition between vxlan_sock_add and vxlan_sock_release net/macb: fix compilation warning for print_hex_dump() called with skb->mac_header net/mlx4: Add support for A0 steering net/mlx4: Refactor QUERY_PORT net/mlx4_core: Add explicit error message when rule doesn't meet configuration net/mlx4: Add A0 hybrid steering net/mlx4: Add mlx4_bitmap zone allocator net/mlx4: Add a check if there are too many reserved QPs net/mlx4: Change QP allocation scheme net/mlx4_core: Use tasklet for user-space CQ completion events net/mlx4_core: Mask out host side virtualization features for guests net/mlx4_en: Set csum level for encapsulated packets be2net: Export tunnel offloads only when a VxLAN tunnel is created gianfar: Fix dma check map error when DMA_API_DEBUG is enabled cxgb4/csiostor: Don't use MASTER_MUST for fw_hello call net: fec: only enable mdio interrupt before phy device link up net: fec: clear all interrupt events to support i.MX6SX net: fec: reset fep link status in suspend function net: sock: fix access via invalid file descriptor net: introduce helper macro for_each_cmsghdr ...
| * Fix race condition between vxlan_sock_add and vxlan_sock_releaseMarcelo Leitner2014-12-111-7/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, when trying to reuse a socket, vxlan_sock_add will grab vn->sock_lock, locate a reusable socket, inc refcount and release vn->sock_lock. But vxlan_sock_release() will first decrement refcount, and then grab that lock. refcnt operations are atomic but as currently we have deferred works which hold vs->refcnt each, this might happen, leading to a use after free (specially after vxlan_igmp_leave): CPU 1 CPU 2 deferred work vxlan_sock_add ... ... spin_lock(&vn->sock_lock) vs = vxlan_find_sock(); vxlan_sock_release dec vs->refcnt, reaches 0 spin_lock(&vn->sock_lock) vxlan_sock_hold(vs), refcnt=1 spin_unlock(&vn->sock_lock) hlist_del_rcu(&vs->hlist); vxlan_notify_del_rx_port(vs) spin_unlock(&vn->sock_lock) So when we look for a reusable socket, we check if it wasn't freed already before reusing it. Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Fixes: 7c47cedf43a8b3 ("vxlan: move IGMP join/leave to work queue") Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/macb: fix compilation warning for print_hex_dump() called with ↵Cyrille Pitchen2014-12-111-1/+1
| | | | | | | | | | | | | | skb->mac_header Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge branch 'mlx4-next'David S. Miller2014-12-1118-170/+1342
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Or Gerlitz says: ==================== mlx4 driver update This series from Matan, Jenny, Dotan and myself is mostly about adding support to a new performance optimized flow steering mode (patches 4-10). The 1st two patches are small fixes (one for VXLAN and one for SRIOV), and the third patch is a fix to avoid hard-lockup situation when many (hunderds) processes holding user-space QPs/CQs get events. Matan and Or. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net/mlx4: Add support for A0 steeringMatan Barak2014-12-117-19/+191
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the required firmware commands for A0 steering and a way to enable that. The firmware support focuses on INIT_HCA, QUERY_HCA, QUERY_PORT, QUERY_DEV_CAP and QUERY_FUNC_CAP commands. Those commands are used to configure and query the device. The different A0 DMFS (steering) modes are: Static - optimized performance, but flow steering rules are limited. This mode should be choosed explicitly by the user in order to be used. Dynamic - this mode should be explicitly choosed by the user. In this mode, the FW works in optimized steering mode as long as it can and afterwards automatically drops to classic (full) DMFS. Disable - this mode should be explicitly choosed by the user. The user instructs the system not to use optimized steering, even if the FW supports Dynamic A0 DMFS (and thus will be able to use optimized steering in Default A0 DMFS mode). Default - this mode is implicitly choosed. In this mode, if the FW supports Dynamic A0 DMFS, it'll work in this mode. Otherwise, it'll work at Disable A0 DMFS mode. Under SRIOV configuration, when the A0 steering mode is enabled, older guest VF drivers who aren't using the RX QP allocation flag (MLX4_RESERVE_A0_QP) will get a QP from the general range and fail when attempting to register a steering rule. To avoid that, the PF context behaviour is changed once on A0 static mode, to require support for the allocation flag in VF drivers too. In order to enable A0 steering, we use log_num_mgm_entry_size param. If the value of the parameter is not positive, we treat the absolute value of log_num_mgm_entry_size as a bit field. Setting bit 2 of this bit field enables static A0 steering. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net/mlx4: Refactor QUERY_PORTMatan Barak2014-12-113-95/+154
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently QUERY_PORT is done as a part of QUERY_DEV_CAP firmware command. Since we would like to use it without querying all device capabilities, extract this part to be a function of its own. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net/mlx4_core: Add explicit error message when rule doesn't meet configurationMatan Barak2014-12-111-3/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | When a given flow steering rule is invalid in respect to the current steering configuration, print the correct error message to the system log. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net/mlx4: Add A0 hybrid steeringMatan Barak2014-12-118-25/+300
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A0 hybrid steering is a form of high performance flow steering. By using this mode, mlx4 cards use a fast limited table based steering, in order to enable fast steering of unicast packets to a QP. In order to implement A0 hybrid steering we allocate resources from different zones: (1) General range (2) Special MAC-assigned QPs [RSS, Raw-Ethernet] each has its own region. When we create a rss QP or a raw ethernet (A0 steerable and BF ready) QP, we try hard to allocate the QP from range (2). Otherwise, we try hard not to allocate from this range. However, when the system is pushed to its limits and one needs every resource, the allocator uses every region it can. Meaning, when we run out of raw-eth qps, the allocator allocates from the general range (and the special-A0 area is no longer active). If we run out of RSS qps, the mechanism tries to allocate from the raw-eth QP zone. If that is also exhausted, the allocator will allocate from the general range (and the A0 region is no longer active). Note that if a raw-eth qp is allocated from the general range, it attempts to allocate the range such that bits 6 and 7 (blueflame bits) in the QP number are not set. When the feature is used in SRIOV, the VF has to notify the PF what kind of QP attributes it needs. In order to do that, along with the "Eth QP blueflame" bit, we reserve a new "A0 steerable QP". According to the combination of these bits, the PF tries to allocate a suitable QP. In order to maintain backward compatibility (with older PFs), the PF notifies which QP attributes it supports via QUERY_FUNC_CAP command. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net/mlx4: Add mlx4_bitmap zone allocatorMatan Barak2014-12-112-0/+451
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The zone allocator is a mechanism which manages a few mlx4_bitmaps. When allocating a resource, the user indicates the desired zone of which this resource will be allocated from. If possible, the resource will be allocated from this zone. Otherwise, the resource will be allocated from a less-than, equal-to, higher-than priority zone, according to the desired zone's properties with that respective allocation order. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net/mlx4: Add a check if there are too many reserved QPsDotan Barak2014-12-111-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The number of reserved QPs is affected both from the firmware and from the driver's requirements. This patch adds a check that validates that this number is indeed feasable. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net/mlx4: Change QP allocation schemeEugenia Emantayev2014-12-1114-38/+137
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using BF (Blue-Flame), the QPN overrides the VLAN, CV, and SV fields in the WQE. Thus, BF may only be used for QPNs with bits 6,7 unset. The current Ethernet driver code reserves a Tx QP range with 256b alignment. This is wrong because if there are more than 64 Tx QPs in use, QPNs >= base + 65 will have bits 6/7 set. This problem is not specific for the Ethernet driver, any entity that tries to reserve more than 64 BF-enabled QPs should fail. Also, using ranges is not necessary here and is wasteful. The new mechanism introduced here will support reservation for "Eth QPs eligible for BF" for all drivers: bare-metal, multi-PF, and VFs (when hypervisors support WC in VMs). The flow we use is: 1. In mlx4_en, allocate Tx QPs one by one instead of a range allocation, and request "BF enabled QPs" if BF is supported for the function 2. In the ALLOC_RES FW command, change param1 to: a. param1[23:0] - number of QPs b. param1[31-24] - flags controlling QPs reservation Bit 31 refers to Eth blueflame supported QPs. Those QPs must have bits 6 and 7 unset in order to be used in Ethernet. Bits 24-30 of the flags are currently reserved. When a function tries to allocate a QP, it states the required attributes for this QP. Those attributes are considered "best-effort". If an attribute, such as Ethernet BF enabled QP, is a must-have attribute, the function has to check that attribute is supported before trying to do the allocation. In a lower layer of the code, mlx4_qp_reserve_range masks out the bits which are unsupported. If SRIOV is used, the PF validates those attributes and masks out unsupported attributes as well. In order to notify VFs which attributes are supported, the VF uses QUERY_FUNC_CAP command. This command's mailbox is filled by the PF, which notifies which QP allocation attributes it supports. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net/mlx4_core: Use tasklet for user-space CQ completion eventsMatan Barak2014-12-115-2/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, we've fired all our completion callbacks straight from our ISR. Some of those callbacks were lightweight (for example, mlx4_en's and IPoIB napi callbacks), but some of them did more work (for example, the user-space RDMA stack uverbs' completion handler). Besides that, doing more than the minimal work in ISR is generally considered wrong, it could even lead to a hard lockup of the system. Since when a lot of completion events are generated by the hardware, the loop over those events could be so long, that we'll get into a hard lockup by the system watchdog. In order to avoid that, add a new way of invoking completion events callbacks. In the interrupt itself, we add the CQs which receive completion event to a per-EQ list and schedule a tasklet. In the tasklet context we loop over all the CQs in the list and invoke the user callback. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net/mlx4_core: Mask out host side virtualization features for guestsOr Gerlitz2014-12-111-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | When VFs (guests in this context) issue the QUERY_DEV_CAP command, they need not be told that host side virtualization features such as VST, FSM (MAC anti-spoofing) and running > 80 VFs are supported by the device. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net/mlx4_en: Set csum level for encapsulated packetsOr Gerlitz2014-12-111-1/+2
| |/ | | | | | | | | | | | | | | This was dropped by mistake for the napi_gro_frags flow, fix that. Fixes: dd65beac48a5 ('net/mlx4_en: Extend usage of napi_gro_frags') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * be2net: Export tunnel offloads only when a VxLAN tunnel is createdSriharsha Basavapatna2014-12-112-10/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The encapsulated offload flags shouldn't be unconditionally exported to the stack. The stack expects offloading to work across all tunnel types when those flags are set. This would break other tunnels (like GRE) since be2net currently supports tunnel offload for VxLAN only. Also, with VxLANs Skyhawk-R can offload only 1 UDP dport. If more than 1 UDP port is added, we should disable offloads in that case too. Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * gianfar: Fix dma check map error when DMA_API_DEBUG is enabledKevin Hao2014-12-111-28/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need to use dma_mapping_error() to check the dma address returned by dma_map_single/page(). Otherwise we would get warning like this: WARNING: at lib/dma-debug.c:1140 Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc2-next-20141029 #196 task: c0834300 ti: effe6000 task.ti: c0874000 NIP: c02b2c98 LR: c02b2c98 CTR: c030abc4 REGS: effe7d70 TRAP: 0700 Not tainted (3.18.0-rc2-next-20141029) MSR: 00021000 <CE,ME> CR: 22044022 XER: 20000000 GPR00: c02b2c98 effe7e20 c0834300 00000098 00021000 00000000 c030b898 00000003 GPR08: 00000001 00000000 00000001 749eec9d 22044022 1001abe0 00000020 ef278678 GPR16: ef278670 ef278668 ef278660 070a8040 c087f99c c08cdc60 00029000 c0840d44 GPR24: c08be6e8 c0840000 effe7e78 ef041340 00000600 ef114e10 00000000 c08be6e0 NIP [c02b2c98] check_unmap+0x51c/0x9e4 LR [c02b2c98] check_unmap+0x51c/0x9e4 Call Trace: [effe7e20] [c02b2c98] check_unmap+0x51c/0x9e4 (unreliable) [effe7e70] [c02b31d8] debug_dma_unmap_page+0x78/0x8c [effe7ed0] [c03d1640] gfar_clean_rx_ring+0x208/0x488 [effe7f40] [c03d1a9c] gfar_poll_rx_sq+0x3c/0xa8 [effe7f60] [c04f8714] net_rx_action+0xc0/0x178 [effe7f90] [c00435a0] __do_softirq+0x100/0x1fc [effe7fe0] [c0043958] irq_exit+0xa4/0xc8 [effe7ff0] [c000d14c] call_do_irq+0x24/0x3c [c0875e90] [c00048a0] do_IRQ+0x8c/0xf8 [c0875eb0] [c000ed10] ret_from_except+0x0/0x18 For TX, we need to unmap the pages which has already been mapped and free the skb before return. For RX, move the dma mapping and error check to gfar_new_skb(). We would reuse the original skb in the rx ring when either allocating skb failure or dma mapping error. Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * cxgb4/csiostor: Don't use MASTER_MUST for fw_hello callHariprasad Shenai2014-12-113-16/+3
| | | | | | | | | | | | | | | | | | Remove use of calls into t4_fw_hello() with MASTER_MUST, which results in FW_HELLO_CMD_MASTERFORCE being set. The firmware doesn't support this and of course any existing PF Drivers will totally go for a toss. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge branch 'fec-next'David S. Miller2014-12-101-2/+11
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fugang Duan says: ==================== net: fec: driver code clean and bug fix The patch serial include code clean and bug fix: Patch#1: avoid dummy operation during suspend/resume test. Patch#2: bug fix for i.MX6SX SOC that clean all interrupt events during MAC initial process. Patch#3: before phy device link status is up, only enable MDIO bus interrupt. V2: - Modify the comment form from David's suggestion. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net: fec: only enable mdio interrupt before phy device link upNimrod Andy2014-12-101-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | Before phy device link up, we only enable FEC mdio interrupt, which is more reasonable. Signed-off-by: Fugang Duan <B38611@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net: fec: clear all interrupt events to support i.MX6SXNimrod Andy2014-12-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | For i.MX6SX FEC controller, there have interrupt mask and event field extension. To support all SOCs FEC, we clear all interrupt events during MAVC initial process. Signed-off-by: Fugang Duan <B38611@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net: fec: reset fep link status in suspend functionNimrod Andy2014-12-101-0/+6
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On some i.MX6 serial boards, phy power and refrence clock are supplied or controlled by SOC. When do suspend/resume test, the power and clock are disabled, so phy device link down. For current driver, fep->link is still up status, which cause extra operation like below code. To avoid the dumy operation, we set fep->link to down when phy device is real down. ... if (fep->link) { napi_disable(&fep->napi); netif_tx_lock_bh(ndev); fec_stop(ndev); netif_tx_unlock_bh(ndev); napi_enable(&fep->napi); fep->link = phy_dev->link; status_change = 1; } ... Signed-off-by: Fugang Duan <B38611@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: sock: fix access via invalid file descriptorAlexei Starovoitov2014-12-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | 0day robot reported the following crash: [ 21.233581] BUG: unable to handle kernel NULL pointer dereference at 0000000000000007 [ 21.234709] IP: [<ffffffff8156ebda>] sk_attach_bpf+0x39/0xc2 It's due to bpf_prog_get() returning ERR_PTR. Check it properly. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Fixes: 89aa075832b0 ("net: sock: allow eBPF programs to be attached to sockets") Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: introduce helper macro for_each_cmsghdrGu Zheng2014-12-1010-16/+15
| | | | | | | | | | | | | | | | Introduce helper macro for_each_cmsghdr as a wrapper of the enumerating cmsghdr from msghdr, just cleanup. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * cxgb4/cxgb4vf: global named must be uniqueStephen Rothwell2014-12-104-6/+6
| | | | | | | | | | Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-12-1077-330/+540
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/ethernet/amd/xgbe/xgbe-desc.c drivers/net/ethernet/renesas/sh_eth.c Overlapping changes in both conflict cases. Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net: fix suspicious rcu_dereference_check in net/sched/sch_fq_codel.cValdis.Kletnieks@vt.edu2014-12-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 46e5da40ae (net: qdisc: use rcu prefix and silence sparse warnings) triggers a spurious warning: net/sched/sch_fq_codel.c:97 suspicious rcu_dereference_check() usage! The code should be using the _bh variant of rcu_dereference. Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * xen-netfront: use correct linear area after linearizing an skbDavid Vrabel2014-12-091-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 97a6d1bb2b658ac85ed88205ccd1ab809899884d (xen-netfront: Fix handling packets on compound pages with skb_linearize) attempted to fix a problem where an skb that would have required too many slots would be dropped causing TCP connections to stall. However, it filled in the first slot using the original buffer and not the new one and would use the wrong offset and grant access to the wrong page. Netback would notice the malformed request and stop all traffic on the VIF, reporting: vif vif-3-0 vif3.0: txreq.offset: 85e, size: 4002, end: 6144 vif vif-3-0 vif3.0: fatal error; disabling device Reported-by: Anthony Wright <anthony@overnetdata.com> Tested-by: Anthony Wright <anthony@overnetdata.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * tcp: fix more NULL deref after prequeue changesEric Dumazet2014-12-092-16/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When I cooked commit c3658e8d0f1 ("tcp: fix possible NULL dereference in tcp_vX_send_reset()") I missed other spots we could deref a NULL skb_dst(skb) Again, if a socket is provided, we do not need skb_dst() to get a pointer to network namespace : sock_net(sk) is good enough. Reported-by: Dann Frazier <dann.frazier@canonical.com> Bisected-by: Dann Frazier <dann.frazier@canonical.com> Tested-by: Dann Frazier <dann.frazier@canonical.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: ca777eff51f7 ("tcp: remove dst refcount false sharing for prequeue mode") Signed-off-by: David S. Miller <davem@davemloft.net>
| | * netback: don't store invalid vif pointerJan Beulich2014-12-091-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When xenvif_alloc() fails, it returns a non-NULL error indicator. To avoid eventual races, we shouldn't store that into struct backend_info as readers of it only check for NULL. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * Merge tag 'linux-can-fixes-for-3.18-20141207' of ↵David S. Miller2014-12-094-51/+54
| | |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://gitorious.org/linux-can/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2014-12-07 this is a pull request of three patches by Stephane Grosjean which fix several bugs in the peak_usb CAN drivers. Please queue, if possible for 3.18, if it's too late these patches takes the slow lane via net-next. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | | * can: peak_usb: fix multi-byte values endianessStephane Grosjean2014-12-074-42/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the endianess definition as well as the usage of the multi-byte fields in the data structures exchanged with the PEAK-System USB adapters. By fixing the endianess, this patch also fixes the wrong usage of a 32-bits local variable for handling the error status 16-bits field, in function pcan_usb_pro_handle_error(). Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
| | | * can: peak_usb: fix cleanup sequence order in case of error during initStephane Grosjean2014-12-061-7/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch sets the correct reverse sequence order to the instructions set to run, when any failure occurs during the initialization steps. It also adds the missing unregistration call of the can device if the failure appears after having been registered. Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
| | | * can: peak_usb: fix memset() usageStephane Grosjean2014-12-061-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patchs fixes a misplaced call to memset() that fills the request buffer with 0. The problem was with sending PCAN_USBPRO_REQ_FCT requests, the content set by the caller was thus lost. With this patch, the memory area is zeroed only when requesting info from the device. Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
| | * | bnx2x: Implement ndo_gso_check()Joe Stringer2014-12-091-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Use vxlan_gso_check() to advertise offload support for this NIC. Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | amd-xgbe: Prevent Tx cleanup stallLendacky, Thomas2014-12-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When performing Tx cleanup, the dirty index counter is compared to the current index counter as one of the tests used to determine when to stop cleanup. The "less than" test will fail when the current index counter rolls over to zero causing cleanup to never occur again. Update the test to a "not equal" to avoid this situation. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | Update old iproute2 and Xen Remus linksAndrew Shewmaker2014-12-092-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Andrew Shewmaker <agshew@gmail.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | Merge tag 'master-2014-12-01' of ↵David S. Miller2014-12-094-5/+27
| | |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== pull request: wireless 2014-12-03 One last(?) batch of fixes hoping to make 3.18... In this episode, we have another trio of rtlwifi fixes repairing a little more damage from the major update of the rtlwifi-family of drivers. These editing mistakes caused some memory corruption and missed a flag critical to proper interrupt handling. Together, these fix the kernel regression reported at https://bugzilla.kernel.org/show_bug.cgi?id=88951 by Catalin Iacob. Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
OpenPOWER on IntegriCloud