summaryrefslogtreecommitdiffstats
path: root/drivers/net/hyperv
Commit message (Collapse)AuthorAgeFilesLines
* hyperv: Add IPv6 into the hash computation for vRSSHaiyang Zhang2014-10-302-2/+5
| | | | | | | | This will allow the workload spreading via vRSS for IPv6. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* hyperv: Fix the total_data_buflen in send pathHaiyang Zhang2014-10-221-0/+1
| | | | | | | | | | | | | total_data_buflen is used by netvsc_send() to decide if a packet can be put into send buffer. It should also include the size of RNDIS message before the Ethernet frame. Otherwise, a messge with total size bigger than send_section_size may be copied into the send buffer, and cause data corruption. [Request to include this patch to the Stable branches] Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* hyperv: Add handling of IP header with option field in netvsc_set_hash()Haiyang Zhang2014-10-171-16/+10
| | | | | | | | | | In case that the IP header has optional field at the end, this patch will get the port numbers after that field, and compute the hash. The general parser skb_flow_dissect() is used here. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-10-081-7/+8
|\
| * hyperv: Fix a bug in netvsc_send()KY Srinivasan2014-10-051-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | After the packet is successfully sent, we should not touch the packet as it may have been freed. This patch is based on the work done by Long Li <longli@microsoft.com>. David, please queue this up for stable. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reported-by: Sitsofe Wheeler <sitsofe@yahoo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-10-021-1/+2
|\ \ | |/ | | | | | | | | | | | | | | | | Conflicts: drivers/net/usb/r8152.c net/netfilter/nfnetlink.c Both r8152 and nfnetlink conflicts were simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
| * hyperv: Fix a bug in netvsc_start_xmit()KY Srinivasan2014-09-301-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | After the packet is successfully sent, we should not touch the skb as it may have been freed. This patch is based on the work done by Long Li <longli@microsoft.com>. In this version of the patch I have fixed issues pointed out by David. David, please queue this up for stable. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Tested-by: Long Li <longli@microsoft.com> Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | hyperv: NULL dereference on errorDan Carpenter2014-09-051-4/+2
| | | | | | | | | | | | | | | | | | We try to call free_netvsc_device(net_device) when "net_device" is NULL. It leads to an Oops. Fixes: f90251c8a6d0 ('hyperv: Increase the buffer length for netvsc_channel_cb()') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | hyperv: Increase the buffer length for netvsc_channel_cb()Haiyang Zhang2014-08-222-4/+16
|/ | | | | | | | | | When the buffer is too small for a packet from VMBus, a bigger buffer will be allocated in netvsc_channel_cb() and retry reading the packet from VMBus. Increasing this buffer size will reduce the retry overhead. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* hyperv: Adjust the size of sendbuf region to support ws2008r2KY Srinivasan2014-08-061-1/+1
| | | | | | | | | WS2008R2 is a supported platform and it turns out that the maximum sendbuf size that ws2008R2 can support is only 15MB. Make the necessary adjustment. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Drivers: net-next: hyperv: Increase the size of the sendbuf regionKY Srinivasan2014-08-042-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Intel did some benchmarking on our network throughput when Linux on Hyper-V is as used as a gateway. This fix gave us almost a 1 Gbps additional throughput on about 5Gbps base throughput we hadi, prior to increasing the sendbuf size. The sendbuf mechanism is a copy based transport that we have which is clearly more optimal than the copy-free page flipping mechanism (for small packets). In the forwarding scenario, we deal only with MTU sized packets, and increasing the size of the senbuf area gave us the additional performance. For what it is worth, Windows guests on Hyper-V, I am told use similar sendbuf size as well. The exact value of sendbuf I think is less important than the fact that it needs to be larger than what Linux can allocate as physically contiguous memory. Thus the change over to allocating via vmalloc(). We currently allocate 16MB receive buffer and we use vmalloc there for allocation. Also the low level channel code has already been modified to deal with physically dis-contiguous memory in the ringbuffer setup. Based on experimentation Intel did, they say there was some improvement in throughput as the sendbuf size was increased up to 16MB and there was no effect on throughput beyond 16MB. Thus I have chosen 16MB here. Increasing the sendbuf value makes a material difference in small packet handling In this version of the patch, based on David's feedback, I have added additional details in the commit log. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-07-301-1/+3
|\ | | | | | | Signed-off-by: David S. Miller <davem@davemloft.net>
| * hyperv: Fix error return code in netvsc_init_buf()Wei Yongjun2014-07-231-1/+3
| | | | | | | | | | | | | | | | | | Fix to return -ENOMEM from the kalloc error handling case instead of 0. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | hyperv: Add netpoll supportRichard Weinberger2014-07-091-0/+11
| | | | | | | | | | | | | | | | | | | | | | In order to have at least a netconsole to debug kernel issues on Windows Azure this patch implements netpoll support. Sending packets is easy, netvsc_start_xmit() does already everything needed. Signed-off-by: Richard Weinberger <richard@nod.at> Acked-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | drivers/net/hyperv/netvsc.c: remove unnecessary null test before kfreeFabian Frederick2014-07-021-3/+1
| | | | | | | | | | | | | | | | | | | | Fix checkpatch warning: WARNING: kfree(NULL) is safe this check is probably not required Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: netdev@vger.kernel.org Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-06-251-1/+1
|\ \ | |/
| * hyperv: fix apparent cut-n-paste error in send path teardownDave Jones2014-06-161-1/+1
| | | | | | | | | | | | | | | | | | c25aaf814a63: "hyperv: Enable sendbuf mechanism on the send path" added some teardown code that looks like it was copied from the recieve path above, but missed a variable name replacement. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | hyperv: Add handler for RNDIS_STATUS_NETWORK_CHANGE eventHaiyang Zhang2014-06-193-25/+29
|/ | | | | | | | | | The RNDIS_STATUS_NETWORK_CHANGE event is received after the Hyper-V host sleep or hibernation. We refresh network at this time. MS-TFS: 135162 Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* hyperv: Add hash value into RNDIS Per-packet infoHaiyang Zhang2014-05-232-4/+18
| | | | | | | | | It passes the hash value as the RNDIS Per-packet info to the Hyper-V host, so that the send completion notices can be spread across multiple channels. MS-TFS: 140273 Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: get rid of SET_ETHTOOL_OPSWilfried Klaebe2014-05-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | net: get rid of SET_ETHTOOL_OPS Dave Miller mentioned he'd like to see SET_ETHTOOL_OPS gone. This does that. Mostly done via coccinelle script: @@ struct ethtool_ops *ops; struct net_device *dev; @@ - SET_ETHTOOL_OPS(dev, ops); + dev->ethtool_ops = ops; Compile tested only, but I'd seriously wonder if this broke anything. Suggested-by: Dave Miller <davem@davemloft.net> Signed-off-by: Wilfried Klaebe <w-lkml@lebenslange-mailadresse.de> Acked-by: Felipe Balbi <balbi@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-05-121-0/+4
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/ethernet/altera/altera_sgdma.c net/netlink/af_netlink.c net/sched/cls_api.c net/sched/sch_api.c The netlink conflict dealt with moving to netlink_capable() and netlink_ns_capable() in the 'net' tree vs. supporting 'tc' operations in non-init namespaces. These were simple transformations from netlink_capable to netlink_ns_capable. The Altera driver conflict was simply code removal overlapping some void pointer cast cleanups in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>
| * hyperv: Properly handle checksum offloadKY Srinivasan2014-04-301-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do checksum offload only if the client of the driver wants checksum to be offloaded. In V1 version of this patch, I addressed comments from Stephen Hemminger <stephen@networkplumber.org> and Eric Dumazet <eric.dumazet@gmail.com>. In this version of the patch I have addressed comments from David Miller. This patch fixes a bug that is exposed in gateway scenarios. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Add support for netvsc build without CONFIG_SYSFS flagHaiyang Zhang2014-05-121-4/+1
| | | | | | | | | | | | | | | | | | | | This change ensures the driver can be built successfully without the CONFIG_SYSFS flag. MS-TFS: 182270 Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | hyperv: Enable sendbuf mechanism on the send pathKY Srinivasan2014-04-303-9/+234
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We send packets using a copy-free mechanism (this is the Guest to Host transport via VMBUS). While this is obviously optimal for large packets, it may not be optimal for small packets. Hyper-V host supports a second mechanism for sending packets that is "copy based". We implement that mechanism in this patch. In this version of the patch I have addressed a comment from David Miller. With this patch (and all of the other offload and VRSS patches), we are now able to almost saturate a 10G interface between Linux VMs on Hyper-V on different hosts - close to 9 Gbps as measured via iperf. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | hyperv: Simplify the send_completion variablesHaiyang Zhang2014-04-234-16/+11
| | | | | | | | | | | | | | | | The union contains only one member now, so we use the variables in it directly. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | hyperv: Remove recv_pkt_list and lockHaiyang Zhang2014-04-234-198/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Removed recv_pkt_list and lock, and updated related code, so that the locking overhead is reduced especially when multiple channels are in use. The recv_pkt_list isn't actually necessary because the packets are processed sequentially in each channel. It has been replaced by a local variable, and the related lock for this list is also removed. The is_data_pkt field is not used in receive path, so its assignment is cleaned up. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | hyperv: Add support for virtual Receive Side Scaling (vRSS)Haiyang Zhang2014-04-214-34/+504
|/ | | | | | | | | This feature allows multiple channels to be used by each virtual NIC. It is available on Hyper-V host 2012 R2. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Drivers: net: hyperv: Address UDP checksum issuesKY Srinivasan2014-04-113-2/+37
| | | | | | | | | | | | | | ws2008r2 does not support UDP checksum offload. Thus, we cannnot turn on UDP offload in the host. Also, on ws2012 and ws2012 r2, there appear to be an issue with UDP checksum offload. Fix this issue by computing the UDP checksum in the Hyper-V driver. Based on Dave Miller's comments, in this version, I have COWed the skb before modifying the UDP header (the checksum field). Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Drivers: net: hyperv: Negotiate suitable ndis version for offload supportKY Srinivasan2014-04-111-1/+1
| | | | | | | | | Ws2008R2 supports ndis_version 6.1 and 6.1 is the minimal version required for various offloads. Negotiate ndis_version 6.1 when on ws2008r2. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Drivers: net: hyperv: Allocate memory for all possible per-pecket informationKY Srinivasan2014-04-111-1/+3
| | | | | | | | | An outgoing packet can potentially need per-packet information for all the offloads and VLAN tagging. Fix this issue. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-03-142-1/+24
|\ | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/usb/r8152.c drivers/net/xen-netback/netback.c Both the r8152 and netback conflicts were simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
| * hyperv: Move state setting for link queryHaiyang Zhang2014-03-052-1/+24
| | | | | | | | | | | | | | | | | | It moves the state setting for query into rndis_filter_receive_response(). All callbacks including query-complete and status-callback are synchronized by channel->inbound_lock. This prevents pentential race between them. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | hyperv: Change the receive buffer size for legacy hostsHaiyang Zhang2014-03-102-1/+6
| | | | | | | | | | | | | | | | Due to a bug in the Hyper-V host verion 2008R2, we need to use a slightly smaller receive buffer size, otherwise the buffer will not be accepted by the legacy hosts. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Drivers: net: hyperv: Enable large send offloadKY Srinivasan2014-03-102-4/+74
| | | | | | | | | | | | | | | | Enable segmentation offload. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Drivers: net: hyperv: Enable send side checksum offloadKY Srinivasan2014-03-102-2/+77
| | | | | | | | | | | | | | | | Enable send side checksum offload. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Drivers: net: hyperv: Enable receive side IP checksum offloadKY Srinivasan2014-03-103-6/+50
| | | | | | | | | | | | | | | | Enable receive side checksum offload. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Drivers: net: hyperv: Enable offloads on the hostKY Srinivasan2014-03-102-0/+135
| | | | | | | | | | | | | | | | Prior to enabling guest side offloads, enable the offloads on the host. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Drivers: net: hyperv: Cleanup the send pathKY Srinivasan2014-03-103-90/+71
| | | | | | | | | | | | | | | | In preparation for enabling offloads, cleanup the send path. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Drivers: net: hyperv: Enable scatter gather I/OKY Srinivasan2014-03-101-39/+114
| | | | | | | | | | | | | | | | Cleanup the code and enable scatter gather I/O. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | hyperv: Add latest NetVSP versions to auto negotiationHaiyang Zhang2014-02-193-10/+70
| | | | | | | | | | | | | | | | It auto negotiates the highest NetVSP version supported by both guest and host. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-02-191-15/+38
|\ \ | |/ | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/bonding/bond_3ad.h drivers/net/bonding/bond_main.c Two minor conflicts in bonding, both of which were overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
| * hyperv: Fix the carrier status settingHaiyang Zhang2014-02-161-15/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | Without this patch, the "cat /sys/class/net/ethN/operstate" shows "unknown", and "ethtool ethN" shows "Link detected: yes", when VM boots up with or without vNIC connected. This patch fixed the problem. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Drivers: net: hyperv: Cleanup the netvsc receive callback functioKY Srinivasan2014-02-172-23/+12
| | | | | | | | | | | | | | Get rid of the buffer allocation in the receive path for normal packets. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Drivers: net: hyperv: Cleanup the receive pathKY Srinivasan2014-02-171-16/+13
| | | | | | | | | | | | | | | | | | Make the receive path a little more efficient by parameterizing the required state rather than re-establishing that state. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Drivers: net: hyperv: Get rid of the rndis_filter_packet structureKY Srinivasan2014-02-173-45/+4
|/ | | | | | | | | This structure is redundant; get rid of it make the code little more efficient - get rid of the unnecessary indirection. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* hyperv: Add support for physically discontinuous receive bufferHaiyang Zhang2014-01-272-6/+3
| | | | | | | | | This will allow us to use bigger receive buffer, and prevent allocation failure due to fragmented memory. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-01-061-12/+8
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c net/ipv6/ip6_tunnel.c net/ipv6/ip6_vti.c ipv6 tunnel statistic bug fixes conflicting with consolidation into generic sw per-cpu net stats. qlogic conflict between queue counting bug fix and the addition of multiple MAC address support. Signed-off-by: David S. Miller <davem@davemloft.net>
| * hyperv: Fix race between probe and open callsHaiyang Zhang2013-12-211-12/+8
| | | | | | | | | | | | | | | | | | Moving the register_netdev to the end of probe to prevent possible open call happens before NetVSP is connected. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2013-12-181-1/+0
|\ \ | |/ | | | | | | | | | | | | | | | | Conflicts: drivers/net/ethernet/intel/i40e/i40e_main.c drivers/net/macvtap.c Both minor merge hassles, simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
| * netvsc: don't flush peers notifying work during setting mtuJason Wang2013-12-171-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's a possible deadlock if we flush the peers notifying work during setting mtu: [ 22.991149] ====================================================== [ 22.991173] [ INFO: possible circular locking dependency detected ] [ 22.991198] 3.10.0-54.0.1.el7.x86_64.debug #1 Not tainted [ 22.991219] ------------------------------------------------------- [ 22.991243] ip/974 is trying to acquire lock: [ 22.991261] ((&(&net_device_ctx->dwork)->work)){+.+.+.}, at: [<ffffffff8108af95>] flush_work+0x5/0x2e0 [ 22.991307] but task is already holding lock: [ 22.991330] (rtnl_mutex){+.+.+.}, at: [<ffffffff81539deb>] rtnetlink_rcv+0x1b/0x40 [ 22.991367] which lock already depends on the new lock. [ 22.991398] the existing dependency chain (in reverse order) is: [ 22.991426] -> #1 (rtnl_mutex){+.+.+.}: [ 22.991449] [<ffffffff810dfdd9>] __lock_acquire+0xb19/0x1260 [ 22.991477] [<ffffffff810e0d12>] lock_acquire+0xa2/0x1f0 [ 22.991501] [<ffffffff81673659>] mutex_lock_nested+0x89/0x4f0 [ 22.991529] [<ffffffff815392b7>] rtnl_lock+0x17/0x20 [ 22.991552] [<ffffffff815230b2>] netdev_notify_peers+0x12/0x30 [ 22.991579] [<ffffffffa0340212>] netvsc_send_garp+0x22/0x30 [hv_netvsc] [ 22.991610] [<ffffffff8108d251>] process_one_work+0x211/0x6e0 [ 22.991637] [<ffffffff8108d83b>] worker_thread+0x11b/0x3a0 [ 22.991663] [<ffffffff81095e5d>] kthread+0xed/0x100 [ 22.991686] [<ffffffff81681c6c>] ret_from_fork+0x7c/0xb0 [ 22.991715] -> #0 ((&(&net_device_ctx->dwork)->work)){+.+.+.}: [ 22.991715] [<ffffffff810de817>] check_prevs_add+0x967/0x970 [ 22.991715] [<ffffffff810dfdd9>] __lock_acquire+0xb19/0x1260 [ 22.991715] [<ffffffff810e0d12>] lock_acquire+0xa2/0x1f0 [ 22.991715] [<ffffffff8108afde>] flush_work+0x4e/0x2e0 [ 22.991715] [<ffffffff8108e1b5>] __cancel_work_timer+0x95/0x130 [ 22.991715] [<ffffffff8108e303>] cancel_delayed_work_sync+0x13/0x20 [ 22.991715] [<ffffffffa03404e4>] netvsc_change_mtu+0x84/0x200 [hv_netvsc] [ 22.991715] [<ffffffff815233d4>] dev_set_mtu+0x34/0x80 [ 22.991715] [<ffffffff8153bc2a>] do_setlink+0x23a/0xa00 [ 22.991715] [<ffffffff8153d054>] rtnl_newlink+0x394/0x5e0 [ 22.991715] [<ffffffff81539eac>] rtnetlink_rcv_msg+0x9c/0x260 [ 22.991715] [<ffffffff8155cdd9>] netlink_rcv_skb+0xa9/0xc0 [ 22.991715] [<ffffffff81539dfa>] rtnetlink_rcv+0x2a/0x40 [ 22.991715] [<ffffffff8155c41d>] netlink_unicast+0xdd/0x190 [ 22.991715] [<ffffffff8155c807>] netlink_sendmsg+0x337/0x750 [ 22.991715] [<ffffffff8150d219>] sock_sendmsg+0x99/0xd0 [ 22.991715] [<ffffffff8150d63e>] ___sys_sendmsg+0x39e/0x3b0 [ 22.991715] [<ffffffff8150eba2>] __sys_sendmsg+0x42/0x80 [ 22.991715] [<ffffffff8150ebf2>] SyS_sendmsg+0x12/0x20 [ 22.991715] [<ffffffff81681d19>] system_call_fastpath+0x16/0x1b This is because we hold the rtnl_lock() before ndo_change_mtu() and try to flush the work in netvsc_change_mtu(), in the mean time, netdev_notify_peers() may be called from worker and also trying to hold the rtnl_lock. This will lead the flush won't succeed forever. Solve this by not canceling and flushing the work, this is safe because the transmission done by NETDEV_NOTIFY_PEERS was synchronized with the netif_tx_disable() called by netvsc_change_mtu(). Reported-by: Yaju Cao <yacao@redhat.com> Tested-by: Yaju Cao <yacao@redhat.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
OpenPOWER on IntegriCloud