summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'minnow/net-next' of ↵David S. Miller2013-07-274-1/+164
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.infradead.org/users/dvhart/linux-2.6 into minnow Darren Hart says: ==================== Add support for the MinnowBoard in the pch_gbe driver. This was originally sent to LKML as part of the MinnowBoard support series. That is now partially merged and this version of the patch has been isolated from those changes and is now completely self-contained. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * pch_gbe: Add MinnowBoard supportDarren Hart2013-07-254-0/+163
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The MinnowBoard uses an AR803x PHY with the PCH GBE which requires special handling. Use the MinnowBoard PCI Subsystem ID to detect this and add a pci_device_id.driver_data structure and functions to handle platform setup. The AR803x does not implement the RGMII 2ns TX clock delay in the trace routing nor via strapping. Add a detection method for the board and the PHY and enable the TX clock delay via the registers. This PHY will hibernate without link for 10 seconds. Ensure the PHY is awake for probe and then disable hibernation. A future improvement would be to convert pch_gbe to using PHYLIB and making sure we can wake the PHY at the necessary times rather than permanently disabling it. Signed-off-by: Darren Hart <dvhart@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Waskiewicz <peter.p.waskiewicz.jr@intel.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Joe Perches <joe@perches.com> Cc: netdev@vger.kernel.org
| * pch_gbe: Use PCH_GBE_PHY_REGS_LEN instead of 32Darren Hart2013-07-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Avoid using magic numbers when we have perfectly good defines just lying around. Signed-off-by: Darren Hart <dvhart@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Waskiewicz <peter.p.waskiewicz.jr@intel.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: netdev@vger.kernel.org
* | USBNET: increase max rx/tx qlen for improving USB3 thoughtputMing Lei2013-07-271-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | The default RX_QLEN()/TX_QLEN() didn't consider super speed USB device, so only max 4 URBs are scheduled at the same time for tx/rx, then USB3 NIC can't perform very well. With this patch, both rx and tx thoughput are increased more than 100Mbps when doing iperf test on ax88179_178a USB 3.0 NIC. Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | USBNET: centralize computing of max rx/tx qlenMing Lei2013-07-274-8/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch centralizes computing of max rx/tx qlen, because: - RX_QLEN()/TX_QLEN() is called in hot path - computing depends on device's usb speed, now we have ls/fs, hs, ss, so more checks need to be involved - in fact, max rx/tx qlen should not only depend on device USB speed, but also depend on ethernet link speed, so we need to consider that in future. - if SG support is done, max tx qlen may need change too Generally, hard_mtu and rx_urb_size are changed in bind(), reset() and link_reset() callback, and change mtu network operation, this patches introduces the API of usbnet_update_max_qlen(), and calls it in above path. Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | tuntap: hardware vlan tx supportJason Wang2013-07-271-4/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Inspired by commit f09e2249c4f5c7c13261ec73f5a7807076af0c8e (macvtap: restore vlan header on user read). This patch adds hardware vlan tx support for tuntap. This is done by copying vlan header directly into userspace in tun_put_user() instead of doing it through __vlan_put_tag() in dev_hard_start_xmit(). This eliminates one unnecessary memmove() in vlan_insert_tag() for 802.1ad and 802.1q traffic. pktgen test shows about 20% improvement for 802.1q traffic: Before: 662149pps 317Mb/sec (317831520bps) errors: 0 After: 801033pps 384Mb/sec (384495840bps) errors: 0 Cc: Basil Gor <basil.gor@gmail.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/sctp: Refactor SCTP skb checksum computationJoe Stringer2013-07-274-36/+20
| | | | | | | | | | | | | | | | | | | | This patch consolidates the SCTP checksum calculation code from various places to a single new function, sctp_compute_cksum(skb, offset). Signed-off-by: Joe Stringer <joe@wand.net.nz> Reviewed-by: Julian Anastasov <ja@ssi.bg> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | virtio-net: put virtio net header inline with dataMichael S. Tsirkin2013-07-272-9/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | For small packets we can simplify xmit processing by linearizing buffers with the header: most packets seem to have enough head room we can use for this purpose. Since existing hypervisors require that header is the first s/g element, we need a feature bit for this. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bond: cleanup netpoll codestephen hemminger2013-07-261-7/+1
| | | | | | | | | | | | | | | | | | | | | | This started out with fixing a sparse warning, then I realized that the wrapper function bond_netpoll_info could just be removed by rolling it into the enable code. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | team: cleanup netpoll clodestephen hemminger2013-07-261-17/+8
| | | | | | | | | | | | | | | | | | | | This started out with fixing a sparse warning, then I realized that the wrapper function team_netpoll_info could just be collapsed away by rolling it into the enable code. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bridge: cleanup netpoll codestephen hemminger2013-07-263-17/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | This started out with fixing a sparse warning, then I realized that the wrapper function br_netpoll_info could just be collapsed away by rolling it into the enable code. Also, eliminate unnecessary goto's Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: use pre-defined macro in bond_mode_name instead of magic number 0Wang Sheng-Hui2013-07-261-1/+1
| | | | | | | | | | | | | | | | | | | | We have BOND_MODE_ROUNDROBIN pre-defined as 0, and it's the lowest mode number. Use it to check the arg lower bound instead of magic number 0 in bond_mode_name. Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | drivers/net/ethernet/stmicro/stmmac: don't check resource with ↵Wolfram Sang2013-07-241-3/+0
|/ | | | | | | | | | devm_ioremap_resource devm_ioremap_resource does sanity checks on the given resource. No need to duplicate this in the driver. Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Make devnet_rename_seq staticThomas Gleixner2013-07-242-4/+1
| | | | | | | | No users outside net/core/dev.c. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: TCP_NOTSENT_LOWAT socket optionEric Dumazet2013-07-2410-6/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Idea of this patch is to add optional limitation of number of unsent bytes in TCP sockets, to reduce usage of kernel memory. TCP receiver might announce a big window, and TCP sender autotuning might allow a large amount of bytes in write queue, but this has little performance impact if a large part of this buffering is wasted : Write queue needs to be large only to deal with large BDP, not necessarily to cope with scheduling delays (incoming ACKS make room for the application to queue more bytes) For most workloads, using a value of 128 KB or less is OK to give applications enough time to react to POLLOUT events in time (or being awaken in a blocking sendmsg()) This patch adds two ways to set the limit : 1) Per socket option TCP_NOTSENT_LOWAT 2) A sysctl (/proc/sys/net/ipv4/tcp_notsent_lowat) for sockets not using TCP_NOTSENT_LOWAT socket option (or setting a zero value) Default value being UINT_MAX (0xFFFFFFFF), meaning this has no effect. This changes poll()/select()/epoll() to report POLLOUT only if number of unsent bytes is below tp->nosent_lowat Note this might increase number of sendmsg()/sendfile() calls when using non blocking sockets, and increase number of context switches for blocking sockets. Note this is not related to SO_SNDLOWAT (as SO_SNDLOWAT is defined as : Specify the minimum number of bytes in the buffer until the socket layer will pass the data to the protocol) Tested: netperf sessions, and watching /proc/net/protocols "memory" column for TCP With 200 concurrent netperf -t TCP_STREAM sessions, amount of kernel memory used by TCP buffers shrinks by ~55 % (20567 pages instead of 45458) lpq83:~# echo -1 >/proc/sys/net/ipv4/tcp_notsent_lowat lpq83:~# (super_netperf 200 -t TCP_STREAM -H remote -l 90 &); sleep 60 ; grep TCP /proc/net/protocols TCPv6 1880 2 45458 no 208 yes ipv6 y y y y y y y y y y y y y n y y y y y TCP 1696 508 45458 no 208 yes kernel y y y y y y y y y y y y y n y y y y y lpq83:~# echo 131072 >/proc/sys/net/ipv4/tcp_notsent_lowat lpq83:~# (super_netperf 200 -t TCP_STREAM -H remote -l 90 &); sleep 60 ; grep TCP /proc/net/protocols TCPv6 1880 2 20567 no 208 yes ipv6 y y y y y y y y y y y y y n y y y y y TCP 1696 508 20567 no 208 yes kernel y y y y y y y y y y y y y n y y y y y Using 128KB has no bad effect on the throughput or cpu usage of a single flow, although there is an increase of context switches. A bonus is that we hold socket lock for a shorter amount of time and should improve latencies of ACK processing. lpq83:~# echo -1 >/proc/sys/net/ipv4/tcp_notsent_lowat lpq83:~# perf stat -e context-switches ./netperf -H 7.7.7.84 -t omni -l 20 -c -i10,3 OMNI Send TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.7.84 () port 0 AF_INET : +/-2.500% @ 99% conf. Local Remote Local Elapsed Throughput Throughput Local Local Remote Remote Local Remote Service Send Socket Recv Socket Send Time Units CPU CPU CPU CPU Service Service Demand Size Size Size (sec) Util Util Util Util Demand Demand Units Final Final % Method % Method 1651584 6291456 16384 20.00 17447.90 10^6bits/s 3.13 S -1.00 U 0.353 -1.000 usec/KB Performance counter stats for './netperf -H 7.7.7.84 -t omni -l 20 -c -i10,3': 412,514 context-switches 200.034645535 seconds time elapsed lpq83:~# echo 131072 >/proc/sys/net/ipv4/tcp_notsent_lowat lpq83:~# perf stat -e context-switches ./netperf -H 7.7.7.84 -t omni -l 20 -c -i10,3 OMNI Send TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.7.84 () port 0 AF_INET : +/-2.500% @ 99% conf. Local Remote Local Elapsed Throughput Throughput Local Local Remote Remote Local Remote Service Send Socket Recv Socket Send Time Units CPU CPU CPU CPU Service Service Demand Size Size Size (sec) Util Util Util Util Demand Demand Units Final Final % Method % Method 1593240 6291456 16384 20.00 17321.16 10^6bits/s 3.35 S -1.00 U 0.381 -1.000 usec/KB Performance counter stats for './netperf -H 7.7.7.84 -t omni -l 20 -c -i10,3': 2,675,818 context-switches 200.029651391 seconds time elapsed Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-By: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: add sk_stream_is_writeable() helperEric Dumazet2013-07-247-8/+12
| | | | | | | | | | | | | | | Several call sites use the hardcoded following condition : sk_stream_wspace(sk) >= sk_stream_min_wspace(sk) Lets use a helper because TCP_NOTSENT_LOWAT support will change this condition for TCP sockets. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: sctp: trivial: add uapi/linux/sctp.h into maintainersDaniel Borkmann2013-07-241-0/+1
| | | | | | | | | | After this file has moved to the uapi section, we also need to update this in the maintainers file. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: sctp: trivial: update mailing list addressDaniel Borkmann2013-07-2438-42/+39
| | | | | | | | | | | | The SCTP mailing list address to send patches or questions to is linux-sctp@vger.kernel.org and not lksctp-developers@lists.sourceforge.net anymore. Therefore, update all occurences. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* drivers: net: cpsw: add support to show hw stats via ethtoolMugunthan V N2013-07-241-2/+200
| | | | | | | | | Add support to show CPSW hardware statistics to user via ethtool so user can find if there were any error reported by hardware or the system is over loaded duing high data rate transfer. Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bonding: Fixed up a error "do not initialise statics to 0 or NULL" in ↵dingtianhong2013-07-241-1/+1
| | | | | | | | | | | bond_main.c The error is found by the checkpatch.pl tools. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* bonding: add rtnl protection for bonding_store_fail_over_macdingtianhong2013-07-241-4/+11
| | | | | | | | | | | We need rtnl protection while reading slave_cnt and updating the .fail_over_mac, and it also follows the logic "don't change anything slave-related without rtnl". :) Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* bonding: bond_sysfs.c checkpatch cleanupdingtianhong2013-07-241-4/+2
| | | | | | | | | | net/bonding/bond_sysfs.c:1302: ERROR: else should follow close brace '}' net/bonding/bond_sysfs.c:1314: ERROR: else should follow close brace '}' Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* bonding: don't call slave_xxx_netpoll under spinlocksdingtianhong2013-07-241-12/+3
| | | | | | | | | | | | | | | The slave_xxx_netpoll will call synchronize_rcu_bh(), so the function may schedule and sleep, it should't be called under spinlocks. bond_netpoll_setup() and bond_netpoll_cleanup() are always protected by rtnl lock, it is no need to take the read lock, as the slave list couldn't be changed outside rtnl lock. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* drivers/net: enic: Move ethtool code to a separate fileNeel Patel2013-07-245-270/+309
| | | | | | | | | | This patch moves all enic ethtool hooks from enic_main.c to a new file enic_ethtool.c Signed-off-by: Neel Patel <neepatel@cisco.com> Signed-off-by: Christian Benvenuti <benve@cisco.com> Signed-off-by: Nishank Trivedi <nistrive@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: trans_rdma: remove unused functionAndi Shyti2013-07-241-11/+0
| | | | | | | | | | | | This patch gets rid of the following warning: net/9p/trans_rdma.c:594:12: warning: ‘rdma_cancelled’ defined but not used [-Wunused-function] static int rdma_cancelled(struct p9_client *client, struct p9_req_t *req) The rdma_cancelled function is not called anywhere in the kernel Signed-off-by: Andi Shyti <andi@etezian.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'be2net'David S. Miller2013-07-243-120/+218
|\ | | | | | | | | | | | | | | | | | | | | Sathya Perla says: ==================== The following patches are mostly for providing MAC filtering ability for VFs. Pls apply. Thanks! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * be2net: delete primary MAC address while unloadingSathya Perla2013-07-241-3/+5
| | | | | | | | | | | | | | | | Currently the UC-list is being deleted from the HW MAC table, but the primary MAC is not. Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * be2net: use SET/GET_MAC_LIST for SH-RSathya Perla2013-07-243-36/+50
| | | | | | | | | | | | | | | | | | | | On SH-R and Lancer-R, GET_MAC_LIST cmd is better supported (instead of NTWK_MAC_QUERY cmd) to query provisioned MAC addresses. Similiarly, (on SH-R and Lancer-R) SET_MAC_LIST must be used by the PF to provision a permanent MAC addresses to the VF. Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * be2net: refactor MAC-addr setup codeSathya Perla2013-07-243-58/+48
| | | | | | | | | | | | | | | | | | | | The code to configure the permanent MAC in be_setup() has become quite complicated, with different FW cmds being used for BEx, SH-R and Lancer. Simplify the logic by moving some of this complexity to be_cmds.c. This makes the code in be_setup() a little more readable. Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * be2net: fix pmac_id for BE3 VFsSathya Perla2013-07-241-0/+4
| | | | | | | | | | | | | | | | | | | | For BE3 VFs, the permanent MAC is added by its PF. The VF can retrieve its pmac_id only via the IFACE_CREATE cmd. This is not true for Lancer and SH-R VFs which get the pmac_id by issuing a ADD_IFACE_MAC cmd. So, use this hack only for BE3 VFs. Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * be2net: allow VFs to program MAC and VLAN filtersSathya Perla2013-07-243-0/+53
| | | | | | | | | | | | | | | | | | In the current design VFs were not allowed to program MAC/VLAN filters. Only the PF driver was allowed to configure/provision MAC and transparent VLANs to a VF. Change this to support MAC/VLAN filtering on a VF by a VM. Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * be2net: fix MAC address modification for VFSathya Perla2013-07-243-38/+73
|/ | | | | | | | | | | | Currently, the VFs by default don't have the privilege to modify MAC address. This will change in a subsequent fix wherein VFs will have the ability to modify MAC/VLAN filters. Fix be_mac_addr_set() logic to support MAC address modification on a privileged VF too. Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sh_eth: Add support for r8a7790 SoCSimon Horman2013-07-242-1/+22
| | | | | | | | | | This is a copy of support for r8a7778/9 with the .rmiimode mode bit of struct sh_eth_cpu_data set. Also update R8A7779 to R8A777x. Signed-off-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* sh_eth: add support for RMIIMODE registerSimon Horman2013-07-242-0/+6
| | | | | | | This register is prsent on the r8a7790 SoC. Signed-off-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: ipv6 eliminate parameter "int addrlen" in function fib6_add_1fan.du2013-07-241-8/+8
| | | | | | | | | | The "int addrlen" in fib6_add_1 is rebundant, as we can get it from parameter "struct in6_addr *addr" once we modified its type. And also fix some coding style issues in fib6_add_1 Signed-off-by: Fan Du <fan.du@windriver.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* net ipv6: Remove rebundant rt6i_nsiblings initializationfan.du2013-07-241-1/+0
| | | | | | | | | | | | Seting rt->rt6i_nsiblings to zero is rebundant, because above memset zeroed the rest of rt excluding the first dst memember. Signed-off-by: Fan Du <fan.du@windriver.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* vti: switch to new ip tunnel codeAmerigo Wang2013-07-231-476/+52
| | | | | | | | | | | | GRE tunnel and IPIP tunnel already switched to the new ip tunnel code, VTI tunnel can use it too. Cc: Pravin B Shelar <pshelar@nicira.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Saurabh Mohan <saurabh.mohan@vyatta.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ip6mr: change the prototype of ip6_mr_forward().Rami Rosen2013-07-231-6/+5
| | | | | | | | | | | | This patch changes the prototpye of the ip6_mr_forward() method to return void instead of int. The ip6_mr_forward() method always returns 0; moreover, the return value of this method is not checked anywhere. Signed-off-by: Rami Rosen <ramirose@gmail.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipmr: change the prototype of ip_mr_forward().Rami Rosen2013-07-231-8/+7
| | | | | | | | | | | | This patch changes the prototpye of the ip_mr_forward() method to return void instead of int. The ip_mr_forward() method always returns 0; moreover, the return value of this method is not checked anywhere. Signed-off-by: Rami Rosen <ramirose@gmail.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'team' ("add support for peer notifications and igmp rejoins ↵David S. Miller2013-07-238-43/+245
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | for team") Jiri Pirko says: ==================== The middle patch adjusts core infrastructure so the bonding code can be generalized and reused by team. v1->v2: using msecs_to_jiffies() as suggested by Eric Jiri Pirko (3): team: add peer notification net: convert resend IGMP to notifier event team: add support for sending multicast rejoins ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * team: add support for sending multicast rejoinsJiri Pirko2013-07-232-0/+91
| | | | | | | | | | | | | | | | | | Similar to what is implemented in bonding. User is able to ask team driver to send IGMP rejoins in case port is enabled or disabled. Using previously introduced netdev notifier. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: convert resend IGMP to notifier eventJiri Pirko2013-07-237-42/+60
| | | | | | | | | | | | | | | | | | | | | | | | Until now, bond_resend_igmp_join_requests() looks for vlans attached to bonding device, bridge where bonding act as port manually. It does not care of other scenarios, like stacked bonds or team device above. Make this more generic and use netdev notifier to propagate the event to upper devices and to actually call ip_mc_rejoin_groups(). Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * team: add peer notificationJiri Pirko2013-07-232-1/+94
|/ | | | | | | | When port is enabled or disabled, allow to notify peers by unsolicitated NAs or gratuitous ARPs. Disabled by default. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
* macvlan fdb replace supportThomas Richter2013-07-231-0/+3
| | | | | | | | | | | | Add support for iproute2 command 'bridge fdb replace ...'. The rtnletlink call back function ndo_fdb_add will be called with the NLM_F_REPLACE flag set. Simply return -EOPNOTSUP. Resubmitted because net-next was closed last week. Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* vxlan fdb replace an existing entryThomas Richter2013-07-231-0/+38
| | | | | | | | | | | | | | | | | | | | | Add support to replace an existing entry found in the vxlan fdb database. The entry in question is identified by its unicast mac address and the destination information is changed. If the entry is not found, it is added in the forwarding database. This is similar to changing an entry in the neighbour table. Multicast mac addresses can not be changed with the replace option. This is useful for virtual machine migration when the destination of a target virtual machine changes. The replace feature can be used instead of delete followed by add. Resubmitted because net-next was closed last week. Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'tcp'David S. Miller2013-07-225-80/+57
|\ | | | | | | | | | | | | | | | | | | | | | | | | Yuchung Cheng says: ==================== This patch series improve RTT sampling in three ways: 1. Sample RTT during fast recovery and reordering events. 2. Favor ack-based RTT to timestamps because of broken TS ECR fields 3. Consolidate the RTT measurement logic. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: use RTT from SACK for RTOYuchung Cheng2013-07-221-9/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If RTT is not available because Karn's check has failed or no new packet is acked, use the RTT measured from SACK to estimate the RTO. The sender can continue to estimate the RTO during loss recovery or reordering event upon receiving non-partial ACKs. This also changes when the RTO is re-armed. Previously it is only re-armed when some data is cummulatively acknowledged (i.e., SND.UNA advances), but now it is re-armed whenever RTT estimator is updated. This feature is particularly useful to reduce spurious timeout for buffer bloat including cellular carriers [1], and RTT estimation on reordering events. [1] "An In-depth Study of LTE: Effect of Network Protocol and Application Behavior on Performance", In Proc. of SIGCOMM 2013 Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: measure RTT from new SACKYuchung Cheng2013-07-221-6/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Take RTT sample if an ACK selectively acks some sequences that have never been retransmitted. The Karn's algorithm does not apply even if that ACK (s)acks other retransmitted sequences, because it must been generated by an original but perhaps out-of-order packet. There is no ambiguity. In case when multiple blocks are newly sacked because of ACK losses the earliest block is used to measure RTT, similar to cummulative ACKs. Such RTT samples allow the sender to estimate the RTO during loss recovery and packet reordering events. It is still useful even with TCP timestamps. That's because during these events the SND.UNA may not advance preventing RTT samples from TS ECR (thus the FLAG_ACKED check before calling tcp_ack_update_rtt()). Therefore this new RTT source is complementary to existing ACK and TS RTT mechanisms. This patch does not update the RTO. It is done in the next patch. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: prefer packet timing to TS-ECR for RTTYuchung Cheng2013-07-222-50/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | Prefer packet timings to TS-ecr for RTT measurements when both sources are available. That's because broken middle-boxes and remote peer can return packets with corrupted TS ECR fields. Similarly most congestion controls that require RTT signals favor timing-based sources as well. Also check for bad TS ECR values to avoid RTT blow-ups. It has happened on production Web servers. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: consolidate SYNACK RTT samplingYuchung Cheng2013-07-225-21/+14
|/ | | | | | | | | | The first patch consolidates SYNACK and other RTT measurement to use a central function tcp_ack_update_rtt(). A (small) bonus is now SYNACK RTT measurement happens after PAWS check, potentially reducing the impact of RTO seeding on bad TCP timestamps values. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
OpenPOWER on IntegriCloud