summaryrefslogtreecommitdiffstats
path: root/drivers/net/bonding/bond_main.c
Commit message (Collapse)AuthorAgeFilesLines
* bonding: Support macvlans on top of tlb/rlb mode bondsVlad Yasevich2014-06-041-3/+3
| | | | | | | | | | | | | | | | To make TLB mode work, the patch allows learning packets to be sent using mac addresses assigned to macvlan devices, also taking into an account vlans that may be between the bond and macvlan device. To make RLB work, all we have to do is accept ARP packets for addresses added to the bond dev->uc list. Since RLB mode will take care to update the peers directly with correct mac addresses, learning packets for these addresses do not have be send to switch. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bonding: Turn on IFF_UNICAST_FLT on bond devicesVlad Yasevich2014-06-041-1/+1
| | | | | | | | | | | | | | | Bonding devices manage the unicast filters of the underlying interfaces, but do not turn on IFF_UNICAST_FLT flag. Thus anytime a unicast address is added to the bond, the bond is places in promiscuous mode. Turn on IFF_UNICAST_FLT on the bond device so that the bond does not go into promiscuous mode needlesly. If an underlying device does not support unicast filtering, that device will automaticall enter promiscuous mode already. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-05-241-69/+65
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/bonding/bond_alb.c drivers/net/ethernet/altera/altera_msgdma.c drivers/net/ethernet/altera/altera_sgdma.c net/ipv6/xfrm6_output.c Several cases of overlapping changes. The xfrm6_output.c has a bug fix which overlaps the renaming of skb->local_df to skb->ignore_df. In the Altera TSE driver cases, the register access cleanups in net-next overlapped with bug fixes done in net. Similarly a bug fix to send ALB packets in the bonding driver using the right source address overlaps with cleanups in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>
| * bonding: Fix stacked device detection in arp monitoringVlad Yasevich2014-05-161-69/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prior to commit fbd929f2dce460456807a51e18d623db3db9f077 bonding: support QinQ for bond arp interval the arp monitoring code allowed for proper detection of devices stacked on top of vlans. Since the above commit, the code can still detect a device stacked on top of single vlan, but not a device stacked on top of Q-in-Q configuration. The search will only set the inner vlan tag if the route device is the vlan device. However, this is not always the case, as it is possible to extend the stacked configuration. With this patch it is possible to provision devices on top Q-in-Q vlan configuration that should be used as a source of ARP monitoring information. For example: ip link add link bond0 vlan10 type vlan proto 802.1q id 10 ip link add link vlan10 vlan100 type vlan proto 802.1q id 100 ip link add link vlan100 type macvlan Note: This patch limites the number of stacked VLANs to 2, just like before. The original, however had another issue in that if we had more then 2 levels of VLANs, we would end up generating incorrectly tagged traffic. This is no longer possible. Fixes: fbd929f2dce460456807a51e18d623db3db9f077 (bonding: support QinQ for bond arp interval) CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@redhat.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: Ding Tianhong <dingtianhong@huawei.com> CC: Patric McHardy <kaber@trash.net> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: populate essential new_slave->bond/dev earlyVeaceslav Falico2014-05-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new bond_free_slave() needs new_slave->bond to verify if additional structures were allocated, so populate it early so that, in case of failure in bond_enslave(), we would be able to get it. Also populate the new_slave->dev field, as it's too one of the most needed things to assign early. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Acked-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: fix vlan_features computingMichal Kubeček2014-05-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bond_compute_features() uses netdev_increment_features() to combine vlan_features of slaves into vlan_features of the bond. As netdev_increment_features() only adds most features and we start with BOND_VLAN_FEATURES, we can end up with features none of the slaves provided. If there is at least one slave, initialize vlan_features only with the flags in NETIF_F_ALL_FOR_ALL. Right now there is none in BOND_VLAN_FEATURES but stating it explicitely will make the code more future proof. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: replace SLAVE_IS_OK() with bond_slave_can_tx()Veaceslav Falico2014-05-161-2/+2
| | | | | | | | | | | | | | | | | | | | They're verifying the same thing (except of IFF_UP, which is implied for netif_running(), which is also a prerequisite). CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: rename {, bond_}slave_can_tx and clean it upVeaceslav Falico2014-05-161-4/+4
| | | | | | | | | | | | | | CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: convert IS_UP(slave->dev) to inline functionVeaceslav Falico2014-05-161-8/+8
| | | | | | | | | | | | | | | | | | | | Also, remove the IFF_UP verification cause we can't be netif_running() with being also IFF_UP. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: make IS_IP_TARGET_UNUSABLE_ADDRESS an inline functionVeaceslav Falico2014-05-161-1/+1
| | | | | | | | | | | | | | | | | | Also, use standard IP primitives to check the address. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: create a macro for bond mode and use itVeaceslav Falico2014-05-161-33/+33
| | | | | | | | | | | | | | CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: make USES_PRIMARY inline functionsVeaceslav Falico2014-05-161-19/+19
| | | | | | | | | | | | | | | | | | | | | | Change the name a bit to better reflect its scope, and update some comments. Two functions added - one which takes bond as a param and the other which takes the mode. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: make BOND_NO_USES_ARP an inline functionVeaceslav Falico2014-05-161-1/+1
| | | | | | | | | | | | | | | | | | | | Also, change its name to better reflect its scope, and skip the "no" part. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: make TX_QUEUE_OVERRIDE() macro an inline functionVeaceslav Falico2014-05-161-4/+3
| | | | | | | | | | | | | | | | | | | | Also, make it accept bonding as a parameter and change the name a bit to better reflect its scope. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: alloc the structure ad_info dynamically in per slavedingtianhong2014-05-141-6/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | The struct ad_slave_info is very huge, and only be used for 802.3ad mode, so alloc the structure dynamically could save 356 Bits for every slave in non 802.3ad mode. Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Acked-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: simplify the slave_do_arp_validate_only()dingtianhong2014-05-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | The argument slave is not used for slave_do_arp_validate_only(), so no need to keep it, make the function more simple. Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: Add tlb_dynamic_lb parameter for tlb modeMahesh Bandewar2014-04-241-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The aggresive load balancing causes packet re-ordering as active flows are moved from a slave to another within the group. Sometime this aggresive lb is not necessary if the preference is for less re-ordering. This parameter if used with value "0" disables this dynamic flow shuffling minimizing packet re-ordering. Of course the side effect is that it has to live with the static load balancing that the hashing distribution provides. This impact is less severe if the correct xmit-hashing-policy is used for the tlb setup. The default value of the parameter is set to "1" mimicing the earlier behavior. Ran the netperf test with 200 stream for 1 min between two hosts with 4x1G trunk (xmit-lb mode with xmit-policy L3+4) before and after these changes. Following was the command used for those 200 instances - netperf -t TCP_RR -l 60 -s 5 -H <host> -- -r81920,81920 Transactions per second: Before change: 1,367.11 After change: 1,470.65 Change-Id: Ie3f75c77282cf602e83a6e833c6eb164e72a0990 Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: Added bond_tlb_xmit() for tlb mode.Mahesh Bandewar2014-04-241-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Re-organized the xmit function for the lb mode separating tlb xmit from the alb mode. This will enable use of the hashing policies like 802.3ad mode. Also extended use of xmit-hash-policy to tlb mode. Now the tlb-mode defaults to BOND_XMIT_POLICY_LAYER2 if the xmit policy module parameter is not set (just like 802.3ad, or Xor mode). Change-Id: I140257403d272df75f477b380207338d0f04963e Signed-off-by: Mahesh Bandewar <maheshb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: Changed hashing function to just provide hashMahesh Bandewar2014-04-241-6/+4
|/ | | | | | | | | | | Modified the hash function to return just hash separating from the modulo operation that can be performed by the caller. This is to make way for the tlb mode to use the same hashing policies that are used in the 802.3ad and Xor mode. Change-Id: I276609e87e0ca213c4d1b17b79c5e0b0f3d0dd6f Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bonding: Remove debug_fs files when module init failsThomas Richter2014-04-111-0/+1
| | | | | | | | | | | Remove the bonding debug_fs entries when the module initialization fails. The debug_fs entries should be removed together with all other already allocated resources. Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Signed-off-by: Jay Vosburgh <j.vosburgh@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bonding: Inactive slaves should keep inactive flag's valuezheng.li2014-04-041-1/+1
| | | | | | | | | | | | | | | | | | | bond_open is not setting the inactive flag correctly for some modes (alb and tlb), resulting in error behavior if the bond has been administratively set down and then back up. This effect should not occur when slaves are added while the bond is up; it's something that only happens after a down/up bounce of the bond. For example, in bond tlb or alb mode, domu send some ARP request which go out from dom0 bond's active slave, then the ARP broadcast request packets go back to inactive slave from switch, because the inactive slave's inactive flag is zero, kernel will receive the packets and pass them to bridge that cause dom0's bridge map domu's MAC address to port of bond, bridge should map domu's MAC to port of vif. Signed-off-by: Zheng Li <zheng.x.li@oracle.com> Signed-off-by: Jay Vosburgh <j.vosburgh@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netpoll: Remove gfp parameter from __netpoll_setupEric W. Biederman2014-03-291-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The gfp parameter was added in: commit 47be03a28cc6c80e3aa2b3e8ed6d960ff0c5c0af Author: Amerigo Wang <amwang@redhat.com> Date: Fri Aug 10 01:24:37 2012 +0000 netpoll: use GFP_ATOMIC in slave_enable_netpoll() and __netpoll_setup() slave_enable_netpoll() and __netpoll_setup() may be called with read_lock() held, so should use GFP_ATOMIC to allocate memory. Eric suggested to pass gfp flags to __netpoll_setup(). Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> The reason for the gfp parameter was removed in: commit c4cdef9b7183159c23c7302aaf270d64c549f557 Author: dingtianhong <dingtianhong@huawei.com> Date: Tue Jul 23 15:25:27 2013 +0800 bonding: don't call slave_xxx_netpoll under spinlocks The slave_xxx_netpoll will call synchronize_rcu_bh(), so the function may schedule and sleep, it should't be called under spinlocks. bond_netpoll_setup() and bond_netpoll_cleanup() are always protected by rtnl lock, it is no need to take the read lock, as the slave list couldn't be changed outside rtnl lock. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net> Nothing else that calls __netpoll_setup or ndo_netpoll_setup requires a gfp paramter, so remove the gfp parameter from both of these functions making the code clearer. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bonding: add net_ratelimt to avoid spam in arp intervaldingtianhong2014-03-261-7/+9
| | | | | | | | | | | | Remove the unnecessary log and add net_ratelimit to the others, in order to avoid spam the log. Cc: Joe Perches <joe@perches.com> Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Veaceslav Falico <vfalico@redhat.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bonding: support QinQ for bond arp intervaldingtianhong2014-03-261-18/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The bond send arp request to indicate that the slave is active, and if the bond dev is a vlan dev, it will set the vlan tag in skb to notice the vlan group, but the bond could only send a skb with 802.1q proto, not support for QinQ. So add outer tag for lower vlan tag and inner tag for upper vlan tag to support QinQ, The new skb will be consist of two vlan tag just like this: dst mac | src mac | outer vlan tag | inner vlan tag | data | ..... If We don't need QinQ, the inner vlan tag could be set to 0 and use outer vlan tag as a normal vlan group. Using "ip link" to configure the bond for QinQ and add test log: ip link add link bond0 bond0.20 type vlan proto 802.1ad id 20 ip link add link bond0.20 bond0.20.200 type vlan proto 802.1q id 200 ifconfig bond0.20 11.11.20.36/24 ifconfig bond0.20.200 11.11.200.36/24 echo +11.11.200.37 > /sys/class/net/bond0/bonding/arp_ip_target 90:e2:ba:07:4a:5c (oui Unknown) > Broadcast, ethertype 802.1Q-QinQ (0x88a8),length 50: vlan 20, p 0,ethertype 802.1Q, vlan 200, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 11.11.200.37 tell 11.11.200.36, length 28 90:e2:ba:06:f9:86 (oui Unknown) > 90:e2:ba:07:4a:5c (oui Unknown), ethertype 802.1Q-QinQ (0x88a8), length 50: vlan 20, p 0, ethertype 802.1Q, vlan 200, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Reply 11.11.200.37 is-at 90:e2:ba:06:f9:86 (oui Unknown), length 28 v1->v2: remove the comment "TODO: QinQ?". Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Veaceslav Falico <vfalico@redhat.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bonding: ratelimit pr_err() for bond xmit broadcastdingtianhong2014-03-261-2/+2
| | | | | | | It may spam if the system is out of the memory, add ratelimit for it. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bonding: slight optimization for bond xmit pathdingtianhong2014-03-261-3/+3
| | | | | | | | Add unlikely() micro to the unlikely conditions in the bond xmit path for slight optimization. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bonding: Call dev_kfree_skby_any instead of kfree_skb.Eric W. Biederman2014-03-121-5/+5
| | | | | | | | Replace kfree_skb with dev_kfree_skb_any in functions that can be called in hard irq and other contexts. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bonding: options handling cleanupstephen hemminger2014-03-061-1/+2
| | | | | | | | Make local functions static (ie. only used in bond_options.c) Make bond options parsing tables constant. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* bonding: remove dead codestephen hemminger2014-03-061-46/+0
| | | | | | | | | These functions are defined but no longer used. Compile tested only. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-03-051-54/+70
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/wireless/ath/ath9k/recv.c drivers/net/wireless/mwifiex/pcie.c net/ipv6/sit.c The SIT driver conflict consists of a bug fix being done by hand in 'net' (missing u64_stats_init()) whilst in 'net-next' a helper was created (netdev_alloc_pcpu_stats()) which takes care of this. The two wireless conflicts were overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
| * bonding: disallow enslaving a bond to itselfJiri Bohac2014-02-261-0/+5
| | | | | | | | | | | | | | | | Enslaving a bond to itself leads to an endless loop and hangs the kernel. Signed-off-by: Jiri Bohac <jbohac@suse.cz> Tested-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bonding: fix a div error caused by the slave release pathNikolay Aleksandrov2014-02-261-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's a bug in the slave release function which leads the transmit functions which use the bond->slave_cnt to a div by 0 because we might just have released our last slave and made slave_cnt == 0 but at the same time we may have a transmitter after the check for an empty list which will fetch it and use it in the slave id calculation. Fix it by moving the slave_cnt after synchronize_rcu so if this was our last slave any new transmitters will see an empty slave list which is checked after rcu lock but before calling the mode transmit functions which rely on bond->slave_cnt. Fixes: 278b208375 ("bonding: initial RCU conversion") CC: Veaceslav Falico <vfalico@redhat.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: Jay Vosburgh <fubar@us.ibm.com> CC: David S. Miller <davem@davemloft.net> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for ab arp monitordingtianhong2014-02-261-39/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Veaceslav has reported and fix this problem by commit f2ebd477f141bc0 (bonding: restructure locking of bond_ab_arp_probe()). According Jay's opinion, the current solution is not very well, because the notification is to indicate that the interface has actually changed state in a meaningful way, but these calls in the ab ARP monitor are internal settings of the flags to allow the ARP monitor to search for a slave to become active when there are no active slaves. The flag setting to active or backup is to permit the ARP monitor's response logic to do the right thing when deciding if the test slave (current_arp_slave) is up or not. So the best way to fix the problem is that we should not send a notification when the slave is in testing state, and check the state at the end of the monitor, if the slave's state recover, avoid to send pointless notification twice. And RTNL is really a big lock, hold it regardless the slave's state changed or not when the current_active_slave is null will loss performance (every 100ms), so we should hold it only when the slave's state changed and need to notify. I revert the old commit and add new modifications. Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Veaceslav Falico <vfalico@redhat.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for 802.3ad modedingtianhong2014-02-261-15/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The problem was introduced by the commit 1d3ee88ae0d (bonding: add netlink attributes to slave link dev). The bond_set_active_slave() and bond_set_backup_slave() will use rtmsg_ifinfo to send slave's states, so these two functions should be called in RTNL. In 802.3ad mode, acquiring RTNL for the __enable_port and __disable_port cases is difficult, as those calls generally already hold the state machine lock, and cannot unconditionally call rtnl_lock because either they already hold RTNL (for calls via bond_3ad_unbind_slave) or due to the potential for deadlock with bond_3ad_adapter_speed_changed, bond_3ad_adapter_duplex_changed, bond_3ad_link_change, or bond_3ad_update_lacp_rate. All four of those are called with RTNL held, and acquire the state machine lock second. The calling contexts for __enable_port and __disable_port already hold the state machine lock, and may or may not need RTNL. According to the Jay's opinion, I don't think it is a problem that the slave don't send notify message synchronously when the status changed, normally the state machine is running every 100 ms, send the notify message at the end of the state machine if the slave's state changed should be better. I fix the problem through these steps: 1). add a new function bond_set_slave_state() which could change the slave's state and call rtmsg_ifinfo() according to the input parameters called notify. 2). Add a new slave parameter which called should_notify, if the slave's state changed and don't notify yet, the parameter will be set to 1, and then if the slave's state changed again, the param will be set to 0, it indicate that the slave's state has been restored, no need to notify any one. 3). the __enable_port and __disable_port should not call rtmsg_ifinfo in the state machine lock, any change in the state of slave could set a flag in the slave, it will indicated that an rtmsg_ifinfo should be called at the end of the state machine. Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Veaceslav Falico <vfalico@redhat.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: send arp requests even if there's no route to themVeaceslav Falico2014-03-021-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we're only sending arp requests if we have a route to the target (and, thus, can find out the source ip address). There are some use cases, however, where we don't want/need to set an ip address (or set up a specific route) for bonding to use arp monitoring *for traffic generation*. We can easily send arp probes (arp requests with src ip == 0) to generate arp broadcast responses from the target ip and use them for determining if the target is up. This, obviously, won't work with arp validation - because we don't have the ip address set and, thus, will filter out the responses. So in that case - print a warning. CC: François CACHEREUL <f.cachereul@alphalink.fr> CC: Zhenjie Chen <zhchen@redhat.com> CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: remove no longer needed lock for bond_xxx_info_query()dingtianhong2014-02-241-4/+0
| | | | | | | | | | | | | | | | | | | | | | The bond_xxx_info_query() was already in RTNL, so no need to use bond lock to protect the bond slave list, so remove it. Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Veaceslav Falico <vfalico@redhat.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: netpoll: remove unwanted slave_dev_support_netpoll()dingtianhong2014-02-241-8/+0
| | | | | | | | | | | | | | | | | | | | | | | | The __netpoll_setup() will check the slave's flag and ndo_poll_controller just like the slave_dev_support_netpoll() does, and slave_dev_support_netpoll() was not used by any place, so remove it. Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Veaceslav Falico <vfalico@redhat.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: fix bond_arp_rcv() race of curr_active_slaveVeaceslav Falico2014-02-201-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bond->curr_active_slave can be changed between its deferences, even to NULL, and thus we might panic. We're always holding the rcu (rx_handler->bond_handle_frame()->bond_arp_rcv()) so fix this by rcu_dereferencing() it and using the saved. Reported-by: Ding Tianhong <dingtianhong@huawei.com> Fixes: aeea64a ("bonding: don't trust arp requests unless active slave really works") CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Acked-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: More use of ether_addr_copyJoe Perches2014-02-191-3/+3
| | | | | | | | | | | | | | | | It's smaller and faster for some architectures. Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-02-191-1/+8
|\ \ | |/ | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/bonding/bond_3ad.h drivers/net/bonding/bond_main.c Two minor conflicts in bonding, both of which were overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
| * netdevice: add queue selection fallback handler for ndo_select_queueDaniel Borkmann2014-02-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new argument for ndo_select_queue() callback that passes a fallback handler. This gets invoked through netdev_pick_tx(); fallback handler is currently __netdev_pick_tx() as most drivers invoke this function within their customized implementation in case for skbs that don't need any special handling. This fallback handler can then be replaced on other call-sites with different queue selection methods (e.g. in packet sockets, pktgen etc). This also has the nice side-effect that __netdev_pick_tx() is then only invoked from netdev_pick_tx() and export of that function to modules can be undone. Suggested-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bonding: Fix deadlock in bonding driver when using netpolldingtianhong2014-02-131-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The bonding driver take write locks and spin locks that are shared by the tx path in enslave processing and notification processing, If the netconsole is in use, the bonding can call printk which puts us in the netpoll tx path, if the netconsole is attached to the bonding driver, result in deadlock. So add protection for these place, by checking the netpoll_block_tx state, we can defer the sending of the netconsole frames until a later time using the retransmit feature of netpoll_send_skb that is triggered on the return code NETDEV_TX_BUSY. Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Veaceslav Falico <vfalico@redhat.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: rename last_arp_rx to last_rxVeaceslav Falico2014-02-181-6/+6
| | | | | | | | | | | | | | | | | | To reflect the new meaning. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: trivial: rename slave->jiffies to ->last_link_upVeaceslav Falico2014-02-181-10/+10
| | | | | | | | | | | | | | | | | | | | slave->jiffies is updated every time the slave becomes active, which, for bonding, means that its link is 'up'. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: remove useless updating of slave->dev->last_rxVeaceslav Falico2014-02-181-3/+0
| | | | | | | | | | | | | | | | | | | | | | Now that all the logic is handled via last_arp_rx, we don't need to use last_rx. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: use last_arp_rx in bond_loadbalance_arp_mon()Veaceslav Falico2014-02-181-2/+2
| | | | | | | | | | | | | | | | | | | | Now that last_arp_rx correctly show the last time we've received an ARP, we can use it safely instead of slave->dev->last_rx. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: use the new options to correctly set last_arp_rxVeaceslav Falico2014-02-181-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that the options are in place - arp_validate can be set to receive all the traffic or only arp packets to verify if the slave is up, when the slave isn't validated. CC: Rob Landley <rob@landley.net> CC: "David S. Miller" <davem@davemloft.net> CC: Nikolay Aleksandrov <nikolay@redhat.com> CC: Ding Tianhong <dingtianhong@huawei.com> CC: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: always set recv_probe to bond_arp_rcv in arp monitorVeaceslav Falico2014-02-181-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we only set bond_arp_rcv() if we're using arp_validate, however this makes us skip updating last_arp_rx if we're not validating incoming ARPs - thus, if arp_validate is off, last_arp_rx will never be updated. Fix this by always setting up recv_probe = bond_arp_rcv, even if we're not using arp_validate. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: always update last_arp_rx on packet recieveVeaceslav Falico2014-02-181-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we're updating the last_arp_rx only when we've validate the packet, however afterwards we use it as 'ANY last packet received', but not only validated ARPs. Fix this by updating it in case of any packet received. It won't break the arp_validation=0 because we, anyway, return the correct slave->dev->last_rx in slave_last_rx(). CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: permit using arp_validate with non-ab modesVeaceslav Falico2014-02-181-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently it's disabled because it's sometimes hard, in typical configs, to make it work - because of the nature how the loadbalance modes work - as it's hard to deliver valid arp replies to correct slaves by the switch. However we still can use arp_validation in loadbalance with several other configs, per example with arp_validate == 2 for backup with one broadcast domain, without the switch(es) doing any balancing - this way we'd be (a bit more) sure that the slave is up. So, enable it to let users decide which one works/suits them best. Also correct the mode limitation from BOND_OPT_ARP_VALIDATE. CC: Nikolay Aleksandrov <nikolay@redhat.com> CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Acked-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
OpenPOWER on IntegriCloud