summaryrefslogtreecommitdiffstats
path: root/drivers/net/bonding/bonding.h
diff options
context:
space:
mode:
authorJay Vosburgh <fubar@us.ibm.com>2011-10-28 15:42:50 +0000
committerDavid S. Miller <davem@davemloft.net>2011-10-30 03:13:14 -0400
commite6d265e8504ab4a3368b8645d318b344ee88b280 (patch)
tree6fd2bc16819bae491e56be2658fc3298b7efa92c /drivers/net/bonding/bonding.h
parent10ee0faed92c8af4baebd633372136a6608a41ea (diff)
downloadop-kernel-dev-e6d265e8504ab4a3368b8645d318b344ee88b280.zip
op-kernel-dev-e6d265e8504ab4a3368b8645d318b344ee88b280.tar.gz
bonding: eliminate bond_close race conditions
This patch resolves two sets of race conditions. Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com> reported the first, as follows: The bond_close() calls cancel_delayed_work() to cancel delayed works. It, however, cannot cancel works that were already queued in workqueue. The bond_open() initializes work->data, and proccess_one_work() refers get_work_cwq(work)->wq->flags. The get_work_cwq() returns NULL when work->data has been initialized. Thus, a panic occurs. He included a patch that converted the cancel_delayed_work calls in bond_close to flush_delayed_work_sync, which eliminated the above problem. His patch is incorporated, at least in principle, into this patch. In this patch, we use cancel_delayed_work_sync in place of flush_delayed_work_sync, and also convert bond_uninit in addition to bond_close. This conversion to _sync, however, opens new races between bond_close and three periodically executing workqueue functions: bond_mii_monitor, bond_alb_monitor and bond_activebackup_arp_mon. The race occurs because bond_close and bond_uninit are always called with RTNL held, and these workqueue functions may acquire RTNL to perform failover-related activities. If bond_close or bond_uninit is waiting in cancel_delayed_work_sync, deadlock occurs. These deadlocks are resolved by having the workqueue functions acquire RTNL conditionally. If the rtnl_trylock() fails, the functions reschedule and return immediately. For the cases that are attempting to perform link failover, a delay of 1 is used; for the other cases, the normal interval is used (as those activities are not as time critical). Additionally, the bond_mii_monitor function now stores the delay in a variable (mimicing the structure of activebackup_arp_mon). Lastly, all of the above renders the kill_timers sentinel moot, and therefore it has been removed. Tested-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'drivers/net/bonding/bonding.h')
-rw-r--r--drivers/net/bonding/bonding.h1
1 files changed, 0 insertions, 1 deletions
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 82fec5f..1aecc37 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -222,7 +222,6 @@ struct bonding {
struct slave *);
rwlock_t lock;
rwlock_t curr_slave_lock;
- s8 kill_timers;
u8 send_peer_notif;
s8 setup_by_slave;
s8 igmp_retrans;
OpenPOWER on IntegriCloud