summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorJohan Hedberg <johan.hedberg@intel.com>2014-11-18 09:00:14 +0200
committerMarcel Holtmann <marcel@holtmann.org>2014-11-18 08:32:08 +0100
commit76727c02c1e14a2b561b806fa1d08acc1619ad27 (patch)
treefb8c465d0f10a2f50a9840a73becb1ea7442807e
parent38da1703060a520e69b00405f9bdf765d1396cd0 (diff)
downloadop-kernel-dev-76727c02c1e14a2b561b806fa1d08acc1619ad27.zip
op-kernel-dev-76727c02c1e14a2b561b806fa1d08acc1619ad27.tar.gz
Bluetooth: Call drain_workqueue() before resetting state
Doing things like hci_conn_hash_flush() while holding the hdev lock is risky since its synchronous pending work cancellation could cause the L2CAP layer to try to reacquire the hdev lock. Right now there doesn't seem to be any obvious places where this would for certain happen but it's already enough to cause lockdep to start warning against the hdev and the work struct locks being taken in the "wrong" order: [ +0.000373] mgmt-tester/1603 is trying to acquire lock: [ +0.000292] ((&conn->pending_rx_work)){+.+.+.}, at: [<c104266d>] flush_work+0x0/0x181 [ +0.000270] but task is already holding lock: [ +0.000000] (&hdev->lock){+.+.+.}, at: [<c13b9a80>] hci_dev_do_close+0x166/0x359 [ +0.000000] which lock already depends on the new lock. [ +0.000000] the existing dependency chain (in reverse order) is: [ +0.000000] -> #1 (&hdev->lock){+.+.+.}: [ +0.000000] [<c105ea8f>] lock_acquire+0xe3/0x156 [ +0.000000] [<c140c663>] mutex_lock_nested+0x54/0x375 [ +0.000000] [<c13d644b>] l2cap_recv_frame+0x293/0x1a9c [ +0.000000] [<c13d7ca4>] process_pending_rx+0x50/0x5e [ +0.000000] [<c1041a3f>] process_one_work+0x21c/0x436 [ +0.000000] [<c1041e3d>] worker_thread+0x1be/0x251 [ +0.000000] [<c1045a22>] kthread+0x94/0x99 [ +0.000000] [<c140f801>] ret_from_kernel_thread+0x21/0x30 [ +0.000000] -> #0 ((&conn->pending_rx_work)){+.+.+.}: [ +0.000000] [<c105e158>] __lock_acquire+0xa07/0xc89 [ +0.000000] [<c105ea8f>] lock_acquire+0xe3/0x156 [ +0.000000] [<c1042696>] flush_work+0x29/0x181 [ +0.000000] [<c1042864>] __cancel_work_timer+0x76/0x8f [ +0.000000] [<c104288c>] cancel_work_sync+0xf/0x11 [ +0.000000] [<c13d4c18>] l2cap_conn_del+0x72/0x183 [ +0.000000] [<c13d8953>] l2cap_disconn_cfm+0x49/0x55 [ +0.000000] [<c13be37a>] hci_conn_hash_flush+0x7a/0xc3 [ +0.000000] [<c13b9af6>] hci_dev_do_close+0x1dc/0x359 [ +0.012038] [<c13bbe38>] hci_unregister_dev+0x6e/0x1a3 [ +0.000000] [<c12d33c1>] vhci_release+0x28/0x47 [ +0.000000] [<c10dd6a9>] __fput+0xd6/0x154 [ +0.000000] [<c10dd757>] ____fput+0xd/0xf [ +0.000000] [<c1044bb2>] task_work_run+0x6b/0x8d [ +0.000000] [<c1001bd2>] do_notify_resume+0x3c/0x3f [ +0.000000] [<c140fa70>] work_notifysig+0x29/0x31 [ +0.000000] other info that might help us debug this: [ +0.000000] Possible unsafe locking scenario: [ +0.000000] CPU0 CPU1 [ +0.000000] ---- ---- [ +0.000000] lock(&hdev->lock); [ +0.000000] lock((&conn->pending_rx_work)); [ +0.000000] lock(&hdev->lock); [ +0.000000] lock((&conn->pending_rx_work)); [ +0.000000] *** DEADLOCK *** Fully fixing this would require some quite heavy refactoring to change how the hdev lock and hci_conn instances are handled together. A simpler solution for now which this patch takes is to try ensure that the hdev workqueue is empty before proceeding with the various cleanup calls, including hci_conn_hash_flush(). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
-rw-r--r--net/bluetooth/hci_core.c10
1 files changed, 10 insertions, 0 deletions
diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index bbefb4e..d786958 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -2561,6 +2561,11 @@ static int hci_dev_do_close(struct hci_dev *hdev)
if (test_bit(HCI_MGMT, &hdev->dev_flags))
cancel_delayed_work_sync(&hdev->rpa_expired);
+ /* Avoid potential lockdep warnings from the *_flush() calls by
+ * ensuring the workqueue is empty up front.
+ */
+ drain_workqueue(hdev->workqueue);
+
hci_dev_lock(hdev);
hci_inquiry_cache_flush(hdev);
hci_pend_le_actions_clear(hdev);
@@ -2684,6 +2689,11 @@ int hci_dev_reset(__u16 dev)
skb_queue_purge(&hdev->rx_q);
skb_queue_purge(&hdev->cmd_q);
+ /* Avoid potential lockdep warnings from the *_flush() calls by
+ * ensuring the workqueue is empty up front.
+ */
+ drain_workqueue(hdev->workqueue);
+
hci_dev_lock(hdev);
hci_inquiry_cache_flush(hdev);
hci_conn_hash_flush(hdev);
OpenPOWER on IntegriCloud