summaryrefslogtreecommitdiffstats
path: root/kernel/watchdog.c
diff options
context:
space:
mode:
authorUlrich Obergfell <uobergfe@redhat.com>2015-11-05 18:44:27 -0800
committerLinus Torvalds <torvalds@linux-foundation.org>2015-11-05 19:34:48 -0800
commitd283c640cee6472852b95036ddd512c2ba0c1139 (patch)
tree85a91987682e4dae5280318aef1452c18ce1b61a /kernel/watchdog.c
parent451637e454f0b41689cd07cdc3fa53388c22890d (diff)
downloadop-kernel-dev-d283c640cee6472852b95036ddd512c2ba0c1139.zip
op-kernel-dev-d283c640cee6472852b95036ddd512c2ba0c1139.tar.gz
watchdog: fix error handling in proc_watchdog_thresh()
The original watchdog_park_threads() function that was introduced by commit 81a4beef91ba ("watchdog: introduce watchdog_park_threads() and watchdog_unpark_threads()") takes a very simple approach to handle errors returned by kthread_park(): It attempts to roll back all watchdog threads to the unparked state. However, this may be undesired behaviour from the perspective of the caller which may want to handle errors as appropriate in its specific context. Currently, there are two possible call chains: - watchdog suspend/resume interface lockup_detector_suspend watchdog_park_threads - write to parameters in /proc/sys/kernel proc_watchdog_update watchdog_enable_all_cpus update_watchdog_all_cpus watchdog_park_threads Instead of 'blindly' attempting to unpark the watchdog threads if a kthread_park() call fails, the new approach is to disable the lockup detectors in the above call chains. Failure becomes visible to the user as follows: - error messages from lockup_detector_suspend() or watchdog_enable_all_cpus() - the state that can be read from /proc/sys/kernel/watchdog_enabled - the 'write' system call in the latter call chain returns an error I did not experience kthread_park() failures in practice, I used some instrumentation to fake error returns from kthread_park() in order to test the patches. This patch (of 5): Restore the previous value of watchdog_thresh _and_ sample_period if proc_watchdog_update() returns an error. The variables must be consistent to avoid false positives of the lockup detectors. Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com> Reviewed-by: Aaron Tomlin <atomlin@redhat.com> Acked-by: Don Zickus <dzickus@redhat.com> Cc: Ulrich Obergfell <uobergfe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'kernel/watchdog.c')
-rw-r--r--kernel/watchdog.c7
1 files changed, 4 insertions, 3 deletions
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 568ba64..284f0e6 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -914,13 +914,14 @@ int proc_watchdog_thresh(struct ctl_table *table, int write,
goto out;
/*
- * Update the sample period.
- * Restore 'watchdog_thresh' on failure.
+ * Update the sample period. Restore on failure.
*/
set_sample_period();
err = proc_watchdog_update();
- if (err)
+ if (err) {
watchdog_thresh = old;
+ set_sample_period();
+ }
out:
mutex_unlock(&watchdog_proc_mutex);
return err;
OpenPOWER on IntegriCloud