summaryrefslogtreecommitdiffstats
path: root/kernel
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'core-locking-for-linus' of ↵Linus Torvalds2011-07-226-8/+11
|\ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: lockdep: Fix lockdep_no_validate against IRQ states mutex: Make mutex_destroy() an inline function plist: Remove the need to supply locks to plist heads lockup detector: Fix reference to the non-existent CONFIG_DETECT_SOFTLOCKUP option
| * lockdep: Fix lockdep_no_validate against IRQ statesPeter Zijlstra2011-07-211-0/+3
| | | | | | | | | | | | | | | | | | | | | | Thomas noticed that a lock marked with lockdep_set_novalidate_class() will still trigger warnings for IRQ inversions. Cure this by skipping those when marking irq state. Reported-and-tested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-2dp5vmpsxeraqm42kgww6ge2@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * plist: Remove the need to supply locks to plist headsDima Zavin2011-07-085-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was legacy code brought over from the RT tree and is no longer necessary. Signed-off-by: Dima Zavin <dima@android.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Daniel Walker <dwalker@codeaurora.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Link: http://lkml.kernel.org/r/1310084879-10351-2-git-send-email-dima@android.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | Merge branch 'for-linus' of ↵Linus Torvalds2011-07-223-10/+23
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: (51 commits) PM: Improve error code of pm_notifier_call_chain() PM: Add "RTC" to PM trace time stamps to avoid confusion PM / Suspend: Export suspend_set_ops, suspend_valid_only_mem PM / Suspend: Add .suspend_again() callback to suspend_ops PM / OPP: Introduce function to free cpufreq table ARM / shmobile: Return -EBUSY from A4LC power off if A3RV is active PM / Domains: Take .power_off() error code into account ARM / shmobile: Use genpd_queue_power_off_work() ARM / shmobile: Use pm_genpd_poweroff_unused() PM / Domains: Introduce function to power off all unused PM domains OMAP: PM: disable idle on suspend for GPIO and UART OMAP: PM: omap_device: add API to disable idle on suspend OMAP: PM: omap_device: add system PM methods for PM domain handling OMAP: PM: omap_device: conditionally use PM domain runtime helpers PM / Runtime: Add new helper function: pm_runtime_status_suspended() PM / Domains: Queue up power off work only if it is not pending PM / Domains: Improve handling of wakeup devices during system suspend PM / Domains: Do not restore all devices on power off error PM / Domains: Allow callbacks to execute all runtime PM helpers PM / Domains: Do not execute device callbacks under locks ...
| * \ Merge branch 'pm-domains' into for-linusRafael J. Wysocki2011-07-151-2/+6
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * pm-domains: (33 commits) ARM / shmobile: Return -EBUSY from A4LC power off if A3RV is active PM / Domains: Take .power_off() error code into account ARM / shmobile: Use genpd_queue_power_off_work() ARM / shmobile: Use pm_genpd_poweroff_unused() PM / Domains: Introduce function to power off all unused PM domains PM / Domains: Queue up power off work only if it is not pending PM / Domains: Improve handling of wakeup devices during system suspend PM / Domains: Do not restore all devices on power off error PM / Domains: Allow callbacks to execute all runtime PM helpers PM / Domains: Do not execute device callbacks under locks PM / Domains: Make failing pm_genpd_prepare() clean up properly PM / Domains: Set device state to "active" during system resume ARM: mach-shmobile: sh7372 A3RV requires A4LC PM / Domains: Export pm_genpd_poweron() in header ARM: mach-shmobile: sh7372 late pm domain off ARM: mach-shmobile: Runtime PM late init callback ARM: mach-shmobile: sh7372 D4 support ARM: mach-shmobile: sh7372 A4MP support ARM: mach-shmobile: sh7372: make sure that fsi is peripheral of spu2 ARM: mach-shmobile: sh7372 A3SG support ...
| | * | PM: Allow the clocks management code to be used during system suspendRafael J. Wysocki2011-07-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The common clocks management code in drivers/base/power/clock_ops.c is going to be used during system-wide power transitions as well as for runtime PM, so it shouldn't depend on CONFIG_PM_RUNTIME. However, the suspend/resume functions provided by it for CONFIG_PM_RUNTIME unset, to be used during system-wide power transitions, should not behave in the same way as their counterparts defined for CONFIG_PM_RUNTIME set, because in that case the clocks are managed differently at run time. The names of the functions still contain the word "runtime" after this change, but that is going to be modified by a separate patch later. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Reviewed-by: Kevin Hilman <khilman@ti.com>
| | * | PM / Domains: Support for generic I/O PM domains (v8)Rafael J. Wysocki2011-07-021-0/+4
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce common headers, helper functions and callbacks allowing platforms to use simple generic power domains for runtime power management. Introduce struct generic_pm_domain to be used for representing power domains that each contain a number of devices and may be parent domains or subdomains with respect to other power domains. Among other things, this structure includes callbacks to be provided by platforms for performing specific tasks related to power management (i.e. ->stop_device() may disable a device's clocks, while ->start_device() may enable them, ->power_off() is supposed to remove power from the entire power domain and ->power_on() is supposed to restore it). Introduce functions that can be used as power domain runtime PM callbacks, pm_genpd_runtime_suspend() and pm_genpd_runtime_resume(), as well as helper functions for the initialization of a power domain represented by a struct generic_power_domain object, adding a device to or removing a device from it and adding or removing subdomains. Introduce configuration option CONFIG_PM_GENERIC_DOMAINS to be selected by the platforms that want to use the new code. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Reviewed-by: Kevin Hilman <khilman@ti.com>
| * | PM: Improve error code of pm_notifier_call_chain()Akinobu Mita2011-07-151-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This enables pm_notifier_call_chain() to get the actual error code in the callback rather than always assume -EINVAL by converting all PM notifier calls to return encapsulate error code with notifier_from_errno(). Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
| * | PM / Suspend: Export suspend_set_ops, suspend_valid_only_memKevin Hilman2011-07-151-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some platforms wish to implement their PM core suspend code as modules. To do so, these functions need to be exported to modules. [rjw: Replaced EXPORT_SYMBOL with EXPORT_SYMBOL_GPL] Reported-by: Jean Pihet <j-pihet@ti.com> Signed-off-by: Kevin Hilman <khilman@ti.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
| * | PM / Suspend: Add .suspend_again() callback to suspend_opsMyungJoo Ham2011-07-151-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A system or a device may need to control suspend/wakeup events. It may want to wakeup the system after a predefined amount of time or at a predefined event decided while entering suspend for polling or delayed work. Then, it may want to enter suspend again if its predefined wakeup condition is the only wakeup reason and there is no outstanding events; thus, it does not wakeup the userspace unnecessary or unnecessary devices and keeps suspended as long as possible (saving the power). Enabling a system to wakeup after a specified time can be easily achieved by using RTC. However, to enter suspend again immediately without invoking userland and unrelated devices, we need additional features in the suspend framework. Such need comes from: 1. Monitoring a critical device status without interrupts that can wakeup the system. (in-suspend polling) An example is ambient temperature monitoring that needs to shut down the system or a specific device function if it is too hot or cold. The temperature of a specific device may be needed to be monitored as well; e.g., a charger monitors battery temperature in order to stop charging if overheated. 2. Execute critical "delayed work" at suspend. A driver or a system/board may have a delayed work (or any similar things) that it wants to execute at the requested time. For example, some chargers want to check the battery voltage some time (e.g., 30 seconds) after the battery is fully charged and the charger has stopped. Then, the charger restarts charging if the voltage has dropped more than a threshold, which is smaller than "restart-charger" voltage, which is a threshold to restart charging regardless of the time passed. This patch allows to add "suspend_again" callback at struct platform_suspend_ops and let the "suspend_again" callback return true if the system is required to enter suspend again after the current instance of wakeup. Device-wise suspend_again implemented at dev_pm_ops or syscore is not done because: a) suspend_again feature is usually under platform-wise decision and controls the behavior of the whole platform and b) There are very limited devices related to the usage cases of suspend_again; chargers and temperature sensors are mentioned so far. With suspend_again callback registered at struct platform_suspend_ops suspend_ops in kernel/power/suspend.c with suspend_set_ops by the platform, the suspend framework tries to enter suspend again by looping suspend_enter() if suspend_again has returned true and there has been no errors in the suspending sequence or pending wakeups (by pm_wakeup_pending). Tested at Exynos4-NURI. [rjw: Fixed up kerneldoc comment for suspend_enter().] Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
* | | Merge branch 'for-3.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds2011-07-221-28/+53
|\ \ \ | | | | | | | | | | | | | | | | | | | | * 'for-3.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: separate out drain_workqueue() from destroy_workqueue() workqueue: remove cancel_rearming_delayed_work[queue]()
| * | | workqueue: separate out drain_workqueue() from destroy_workqueue()Tejun Heo2011-05-201-28/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are users which want to drain workqueues without destroying it. Separate out drain functionality from destroy_workqueue() into drain_workqueue() and make it accessible to workqueue users. To guarantee forward-progress, only chain queueing is allowed while drain is in progress. If a new work item which isn't chained from the running or pending work items is queued while draining is in progress, WARN_ON_ONCE() is triggered. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: James Bottomley <James.Bottomley@hansenpartnership.com>
* | | | Merge branch 'ptrace' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/miscLinus Torvalds2011-07-224-261/+485
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'ptrace' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc: (39 commits) ptrace: do_wait(traced_leader_killed_by_mt_exec) can block forever ptrace: fix ptrace_signal() && STOP_DEQUEUED interaction connector: add an event for monitoring process tracers ptrace: dont send SIGSTOP on auto-attach if PT_SEIZED ptrace: mv send-SIGSTOP from do_fork() to ptrace_init_task() ptrace_init_task: initialize child->jobctl explicitly has_stopped_jobs: s/task_is_stopped/SIGNAL_STOP_STOPPED/ ptrace: make former thread ID available via PTRACE_GETEVENTMSG after PTRACE_EVENT_EXEC stop ptrace: wait_consider_task: s/same_thread_group/ptrace_reparented/ ptrace: kill real_parent_is_ptracer() in in favor of ptrace_reparented() ptrace: ptrace_reparented() should check same_thread_group() redefine thread_group_leader() as exit_signal >= 0 do not change dead_task->exit_signal kill task_detached() reparent_leader: check EXIT_DEAD instead of task_detached() make do_notify_parent() __must_check, update the callers __ptrace_detach: avoid task_detached(), check do_notify_parent() kill tracehook_notify_death() make do_notify_parent() return bool ptrace: s/tracehook_tracer_task()/ptrace_parent()/ ...
| * | | | ptrace: fix ptrace_signal() && STOP_DEQUEUED interactionOleg Nesterov2011-07-211-6/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Simple test-case, int main(void) { int pid, status; pid = fork(); if (!pid) { pause(); assert(0); return 0x23; } assert(ptrace(PTRACE_ATTACH, pid, 0,0) == 0); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGSTOP); kill(pid, SIGCONT); // <--- also clears STOP_DEQUEUD assert(ptrace(PTRACE_CONT, pid, 0,0) == 0); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGCONT); assert(ptrace(PTRACE_CONT, pid, 0, SIGSTOP) == 0); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGSTOP); kill(pid, SIGKILL); return 0; } Without the patch it hangs. After the patch SIGSTOP "injected" by the tracer is not ignored and stops the tracee. Note also that if this test-case uses, say, SIGWINCH instead of SIGCONT, everything works without the patch. This can't be right, and this is confusing. The problem is that SIGSTOP (or any other sig_kernel_stop() signal) has no effect without JOBCTL_STOP_DEQUEUED. This means it is simply ignored after PTRACE_CONT unless JOBCTL_STOP_DEQUEUED was set "by accident", say it wasn't cleared after initial SIGSTOP sent by PTRACE_ATTACH. At first glance we could change ptrace_signal() to add STOP_DEQUEUED after return from ptrace_stop(), but this is not right in case when the tracer does not change the reported SIGSTOP and SIGCONT comes in between. This is even more wrong with PT_SEIZED, SIGCONT adds JOBCTL_TRAP_NOTIFY which will be "lost" during the TRAP_STOP | TRAP_NOTIFY report. So lets add STOP_DEQUEUED _before_ we report the signal. It has no effect unless sig_kernel_stop() == T after the tracer resumes us, and in the latter case the pending STOP_DEQUEUED means no SIGCONT in between, we should stop. Note also that if SIGCONT was sent, PT_SEIZED tracee will correctly report PTRACE_EVENT_STOP/SIGTRAP and thus the tracer can notice the fact SIGSTOP was cancelled. Also, move the current->ptrace check from ptrace_signal() to its caller, get_signal_to_deliver(), this looks more natural. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Tejun Heo <tj@kernel.org>
| * | | | connector: add an event for monitoring process tracersVladimir Zapolskiy2011-07-181-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change adds a procfs connector event, which is emitted on every successful process tracer attach or detach. If some process connects to other one, kernelspace connector reports process id and thread group id of both these involved processes. On disconnection null process id is returned. Such an event allows to create a simple automated userspace mechanism to be aware about processes connecting to others, therefore predefined process policies can be applied to them if needed. Note, a detach signal is emitted only in case, if a tracer process explicitly executes PTRACE_DETACH request. In other cases like tracee or tracer exit detach event from proc connector is not reported. Signed-off-by: Vladimir Zapolskiy <vzapolskiy@gmail.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
| * | | | ptrace: mv send-SIGSTOP from do_fork() to ptrace_init_task()Oleg Nesterov2011-07-171-12/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the new child is traced, do_fork() adds the pending SIGSTOP. It assumes that either it is traced because of auto-attach or the tracer attached later, in both cases sigaddset/set_thread_flag is correct even if SIGSTOP is already pending. Now that we have PTRACE_SEIZE this is no longer right in the latter case. If the tracer does PTRACE_SEIZE after copy_process() makes the child visible the queued SIGSTOP is wrong. We could check PT_SEIZED bit and change ptrace_attach() to set both PT_PTRACED and PT_SEIZED bits simultaneously but see the next patch, we need to know whether this child was auto-attached or not anyway. So this patch simply moves this code to ptrace_init_task(), this way we can never race with ptrace_attach(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Tejun Heo <tj@kernel.org>
| * | | | has_stopped_jobs: s/task_is_stopped/SIGNAL_STOP_STOPPED/Oleg Nesterov2011-07-171-7/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | has_stopped_jobs() naively checks task_is_stopped(group_leader). This was always wrong even without ptrace, group_leader can be dead. And given that ptrace can change the state to TRACED this is wrong even in the single-threaded case. Change the code to check SIGNAL_STOP_STOPPED and simplify the code, retval + break/continue doesn't make this trivial code more readable. We could probably add the usual "|| signal->group_stop_count" check but I don't think this makes sense, the task can start the group-stop right after the check anyway. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Tejun Heo <tj@kernel.org>
| * | | | ptrace: wait_consider_task: s/same_thread_group/ptrace_reparented/Oleg Nesterov2011-06-271-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | wait_consider_task() checks same_thread_group(parent, real_parent), this is the open-coded ptrace_reparented(). __ptrace_detach() remains the only function which has to check this by hand, although we could reorganize the code to delay __ptrace_unlink. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Tejun Heo <tj@kernel.org>
| * | | | ptrace: kill real_parent_is_ptracer() in in favor of ptrace_reparented()Oleg Nesterov2011-06-271-16/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Kill real_parent_is_ptracer() and update the callers to use ptrace_reparented(), after the previous patch they do the same. Remove the unnecessary ->ptrace != 0 check in get_signal_to_deliver(), if ptrace_reparented() == T then the task must be ptraced. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Tejun Heo <tj@kernel.org>
| * | | | do not change dead_task->exit_signalOleg Nesterov2011-06-272-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __ptrace_detach() and do_notify_parent() set task->exit_signal = -1 to mark the task dead. This is no longer needed, nobody checks exit_signal to detect the EXIT_DEAD task. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Tejun Heo <tj@kernel.org>
| * | | | kill task_detached()Oleg Nesterov2011-06-271-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Upadate the last user of task_detached(), wait_task_zombie(), to use thread_group_leader() and kill task_detached(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Tejun Heo <tj@kernel.org>
| * | | | reparent_leader: check EXIT_DEAD instead of task_detached()Oleg Nesterov2011-06-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change reparent_leader() to check ->exit_state instead of ->exit_signal, this matches the similar EXIT_DEAD check in wait_consider_task() and allows us to cleanup the do_notify_parent/task_detached logic. task_detached() was really needed during reparenting before 9cd80bbb "do_wait() optimization: do not place sub-threads on ->children list" to filter out the sub-threads. After this change task_detached(p) can only be true if p is the dead group_leader and its parent ignores SIGCHLD, in this case the caller of do_notify_parent() is going to reap this task and it should set EXIT_DEAD. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Tejun Heo <tj@kernel.org>
| * | | | make do_notify_parent() __must_check, update the callersOleg Nesterov2011-06-271-21/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change other callers of do_notify_parent() to check the value it returns, this makes the subsequent task_detached() unnecessary. Mark do_notify_parent() as __must_check. Use thread_group_leader() instead of !task_detached() to check if we need to notify the real parent in wait_task_zombie(). Remove the stale comment in release_task(). "just for sanity" is no longer true, we have to set EXIT_DEAD to avoid the races with do_wait(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Tejun Heo <tj@kernel.org>
| * | | | __ptrace_detach: avoid task_detached(), check do_notify_parent()Oleg Nesterov2011-06-271-15/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __ptrace_detach() relies on the current obscure behaviour of do_notify_parent(tsk) which changes tsk->exit_signal if this child should be silently reaped. That is why we check task_detached(), it is true if the task is sub-thread, or it is the group_leader but its exit_signal was changed by do_notify_parent(). This is confusing, change the code to rely on !thread_group_leader() or the value returned by do_notify_parent(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Tejun Heo <tj@kernel.org>
| * | | | kill tracehook_notify_death()Oleg Nesterov2011-06-271-8/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Kill tracehook_notify_death(), reimplement the logic in its caller, exit_notify(). Also, change the exec_id's check to use thread_group_leader() instead of task_detached(), this is more clear. This logic only applies to the exiting leader, a sub-thread must never change its exit_signal. Note: when the traced group leader exits the exit_signal-or-SIGCHLD logic looks really strange: - we notify the tracer even if !thread_group_empty() but do_wait(WEXITED) can't work until all threads exit - if the tracer is real_parent, it is not clear why can't we use ->exit_signal event if !thread_group_empty() -v2: do not try to fix the 2nd oddity to avoid the subtle behavior change mixed with reorganization, suggested by Tejun. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Tejun Heo <tj@kernel.org>
| * | | | make do_notify_parent() return boolOleg Nesterov2011-06-272-11/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - change do_notify_parent() to return a boolean, true if the task should be reaped because its parent ignores SIGCHLD. - update the only caller which checks the returned value, exit_notify(). This temporary uglifies exit_notify() even more, will be cleanuped by the next change. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Tejun Heo <tj@kernel.org>
| * | | | ptrace: kill clone/exec tracehooksTejun Heo2011-06-221-9/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At this point, tracehooks aren't useful to mainline kernel and mostly just add an extra layer of obfuscation. Although they have comments, without actual in-kernel users, it is difficult to tell what are their assumptions and they're actually trying to achieve. To mainline kernel, they just aren't worth keeping around. This patch kills the following clone and exec related tracehooks. tracehook_prepare_clone() tracehook_finish_clone() tracehook_report_clone() tracehook_report_clone_complete() tracehook_unsafe_exec() The changes are mostly trivial - logic is moved to the caller and comments are merged and adjusted appropriately. The only exception is in check_unsafe_exec() where LSM_UNSAFE_PTRACE* are OR'd to bprm->unsafe instead of setting it, which produces the same result as the field is always zero on entry. It also tests p->ptrace instead of (p->ptrace & PT_PTRACED) for consistency, which also gives the same result. This doesn't introduce any behavior change. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
| * | | | ptrace: kill trivial tracehooksTejun Heo2011-06-223-10/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At this point, tracehooks aren't useful to mainline kernel and mostly just add an extra layer of obfuscation. Although they have comments, without actual in-kernel users, it is difficult to tell what are their assumptions and they're actually trying to achieve. To mainline kernel, they just aren't worth keeping around. This patch kills the following trivial tracehooks. * Ones testing whether task is ptraced. Replace with ->ptrace test. tracehook_expect_breakpoints() tracehook_consider_ignored_signal() tracehook_consider_fatal_signal() * ptrace_event() wrappers. Call directly. tracehook_report_exec() tracehook_report_exit() tracehook_report_vfork_done() * ptrace_release_task() wrapper. Call directly. tracehook_finish_release_task() * noop tracehook_prepare_release_task() tracehook_report_death() This doesn't introduce any behavior change. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
| * | | | ptrace: kill task_ptrace()Tejun Heo2011-06-222-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | task_ptrace(task) simply dereferences task->ptrace and isn't even used consistently only adding confusion. Kill it and directly access ->ptrace instead. This doesn't introduce any behavior change. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
| * | | | ptrace: implement PTRACE_LISTENTejun Heo2011-06-163-8/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous patch implemented async notification for ptrace but it only worked while trace is running. This patch introduces PTRACE_LISTEN which is suggested by Oleg Nestrov. It's allowed iff tracee is in STOP trap and puts tracee into quasi-running state - tracee never really runs but wait(2) and ptrace(2) consider it to be running. While ptracer is listening, tracee is allowed to re-enter STOP to notify an async event. Listening state is cleared on the first notification. Ptracer can also clear it by issuing INTERRUPT - tracee will re-trap into STOP with listening state cleared. This allows ptracer to monitor group stop state without running tracee - use INTERRUPT to put tracee into STOP trap, issue LISTEN and then wait(2) to wait for the next group stop event. When it happens, PTRACE_GETSIGINFO provides information to determine the current state. Test program follows. #define PTRACE_SEIZE 0x4206 #define PTRACE_INTERRUPT 0x4207 #define PTRACE_LISTEN 0x4208 #define PTRACE_SEIZE_DEVEL 0x80000000 static const struct timespec ts1s = { .tv_sec = 1 }; int main(int argc, char **argv) { pid_t tracee, tracer; int i; tracee = fork(); if (!tracee) while (1) pause(); tracer = fork(); if (!tracer) { siginfo_t si; ptrace(PTRACE_SEIZE, tracee, NULL, (void *)(unsigned long)PTRACE_SEIZE_DEVEL); ptrace(PTRACE_INTERRUPT, tracee, NULL, NULL); repeat: waitid(P_PID, tracee, NULL, WSTOPPED); ptrace(PTRACE_GETSIGINFO, tracee, NULL, &si); if (!si.si_code) { printf("tracer: SIG %d\n", si.si_signo); ptrace(PTRACE_CONT, tracee, NULL, (void *)(unsigned long)si.si_signo); goto repeat; } printf("tracer: stopped=%d signo=%d\n", si.si_signo != SIGTRAP, si.si_signo); if (si.si_signo != SIGTRAP) ptrace(PTRACE_LISTEN, tracee, NULL, NULL); else ptrace(PTRACE_CONT, tracee, NULL, NULL); goto repeat; } for (i = 0; i < 3; i++) { nanosleep(&ts1s, NULL); printf("mother: SIGSTOP\n"); kill(tracee, SIGSTOP); nanosleep(&ts1s, NULL); printf("mother: SIGCONT\n"); kill(tracee, SIGCONT); } nanosleep(&ts1s, NULL); kill(tracer, SIGKILL); kill(tracee, SIGKILL); return 0; } This is identical to the program to test TRAP_NOTIFY except that tracee is PTRACE_LISTEN'd instead of PTRACE_CONT'd when group stopped. This allows ptracer to monitor when group stop ends without running tracee. # ./test-listen tracer: stopped=0 signo=5 mother: SIGSTOP tracer: SIG 19 tracer: stopped=1 signo=19 mother: SIGCONT tracer: stopped=0 signo=5 tracer: SIG 18 mother: SIGSTOP tracer: SIG 19 tracer: stopped=1 signo=19 mother: SIGCONT tracer: stopped=0 signo=5 tracer: SIG 18 mother: SIGSTOP tracer: SIG 19 tracer: stopped=1 signo=19 mother: SIGCONT tracer: stopped=0 signo=5 tracer: SIG 18 -v2: Moved JOBCTL_LISTENING check in wait_task_stopped() into task_stopped_code() as suggested by Oleg. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com>
| * | | | ptrace: implement TRAP_NOTIFY and use it for group stop eventsTejun Heo2011-06-161-3/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently there's no way for ptracer to find out whether group stop finished other than polling with INTERRUPT - GETSIGINFO - CONT sequence. This patch implements group stop notification for ptracer using STOP traps. When group stop state of a seized tracee changes, JOBCTL_TRAP_NOTIFY is set, which schedules a STOP trap which is sticky - it isn't cleared by other traps and at least one STOP trap will happen eventually. STOP trap is synchronization point for event notification and the tracer can determine the current group stop state by looking at the signal number portion of exit code (si_status from waitid(2) or si_code from PTRACE_GETSIGINFO). Notifications are generated both on start and end of group stops but, because group stop participation always happens before STOP trap, this doesn't cause an extra trap while tracee is participating in group stop. The symmetry will be useful later. Note that this notification works iff tracee is not trapped. Currently there is no way to be notified of group stop state changes while tracee is trapped. This will be addressed by a later patch. An example program follows. #define PTRACE_SEIZE 0x4206 #define PTRACE_INTERRUPT 0x4207 #define PTRACE_SEIZE_DEVEL 0x80000000 static const struct timespec ts1s = { .tv_sec = 1 }; int main(int argc, char **argv) { pid_t tracee, tracer; int i; tracee = fork(); if (!tracee) while (1) pause(); tracer = fork(); if (!tracer) { siginfo_t si; ptrace(PTRACE_SEIZE, tracee, NULL, (void *)(unsigned long)PTRACE_SEIZE_DEVEL); ptrace(PTRACE_INTERRUPT, tracee, NULL, NULL); repeat: waitid(P_PID, tracee, NULL, WSTOPPED); ptrace(PTRACE_GETSIGINFO, tracee, NULL, &si); if (!si.si_code) { printf("tracer: SIG %d\n", si.si_signo); ptrace(PTRACE_CONT, tracee, NULL, (void *)(unsigned long)si.si_signo); goto repeat; } printf("tracer: stopped=%d signo=%d\n", si.si_signo != SIGTRAP, si.si_signo); ptrace(PTRACE_CONT, tracee, NULL, NULL); goto repeat; } for (i = 0; i < 3; i++) { nanosleep(&ts1s, NULL); printf("mother: SIGSTOP\n"); kill(tracee, SIGSTOP); nanosleep(&ts1s, NULL); printf("mother: SIGCONT\n"); kill(tracee, SIGCONT); } nanosleep(&ts1s, NULL); kill(tracer, SIGKILL); kill(tracee, SIGKILL); return 0; } In the above program, tracer keeps tracee running and gets notification of each group stop state changes. # ./test-notify tracer: stopped=0 signo=5 mother: SIGSTOP tracer: SIG 19 tracer: stopped=1 signo=19 mother: SIGCONT tracer: stopped=0 signo=5 tracer: SIG 18 mother: SIGSTOP tracer: SIG 19 tracer: stopped=1 signo=19 mother: SIGCONT tracer: stopped=0 signo=5 tracer: SIG 18 mother: SIGSTOP tracer: SIG 19 tracer: stopped=1 signo=19 mother: SIGCONT tracer: stopped=0 signo=5 tracer: SIG 18 Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com>
| * | | | ptrace: implement PTRACE_INTERRUPTTejun Heo2011-06-161-2/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, there's no way to trap a running ptracee short of sending a signal which has various side effects. This patch implements PTRACE_INTERRUPT which traps ptracee without any signal or job control related side effect. The implementation is almost trivial. It uses the group stop trap - SIGTRAP | PTRACE_EVENT_STOP << 8. A new trap flag JOBCTL_TRAP_INTERRUPT is added, which is set on PTRACE_INTERRUPT and cleared when any trap happens. As INTERRUPT should be useable regardless of the current state of tracee, task_is_traced() test in ptrace_check_attach() is skipped for INTERRUPT. PTRACE_INTERRUPT is available iff tracee is attached with PTRACE_SEIZE. Test program follows. #define PTRACE_SEIZE 0x4206 #define PTRACE_INTERRUPT 0x4207 #define PTRACE_SEIZE_DEVEL 0x80000000 static const struct timespec ts100ms = { .tv_nsec = 100000000 }; static const struct timespec ts1s = { .tv_sec = 1 }; static const struct timespec ts3s = { .tv_sec = 3 }; int main(int argc, char **argv) { pid_t tracee; tracee = fork(); if (tracee == 0) { nanosleep(&ts100ms, NULL); while (1) { printf("tracee: alive pid=%d\n", getpid()); nanosleep(&ts1s, NULL); } } if (argc > 1) kill(tracee, SIGSTOP); nanosleep(&ts100ms, NULL); ptrace(PTRACE_SEIZE, tracee, NULL, (void *)(unsigned long)PTRACE_SEIZE_DEVEL); if (argc > 1) { waitid(P_PID, tracee, NULL, WSTOPPED); ptrace(PTRACE_CONT, tracee, NULL, NULL); } nanosleep(&ts3s, NULL); printf("tracer: INTERRUPT and DETACH\n"); ptrace(PTRACE_INTERRUPT, tracee, NULL, NULL); waitid(P_PID, tracee, NULL, WSTOPPED); ptrace(PTRACE_DETACH, tracee, NULL, NULL); nanosleep(&ts3s, NULL); printf("tracer: exiting\n"); kill(tracee, SIGKILL); return 0; } When called without argument, tracee is seized from running state, interrupted and then detached back to running state. # ./test-interrupt tracee: alive pid=4546 tracee: alive pid=4546 tracee: alive pid=4546 tracer: INTERRUPT and DETACH tracee: alive pid=4546 tracee: alive pid=4546 tracee: alive pid=4546 tracer: exiting When called with argument, tracee is seized from stopped state, continued, interrupted and then detached back to stopped state. # ./test-interrupt 1 tracee: alive pid=4548 tracee: alive pid=4548 tracee: alive pid=4548 tracer: INTERRUPT and DETACH tracer: exiting Before PTRACE_INTERRUPT, once the tracee was running, there was no way to trap tracee and do PTRACE_DETACH without causing side effect. -v2: Updated to use task_set_jobctl_pending() so that it doesn't end up scheduling TRAP_STOP if child is dying which may make the child unkillable. Spotted by Oleg. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com>
| * | | | ptrace: implement PTRACE_SEIZETejun Heo2011-06-162-15/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PTRACE_ATTACH implicitly issues SIGSTOP on attach which has side effects on tracee signal and job control states. This patch implements a new ptrace request PTRACE_SEIZE which attaches a tracee without trapping it or affecting its signal and job control states. The usage is the same with PTRACE_ATTACH but it takes PTRACE_SEIZE_* flags in @data. Currently, the only defined flag is PTRACE_SEIZE_DEVEL which is a temporary flag to enable PTRACE_SEIZE. PTRACE_SEIZE will change ptrace behaviors outside of attach itself. The changes will be implemented gradually and the DEVEL flag is to prevent programs which expect full SEIZE behavior from using it before all the behavior modifications are complete while allowing unit testing. The flag will be removed once SEIZE behaviors are completely implemented. * PTRACE_SEIZE, unlike ATTACH, doesn't force tracee to trap. After attaching tracee continues to run unless a trap condition occurs. * PTRACE_SEIZE doesn't affect signal or group stop state. * If PTRACE_SEIZE'd, group stop uses PTRACE_EVENT_STOP trap which uses exit_code of (signr | PTRACE_EVENT_STOP << 8) where signr is one of the stopping signals if group stop is in effect or SIGTRAP otherwise, and returns usual trap siginfo on PTRACE_GETSIGINFO instead of NULL. Seizing sets PT_SEIZED in ->ptrace of the tracee. This flag will be used to determine whether new SEIZE behaviors should be enabled. Test program follows. #define PTRACE_SEIZE 0x4206 #define PTRACE_SEIZE_DEVEL 0x80000000 static const struct timespec ts100ms = { .tv_nsec = 100000000 }; static const struct timespec ts1s = { .tv_sec = 1 }; static const struct timespec ts3s = { .tv_sec = 3 }; int main(int argc, char **argv) { pid_t tracee; tracee = fork(); if (tracee == 0) { nanosleep(&ts100ms, NULL); while (1) { printf("tracee: alive\n"); nanosleep(&ts1s, NULL); } } if (argc > 1) kill(tracee, SIGSTOP); nanosleep(&ts100ms, NULL); ptrace(PTRACE_SEIZE, tracee, NULL, (void *)(unsigned long)PTRACE_SEIZE_DEVEL); if (argc > 1) { waitid(P_PID, tracee, NULL, WSTOPPED); ptrace(PTRACE_CONT, tracee, NULL, NULL); } nanosleep(&ts3s, NULL); printf("tracer: exiting\n"); return 0; } When the above program is called w/o argument, tracee is seized while running and remains running. When tracer exits, tracee continues to run and print out messages. # ./test-seize-simple tracee: alive tracee: alive tracee: alive tracer: exiting tracee: alive tracee: alive When called with an argument, tracee is seized from stopped state and continued, and returns to stopped state when tracer exits. # ./test-seize tracee: alive tracee: alive tracee: alive tracer: exiting # ps -el|grep test-seize 1 T 0 4720 1 0 80 0 - 941 signal ttyS0 00:00:00 test-seize -v2: SEIZE doesn't schedule TRAP_STOP and leaves tracee running as Jan suggested. -v3: PTRACE_EVENT_STOP traps now report group stop state by signr. If group stop is in effect the stop signal number is returned as part of exit_code; otherwise, SIGTRAP. This was suggested by Denys and Oleg. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jan Kratochvil <jan.kratochvil@redhat.com> Cc: Denys Vlasenko <vda.linux@googlemail.com> Cc: Oleg Nesterov <oleg@redhat.com>
| * | | | job control: introduce JOBCTL_TRAP_STOP and use it for group stop trapTejun Heo2011-06-162-32/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | do_signal_stop() implemented both normal group stop and trap for group stop while ptraced. This approach has been enough but scheduled changes require trap mechanism which can be used in more generic manner and using group stop trap for generic trap site simplifies both userland visible interface and implementation. This patch adds a new jobctl flag - JOBCTL_TRAP_STOP. When set, it triggers a trap site, which behaves like group stop trap, in get_signal_to_deliver() after checking for pending signals. While ptraced, do_signal_stop() doesn't stop itself. It initiates group stop if requested and schedules JOBCTL_TRAP_STOP and returns. The caller - get_signal_to_deliver() - is responsible for checking whether TRAP_STOP is pending afterwards and handling it. ptrace_attach() is updated to use JOBCTL_TRAP_STOP instead of JOBCTL_STOP_PENDING and __ptrace_unlink() to clear all pending trap bits and TRAPPING so that TRAP_STOP and future trap bits don't linger after detach. While at it, add proper function comment to do_signal_stop() and make it return bool. -v2: __ptrace_unlink() updated to clear JOBCTL_TRAP_MASK and TRAPPING instead of JOBCTL_PENDING_MASK. This avoids accidentally clearing JOBCTL_STOP_CONSUME. Spotted by Oleg. -v3: do_signal_stop() updated to return %false without dropping siglock while ptraced and TRAP_STOP check moved inside for(;;) loop after group stop participation. This avoids unnecessary relocking and also will help avoiding unnecessary traps by consuming group stop before handling pending traps. -v4: Jobctl trap handling moved into a separate function - do_jobctl_trap(). Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com>
| * | | | signal: remove three noop tracehooksTejun Heo2011-06-041-30/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the following three noop tracehooks in signals.c. * tracehook_force_sigpending() * tracehook_get_signal() * tracehook_finish_jctl() The code area is about to be updated and these hooks don't do anything other than obfuscating the logic. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
| * | | | ptrace: use bit_waitqueue for TRAPPING instead of wait_chldexitTejun Heo2011-06-042-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ptracer->signal->wait_chldexit was used to wait for TRAPPING; however, ->wait_chldexit was already complicated with waker-side filtering without adding TRAPPING wait on top of it. Also, it unnecessarily made TRAPPING clearing depend on the current ptrace relationship - if the ptracee is detached, wakeup is lost. There is no reason to use signal->wait_chldexit here. We're just waiting for JOBCTL_TRAPPING bit to clear and given the relatively infrequent use of ptrace, bit_waitqueue can serve it perfectly. This patch makes JOBCTL_TRAPPING wait use bit_waitqueue instead of signal->wait_chldexit. -v2: Use JOBCTL_*_BIT macros instead of ilog2() as suggested by Linus. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
| * | | | job control: introduce task_set_jobctl_pending()Tejun Heo2011-06-042-9/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | task->jobctl currently hosts JOBCTL_STOP_PENDING and will host TRAP pending bits too. Setting pending conditions on a dying task may make the task unkillable. Currently, each setting site is responsible for checking for the condition but with to-be-added job control traps this becomes too fragile. This patch adds task_set_jobctl_pending() which should be used when setting task->jobctl bits to schedule a stop or trap. The function performs the followings to ease setting pending bits. * Sanity checks. * If fatal signal is pending or PF_EXITING is set, no bit is set. * STOP_SIGMASK is automatically cleared if new value is being set. do_signal_stop() and ptrace_attach() are updated to use task_set_jobctl_pending() instead of setting STOP_PENDING explicitly. The surrounding structures around setting are changed to fit task_set_jobctl_pending() better but there should be no userland visible behavior difference. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
| * | | | job control: make task_clear_jobctl_pending() clear TRAPPING automaticallyTejun Heo2011-06-041-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | JOBCTL_TRAPPING indicates that ptracer is waiting for tracee to (re)transit into TRACED. task_clear_jobctl_pending() must be called when either tracee enters TRACED or the transition is cancelled for some reason. The former is achieved by explicitly calling task_clear_jobctl_pending() in ptrace_stop() and the latter by calling it at the end of do_signal_stop(). Calling task_clear_jobctl_trapping() at the end of do_signal_stop() limits the scope TRAPPING can be used and is fragile in that seemingly unrelated changes to tracee's control flow can lead to stuck TRAPPING. We already have task_clear_jobctl_pending() calls on those cancelling events to clear JOBCTL_STOP_PENDING. Cancellations can be handled by making those call sites use JOBCTL_PENDING_MASK instead and updating task_clear_jobctl_pending() such that task_clear_jobctl_trapping() is called automatically if no stop/trap is pending. This patch makes the above changes and removes the fallback task_clear_jobctl_trapping() call from do_signal_stop(). Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
| * | | | job control: introduce JOBCTL_PENDING_MASK and task_clear_jobctl_pending()Tejun Heo2011-06-041-10/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces JOBCTL_PENDING_MASK and replaces task_clear_jobctl_stop_pending() with task_clear_jobctl_pending() which takes an extra @mask argument. JOBCTL_PENDING_MASK is currently equal to JOBCTL_STOP_PENDING but future patches will add more bits. recalc_sigpending_tsk() is updated to use JOBCTL_PENDING_MASK instead. task_clear_jobctl_pending() takes @mask which in subset of JOBCTL_PENDING_MASK and clears the relevant jobctl bits. If JOBCTL_STOP_PENDING is set, other STOP bits are cleared together. All task_clear_jobctl_stop_pending() users are updated to call task_clear_jobctl_pending() with JOBCTL_STOP_PENDING which is functionally identical to task_clear_jobctl_stop_pending(). This patch doesn't cause any functional change. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
| * | | | ptrace: relocate set_current_state(TASK_TRACED) in ptrace_stop()Tejun Heo2011-06-041-15/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In ptrace_stop(), after arch hook is done, the task state and jobctl bits are updated while holding siglock. The ordering requirement there is that TASK_TRACED is set before JOBCTL_TRAPPING is cleared to prevent ptracer waiting on TRAPPING doesn't end up waking up TRACED is actually set and sees TASK_RUNNING in wait(2). Move set_current_state(TASK_TRACED) to the top of the block and reorganize comments. This makes the ordering more obvious (TASK_TRACED before other updates) and helps future updates to group stop participation. This patch doesn't cause any functional change. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
| * | | | ptrace: ptrace_check_attach(): rename @kill to @ignore_state and add commentsTejun Heo2011-06-041-5/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PTRACE_INTERRUPT is going to be added which should also skip task_is_traced() check in ptrace_check_attach(). Rename @kill to @ignore_state and make it bool. Add function comment while at it. This patch doesn't introduce any behavior difference. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
| * | | | job control: rename signal->group_stop and flags to jobctl and update themTejun Heo2011-06-042-51/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | signal->group_stop currently hosts mostly group stop related flags; however, it's gonna be used for wider purposes and the GROUP_STOP_ flag prefix becomes confusing. Rename signal->group_stop to signal->jobctl and rename all GROUP_STOP_* flags to JOBCTL_*. Bit position macros JOBCTL_*_BIT are defined and JOBCTL_* flags are defined in terms of them to allow using bitops later. While at it, reassign JOBCTL_TRAPPING to bit 22 to better accomodate future additions. This doesn't cause any functional change. -v2: JOBCTL_*_BIT macros added as suggested by Linus. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
| * | | | ptrace: remove silly wait_trap variable from ptrace_attach()Tejun Heo2011-06-041-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove local variable wait_trap which determines whether to wait for !TRAPPING or not and simply wait for it if attach was successful. -v2: Oleg pointed out wait should happen iff attach was successful. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
* | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds2011-07-221-0/+29
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1287 commits) icmp: Fix regression in nexthop resolution during replies. net: Fix ppc64 BPF JIT dependencies. acenic: include NET_SKB_PAD headroom to incoming skbs ixgbe: convert to ndo_fix_features ixgbe: only enable WoL for magic packet by default ixgbe: remove ifdef check for non-existent define ixgbe: Pass staterr instead of re-reading status and error bits from descriptor ixgbe: Move interrupt related values out of ring and into q_vector ixgbe: add structure for containing RX/TX rings to q_vector ixgbe: inline the ixgbe_maybe_stop_tx function ixgbe: Update ATR to use recorded TX queues instead of CPU for routing igb: Fix for DH89xxCC near end loopback test e1000: always call e1000_check_for_link() on e1000_ce4100 MACs. netxen: add fw version compatibility check be2net: request native mode each time the card is reset ipv4: Constrain UFO fragment sizes to multiples of 8 bytes virtio_net: Fix panic in virtnet_remove ipv6: make fragment identifications less predictable ipv6: unshare inetpeers can: make function can_get_bittiming static ...
| * \ \ \ \ Merge branch 'master' of ↵David S. Miller2011-07-213-5/+39
| |\ \ \ \ \ | | | |_|/ / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/bluetooth/l2cap_core.c
| * | | | | Merge branch 'master' of ↵David S. Miller2011-07-211-0/+29
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6
| | * | | | | netfilter: add SELinux context support to AUDIT targetMr Dash Four2011-06-301-0/+29
| | | |/ / / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In this revision the conversion of secid to SELinux context and adding it to the audit log is moved from xt_AUDIT.c to audit.c with the aid of a separate helper function - audit_log_secctx - which does both the conversion and logging of SELinux context, thus also preventing internal secid number being leaked to userspace. If conversion is not successful an error is raised. With the introduction of this helper function the work done in xt_AUDIT.c is much more simplified. It also opens the possibility of this helper function being used by other modules (including auditd itself), if desired. With this addition, typical (raw auditd) output after applying the patch would be: type=NETFILTER_PKT msg=audit(1305852240.082:31012): action=0 hook=1 len=52 inif=? outif=eth0 saddr=10.1.1.7 daddr=10.1.2.1 ipid=16312 proto=6 sport=56150 dport=22 obj=system_u:object_r:ssh_client_packet_t:s0 type=NETFILTER_PKT msg=audit(1306772064.079:56): action=0 hook=3 len=48 inif=eth0 outif=? smac=00:05:5d:7c:27:0b dmac=00:02:b3:0a:7f:81 macproto=0x0800 saddr=10.1.2.1 daddr=10.1.1.7 ipid=462 proto=6 sport=22 dport=3561 obj=system_u:object_r:ssh_server_packet_t:s0 Acked-by: Eric Paris <eparis@redhat.com> Signed-off-by: Mr Dash Four <mr.dash.four@googlemail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
* | | | | | Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds2011-07-204-28/+100
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: signal: align __lock_task_sighand() irq disabling and RCU softirq,rcu: Inform RCU of irq_exit() activity sched: Add irq_{enter,exit}() to scheduler_ipi() rcu: protect __rcu_read_unlock() against scheduler-using irq handlers rcu: Streamline code produced by __rcu_read_unlock() rcu: Fix RCU_BOOST race handling current->rcu_read_unlock_special rcu: decrease rcu_report_exp_rnp coupling with scheduler
| * \ \ \ \ \ Merge branch 'rcu/urgent' of ↵Ingo Molnar2011-07-204-28/+100
| |\ \ \ \ \ \ | | |_|_|/ / / | |/| | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu into core/urgent
| | * | | | | signal: align __lock_task_sighand() irq disabling and RCUPaul E. McKenney2011-07-201-6/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The __lock_task_sighand() function calls rcu_read_lock() with interrupts and preemption enabled, but later calls rcu_read_unlock() with interrupts disabled. It is therefore possible that this RCU read-side critical section will be preempted and later RCU priority boosted, which means that rcu_read_unlock() will call rt_mutex_unlock() in order to deboost itself, but with interrupts disabled. This results in lockdep splats, so this commit nests the RCU read-side critical section within the interrupt-disabled region of code. This prevents the RCU read-side critical section from being preempted, and thus prevents the attempt to deboost with interrupts disabled. It is quite possible that a better long-term fix is to make rt_mutex_unlock() disable irqs when acquiring the rt_mutex structure's ->wait_lock. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
OpenPOWER on IntegriCloud