summaryrefslogtreecommitdiffstats
path: root/sys/kern/kern_thread.c
Commit message (Collapse)AuthorAgeFilesLines
* /* -> /*- for copyright notices, minor format tweaks as necessaryimp2005-01-061-1/+1
|
* - Garbage collect several unused members of struct kse and struce ksegrp.jeff2004-12-141-2/+0
| | | | As best as I can tell, some of these were never used.
* Remove local definitions of RANGEOF() and use __rangeof() instead.das2004-11-201-2/+0
| | | | Also remove a few bogus casts.
* Respect TDF_SINTR, don't suspend uninterruptible thread.davidxu2004-11-051-4/+3
|
* Backout previous commit, the P_STOPPED_BOUNDARY flag was alreadydavidxu2004-11-051-1/+1
| | | | cleared at the begin of thread_single() when needed.
* Don't forget to turn off P_SINGLE_BOUNDARY for thread_single(SINGLE_EXIT),davidxu2004-11-041-1/+1
| | | | | otherwise a threaded process which calls execv() will hang in kernel and may can not be killed!
* Whitespace fix.jhb2004-10-121-1/+1
|
* In original kern_execve() code, at the start of the function, it forcesdavidxu2004-10-061-23/+60
| | | | | | | | | | | | | | | | | all other threads to suicide, problem is execve() could be failed, and a failed execve() would change threaded process to unthreaded, this side effect is unexpected. The new code introduces a new single threading mode SINGLE_BOUNDARY, in the mode, all threads should suspend themself at user boundary except the singler. we can not use SINGLE_NO_EXIT because we want to start from a clean state if execve() is successful, suspending other threads at unknown point and later resuming them from there and forcing them to exit at user boundary may cause the process to start from a dirty state. If execve() is successful, current thread upgrades to SINGLE_EXIT mode and forces other threads to suicide at user boundary, otherwise, other threads will be resumed and their interrupted syscall will be restarted. Reviewed by: julian
* Slight cleanup in the single threading code.julian2004-10-051-6/+5
| | | | MFC after: 4 days
* Break out to a separate function, the code to revert a multithreadedjulian2004-10-051-14/+26
| | | | | | process back to officially being a non-threaded program. MFC after: 4 days
* Always strt out with an initilalised ksegrp structure.julian2004-10-031-3/+3
| | | | MFC after: 3 days
* Use the universal 'threaded process' flag rather than thejulian2004-09-251-1/+1
| | | | | | specific tests for different threading systems. MFC after: 1 week
* Various small style fixes.jhb2004-09-221-1/+1
|
* Try harder to get back to being a non threaded process.julian2004-09-151-1/+11
| | | | | Submitted by: DavidXu MFC after: 3 days
* Refactor a bunch of scheduler code to give basically the same behaviourjulian2004-09-051-241/+153
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | but with slightly cleaned up interfaces. The KSE structure has become the same as the "per thread scheduler private data" structure. In order to not make the diffs too great one is #defined as the other at this time. The KSE (or td_sched) structure is now allocated per thread and has no allocation code of its own. Concurrency for a KSEGRP is now kept track of via a simple pair of counters rather than using KSE structures as tokens. Since the KSE structure is different in each scheduler, kern_switch.c is now included at the end of each scheduler. Nothing outside the scheduler knows the contents of the KSE (aka td_sched) structure. The fields in the ksegrp structure that are to do with the scheduler's queueing mechanisms are now moved to the kg_sched structure. (per ksegrp scheduler private data structure). In other words how the scheduler queues and keeps track of threads is no-one's business except the scheduler's. This should allow people to write experimental schedulers with completely different internal structuring. A scheduler call sched_set_concurrency(kg, N) has been added that notifies teh scheduler that no more than N threads from that ksegrp should be allowed to be on concurrently scheduled. This is also used to enforce 'fainess' at this time so that a ksegrp with 10000 threads can not swamp a the run queue and force out a process with 1 thread, since the current code will not set the concurrency above NCPU, and both schedulers will not allow more than that many onto the system run queue at a time. Each scheduler should eventualy develop their own methods to do this now that they are effectively separated. Rejig libthr's kernel interface to follow the same code paths as linkse for scope system threads. This has slightly hurt libthr's performance but I will work to recover as much of it as I can. Thread exit code has been cleaned up greatly. exit and exec code now transitions a process back to 'standard non-threaded mode' before taking the next step. Reviewed by: scottl, peter MFC after: 1 week
* Only test return_instead if P_SINGLE_EXIT is set, otherwise a fork()davidxu2004-08-291-1/+1
| | | | | | | | | | syscall can interrupt other thread's syscall in sleepq_catch_signals(). Current, all callers know thread_suspend_check may suspend thread itself, so we need't to check return_instead for normal suspension flags (no P_SINGLE_EXIT set). Tested by: deischen Reported by: Maarten L. Hekkelman <m.hekkelman@cmbi.kun.nl>
* Now that the return value semantics of cv's for multithreaded processesjhb2004-08-191-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | have been unified with that of msleep(9), further refine the sleepq interface and consolidate some duplicated code: - Move the pre-sleep checks for theaded processes into a thread_sleep_check() function in kern_thread.c. - Move all handling of TDF_SINTR to be internal to subr_sleepqueue.c. Specifically, if a thread is awakened by something other than a signal while checking for signals before going to sleep, clear TDF_SINTR in sleepq_catch_signals(). This removes a sched_lock lock/unlock combo in that edge case during an interruptible sleep. Also, fix sleepq_check_signals() to properly handle the condition if TDF_SINTR is clear rather than requiring the callers of the sleepq API to notice this edge case and call a non-_sig variant of sleepq_wait(). - Clarify the flags arguments to sleepq_add(), sleepq_signal() and sleepq_broadcast() by creating an explicit submask for sleepq types. Also, add an explicit SLEEPQ_MSLEEP type rather than a magic number of 0. Also, add a SLEEPQ_INTERRUPTIBLE flag for use with sleepq_add() and move the setting of TDF_SINTR to sleepq_add() if this flag is set rather than sleepq_catch_signals(). Note that it is the caller's responsibility to ensure that sleepq_catch_signals() is called if and only if this flag is passed to the preceeding sleepq_add(). Note that this also removes a sched_lock lock/unlock pair from sleepq_catch_signals(). It also ensures that for an interruptible sleep, TDF_SINTR is always set when TD_ON_SLEEPQ() is true.
* Whitespace nit.julian2004-08-141-1/+1
|
* Increase the amount of data exported by KTR in the KTR_RUNQ setting.julian2004-08-091-3/+2
| | | | | This extra data is needed to really follow what is going on in the threaded case.
* In thread_exit(), include more information about the thread/processrwatson2004-08-061-1/+2
| | | | | | context in the KTR trace record. In particular, include the same information as passed for mi_switch() and fork_exit() KTR trace records.
* * Add a "how" argument to uma_zone constructors and initialization functionsgreen2004-08-021-8/+12
| | | | | | | | | | | | | | | | | so that they know whether the allocation is supposed to be able to sleep or not. * Allow uma_zone constructors and initialation functions to return either success or error. Almost all of the ones in the tree currently return success unconditionally, but mbuf is a notable exception: the packet zone constructor wants to be able to fail if it cannot suballocate an mbuf cluster, and the mbuf allocators want to be able to fail in general in a MAC kernel if the MAC mbuf initializer fails. This fixes the panics people are seeing when they run out of memory for mbuf clusters. * Allow debug.nosleepwithlocks on WITNESS to be disabled, without changing the default. Both bmilekic and jeff have reviewed the changes made to make failable zone allocations work.
* When calling scheduler entrypoints for creating new threads and processes,julian2004-07-181-2/+2
| | | | | | | | | | | specify "us" as the thread not the process/ksegrp/kse. You can always find the others from the thread but the converse is not true. Theorotically this would lead to runtime being allocated to the wrong entity in some cases though it is not clear how often this actually happenned. (would only affect threaded processes and would probably be pretty benign, but it WAS a bug..) Reviewed by: peter
* Whitespace fix.jhb2004-07-161-1/+1
|
* Add code to support debugging threaded process.davidxu2004-07-131-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. Add tm_lwpid into kse_thr_mailbox to indicate which kernel thread current user thread is running on. Add tm_dflags into kse_thr_mailbox, the flags is written by debugger, it tells UTS and kernel what should be done when the process is being debugged, current, there two flags TMDF_SSTEP and TMDF_DONOTRUNUSER. TMDF_SSTEP is used to tell kernel to turn on single stepping, or turn off if it is not set. TMDF_DONOTRUNUSER is used to tell kernel to schedule upcall whenever possible, to UTS, it means do not run the user thread until debugger clears it, this behaviour is necessary because gdb wants to resume only one thread when the thread's pc is at a breakpoint, and thread needs to go forward, in order to avoid other threads sneak pass the breakpoints, it needs to remove breakpoint, only wants one thread to go. Also, add km_lwp to kse_mailbox, the lwp id is copied to kse_thr_mailbox at context switch time when process is not being debugged, so when process is attached, debugger can map kernel thread to user thread. 2. Add p_xthread to proc strcuture and td_xsig to thread structure. p_xthread is used by a thread when it wants to report event to debugger, every thread can set the pointer, especially, when it is used in ptracestop, it is the last thread reporting event will win the race. Every thread has a td_xsig to exchange signal with debugger, thread uses TDF_XSIG flag to indicate it is reporting signal to debugger, if the flag is not cleared, thread will keep retrying until it is cleared by debugger, p_xthread may be used by debugger to indicate CURRENT thread. The p_xstat is still in proc structure to keep wait() to work, in future, we may just use td_xsig. 3. Add TDF_DBSUSPEND flag, the flag is used by debugger to suspend a thread. When process stops, debugger can set the flag for thread, thread will check the flag in thread_suspend_check, enters a loop, unless it is cleared by debugger, process is detached or process is existing. The flag is also checked in ptracestop, so debugger can temporarily suspend a thread even if the thread wants to exchange signal. 4. Current, in ptrace, we always resume all threads, but if a thread has already a TDF_DBSUSPEND flag set by debugger, it won't run. Encouraged by: marcel, julian, deischen
* - Change mi_switch() and sched_switch() to accept an optional thread tojhb2004-07-021-2/+2
| | | | | | | | | | | | | switch to. If a non-NULL thread pointer is passed in, then the CPU will switch to that thread directly rather than calling choosethread() to pick a thread to choose to. - Make sched_switch() aware of idle threads and know to do TD_SET_CAN_RUN() instead of sticking them on the run queue rather than requiring all callers of mi_switch() to know to do this if they can be called from an idlethread. - Move constants for arguments to mi_switch() and thread_single() out of the middle of the function prototypes and up above into their own section.
* Allocate TIDs in thread_init() and deallocate them in thread_fini().marcel2004-06-261-71/+57
| | | | | | | | | | | | | | | | | | | | | | | | The overhead of unconditionally allocating TIDs (and likewise, unconditionally deallocating them), is amortized across multiple thread creations by the way UMA makes it possible to have type-stable storage. Previously the cost was kept down by having threads created as part of a fork operation use the process' PID as the TID. While this had some nice properties, it also introduced complexity in the way TIDs were allocated. Most importantly, by using the type-stable storage that UMA gives us this was also unnecessary. This change affects how core dumps are created and in particular how the PRSTATUS notes are dumped. Since we don't have a thread with a TID equalling the PID, we now need a different way to preserve the old and previous behavior. We do this by having the given thread (i.e. the thread passed to the core dump code in td) dump it's state first and fill in pr_pid with the actual PID. All other threads will have pr_pid contain their TIDs. The upshot of all this is that the debugger will now likely select the right LWP (=TID) as the initial thread. Credits to: julian@ for spotting how we can utilize UMA. Thanks to: all who provided julian@ with test results.
* Mark the thread in an exiting program as inactive.julian2004-06-211-1/+1
| | | | | | | This is not really used by the process but it's confusing to some status readers to see zombie processes the "runnin" threads. Pointed out by: Don Lewis <truckman@FreeBSD.org>
* Define __lwpid_t as an int32_t in <sys/_types.h> and define lwpid_tmarcel2004-06-191-4/+6
| | | | | as an __lwpid_t in <sys/types.h>. Retype td_tid from an int to a lwpid_t and change related definitions accordingly.
* If thread singler wants to terminate other threads, make sure it includesdavidxu2004-06-181-2/+16
| | | | | | all threads except itself. Obtained from: julian
* Shuffle some code around.julian2004-06-111-1/+42
|
* Add a comment explaining td_critnest's initial state and its life from thatjmallett2004-06-091-0/+13
| | | | | | | point on, as it happens relatively indirectly, and in a codepath the casual reader may not be acquainted with or find obvious. Glanced at by: jhb
* Split kern_thread.c into 2 parts. kern_kse.c and kern_thread.cjulian2004-06-071-1209/+13
| | | | | Kern_kse has already been committed. This separates out the KSE threading ABI from generic thread support.
* Move TDF_SA from td_flags to td_pflags (and rename it accordingly)tjr2004-06-021-10/+10
| | | | | | | so that it is no longer necessary to hold sched_lock while manipulating it. Reviewed by: davidxu
* Clear KSE thread flags after KSE thread mode is ended. The side effectdavidxu2004-05-211-1/+1
| | | | | | | | of not clearing the flags for execv() syscall will result that a new program runs in KSE thread mode without enabling it. Submitted by: tjr Modified by: davidxu
* Keep track of threads waiting in kse_release() to avoid a racedeischen2004-04-281-16/+37
| | | | | | | | | | | condition where kse_wakeup() doesn't yet see them in (interruptible) sleep queues. Also add an upcall check to sleepqueue_catch_signals() suggested by jhb. This commit should fix recent mysql hangs. Reviewed by: jhb, davidxu Mysql'd by: Robin P. Blanchard <robin.blanchard at gactr uga edu>
* Assign thread IDs to kernel threads. The purpose of the thread ID (tid)marcel2004-04-031-2/+98
| | | | | | | | | | | | | | | | | | | | | | | is twofold: 1. When a 1:1 or M:N threaded process dumps core, we need to put the register state of each of its kernel threads in the core file. This can only be done by differentiating the pid field in the respective note. For this we need the tid. 2. When thread support is present for remote debugging the kernel with gdb(1), threads need to be identified by an integer due to limitations in the remote protocol. This requires having a tid. To minimize the impact of having thread IDs, threads that are created as part of a fork (i.e. the initial thread in a process) will inherit the process ID (i.e. tid=pid). Subsequent threads will have IDs larger than PID_MAX to avoid interference with the pid allocation algorithm. The assignment of tids is handled by thread_new_tid(). The thread ID allocation algorithm has been written with 3 assumptions in mind: 1. IDs need to be created as fast a possible, 2. Reuse of IDs may happen instantaneously, 3. Someone else will write a better algorithm.
* Massively up the (artificial) limit on system scope threadsjulian2004-03-211-2/+2
| | | | | | | in a process from 50 to 500 Also up the number of process scope threads allowed to be in the kernel at one time from 150 to 1500 (per process)
* Push Giant down a little further:peter2004-03-131-8/+5
| | | | | | | | | | | | | | | - no longer serialize on Giant for thread_single*() and family in fork, exit and exec - thread_wait() is mpsafe, assert no Giant - reduce scope of Giant in exit to not cover thread_wait and just do vm_waitproc(). - assert that thread_single() family are not called with Giant - remove the DROP/PICKUP_GIANT macros from thread_single() family - assert that thread_suspend_check() s not called with Giant - remove manual drop_giant hack in thread_suspend_check since we know it isn't held. - remove the DROP/PICKUP_GIANT macros from thread_suspend_check() family - mark kse_create() mpsafe
* Check for TDF_SINTR before calling sleepq_abort() as there is a narrowjhb2004-03-011-1/+1
| | | | | | | | | | | race in between sleepq_add() and sleepq_catch_signals() in that setting td_wchan and TDF_SINTR is not atomic to sched_lock but only to the sleepq lock. This band-aid will stop assertion failures, but there is perhaps a larger problem with the sleepq_add/sleepq_catch_signals race that I am not sure how to solve. For the signals case the race is harmless because we always call cursig() after setting TDF_SINTR. However, KSE doesn't do anything in sleepq_catch_signals() to check that this race was lost, so I am unsure if this race is harmful for this specific abort.
* Switch the sleep/wakeup and condition variable implementations to use thejhb2004-02-271-11/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | sleep queue interface: - Sleep queues attempt to merge some of the benefits of both sleep queues and condition variables. Having sleep qeueus in a hash table avoids having to allocate a queue head for each wait channel. Thus, struct cv has shrunk down to just a single char * pointer now. However, the hash table does not hold threads directly, but queue heads. This means that once you have located a queue in the hash bucket, you no longer have to walk the rest of the hash chain looking for threads. Instead, you have a list of all the threads sleeping on that wait channel. - Outside of the sleepq code and the sleep/cv code the kernel no longer differentiates between cv's and sleep/wakeup. For example, calls to abortsleep() and cv_abort() are replaced with a call to sleepq_abort(). Thus, the TDF_CVWAITQ flag is removed. Also, calls to unsleep() and cv_waitq_remove() have been replaced with calls to sleepq_remove(). - The sched_sleep() function no longer accepts a priority argument as sleep's no longer inherently bump the priority. Instead, this is soley a propery of msleep() which explicitly calls sched_prio() before blocking. - The TDF_ONSLEEPQ flag has been dropped as it was never used. The associated TDF_SET_ONSLEEPQ and TDF_CLR_ON_SLEEPQ macros have also been dropped and replaced with a single explicit clearing of td_wchan. TD_SET_ONSLEEPQ() would really have only made sense if it had taken the wait channel and message as arguments anyway. Now that that only happens in one place, a macro would be overkill.
* Use mtx_assert() rather than using a home-rolled version.jhb2004-01-281-1/+1
|
* - Add a flags parameter to mi_switch. The value of flags may be SW_VOL orjeff2004-01-251-4/+2
| | | | | | | | | | SW_INVOL. Assert that one of these is set in mi_switch() and propery adjust the rusage statistics. This is to simplify the large number of users of this interface which were previously all required to adjust the proper counter prior to calling mi_switch(). This also facilitates more switch and locking optimizations. - Change all callers of mi_switch() to pass the appropriate paramter and remove direct references to the process statistics.
* Reduce gratuitous includes: don't include jail.h if it's not needed.rwatson2004-01-211-1/+0
| | | | | | | Presumably, at some point, you had to include jail.h if you included proc.h, but that is no longer required. Result of: self injury involving adding something to struct prison
* s/Muliple/Multipleschweikh2004-01-101-48/+46
| | | | Removed whitespace at EOL and EOF.
* Don't use NULL (pointer) when we mean 0 (integer) for the number of tickspeter2003-12-231-1/+1
| | | | in msleep.
* Write the thread pointer (val) in the kse mailbox (loc) before wemarcel2003-12-101-2/+2
| | | | | set the new context in kse_switchin(2). This allows us to return an error to the calling context when the suword() fails.
* Add kse_switchin(2). This syscall can be used by KSE implementationsmarcel2003-12-071-0/+24
| | | | | | | | to have the kernel switch to a new thread, instead of doing it in userland. It is in fact needed on ia64 where syscall restarts do not return to userland first. It's completely handled inside the kernel. As such, any context created by the kernel as part of an upcall and caused by some syscall needs to be restored by the kernel.
* - Giant is no longer required by vm_thread_new().alc2003-12-071-2/+0
|
* Add an implementation of turnstiles and change the sleep mutex code to usejhb2003-11-111-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | turnstiles to implement blocking isntead of implementing a thread queue directly. These turnstiles are somewhat similar to those used in Solaris 7 as described in Solaris Internals but are also different. Turnstiles do not come out of a fixed-sized pool. Rather, each thread is assigned a turnstile when it is created that it frees when it is destroyed. When a thread blocks on a lock, it donates its turnstile to that lock to serve as queue of blocked threads. The queue associated with a given lock is found by a lookup in a simple hash table. The turnstile itself is protected by a lock associated with its entry in the hash table. This means that sched_lock is no longer needed to contest on a mutex. Instead, sched_lock is only used when manipulating run queues or thread priorities. Turnstiles also implement priority propagation inherently. Currently turnstiles only support mutexes. Eventually, however, turnstiles may grow two queue's to support a non-sleepable reader/writer lock implementation. For more details, see the comments in sys/turnstile.h and kern/subr_turnstile.c. The two primary advantages from the turnstile code include: 1) the size of struct mutex shrinks by four pointers as it no longer stores the thread queue linkages directly, and 2) less contention on sched_lock in SMP systems including the ability for multiple CPUs to contend on different locks simultaneously (not that this last detail is necessarily that much of a big win). Note that 1) means that this commit is a kernel ABI breaker, so don't mix old modules with a new kernel and vice versa. Tested on: i386 SMP, sparc64 SMP, alpha SMP
* Let SA process work under ULE scheduler, originally it would panic kernel.davidxu2003-08-261-3/+16
| | | | Reviewed by: jeff
OpenPOWER on IntegriCloud