summaryrefslogtreecommitdiffstats
path: root/sys/kern/kern_kse.c
Commit message (Collapse)AuthorAgeFilesLines
* - Remove setrunqueue and replace it with direct calls to sched_add().jeff2007-01-231-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | setrunqueue() was mostly empty. The few asserts and thread state setting were moved to the individual schedulers. sched_add() was chosen to displace it for naming consistency reasons. - Remove adjustrunqueue, it was 4 lines of code that was ifdef'd to be different on all three schedulers where it was only called in one place each. - Remove the long ifdef'd out remrunqueue code. - Remove the now redundant ts_state. Inspect the thread state directly. - Don't set TSF_* flags from kern_switch.c, we were only doing this to support a feature in one scheduler. - Change sched_choose() to return a thread rather than a td_sched. Also, rely on the schedulers to return the idlethread. This simplifies the logic in choosethread(). Aside from the run queue links kern_switch.c mostly does not care about the contents of td_sched. Discussed with: julian - Move the idle thread loop into the per scheduler area. ULE wants to do something different from the other schedulers. Suggested by: jhb Tested on: x86/amd64 sched_{4BSD, ULE, CORE}.
* Fix a potential point of confusion. Art Ironport we've seen this end upjulian2006-12-121-3/+6
| | | | with an infinite loop in and out of the kernel during process shutdown.
* Threading cleanup.. part 2 of several.julian2006-12-061-215/+114
| | | | | | | | | | | | | | | | | | | | | | Make part of John Birrell's KSE patch permanent.. Specifically, remove: Any reference of the ksegrp structure. This feature was never fully utilised and made things overly complicated. All code in the scheduler that tried to make threaded programs fair to unthreaded programs. Libpthread processes will already do this to some extent and libthr processes already disable it. Also: Since this makes such a big change to the scheduler(s), take the opportunity to rename some structures and elements that had to be moved anyhow. This makes the code a lot more readable. The ULE scheduler compiles again but I have no idea if it works. The 4bsd scheduler still reqires a little cleaning and some functions that now do ALMOST nothing will go away, but I thought I'd do that as a separate commit. Tested by David Xu, and Dan Eischen using libthr and libpthread.
* Make KSE a kernel option, turned on by default in all GENERICjb2006-10-261-0/+28
| | | | | | | kernel configs except sun4v (which doesn't process signals properly with KSE). Reviewed by: davidxu@
* Close some races between procfs/ptrace and exit(2):jhb2006-02-221-2/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Reorder the events in exit(2) slightly so that we trigger the S_EXIT stop event earlier. After we have signalled that, we set P_WEXIT and then wait for any processes with a hold on the vmspace via PHOLD to release it. PHOLD now KASSERT()'s that P_WEXIT is clear when it is invoked, and PRELE now does a wakeup if P_WEXIT is set and p_lock drops to zero. - Change proc_rwmem() to require that the processing read from has its vmspace held via PHOLD by the caller and get rid of all the junk to screw around with the vmspace reference count as we no longer need it. - In ptrace() and pseudofs(), treat a process with P_WEXIT set as if it doesn't exist. - Only do one PHOLD in kern_ptrace() now, and do it earlier so it covers FIX_SSTEP() (since on alpha at least this can end up calling proc_rwmem() to clear an earlier single-step simualted via a breakpoint). We only do one to avoid races. Also, by making the EINVAL error for unknown requests be part of the default: case in the switch, the various switch cases can now just break out to return which removes a _lot_ of duplicated PRELE and proc unlocks, etc. Also, it fixes at least one bug where a LWP ptrace command could return EINVAL with the proc lock still held. - Changed the locking for ptrace_single_step(), ptrace_set_pc(), and ptrace_clear_single_step() to always be called with the proc lock held (it was a mixed bag previously). Alpha and arm have to drop the lock while the mess around with breakpoints, but other archs avoid extra lock release/acquires in ptrace(). I did have to fix a couple of other consumers in kern_kse and a few other places to hold the proc lock and PHOLD. Tested by: ps (1 mostly, but some bits of 2-4 as well) MFC after: 1 week
* Fix a long standing race between sleep queue and threaddavidxu2006-02-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | suspension code. When a thread A is going to sleep, it calls sleepq_catch_signals() to detect any pending signals or thread suspension request, if nothing happens, it returns without holding process lock or scheduler lock, this opens a race window which allows thread B to come in and do process suspension work, however since A is still at running state, thread B can do nothing to A, thread A continues, and puts itself into actually sleeping state, but B has never seen it, and it sits there forever until B is woken up by other threads sometimes later(this can be very long delay or never happen). Fix this bug by forcing sleepq_catch_signals to return with scheduler lock held. Fix sleepq_abort() by passing it an interrupted code, previously, it worked as wakeup_one(), and the interruption can not be identified correctly by sleep queue code when the sleeping thread is resumed. Let thread_suspend_check() returns EINTR or ERESTART, so sleep queue no longer has to use SIGSTOP as a hack to build a return value. Reviewed by: jhb MFC after: 1 week
* - Always call exec_free_args() in kern_execve() instead of doing it in alljhb2006-02-061-1/+0
| | | | | | the callers if the exec either succeeds or fails early. - Move the code to call exit1() if the exec fails after the vmspace is gone to the bottom of kern_execve() to cut down on some code duplication.
* Cleanup some signal interfaces. Now the tdsignal function acceptsdavidxu2005-11-031-1/+1
| | | | | | | | | both proc pointer and thread pointer, if thread pointer is NULL, tdsignal automatically finds a thread, otherwise it sends signal to given thread. Add utility function psignal_event to send a realtime sigevent to a process according to the delivery requirement specified in struct sigevent.
* 1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, mostdavidxu2005-10-141-8/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | changes in MD code are trivial, before this change, trapsignal and sendsig use discrete parameters, now they uses member fields of ksiginfo_t structure. For sendsig, this change allows us to pass POSIX realtime signal value to user code. 2. Remove cpu_thread_siginfo, it is no longer needed because we now always generate ksiginfo_t data and feed it to libpthread. 3. Add p_sigqueue to proc structure to hold shared signals which were blocked by all threads in the proc. 4. Add td_sigqueue to thread structure to hold all signals delivered to thread. 5. i386 and amd64 now return POSIX standard si_code, other arches will be fixed. 6. In this sigqueue implementation, pending signal set is kept as before, an extra siginfo list holds additional siginfo_t data for signals. kernel code uses psignal() still behavior as before, it won't be failed even under memory pressure, only exception is when deleting a signal, we should call sigqueue_delete to remove signal from sigqueue but not SIGDELSET. Current there is no kernel code will deliver a signal with additional data, so kernel should be as stable as before, a ksiginfo can carry more information, for example, allow signal to be delivered but throw away siginfo data if memory is not enough. SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can not be caught or masked. The sigqueue() syscall allows user code to queue a signal to target process, if resource is unavailable, EAGAIN will be returned as specification said. Just before thread exits, signal queue memory will be freed by sigqueue_flush. Current, all signals are allowed to be queued, not only realtime signals. Earlier patch reviewed by: jhb, deischen Tested on: i386, amd64
* Fox a LOR of sleep and sched_lock by using a timeout waitdavidxu2005-09-301-1/+1
| | | | | | when process reaches maximum number of threads. MFC after: 3 days
* Add witness warnings to panic if a thread tries to exit while holding anyjhb2005-09-021-0/+2
| | | | | | | locks. Requested by: jeff MFC after: 3 days
* Add missing brackets.davidxu2005-08-191-1/+1
| | | | Noticed by: stefanf@
* Fix a LOR between sched_lock and sleep queue lock.davidxu2005-08-191-5/+4
|
* Fix a typo in a comment.jhb2005-06-231-1/+1
| | | | Approved by: re (scottl)
* Remove thread_upcall_check, it was used to avoid race bug in earlierdavidxu2005-05-271-10/+0
| | | | | day's sleep queue code, today the bug no longer exists. please see 04/25/2004 freebsd-threads@ mailing list archive.
* Change cpu_set_kse_upcall to more generic style, so we can reuse itdavidxu2005-04-231-2/+4
| | | | | | | in other codes. Add cpu_set_user_tls, use it to tweak user register and setup user TLS. I ever wanted to merge it into cpu_set_kse_upcall, but since cpu_set_kse_upcall is also used by M:N threads which may not need this feature, so I wrote a separated cpu_set_user_tls.
* Drop bzero and shove the responsibility of zeroing the kse upcallcsjp2005-02-241-2/+1
| | | | | | | | | | | object on to the zone allocator. It should be noted that uma_zalloc(9) uses bzero to zero out the object so there probably wont be any real performance benefit. If UMA grows the ability to supply zeroed zones more efficiently in the future, we will not have to modify all the existing consumers. Discussed with: rwatson,julian MFC after: 1 week
* o Split out kernel part of execve(2) syscall into two parts: one thatsobomax2005-01-291-1/+7
| | | | | | | | | | | copies arguments into the kernel space and one that operates completely in the kernel space; o use kernel-only version of execve(2) to kill another stackgap in linuxlator/i386. Obtained from: DragonFlyBSD (partially) MFC after: 2 weeks
* /* -> /*- for copyright notices, minor format tweaks as necessaryimp2005-01-061-1/+1
|
* - Remove a 4BSD specific hack since this will work on ULE too.jeff2004-12-261-4/+0
|
* Remove local definitions of RANGEOF() and use __rangeof() instead.das2004-11-201-7/+5
| | | | Also remove a few bogus casts.
* Add an execve command for kse_thr_interrupt to allow libpthread todavidxu2004-10-071-0/+17
| | | | | | restore signal mask correctly, this is required by POSIX. Reviewed by: deischen
* Restore some code removed in revision 1.193 and 1.194, julian saiddavidxu2004-10-061-4/+23
| | | | he'd like to keep these code.
* light rearrangement of some code to get some lockingjulian2004-10-051-15/+27
| | | | | | more correct MFC after: 4 days
* Break out to a separate function, the code to revert a multithreadedjulian2004-10-051-2/+1
| | | | | | process back to officially being a non-threaded program. MFC after: 4 days
* - Assert sched_lock in upcall_remove() since it is needed there and alljhb2004-09-231-1/+3
| | | | | | callers already lock it there. - Lock sched_lock slightly earlier in kse_create() so that it covers kg_numupcalls.
* Various small style fixes.jhb2004-09-221-7/+7
|
* Refactor a bunch of scheduler code to give basically the same behaviourjulian2004-09-051-36/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | but with slightly cleaned up interfaces. The KSE structure has become the same as the "per thread scheduler private data" structure. In order to not make the diffs too great one is #defined as the other at this time. The KSE (or td_sched) structure is now allocated per thread and has no allocation code of its own. Concurrency for a KSEGRP is now kept track of via a simple pair of counters rather than using KSE structures as tokens. Since the KSE structure is different in each scheduler, kern_switch.c is now included at the end of each scheduler. Nothing outside the scheduler knows the contents of the KSE (aka td_sched) structure. The fields in the ksegrp structure that are to do with the scheduler's queueing mechanisms are now moved to the kg_sched structure. (per ksegrp scheduler private data structure). In other words how the scheduler queues and keeps track of threads is no-one's business except the scheduler's. This should allow people to write experimental schedulers with completely different internal structuring. A scheduler call sched_set_concurrency(kg, N) has been added that notifies teh scheduler that no more than N threads from that ksegrp should be allowed to be on concurrently scheduled. This is also used to enforce 'fainess' at this time so that a ksegrp with 10000 threads can not swamp a the run queue and force out a process with 1 thread, since the current code will not set the concurrency above NCPU, and both schedulers will not allow more than that many onto the system run queue at a time. Each scheduler should eventualy develop their own methods to do this now that they are effectively separated. Rejig libthr's kernel interface to follow the same code paths as linkse for scope system threads. This has slightly hurt libthr's performance but I will work to recover as much of it as I can. Thread exit code has been cleaned up greatly. exit and exec code now transitions a process back to 'standard non-threaded mode' before taking the next step. Reviewed by: scottl, peter MFC after: 1 week
* Give setrunqueue() and sched_add() more of a clue as tojulian2004-09-011-2/+2
| | | | | | where they are coming from and what is expected from them. MFC after: 2 days
* Remove TDP_USTATCLOCK, we no longer need it because we now alwaysdavidxu2004-08-311-55/+30
| | | | | | | | | update tick count for userland in thread_userret. This change also removes a "no upcall owned" panic because fuword() schedules an upcall under heavily loaded, and code assumes there is no upcall can occur. Reported and Tested by: Peter Holm <peter@holm.cc>
* Remove an unneeded argument..julian2004-08-311-2/+2
| | | | | | | | | The removed argument could trivially be derived from the remaining one. That in turn should be the same as curthread, but it is possible that curthread could be expensive to derive on some syste,s so leave it as an argument. Having both proc and thread as an argumen tjust gives an opportunity for them to get out sync. MFC after: 3 days
* 1. try to use existing mailbox address in thread_update_usr_ticks.davidxu2004-08-281-4/+6
| | | | 2. remove '\n' in KASSERT.
* Move TDF_CAN_UNBIND to thread private flags td_pflags, this eliminatesdavidxu2004-08-281-19/+4
| | | | | | | need of sched_lock in some places. Also in thread_userret, remove spare thread allocation code, it is already done in thread_user_enter. Reviewed by: julian
* Remove checking of single exit flag in thread_user_enter(), this isdavidxu2004-08-231-12/+0
| | | | generic code for threaded process, should not be here.
* Slight changes to comments and some whitespace changes.julian2004-08-091-3/+10
|
* 1.Add KSE_INTR_DBSUSPEND command for kse_thr_interrupt to suspend a bounddavidxu2004-08-081-29/+46
| | | | | | | | | thread, after the bound thread leaves critical region, the thread should check debug flag may suspend itself by using the command. 2.Schedule upcall after thread is suspended by debugger 3.Wakeup upcall thread after process suspension. Reviewed by: deischen
* s/TMDF_DONOTRUNUSER/TMDF_SUSPEND/gdavidxu2004-08-031-2/+2
| | | | Dicussed with: deischen
* Repeat after me:julian2004-08-031-0/+1
| | | | "Do not apply your tested patches to your commit tree by hand"
* Remove an argument that is never used.julian2004-08-021-7/+6
|
* Add what appears to be a missing '*/' at the end of a comment.rwatson2004-08-021-0/+1
|
* Comment kse_create() and make a few minor code cleanupsjulian2004-08-011-47/+121
| | | | Reviewed by: davidxu
* When calling scheduler entrypoints for creating new threads and processes,julian2004-07-181-2/+2
| | | | | | | | | | | specify "us" as the thread not the process/ksegrp/kse. You can always find the others from the thread but the converse is not true. Theorotically this would lead to runtime being allocated to the wrong entity in some cases though it is not clear how often this actually happenned. (would only affect threaded processes and would probably be pretty benign, but it WAS a bug..) Reviewed by: peter
* - Move TDF_OWEPREEMPT, TDF_OWEUPC, and TDF_USTATCLOCK over to td_pflagsjhb2004-07-161-5/+4
| | | | | | | | | since they are only accessed by curthread and thus do not need any locking. - Move pr_addr and pr_ticks out of struct uprof (which is per-process) and directly into struct thread as td_profil_addr and td_profil_ticks as these variables are really per-thread. (They are used to defer an addupc_intr() that was too "hard" until ast()).
* Add code to support debugging threaded process.davidxu2004-07-131-75/+153
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. Add tm_lwpid into kse_thr_mailbox to indicate which kernel thread current user thread is running on. Add tm_dflags into kse_thr_mailbox, the flags is written by debugger, it tells UTS and kernel what should be done when the process is being debugged, current, there two flags TMDF_SSTEP and TMDF_DONOTRUNUSER. TMDF_SSTEP is used to tell kernel to turn on single stepping, or turn off if it is not set. TMDF_DONOTRUNUSER is used to tell kernel to schedule upcall whenever possible, to UTS, it means do not run the user thread until debugger clears it, this behaviour is necessary because gdb wants to resume only one thread when the thread's pc is at a breakpoint, and thread needs to go forward, in order to avoid other threads sneak pass the breakpoints, it needs to remove breakpoint, only wants one thread to go. Also, add km_lwp to kse_mailbox, the lwp id is copied to kse_thr_mailbox at context switch time when process is not being debugged, so when process is attached, debugger can map kernel thread to user thread. 2. Add p_xthread to proc strcuture and td_xsig to thread structure. p_xthread is used by a thread when it wants to report event to debugger, every thread can set the pointer, especially, when it is used in ptracestop, it is the last thread reporting event will win the race. Every thread has a td_xsig to exchange signal with debugger, thread uses TDF_XSIG flag to indicate it is reporting signal to debugger, if the flag is not cleared, thread will keep retrying until it is cleared by debugger, p_xthread may be used by debugger to indicate CURRENT thread. The p_xstat is still in proc structure to keep wait() to work, in future, we may just use td_xsig. 3. Add TDF_DBSUSPEND flag, the flag is used by debugger to suspend a thread. When process stops, debugger can set the flag for thread, thread will check the flag in thread_suspend_check, enters a loop, unless it is cleared by debugger, process is detached or process is existing. The flag is also checked in ptracestop, so debugger can temporarily suspend a thread even if the thread wants to exchange signal. 4. Current, in ptrace, we always resume all threads, but if a thread has already a TDF_DBSUSPEND flag set by debugger, it won't run. Encouraged by: marcel, julian, deischen
* Change kse_switchin to accept kse_thr_mailbox pointer, the syscalldavidxu2004-07-121-9/+20
| | | | | | will be used heavily in debugging KSE threads. This breaks libpthread on IA64, but because libpthread was not in 5.2.1 release, I would like to change it so we needn't to introduce another syscall.
* Allocate TIDs in thread_init() and deallocate them in thread_fini().marcel2004-06-261-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | The overhead of unconditionally allocating TIDs (and likewise, unconditionally deallocating them), is amortized across multiple thread creations by the way UMA makes it possible to have type-stable storage. Previously the cost was kept down by having threads created as part of a fork operation use the process' PID as the TID. While this had some nice properties, it also introduced complexity in the way TIDs were allocated. Most importantly, by using the type-stable storage that UMA gives us this was also unnecessary. This change affects how core dumps are created and in particular how the PRSTATUS notes are dumped. Since we don't have a thread with a TID equalling the PID, we now need a different way to preserve the old and previous behavior. We do this by having the given thread (i.e. the thread passed to the core dump code in td) dump it's state first and fill in pr_pid with the actual PID. All other threads will have pr_pid contain their TIDs. The upshot of all this is that the debugger will now likely select the right LWP (=TID) as the initial thread. Credits to: julian@ for spotting how we can utilize UMA. Thanks to: all who provided julian@ with test results.
* Shuffle some code around.julian2004-06-111-42/+25
|
* Move the KSE ABI specific code here and separate it from code thatjulian2004-06-071-977/+19
| | | | | | is generic to any threading system. This commit does not link this file to the build yet, nor does it remove these functions from their current location in kern_thread.c. (that commit coming up after further review)
* Move TDF_SA from td_flags to td_pflags (and rename it accordingly)tjr2004-06-021-10/+10
| | | | | | | so that it is no longer necessary to hold sched_lock while manipulating it. Reviewed by: davidxu
* Clear KSE thread flags after KSE thread mode is ended. The side effectdavidxu2004-05-211-1/+1
| | | | | | | | of not clearing the flags for execv() syscall will result that a new program runs in KSE thread mode without enabling it. Submitted by: tjr Modified by: davidxu
OpenPOWER on IntegriCloud