summaryrefslogtreecommitdiffstats
path: root/sys/kern
Commit message (Collapse)AuthorAgeFilesLines
* Always acquire the UNIX domain socket subsystem lock (UNP lock)rwatson2004-08-161-46/+107
| | | | | | | | | | before dereferencing sotounpcb() and checking its value, as so_pcb is protected by protocol locking, not subsystem locking. This prevents races during close() by one thread and use of ths socket in another. unp_bind() now assert the UNP lock, and uipc_bind() now acquires the lock around calls to unp_bind().
* Add the missing knote_fdclose().green2004-08-161-2/+4
|
* Allocate the marker, when scanning a kqueue, from the "heap" instead of thegreen2004-08-161-6/+12
| | | | | | | stack. When swapped out, a process's kernel stack would be unavailable, and we could get a page fault when scanning the same kqueue. PR: kern/61849
* Annotate the current UNIX domain socket locking strategies, order,rwatson2004-08-161-0/+21
| | | | | strengths, and weaknesses in a comment. Assert a copyright over the changes made as part of the locking work.
* Major enhancements to pipe memory usage:silby2004-08-161-55/+136
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - pipespace is now able to resize non-empty pipes; this allows for many more resizing opportunities - Backing is no longer pre-allocated for the reverse direction of pipes. This direction is rarely (if ever) used, so this cuts the amount of map space allocated to a pipe in half. - Pipe growth is now much more dynamic; a pipe will now grow when the total amount of data it contains and the size of the write are larger than the size of pipe. Previously, only individual writes greater than the size of the pipe would cause growth. - In low memory situations, pipes will now shrink during both read and write operations, where possible. Once the memory shortage ends, the growth code will cause these pipes to grow back to an appropriate size. - If the full PIPE_SIZE allocation fails when a new pipe is created, the allocation will be retried with SMALL_PIPE_SIZE. This helps to deal with the situation of a fragmented map after a low memory period has ended. - Minor documentation + code changes to support the above. In total, these changes increase the total number of pipes that can be allocated simultaneously, drastically reducing the chances that pipe allocation will fail. Performance appears unchanged due to dynamic resizing.
* Yet another tweak to the shutdown messages in boot():truckman2004-08-151-15/+12
| | | | | | | | | | | | | | | | | | | | | | Don't count busy buffers before the initial call to sync() and don't skip the initial sync() if no busy buffers were called. Always call sync() at least once if syncing is requested. This defers the "Syncing disks, buffers remaining..." message until after the initial sync() call and the first count of busy buffers. This backs out changes in kern_shutdown 1.162. Print a different message when there are no busy buffers after the initial sync(), which is now the expected situation. Print an additional message when syncing has completed successfully in the unusual situation where the work of syncing was done by boot(). Uppercase one message to make it consistent with all of the other kernel shutdown messages. Discussed with: bde (in a much earlier form, prior to 1.162) Reviewed by: njl (in an earlier form)
* Add locking to the kqueue subsystem. This also makes the kqueue subsystemjmg2004-08-1518-409/+1059
| | | | | | | | | | | | | a more complete subsystem, and removes the knowlege of how things are implemented from the drivers. Include locking around filter ops, so a module like aio will know when not to be unloaded if there are outstanding knotes using it's filter ops. Currently, it uses the MTX_DUPOK even though it is not always safe to aquire duplicate locks. Witness currently doesn't support the ability to discover if a dup lock is ok (in some cases). Reviewed by: green, rwatson (both earlier versions)
* Add a new sysctl, debug.kdb.stop_cpus, which controls whether or not werwatson2004-08-151-2/+19
| | | | | | | | | | | attempt to IPI other cpus when entering the debugger in order to stop them while in the debugger. The default remains to issue the stop; however, that can result in a hang if another cpu has interrupts disabled and is spinning, since the IPI won't be received and the KDB will wait indefinitely. We probably need to add a timeout, but this is a useful stopgap in the mean time. Reviewed by: marcel
* Cause pfind() not to return processes in the PRS_NEW state. As a result,rwatson2004-08-141-1/+8
| | | | | | | | | | | | | | threads consuming the result of pfind() will not need to check for a NULL credential pointer or other signs of an incompletely created process. However, this also means that pfind() cannot be used to test for the existence or find such a process. Annotate pfind() to indicate that this is the case. A review of curent consumers seems to indicate that this is not a problem for any of them. This closes a number of race conditions that could result in NULL pointer dereferences and related failure modes. Other related races continue to exist, especially during iteration of the allproc list without due caution. Discussed with: tjr, green
* Add some KASSERTS.phk2004-08-141-0/+3
|
* Whitespace nit.julian2004-08-141-1/+1
|
* After completing a name lookup for a target UNIX domain socket torwatson2004-08-141-5/+18
| | | | | | | | | | | | | | | | | | | | | | connect to, re-check that the local UNIX domain socket hasn't been closed while we slept, and if so, return EINVAL. This affects the system running both with and without Giant over the network stack, and recent ULE changes appear to cause it to trigger more frequently than previously under load. While here, improve catching of possibly closed UNIX domain sockets in one or two additional circumstances. I have a much larger set of related changes in Perforce, but they require more testing before they can be merged. One debugging printf is left in place to indicate when such a race takes place: this is typically triggered by a buggy application that simultaenously connect()'s and close()'s a UNIX domain socket file descriptor. I'll remove this at some point in the future, but am interested in seeing how frequently this is reported. In the case of Martin's reported problem, it appears to be a result of a non-thread safe syslog() implementation in the C library, which does not synchronize access to its logging file descriptor. Reported by: mbr
* clean up whitespace...jmg2004-08-131-55/+55
|
* looks like rwatson forgot tabs... :)jmg2004-08-131-2/+2
|
* Don't keep evaluating our own cpu mask..julian2004-08-131-2/+3
| | | | it's not likely to have changed....
* Trim trailing white space.rwatson2004-08-121-11/+11
|
* Minor formatting fixes for lines > 80 charactersimp2004-08-122-29/+31
|
* - Introduce a new flag KEF_HOLD that prevents sched_add() from doing ajeff2004-08-121-7/+19
| | | | | | | | | migration. Use this in sched_prio() and sched_switch() to stop us from migrating threads that are in short term sleeps or are runnable. These extra migrations were added in the patches to support KSE. - Only set NEEDRESCHED if the thread we're adding in sched_add() is a lower priority and is being placed on the current queue. - Fix some minor whitespace problems.
* Properly keep track of how many kses are on the system run queue(s).julian2004-08-111-2/+3
|
* Replace a reference to splnet() with a reference to locking in a comment.rwatson2004-08-111-1/+1
|
* Add __elfN(dump_thread). This function is called from __elfN(coredump)marcel2004-08-111-2/+5
| | | | | | | | | to allow dumping per-thread machine specific notes. On ia64 we use this function to flush the dirty registers onto the backingstore before we write out the PRSTATUS notes. Tested on: alpha, amd64, i386, ia64 & sparc64 Not tested on: arm, powerpc
* In v_addpollinfo(), we allocate storage to back vp->v_pollinfo. However,rwatson2004-08-111-1/+7
| | | | | | | | | we may sleep when doing so; check that we didn't race with another thread allocating storage for the vnode after allocation is made to a local pointer, and only update the vnode pointer if it's still NULL. Otherwise, accept that another thread got there first, and release the local storage. Discussed with: jmg
* Eliminate the acquisition and release of Giant within physio(). Removealc2004-08-101-6/+0
| | | | | | | the spl calls. Reviewed by: phk@ Discussed with: scottl@
* Synchronize the extra SA threading checks and return value handling ofjhb2004-08-101-24/+50
| | | | | | condition variables with that of msleep(). Reviewed by: davidxu
* - Use a new flag, KEF_XFERABLE, to record with certainty that this kse hadjeff2004-08-101-34/+76
| | | | | | | | | | | | | | | | | | | contributed to the transferable load count. This prevents any potential problems with sched_pin() being used around calls to setrunqueue(). - Change the sched_add() load balancing algorithm to try to migrate on wakeup. This attempts to place threads that communicate with each other on the same CPU. - Don't clear the idle counts in kseq_transfer(), let the cpus do that when they call sched_add() from kseq_assign(). - Correct a few out of date comments. - Make sure the ke_cpu field is correct when we preempt. - Call kseq_assign() from sched_clock() to catch any assignments that were done without IPI. Presently all assignments are done with an IPI, but I'm trying a patch that limits that. - Don't migrate a thread if it is still runnable in sched_add(). Previously, this could only happen for KSE threads, but due to changes to sched_switch() all threads went through this path. - Remove some code that was added with preemption but is not necessary.
* Skip the syncing disks loop if there are no dirty buffers. Remove anjl2004-08-102-6/+14
| | | | | | variable used to flag the initial printf. Submitted by: truckman (earlier version)
* Add a temporary debugging hack to detect a deadlock in setrunqueue(). Thisscottl2004-08-101-0/+7
| | | | | | is here so that we can gather stats on the nature of the recent rash of hard lockups, and in this particular case panic the machine instead of letting it deadlock forever.
* Slight changes to comments and some whitespace changes.julian2004-08-091-3/+10
|
* Make kg->kg_runnable actually count runnable threads in the ksegrp run queuejulian2004-08-091-4/+5
| | | | | instead of only doing it sometimes.. This is not used outdide of debugging code in the current code, but that will probably change.
* Remove typos on KASSERT messages.julian2004-08-091-3/+3
|
* Normalize the VM wiring done with SPARSE_MAPPING: check for errors, andgreen2004-08-091-10/+17
| | | | | unmap when done. For whatever reason, SPARSE_MAPPING is not even a config option, so this is dead code.
* Increase the amount of data exported by KTR in the KTR_RUNQ setting.julian2004-08-095-21/+28
| | | | | This extra data is needed to really follow what is going on in the threaded case.
* add option to automaticly mark core dumps with the nodump flagjmg2004-08-091-0/+6
| | | | | PR: 57065 Submitted by: Walter C. Pelissero
* 1.Add KSE_INTR_DBSUSPEND command for kse_thr_interrupt to suspend a bounddavidxu2004-08-081-29/+46
| | | | | | | | | thread, after the bound thread leaves critical region, the thread should check debug flag may suspend itself by using the command. 2.Schedule upcall after thread is suspended by debugger 3.Wakeup upcall thread after process suspension. Reviewed by: deischen
* Call thread_user_enter for M:N thread, ast() should be treated as anotherdavidxu2004-08-081-0/+2
| | | | entrance of kernel.
* Add pl_flags to ptrace_lwpinfo, two flags PL_FLAG_SA and PL_FLAG_BOUNDdavidxu2004-08-081-0/+7
| | | | | | | indicate that a thread is in UTS critical region. Reviewed by: deischen Approved by: marcel
* Make sure that AT_PHDR has a useful value even for static programs.dfr2004-08-081-0/+11
|
* rearange some code that handles the thread taskqueue so that it is morejmg2004-08-081-13/+16
| | | | | | | generic. Introduce a new define TASKQUEUE_DEFINE_THREAD that takes a single arg, which is the name of the queue. Document these changes.
* We're not yet ready to assert !Giant in kern_fcntl(), as it's calledrwatson2004-08-071-5/+4
| | | | | | with Giant from ABI wrappers such as Linux emulation. Foot shoot off: phk
* Flag a broad range of VFS operations as GIANT_REQUIRED in order torwatson2004-08-061-2/+24
| | | | | | | | catch leaking into VFS without Giant. Inch Giant a little lower in several file descriptor operations on vnodes to cover only VFS operations that need it, rather than file flag reading, etc.
* In thread_exit(), include more information about the thread/processrwatson2004-08-061-1/+2
| | | | | | context in the KTR trace record. In particular, include the same information as passed for mi_switch() and fork_exit() KTR trace records.
* Push UIDINFO_UNLOCK() slightly earlier in chgsbize(), as it's notrwatson2004-08-061-2/+2
| | | | | needed if we print the local variable version of the limit rather than the shared version.
* Avoid acquiring Giant for some common light-weight or already MPSAFErwatson2004-08-061-2/+31
| | | | | | | | | | | | | | fcntl() operations, including: F_DUPFD dup() alias F_GETFD retrieve close-on-exec flag F_SETFD set close-on-exec flag F_GETFL retrieve file descriptor flags For the remaining fcntl() operations, do acquire Giant, especially where we call into fo_ioctl() as a result. We're not yet ready to push Giant into fo_ioctl(). Once we do, this can all become quite a bit prettier.
* Cut a KTR record whenever a callout is invoked. Mark whether it runsrwatson2004-08-061-0/+4
| | | | | with Giant or not, and include the function point so it can be looked up against the kernel symbol table during trace analysis.
* Don't scare users with a warning about preemption being off when it isn'tjhb2004-08-061-0/+2
| | | | yet safe to have on by default.
* In ithread_schedule(), when we plan to go harvest some entropy asrwatson2004-08-061-2/+4
| | | | | | a result of scheduling an ithread, cut a KTR_INTR trace record so that it's clear in tracing interrupt activity where and when the entropy harvesting code is invoked.
* When reseting a pending callout, perform the deregistration incperciva2004-08-061-2/+16
| | | | | | | | callout_reset rather than calling callout_stop. This results in a few lines of code duplication, but it provides a significant performance improvement because it avoids recursing on callout_lock. Requested by: rwatson
* Fix the code in rman that merges adjacent unallocated resources to use ajhb2004-08-051-5/+12
| | | | | | | | | | | | | | | better check for 'adjacent'. The old code assumed that if two resources were adjacent in the linked list that they were also adjacent range wise. This is not true when a resource manager has to manage disparate regions. For example, the current interrupt code on i386/amd64 will instruct irq_rman to manage two disjoint regions: 0-1 and 3-15 for the non-APIC case. If IRQs 1 and 3 were allocated and then released, the old code would coalesce across the 1 to 3 boundary because the resources were adjacent in the linked list thus adding 2 to the area of resources that irq_rman managed as a side effect. The fix adds extra checks so that adjacent unallocated resources are only merged with the resource being freed if the start and end values of the resources also match up. The patch also consolidates the checks for adjacent resources being allocated.
* Remove a potential deadlock on i386 SMP by changing the lazypmap ipi andjhb2004-08-041-1/+0
| | | | | | | | spin-wait code to use the same spin mutex (smp_tlb_mtx) as the TLB ipi and spin-wait code snippets so that you can't get into the situation of one CPU doing a TLB shootdown to another CPU that is doing a lazy pmap shootdown each of which are waiting on each other. With this change, only one of the CPUs would do an IPI and spin-wait at a time.
* Workaround a possible deadlock on SMP due to a spin lock LOR by disablingjhb2004-08-041-0/+6
| | | | | | | | the immediate awakening of proc0 (scheduler kproc, controls swapping processes in and out). The scheduler process periodically awakens already, so this will not result in processes not being swapped in, there will just be more latency in between a thread being made runnable and the scheduler waking up to swap the affected process back in.
OpenPOWER on IntegriCloud