summaryrefslogtreecommitdiffstats
path: root/sys/kern
Commit message (Collapse)AuthorAgeFilesLines
...
* - Push down Giant in exit() and wait().jhb2004-03-052-26/+42
| | | | | | | | - Push Giant down a bit in coredump() and call coredump() with the proc lock already held rather than unlocking it only to turn around and relock it. Requested by: peter
* Lock Giant around the single threading code in exec() to satisfy anjhb2004-03-051-0/+3
| | | | assertion in the single threading code.
* - Grab a share lock of the proctree lock while looking for a pid due to thejhb2004-03-051-13/+25
| | | | | | | | process group and session dereferences. Also, check that p_pgrp and p_sesssion are NULL before dereferencing them. - Push down Giant in fork1(). Requested by: peter
* Undo the merger of mlock()/vslock and munlock()/vsunlock() and thetruckman2004-03-051-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | introduction of kern_mlock() and kern_munlock() in src/sys/kern/kern_sysctl.c 1.150 src/sys/vm/vm_extern.h 1.69 src/sys/vm/vm_glue.c 1.190 src/sys/vm/vm_mmap.c 1.179 because different resource limits are appropriate for transient and "permanent" page wiring requests. Retain the kern_mlock() and kern_munlock() API in the revived vslock() and vsunlock() functions. Combine the best parts of each of the original sets of implementations with further code cleanup. Make the mclock() and vslock() implementations as similar as possible. Retain the RLIMIT_MEMLOCK check in mlock(). Move the most strigent test, which can return EAGAIN, last so that requests that have no hope of ever being satisfied will not be retried unnecessarily. Disable the test that can return EAGAIN in the vslock() implementation because it will cause the sysctl code to wedge. Tested by: Cy Schubert <Cy.Schubert AT komquats.com>
* The roundrobin callout from sched_4bsd is MPSAFE, so set up therwatson2004-03-051-1/+1
| | | | | | callout as MPSAFE to avoid grabbing Giant. Reviewed by: jhb
* Put "failed to set signal flags properly for ast()" check underrwatson2004-03-051-1/+1
| | | | | | | | | | DIAGNOSTIC instead of INVARIANTS. INVARIANTS is intended for tests that don't substantially change code flow or behavior (passive), but this test required locking both the proc lock and scheduler lock in order to execute. It also appears to be a very advisory diagnostic as opposed to an invariant violation. Following discussion with: bde
* Just because the timecounter reads the same value on two samplesphk2004-03-041-4/+0
| | | | after each other doesn't mean that nothing happened.
* Fixed some style bugs (mainly English usage errors in comments).bde2004-03-041-9/+10
|
* Fixed some style bugs (mainly misplaced comments, and totally disorderedbde2004-03-041-15/+18
| | | | declarations in acct_process()).
* Remove unneeded label 'done2' from socket(). We now grab Giantrwatson2004-03-041-2/+1
| | | | | | | only around socreate(), and don't need it for file descriptor accesses. Submitted by: sam
* Use different dummy wait channels to avoid panic in msleep().des2004-03-031-3/+3
| | | | Reviewed by: jhb
* Always assert that the passed in lock is the same as the saved lock in thejhb2004-03-021-19/+1
| | | | sleep queue now that the one abnormal case has been fixed.
* Correct handling of PDROP in msleep() to just skip the mtx_lock() ratherjhb2004-03-021-3/+1
| | | | | than clear the lock pointer so that sleepq_add() still gets the correct lock pointer and doesn't bogusly trip an assertion.
* Check for TDF_SINTR before calling sleepq_abort() as there is a narrowjhb2004-03-012-2/+2
| | | | | | | | | | | race in between sleepq_add() and sleepq_catch_signals() in that setting td_wchan and TDF_SINTR is not atomic to sched_lock but only to the sleepq lock. This band-aid will stop assertion failures, but there is perhaps a larger problem with the sleepq_add/sleepq_catch_signals race that I am not sure how to solve. For the signals case the race is harmless because we always call cursig() after setting TDF_SINTR. However, KSE doesn't do anything in sleepq_catch_signals() to check that this race was lost, so I am unsure if this race is harmful for this specific abort.
* Rename dup_sockaddr() to sodupsockaddr() for consistency with otherrwatson2004-03-014-24/+22
| | | | | | | | | | | | functions in kern_socket.c. Rename the "canwait" field to "mflags" and pass M_WAITOK and M_NOWAIT in from the caller context rather than "1" or "0". Correct mflags pass into mac_init_socket() from previous commit to not include M_ZERO. Submitted by: sam
* Convert the other use of flags to mflags in soalloc().scottl2004-03-011-1/+1
|
* Modify soalloc() API so that it accepts a malloc flags argument ratherrwatson2004-02-293-12/+5
| | | | | | | | | than a "waitok" argument. Callers now passing M_WAITOK or M_NOWAIT rather than 0 or 1. This simplifies the soalloc() logic, and also makes the waiting behavior of soalloc() more clear in the calling context. Submitted by: sam
* Loudly announce WITNESS and DIAGNOSTIC options and warn about reducedphk2004-02-291-0/+14
| | | | performance.
* Make sure to disable the watchdog if we cannot honour the timeout.phk2004-02-281-3/+2
|
* Rename the WATCHDOG option to SW_WATCHDOG and make it use thephk2004-02-281-33/+29
| | | | | | | | | | | generic watchdoc(9) interface. Make watchdogd(8) perform as watchdog(8) as well, and make it possible to specify a check command to run, timeout and sleep periods. Update watchdog(4) to talk about the generic interface and add new watchdog(8) page.
* Switch the sleep/wakeup and condition variable implementations to use thejhb2004-02-279-569/+157
| | | | | | | | | | | | | | | | | | | | | | | | | | | sleep queue interface: - Sleep queues attempt to merge some of the benefits of both sleep queues and condition variables. Having sleep qeueus in a hash table avoids having to allocate a queue head for each wait channel. Thus, struct cv has shrunk down to just a single char * pointer now. However, the hash table does not hold threads directly, but queue heads. This means that once you have located a queue in the hash bucket, you no longer have to walk the rest of the hash chain looking for threads. Instead, you have a list of all the threads sleeping on that wait channel. - Outside of the sleepq code and the sleep/cv code the kernel no longer differentiates between cv's and sleep/wakeup. For example, calls to abortsleep() and cv_abort() are replaced with a call to sleepq_abort(). Thus, the TDF_CVWAITQ flag is removed. Also, calls to unsleep() and cv_waitq_remove() have been replaced with calls to sleepq_remove(). - The sched_sleep() function no longer accepts a priority argument as sleep's no longer inherently bump the priority. Instead, this is soley a propery of msleep() which explicitly calls sched_prio() before blocking. - The TDF_ONSLEEPQ flag has been dropped as it was never used. The associated TDF_SET_ONSLEEPQ and TDF_CLR_ON_SLEEPQ macros have also been dropped and replaced with a single explicit clearing of td_wchan. TD_SET_ONSLEEPQ() would really have only made sense if it had taken the wait channel and message as arguments anyway. Now that that only happens in one place, a macro would be overkill.
* Drop sched_lock around the wakeup of the parent process after settingjhb2004-02-271-4/+9
| | | | | | the process state to zombie when a process exits to avoid a lock order reversal with the sleepqueue locks. This appears to be the only place that we call wakeup() with sched_lock held.
* Add an implementation of a generic sleep queue abstraction that is usedjhb2004-02-273-5/+777
| | | | | | | | | | | | | | | | to queue threads sleeping on a wait channel similar to how turnstiles are used to queue threads waiting for a lock. This subsystem will be used as the backend for sleep/wakeup and condition variables initially. Eventually it will also be used to replace the ithread-specific iwait thread inhibitor. Sleep queues are also not locked by sched_lock, so this splits sched_lock up a bit further increasing concurrency within the scheduler. Sleep queues also natively support timeouts on sleeps and interruptible sleeps allowing for the reduction of a lot of duplicated code between the sleep/wakeup and condition variable implementations. For more details on the sleep queue implementation, check the comments in sys/sleepqueue.h and kern/subr_sleepqueue.c.
* Add sysctl_move_oid() which reparents an existing OID.des2004-02-271-0/+20
|
* Clarify and tweak some comments.jhb2004-02-271-3/+3
|
* Fix _sx_assert() to panic() rather than printf() when an assertion failsjhb2004-02-271-3/+5
| | | | and ignore assertions if we have already paniced.
* Replace the ktrace queue's semaphore with a condition variable instead asjhb2004-02-261-5/+5
| | | | | it is slightly more efficient since we already have a mutex to protect the queue. Ktrace originally used a semaphore more as a proof of concept.
* Split the mlock() kernel code into two parts, mlock(), which unpackstruckman2004-02-266-24/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the syscall arguments and does the suser() permission check, and kern_mlock(), which does the resource limit checking and calls vm_map_wire(). Split munlock() in a similar way. Enable the RLIMIT_MEMLOCK checking code in kern_mlock(). Replace calls to vslock() and vsunlock() in the sysctl code with calls to kern_mlock() and kern_munlock() so that the sysctl code will obey the wired memory limits. Nuke the vslock() and vsunlock() implementations, which are no longer used. Add a member to struct sysctl_req to track the amount of memory that is wired to handle the request. Modify sysctl_wire_old_buffer() to return an error if its call to kern_mlock() fails. Only wire the minimum of the length specified in the sysctl request and the length specified in its argument list. It is recommended that sysctl handlers that use sysctl_wire_old_buffer() should specify reasonable estimates for the amount of data they want to return so that only the minimum amount of memory is wired no matter what length has been specified by the request. Modify the callers of sysctl_wire_old_buffer() to look for the error return. Modify sysctl_old_user to obey the wired buffer length and clean up its implementation. Reviewed by: bms
* Assert pipe mutex in pipeselwakeup(), as we manipulate pipe_staterwatson2004-02-261-0/+1
| | | | | in a non-atomic manner. It appears to always be called with the mutex (good).
* Update comment regarding MAC labels: we no longer pass endpointsrwatson2004-02-251-7/+3
| | | | | | | into the MAC Framework, just the pipe pair. GC 'hadpeer' used in pipedestroy(), which is no longer needed as we check pipe_present flags on the pair.
* Whitespace cleanupdes2004-02-241-4/+4
|
* Fix two oversights here: don't trash the freelist, and properly cleanupphk2004-02-231-1/+4
| | | | | | the cdevsw{}. Submitted by: tegge
* Correct some major SMP-harmful problems in the pipe implementation. Firstgreen2004-02-221-41/+66
| | | | | | | | | | | | of all, PIPE_EOF is not checked pervasively after everything that can drop the pipe mutex and msleep(), so fix. Additionally, though it might not harm anything, pipelock() and pipeunlock() are not used consistently. Third, the kqueue support functions do not use the pipe mutex correctly. Last, but absolutely not least, is a race: if pipe_busy is not set on the closing side of the pipe, the other side that is trying to write to that will crash BECAUSE PIPE_EOF IS NOT SET! Unconditionally set PIPE_EOF, and get rid of all the lockups/crashes I have seen trying to build ports.
* Add sysctls to allow showing threads for pgrp, tty, uid, ruid,deischen2004-02-221-7/+31
| | | | and pid.
* Reimplement sysctls handling by MAC framework.pjd2004-02-221-15/+15
| | | | | | | | | | | Now I believe it is done in the right way. Removed some XXMAC cases, we now assume 'high' integrity level for all sysctls, except those with CTLFLAG_ANYBODY flag set. No more magic. Reviewed by: rwatson Approved by: rwatson, scottl (mentor) Tested with: LINT (compilation), mac_biba(4) (functionality)
* If we're going to panic(), do it before dereferencing a NULL pointer.cperciva2004-02-221-1/+1
| | | | | Reported by: "Ted Unangst" <tedu@coverity.com> Approved by: rwatson (mentor)
* Update my personal copyrights and NETA copyrights in the kernelrwatson2004-02-224-5/+5
| | | | | | | | to use the "year1-year3" format, as opposed to "year1, year2, year3". This seems to make lawyers more happy, but also prevents the lines from getting excessively long as the years start to add up. Suggested by: imp
* Check for NODEV return from udev2dev()phk2004-02-211-0/+2
|
* Device megapatch 6/6:phk2004-02-213-33/+163
| | | | | | | | | | | | | | | | | | | | | | | | | This is what we came here for: Hang dev_t's from their cdevsw, refcount cdevsw and dev_t and generally keep track of things a lot better than we used to: Hold a cdevsw reference around all entrances into the device driver, this will be necessary to safely determine when we can unload driver code. Hold a dev_t reference while the device is open. KASSERT that we do not enter the driver on a non-referenced dev_t. Remove old D_NAG code, anonymous dev_t's are not a problem now. When destroy_dev() is called on a referenced dev_t, move it to dead_cdevsw's list. When the refcount drops, free it. Check that cdevsw->d_version is correct. If not, set all methods to the dead_*() methods to prevent entrance into driver. Print warning on console to this effect. The device driver may still explode if it is also incompatible with newbus, but in that case we probably didn't get this far in the first place.
* Device megapatch 5/6:phk2004-02-212-10/+13
| | | | | | | | | | | | Remove the unused second argument from udev2dev(). Convert all remaining users of makedev() to use udev2dev(). The semantic difference is that udev2dev() will only locate a pre-existing dev_t, it will not line makedev() create a new one. Apart from the tiny well controlled windown in D_PSEUDO drivers, there should no longer be any "anonymous" dev_t's in the system now, only dev_t's created with make_dev() and make_dev_alias()
* Device megapatch 4/6:phk2004-02-218-4/+18
| | | | | | | | Introduce d_version field in struct cdevsw, this must always be initialized to D_VERSION. Flip sense of D_NOGIANT flag to D_NEEDGIANT, this involves removing four D_NOGIANT flags and adding 145 D_NEEDGIANT flags.
* Device megapatch 3/6:phk2004-02-213-4/+26
| | | | | | | | | | | | Add missing D_TTY flags to various drivers. Complete asserts that dev_t's passed to ttyread(), ttywrite(), ttypoll() and ttykqwrite() have (d_flags & D_TTY) and a struct tty pointer. Make ttyread(), ttywrite(), ttypoll() and ttykqwrite() the default cdevsw methods for D_TTY drivers and remove the explicit initializations in various drivers cdevsw structures.
* Device megapatch 2/6:phk2004-02-211-8/+129
| | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds a couple of functions for pseudodrivers to use for implementing cloning in a manner we will be able to lock down (shortly). Basically what happens is that pseudo drivers get a way to ask for "give me the dev_t with this unit number" or alternatively "give me a dev_t with the lowest guaranteed free unit number" (there is unfortunately a lot of non-POLA in the exact numeric value of this number, just live with it for now) Managing the unit number space this way removes the need to use rman(9) to do so in the drivers this greatly simplifies the code in the drivers because even using rman(9) they still needed to manage their dev_t's anyway. I have taken the if_tun, if_tap, snp and nmdm drivers through the mill, partly because they (ab)used makedev(), but mostly because together they represent three different problems for device-cloning: if_tun and snp is the plain case: just give me a device. if_tap has two kinds of devices, with a flag for device type. nmdm has paired devices (ala pty) can you can clone either of them.
* Make sure to wake up any select waiters when closing a kqueue (also, notgreen2004-02-201-0/+4
| | | | | | | crash). I am fairly sure that only people with SMP and multi-threaded apps using kqueue will be affected by this, so I have a stress-testing program on my web site: <URL:http://green.homeunix.org/~green/getaddrinfo-pthreads-stresstest.c>
* Tidy up the thread taskqueue implementation and close a lost wakeup race.jhb2004-02-191-14/+9
| | | | | | | Instead of creating a mutex that we msleep on but don't actually lock when doing the corresponding wakeup(), in the kthread, lock the mutex associated with our taskqueue and msleep while the queue is empty. Assert that the queue is locked when the callback function is called to wake the kthread.
* Rework jail_attach(2) so that an already jailed process cannot hopnectar2004-02-191-12/+12
| | | | | | to another jail. Submitted by: rwatson
* Added sysctl security.jail.jailed.pjd2004-02-191-0/+13
| | | | | | | | | | | | It returns 1 is process is inside of jail and 0 if it is not. Information if we are in jail or not is not a secret, there is plenty of ways to discover it. Many people are using own hack to check this and this will be a legal way from now on. It will be great if our starting scripts will take advantage of this sysctl to allow clean "boot" inside jail. Approved by: rwatson, scottl (mentor)
* Simplify check. We are only able to check exclusive lock and ifpjd2004-02-191-1/+5
| | | | | | 2nd condition is true, first one is true for sure. Approved by: jhb, scottl (mentor)
* When reparenting a process in the PT_DETACH code, only set p_sigparenttruckman2004-02-191-1/+2
| | | | | | to SIGCHLD if the new parent process is initproc. MFC after: 2 weeks
* A Linux thread created using clone() should not send SIGCHLD to itstruckman2004-02-191-3/+3
| | | | | | | parent if no signal is specified in the clone() flags argument. PR: 42457 MFC after: 2 weeks
OpenPOWER on IntegriCloud