summaryrefslogtreecommitdiffstats
path: root/sys/kern
Commit message (Collapse)AuthorAgeFilesLines
...
* Invoke label initialization, creation, cleanup, and tear-down MACrwatson2005-01-221-0/+162
| | | | | | | | Framework entry points for System V IPC message queues. Submitted by: Dandekar Hrishikesh <rishi_dandekar at sbcglobal dot net> Obtained from: TrustedBSD Project Sponsored by: DARPA, SPAWAR, McAfee Research
* Bring in MemGuard, a very simple and small replacement allocatorbmilekic2005-01-211-0/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | designed to help detect tamper-after-free scenarios, a problem more and more common and likely with multithreaded kernels where race conditions are more prevalent. Currently MemGuard can only take over malloc()/realloc()/free() for particular (a) malloc type(s) and the code brought in with this change manually instruments it to take over M_SUBPROC allocations as an example. If you are planning to use it, for now you must: 1) Put "options DEBUG_MEMGUARD" in your kernel config. 2) Edit src/sys/kern/kern_malloc.c manually, look for "XXX CHANGEME" and replace the M_SUBPROC comparison with the appropriate malloc type (this might require additional but small/simple code modification if, say, the malloc type is declared out of scope). 3) Build and install your kernel. Tune vm.memguard_divisor boot-time tunable which is used to scale how much of kmem_map you want to allott for MemGuard's use. The default is 10, so kmem_size/10. ToDo: 1) Bring in a memguard(9) man page. 2) Better instrumentation (e.g., boot-time) of MemGuard taking over malloc types. 3) Teach UMA about MemGuard to allow MemGuard to override zone allocations too. 4) Improve MemGuard if necessary. This work is partly based on some old patches from Ian Dowse.
* Make "c->c_func = NULL" conditional on CALLOUT_LOCAL_ALLOC in bothcperciva2005-01-191-1/+1
| | | | | | | places where it occurs, not just one. :-) Pointed out by: glebius Pointy had to: cperciva
* Make "c->c_func = NULL" conditional on the CALLOUT_LOCAL_ALLOC flag,cperciva2005-01-191-1/+1
| | | | | | | i.e., only clear c->c_func if the callout c is being used via the old timeout(9) interface. Requested by: glebius
* Clarify the description of the callout_active() macro: It is cleared bycperciva2005-01-191-1/+3
| | | | | callout_stop, callout_drain, and callout_deactivate, but is not automatically cleared when a callout returns.
* move kern_nanosleep to sys/syscallsubr.hps2005-01-191-0/+1
| | | | Requested by: jhb
* Add a 32bit syscall wrapper for modstatps2005-01-191-0/+81
| | | | Obtained from: Yahoo!
* - rename nanosleep1 to kern_nanosleepps2005-01-191-5/+3
| | | | | | | - Add a 32bit syscall entry for nanosleep Reviewed by: peter Obtained from: Yahoo!
* Introduce bus_free_resource. It is a convenience function which wrapsimp2005-01-191-0/+8
| | | | bus_release_resource by grabbing the rid from the resource.
* Revert my previous errno hack, that is certainly an issue,davidxu2005-01-181-2/+1
| | | | | | | | and always has been, but the system call itself returns errno in a register so the problem is really a function of libc, not the system call. Discussed with : Matthew Dillion <dillon@apollo.backplane.com>
* Detect sign-extension bugs in the ioctl(2) command argument: Truncatephk2005-01-181-0/+6
| | | | to 32 bits and print warning.
* Rearrange the kninit calls for both directions of a pipe so thatsilby2005-01-171-1/+3
| | | | | | | | | they both happen before pipe backing allocation occurs. Previously, a pipe memory shortage would cause a panic due to a KNOTE call on an uninitialized si_note. Reported by: Peter Holm MFC after: 1 week
* Fix a bug I introduced in 1.561 which has caused considerable filesystemphk2005-01-161-5/+5
| | | | | | | | | unhappiness lately. As far as I can tell, no files that have made it safely to disk have been endangered, but stuff in transit has been in peril. Pointy hat: phk
* make umtx timeout relative so userland can select different clock type,davidxu2005-01-141-46/+51
| | | | | e.g, CLOCK_REALTIME or CLOCK_MONOTONIC. merge umtx_wait and umtx_timedwait into single function.
* Eliminate unused and unnecessary "cred" argument from vinvalbuf()phk2005-01-142-6/+5
|
* Ditch vfs_object_create() and make the callers call VOP_CREATEVOBJECT()phk2005-01-136-25/+9
| | | | directly.
* Change the generated VOP_ macro implementations to improve type checkingphk2005-01-131-26/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and KASSERT coverage. After this check there is only one "nasty" cast in this code but there is a KASSERT to protect against the wrong argument structure behind that cast. Un-inlining the meat of VOP_FOO() saves 35kB of text segment on a typical kernel with no change in performance. We also now run the checking and tracing on VOP's which have been layered by nullfs, umapfs, deadfs or unionfs. Add new (non-inline) VOP_FOO_AP() functions which take a "struct foo_args" argument and does everything the VOP_FOO() macros used to do with checks and debugging code. Add KASSERT to VOP_FOO_AP() check for argument type being correct. Slim down VOP_FOO() inline functions to just stuff arguments into the struct foo_args and call VOP_FOO_AP(). Put function pointer to VOP_FOO_AP() into vop_foo_desc structure and make VCALL() use it instead of the current offsetoff() hack. Retire vcall() which implemented the offsetoff() Make deadfs and unionfs use VOP_FOO_AP() calls instead of VCALL(), we know which specific call we want already. Remove unneeded arguments to VCALL() in nullfs and umapfs bypass functions. Remove unused vdesc_offset and VOFFSET(). Generally improve style/readability of the generated code.
* When re-connecting already connected datagram socket ensure to cleansobomax2005-01-121-2/+11
| | | | | | | | | | up its pending error state, which may be set in some rare conditions resulting in connect() syscall returning that bogus error and making application believe that attempt to change association has failed, while it has not in fact. There is sockets/reconnect regression test which excersises this bug. MFC after: 2 weeks
* Comment out debugging printf which doesn't compile on amd64.phk2005-01-121-0/+2
|
* Let _umtx_op directly return error code rather than from errno becausedavidxu2005-01-121-12/+23
| | | | | | errno can be tampered potentially by nested signal handle. Now all error codes are returned in negative value, positive value are reserved for future expansion.
* Add BO_SYNC() and add a default which uses the secret vnode pointerphk2005-01-112-1/+9
| | | | and VOP_FSYNC() for now.
* More vnode -> bufobj migration.phk2005-01-111-12/+13
|
* Give flushbuflist() a struct bufv as first argument and avoid home-rollingphk2005-01-111-36/+21
| | | | | | | | TAILQ_FOREACH_SAFE(). Loose the error pointer argument and return any errors the normal way. Return EAGAIN for the case where more work needs to be done.
* Remove the unused credential argument from VOP_FSYNC() and VFS_SYNC().phk2005-01-118-18/+13
| | | | | | | | | | | | | | | | | | I'm not sure why a credential was added to these in the first place, it is not used anywhere and it doesn't make much sense: The credentials for syncing a file (ability to write to the file) should be checked at the system call level. Credentials for syncing one or more filesystems ("none") should be checked at the system call level as well. If the filesystem implementation needs a particular credential to carry out the syncing it would logically have to the cached mount credential, or a credential cached along with any delayed write data. Discussed with: rwatson
* Break out of loop earlier if it is not timeout.davidxu2005-01-081-1/+1
|
* In acct_process(), do a lockless read of acctvp to see if it's NULLrwatson2005-01-081-1/+12
| | | | | | | | before deciding to do more expensive locking to account for process exit. This acceptable minor race avoids two mutex operations in that highly common case of accounting not being enabled. MFC after: 2 weeks
* In kern_wait(), let the compiler copy the rusage structure rather thanrwatson2005-01-081-1/+1
| | | | an explicit bcopy() -- it probably does a better job.
* Adjust two of my comments to the new world order: Indent protection incperciva2005-01-071-2/+2
| | | | the first column is performed using /**, not /*-.
* /* -> /*- for license, minor formatting changesimp2005-01-072-2/+2
|
* /* -> /*- for copyright notices, minor format tweaks as necessaryimp2005-01-0689-93/+97
|
* Expand COPYRIGHT inline, per Matthew Dillon's earlier approval.imp2005-01-061-4/+24
|
* Return ETIMEDOUT when thread is timeouted since POSIX threaddavidxu2005-01-061-5/+7
| | | | | APIs expect ETIMEDOUT not EAGAIN, this simplifies userland code a bit.
* - Move the function prototypes for kern_setrlimit() and kern_wait() tojhb2005-01-051-0/+1
| | | | | sys/syscallsubr.h where all the other kern_foo() prototypes live. - Resort kern_execve() while I'm there.
* Rework the optimization for spinlocks on UP to be slightly less drastic andjhb2005-01-051-8/+2
| | | | | | | | | | | | turn it back on. Specifically, the actual changes are now less intrusive in that the _get_spin_lock() and _rel_spin_lock() macros now have their contents changed for UP vs SMP kernels which centralizes the changes. Also, UP kernels do not use _mtx_lock_spin() and no longer include it. The UP versions of the spin lock functions do not use any atomic operations, but simple compares and stores which allow mtx_owned() to still work for spin locks while removing the overhead of atomic operations. Tested on: i386, alpha
* Since we do not support forceful unmount of DEVFS we can do away withphk2005-01-041-45/+3
| | | | the partially implemented vnode-readoption code in vgonechrl().
* Regen.marcel2005-01-032-3/+3
|
* uuidgen(2) is MP safe.marcel2005-01-031-1/+1
|
* Implement device_quiesce. This method means 'you are about to beimp2004-12-312-0/+132
| | | | | | | | unloaded, cleanup, or return ebusy of that's inconvenient.' The default module hanlder for newbus will now call this when we get a MOD_QUIESCE event, but in the future may call this at other times. This shouldn't change any actual behavior until drivers start to use it.
* Be consistent and always use form 'return (value);' instead of 'return value;'.pjd2004-12-311-15/+15
| | | | | We had (before this change) 84 lines where it was style(9)-clean and 15 lines where it was not.
* Fix a typo and two whitespace nits.jhb2004-12-301-3/+3
|
* Rework the interface between priority propagation (lending) and thejhb2004-12-303-105/+299
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | schedulers a bit to ensure more correct handling of priorities and fewer priority inversions: - Add two functions to the sched(9) API to handle priority lending: sched_lend_prio() and sched_unlend_prio(). The turnstile code uses these functions to ask the scheduler to lend a thread a set priority and to tell the scheduler when it thinks it is ok for a thread to stop borrowing priority. The unlend case is slightly complex in that the turnstile code tells the scheduler what the minimum priority of the thread needs to be to satisfy the requirements of any other threads blocked on locks owned by the thread in question. The scheduler then decides where the thread can go back to normal mode (if it's normal priority is high enough to satisfy the pending lock requests) or it it should continue to use the priority specified to the sched_unlend_prio() call. This involves adding a new per-thread flag TDF_BORROWING that replaces the ULE-only kse flag for priority elevation. - Schedulers now refuse to lower the priority of a thread that is currently borrowing another therad's priority. - If a scheduler changes the priority of a thread that is currently sitting on a turnstile, it will call a new function turnstile_adjust() to inform the turnstile code of the change. This function resorts the thread on the priority list of the turnstile if needed, and if the thread ends up at the head of the list (due to having the highest priority) and its priority was raised, then it will propagate that new priority to the owner of the lock it is blocked on. Some additional fixes specific to the 4BSD scheduler include: - Common code for updating the priority of a thread when the user priority of its associated kse group has been consolidated in a new static function resetpriority_thread(). One change to this function is that it will now only adjust the priority of a thread if it already has a time sharing priority, thus preserving any boosts from a tsleep() until the thread returns to userland. Also, resetpriority() no longer calls maybe_resched() on each thread in the group. Instead, the code calling resetpriority() is responsible for calling resetpriority_thread() on any threads that need to be updated. - schedcpu() now uses resetpriority_thread() instead of just calling sched_prio() directly after it updates a kse group's user priority. - sched_clock() now uses resetpriority_thread() rather than writing directly to td_priority. - sched_nice() now updates all the priorities of the threads after the group priority has been adjusted. Discussed with: bde Reviewed by: ups, jeffr Tested on: 4bsd, ule Tested on: i386, alpha, sparc64
* Whitespace fix.jhb2004-12-301-0/+1
|
* Stop explicitly touching td_base_pri outside of the scheduler and simplyjhb2004-12-303-7/+3
| | | | | set a thread's priority via sched_prio() when that is the desired action. The schedulers will start managing td_base_pri internally shortly.
* Call tty_close() at the very end of ttyclose() since otherwise NULLjhb2004-12-301-1/+1
| | | | | | | deferences can occur since tty_close() may end up freeing the tty structure if it drops the last reference to it. Glanced at by: phk
* Make the sysctls kern.ipc.msgmnb and kern.ipc.msgtql into tunables asrwatson2004-12-301-2/+4
| | | | | | | | | is the case for most other sysctls in the System V IPC message queue implementation. PR: 75541 Submitted by: Sergiy Vyshnevetskiy <serg at vostok dot net> MFC after: 2 weeks
* Make umtx_wait and umtx_wake more like linux futex does, it isdavidxu2004-12-301-41/+9
| | | | | | | more general than previous. It also lets me implement cancelable point in thread library. Also in theory, umtx_lock and umtx_unlock can be implemented by using umtx_wait and umtx_wake, all atomic operations can be done in userland without kernel's casuptr() function.
* Eliminate (now) unnecessary acquisition and release of the global pagealc2004-12-291-4/+0
| | | | queues lock.
* - Up the WITNESS_COUNT macro from 200 to 1024 to support the growing numberjhb2004-12-281-2/+1
| | | | | | | | | of lock types in the kernel. This results in an increase of witness data usage from ~145k to ~280k on i386 for kernels with 'options WITNESS'. - Remove the unused witness malloc bucket. Submitted by: Michal Mertl mime at traveller dot cz (1)
* Attempt to slightly refine the print out from "show alllocks" -- listrwatson2004-12-271-2/+2
| | | | | the process and thread numbers/names on the same line rather than on separate lines, and print the thread pointer not just the tid.
* Do not vput(9) unlocked vnode and do not VREF it with the sole purposekan2004-12-271-2/+0
| | | | | | of vputting it back immediately. Complained by: DEBUG_VFS_LOCKS
OpenPOWER on IntegriCloud