summaryrefslogtreecommitdiffstats
path: root/sys/kern
Commit message (Collapse)AuthorAgeFilesLines
* Further system call comment cleanup:rwatson2007-03-0530-186/+86
| | | | | | | | | | - Remove also "MP SAFE" after prior "MPSAFE" pass. (suggested by bde) - Remove extra blank lines in some cases. - Add extra blank lines in some cases. - Remove no-op comments consisting solely of the function name, the word "syscall", or the system call name. - Add punctuation. - Re-wrap some comments.
* Change these descriptions of memory types used in malloc(9), as theirwkoszek2007-03-051-1/+1
| | | | | | current, rather long strings make output from vmstat -m look unpleasant. Approved by: cognet (mentor)
* Use msleep(9) instead of tsleep(9) surrounded by lock acquisition andwkoszek2007-03-041-6/+2
| | | | | | release. Approved by: cognet (mentor)
* Remove 'MPSAFE' annotations from the comments above most system calls: allrwatson2007-03-0430-644/+38
| | | | | | | | system calls now enter without Giant held, and then in some cases, acquire Giant explicitly. Remove a number of other MPSAFE annotations in the credential code and tweak one or two other adjacent comments.
* Move to ANSI C function headers. Re-wrap some comments.rwatson2007-03-041-45/+25
|
* - Don't do the interrupt storm protection stuff for software interruptjhb2007-03-021-2/+3
| | | | | | | handlers. - Use pause() when throtting during an interrupt storm. Reported by: kris (1)
* lock stats updates need to be protected by the lockkmacy2007-03-022-44/+8
|
* Rename PRIV_VFS_CLEARSUGID to PRIV_VFS_RETAINSUGID, which seems to betterpjd2007-03-011-1/+1
| | | | | | describe the privilege. OK'ed by: rwatson
* Do not dispatch SIGPIPE from the generic write path for a socket; withbms2007-03-011-1/+1
| | | | | | | | | | | | this patch the code behaves according to the comment on the line above. Without this patch, a socket could cause SIGPIPE to be delivered to its process, once with SO_NOSIGPIPE set, and twice without. With this patch, the kernel now passes the sigpipe regression test. Tested by: Anton Yuzhaninov MFC after: 1 week
* Evidently I've overestimated gcc's ability to peak inside inline functionskmacy2007-03-012-6/+16
| | | | | and optimize away unused stack values. The 48 bytes that the lock_profile_object adds to the stack evidently has a measurable performance impact on certain workloads.
* Remove two simultaneous acquisitions of multiple unpcb locks fromrwatson2007-03-011-22/+19
| | | | | | | | | uipc_send in cases where only a global read lock is held by breaking them out and avoiding the unpcb lock acquire in the common case. This avoids deadlocks which manifested with X11, and should also marginally further improve performance. Reported by: sepotvin, brooks
* Lock unp2 after checking for a non-NULL unp2 pointer in uipc_send() onrwatson2007-02-281-1/+1
| | | | datagram UNIX domain sockets, not before.
* Print tid's rather than thread pointers in KTR_PROC traces.jhb2007-02-271-8/+8
|
* Use pause() rather than tsleep() on stack variables and function pointers.jhb2007-02-271-2/+1
|
* Use pause() rather than tsleep() on explicit global dummy variables.jhb2007-02-271-3/+1
|
* Do not execute filter only handlers in ithread_execute_handlers():piso2007-02-271-0/+4
| | | | | this fixes the panics when filter only and ithread only handlers where sharing the same irq .
* Further improvements to LOCK_PROFILING:kmacy2007-02-273-10/+36
| | | | | | | | | | | | - Fix missing initialization in kern_rwlock.c causing bogus times to be collected - Move updates to the lock hash to after the lock is released for spin mutexes, sleep mutexes, and sx locks - Add new kernel build option LOCK_PROFILE_FAST - only update lock profiling statistics when an acquisition is contended. This reduces the overhead of LOCK_PROFILING to increasing system time by 20%-25% which on "make -j8 kernel-toolchain" on a dual woodcrest is unmeasurable in terms of wall-clock time. Contrast this to enabling lock profiling without LOCK_PROFILE_FAST and I see a 5x-6x slowdown in wall-clock time.
* Revise locking strategy used for UNIX domain sockets in order to improverwatson2007-02-261-223/+469
| | | | | | | | | | | | | | | | | | concurrency: - Add per-unpcb mutexes protecting unpcb connection state, fields, etc. - Replace global UNP mutex with a global UNP rwlock, which will protect the UNIX domain socket connection topology, v_socket, and be acquired exclusively before acquiring more than per-unpcb at a time in order to avoid lock order issues. In performance measurements involving MySQL, this change has little or no overhead on UP (+/- 1%), but leads to a significant (5%-30%) improvement in multi-processor measurements using the sysbench and supersmack benchmarks. Much testing by: kris Approved by: re (kensmith)
* Use NULL rather than 0 for various pointer constants.jhb2007-02-261-26/+26
|
* Add rw_wowned() interface to rwlock(9), allowing a kernel thread torwatson2007-02-261-0/+7
| | | | | | | | determine if it holds an exclusive rwlock reference or not. This is non-ideal, but recursion scenarios in the network stack currently require it. Approved by: jhb
* Mark the kernel linker file as linked so that it is visible to the variousjhb2007-02-261-0/+1
| | | | | | kld*() syscalls. Tested by: piso
* Fix a comment.jhb2007-02-261-2/+2
|
* Don't block on the socket zone limit during the socket()ru2007-02-261-5/+5
| | | | | | | | | call which can easily lock up a system otherwise; instead, return ENOBUFS as documented in a manpage, thus reverting us to the FreeBSD 4.x behavior. Reviewed by: rwatson MFC after: 2 weeks
* general LOCK_PROFILING cleanupkmacy2007-02-266-111/+40
| | | | | | | | | | | | - only collect timestamps when a lock is contested - this reduces the overhead of collecting profiles from 20x to 5x - remove unused function from subr_lock.c - generalize cnt_hold and cnt_lock statistics to be kept for all locks - NOTE: rwlock profiling generates invalid statistics (and most likely always has) someone familiar with that should review
* Close race conditions between fork() and [sg]etpriority()'sdelphij2007-02-262-5/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PRIO_USER case, possibly also other places that deferences p_ucred. In the past, we insert a new process into the allproc list right after PID allocation, and release the allproc_lock sx. Because most content in new proc's structure is not yet initialized, this could lead to undefined result if we do not handle PRS_NEW with care. The problem with PRS_NEW state is that it does not provide fine grained information about how much initialization is done for a new process. By defination, after PRIO_USER setpriority(), all processes that belongs to given user should have their nice value set to the specified value. Therefore, if p_{start,end}copy section was done for a PRS_NEW process, we can not safely ignore it because p_nice is in this area. On the other hand, we should be careful on PRS_NEW processes because we do not allow non-root users to lower their nice values, and without a successful copy of the copy section, we can get stale values that is inherted from the uninitialized area of the process structure. This commit tries to close the race condition by grabbing proc mutex *before* we release allproc_lock xlock, and do copy as well as zero immediately after the allproc_lock xunlock. This guarantees that the new process would have its p_copy and p_zero sections, as well as user credential informaion initialized. In getpriority() case, instead of grabbing PROC_LOCK for a PRS_NEW process, we just skip the process in question, because it does not affect the final result of the call, as the p_nice value would be copied from its parent, and we will see it during allproc traverse. Other potential solutions are still under evaluation. Discussed with: davidxu, jhb, rwatson PR: kern/108071 MFC after: 2 weeks
* Fix a case in rman_manage_region() where the resource list would get missorted.scottl2007-02-231-6/+7
| | | | | This would in turn confuse rman_reserve_resource(). This was only seen for MSI resources that can get allocated and deallocated after boot.
* Drop the global kernel linker lock while executing the sysinit's for ajhb2007-02-231-15/+21
| | | | | | | | freshly-loaded kernel module. To avoid various unload races, hide linker files whose sysinit's are being run from userland so that they can't be kldunloaded until after all the sysinit's have finished. Tested by: gallatin
* Add a new kernel sleep function pause(9). pause(9) is for places thatjhb2007-02-231-1/+21
| | | | | | | | want an equivalent of DELAY(9) that sleeps instead of spins. It accepts a wmesg and a timeout and is not interrupted by signals. It uses a private wait channel that should never be woken up by wakeup(9) or wakeup_one(9). Glanced at by: phk
* o break newbus api: add a new argument of type driver_filter_t topiso2007-02-233-24/+23
| | | | | | | | | | | | | bus_setup_intr() o add an int return code to all fast handlers o retire INTR_FAST/IH_FAST For more info: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=465712+0+current/freebsd-current Reviewed by: many Approved by: re@
* Use LIST_EMPTY() instead of unrolled version (LIST_FIRST() [!=]= NULL)delphij2007-02-221-5/+5
|
* Add an additional MAC check to the UNIX domain socket connect path:rwatson2007-02-221-0/+5
| | | | | | | | | check that the subject has read/write access to the vnode using the vnode MAC check. MFC after: 3 weeks Submitted by: Spencer Minear <spencer_minear at securecomputing dot com> Obtained from: TrustedBSD Project
* Remove unnecessary privilege and privilege check for WITNESS sysctl.rwatson2007-02-201-6/+0
| | | | Head nod: jhb
* Break introductory comment into two paragraphs to separate material on therwatson2007-02-201-12/+9
| | | | | | | | | | garbage collection complications from general discussion of UNIX domain sockets. Staticize unp_addsockcred(). Remove XXX comment regarding Giant and v_socket -- v_socket is protected by the global UNIX domain socket lock.
* Remove unused PRIV_IPC_EXEC. Renumbers System V IPC privilege.rwatson2007-02-201-1/+0
|
* Sync up PRIV_IPC_{ADMIN,READ,WRITE} priv checks in ipcperm() withrwatson2007-02-201-3/+6
| | | | | kern_jail.c: allow jailed root these privileges. This only has an effect if System V IPC is administratively enabled for the jail.
* Restore sysv_ipc.c:1.30, which was backed out due to interactions withrwatson2007-02-191-38/+66
| | | | | | | | | | | | | | | | | | | | | System V shared memory, now believed fixed in sysv_shm.c:1.109: date: 2006/11/06 13:42:01; author: rwatson; state: Exp; lines: +65 -37 Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net> This restores fine-grained privilege support to System V IPC. PR: 106078
* Remove call to ipcperm() in shmget_existing(). The flags argument isrwatson2007-02-191-3/+0
| | | | | | | | | | | ignored on other systems I investigated when accessing an existing memory segment rather than creating a new one. This call to ipcperm() is the only one to pass in a complete mode flag to the permission checks rather than a simple access request mask, and caused problems for the revised ipcperm() based on the priv(9) interface, which can now be restored. PR: 106078
* Rename three quota privileges from the UFS privilege namespace to therwatson2007-02-191-2/+2
| | | | | | | | | | VFS privilege namespace: exceedquota, getquota, and setquota. Leave UFS-specific quota configuration privileges in the UFS name space. This renumbers VFS and UFS privileges, so requires rebuilding modules if you are using security policies aware of privilege identifiers. This is likely no one at this point since none of the committed MAC policies use the privilege checks.
* Limit quota privileges in jail to PRIV_UFS_GETQUOTA andrwatson2007-02-191-5/+2
| | | | PRIV_UFS_SETQUOTA.
* Do allow privilege to create over-sized messages on System V IPCrwatson2007-02-191-1/+2
| | | | message queues in jail.
* Use priv_check(9) instead of suser(9) for checking the privilege torwatson2007-02-191-1/+1
| | | | | | set real-time priority on a thread. It looks like this suser(9) call was introduced after my first pass through replacing superuser checks with named privilege checks.
* For now, reflect practical reality that Audit system calls aren'trwatson2007-02-191-0/+2
| | | | allowed in Jail: return a privilege error.
* Remove union_dircheckp hook, it is not needed by new unionfs code anymore.kib2007-02-191-65/+33
| | | | | | | | | As consequence, getdirentries() no longer needs to drop/reacquire directory vnode lock, that would allow it to be reclaimed in between. Reported and tested by: Peter Holm Approved by: rodrigc (unionfs) MFC after: 1 week
* Remove VFS_VPTOFH entirely. API is already broken and it is good time topjd2007-02-162-21/+1
| | | | | | do it. Suggested by: rwatson
* Move vnode-to-file-handle translation from vfs_vptofh to vop_vptofh method.pjd2007-02-155-3/+26
| | | | | | | | | | | | | | | | This way we may support multiple structures in v_data vnode field within one file system without using black magic. Vnode-to-file-handle should be VOP in the first place, but was made VFS operation to keep interface as compatible as possible with SUN's VFS. BTW. Now Solaris also implements vnode-to-file-handle as VOP operation. VFS_VPTOFH() was left for API backward compatibility, but is marked for removal before 8.0-RELEASE. Approved by: mckusick Discussed with: many (on IRC) Tested with: ufs, msdosfs, cd9660, nullfs and zfs
* Cleanup and document the implementation of firmware(9) based onluigi2007-02-151-153/+272
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a version that i posted earlier on the -current mailing list, and subsequent feedback received. The core of the change is just in sys/firmware.h and kern/subr_firmware.c, while other files are just adaptation of the clients to the ABI change (const-ification of some parameters and hiding of internal info, so this is fully compatible at the binary level). In detail: - reduce the amount of information exported to clients in struct firmware, and constify the pointer; - internally, document and simplify the implementation of the various functions, and make sure error conditions are dealt with properly. The diffs are large, but the code is really straightforward now (i hope). Note also that there is a subtle issue with the implementation of firmware_register(): currently, as in the previous version, we just store a reference to the 'imagename' argument, but we should rather copy it because there is no guarantee that this is a static string. I realised this while testing this code, but i prefer to fix it in a later commit -- there is no regression with respect to the past. Note, too, that the version in RELENG_6 has various bugs including missing locks around the module release calls, mishandling of modules loaded by /boot/loader, and so on, so an MFC is absolutely necessary there. I was just postponing it until this cleanup to avoid doing things twice. MFC after: 1 week
* Catch up file descriptor printing function in DDB to the addition of kqueuesrwatson2007-02-151-0/+4
| | | | and POSIX message queues.
* Break file descriptor printing logic out of db_show_files() intorwatson2007-02-151-9/+32
| | | | | db_print_file(), and add a new "show file <ptr>" DDB command, which can be used to print out file descriptors referenced in stack traces.
* Rename somaxconn_sysctl() to sysctl_somaxconn() so that I will be able torwatson2007-02-151-3/+3
| | | | claim that sofoo() functions all accept a socket as their first argument.
* If both ISDOTDOT and NOCROSSMOUNT are set then lookup() might breaks outkib2007-02-151-3/+4
| | | | | | | | | of the special handling for ".." and perform an ISDOTDOT VOP_LOOKUP() for a filesystem root vnode. Handle this case inside lookup(). Submitted by: tegge PR: 92785 MFC after: 1 week
OpenPOWER on IntegriCloud