summaryrefslogtreecommitdiffstats
path: root/sys/kern
Commit message (Collapse)AuthorAgeFilesLines
* Add the ksyms(4) pseudo driver. The ksyms driver allows a process tosson2009-05-263-0/+78
| | | | | | | | | | | | | | get a quick snapshot of the kernel's symbol table including the symbols from any loaded modules (the symbols are all merged into one symbol table). Unlike like other implementations, this ksyms driver maps memory in the process memory space to store the snapshot at the time /dev/ksyms is opened. It also checks to see if the process has already a snapshot open and won't allow it to open /dev/ksyms it again until it closes first. This prevents kernel and process memory from being exhausted. Note that /dev/ksyms is used by the lockstat(1) command. Reviewed by: gallatin kib (freebsd-arch) Approved by: gnn (mentor)
* Add the OpenSolaris dtrace lockstat provider. The lockstat providersson2009-05-266-29/+339
| | | | | | | | | | adds probes for mutexes, reader/writer and shared/exclusive locks to gather contention statistics and other locking information for dtrace scripts, the lockstat(1M) command and other potential consumers. Reviewed by: attilio jhb jb Approved by: gnn (mentor)
* Get rid of M_TEMP.ed2009-05-261-2/+2
|
* Add missing socket options.pjd2009-05-261-0/+8
|
* The advisory lock may be activated or activated and removed during thekib2009-05-241-2/+15
| | | | | | | | | | | | | | | sleep waiting for conditions when the lock may be granted. To prevent lf_setlock() from accessing possibly freed memory, add reference counting to the struct lockf_entry. Bump refcount around the sleep. Make lf_free_lock() return non-zero when structure was freed, and use this after the sleep to return EINTR to the caller. The error code might need a clarification, but we cannot return success to usermode, since the lock is not owned anymore. Reviewed by: dfr Tested by: pho MFC after: 1 month
* In lf_purgelocks(), assert that state->ls_pending is empty after wekib2009-05-241-1/+3
| | | | | | | | weeded out threads, and clean ls_active instead of ls_pending. Reviewed by: dfr Tested by: pho MFC after: 1 month
* In lf_advlockasync(), recheck for doomed vnode after the state->ls_lockkib2009-05-241-2/+17
| | | | | | | | | | | is acquired. In the lf_purgelocks(), assert that vnode is doomed and set *statep to NULL before clearing ls_pending list. Otherwise, we allow for the thread executing lf_advlockasync() to put new pending entry after state->ls_lock is dropped in lf_purgelocks(). Reviewed by: dfr Tested by: pho MFC after: 1 month
* Block when initially opening a TTY multiple times.ed2009-05-241-5/+11
| | | | | | | | | | | | | | | In the original MPSAFE TTY code, I changed the behaviour by returning EBUSY. I thought this made more sense, because it's basically a race to see who gets the TTY first. It turns out this is not a good change, because it also causes EBUSY to be returned when another process is closing the TTY. This can happen during startup, when /etc/rc (or one of its children) is still busy draining its data and /sbin/init is attempting to open the TTY to spawn a getty. Reported by: bz Tested by: bz
* Replace the while statement with the if for clarity. The loop bodykib2009-05-241-1/+1
| | | | | | | | cannot be executed more then once. Reviewed by: dfr Tested by: pho MFC after: 1 month
* V_irtualize the if_clone framework, thus allowing for clonable ifnetszec2009-05-231-0/+5
| | | | | | | | | | | | | | | | | | | | | | | to optionally have overlapping unit numbers if attached in different vnets. At this stage if_loop is the only clonable ifnet class that has been extended to allow for such overlapping allocation of unit numbers, i.e. in each vnet it is possible to have a lo0 interface. Other clonable ifnet classes remain to operate with traditional semantics, i.e. each instance of a clonable ifnet will be assigned a globally unique unit number, regardless in which vnet such an ifnet becomes instantiated. While here, garbage collect unused _lo_list field in struct vnet_net, as well as improve indentation for #defines in sys/net/vnet.h. The layout of struct vnet_net has changed, therefore bump __FreeBSD_version. This change has no functional impact on nooptions VIMAGE kernel builds. Reviewed by: bz, brooks Approved by: julian (mentor)
* Delay an error message until the variable it uses gets initialized.jamie2009-05-231-8/+6
| | | | | | | Found with: Coverity Prevent(tm) CID: 4316 Reported by: trasz Approved by: bz (mentor)
* Introduce the if_vmove() function, which will be used in the futurezec2009-05-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | for reassigning ifnets from one vnet to another. if_vmove() works by calling a restricted subset of actions normally executed by if_detach() on an ifnet in the current vnet, and then switches to the target vnet and executes an appropriate subset of if_attach() actions there. if_attach() and if_detach() have become wrapper functions around if_attach_internal() and if_detach_internal(), where the later variants have an additional argument, a flag indicating whether a full attach or detach sequence is to be executed, or only a restricted subset suitable for moving an ifnet from one vnet to another. Hence, if_vmove() will not call if_detach() and if_attach() directly, but will call the if_detach_internal() and if_attach_internal() variants instead, with the vmove flag set. While here, staticize ifnet_setbyindex() since it is not referenced from outside of sys/net/if.c. Also rename ifccnt field in struct vimage to ifcnt, and do some minor whitespace garbage collection where appropriate. This change should have no functional impact on nooptions VIMAGE kernel builds. Reviewed by: bz, rwatson, brooks? Approved by: julian (mentor)
* Make 'struct acl' larger, as required to support NFSv4 ACLs. Providetrasz2009-05-222-7/+142
| | | | | | compatibility interfaces in both kernel and libc. Reviewed by: rwatson
* Enable secure TTY input buffer flushing by default.ed2009-05-211-1/+1
| | | | | | | I'm leaving the sysctl there. If people really notice a slowdown, they can revert to the old behaviour. Discussed with: kib
* Add a new sysctl: kern.tty_inq_flush_secure.ed2009-05-211-14/+8
| | | | | | | When enabled all TTY input queue buffers are zeroed when flushing or closing the TTY. Because TTY input queues are also used to store filled in passwords, this may be an interesting switch to enable for security minded people.
* Only use the ABI compat shim for vfs.bufspace if the old buffer is smallerjhb2009-05-211-1/+1
| | | | | | | | than a long. PR: amd64/134786 Submitted by: Emil Mikulic emikulic| gmail MFC after: 3 days
* Move the M_WAITOK flag in notify() into an M_NOWAIT one in order to matchattilio2009-05-211-1/+3
| | | | | | | | | | | the behaviour alredy present with the further malloc() call in devctl_notify(). This fixes a bug in the CAM layer where the camisr handler finished to call camperiphfree() (and subsequently destroy_dev() resulting in a new dev notify) while the xpt lock is held. PR: kern/130330 Tested by: Riccardo Torrini <riccardo dot torrini at esaote dot com>
* Set the umask in a new file descriptor table earlier in fdcopy() to removejhb2009-05-201-4/+2
| | | | two lock operations.
* Remove an obsolete assertion. We always wake up all waiters when unlockingjhb2009-05-201-2/+0
| | | | a mutex and never set the lock cookie == MTX_CONTESTED.
* Fix a typo.jhb2009-05-201-1/+1
|
* We no longer need to use d_thread_t for portability here, switch toimp2009-05-201-4/+4
| | | | struct thread *.
* Add minimal ZFS lock hierarchykmacy2009-05-201-0/+7
|
* With SMPng, DEVICE_POLLING uses its own idle threads, rather than therwatson2009-05-191-2/+1
| | | | | | system idle loop, to run ether_poll(), so make ether_poll() static. MFC after: 1 week
* sysctl_rman: report shared resources to devinfoavg2009-05-191-24/+34
| | | | | | | | | | | shared uses of a resource are recorded on a sub-list hanging off a main resource object on a main resource list; without this change a shared resource (e.g. irq) is reported only once by devinfo -r/-u; with this change the resource is reported for each driver that allocates it (which is even more than what vmstat -i -a reports). Approved by: jhb (mentor)
* Binding interrupts to a CPU consists of two parts: setting up CPUrwatson2009-05-181-1/+13
| | | | | | | | | | | affinity for the interrupt thread, and requesting that underlying hardware direct interrupts to the CPU. For software interrupt threads, implement a no-op interrupt event binder that returns success, so that the interrupt management code will just set the ithread's affinity and succeed. Reviewed by: jhb MFC after: 1 week
* Mark the clock sysctls as MPSAFE.ed2009-05-181-3/+4
| | | | | | | These sysctls don't need any form of locking. At least cp_times is used by powerd very often, which means I get 50% less calls to non-MPSAFE sysctls on my system. The other 50% is consumed by dev.cpu.0.freq, but this seems to need Giant for Newbus.
* Several changes to vfs_bio_clrbuf():alc2009-05-171-13/+11
| | | | | | | | | | Provide a more descriptive comment. Eliminate dead code. The page cannot possibly have PG_ZERO set. Eliminate unnecessary blank lines. Reviewed by: tegge
* Introduce vfs_bio_set_valid() and use it from ffs_realloccg(). Thisalc2009-05-171-0/+38
| | | | | | eliminates the misuse of vfs_bio_clrbuf() by ffs_realloccg(). In collaboration with: tegge
* Print an extra newline when not at the first column already.ed2009-05-171-1/+2
| | | | | | | | | | | | | | This makes siginfo output look a lot better when pressing it the first time when in sh(1), for example: $ load: 0.00 cmd: sh 1945 [ttyin] 3.94r 0.00u 0.00s 0% 1960k load: 0.00 cmd: sh 1945 [ttyin] 4.19r 0.00u 0.00s 0% 1960k will now become: $ load: 0.00 cmd: sh 1945 [ttyin] 3.94r 0.00u 0.00s 0% 1960k load: 0.00 cmd: sh 1945 [ttyin] 4.19r 0.00u 0.00s 0% 1960k
* Several cleanups to tty_info(), better known as Ctrl-T.ed2009-05-171-25/+25
| | | | | | | | | | | | - Only pick up PROC_LOCK once, which means we can drop the PGRP_LOCK right after picking up PROC_LOCK for the first time. - Print the process real time, making it consistent with tools like time(1). - Use `p' and `td' to reference the process/thread we are going to print. Only use pick-variables inside the loops. We already did this for the threads, but not the processes.
* Remove do-nothing code that was required to dirty the old buffer on Alpha.des2009-05-151-12/+1
| | | | | Coverity ID: 838 Approved by: jhb, alc
* Revert r192094. The revision caused problems for sysctl(3) consumerskib2009-05-152-7/+18
| | | | | | | | | | | | | that expect that oldlen is filled with required buffer length even when supplied buffer is too short and returned error is ENOMEM. Redo the fix for kern.proc.filedesc, by reverting the req->oldidx when remaining buffer space is too short for the current kinfo_file structure. Also, only ignore ENOMEM. We have to convert ENOMEM to no error condition to keep existing interface for the sysctl, though. Reported by: ed, Florian Smeets <flo kasimir com> Tested by: pho
* - Use a separate sx lock to try to limit the number of concurrent userlandjhb2009-05-141-7/+16
| | | | | | | | | sysctl requests to avoid wiring too much user memory. Only grab this lock if the user's old buffer is larger than a page as a tradeoff to allow more concurrency for common small requests. - Just use a shared lock on the sysctl tree for user sysctl requests now. MFC after: 1 week
* Do not advance req->oldidx when sysctl_old_user returning ankib2009-05-141-3/+5
| | | | | | | | | | | | | error due to copyout failure or short buffer. The later breaks the usermode iterators of the sysctl results that pack arbitrary number of variable-sized structures. Iterator expects that kernel filled exactly oldlen bytes, and tries to interpret half-filled or garbage structure at the end of the buffer. In particular, kinfo_getfile(3) segfaulted. Reported and tested by: pho MFC after: 3 weeks
* - Implement a lockless file descriptor lookup algorithm injeff2009-05-145-77/+113
| | | | | | | | | | | | fget_unlocked(). - Save old file descriptor tables created on expansion until the entire descriptor table is freed so that pointers may be followed without regard for expanders. - Mark the file zone as NOFREE so we may attempt to reference potentially freed files. - Convert several fget_locked() users to fget_unlocked(). This requires us to manage reference counts explicitly but reduces locking overhead in the common case.
* Eliminate page queues locking from bufdone_finish() through thealc2009-05-131-11/+36
| | | | | | | | | | | | | | | | | | | | following changes: Rename vfs_page_set_valid() to vfs_page_set_validclean() to reflect what this function actually does. Suggested by: tegge Introduce a new version of vfs_page_set_valid() that does no more than what the function's name implies. Specifically, it does not update the page's dirty mask, and thus it does not require the page queues lock to be held. Update two of the three callers to the old vfs_page_set_valid() to call vfs_page_set_validclean() instead because they actually require the page's dirty mask to be cleared. Introduce vm_page_set_valid(). Reviewed by: tegge
* Add missing 'break' statement.trasz2009-05-121-0/+1
| | | | | Found with: Coverity Prevent(tm) CID: 3919
* Prevent overflow of uio_resid.kib2009-05-111-0/+3
| | | | | Noted by: jhb MFC after: 3 days
* Fix a kernel compilation error, introduced after r191990, by definingattilio2009-05-111-0/+3
| | | | | | thread with curthread in the AUDIT case. Reported by: dchagin
* Remove the thread argument from the FSD (File-System Dependent) parts ofattilio2009-05-1110-53/+49
| | | | | | | | | | | | | | | | | the VFS. Now all the VFS_* functions and relating parts don't want the context as long as it always refers to curthread. In some points, in particular when dealing with VOPs and functions living in the same namespace (eg. vflush) which still need to be converted, pass curthread explicitly in order to retain the old behaviour. Such loose ends will be fixed ASAP. While here fix a bug: now, UFS_EXTATTR can be compiled alone without the UFS_EXTATTR_AUTOSTART option. VFS KPI is heavilly changed by this commit so thirdy parts modules needs to be recompiled. Bump __FreeBSD_version in order to signal such situation.
* Revert CVS revision 1.94 (svn r16840). Current pmap implementations don'talc2009-05-111-5/+7
| | | | | | | | | | suffer from the race condition that motivated revision 1.94. Consequently, the work-around that was implemented by revision 1.94 is no longer needed. Moreover, reverting this work-around eliminates the need for vfs_busy_pages() to acquire the page queues lock when preparing a buffer for read. Reviewed by: tegge
* Spell NULL properly, use (void) rather than () for functions with noimp2009-05-091-12/+12
| | | | parameters. Mark two items as static that aren't used elsewhere...
* Retire kern.vm.kmem.size. It was marked as obsolete prior to 5.2, soimp2009-05-091-4/+0
| | | | it can go.
* Do not embed struct ucred into larger netcred parent structures.kan2009-05-091-20/+24
| | | | | | | | | | | | | Credential might need to hang around longer than its parent and be used outside of mnt_explock scope controlling netcred lifetime. Use separate reference-counted ucred allocated separately instead. While there, extend mnt_explock coverage in vfs_stdexpcheck and clean-up some unused declarations in new NFS code. Reported by: John Hickey PR: kern/133439 Reviewed by: dfr, kib
* A NOP change: style / whitespace cleanup of the noise that slippedzec2009-05-083-4/+4
| | | | | | | into r191816. Spotted by: bz Approved by: julian (mentor) (an earlier version of the diff)
* Introduce a new virtualization container, provisionally named vprocg, to holdzec2009-05-088-3/+66
| | | | | | | | | | | | | | | | | | | | | | virtualized instances of hostname and domainname, as well as a new top-level virtualization struct vimage, which holds pointers to struct vnet and struct vprocg. Struct vprocg is likely to become replaced in the near future with a new jail management API import. As a consequence of this change, change struct ucred to point to a struct vimage, instead of directly pointing to a vnet. Merge vnet / vimage / ucred refcounting infrastructure from p4 / vimage branch. Permit kldload / kldunload operations to be executed only from the default vimage context. This change should have no functional impact on nooptions VIMAGE kernel builds. Reviewed by: bz Approved by: julian (mentor)
* Move the per-prison Linux MIB from a private one-off pointer to the newjamie2009-05-071-1/+0
| | | | | | | | | OSD-based jail extensions. This allows the Linux MIB to accessed via jail_set and jail_get, and serves as a demonstration of adding jail support to a module. Reviewed by: dchagin, kib Approved by: bz (mentor)
* Eliminate the loop and the call to pause(9) in vfs_vget_ino(). Ifkib2009-05-071-6/+8
| | | | | | | | vfs_busy(MBF_NOWAIT) failed, unlock the vnode and sleep in vfs_busy(). Suggested and reviewed by: jeff Tested by: pho MFC after: 3 weeks
* If we have a regular rint handler, never go into rint_bypass mode.ed2009-05-071-3/+6
| | | | | | | It turns out if we called cfmakeraw() on a TTY with only a rint handler in place, it could inject data into the TTY, even though it should be redirected. Always take a look at the hooks before looking at the termios flags.
* Change the curvnet variable from a global const struct vnet *,zec2009-05-059-15/+142
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | previously always pointing to the default vnet context, to a dynamically changing thread-local one. The currvnet context should be set on entry to networking code via CURVNET_SET() macros, and reverted to previous state via CURVNET_RESTORE(). Recursions on curvnet are permitted, though strongly discuouraged. This change should have no functional impact on nooptions VIMAGE kernel builds, where CURVNET_* macros expand to whitespace. The curthread->td_vnet (aka curvnet) variable's purpose is to be an indicator of the vnet context in which the current network-related operation takes place, in case we cannot deduce the current vnet context from any other source, such as by looking at mbuf's m->m_pkthdr.rcvif->if_vnet, sockets's so->so_vnet etc. Moreover, so far curvnet has turned out to be an invaluable consistency checking aid: it helps to catch cases when sockets, ifnets or any other vnet-aware structures may have leaked from one vnet to another. The exact placement of the CURVNET_SET() / CURVNET_RESTORE() macros was a result of an empirical iterative process, whith an aim to reduce recursions on CURVNET_SET() to a minimum, while still reducing the scope of CURVNET_SET() to networking only operations - the alternative would be calling CURVNET_SET() on each system call entry. In general, curvnet has to be set in three typicall cases: when processing socket-related requests from userspace or from within the kernel; when processing inbound traffic flowing from device drivers to upper layers of the networking stack, and when executing timer-driven networking functions. This change also introduces a DDB subcommand to show the list of all vnet instances. Approved by: julian (mentor)
OpenPOWER on IntegriCloud