summaryrefslogtreecommitdiffstats
path: root/sys/kern/vfs_subr.c
Commit message (Collapse)AuthorAgeFilesLines
...
* Merge first in a series of TrustedBSD MAC Framework KPI changesrwatson2007-10-241-3/+3
| | | | | | | | | | | | | | | | | | | | | | | from Mac OS X Leopard--rationalize naming for entry points to the following general forms: mac_<object>_<method/action> mac_<object>_check_<method/action> The previous naming scheme was inconsistent and mostly reversed from the new scheme. Also, make object types more consistent and remove spaces from object types that contain multiple parts ("posix_sem" -> "posixsem") to make mechanical parsing easier. Introduce a new "netinet" object type for certain IPv4/IPv6-related methods. Also simplify, slightly, some entry point names. All MAC policy modules will need to be recompiled, and modules not updates as part of this commit will need to be modified to conform to the new KPI. Sponsored by: SPARTA (original patches against Mac OS X) Obtained from: TrustedBSD Project, Apple Computer
* Rename the kthread_xxx (e.g. kthread_create()) callsjulian2007-10-201-2/+2
| | | | | | | | | | | to kproc_xxx as they actually make whole processes. Thos makes way for us to add REAL kthread_create() and friends that actually make theads. it turns out that most of these calls actually end up being moved back to the thread version when it's added. but we need to make this cosmetic change first. I'd LOVE to do this rename in 7.0 so that we can eventually MFC the new kthread_xxx() calls.
* When restoring the mount after umount failed, the MNTK_UNMOUNT flagkib2007-09-121-1/+1
| | | | | | | | | | | | | | | prevents insmntque() from placing reallocated syncer vnode on mount list, that causes panic in vfs_allocate_syncvnode(). Introduce MNTK_NOINSMNTQ flag, that marks the period when instmntque is not allowed to success, instead of MNTK_UNMOUNT. The MNTK_NOINSMNTQ is set and cleared simultaneously with MNTK_UNMOUNT, except on umount error path, where it is cleaned just before the syncer vnode is going to be allocated. Reported by: Peter Jeremy <peterjeremy optushome com au> Suggested by: tegge Approved by: re (rwatson)
* Improve vn_printf() by:pjd2007-08-131-7/+45
| | | | | | | | - adding missing vnode flags, - printing unknown flags as numbers, - using strlcat() instead of strcat(). Approved by: re (bmah)
* Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); inrwatson2007-06-121-5/+5
| | | | | | | | | | | | | | | some cases, move to priv_check() if it was an operation on a thread and no other flags were present. Eliminate caller-side jail exception checking (also now-unused); jail privilege exception code now goes solely in kern_jail.c. We can't yet eliminate suser() due to some cases in the KAME code where a privilege check is performed and then used in many different deferred paths. Do, however, move those prototypes to priv.h. Reviewed by: csjp Obtained from: TrustedBSD Project
* Revert VMCNT_* operations introduction.attilio2007-05-311-4/+3
| | | | | | | | Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately. Requested by: alc Approved by: jeff (mentor)
* Universally adopt most conventional spelling of acquire.rwatson2007-05-271-1/+1
|
* Since renaming of vop_lock to _vop_lock, pre- and post-conditionkib2007-05-181-3/+3
| | | | | | function calls are no more generated for vop_lock. Rename _vop_lock to vop_lock1 to satisfy tools/vnode_if.awk assumption about vop naming conventions. This restores pre/post-condition calls.
* - define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulatingjeff2007-05-181-3/+4
| | | | | | | | vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines. Contributed by: Attilio Rao <attilio@FreeBSD.org>
* Fix jails and jail-friendly file systems handling:pjd2007-04-131-0/+24
| | | | | | | | - We need to allow for PRIV_VFS_MOUNT_OWNER inside a jail. - Move security checks to vfs_suser() and deny unmounting and updating for jailed root from different jails, etc. OK'ed by: rwatson
* When we are running low on vnodes, there is currently no way to ask otherpjd2007-04-131-0/+1
| | | | | subsystems to release some vnodes. Implement backpressure based on vfs_lowvnodes event (similar to vm_lowmem for memory).
* Minor style cleanups (mostly removal of trailing whitespaces).pjd2007-04-101-22/+22
|
* Correct typos.pjd2007-04-101-1/+1
|
* Now that the vdropl() function is public, assert that the vnode interlockpjd2007-04-011-0/+1
| | | | is held.
* Make vdropl() public; zfs needs it. There is also plenty of existingdes2007-03-311-2/+1
| | | | | | | | | | | | | | | | | file system code (mostly *_reclaim()) which look like this: VOP_LOCK(vp); /* examine vp */ VOP_UNLOCK(vp); vdrop(vp); This can now be rewritten to: VOP_LOCK(vp); /* examine vp */ vdropl(vp); /* will unlock vp */ MFC after: 1 week
* PowerPC is the only architecture with mpsafe_vfs=0. This is nowmarcel2007-03-271-4/+0
| | | | | broken. Rudimentary tests show that PowerPC can run with mpsafe_vfs=1. Make it so...
* Make insmntque() externally visibile and allow it to fail (e.g. duringtegge2007-03-131-5/+40
| | | | | | | | | | | | | | | | | | | | | | | late stages of unmount). On failure, the vnode is recycled. Add insmntque1(), to allow for file system specific cleanup when recycling vnode on failure. Change getnewvnode() to no longer call insmntque(). Previously, embryonic vnodes were put onto the list of vnode belonging to a file system, which is unsafe for a file system marked MPSAFE. Change vfs_hash_insert() to no longer lock the vnode. The caller now has that responsibility. Change most file systems to lock the vnode and call insmntque() or insmntque1() after a new vnode has been sufficiently setup. Handle failed insmntque*() calls by propagating errors to callers, possibly after some file system specific cleanup. Approved by: re (kensmith) Reviewed by: kib In collaboration with: kib
* change vop_lock handling to allowing tracking of callers' file and line forkmacy2006-11-131-3/+3
| | | | | | acquisition of lockmgr locks Approved by: scottl (standing in for mentor rwatson)
* Simplify operations with sync_mtx in sched_sync():jhb2006-11-071-7/+3
| | | | | | | | | - Don't drop the lock just to reacquire it again to check rushjob, this only wastes time. - Use msleep() to drop the mutex while sleeping instead of explicitly unlocking around tsleep. Reviewed by: pjd
* Fix comment typo and function declaration.jhb2006-11-071-2/+2
|
* Sweep kernel replacing suser(9) calls with priv(9) calls, assigningrwatson2006-11-061-40/+23
| | | | | | | | | | | | | specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>
* Typo, 'from' vnode is locked here, not 'to' vnode.pjd2006-11-041-1/+1
|
* Add gjournal specific code to the UFS file system:pjd2006-10-311-0/+2
| | | | | | | | | | | | | | | | - Add FS_GJOURNAL flag which enables gjournal support on a file system. - Add cg_unrefs field to the cylinder group structure which holds number of unreferenced (orphaned) inodes in the given cylinder group. - Add fs_unrefs field to the super block structure which holds total number of unreferenced (orphaned) inodes. - When file or a directory is orphaned (last reference is removed, but object is still open), increase fs_unrefs and cg_unrefs fields, which is a hint for fsck in which cylinder groups looks for such (orphaned) objects. - When file is last closed, decrease {fs,cg}_unrefs fields. - Add VV_DELETED vnode flag which points at orphaned objects. Sponsored by: home.pl
* Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.hrwatson2006-10-221-1/+2
| | | | | | | | | | | | | begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead. This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd. Obtained from: TrustedBSD Project Sponsored by: SPARTA
* Correct the comment: numvnodes is decreased on vdestroying the vnode.kib2006-10-021-1/+2
| | | | | | OKed by: tegge Approved by: pjd (mentor) MFC after: 1 week
* Add mnt_noasync counter to better handle interleaved calls to nmount(),tegge2006-09-261-5/+6
| | | | | | sync() and sync_fsync() without losing MNT_ASYNC. Add MNTK_ASYNC flag which is set only when MNT_ASYNC is set and mnt_noasync is zero, and check that flag instead of MNT_ASYNC before initiating async io.
* Use mount interlock to protect all changes to mnt_flag and mnt_kern_flag.tegge2006-09-261-0/+4
| | | | | This eliminates a race where MNT_UPDATE flag could be lost when nmount() raced against sync(), sync_fsync() or quotactl().
* Add 'show vnode <addr>' DDB command.pjd2006-09-041-2/+18
|
* getnewvnode() can be called with NULL mp.pjd2006-08-101-1/+1
| | | | | | Found by: Coverity Prevent (tm) Coverity ID: 1521 Confirmed by: phk
* Add a bandaid to avoid a deadlock in a situation, when we are trying to suspendpjd2006-08-091-0/+10
| | | | | | | | | | | | | | a file system, but need to obtain a vnode. We may not be able to do it, because all vnodes could be already in use and other processes cannot release them, because they are waiting in "suspfs" state. In such situation, we allow to allocate a vnode anyway. This is a temporary fix - there is no backpressure to free vnodes allocated in those circumstances. MFC after: 1 week Reviewed by: tegge
* Improve commenting of vaccess(), making sure to be clear that the ifdefrwatson2006-08-061-4/+10
| | | | | capabilities code is there for reference and never actually used. Slight style tweak.
* Enable debug.mpsafevfs by default on arm. Since every architecture exceptalc2006-07-151-2/+1
| | | | | | | powerpc has debug.mpsafevfs enabled by default, it is shorter to enumerate the architectures on which debug.mpsafevfs is off. Tested by: cognet@
* Back out my rev. 1.674. The better fix (rev. 1.637) is already in tree.kib2006-07-051-3/+3
| | | | Approved by: kan (mentor)
* Backed out the change by request from rwatson.babkin2006-06-261-72/+0
| | | | PR: kern/14584
* The common UID/GID space implementation. It has been discussed on -archbabkin2006-06-251-0/+72
| | | | | | | | | | in 1999, and there are changes to the sysctl names compared to PR, according to that discussion. The description is in sys/conf/NOTES. Lines in the GENERIC files are added in commented-out form. I'll attach the test script I've used to PR. PR: kern/14584 Submitted by: babkin
* Fix the LOR that occurs when the MAC compiled into the kernelkib2006-06-081-3/+3
| | | | | | | | | and vnode is destroyed. Reviewed by: rwatson LOR: 189 MFC after: 2 weeks Approved by: kan (mentor)
* Do not set B_NOCACHE on buffers when releasing them in flushbuflist().ups2006-05-251-1/+1
| | | | | | | | | | | | | | | If B_NOCACHE is set the pages of vm backed buffers will be invalidated. However clean buffers can be backed by dirty VM pages so invalidating them can lead to data loss. Add support for flush dirty page in the data invalidation function of some network file systems. This fixes data losses during vnode recycling (and other code paths using invalbuf(*,V_SAVE,*,*)) for data written using an mmaped file. Collaborative effort by: jhb@,mohans@,peter@,ps@,ups@ Reviewed by: tegge@ MFC after: 7 days
* Remove various bits of conditional Alpha code and fixup a few comments.jhb2006-05-121-1/+1
|
* vn_start_write()/vn_finished_write() is not needed here, becausepjd2006-04-291-2/+0
| | | | | | | | vn_start_write() is always called earlier in the code path and calling the function recursively may lead to a deadlock. Confirmed by: tegge MFC after: 2 weeks
* - Add a BO_NEEDSGIANT flag to the bufobj. This flag forces all childjeff2006-04-281-1/+2
| | | | | | buffers to go on the buf daemon's DIRTYGIANT queue. - Set BO_NEEDSGIANT on ffs's devvp since the ffs_copyonwrite handler runs in the context of the buf daemon and may require Giant.
* - VFS_LOCK_GIANT when recycling a vnode via getnewvnode. We may bejeff2006-04-041-0/+3
| | | | | | | | recycling for an unrelated filesystem. I really don't like potentially acquiring giant in the context of a giantless filesystem but there are reasonable objections to removing the recycling from this path. Sponsored by: Isilon Systems, Inc.
* - Add an assert to vgone. It is illegal to call vgone without a referencejeff2006-03-311-3/+0
| | | | | | | to the vnode. Without a reference the vnode will never be vdestroy'd and the memory will never be reclaimed. Sponsored by: Isilon Systems, Inc.
* - Hold a reference from the time vfs_busy starts until vfs_unbusy isjeff2006-03-311-3/+9
| | | | | | | | | | | called. - vfs_getvfs has to return a reference to prevent the returned mountpoint from changing identities. - Release references acquired via vfs_getvfs. Discussed with: tegge Tested by: kris Sponsored by: Isilon Systems, Inc.
* - Add the B_NEEDSGIANT flag which is only set if the vnode that owns a bufjeff2006-03-311-0/+3
| | | | | | | | | requires Giant. It is set in bgetvp and cleared in brelvp. - Create QUEUE_DIRTY_GIANT for dirty buffers that require giant. - In the buf daemon, only grab giant when processing QUEUE_DIRTY_GIANT and only if we think there are buffers in that queue. Sponsored by: Isilon Systems, Inc.
* - Correct an assert in vop_rename_pre. fdvp may be locked if it is eitherjeff2006-03-191-1/+1
| | | | | | | the target directory or file. This case should fail in the filesystem anyway and perhaps kern_rename() should catch it. Sponsored by: Isilon Systems, Inc.
* Use vn_start_secondary_write() and vn_finished_secondary_write() as ategge2006-03-081-4/+20
| | | | | | | | | | | replacement for vn_write_suspend_wait() to better account for secondary write processing. Close race where secondary writes could be started after ffs_sync() returned but before the file system was marked as suspended. Detect if secondary writes or softdep processing occurred during vnode sync loop in ffs_sync() and retry the loop if needed.
* Eliminate a deadlock when creating snapshots. Blocking vn_start_write() musttegge2006-03-021-0/+2
| | | | | | be called without any vnode locks held. Remove calls to vn_start_write() and vn_finished_write() in vnode_pager_putpages() and add these calls before the vnode lock is obtained to most of the callers that don't already have them.
* Don't try to show marker nodes.tegge2006-03-021-1/+1
|
* - Move softdep from using a global worklist to per-mount worklists. Thisjeff2006-03-021-10/+0
| | | | | | | | | | | | | | | | | | | | | | | has many positive effects including improved smp locking, reducing interdependencies between mounts that can lead to deadlocks, etc. - Add the softdep worklist and various counters to the ufsmnt structure. - Add a mount pointer to the workitem and remove mount pointers from the various structures derived from the workitem as they are now redundant. - Remove the poor-man's semaphore protecting softdep_process_worklist and softdep_flushworklist. Several threads may now process the list simultaneously. - Add softdep_waitidle() to block the thread until all pending dependencies being operated on by other threads have been flushed. - Use softdep_waitidle() in unmount and snapshots to block either operation until the fs is stable. - Remove softdep worklist processing from the syncer and move it into the softdep_flush() thread. This thread processes all softdep mounts once each second and when it is called via the new softdep_speedup() when there is a resource shortage. This removes the softdep hook from the kernel and various hacks in header files to support it. Reviewed by/Discussed with: tegge, truckman, mckusick Tested by: kris
* - Release the mount ref once the vnode has been recycled rather than oncejeff2006-02-231-3/+2
| | | | | | | | | the last reference is dropped. I forgot that vnodes can stick around for a very long time until processes discover that they are dead. This means that a vnode reference is not sufficient to keep the mount referenced and even more code will be required to ref mount points. Discovered by: kris
OpenPOWER on IntegriCloud