summaryrefslogtreecommitdiffstats
path: root/sys/kern/vfs_subr.c
Commit message (Collapse)AuthorAgeFilesLines
...
* - Grab a mnt ref in vfs_busy() before dropping the interlock. This willjeff2006-02-221-1/+6
| | | | | | | | prevent the mount point from going away while we're waiting on the lock. The ref does not need to persist once we have the lock because the lock prevents the mount point from being unmounted. MFC After: 1 week
* - Add a ref count to the mount structure. Sleep for up to 3 seconds injeff2006-02-061-6/+8
| | | | | | | | | | | | | | | | vfs_mount_destroy waiting for this ref to hit 0. We don't print an error if we are rebooting as the root mount always retains some refernces by init proc. - Acquire a mnt ref for every vnode allocated to a mount point. Drop this ref only once vdestroy() has been called and the mount has been freed. - No longer NULL the v_mount pointer in delmntque() so that we may release the ref after vgone() has been called. This allows us to guarantee that the mount point structure will be valid until the last vnode has lost its last ref. - Fix a few places that rely on checking v_mount to detect recycling. Sponsored by: Isilon Systems, Inc. MFC After: 1 week
* - Solve a race where we could lose a call to VOP_INACTIVE. If vget() waitingjeff2006-02-011-12/+30
| | | | | | | | | | | | | on a lock held the last usecount ref on a vnode and the lock failed we would not call INACTIVE. Solve this by only holding a holdcnt to prevent the vnode from disappearing while we wait on vn_lock. Other callers may now VOP_INACTIVE while we are waiting on the lock, however this race is acceptable, while losing INACTIVE is not. Discussed with: kan, pjd Tested by: kkenn Sponsored by: Isilon Systems, Inc. MFC After: 1 week
* Back out r1.653; it turns out that the race (or at least the printf) iskris2006-01-281-20/+0
| | | | | | actually not hard to trigger, and it can cause a lot of console spam. Approved by: kan
* Convert remaining functions in vfs_subr.c from K&R prototypes to ANSI Crwatson2006-01-211-82/+34
| | | | | | | | | | prototypes, as the majority of new functions added have been in this style. Changing prototype style now results in gcc noticing that the implementation of vn_pollrecord() has a 'short' argument instead of 'int' as prototyped in vnode.h, so correct that definition. In practice this didn't matter as only poll flags in the lower 16 bits are used. MFC after: 1 week
* Add marker vnodes to ensure that all vnodes associated with the mount point aretegge2006-01-091-22/+17
| | | | | | iterated over when using MNT_VNODE_FOREACH. Reviewed by: truckman
* Print a warning when we miss vinactive() call, because of race in vget().pjd2005-12-291-0/+20
| | | | | | | | | | | The race is very real, but conditions needed for triggering it are rather hard to meet now. When gjournal will be committed (where it is quite easy to trigger) we need to fix it. For now, verify if it is really hard to trigger. Discussed with: kan
* This is a workaround for a complicated issue involving VFS cookies and devfs.dwhite2005-11-091-0/+4
| | | | | | | | | | | | | The PR and patch have the details. The ultimate fix requires architectural changes and clarifications to the VFS API, but this will prevent the system from panicking when someone does "ls /dev" while running in a shell under the linuxulator. This issue affects HEAD and RELENG_6 only. PR: 88249 Submitted by: "Devon H. O'Dell" <dodell@ixsystems.com> MFC after: 3 days
* Normalize a significant number of kernel malloc type names:rwatson2005-10-311-1/+1
| | | | | | | | | | | | | | | | | | | - Prefer '_' to ' ', as it results in more easily parsed results in memory monitoring tools such as vmstat. - Remove punctuation that is incompatible with using memory type names as file names, such as '/' characters. - Disambiguate some collisions by adding subsystem prefixes to some memory types. - Generally prefer lower case to upper case. - If the same type is defined in multiple architecture directories, attempt to use the same name in additional cases. Not all instances were caught in this change, so more work is required to finish this conversion. Similar changes are required for UMA zone names.
* mpsafevm has been stable and defaulted to 1 on sparc64 for over 6 months,kris2005-10-141-1/+1
| | | | | | | | so we are ready for mpsafevfs=1 by default on sparc64 too. I have been running this on all my sparc64 machines for over 6 months, and have not encountered MD problems. MFC after: 1 week
* Move execve's access time update functionality into a newdds2005-10-121-0/+17
| | | | | | | | vfs_mark_atime() function, and use the new function for performing efficient atime updates in mmap(). Reviewed by: bde MFC after: 2 weeks
* Un-staticize runningbufwakeup() and staticize updateproc.truckman2005-09-301-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new private thread flag to indicate that the thread should not sleep if runningbufspace is too large. Set this flag on the bufdaemon and syncer threads so that they skip the waitrunningbufspace() call in bufwrite() rather than than checking the proc pointer vs. the known proc pointers for these two threads. A way of preventing these threads from being starved for I/O but still placing limits on their outstanding I/O would be desirable. Set this flag in ffs_copyonwrite() to prevent bufwrite() calls from blocking on the runningbufspace check while holding snaplk. This prevents snaplk from being held for an arbitrarily long period of time if runningbufspace is high and greatly reduces the contention for snaplk. The disadvantage is that ffs_copyonwrite() can start a large amount of I/O if there are a large number of snapshots, which could cause a deadlock in other parts of the code. Call runningbufwakeup() in ffs_copyonwrite() to decrement runningbufspace before attempting to grab snaplk so that I/O requests waiting on snaplk are not counted in runningbufspace as being in-progress. Increment runningbufspace again before actually launching the original I/O request. Prior to the above two changes, the system could deadlock if enough I/O requests were blocked by snaplk to prevent runningbufspace from falling below lorunningspace and one of the bawrite() calls in ffs_copyonwrite() blocked in waitrunningbufspace() while holding snaplk. See <http://www.holm.cc/stress/log/cons143.html>
* Break out of loop if next buffer pointer has become invalid while flushingtegge2005-09-161-0/+15
| | | | | | current buffer. Reviewed by: kan
* In vfs_kqfilter(), return EINVAL instead of 1 (EPERM) when an unsupportedrwatson2005-09-121-1/+1
| | | | | | kqueue filter type is requested on a vnode. MFC after: 3 days
* use monotonic `time_uptime' instead of `time_second'jkim2005-09-121-4/+4
| | | | | Approved by: anholt (mentor) Discussed on: arch
* Introduce vfs_read_dirent() which can help VOP_READDIR() implementationsphk2005-09-121-0/+27
| | | | by handling all the cookie stuff.
* Fix a typo in vop_rename_pre() where we ended up using vholdl()ssouhlal2005-08-281-1/+1
| | | | | | instead of vhold(), even though the vnode interlock is unlocked. MFC after: 3 days
* Back out the removal of LK_NOWAIT from the VOP_LOCK() call intruckman2005-08-231-7/+37
| | | | | | | | | | | | | | | | | | vlrureclaim() in vfs_subr.c 1.636 because waiting for the vnode lock aggravates an existing race condition. It is also undesirable according to the commit log for 1.631. Fix the tiny race condition that remains by rechecking the vnode state after grabbing the vnode lock and grabbing the vnode interlock. Fix the problem of other threads being starved (which 1.636 attempted to fix by removing LK_NOWAIT) by calling uio_yield() periodically in vlrureclaim(). This should be more deterministic than hoping that VOP_LOCK() without LK_NOWAIT will block, which may not happen in this loop. Reviewed by: kan MFC after: 5 days
* Silence "busy" warnings when unmounting devfs at system shutdown. Thisrwatson2005-08-201-6/+16
| | | | | | | | | | | | is a workaround for non-symetric teardown of the file systems at shutdown with respect to the mount order at boot. The proper long term fix is to properly detach devfs from the root mount before unmounting each, and should be implemented, but since the problem is non-harmful, this temporary band-aid will prevent false positive bug reports and unnecessary error output for 6.0-RELEASE. MFC after: 3 days Tested by: pav, pjd
* Make mpsafe_vfs=1 the default on ia64.marcel2005-08-131-1/+2
|
* Do not drop the vnode interlock if vdropl is called on already doomed vnode.kan2005-08-101-3/+1
| | | | | | vdropl callers expect it to return with interlock still being held. MFC after: 2 days
* Holding a vnode doesn't prevent v_mount from disappearing (when thessouhlal2005-08-061-0/+2
| | | | | | | | | | | | vnode is inactivated), possibly leading to a NULL dereference when checking if the mount wants knotes to be activated in the VOP hooks. So, we add a new vnode flag VV_NOKNOTE that is only set in getnewvnode(), if necessary, and check it when activating knotes. Since the flags are not erased when a vnode is being held, we can safely read them. Reviewed by: kris@ MFC after: 3 days
* - Unlock before we call mac_destroy_vnode to prevent a lock order reversal.jeff2005-08-031-0/+1
| | | | Found by: trhodes
* - Allow vnlru to drop giant if the filesystem does not require it. Thejeff2005-07-201-2/+11
| | | | | | | | | | | | | | vnlru proc is extremely inefficient, potentially iteration over tens of thousands of vnodes without blocking. Droping Giant allows other threads to preempt us although we should revisit the algorithm to fix the runtime problems especially since this may hold up all vnode allocations. - Remove the LK_NOWAIT from the VOP_LOCK in vlrureclaim. This provides a natural blocking point to help alleviate the situation described above although it may not technically be desirable. - yield after we make a pass on all mount points to prevent us from blocking other threads which require Giant. MFC after: 2 weeks
* Fix one "wrong b_bufobj" panic in reassignbuf() by moving VI_UNLOCK(vp)pjd2005-07-051-1/+1
| | | | | | | | below KASSERT()s, which means there was no real problem here, we just needed better locking for assertions. OK'ed by: jeff Approved by: re (scottl)
* Fix the recent panics/LORs/hangs created by my kqueue commit by:ssouhlal2005-07-011-23/+49
| | | | | | | | | | | | | | | | | - Introducing the possibility of using locks different than mutexes for the knlist locking. In order to do this, we add three arguments to knlist_init() to specify the functions to use to lock, unlock and check if the lock is owned. If these arguments are NULL, we assume mtx_lock, mtx_unlock and mtx_owned, respectively. - Using the vnode lock for the knlist locking, when doing kqueue operations on a vnode. This way, we don't have to lock the vnode while holding a mutex, in filt_vfsread. Reviewed by: jmg Approved by: re (scottl), scottl (mentor override) Pointyhat to: ssouhlal Will be happy: everyone
* - Try to catch the wrong bufobj panics a little earlier. I believe theyjeff2005-06-181-0/+5
| | | | | | | | | | are actually caused by a buf with both VNCLEAN and VNDIRTY set. In the traces it is clear that the buf is removed from the dirty queue while it is actually on the clean queue which leaves the tail pointer set. Assert that both flags are not set in buf_vlist_add and buf_vlist_remove. Sponsored by: Isilon Systems, Inc. Approved by: re (blanket vfs)
* - Change holdcnt use around vnode recycling. We now always keep a holdcntjeff2005-06-161-202/+198
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ref while we're calling vgone(). This prevents transient refs from re-adding us to the free list. Previously, a vfree() triggered via vinvalbuf() getting rid of all of a vnode's pages could place a partially destructed vnode on the free list where vtryrecycle() could find it. The first call to vtryrecycle would hang up on the vnode lock, but when it failed it would place a now dead vnode onto the free list, and another call to vtryrecycle() would free an already free vnode. There were many complications of having a zero ref count while freeing which can now go away. - Change vdropl() to release the interlock before returning. All callers now respect this, so vdropl() directly frees VI_DOOMED vnodes once the last ref is dropped. This means that we'll never have VI_DOOMED vnodes on the free list. - Seperate v_incr_usecount() into v_incr_usecount(), v_decr_usecount() and v_decr_useonly(). The incr/decr split is so that incr usecount can return with the interlock still held while decr drops the interlock so it can call vdropl() which will potentially free the vnode. The calling function can't drop the lock of an already free'd node. v_decr_useonly() drops a usecount without droping the hold count. This is done so the usecount reaches zero in vput() before we recycle, however the holdcount is still 1 which prevents any new references from placing the vnode back on the free list. - Fix vnlrureclaim() to vhold the vnode since it doesn't do a vget(). We wouldn't want vnlrureclaim() to bump the usecount since this has different semantics. Also change vnlrureclaim() to do a NOWAIT on the vn_lock. When this function runs we're usually in a desperate situation and we wouldn't want to wait for any specific vnode to be released. - Fix a bunch of misc comments to reflect the new behavior. - Add vhold() and vdrop() to vflush() for the same reasons that we do in vlrureclaim(). Previously we held no reference and a vnode could have been freed while we were waiting on the lock. - Get rid of vlruvp() and vfreehead(). Neither are used. vlruvp() should really be rethought before it's reintroduced. - vgonel() always returns with the vnode locked now and never puts the vnode back on a free list. The vnode will be freed as soon as the last reference is released. Sponsored by: Isilon Systems, Inc. Debugging help from: Kris Kennaway, Peter Holm Approved by: re (blanket vfs)
* - In reassignbuf() add many asserts to validate the head and tail pointersjeff2005-06-141-18/+29
| | | | | | | | | | | of the clean and dirty lists. This is in an attempt to catch the wrong bufobj problem sooner. - In vgonel() don't acquire an extra reference in the active case, the vnode lock and VI_DOOMED protect us from recursively cleaning. - Also in vgonel() clean up some stale comments. Sponsored by: Isilon Systems, Inc. Approved by: re (blanket vfs)
* - Don't make vgonel() globally visible, we want to change its prototypejeff2005-06-131-36/+19
| | | | | | | | | | | | anyway and it's not used outside of vfs_subr.c. - Change vgonel() to accept a parameter which determines whether or not we'll put the vnode on the free list when we're done. - Use the new vgonel() parameter rather than VI_DOOMED to signal our intentions in vtryrecycle(). - In vgonel() return if VI_DOOMED is already set, this vnode has already been reclaimed. Sponsored by: Isilon Systems, Inc.
* - Add KTR_VFS events to vdestroy, vtruncbuf, vinvalbuf, vfreehead.jeff2005-06-131-0/+4
| | | | Sponsored by: Isilon Systems, Inc.
* - Assert that we're not in the name cache anymore in vdestroy().jeff2005-06-111-0/+2
| | | | Sponsored by: Isilon Systems, Inc.
* - Add KTR_VFS tracing to track the life of vnodes. Eventually KTR_VFSjeff2005-06-111-1/+20
| | | | | | | | | | events could be added to cover other interesting details. - Add some VNASSERTs to discover places where we access vnodes after they have been uma_zfree'd before we try to free them again. - Add a few more VNASSERTs to vdestroy() to be certain that the vnode is really unused. Sponsored by: Isilon Systems, Inc.
* Allow EVFILT_VNODE events to work on every filesystem type, not justssouhlal2005-06-091-1/+232
| | | | | | | | | | | | | | | UFS by: - Making the pre and post hooks for the VOP functions work even when DEBUG_VFS_LOCKS is not defined. - Moving the KNOTE activations into the corresponding VOP hooks. - Creating a MNTK_NOKNOTE flag for the mnt_kern_flag field of struct mount that permits filesystems to disable the new behavior. - Creating a default VOP_KQFILTER function: vfs_kqfilter() My benchmarks have not revealed any performance degradation. Reviewed by: jeff, bde Approved by: rwatson, jmg (kqueue changes), grehan (mentor)
* - Clear OWEINACT prior to calling VOP_INACTIVE to remove the possibilityjeff2005-06-071-1/+2
| | | | of a vget causing another call to INACTIVE before we're finished.
* If we are going tocperciva2005-05-061-0/+3
| | | | | | | | | | 1. Copy a NULL-terminated string into a fixed-length buffer, and 2. copyout that buffer to userland, we really ought to 0. Zero the entire buffer first. Security: FreeBSD-SA-05:08.kmem
* - A vnode may have made its way onto the free list while it was beingjeff2005-05-031-0/+2
| | | | | | | vgone'd. We must remove it from the freelist before returning in vtryrecycle() or we may get a duplicate free. Reported by: kkenn
* Since it is not possible for curthread to be NULL in this context,csjp2005-05-021-4/+2
| | | | | | | | drop the check+initialization for a straight initialization. Also assert that curthread will never be NULL just to be sure. Discussed with: rwatson, peter MFC after: 1 week
* - All buffers should either be clean or dirty. If neither of these flagsjeff2005-05-011-0/+4
| | | | | | | are set when we attempt to remove a buffer from a queue we should panic. Hopefully this will catch the source of the wrong bufobj panics. Sponsored by: Isilon Systems, Inc.
* - In vnlru_free() remove the vnode from the free list before we calljeff2005-04-301-33/+51
| | | | | | | | | | | | vtryrecycle(). We could sometimes get into situations where two threads could try to recycle the same vnode before this. - vtryrecycle() is now responsible for returning the vnode to the free list if it fails and someone else hasn't done it. - Make a new function vfreehead() which moves a vnode to the head of the free list and use it in vgone() to clean up that code a bit. Sponsored by: Isilon Systems, Inc. Reported by: pho, kkenn
* - Don't vgonel() via vgone() or vrecycle() if the vnode is already doomed.jeff2005-04-271-1/+8
| | | | | | | This fixes forced unmounts via nullfs. Reported by: kkenn Sponsored by: Isilon Systems, Inc.
* - Stop setting vxthread, we've asserted that it was useless for severaljeff2005-04-271-2/+0
| | | | weeks now.
* - Disable code which allows getnewvnode() to fail. Many ffs_vget() callersjeff2005-04-221-0/+2
| | | | | | | | do not correctly deal with failures. This presently risks deadlock problems if dependency processing is held up by failures to allocate a vnode, however, this is better than the situation with the failures. Sponsored by: Isilon Systems, Inc.
* Initialize mountlist_mtx with an MTX_SYSINIT(), we need it to be readyphk2005-04-181-1/+0
| | | | earlier.
* - Change vop_lookup_post assertions to reflect recent vfs_lookup changes.jeff2005-04-131-12/+2
| | | | Sponsored by: Isilon Systems, Inc.
* - Enable ASSERT_VOP_ELOCKED and assert_vop_elocked() now that vnode_if.awkjeff2005-04-111-1/+1
| | | | | | uses it. Sponsored by: Isilon Systems, Inc.
* - Change the VOP_LOCK UPGRADE in vput() to do a LK_NOWAIT to avoid ajeff2005-04-111-39/+43
| | | | | | | | | | | | potential lock order reversal. Also, don't unlock the vnode if this fails, lockmgr has already unlocked it for us. - Restructure vget() now that vn_lock() does all of VI_DOOMED checking for us and also handles the case where there is no real lock type. - If VI_OWEINACT is set, we need to upgrade the lock request to EXCLUSIVE so that we can call inactive. It's not legal to vget a vnode that hasn't had INACTIVE called yet. Sponsored by: Isilon Systems, Inc.
* - Assert that the bufobj matches in flushbuflists. I still haven't gottenjeff2005-04-061-0/+3
| | | | | | | | | | to root cause on exactly how this happens. - If the assert is disabled, we presently try to handle this case, but the BUF_UNLOCK was missing. Thus, if this condition ever hit we would leak a buf lock. Many thanks to Peter Holm for all his help in finding this bug. He really put more effort into it than I did.
* - Move NDFREE() from vfs_subr to vfs_lookup where namei() is.jeff2005-04-051-38/+0
|
* - Add a missing unlock of the vnode_free_list_mtx.jeff2005-04-041-1/+3
| | | | Spotted by: Antoine Brodin
OpenPOWER on IntegriCloud