summaryrefslogtreecommitdiffstats
path: root/sys/kern/vfs_cache.c
Commit message (Collapse)AuthorAgeFilesLines
* Correctly handle unlock for !MAKEENTRY case, after successfull attempt ofkib2009-08-141-1/+2
| | | | | | | | lock upgrade cache shall be unlocked from write. Reported by: Lucius Windschuh <lwindschuh googlemail com> Reviewed by: kan Approved by: re (rwatson)
* Add explicit struct ucred * argument for VOP_VPTOCNP, to be used bykib2009-06-211-7/+9
| | | | | | | | | | vn_open_cred in default implementation. Valid struct ucred is needed for audit and MAC, and curthread credentials may be wrong. This further requires modifying the interface of vn_fullpath(9), but it is out of scope of this change. Reviewed by: rwatson
* Unlock the cache lock before returning when we run out of buffer spacemarcus2009-06-051-1/+4
| | | | | | | trying to fill in the full path name. Reported by: David Naylor <naylor.b.david@gmail.com> Approved by: kib
* Unbreak the build. Add missed probes.kib2009-05-311-6/+12
| | | | | Reviewed by: rwatson Pointy hat to: me
* Eliminate code duplication in vn_fullpath1() around the cache lookupskib2009-05-311-85/+75
| | | | | | | | | | | | | | and calls to vn_vptocnp() by moving more of the common code to vn_vptocnp(). Rename vn_vptocnp() to vn_vptocnp_locked() to signify that cache is locked around the call. Do not track buffer position by both the pointer and offset, use only buflen to record the start of the free space. Export vn_vptocnp() for external consumers as a wrapper around vn_vptocnp_locked() that locks the cache and handles hold counts. Tested by: pho
* More fallout from negative dotdot caching. Negative entries shouldkan2009-04-171-8/+13
| | | | | | | be removed from and reinserted to proper ncneg list. Reported by: pho Submitted by: kib
* Redo previous change using simpler patch that happens to be alsokan2009-04-141-9/+3
| | | | | | more correct. Submitted by: tor
* Fix yet another negative dotodot entry fallout.kan2009-04-141-0/+12
| | | | Reported by: pho
* Fix v_cache_dd handling for negative entries. v_cache_dd pointer waskan2009-04-111-13/+14
| | | | | | | | | | not populated in parent directory if negative entry was being created, yet entry itself was added to the nc_neg list. It was possible for parent vnode to get discarded later, leaving negative entry pointing to now unused memory block. Reported by: dho Revewed by: kib
* When zapping v_cache_dd for !MAKEENTRY case in cache_lookup(), we shallkib2009-04-111-0/+2
| | | | | | lock cache as writer. Reviewed by: kan
* Cache_lookup() for DOTDOT drops dvp vnode lock, allowing dvp to be reclaimed.kib2009-04-101-1/+8
| | | | | | | | | Check the condition and return ENOENT then. In nfs_lookup(), respect ENOENT return from cache_lookup() when it is caused by dvp reclaim. Reported and tested by: pho
* Nul-terminate strings in the VFS name cache, which negligibly changerwatson2009-04-071-10/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the size and cost of name cache entries, but make adding debugging and tracing easier. Add SDT DTrace probes for various namecache events: vfs:namecache:enter:done - new entry in the name cache, passed parent directory vnode pointer, name added to the cache, and child vnode pointer. vfs:namecache:enter_negative:done - new negative entry in the name cache, passed parent vnode pointer, name added to the cache. vfs:namecache:fullpath:enter - call to vn_fullpath1() is made, passed the vnode to resolve to a name. vfs:namecache:fullpath:hit - vn_fullpath1() successfully resolved a search for the parent of an object using the namecache, passed the discovered parent directory vnode pointer, name, and child vnode pointer. vfs:namecache:fullpath:miss - vn_fullpath1() failed to resolve a search for the parent of an object using the namecache, passed the child vnode pointer. vfs:namecache:fullpath:return - vn_fullpath1() has completed, passed the error number, and if that is zero, the vnode to resolve, and the returned path. vfs:namecache:lookup:hit - postive name cache entry hit, passed the parent directory vnode pointer, name, and child vnode pointer. vfs:namecache:lookup:hit_negative - negative name cache entry hit, passed the parent directory vnode pointer and name. vfs:namecache:lookup:miss - name cache miss, passed the parent directory pointer and the full remaining component name (not terminated after the cache miss component). vfs:namecache:purge:done - name cache purge for a vnode, passed the vnode pointer to purge. vfs:namecache:purge_negative:done - name cache purge of negative entries for children of a vnode, passed the vnode pointer to purge. vfs:namecache:purgevfs - name cache purge for a mountpoint, passed the mount pointer. Separate probes will also be invoked for each cache entry zapped. vfs:namecache:zap:done - name cache entry zapped, passed the parent directory vnode pointer, name, and child vnode pointer. vfs:namecache:zap_negative:done - negative name cache entry zapped, passed the parent directory vnode pointer and name. For any probes involving an extant name cache entry (enter, hit, zapp), we use the nul-terminated string for the name component. For misses, the remainder of the path, including later components, is provided as an argument instead since there is no handy nul-terminated version of the string around. This is arguably a bug. MFC after: 1 month Sponsored by: Google, Inc. Reviewed by: jhb, kan, kib (earlier version)
* Revert change 190655 temporarily. It breaks many setups where nullfs iskan2009-04-041-1/+1
| | | | used and needs to be revisited.
* vn_vptocnp() unlocks the name cache and forgets to re-lock it beforepeter2009-04-021-1/+1
| | | | | returning in one error case, and mistakenly unlocks it for the umount -f case.
* Replace v_dd vnode pointer with v_cache_dd pointer to struct namecachekan2009-03-291-33/+90
| | | | | | | | | | | | in directory vnodes. Allow namecache dotdot entry to be created pointing from child vnode to parent vnode if no existing links in opposite direction exist. Use direct link from parent to child for dotdot lookups otherwise. This restores more efficient dotdot caching in NFS filesystems which was lost when vnodes stoppped being type stable. Reviewed by: kib
* When a file lookup fails due to encountering a doomed vnode from a forcedjhb2009-03-241-3/+3
| | | | | | | unmount, consistently return ENOENT rather than EBADF. Reviewed by: kib MFC after: 1 month
* Do not underflow the buffer and then report the problem. Check for thekib2009-03-201-6/+6
| | | | | | | | condition before the buffer write. Also, since buflen is unsigned, previous check was ignored. Reviewed by: marcus Tested by: pho
* Remove unneeded braces to reduce used vertical screen space.kib2009-03-201-2/+1
| | | | The location was missed in r190140.
* Do not forget to adjust buflen for the first resolution of the pathkib2009-03-201-1/+2
| | | | | | | | from namecache. While there, compare pointers for equiality. Reviewed by: marcus Tested by: pho
* The nc_nlen member of the struct namecache contains the length of the cachedkib2009-03-201-1/+1
| | | | | | | | | name, not the length + 1. PR: 132620, 132542 Reported by: bf2006a yahoo com Tested by: bf2006a, pho Reviewed by: marcus
* When ktracing namei operations, log a result of the __getcwd().kib2009-03-201-0/+9
| | | | MFC after: 1 week
* Remove unneeded braces to reduce used vertical screen space.kib2009-03-201-4/+2
|
* Move the debug.hashstat sysctl tree under DIAGNOSTIC. I measured thejhb2009-03-091-0/+2
| | | | | | | | | debug.hashstat.rawnchash sysctl in particular as taking 7 milliseconds on a 3GHz Intel Xeon (4x2) running 7.1. It accounted for almost a quarter of the total runtime of 'sysctl -a'. It also performs lots of copyout's while holding the namecache lock (this does not attempt to fix that). MFC after: 2 weeks
* Enable caching of negative pathname lookups in the NFS client. To avoidjhb2009-02-191-0/+18
| | | | | | | | | | | | | | stale entries, we save a copy of the directory's modification time when the first negative cache entry was added in the directory's NFS node. When a negative cache entry is hit during a pathname lookup, the parent directory's modification time is checked. If it has changed, all of the negative cache entries for that parent are purged and the lookup falls back to using the RPC. This required adding a new cache_purge_negative() method to the name cache to purge only negative cache entries for a given directory. Submitted by: mohans, Rick Macklem, Ricardo Labiaga @ NetApp Reviewed by: mohans
* Convert the global mutex protecting the directory lookup name cache from ajhb2009-01-281-46/+81
| | | | | | | mutex to a reader/writer lock. Lookup operations first grab a read lock and perform the lookup. If the operation results in a need to modify the cache, then it tries to do an upgrade. If that fails, it drops the read lock, obtains a write lock, and redoes the lookup.
* - Mark all standalone INT/LONG/QUAD sysctl's MPSAFE. This is donejhb2009-01-231-6/+8
| | | | | | | | | | inside the SYSCTL() macros and thus does not need to be done for all of the nodes scattered across the source tree. - Mark the name-cache related sysctl's (including debug.hashstat.*) MPSAFE. - Mark vm.loadavg MPSAFE. - Remove GIANT_REQUIRED from vmtotal() (everything in this routine already has sufficient locking) and mark vm.vmtotal MPSAFE. - Mark the vm.stats.(sys|vm).* sysctls MPSAFE.
* Add a limit on namecache entries.mckay2009-01-201-0/+6
| | | | | | | | | | In normal operation, the number of cache entries is roughly equal to the number of active vnodes. However, when most of the recently accessed vnodes have many hard links, the number of cache entries can be 32000 times as large, exhausting kernel memory and provoking a panic in kmem_malloc(). MFC after: 2 weeks
* In r185557, the check for existing negative entry for the given namekib2008-12-301-22/+11
| | | | | | | | | | did not compared nc_dvp with supplied parent directory vnode pointer. Add the check and note that now branches for vp != NULL and vp == NULL are the same, thus can be merged. Reported and reviewed by: kan Tested by: pho MFC after: 2 weeks
* Do not KASSERT when vp->v_dd is NULL. Only directories which have had ".."marcus2008-12-231-1/+1
| | | | | | | | | looked up would have v_dd set to a non-NULL value. This fixes a panic seen when running installworld on a diskless system with a separate /usr file system. Submitted by: cracauer Approved by: kib
* Keep the hold on the vnode during VOP_VPTOCNP() call, allowing the vopkib2008-12-231-1/+1
| | | | | | implementation to drop vnode lock, if needed. Reported and tested by: pho
* Add a new VOP, VOP_VPTOCNP, which translates a vnode to its component namemarcus2008-12-121-24/+77
| | | | | | | | | | | | | | | | on a best-effort basis. Teach vn_fullpath to use this new VOP if a regular VFS cache lookup fails. This VOP is designed to supplement the VFS cache to provide a better chance that a vnode-to-name lookup will succeed. Currently, an implementation for devfs is being committed. The default implementation is to return ENOENT. A big thanks to kib for the mentorship on this, and to pho for running it through his stress test suite. Reviewed by: arch Approved by: kib
* Shared lookup makes it possible to create several negative cachekib2008-12-021-4/+11
| | | | | | | | | | | | | entries for one name. Then, creating inode with that name would remove one entry, leaving others dormant. Reclaiming the vnode would uncover negative entries, causing false return of ENOENT from the calls like stat, that do not create inode. Prevent creation of the duplicated negative entries. Reported and debugged with: pho Reviewed by: jhb X-MFC: after shared lookup changes
* Move vn_fullpath1() outside of FILEDESC locking. This is being done inmarcus2008-11-251-5/+21
| | | | | | | | | | advance of teaching vn_fullpath1() how to query file systems for vnode-to-name mappings when cache lookups fail. Thanks to kib for guidance and patience on this process. Reviewed by: kib Approved by: kib
* Part 1 of making shared lookups more resilient with respect to forcedjhb2008-09-241-8/+18
| | | | | | | | | | | unmounts. When we upgrade a vnode lock from shared to exclusive during a name cache lookup, fail the lookup with EBADF if the vnode is invalidated while we are waiting for the exclusive lock. Also, for correctness (though I'm not sure it can occur in practice), downgrade an exclusively locked vnode if it should be share locked. Tested by: pho
* Sort includes.jhb2008-09-181-8/+8
|
* Fix a race condition with concurrent LOOKUP namecache operations for a vnodejhb2008-08-231-9/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | not in the namecache when shared lookups are enabled (vfs.lookup_shared=1, it is currently off by default) and the filesystem supports shared lookups (e.g. NFS client). Specifically, if multiple concurrent LOOKUPs both miss in the name cache in parallel, each of the lookups may each end up adding an entry to the namecache resulting in duplicate entries in the namecache for the same pathname. A subsequent removal of the mapping of that pathname to that vnode (via remove or rename) would only evict one of the entries from the name cache. As a result, subseqent lookups for that pathname would still return the old vnode. This race was observed with shared lookups over NFS where a file was updated by writing a new file out to a temporary file name and then renaming that temporary file to the "real" file to effect atomic updates of a file. Other processes on the same client that were periodically reading the file would occasionally receive an ESTALE error from open(2) because the VOP_GETATTR() in nfs_open() would receive that error when given the stale vnode. The fix here is to check for duplicates in cache_enter() and just return if an entry for this same directory and leaf file name for this vnode is already in the cache. The check for duplicates is done by walking the per-vnode list of name cache entries. It is expected that this list should be very small in the common case (usually 0 or 1 entries during a cache_enter() since most files only have 1 "leaf" name). Reviewed by: ups, scottl MFC after: 2 months
* Prevent crashes due to unlocked access to hash buckets in two sysctls.alfred2008-08-161-0/+4
| | | | | | | | | Use CACHE_LOCK to prevent crashes. Sysctls fixed: debug.hashstat.nchash and debug.hashstat.rawnchash. Obtained from: Juniper Networks MFC After: 1 week
* Currently, BSM audit pathname token generation for chrooted or jailedcsjp2008-07-311-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | processes are not producing absolute pathname tokens. It is required that audited pathnames are generated relative to the global root mount point. This modification changes our implementation of audit_canon_path(9) and introduces a new function: vn_fullpath_global(9) which performs a vnode -> pathname translation relative to the global mount point based on the contents of the name cache. Much like vn_fullpath, vn_fullpath_global is a wrapper function which called vn_fullpath1. Further, the string parsing routines have been converted to use the sbuf(9) framework. This change also removes the conditional acquisition of Giant, since the vn_fullpath1 method will not dip into file system dependent code. The vnode locking was modified to use vhold()/vdrop() instead the vref() and vrele(). This will modify the hold count instead of modifying the user count. This makes more sense since it's the kernel that requires the reference to the vnode. This also makes sure that the vnode does not get recycled we hold the reference to it. [1] Discussed with: rwatson Reviewed by: kib [1] MFC after: 2 weeks
* - Use LK_TYPE_MASK where needed. Actually after sys/sys/lockmgr.h:1.69 it ispjd2008-04-091-3/+5
| | | | | | | | no longer needed, but for now we still want to be consistent with other similar checks in the tree. - Call ASSERT_VOP_ELOCKED() only when vget() returns 0. Reviewed by: jeff
* Add the utility function vn_commname() to retrieve the command namekib2008-03-311-0/+19
| | | | | | | from the vfs namecache, when available. Reviewed by: rwatson, rdivacky Tested by: pho
* In keeping with style(9)'s recommendations on macros, use a ';'rwatson2008-03-161-1/+1
| | | | | | | | | after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr. MFC after: 1 month Discussed with: imp, rink
* Axe the 'thread' argument from VOP_ISLOCKED() and lockstatus() as it isattilio2008-02-251-5/+3
| | | | | | | | | always curthread. As KPI gets broken by this patch, manpages and __FreeBSD_version will be updated by further commits. Tested by: Andrea Barberio <insomniac at slackware dot it>
* VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used inattilio2008-01-131-1/+1
| | | | | | | | | | | conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
* vn_lock() is currently only used with the 'curthread' passed as argument.attilio2008-01-101-2/+2
| | | | | | | | | | | | | | | | Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>
* Remove remaining Giant acquisition around vn_fullpath1. This was missedkris2007-11-221-2/+0
| | | | | | | in r1.106 and has not been required for some years now. Reviewed by: jeff MFC After: 1 week
* Fix some locking cases where we ask for exclusively locked vnode, but we getpjd2007-09-211-4/+17
| | | | | | | | shared locked vnode in instead when vfs.lookup_shared is set to 1. Discussed with: kib, kris Tested by: kris Approved by: re (kensmith)
* We only flush entries related to the given file system. Currently there arepjd2007-06-181-3/+0
| | | | | no 'invalid' cache entires - file system is responsible for keeping it that way. The comment should have been updated in rev.1.25.
* To avoid a deadlock when handling .. directory during a lookup, we unlockpjd2007-05-251-3/+6
| | | | | | | parent vnode and relock it after locking child vnode. The problem was that we always relock it exclusively, even when it was share-locked. Discussed with: jeff
* We no longer need to put namecache entries onto temporary mplist.pjd2007-05-251-11/+3
| | | | It was useful in revision 1.86, but should have been removed in 1.89.
* The cache_leaf_test() function seems to be unused, so remove it.pjd2007-05-251-31/+0
|
OpenPOWER on IntegriCloud