summaryrefslogtreecommitdiffstats
path: root/sys/kern/vfs_cache.c
Commit message (Collapse)AuthorAgeFilesLines
...
* Do not forget to adjust buflen for the first resolution of the pathkib2009-03-201-1/+2
| | | | | | | | from namecache. While there, compare pointers for equiality. Reviewed by: marcus Tested by: pho
* The nc_nlen member of the struct namecache contains the length of the cachedkib2009-03-201-1/+1
| | | | | | | | | name, not the length + 1. PR: 132620, 132542 Reported by: bf2006a yahoo com Tested by: bf2006a, pho Reviewed by: marcus
* When ktracing namei operations, log a result of the __getcwd().kib2009-03-201-0/+9
| | | | MFC after: 1 week
* Remove unneeded braces to reduce used vertical screen space.kib2009-03-201-4/+2
|
* Move the debug.hashstat sysctl tree under DIAGNOSTIC. I measured thejhb2009-03-091-0/+2
| | | | | | | | | debug.hashstat.rawnchash sysctl in particular as taking 7 milliseconds on a 3GHz Intel Xeon (4x2) running 7.1. It accounted for almost a quarter of the total runtime of 'sysctl -a'. It also performs lots of copyout's while holding the namecache lock (this does not attempt to fix that). MFC after: 2 weeks
* Enable caching of negative pathname lookups in the NFS client. To avoidjhb2009-02-191-0/+18
| | | | | | | | | | | | | | stale entries, we save a copy of the directory's modification time when the first negative cache entry was added in the directory's NFS node. When a negative cache entry is hit during a pathname lookup, the parent directory's modification time is checked. If it has changed, all of the negative cache entries for that parent are purged and the lookup falls back to using the RPC. This required adding a new cache_purge_negative() method to the name cache to purge only negative cache entries for a given directory. Submitted by: mohans, Rick Macklem, Ricardo Labiaga @ NetApp Reviewed by: mohans
* Convert the global mutex protecting the directory lookup name cache from ajhb2009-01-281-46/+81
| | | | | | | mutex to a reader/writer lock. Lookup operations first grab a read lock and perform the lookup. If the operation results in a need to modify the cache, then it tries to do an upgrade. If that fails, it drops the read lock, obtains a write lock, and redoes the lookup.
* - Mark all standalone INT/LONG/QUAD sysctl's MPSAFE. This is donejhb2009-01-231-6/+8
| | | | | | | | | | inside the SYSCTL() macros and thus does not need to be done for all of the nodes scattered across the source tree. - Mark the name-cache related sysctl's (including debug.hashstat.*) MPSAFE. - Mark vm.loadavg MPSAFE. - Remove GIANT_REQUIRED from vmtotal() (everything in this routine already has sufficient locking) and mark vm.vmtotal MPSAFE. - Mark the vm.stats.(sys|vm).* sysctls MPSAFE.
* Add a limit on namecache entries.mckay2009-01-201-0/+6
| | | | | | | | | | In normal operation, the number of cache entries is roughly equal to the number of active vnodes. However, when most of the recently accessed vnodes have many hard links, the number of cache entries can be 32000 times as large, exhausting kernel memory and provoking a panic in kmem_malloc(). MFC after: 2 weeks
* In r185557, the check for existing negative entry for the given namekib2008-12-301-22/+11
| | | | | | | | | | did not compared nc_dvp with supplied parent directory vnode pointer. Add the check and note that now branches for vp != NULL and vp == NULL are the same, thus can be merged. Reported and reviewed by: kan Tested by: pho MFC after: 2 weeks
* Do not KASSERT when vp->v_dd is NULL. Only directories which have had ".."marcus2008-12-231-1/+1
| | | | | | | | | looked up would have v_dd set to a non-NULL value. This fixes a panic seen when running installworld on a diskless system with a separate /usr file system. Submitted by: cracauer Approved by: kib
* Keep the hold on the vnode during VOP_VPTOCNP() call, allowing the vopkib2008-12-231-1/+1
| | | | | | implementation to drop vnode lock, if needed. Reported and tested by: pho
* Add a new VOP, VOP_VPTOCNP, which translates a vnode to its component namemarcus2008-12-121-24/+77
| | | | | | | | | | | | | | | | on a best-effort basis. Teach vn_fullpath to use this new VOP if a regular VFS cache lookup fails. This VOP is designed to supplement the VFS cache to provide a better chance that a vnode-to-name lookup will succeed. Currently, an implementation for devfs is being committed. The default implementation is to return ENOENT. A big thanks to kib for the mentorship on this, and to pho for running it through his stress test suite. Reviewed by: arch Approved by: kib
* Shared lookup makes it possible to create several negative cachekib2008-12-021-4/+11
| | | | | | | | | | | | | entries for one name. Then, creating inode with that name would remove one entry, leaving others dormant. Reclaiming the vnode would uncover negative entries, causing false return of ENOENT from the calls like stat, that do not create inode. Prevent creation of the duplicated negative entries. Reported and debugged with: pho Reviewed by: jhb X-MFC: after shared lookup changes
* Move vn_fullpath1() outside of FILEDESC locking. This is being done inmarcus2008-11-251-5/+21
| | | | | | | | | | advance of teaching vn_fullpath1() how to query file systems for vnode-to-name mappings when cache lookups fail. Thanks to kib for guidance and patience on this process. Reviewed by: kib Approved by: kib
* Part 1 of making shared lookups more resilient with respect to forcedjhb2008-09-241-8/+18
| | | | | | | | | | | unmounts. When we upgrade a vnode lock from shared to exclusive during a name cache lookup, fail the lookup with EBADF if the vnode is invalidated while we are waiting for the exclusive lock. Also, for correctness (though I'm not sure it can occur in practice), downgrade an exclusively locked vnode if it should be share locked. Tested by: pho
* Sort includes.jhb2008-09-181-8/+8
|
* Fix a race condition with concurrent LOOKUP namecache operations for a vnodejhb2008-08-231-9/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | not in the namecache when shared lookups are enabled (vfs.lookup_shared=1, it is currently off by default) and the filesystem supports shared lookups (e.g. NFS client). Specifically, if multiple concurrent LOOKUPs both miss in the name cache in parallel, each of the lookups may each end up adding an entry to the namecache resulting in duplicate entries in the namecache for the same pathname. A subsequent removal of the mapping of that pathname to that vnode (via remove or rename) would only evict one of the entries from the name cache. As a result, subseqent lookups for that pathname would still return the old vnode. This race was observed with shared lookups over NFS where a file was updated by writing a new file out to a temporary file name and then renaming that temporary file to the "real" file to effect atomic updates of a file. Other processes on the same client that were periodically reading the file would occasionally receive an ESTALE error from open(2) because the VOP_GETATTR() in nfs_open() would receive that error when given the stale vnode. The fix here is to check for duplicates in cache_enter() and just return if an entry for this same directory and leaf file name for this vnode is already in the cache. The check for duplicates is done by walking the per-vnode list of name cache entries. It is expected that this list should be very small in the common case (usually 0 or 1 entries during a cache_enter() since most files only have 1 "leaf" name). Reviewed by: ups, scottl MFC after: 2 months
* Prevent crashes due to unlocked access to hash buckets in two sysctls.alfred2008-08-161-0/+4
| | | | | | | | | Use CACHE_LOCK to prevent crashes. Sysctls fixed: debug.hashstat.nchash and debug.hashstat.rawnchash. Obtained from: Juniper Networks MFC After: 1 week
* Currently, BSM audit pathname token generation for chrooted or jailedcsjp2008-07-311-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | processes are not producing absolute pathname tokens. It is required that audited pathnames are generated relative to the global root mount point. This modification changes our implementation of audit_canon_path(9) and introduces a new function: vn_fullpath_global(9) which performs a vnode -> pathname translation relative to the global mount point based on the contents of the name cache. Much like vn_fullpath, vn_fullpath_global is a wrapper function which called vn_fullpath1. Further, the string parsing routines have been converted to use the sbuf(9) framework. This change also removes the conditional acquisition of Giant, since the vn_fullpath1 method will not dip into file system dependent code. The vnode locking was modified to use vhold()/vdrop() instead the vref() and vrele(). This will modify the hold count instead of modifying the user count. This makes more sense since it's the kernel that requires the reference to the vnode. This also makes sure that the vnode does not get recycled we hold the reference to it. [1] Discussed with: rwatson Reviewed by: kib [1] MFC after: 2 weeks
* - Use LK_TYPE_MASK where needed. Actually after sys/sys/lockmgr.h:1.69 it ispjd2008-04-091-3/+5
| | | | | | | | no longer needed, but for now we still want to be consistent with other similar checks in the tree. - Call ASSERT_VOP_ELOCKED() only when vget() returns 0. Reviewed by: jeff
* Add the utility function vn_commname() to retrieve the command namekib2008-03-311-0/+19
| | | | | | | from the vfs namecache, when available. Reviewed by: rwatson, rdivacky Tested by: pho
* In keeping with style(9)'s recommendations on macros, use a ';'rwatson2008-03-161-1/+1
| | | | | | | | | after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr. MFC after: 1 month Discussed with: imp, rink
* Axe the 'thread' argument from VOP_ISLOCKED() and lockstatus() as it isattilio2008-02-251-5/+3
| | | | | | | | | always curthread. As KPI gets broken by this patch, manpages and __FreeBSD_version will be updated by further commits. Tested by: Andrea Barberio <insomniac at slackware dot it>
* VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used inattilio2008-01-131-1/+1
| | | | | | | | | | | conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
* vn_lock() is currently only used with the 'curthread' passed as argument.attilio2008-01-101-2/+2
| | | | | | | | | | | | | | | | Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>
* Remove remaining Giant acquisition around vn_fullpath1. This was missedkris2007-11-221-2/+0
| | | | | | | in r1.106 and has not been required for some years now. Reviewed by: jeff MFC After: 1 week
* Fix some locking cases where we ask for exclusively locked vnode, but we getpjd2007-09-211-4/+17
| | | | | | | | shared locked vnode in instead when vfs.lookup_shared is set to 1. Discussed with: kib, kris Tested by: kris Approved by: re (kensmith)
* We only flush entries related to the given file system. Currently there arepjd2007-06-181-3/+0
| | | | | no 'invalid' cache entires - file system is responsible for keeping it that way. The comment should have been updated in rev.1.25.
* To avoid a deadlock when handling .. directory during a lookup, we unlockpjd2007-05-251-3/+6
| | | | | | | parent vnode and relock it after locking child vnode. The problem was that we always relock it exclusively, even when it was share-locked. Discussed with: jeff
* We no longer need to put namecache entries onto temporary mplist.pjd2007-05-251-11/+3
| | | | It was useful in revision 1.86, but should have been removed in 1.89.
* The cache_leaf_test() function seems to be unused, so remove it.pjd2007-05-251-31/+0
|
* - Remove redundant initialization.pjd2007-05-221-2/+1
| | | | - Compare pointer with NULL.
* Replace custom file descriptor array sleep lock constructed using a mutexrwatson2007-04-041-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and flags with an sxlock. This leads to a significant and measurable performance improvement as a result of access to shared locking for frequent lookup operations, reduced general overhead, and reduced overhead in the event of contention. All of these are imported for threaded applications where simultaneous access to a shared file descriptor array occurs frequently. Kris has reported 2x-4x transaction rate improvements on 8-core MySQL benchmarks; smaller improvements can be expected for many workloads as a result of reduced overhead. - Generally eliminate the distinction between "fast" and regular acquisisition of the filedesc lock; the plan is that they will now all be fast. Change all locking instances to either shared or exclusive locks. - Correct a bug (pointed out by kib) in fdfree() where previously msleep() was called without the mutex held; sx_sleep() is now always called with the sxlock held exclusively. - Universally hold the struct file lock over changes to struct file, rather than the filedesc lock or no lock. Always update the f_ops field last. A further memory barrier is required here in the future (discussed with jhb). - Improve locking and reference management in linux_at(), which fails to properly acquire vnode references before using vnode pointers. Annotate improper use of vn_fullpath(), which will be replaced at a future date. In fcntl(), we conservatively acquire an exclusive lock, even though in some cases a shared lock may be sufficient, which should be revisited. The dropping of the filedesc lock in fdgrowtable() is no longer required as the sxlock can be held over the sleep operation; we should consider removing that (pointed out by attilio). Tested by: kris Discussed with: jhb, kris, attilio, jeff
* Further system call comment cleanup:rwatson2007-03-051-1/+1
| | | | | | | | | | - Remove also "MP SAFE" after prior "MPSAFE" pass. (suggested by bde) - Remove extra blank lines in some cases. - Add extra blank lines in some cases. - Remove no-op comments consisting solely of the function name, the word "syscall", or the system call name. - Add punctuation. - Re-wrap some comments.
* Axe Giant from vn_fullpath(9). The vnode -> pathname lookup should becsjp2006-06-161-4/+0
| | | | | | | | | | filesystem agnostic. We are not touching any file system specific functions in this code path. Since we have a cache lock, there is really no need to keep Giant around here. This eliminates Giant acquisitions for any syscall which is auditing pathnames. Discussed with: jeff
* remove duplicate sizeof vnode entry (debug.sizeof.vnode already existed)...jmg2006-04-161-2/+2
| | | | move ncsize into debug.sizeof and rename to namecache...
* - Don't check v_mount for NULL to determine if a vnode has been recycled.jeff2006-02-061-1/+1
| | | | | | | Use the more appropriate VI_DOOMED flag instead. Sponsored by: Isilon Systems, Inc. MFC After: 1 week
* - Fix a leaked reference to a vnode via v_dd. We rely on cache_purge() andjeff2005-06-171-1/+11
| | | | | | | | | | | | | | | | | | cache_zap() to clear the v_dd pointers when a directory vnode is forcibly discarded. For this to work, all vnodes with v_dd pointers to a directory must also have name cache entries linked via v_cache_dst to that dvp otherwise we could not find them at cache_purge() time. The following code snipit could break this guarantee by unlinking a directory before fetching it's dotdot. The dotdot lookup would initialize the v_dd field of the unlinked directory which could never be cleared. To fix this we don't initialize v_dd for orphaned vnodes. printf("rmdir: %d\n", rmdir("../foo")); /* foo is cwd */ printf("chdir: %d\n", chdir("..")); printf("%s\n", getwd(NULL)); Sponsored by: Isilon Systems, Inc. Discovered by: kkenn Approved by: re (blanket vfs)
* - Clear v_dd in cache_zap() instead of cache_purge() as cache_purge() mayjeff2005-06-131-13/+3
| | | | | | not be called in all cases where we free the cnp. Sponsored by: Isilon Systems, Inc.
* - Add KTR_VFS messages for various name cache related events.jeff2005-06-131-0/+9
| | | | Sponsored by: Isilon Systems, Inc.
* - Assert that we're not adding a doomed vnode to the name cache.jeff2005-06-111-0/+3
| | | | Sponsored by: Isilon Systems, Inc.
* - Change all filesystems and vfs_cache to relock the dvp once the child isjeff2005-04-131-3/+5
| | | | | | locked in the ISDOTDOT case. Se vfs_lookup.c r1.79 for details. Sponsored by: Isilon Systems, Inc.
* Eliminate v_id and v_ddid. The name cache now holds references todas2005-03-301-36/+10
| | | | | | | | | | vnodes whose names it caches, so we no longer need a `generation number' to tell us if a referenced vnode is invalid. Replace the use of the parent's v_id in the hash function with the address of the parent vnode. Tested by: Peter Holm Glanced at by: jeff, phk
* Merge kern___cwd() and vn_fullpath(), which were virtually identical,das2005-03-301-132/+89
| | | | | | | | | | | | | | | | | | | | | | except for places where people forget to update one of them. We now collect only one set of stats for both of these routines. Other changes in this commit include: - Start acquiring Giant again in vn_fullpath(), since it is required when crossing a mount point. - Expand the scope of the cache lock to avoid dropping it and picking it up again for every pathname component. This also makes it trivial to avoid races in stats collection. - Assert that nc_dvp == v_dd for directories instead of returning an error to userland when this is not true. AFAIK, it should always be true when v_dd is non-null. - For vn_fullpath(), handle the first (non-directory) vnode separately. Glanced at by: jeff, phk
* - Move the logic that locks and refs the new vnode from vfs_cache_lookup()jeff2005-03-291-35/+33
| | | | | | | | to cache_lookup(). This allows us to acquire the vnode interlock before dropping the cache lock. This protects the vnodes identity until we have locked it. Sponsored by: Isilon Systems, Inc.
* - Get rid of the old LOOKUP_SHARED code. namei() now supplies thejeff2005-03-291-30/+6
| | | | | | proper lock flags via cn_lkflag. Sponsored by: Isilon Systems, Inc.
* - Invalidate the childrens v_dd pointers when we cache_purge() a directory.jeff2005-03-291-8/+15
| | | | | | Otherwise the stale pointer may be accessed after a vnode is freed. Sponsored by: Isilon Systems, Inc.
* - Remove an unused variable.jeff2005-03-281-2/+0
| | | | Sponsored by: Isilon Systems, Inc.
* - We no longer have to bother with PDIRUNLOCK, lookup() handles it for us.jeff2005-03-281-21/+4
| | | | Sponsored by: Isilon Systems, Inc.
OpenPOWER on IntegriCloud