diff options
author | mjg <mjg@FreeBSD.org> | 2016-12-31 12:32:50 +0000 |
---|---|---|
committer | mjg <mjg@FreeBSD.org> | 2016-12-31 12:32:50 +0000 |
commit | 6fe205a80e552bb298e677612dea8c87d6afe6e4 (patch) | |
tree | 6078c133ed387749fdac45fad3346cc4df168c81 /sys/cddl | |
parent | 1af6bcb527e97aaddea67934bd4fcbd6cb8a1d01 (diff) | |
download | FreeBSD-src-6fe205a80e552bb298e677612dea8c87d6afe6e4.zip FreeBSD-src-6fe205a80e552bb298e677612dea8c87d6afe6e4.tar.gz |
MFC r305378,r305379,r305386,r305684,r306224,r306608,r306803,r307650,r307685,
r308407,r308665,r308667,r309067:
cache: put all negative entry management code into dedicated functions
==
cache: manage negative entry list with a dedicated lock
Since negative entries are managed with a LRU list, a hit requires a
modificaton.
Currently the code tries to upgrade the global lock if needed and is
forced to retry the lookup if it fails.
Provide a dedicated lock for use when the cache is only shared-locked.
==
cache: defer freeing entries until after the global lock is dropped
This also defers vdrop for held vnodes.
==
cache: improve scalability by introducing bucket locks
An array of bucket locks is added.
All modifications still require the global cache_lock to be held for
writing. However, most readers only need the relevant bucket lock and in
effect can run concurrently to the writer as long as they use a
different lock. See the added comment for more details.
This is an intermediate step towards removal of the global lock.
==
cache: get rid of the global lock
Add a table of vnode locks and use them along with bucketlocks to provide
concurrent modification support. The approach taken is to preserve the
current behaviour of the namecache and just lock all relevant parts before
any changes are made.
Lookups still require the relevant bucket to be locked.
==
cache: ignore purgevfs requests for filesystems with few vnodes
purgevfs is purely optional and induces lock contention in workloads
which frequently mount and unmount filesystems.
In particular, poudriere will do this for filesystems with 4 vnodes or
less. Full cache scan is clearly wasteful.
Since there is no explicit counter for namecache entries, the number of
vnodes used by the target fs is checked.
The default limit is the number of bucket locks.
== (by kib)
Limit scope of the optimization in r306608 to dounmount() caller only.
Other uses of cache_purgevfs() do rely on the cache purge for correct
operations, when paths are invalidated without unmount.
==
cache: split negative entry LRU into multiple lists
This splits the ncneg_mtx lock while preserving the hit ratio at least
during buildworld.
Create N dedicated lists for new negative entries.
Entries with at least one hit get promoted to the hot list, where they
get requeued every M hits.
Shrinking demotes one hot entry and performs a round-robin shrinking of
regular lists.
==
cache: fix up a corner case in r307650
If no negative entry is found on the last list, the ncp pointer will be
left uninitialized and a non-null value will make the function assume an
entry was found.
Fix the problem by initializing to NULL on entry.
== (by kib)
vn_fullpath1() checked VV_ROOT and then unreferenced
vp->v_mount->mnt_vnodecovered unlocked. This allowed unmount to race.
Lock vnode after we noticed the VV_ROOT flag. See comments for
explanation why unlocked check for the flag is considered safe.
==
cache: fix a race between entry removal and demotion
The negative list shrinker can demote an entry with only hotlist + neglist
locks held. On the other hand entry removal possibly sets the NCF_DVDROP
without aformentioned locks held prior to detaching it from the respective
netlist., which can lose the update made by the shrinker.
==
cache: plug a write-only variable in cache_negative_zap_one
==
cache: ensure that the number of bucket locks does not exceed hash size
The size can be changed by side effect of modifying kern.maxvnodes.
Since numbucketlocks was not modified, setting a sufficiently low value
would give more locks than actual buckets, which would then lead to
corruption.
Force the number of buckets to be not smaller.
Note this should not matter for real world cases.
Diffstat (limited to 'sys/cddl')
-rw-r--r-- | sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c index 855a7b9..dbcdd69 100644 --- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c +++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c @@ -1843,7 +1843,7 @@ zfsvfs_teardown(zfsvfs_t *zfsvfs, boolean_t unmounting) */ (void) dnlc_purge_vfsp(zfsvfs->z_parent->z_vfs, 0); #ifdef FREEBSD_NAMECACHE - cache_purgevfs(zfsvfs->z_parent->z_vfs); + cache_purgevfs(zfsvfs->z_parent->z_vfs, true); #endif } |