summaryrefslogtreecommitdiffstats
path: root/sys/kern/vfs_cache.c
Commit message (Collapse)AuthorAgeFilesLines
* Turn #ifdef LOOKUP_SHARED into #ifndef LOOKUP_EXCLUSIVE to enable thisjeff2002-04-091-4/+4
| | | | | | | | | behavior by default. Also, change the options line to reflect this. If there are no problems reported this will become the only behavior and the knob will be removed in a month or so. Demanded by: obrien
* Remove a comment which relates to the old name cache code, whichdwmalone2002-04-071-4/+0
| | | | | | was replaced in 1997. Approved by: phk
* Remove __P.alfred2002-03-191-1/+1
|
* This patch adds the "LOCKSHARED" option to namei which causes it to only ↵jeff2002-03-121-0/+63
| | | | | | | | | | | | | | | | acquire shared locks on leafs. The stat() and open() calls have been changed to make use of this new functionality. Using shared locks in these cases is sufficient and can significantly reduce their latency if IO is pending to these vnodes. Also, this reduces the number of exclusive locks that are floating around in the system, which helps reduce the number of deadlocks that occur. A new kernel option "LOOKUP_SHARED" has been added. It defaults to off so this patch can be turned on for testing, and should eventually go away once it is proven to be stable. I have personally been running this patch for over a year now, so it is believed to be fully stable. Reviewed by: jake, obrien Approved by: jake
* Document all functions, global and static variables, and sysctls.eivind2002-03-051-4/+17
| | | | | | | | Includes some minor whitespace changes, and re-ordering to be able to document properly (e.g, grouping of variables and the SYSCTL macro calls for them, where the documentation has been added.) Reviewed by: phk (but all errors are mine)
* Remove cache_purgeleafdirs(), it has been #if 0 for quite some time.phk2002-02-171-60/+0
|
* Include sys/_lock.h and sys/_mutex.h to reduce namespace pollution.alfred2002-01-131-0/+1
| | | | Requested by: jhb
* SMP Lock struct file, filedesc and the global file list.alfred2002-01-131-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Seigo Tanimura (tanimura) posted the initial delta. I've polished it quite a bit reducing the need for locking and adapting it for KSE. Locks: 1 mutex in each filedesc protects all the fields. protects "struct file" initialization, while a struct file is being changed from &badfileops -> &pipeops or something the filedesc should be locked. 1 mutex in each struct file protects the refcount fields. doesn't protect anything else. the flags used for garbage collection have been moved to f_gcflag which was the FILLER short, this doesn't need locking because the garbage collection is a single threaded container. could likely be made to use a pool mutex. 1 sx lock for the global filelist. struct file * fhold(struct file *fp); /* increments reference count on a file */ struct file * fhold_locked(struct file *fp); /* like fhold but expects file to locked */ struct file * ffind_hold(struct thread *, int fd); /* finds the struct file in thread, adds one reference and returns it unlocked */ struct file * ffind_lock(struct thread *, int fd); /* ffind_hold, but returns file locked */ I still have to smp-safe the fget cruft, I'll get to that asap.
* Convert textvp_fullpath() into the more generic vn_fullpath() which takes ades2001-10-211-9/+9
| | | | | | struct thread * and a struct vnode * instead of a struct proc *. Temporarily add a textvp_fullpath macro for compatibility.
* After extensive testing it has been determined that adding complexitydillon2001-10-011-0/+31
| | | | | | | | | | | | | | | | | | to avoid removing higher level directory vnodes from the namecache has no perceivable effect and will be removed. This is especially true when vmiodirenable is turned on, which it is by default now. ( vmiodirenable makes a huge difference in directory caching ). The vfs.vmiodirenable and vfs.nameileafonly sysctls have been left in to allow further testing, but I expect to rip out vfs.nameileafonly soon too. I have also determined through testing that the real problem with numvnodes getting too large is due to the VM Page cache preventing the vnode from being reclaimed. The directory stuff made only a tiny dent relative to Poul's original code, enough so that some tests succeeded. But tests with several million small files show that the bigger problem is the VM Page cache. This will have to be addressed by a future commit. MFC after: 3 days
* KSE Milestone 2julian2001-09-121-12/+12
| | | | | | | | | | | | | | Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha
* Fix a memory leak in __getcwd() that can occur after a filesystemiedowse2001-09-041-1/+3
| | | | | | | | | | has been forcibly unmounted. If the filesystem root vnode is reached and it has no associated mountpoint (vp->v_mount == NULL), __getcwd would return without freeing 'buf'. Add the missing free() call. PR: kern/30306 Submitted by: Mike Potanin <potanin@mccme.ru> MFC after: 1 week
* Undo part of the tangle of having sys/lock.h and sys/mutex.h included inmarkm2001-05-011-0/+1
| | | | | | | | | | | other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)
* Revert consequences of changes to mount.h, part 2.grog2001-04-291-2/+0
| | | | Requested by: bde
* Correct #includes to work with fixed sys/mount.h.grog2001-04-231-0/+2
|
* Reclaim directory vnodes held in namecache if few free vnodes aretanimura2001-04-181-2/+66
| | | | | | | | | | | | | | | available. Only directory vnodes holding no child directory vnodes held in v_cache_src are recycled, so that directory vnodes near the root of the filesystem hierarchy remain in namecache and directory vnodes are not reclaimed in cascade. The period of vnode reclaiming attempt and the number of vnodes attempted to reclaim can be tuned via sysctl(2). Suggested by: tegge Approved by: phk
* Create debug.hashstat.[raw]nchash and debug.hashstat.[raw]nfsnode topeter2001-04-111-0/+80
| | | | | | | | | | | enable easy access to the hash chain stats. The raw prefixed versions dump an integer array to userland with the chain lengths. This cheats and calls it an array of 'struct int' rather than 'int' or sysctl -a faithfully dumps out the 128K array on an average machine. The non-raw versions return 4 integers: count, number of chains used, maximum chain length, and percentage utilization (fixed point, multiplied by 100). The raw forms are more useful for analyzing the hash distribution, while the other form can be read easily by humans and stats loggers.
* Use the same API as the example code.peter2001-03-201-6/+8
| | | | | | | | | | Allow the initial hash value to be passed in, as the examples do. Incrementally hash in the dvp->v_id (using the official api) rather than add it. This seems to help power-of-two predictable filename trees where the filenames repeat on a power-of-two cycle and the directory trees have power-of-two components in it. The simple add then mask was causing things like 12000+ entry collision chains while most other entries have between 0 and 3 entries each. This way seems to improve things.
* Use a generic implementation of the Fowler/Noll/Vo hash (FNV hash).peter2001-03-171-13/+6
| | | | | | | | | | | | | | | | | Make the name cache hash as well as the nfsnode hash use it. As a special tweak, create an unsigned version of register_t. This allows us to use a special tweak for the 64 bit versions that significantly speeds up the i386 version (ie: int64 XOR int64 is slower than int64 XOR int32). The code layout is a little strange for the string function, but I was able to get between 5 to 10% improvement over the original version I started with. The layout affects gcc code generation choices and this way was fastest on x86 and alpha. Note that 'CPUTYPE=p3' etc makes a fair difference to this. It is around 45% faster with -march=pentiumpro on a p6 cpu.
* Staticize some malloc M_ instances.phk2000-12-081-1/+1
|
* Untangle vfsinit() a bit. Use seperate sysinit functions rather thanpeter2000-12-061-3/+5
| | | | having a super-function calling bits all over the place.
* o Export nchstats ("VFS cache effectiveness statistics") usingrwatson2000-11-201-0/+4
| | | | | SYSCTL_OPAQUE. This removes a reason that systat requires setgid kmem. More to come.
* Add new flag PDIRUNLOCK to the component.cn_flags which should be set bybp2000-09-171-7/+18
| | | | | | | | | | | | | | | | | | filesystem lookup() routine if it unlocks parent directory. This flag should be carefully tracked by filesystems if they want to work properly with nullfs and other stacked filesystems. VFS takes advantage of this flag to perform symantically correct usage of vrele() instead of vput() if parent directory already unlocked. If filesystem fails to track this flag then previous codepath in VFS left unchanged. Convert UFS code to set PDIRUNLOCK flag if necessary. Other filesystmes will be changed after some period of testing. Reviewed in general by: mckusick, dillon, adrian Obtained from: NetBSD
* Change variable naming to be consistent with the rest of VFS code.bp2000-09-101-25/+23
| | | | Reduce number of indirections by using already fetched values.
* Support for unsigned integer and long sysctl variables. Update thejhb2000-07-051-9/+6
| | | | | | | | | SYSCTL_LONG macro to be consistent with other integer sysctl variables and require an initial value instead of assuming 0. Update several sysctl variables to use the unsigned types. PR: 15251 Submitted by: Kelly Yancey <kbyanc@posi.net>
* Back out the previous change to the queue(3) interface.jake2000-05-261-5/+5
| | | | | | It was not discussed and should probably not happen. Requested by: msmith and others
* Change the way that the queue(3) structures are declared; don't assume thatjake2000-05-231-5/+5
| | | | | | | | the type argument to *_HEAD and *_ENTRY is a struct. Suggested by: phk Reviewed by: phk Approved by: mdodd
* Move procfs_fullpath() to vfs_cache.c, with a rename to textvp_fullpath().green2000-04-261-0/+110
| | | | | | | | | | There's no excuse to have code in synthetic filestores that allows direct references to the textvp anymore. Feature requested by: msmith Feature agreed to by: warner Move requested by: phk Move agreed to by: bde
* Move the declaration of "struct namecache" to vnode.h, as it can be usefulgreen2000-04-221-16/+0
| | | | | elsewhere. Note, of course, that in an ideal world nothing should need to see our VFS implementation :-/
* Avoid a panic in __getcwd(2) when combined with umount -f.peter2000-02-141-0/+2
|
* Before we start to mess with the VFS name-cache clean things up a little bit:phk1999-10-031-13/+142
| | | | Isolate the namecache in its own file, and give it a dedicated malloc type.
* $Id$ -> $FreeBSD$peter1999-08-281-1/+1
|
* Fix a braino in the v_id wraparound code. Give more (current) detailsphk1999-04-241-9/+17
| | | | | | | in comment. PR: 11307 Spotted by: Ville-Pertti Keinonen <will@iki.fi>
* Don't use CTL_VFS at the wrong level. This caused loops in the sysctlbde1998-09-091-3/+2
| | | | | tree if CTL_VFS happened to get assigned as a type number to a vfs that has some vfs sysctls.
* Removed some bogus casts.bde1997-12-191-3/+3
|
* Remove a bunch of variables which were unused both in GENERIC and LINT.phk1997-11-071-5/+2
| | | | Found by: -Wunused
* VFS mega cleanup commit (x/N)phk1997-10-161-3/+2
| | | | | | | | | | | | | | | | | | | | | | | 1. Add new file "sys/kern/vfs_default.c" where default actions for VOPs go. Implement proper defaults for ABORTOP, BWRITE, LEASE, POLL, REVOKE and STRATEGY. Various stuff spread over the entire tree belongs here. 2. Change VOP_BLKATOFF to a normal function in cd9660. 3. Kill VOP_BLKATOFF, VOP_TRUNCATE, VOP_VFREE, VOP_VALLOC. These are private interface functions between UFS and the underlying storage manager layer (FFS/LFS/MFS/EXT2FS). The functions now live in struct ufsmount instead. 4. Remove a kludge of VOP_ functions in all filesystems, that did nothing but obscure the simplicity and break the expandability. If a filesystem doesn't implement VOP_FOO, it shouldn't have an entry for it in its vnops table. The system will try to DTRT if it is not implemented. There are still some cruft left, but the bulk of it is done. 5. Fix another VCALL in vfs_cache.c (thanks Bruce!)
* vnops megacommitphk1997-10-151-3/+2
| | | | | | | | | | 1. Use the default function to access all the specfs operations. 2. Use the default function to access all the fifofs operations. 3. Use the default function to access all the ufs operations. 4. Fix VCALL usage in vfs_cache.c 5. Use VOCALL to access specfs functions in devfs_vnops.c 6. Staticize most of the spec and fifofs vnops functions. 7. Make UFS panic if it lacks bits of the underlying storage handling.
* Add one more counter so we can truly find out how good our name cachephk1997-09-241-2/+7
| | | | | is. If we don't find something and don't what to have found something, it's actually a success.
* A couple of handles to tweak, more statistics.phk1997-09-241-1/+31
|
* Revert to the previous hashing, double the hashtable size instead.phk1997-09-041-4/+7
|
* Use 2^N hash sizes rather than primesize, this replaces a divisionphk1997-09-031-12/+10
| | | | | | | | with an and. (Submitted by davidg) Preemptively record ".." values. Reviewed by: phk
* Removed unused #includes.bde1997-09-021-3/+1
|
* Change the 0xdeadb hack to a flag called VDOOMED.phk1997-08-311-7/+8
| | | | | | | | | | | | | | | | | | | | | Introduce VFREE which indicates that vnode is on freelist. Rename vholdrele() to vdrop(). Create vfree() and vbusy() to add/delete vnode from freelist. Add vfree()/vbusy() to keep (v_holdcnt != 0 || v_usecount != 0) vnodes off the freelist. Generalize vhold()/v_holdcnt to mean "do not recycle". Fix reassignbuf()s lack of use of vhold(). Use vhold() instead of checking v_cache_src list. Remove vtouch(), the vnodes are always vget'ed soon enough after for it to have any measuable effect. Add sysctl debug.freevnodes to keep track of things. Move cache_purge() up in getnewvnodes to avoid race. Decrement v_usecount after VOP_INACTIVE(), put a vhold() on it during VOP_INACTIVE() Unmacroize vhold()/vdrop() Print out VDOOMED and VFREE flags (XXX: should use %b) Reviewed by: dyson
* Uncut&paste cache_lookup().phk1997-08-261-1/+84
| | | | | | | | | | | | | | | This unifies several times in theory indentical 50 lines of code. The filesystems have a new method: vop_cachedlookup, which is the meat of the lookup, and use vfs_cache_lookup() for their vop_lookup method. vfs_cache_lookup() will check the namecache and pass on to the vop_cachedlookup method in case of a miss. It's still the task of the individual filesystems to populate the namecache with cache_enter(). Filesystems that do not use the namecache will just provide the vop_lookup method as usual.
* remove unused MAXVNODEUSE macro.phk1997-08-041-2/+1
|
* 1. Add a {pointer, v_id} pair to the vnode to store the reference to thephk1997-05-041-107/+105
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ".." vnode. This is cheaper storagewise than keeping it in the namecache, and it makes more sense since it's a 1:1 mapping. 2. Also handle the case of "." more intelligently rather than stuff the namecache with pointless entries. 3. Add two lists to the vnode and hang namecache entries which go from or to this vnode. When cleaning a vnode, delete all namecache entries it invalidates. 4. Never reuse namecache enties, malloc new ones when we need it, free old ones when they die. No longer a hard limit on how many we can have. 5. Remove the upper limit on namelength of namecache entries. 6. Make a global list for negative namecache entries, limit their number to a sysctl'able (debug.ncnegfactor) fraction of the total namecache. Currently the default fraction is 1/16th. (Suggestions for better default wanted!) 7. Assign v_id correctly in the face of 32bit rollover. 8. Remove the LRU list for namecache entries, not needed. Remove the #ifdef NCH_STATISTICS stuff, it's not needed either. 9. Use the vnode freelist as a true LRU list, also for namecache accesses. 10. Reuse vnodes more aggresively but also more selectively, if we can't reuse, malloc a new one. There is no longer a hard limit on their number, they grow to the point where we don't reuse potentially usable vnodes. A vnode will not get recycled if still has pages in core or if it is the source of namecache entries (Yes, this does indeed work :-) "." and ".." are not namecache entries any longer...) 11. Do not overload the v_id field in namecache entries with whiteout information, use a char sized flags field instead, so we can get rid of the vpid and v_id fields from the namecache struct. Since we're linked to the vnodes and purged when they're cleaned, we don't have to check the v_id any more. 12. NFS knew about the limitation on name length in the namecache, it shouldn't and doesn't now. Bugs: The namecache statistics no longer includes the hits for ".." and "." hits. Performance impact: Generally in the +/- 0.5% for "normal" workstations, but I hope this will allow the system to be selftuning over a bigger range of "special" applications. The case where RAM is available but unused for cache because we don't have any vnodes should be gone. Future work: Straighten out the namecache statistics. "desiredvnodes" is still used to (bogusly ?) size hash tables in the filesystems. I have still to find a way to safely free unused vnodes back so their number can shrink when not needed. There is a few uses of the v_id field left in the filesystems, scheduled for demolition at a later time. Maybe a one slot cache for unused namecache entries should be implemented to decrease the malloc/free frequency.
* Fixed the hash formula. Lite2 doesn't have phashinit(), so Lite2's hashbde1997-03-081-14/+14
| | | | | | | | | | | | | | | | | | | | | | | | formula uses `& nchash'. This is very broken when nchash is a prime number instead of 1 less than a power of 2, but the Lite2 formula was merged in. Merged some cosmetic changes from Lite2, rev.1.21 and Lite1. The merge was difficult because the Lite2 code is essentially ours (phk's) except where Lite2 improved or broke it. Summary of the Lite2 changes: - in the copyright, phk's rights have been transferred to the Regents. This change should be reviewed. - nchENOENT went away; the "no" vnode is now simply 0. - comments were improved. - style was "improved". - goto instead of Fanatism (sic) was considered bad :-). - there are some small changes to support whiteouts. - new cache entries are added in more cases. More work is required near here to change the hash table size if kern.desiredvnodes is changed using sysctl. - rescanning of the hash bucket in cache_purgevfs() was removed. This change should be reviewed.
* Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are notpeter1997-02-221-1/+1
| | | | ready for it yet.
* This is the kernel Lite/2 commit. There are some requisite userlanddyson1997-02-101-71/+87
| | | | | | | | | | | | | | | changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes. The system boots and can mount UFS filesystems. Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed. Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>
OpenPOWER on IntegriCloud