summaryrefslogtreecommitdiffstats
path: root/sys/kern/vfs_default.c
Commit message (Collapse)AuthorAgeFilesLines
* Remove references to vm_zone.h and switch over to the new uma API.jeff2002-03-201-1/+0
| | | | | Also, remove maxsockets. If you look carefully you'll notice that the old zone allocator never honored this anyway.
* Remove __P.alfred2002-03-191-2/+2
|
* Add a flags parameter to VFS_VGET to pass through the desiredmckusick2002-03-171-1/+2
| | | | | | | | | | | | locking flags when acquiring a vnode. The immediate purpose is to allow polling lock requests (LK_NOWAIT) needed by soft updates to avoid deadlock when enlisting other processes to help with the background cleanup. For the future it will allow the use of shared locks for read access to vnodes. This change touches a lot of files as it affects most filesystems within the system. It has been well tested on FFS, loopback, and CD-ROM filesystems. only lightly on the others, so if you find a problem there, please let me (mckusick@mckusick.com) know.
* Document all functions, global and static variables, and sysctls.eivind2002-03-051-5/+44
| | | | | | | | Includes some minor whitespace changes, and re-ordering to be able to document properly (e.g, grouping of variables and the SYSCTL macro calls for them, where the documentation has been added.) Reviewed by: phk (but all errors are mine)
* Fix incorrect double-termination of vm_object. When a vm_object isdillon2001-10-231-1/+12
| | | | | | | | | | | | terminated and flushes pending dirty pages it is possible for the object to be ref'd (0->1) and then deref'd (1->0) during termination. We do not terminate the object a second time. Document vop_stdgetvobject() to explicitly allow it to be called without the vnode interlock held (for upcoming sync_msync() and ffs_sync() performance optimizations) MFC after: 3 days
* KSE Milestone 2julian2001-09-121-38/+38
| | | | | | | | | | | | | | Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha
* With Alfred's permission, remove vm_mtx in favor of a fine-grained approachdillon2001-07-041-12/+5
| | | | | | | | | (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
* Don't release the vm lock just to turn around and grab it again.jhb2001-05-231-3/+0
|
* Introduce a global lock for the vm subsystem (vm_mtx).alfred2001-05-191-1/+15
| | | | | | | | | | | | | | | | | | | vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb
* When calling poll() on a fd associated with a filesystem, let POLLIN/POLLOUTjlemon2001-05-141-3/+3
| | | | | | | | behave identically to POLLRDNORM/POLLWRNORM. Submitted by: bde PR: 27287 merge after: 1 week
* Fix return type of vop_stdputpages()phk2001-05-061-1/+2
| | | | Noticed by: rwatson
* Implement vop_std{get|put}pages() and add them to the default vop[].phk2001-05-011-0/+33
| | | | | Un-copy&paste all the VOP_{GET|PUT}PAGES() functions which do nothing but the default.
* Add a vop_stdbmap(), and make it part of the default vop vector.phk2001-04-291-0/+25
| | | | | | Make 7 filesystems which don't really know about VOP_BMAP rely on the default vector, rather than more or less complete local vop_nopbmap() implementations.
* Revert consequences of changes to mount.h, part 2.grog2001-04-291-2/+0
| | | | Requested by: bde
* Move the netexport structure from the fs-specific mountstructurephk2001-04-251-10/+0
| | | | | | | | | | | | | | to struct mount. This makes the "struct netexport *" paramter to the vfs_export and vfs_checkexport interface unneeded. Consequently that all non-stacking filesystems can use vfs_stdcheckexp(). At the same time, make it a pointer to a struct netexport in struct mount, so that we can remove the bogus AF_MAX and #include <net/radix.h> from <sys/mount.h>
* Correct #includes to work with fixed sys/mount.h.grog2001-04-231-0/+2
|
* This patch removes the VOP_BWRITE() vector.phk2001-04-171-8/+0
| | | | | | | | | | | | | VOP_BWRITE() was a hack which made it possible for NFS client side to use struct buf with non-bio backing. This patch takes a more general approach and adds a bp->b_op vector where more methods can be added. The success of this patch depends on bp->b_op being initialized all relevant places for some value of "relevant" which is not easy to determine. For now the buffers have grown a b_magic element which will make such issues a tiny bit easier to debug.
* o Rename "namespace" argument to "attrnamespace" as namespace is a C++rwatson2001-03-191-2/+2
| | | | | | | reserved word. Submitted by: jkh Obtained from: TrustedBSD Project
* o Change the API and ABI of the Extended Attribute kernel interfaces torwatson2001-03-151-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | introduce a new argument, "namespace", rather than relying on a first- character namespace indicator. This is in line with more recent thinking on EA interfaces on various mailing lists, including the posix1e, Linux acl-devel, and trustedbsd-discuss forums. Two namespaces are defined by default, EXTATTR_NAMESPACE_SYSTEM and EXTATTR_NAMESPACE_USER, where the primary distinction lies in the access control model: user EAs are accessible based on the normal MAC and DAC file/directory protections, and system attributes are limited to kernel-originated or appropriately privileged userland requests. o These API changes occur at several levels: the namespace argument is introduced in the extattr_{get,set}_file() system call interfaces, at the vnode operation level in the vop_{get,set}extattr() interfaces, and in the UFS extended attribute implementation. Changes are also introduced in the VFS extattrctl() interface (system call, VFS, and UFS implementation), where the arguments are modified to include a namespace field, as well as modified to advoid direct access to userspace variables from below the VFS layer (in the style of recent changes to mount by adrian@FreeBSD.org). This required some cleanup and bug fixing regarding VFS locks and the VFS interface, as a vnode pointer may now be optionally submitted to the VFS_EXTATTRCTL() call. Updated documentation for the VFS interface will be committed shortly. o In the near future, the auto-starting feature will be updated to search two sub-directories to the ".attribute" directory in appropriate file systems: "user" and "system" to locate attributes intended for those namespaces, as the single filename is no longer sufficient to indicate what namespace the attribute is intended for. Until this is committed, all attributes auto-started by UFS will be placed in the EXTATTR_NAMESPACE_SYSTEM namespace. o The default POSIX.1e attribute names for ACLs and Capabilities have been updated to no longer include the '$' in their filename. As such, if you're using these features, you'll need to rename the attribute backing files to the same names without '$' symbols in front. o Note that these changes will require changes in userland, which will be committed shortly. These include modifications to the extended attribute utilities, as well as to libutil for new namespace string conversion routines. Once the matching userland changes are committed, a buildworld is recommended to update all the necessary include files and verify that the kernel and userland environments are in sync. Note: If you do not use extended attributes (most people won't), upgrading is not imperative although since the system call API has changed, the new userland extended attribute code will no longer compile with old include files. o Couple of minor cleanups while I'm there: make more code compilation conditional on FFS_EXTATTR, which should recover a bit of space on kernels running without EA's, as well as update copyright dates. Obtained from: TrustedBSD Project
* Added a dummy lookup vop. Specfs was broken by removing its dummybde2001-02-181-4/+17
| | | | | | | | | | lookup vop so that it defaulted to using vop_eopnotsupp for strange lookups like the ones for open("/dev/null/", ...) and stat("/dev/null/", ...). This mainly caused the wrong errno to be returned by vfs syscalls (EOPNOTSUPP is not in POSIX, and is not documented in connection with specfs in open.2 and is not documented in stat.2 at all). Also, lookup vops are apparently required to set *ap->a_vpp to NULL on error, but vop_eopnotsupp is too broken to do this.
* Change and clean the mutex lock interface.bmilekic2001-02-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mtx_enter(lock, type) becomes: mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized) similarily, for releasing a lock, we now have: mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument. The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind. Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two: MTX_QUIET and MTX_NOSWITCH The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers: mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively. Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case. Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled. Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those. Finally, caught up to the interface changes in all sys code. Contributors: jake, jhb, jasone (in no particular order)
* Give vop_mmap an untimely death. The opportunity to give it a timelyeivind2000-11-011-1/+0
| | | | death timed out in 1996.
* Catch up to moving headers:jhb2000-10-201-1/+1
| | | | | - machine/ipl.h -> sys/ipl.h - machine/mutex.h -> sys/mutex.h
* Convert lockmgr locks from using simple locks to using mutexes.jasone2000-10-041-2/+3
| | | | | | Add lockdestroy() and appropriate invocations, which corresponds to lockinit() and must be called to clean up after a lockmgr lock is no longer needed.
* Add a lock structure to vnode structure. Previously it was either allocatedbp2000-09-251-52/+18
| | | | | | | | | | | | | | | | | | | separately (nfs, cd9660 etc) or keept as a first element of structure referenced by v_data pointer(ffs). Such organization leads to known problems with stacked filesystems. From this point vop_no*lock*() functions maintain only interlock lock. vop_std*lock*() functions maintain built-in v_lock structure using lockmgr(). vop_sharedlock() is compatible with vop_stdunlock(), but maintains a shared lock on vnode. If filesystem wishes to export lockmgr compatible lock, it can put an address of this lock to v_vnlock field. This indicates that the upper filesystem can take advantage of it and use single lock structure for entire (or part) of stack of vnodes. This field shouldn't be examined or modified by VFS code except for initialization purposes. Reviewed in general by: mckusick
* Add three new VOPs: VOP_CREATEVOBJECT, VOP_DESTROYVOBJECT and VOP_GETVOBJECT.bp2000-09-121-0/+116
| | | | | | | They will be used by nullfs and other stacked filesystems to support full cache coherency. Reviewed in general by: mckusick, dillon
* Introduce vop_stdinactive() and make it the default if no vop_inactivephk2000-08-181-6/+13
| | | | | | is declared. Sort and prune a few vop_op[].
* Add snapshots to the fast filesystem. Most of the changes supportmckusick2000-07-111-0/+15
| | | | | | | | | | | | | | | | | | | | the gating of system calls that cause modifications to the underlying filesystem. The gating can be enabled by any filesystem that needs to consistently suspend operations by adding the vop_stdgetwritemount to their set of vnops. Once gating is enabled, the function vfs_write_suspend stops all new write operations to a filesystem, allows any filesystem modifying system calls already in progress to complete, then sync's the filesystem to disk and returns. The function vfs_write_resume allows the suspended write operations to begin again. Gating is not added by default for all filesystems as for SMP systems it adds two extra locks to such critical kernel paths as the write system call. Thus, gating should only be added as needed. Details on the use and current status of snapshots in FFS can be found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness is not included here. Unless and until you create a snapshot file, these changes should have no effect on your system (famous last words).
* Separate the struct bio related stuff out of <sys/buf.h> intophk2000-05-051-0/+1
| | | | | | | | | | | | | | | <sys/bio.h>. <sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall not be made a nested include according to bdes teachings on the subject of nested includes. Diskdrivers and similar stuff below specfs::strategy() should no longer need to include <sys/buf.> unless they need caching of data. Still a few bogus uses of struct buf to track down. Repocopy by: peter
* Complete the bio/buf divorce for all code below devfs::strategyphk2000-04-151-1/+1
| | | | | | | | | | Exceptions: Vinum untouched. This means that it cannot be compiled. Greg Lehey is on the case. CCD not converted yet, casts to struct buf (still safe) atapi-cd casts to struct buf to examine B_PHYS
* Move B_ERROR flag to b_ioflags and call it BIO_ERROR.phk2000-04-021-3/+3
| | | | | | | | | | | | | (Much of this done by script) Move B_ORDERED flag to b_ioflags and call it BIO_ORDERED. Move b_pblkno and b_iodone_chain to struct bio while we transition, they will be obsoleted once bio structs chain/stack. Add bio_queue field for struct bio aware disksort. Address a lot of stylistic issues brought up by bde.
* Remove B_READ, B_WRITE and B_FREEBUF and replace them with a newphk2000-03-201-4/+4
| | | | | | | | | | | | | | | | | | | | | field in struct buf: b_iocmd. The b_iocmd is enforced to have exactly one bit set. B_WRITE was bogusly defined as zero giving rise to obvious coding mistakes. Also eliminate the redundant struct buf flag B_CALL, it can just as efficiently be done by comparing b_iodone to NULL. Should you get a panic or drop into the debugger, complaining about "b_iocmd", don't continue. It is likely to write on your disk where it should have been reading. This change is a step in the direction towards a stackable BIO capability. A lot of this patch were machine generated (Thanks to style(9) compliance!) Vinum users: Greg has not had time to test this yet, be careful.
* Fix bde'isms in acl/extattr syscall interface, renaming syscalls torwatson2000-01-191-1/+1
| | | | | | prettier (?) names, adding some const's around here, et al. Reviewed by: bde
* Second pass commit to introduce new ACL and Extended Attribute systemrwatson1999-12-191-0/+16
| | | | | | | calls, vnops, vfsops, both in /kern, and to individual file systems that require a vfsop_ array entry. Reviewed by: eivind
* Introduce NDFREE (and remove VOP_ABORTOP)eivind1999-12-151-1/+0
|
* Lock reporting and assertion changes.eivind1999-12-111-2/+4
| | | | | | | | | | | | | | | * lockstatus() and VOP_ISLOCKED() gets a new process argument and a new return value: LK_EXCLOTHER, when the lock is held exclusively by another process. * The ASSERT_VOP_(UN)LOCKED family is extended to use what this gives them * Extend the vnode_if.src format to allow more exact specification than locked/unlocked. This commit should not do any semantic changes unless you are using DEBUG_VFS_LOCKS. Discussed with: grog, mch, peter, phk Reviewed by: peter
* Make vop_panic() a little more informative.phk1999-11-071-1/+3
|
* Seperate the export check in VFS_FHTOVP, exports are now checked viaalfred1999-09-111-3/+10
| | | | | | | | | VFS_CHECKEXP. Add fh(open|stat|stafs) syscalls to allow userland to query filesystems based on (network) filehandle. Obtained from: NetBSD
* All unimplemented VFS ops now have entries in kern/vfs_default.c that returnalfred1999-09-071-0/+116
| | | | | | | | | | | | | reasonable defaults. This avoids confusing and ugly casting to eopnotsupp or making dummy functions. Bogus casting of filesystem sysctls to eopnotsupp() have been removed. This should make *_vfsops.c more readable and reduce bloat. Reviewed by: msmith, eivind Approved by: phk Tested by: Jeroen Ruigrok/Asmodai <asmodai@wxs.nl>
* $Id$ -> $FreeBSD$peter1999-08-281-1/+1
|
* Add Id stringsgpalmer1999-06-171-0/+2
|
* The VFS/BIO subsystem contained a number of hacks in order to optimizealc1999-05-021-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | piecemeal, middle-of-file writes for NFS. These hacks have caused no end of trouble, especially when combined with mmap(). I've removed them. Instead, NFS will issue a read-before-write to fully instantiate the struct buf containing the write. NFS does, however, optimize piecemeal appends to files. For most common file operations, you will not notice the difference. The sole remaining fragment in the VFS/BIO system is b_dirtyoff/end, which NFS uses to avoid cache coherency issues with read-merge-write style operations. NFS also optimizes the write-covers-entire-buffer case by avoiding the read-before-write. There is quite a bit of room for further optimization in these areas. The VM system marks pages fully-valid (AKA vm_page_t->valid = VM_PAGE_BITS_ALL) in several places, most noteably in vm_fault. This is not correct operation. The vm_pager_get_pages() code is now responsible for marking VM pages all-valid. A number of VM helper routines have been added to aid in zeroing-out the invalid portions of a VM page prior to the page being marked all-valid. This operation is necessary to properly support mmap(). The zeroing occurs most often when dealing with file-EOF situations. Several bugs have been fixed in the NFS subsystem, including bits handling file and directory EOF situations and buf->b_flags consistancy issues relating to clearing B_ERROR & B_INVAL, and handling B_DONE. getblk() and allocbuf() have been rewritten. B_CACHE operation is now formally defined in comments and more straightforward in implementation. B_CACHE for VMIO buffers is based on the validity of the backing store. B_CACHE for non-VMIO buffers is based simply on whether the buffer is B_INVAL or not (B_CACHE set if B_INVAL clear, and vise-versa). biodone() is now responsible for setting B_CACHE when a successful read completes. B_CACHE is also set when a bdwrite() is initiated and when a bwrite() is initiated. VFS VOP_BWRITE routines (there are only two - nfs_bwrite() and bwrite()) are now expected to set B_CACHE. This means that bowrite() and bawrite() also set B_CACHE indirectly. There are a number of places in the code which were previously using buf->b_bufsize (which is DEV_BSIZE aligned) when they should have been using buf->b_bcount. These have been fixed. getblk() now clears B_DONE on return because the rest of the system is so bad about dealing with B_DONE. Major fixes to NFS/TCP have been made. A server-side bug could cause requests to be lost by the server due to nfs_realign() overwriting other rpc's in the same TCP mbuf chain. The server's kernel must be recompiled to get the benefit of the fixes. Submitted by: Matthew Dillon <dillon@apollo.backplane.com>
* Added a used #include (don't depend on "vnode_if.h" including <sys/buf.h>).bde1999-02-251-0/+1
|
* Add 'options DEBUG_LOCKS', which stores extra information in structeivind1999-01-201-1/+11
| | | | | | | | lock, and add some macros and function parameters to make sure that the information get to the point where it can be put in the lock structure. While I'm here, add DEBUG_VFS_LOCKS to LINT.
* Make the vnode opv vector construction fully dynamic. Previously wepeter1998-11-101-0/+7
| | | | | | | | | | | | | | | | | | | | leaked memory on each unload and were limited to items referenced in the kernel copy of vnode_if.c. Now a kernel module is free to create it's own VOP_FOO() routines and the rest of the system will happily deal with it, including passthrough layers like union/umap/etc. Have VFS_SET() call a common vfs_modevent() handler rather than inline duplicating the common code all over the place. Have VNODEOP_SET() have the vnodeops removed at unload time (assuming a module) so that the vop_t ** vector is reclaimed. Slightly adjust the vop_t ** vectors so that calling slot 0 is a panic rather than a page fault. This could happen if VOP_something() was called without *any* handlers being present anywhere (including in vfs_default.c). slot 1 becomes the default vector for the vnodeop table. TODO: reclaim zones on unload (eg: nfs code)
* VOP_STRATEGY grows an (struct vnode *) argumentjulian1998-07-041-0/+1
| | | | | | as the value in b_vp is often not really what you want. (and needs to be frobbed). more cleanups will follow this. Reviewed by: Bruce Evans <bde@freebsd.org>
* As described by the submitter:msmith1998-05-061-23/+0
| | | | | | | | | | | | | Reverse the VFS_VRELE patch. Reference counting of vnodes does not need to be done per-fs. I noticed this while fixing vfs layering violations. Doing reference counting in generic code is also the preference cited by John Heidemann in recent discussions with him. The implementation of alternative vnode management per-fs is still a valid requirement for some filesystems but will be revisited sometime later, most likely using a different framework. Submitted by: Michael Hancock <michaelh@cet.co.jp>
* This mega-commit is meant to fix numerous interrelated problems. Theredyson1998-03-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | has been some bitrot and incorrect assumptions in the vfs_bio code. These problems have manifest themselves worse on NFS type filesystems, but can still affect local filesystems under certain circumstances. Most of the problems have involved mmap consistancy, and as a side-effect broke the vfs.ioopt code. This code might have been committed seperately, but almost everything is interrelated. 1) Allow (pmap_object_init_pt) prefaulting of buffer-busy pages that are fully valid. 2) Rather than deactivating erroneously read initial (header) pages in kern_exec, we now free them. 3) Fix the rundown of non-VMIO buffers that are in an inconsistent (missing vp) state. 4) Fix the disassociation of pages from buffers in brelse. The previous code had rotted and was faulty in a couple of important circumstances. 5) Remove a gratuitious buffer wakeup in vfs_vmio_release. 6) Remove a crufty and currently unused cluster mechanism for VBLK files in vfs_bio_awrite. When the code is functional, I'll add back a cleaner version. 7) The page busy count wakeups assocated with the buffer cache usage were incorrectly cleaned up in a previous commit by me. Revert to the original, correct version, but with a cleaner implementation. 8) The cluster read code now tries to keep data associated with buffers more aggressively (without breaking the heuristics) when it is presumed that the read data (buffers) will be soon needed. 9) Change to filesystem lockmgr locks so that they use LK_NOPAUSE. The delay loop waiting is not useful for filesystem locks, due to the length of the time intervals. 10) Correct and clean-up spec_getpages. 11) Implement a fully functional nfs_getpages, nfs_putpages. 12) Fix nfs_write so that modifications are coherent with the NFS data on the server disk (at least as well as NFS seems to allow.) 13) Properly support MS_INVALIDATE on NFS. 14) Properly pass down MS_INVALIDATE to lower levels of the VM code from vm_map_clean. 15) Better support the notion of pages being busy but valid, so that fewer in-transit waits occur. (use p->busy more for pageouts instead of PG_BUSY.) Since the page is fully valid, it is still usable for reads. 16) It is possible (in error) for cached pages to be busy. Make the page allocation code handle that case correctly. (It should probably be a printf or panic, but I want the system to handle coding errors robustly. I'll probably add a printf.) 17) Correct the design and usage of vm_page_sleep. It didn't handle consistancy problems very well, so make the design a little less lofty. After vm_page_sleep, if it ever blocked, it is still important to relookup the page (if the object generation count changed), and verify it's status (always.) 18) In vm_pageout.c, vm_pageout_clean had rotted, so clean that up. 19) Push the page busy for writes and VM_PROT_READ into vm_pageout_flush. 20) Fix vm_pager_put_pages and it's descendents to support an int flag instead of a boolean, so that we can pass down the invalidate bit.
* The intent is to get rid of WILLRELE in vnode_if.src by makingmsmith1998-03-011-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a complement to all ops that return a vpp, VFS_VRELE. This is initially only for file systems that implement the following ops that do a WILLRELE: vop_create, vop_whiteout, vop_mknod, vop_remove, vop_link, vop_rename, vop_mkdir, vop_rmdir, vop_symlink This is initial DNA that doesn't do anything yet. VFS_VRELE is implemented but not called. A default vfs_vrele was created for fs implementations that use the standard vnode management routines. VFS_VRELE implementations were made for the following file systems: Standard (vfs_vrele) ffs mfs nfs msdosfs devfs ext2fs Custom union umapfs Just EOPNOTSUPP fdesc procfs kernfs portal cd9660 These implementations may change as VOP changes are implemented. In the next phase, in the vop implementations calls to vrele and the vrele part of vput will be moved to the top layer vfs_vnops and made visible to all layers. vput will be replaced by unlock in these cases. Unlocking will still be done in the per fs layer but the refcount decrement will be triggered at the top because it doesn't hurt to hold a vnode reference a little longer. This will have minimal impact on the structure of the existing code. This will only be done for vnode arguments that are released by the various fs vop implementations. Wider use of VFS_VRELE will likely require restructuring of the code. Reviewed by: phk, dyson, terry et. al. Submitted by: Michael Hancock <michaelh@cet.co.jp>
* Make our v_usecount vnode reference count work identically to thedyson1998-01-061-3/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | original BSD code. The association between the vnode and the vm_object no longer includes reference counts. The major difference is that vm_object's are no longer freed gratuitiously from the vnode, and so once an object is created for the vnode, it will last as long as the vnode does. When a vnode object reference count is incremented, then the underlying vnode reference count is incremented also. The two "objects" are now more intimately related, and so the interactions are now much less complex. When vnodes are now normally placed onto the free queue with an object still attached. The rundown of the object happens at vnode rundown time, and happens with exactly the same filesystem semantics of the original VFS code. There is absolutely no need for vnode_pager_uncache and other travesties like that anymore. A side-effect of these changes is that SMP locking should be much simpler, the I/O copyin/copyout optimizations work, NFS should be more ponderable, and further work on layered filesystems should be less frustrating, because of the totally coherent management of the vnode objects and vnodes. Please be careful with your system while running this code, but I would greatly appreciate feedback as soon a reasonably possible.
OpenPOWER on IntegriCloud