summaryrefslogtreecommitdiffstats
path: root/sys/fs/deadfs
Commit message (Collapse)AuthorAgeFilesLines
* The deadfs VOPs for vop_ioctl and vop_bmap call itself recursively,kib2012-09-131-41/+2
| | | | | | | | | | | | | | | which is an elaborate way to cause kernel panic. Change the VOPs implementation to return EBADF for a reclaimed vnode. While the calls to vop_bmap should not reach deadfs, it is indeed possible for vop_ioctl, because the VOP locking protocol is to pass the vnode to VOP unlocked. The actual panic was observed when ioctl was called on procfs filedescriptor which pointed to an exited process. Reported by: zont Tested by: pho MFC after: 1 week
* Add function vop_rename_fail(9) that performs needed cleanup for lockskib2010-04-021-8/+2
| | | | | | | | and references of the VOP_RENAME(9) arguments. Use vop_rename_fail() in deadfs_rename(). Tested by: Mikolaj Golub MFC after: 1 week
* Add a simple VOP_VPTOCNP implementation for deadfs which returns EBADF.marcus2008-12-121-0/+1
| | | | | Reviewed by: arch Approved by: kib
* Below is slightly edited description of the LOR by Tor Egge:kib2007-01-221-1/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | -------------------------- [Deadlock] is caused by a lock order reversal in vfs_lookup(), where [some] process is trying to lock a directory vnode, that is the parent directory of covered vnode) while holding an exclusive vnode lock on covering vnode. A simplified scenario: root fs var fs / A / (/var) D /var B /log (/var/log) E vfs lock C vfs lock F Within each file system, the lock order is clear: C->A->B and F->D->E When traversing across mounts, the system can choose between two lock orders, but everything must then follow that lock order: L1: C->A->B | +->F->D->E L2: F->D->E | +->C->A->B The lookup() process for namei("/var") mixes those two lock orders: VOP_LOOKUP() obtains B while A is held vfs_busy() obtains a shared lock on F while A and B are held (follows L1, violates L2) vput() releases lock on B VOP_UNLOCK() releases lock on A VFS_ROOT() obtains lock on D while shared lock on F is held vfs_unbusy() releases shared lock on F vn_lock() obtains lock on A while D is held (violates L1, follows L2) dounmount() follows L1 (B is locked while F is drained). Without unmount activity, vfs_busy() will always succeed without blocking and the deadlock isn't triggered (the system behaves as if L2 is followed). With unmount, you can get 4 processes in a deadlock: p1: holds D, want A (in lookup()) p2: holds shared lock on F, want D (in VFS_ROOT()) p3: holds B, want drain lock on F (in dounmount()) p4: holds A, want B (in VOP_LOOKUP()) You can have more than one instance of p2. The reversal was introduced in revision 1.81 of src/sys/kern/vfs_lookup.c and MFCed to revision 1.80.2.1, probably to avoid a cascade of vnode locks when nfs servers are dead (VFS_ROOT() just hangs) spreading to the root fs root vnode. - Tor Egge To fix the LOR, ups@ noted that when crossing the mount point, ni_dvp is actually not used by the callers of namei. Thus, placeholder deadfs vnode vp_crossmp is introduced that is filled into ni_dvp. Idea by: ups Reviewed by: tegge, ups, jeff, rwatson (mac interaction) Tested by: Peter Holm MFC after: 2 weeks
* - Deadfs should not use the std GETWRITEMOUNT routine. Add one that alwaysjeff2006-02-221-0/+14
| | | | | | returns NULL. MFC After: 1 week
* - Deadfs may now use the standard vop lock, get rid of dead_lock().jeff2005-03-131-40/+0
| | | | | | | - We no longer have to take the XLOCK state into consideration in any routines. Sponsored by: Isilon Systems, Inc.
* Introduce vx_wait{l}() and use it instead of home-rolled versions.phk2005-02-171-26/+5
|
* Whitespace in vop_vector{} initializations.phk2005-01-131-0/+1
|
* Change the generated VOP_ macro implementations to improve type checkingphk2005-01-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and KASSERT coverage. After this check there is only one "nasty" cast in this code but there is a KASSERT to protect against the wrong argument structure behind that cast. Un-inlining the meat of VOP_FOO() saves 35kB of text segment on a typical kernel with no change in performance. We also now run the checking and tracing on VOP's which have been layered by nullfs, umapfs, deadfs or unionfs. Add new (non-inline) VOP_FOO_AP() functions which take a "struct foo_args" argument and does everything the VOP_FOO() macros used to do with checks and debugging code. Add KASSERT to VOP_FOO_AP() check for argument type being correct. Slim down VOP_FOO() inline functions to just stuff arguments into the struct foo_args and call VOP_FOO_AP(). Put function pointer to VOP_FOO_AP() into vop_foo_desc structure and make VCALL() use it instead of the current offsetoff() hack. Retire vcall() which implemented the offsetoff() Make deadfs and unionfs use VOP_FOO_AP() calls instead of VCALL(), we know which specific call we want already. Remove unneeded arguments to VCALL() in nullfs and umapfs bypass functions. Remove unused vdesc_offset and VOFFSET(). Generally improve style/readability of the generated code.
* /* -> /*- for copyright notices, minor format tweaks as necessaryimp2005-01-061-1/+1
|
* Back when VOP_* was introduced, we did not have new-style structphk2004-12-011-33/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | initializations but we did have lofty goals and big ideals. Adjust to more contemporary circumstances and gain type checking. Replace the entire vop_t frobbing thing with properly typed structures. The only casualty is that we can not add a new VOP_ method with a loadable module. History has not given us reason to belive this would ever be feasible in the the first place. Eliminate in toto VOCALL(), vop_t, VNODEOP_SET() etc. Give coda correct prototypes and function definitions for all vop_()s. Generate a bit more data from the vnode_if.src file: a struct vop_vector and protype typedefs for all vop methods. Add a new vop_bypass() and make vop_default be a pointer to another struct vop_vector. Remove a lot of vfs_init since vop_vector is ready to use from the compiler. Cast various vop_mumble() to void * with uppercase name, for instance VOP_PANIC, VOP_NULL etc. Implement VCALL() by making vdesc_offset the offsetof() the relevant function pointer in vop_vector. This is disgusting but since the code is generated by a script comparatively safe. The alternative for nullfs etc. would be much worse. Fix up all vnode method vectors to remove casts so they become typesafe. (The bulk of this is generated by scripts)
* Mechanically change prototypes for vnode operations to use the new typedefs.phk2004-12-011-8/+8
|
* Make VOP_BMAP return a struct bufobj for the underlying storage devicephk2004-11-151-2/+2
| | | | | | | | | instead of a vnode for it. The vnode_pager does not and should not have any interest in what the filesystem uses for backend. (vfs_cluster doesn't use the backing store argument.)
* Remove advertising clause from University of California Regent'simp2004-04-071-4/+0
| | | | | | | license, per letter dated July 22, 1999 and email from Peter Wemm, Alan Cox and Robert Watson. Approved by: core, peter, alc, rwatson
* Finish cleanup of vprint() which was begun with changing v_tag to a string.njl2003-03-031-1/+0
| | | | | | Remove extraneous uses of vop_null, instead defering to the default op. Rename vnode type "vfs" to the more descriptive "syncer". Fix formatting for various filesystems that use vop_print.
* Fix comments and one resulting code confusion about the type of thephk2002-10-161-1/+2
| | | | | | "command" argument to VOP_IOCTL. Spotted by: FlexeLint.
* Be consistent about "static" functions: if the function is markedphk2002-09-281-1/+1
| | | | | | static in its prototype, mark it static at the definition too. Inspired by: FlexeLint warning #512
* Remove any VOP_PRINT that redundantly prints the tag.njl2002-09-181-17/+1
| | | | | | Move lockmgr_printinfo() into vprint() for everyone's benefit. Suggested by: bde
* Remove all use of vnode->v_tag, replacing with appropriate substitutes.njl2002-09-141-1/+1
| | | | | | | | | | | | v_tag is now const char * and should only be used for debugging. Additionally: 1. All users of VT_NTS now check vfsconf->vf_type VFCF_NETWORK 2. The user of VT_PROCFS now checks for the new flag VV_PROCDEP, which is propagated by pseudofs to all child vnodes if the fs sets PFS_PROCDEP. Suggested by: phk Reviewed by: bde, rwatson (earlier version)
* - Replace v_flag with v_iflag and v_vflagjeff2002-08-041-4/+6
| | | | | | | | | | | | | | | - v_vflag is protected by the vnode lock and is used when synchronization with VOP calls is needed. - v_iflag is protected by interlock and is used for dealing with vnode management issues. These flags include X/O LOCK, FREE, DOOMED, etc. - All accesses to v_iflag and v_vflag have either been locked or marked with mp_fixme's. - Many ASSERT_VOP_LOCKED calls have been added where the locking was not clear. - Many functions in vfs_subr.c were restructured to provide for stronger locking. Idea stolen from: BSD/OS
* Use vop_panic() instead of rolling our own.phk2002-05-021-20/+8
|
* Remove __P.alfred2002-03-191-11/+11
|
* Undo part of the tangle of having sys/lock.h and sys/mutex.h included inmarkm2001-05-011-3/+2
| | | | | | | | | | | other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)
* Change and clean the mutex lock interface.bmilekic2001-02-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mtx_enter(lock, type) becomes: mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized) similarily, for releasing a lock, we now have: mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument. The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind. Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two: MTX_QUIET and MTX_NOSWITCH The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers: mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively. Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case. Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled. Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those. Finally, caught up to the interface changes in all sys code. Contributors: jake, jhb, jasone (in no particular order)
* Give vop_mmap an untimely death. The opportunity to give it a timelyeivind2000-11-011-1/+0
| | | | death timed out in 1996.
* Convert lockmgr locks from using simple locks to using mutexes.jasone2000-10-041-1/+3
| | | | | | Add lockdestroy() and appropriate invocations, which corresponds to lockinit() and must be called to clean up after a lockmgr lock is no longer needed.
* Remove unneeded <sys/buf.h> includes.phk2000-04-181-1/+0
| | | | | Due to some interesting cpp tricks in lockmgr, the LINT kernel shrinks by 924 bytes.
* $Id$ -> $FreeBSD$peter1999-08-281-1/+1
|
* Return ENOTTY instead of EBADF for ioctls on dead vnodes. This fixesbde1998-11-221-2/+2
| | | | | tcsetpgrp() on controlling terminals that are no longer associated with the session of the calling process, not to mention ioctl.2.
* Enabled Lite2 fix for reading from dead ttys.bde1998-08-231-10/+1
|
* Add support for poll(2) on files. vop_nopoll() now returns POLLNVALwollman1997-12-151-5/+20
| | | | | | | | | | | | | | | | | | | if one of the new poll types is requested; hopefully this will not break any existing code. (This is done so that programs have a dependable way of determining whether a filesystem supports the extended poll types or not.) The new poll types added are: POLLWRITE - file contents may have been modified POLLNLINK - file was linked, unlinked, or renamed POLLATTRIB - file's attributes may have been changed POLLEXTEND - file was extended Note that the internal operation of poll() means that it is impossible for two processes to reliably poll for the same event (this could be fixed but may not be worth it), so it is not possible to rewrite `tail -f' to use poll at this time.
* Don't include <sys/lock.h> in headers when only `struct simplelock' isbde1997-12-051-1/+2
| | | | required. Fixed everything that depended on the pollution.
* VFS interior redecoration.phk1997-10-261-22/+11
| | | | | | | | | | | | | Rename vn_default_error to vop_defaultop all over the place. Move vn_bwrite from vfs_bio.c to vfs_default.c and call it vop_stdbwrite. Use vop_null instead of nullop. Move vop_nopoll from vfs_subr.c to vfs_default.c Move vop_sharedlock from vfs_subr.c to vfs_default.c Move vop_nolock from vfs_subr.c to vfs_default.c Move vop_nounlock from vfs_subr.c to vfs_default.c Move vop_noislocked from vfs_subr.c to vfs_default.c Use vop_ebadf instead of *_ebadf. Add vop_defaultop for getpages on master vnode in MFS.
* VFS clean up "hekto commit"phk1997-10-161-3/+1
| | | | | | | | | | 1. Add defaults for more VOPs VOP_LOCK vop_nolock VOP_ISLOCKED vop_noislocked VOP_UNLOCK vop_nounlock and remove direct reference in filesystems. 2. Rename the nfsv2 vnop tables to improve sorting order.
* Another VFS cleanup "kilo commit"phk1997-10-161-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. Remove VOP_UPDATE, it is (also) an UFS/{FFS,LFS,EXT2FS,MFS} intereface function, and now lives in the ufsmount structure. 2. Remove VOP_SEEK, it was unused. 3. Add mode default vops: VOP_ADVLOCK vop_einval VOP_CLOSE vop_null VOP_FSYNC vop_null VOP_IOCTL vop_enotty VOP_MMAP vop_einval VOP_OPEN vop_null VOP_PATHCONF vop_einval VOP_READLINK vop_einval VOP_REALLOCBLKS vop_eopnotsupp And remove identical functionality from filesystems 4. Add vop_stdpathconf, which returns the canonical stuff. Use it in the filesystems. (XXX: It's probably wrong that specfs and fifofs sets this vop, shouldn't it come from the "host" filesystem, for instance ufs or cd9660 ?) 5. Try to make system wide VOP functions have vop_* names. 6. Initialize the um_* vectors in LFS. (Recompile your LKMS!!!)
* VFS mega cleanup commit (x/N)phk1997-10-161-82/+25
| | | | | | | | | | | | | | | | | | | | | | | 1. Add new file "sys/kern/vfs_default.c" where default actions for VOPs go. Implement proper defaults for ABORTOP, BWRITE, LEASE, POLL, REVOKE and STRATEGY. Various stuff spread over the entire tree belongs here. 2. Change VOP_BLKATOFF to a normal function in cd9660. 3. Kill VOP_BLKATOFF, VOP_TRUNCATE, VOP_VFREE, VOP_VALLOC. These are private interface functions between UFS and the underlying storage manager layer (FFS/LFS/MFS/EXT2FS). The functions now live in struct ufsmount instead. 4. Remove a kludge of VOP_ functions in all filesystems, that did nothing but obscure the simplicity and break the expandability. If a filesystem doesn't implement VOP_FOO, it shouldn't have an entry for it in its vnops table. The system will try to DTRT if it is not implemented. There are still some cruft left, but the bulk of it is done. 5. Fix another VCALL in vfs_cache.c (thanks Bruce!)
* Hmm, realign the vnops into two columns.phk1997-10-151-8/+8
|
* Stylistic overhaul of vnops tables.phk1997-10-151-49/+42
| | | | | | | 1. Remove comment stating the blatantly obvious. 2. Align in two columns. 3. Sort all but the default element alphabetically. 4. Remove XXX comments pointing out entries not needed.
* Convert select -> poll.peter1997-09-141-21/+10
| | | | | Delete 'always succeed' select/poll handlers, replaced with generic call. Flag missing vnode op table entries.
* Removed unused #includes.bde1997-09-021-5/+1
|
* Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are notpeter1997-02-221-1/+1
| | | | ready for it yet.
* This is the kernel Lite/2 commit. There are some requisite userlanddyson1997-02-101-4/+24
| | | | | | | | | | | | | | | changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes. The system boots and can mount UFS filesystems. Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed. Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>
* Make the long-awaited change from $Id$ to $FreeBSD$jkh1997-01-141-1/+1
| | | | | | | | This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
* staticize.phk1995-12-021-27/+27
|
* Removed unsed function dead_nullop().bde1995-11-111-13/+3
| | | | Converted incomplete function declarations to prototypes.
* Introduced a type `vop_t' for vnode operation functions and usedbde1995-11-091-44/+44
| | | | | | | | | | | | | | | it 1138 times (:-() in casts and a few more times in declarations. This change is null for the i386. The type has to be `typedef int vop_t(void *)' and not `typedef int vop_t()' because `gcc -Wstrict-prototypes' warns about the latter. Since vnode op functions are called with args of different (struct pointer) types, neither of these function types is any use for type checking of the arg, so it would be preferable not to use the complete function type, especially since using the complete type requires adding 1138 casts to avoid compiler warnings and another 40+ casts to reverse the function pointer conversions before calling the functions.
* Added VOP_GETPAGES/VOP_PUTPAGES and also the "backwards" block countdyson1995-09-041-2/+3
| | | | for VOP_BMAP. Updated affected filesystems...
* Cosmetics: added a #include and a static prototype to silence gcc.phk1994-10-081-1/+3
|
* Use tsleep() rather than sleep so that 'ps' is more informative aboutdg1994-10-061-2/+2
| | | | the wait.
* Implemented loadable VFS modules, and made most existing filesystemswollman1994-09-211-1/+4
| | | | loadable. (NFS is a notable exception.)
OpenPOWER on IntegriCloud