summaryrefslogtreecommitdiffstats
path: root/sys/kern/vfs_subr.c
Commit message (Collapse)AuthorAgeFilesLines
* Be more friendly for DDB pager.pjd2008-05-181-1/+6
| | | | Educated by: jhb's BSDCan presentation
* sync_vnode() has some messy code about locking in order to deal withattilio2008-05-041-39/+37
| | | | | | | | | | | mount fs needing Giant to be held when processing bufobjs. Use a different subqueue for pending workitems on filesystems requiring Giant. This simplifies the code notably and also reduces the number of Giant acquisitions (and the whole processing cost). Suggested by: jeff Reviewed by: kib Tested by: pho
* Implement 'show mount' command in DDB. Without argument, it prints shortpjd2008-04-261-0/+152
| | | | | | | info about all currently mounted file systems. When an address is given as an argument, prints detailed info about the given mount point. MFC after: 2 weeks
* Allow the vnode zone to return the unused memory. The vnode referencekib2008-04-241-2/+2
| | | | | | | | count is/shall be properly maintained for the long time, and VFS shall be safe against the vnode memory reclamation. Proposed by: jeff Tested by: pho
* Move the head of byte-level advisory lock list from thekib2008-04-161-0/+5
| | | | | | | | | | | | | | | | | | | | | | filesystem-specific vnode data to the struct vnode. Provide the default implementation for the vop_advlock and vop_advlockasync. Purge the locks on the vnode reclaim by using the lf_purgelocks(). The default implementation is augmented for the nfs and smbfs. In the nfs_advlock, push the Giant inside the nfs_dolock. Before the change, the vop_advlock and vop_advlockasync have taken the unlocked vnode and dereferenced the fs-private inode data, racing with with the vnode reclamation due to forced unmount. Now, the vop_getattr under the shared vnode lock is used to obtain the inode size, and later, in the lf_advlockasync, after locking the vnode interlock, the VI_DOOMED flag is checked to prevent an operation on the doomed vnode. The implementation of the lf_purgelocks() is submitted by dfr. Reported by: kris Tested by: kris, pho Discussed with: jeff, dfr MFC after: 2 weeks
* - Destroy the bo mtx when the vnode is destroyed.jeff2008-04-021-0/+1
|
* b_waiters cannot be adequately protected by the interlock because it isattilio2008-03-281-5/+1
| | | | | | | | | | | | | | | | dropped after the call to lockmgr() so just revert this approach using something similar to the precedent one: BUF_LOCKWAITERS() just checks if there are waiters (not the actual number of them) and it is based on newly introduced lockmgr_waiters() which returns if the lockmgr has waiters or not. The name has been choosen differently by old lockwaiters() in order to not confuse them. KPI results enriched by this commit so __FreeBSD_version bumping and manpage update will be happening soon. 'struct buf' also changes, so kernel ABI is disturbed. Bug found by: jeff Approved by: jeff, kib
* - Greatly simplify vget() by removing the guarantee that any newjeff2008-03-241-32/+18
| | | | | | | | | | references to a vnode with VI_OWEINACT set will force the vinactive() call. The kernel makes no guarantees about which reference was the last to close a file or when the actual inactive processing will happen. The previous code was designed to preserve existing semantics in the face of shared locks, however, this was unnecessary. Discussed with: mckusick
* - Only return 1 from sync_vnode() in cases where the vnode is stilljeff2008-03-231-1/+1
| | | | | | | at the head of the sync list. This prevents sched_sync() from re-queueing a vnode which may have been freed already. Discussed with: kib
* - Pass BO_MTX(bo) to lockmgr in vtruncbuf, we don't own the vnodejeff2008-03-231-1/+1
| | | | | | interlock here anymore. Reported by: kris
* - Complete part of the unfinished bufobj work by consistently usingjeff2008-03-221-25/+29
| | | | | | | | | | | | | | | | | BO_LOCK/UNLOCK/MTX when manipulating the bufobj. - Create a new lock in the bufobj to lock bufobj fields independently. This leaves the vnode interlock as an 'identity' lock while the bufobj is an io lock. The bufobj lock is ordered before the vnode interlock and also before the mnt ilock. - Exploit this new lock order to simplify softdep_check_suspend(). - A few sync related functions are marked with a new XXX to note that we may not properly interlock against a non-zero bv_cnt when attempting to sync all vnodes on a mountlist. I do not believe this race is important. If I'm wrong this will make these locations easier to find. Reviewed by: kib (earlier diff) Tested by: kris, pho (earlier diff)
* In keeping with style(9)'s recommendations on macros, use a ';'rwatson2008-03-161-3/+4
| | | | | | | | | after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr. MFC after: 1 month Discussed with: imp, rink
* - Handle buffer lock waiters count directly in the buffer cache insteadattilio2008-03-011-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | than rely on the lockmgr support [1]: * bump the waiters only if the interlock is held * let brelvp() return the waiters count * rely on brelvp() instead than BUF_LOCKWAITERS() in order to check for the waiters number - Remove a namespace pollution introduced recently with lockmgr.h including lock.h by including lock.h directly in the consumers and making it mandatory for using lockmgr. - Modify flags accepted by lockinit(): * introduce LK_NOPROFILE which disables lock profiling for the specified lockmgr * introduce LK_QUIET which disables ktr tracing for the specified lockmgr [2] * disallow LK_SLEEPFAIL and LK_NOWAIT to be passed there so that it can only be used on a per-instance basis - Remove BUF_LOCKWAITERS() and lockwaiters() as they are no longer used This patch breaks KPI so __FreBSD_version will be bumped and manpages updated by further commits. Additively, 'struct buf' changes results in a disturbed ABI also. [2] Really, currently there is no ktr tracing in the lockmgr, but it will be added soon. [1] Submitted by: kib Tested by: pho, Andrea Barberio <insomniac at slackware dot it>
* Axe the 'thread' argument from VOP_ISLOCKED() and lockstatus() as it isattilio2008-02-251-13/+11
| | | | | | | | | always curthread. As KPI gets broken by this patch, manpages and __FreeBSD_version will be updated by further commits. Tested by: Andrea Barberio <insomniac at slackware dot it>
* Conver all explicit instances to VOP_ISLOCKED(arg, NULL) intoattilio2008-02-081-5/+6
| | | | | | | | VOP_ISLOCKED(arg, curthread). Now, VOP_ISLOCKED() and lockstatus() should only acquire curthread as argument; this will lead in axing the additional argument from both functions, making the code cleaner. Reviewed by: jeff, kib
* Cleanup lockmgr interface and exported KPI:attilio2008-01-241-2/+2
| | | | | | | | | | | | | | | | | | | | - Remove the "thread" argument from the lockmgr() function as it is always curthread now - Axe lockcount() function as it is no longer used - Axe LOCKMGR_ASSERT() as it is bogus really and no currently used. Hopefully this will be soonly replaced by something suitable for it. - Remove the prototype for dumplockinfo() as the function is no longer present Addictionally: - Introduce a KASSERT() in lockstatus() in order to let it accept only curthread or NULL as they should only be passed - Do a little bit of style(9) cleanup on lockmgr.h KPI results heavilly broken by this change, so manpages and FreeBSD_version will be modified accordingly by further commits. Tested by: matteo
* - Introduce the function lockmgr_recursed() which returns true if theattilio2008-01-191-1/+1
| | | | | | | | | | | | | | | | | | | lockmgr lkp, when held in exclusive mode, is recursed - Introduce the function BUF_RECURSED() which does the same for bufobj locks based on the top of lockmgr_recursed() - Introduce the function BUF_ISLOCKED() which works like the counterpart VOP_ISLOCKED(9), showing the state of lockmgr linked with the bufobj BUF_RECURSED() and BUF_ISLOCKED() entirely replace the usage of bogus BUF_REFCNT() in a more explicative and SMP-compliant way. This allows us to axe out BUF_REFCNT() and leaving the function lockcount() totally unused in our stock kernel. Further commits will axe lockcount() as well as part of lockmgr() cleanup. KPI results, obviously, broken so further commits will update manpages and freebsd version. Tested by: kris (on UFS and NFS)
* VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used inattilio2008-01-131-20/+19
| | | | | | | | | | | conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
* vn_lock() is currently only used with the 'curthread' passed as argument.attilio2008-01-101-6/+6
| | | | | | | | | | | | | | | | Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>
* In "show lockedvnods" DDB command, use db_printf() rather than printf()rwatson2007-12-281-1/+1
| | | | | | | | | so that the results end up in the DDB output stream rather than the console output stream. This should likely also be done for the vprint() function it calls. MFC after: 3 months
* As LK_EXCLUPGRADE is used in conjuction with LK_NOWAIT, LK_UPGRADE becamesattilio2007-12-271-1/+1
| | | | | | | | | | equivalent with this and so operate the switch. That call is the only one remaining LK_EXCLUPGRADE consumer and removing it will prepare the ground for LK_EXCLUPGRADE axing and further lockmgr improvements. Discussed with: jeff, ups
* Add a new 'why' argument to kdb_enter(), and a set of constants to userwatson2007-12-251-2/+2
| | | | | | | | | for that argument. This will allow DDB to detect the broad category of reason why the debugger has been entered, which it can use for the purposes of deciding which DDB script to run. Assign approximate why values to all current consumers of the kdb_enter() interface.
* Use curthread instead of the FIRST_THREAD_IN_PROC for vnlru and syncer,kib2007-12-051-15/+42
| | | | | | | | | | | | | | | | | when applicable. Aquire Giant slightly later for vnlru. In the syncer, aquire the Giant only when a vnode belongs to the non-MPsafe fs. In both speedup_syncer() and syncer_shutdown(), remove the syncer thread from the lbolt sleep queue after the syncer state is modified, not before. Herded by: attilio Tested by: Peter Holm Reviewed by: ups MFC after: 1 week
* Merge first in a series of TrustedBSD MAC Framework KPI changesrwatson2007-10-241-3/+3
| | | | | | | | | | | | | | | | | | | | | | | from Mac OS X Leopard--rationalize naming for entry points to the following general forms: mac_<object>_<method/action> mac_<object>_check_<method/action> The previous naming scheme was inconsistent and mostly reversed from the new scheme. Also, make object types more consistent and remove spaces from object types that contain multiple parts ("posix_sem" -> "posixsem") to make mechanical parsing easier. Introduce a new "netinet" object type for certain IPv4/IPv6-related methods. Also simplify, slightly, some entry point names. All MAC policy modules will need to be recompiled, and modules not updates as part of this commit will need to be modified to conform to the new KPI. Sponsored by: SPARTA (original patches against Mac OS X) Obtained from: TrustedBSD Project, Apple Computer
* Rename the kthread_xxx (e.g. kthread_create()) callsjulian2007-10-201-2/+2
| | | | | | | | | | | to kproc_xxx as they actually make whole processes. Thos makes way for us to add REAL kthread_create() and friends that actually make theads. it turns out that most of these calls actually end up being moved back to the thread version when it's added. but we need to make this cosmetic change first. I'd LOVE to do this rename in 7.0 so that we can eventually MFC the new kthread_xxx() calls.
* When restoring the mount after umount failed, the MNTK_UNMOUNT flagkib2007-09-121-1/+1
| | | | | | | | | | | | | | | prevents insmntque() from placing reallocated syncer vnode on mount list, that causes panic in vfs_allocate_syncvnode(). Introduce MNTK_NOINSMNTQ flag, that marks the period when instmntque is not allowed to success, instead of MNTK_UNMOUNT. The MNTK_NOINSMNTQ is set and cleared simultaneously with MNTK_UNMOUNT, except on umount error path, where it is cleaned just before the syncer vnode is going to be allocated. Reported by: Peter Jeremy <peterjeremy optushome com au> Suggested by: tegge Approved by: re (rwatson)
* Improve vn_printf() by:pjd2007-08-131-7/+45
| | | | | | | | - adding missing vnode flags, - printing unknown flags as numbers, - using strlcat() instead of strcat(). Approved by: re (bmah)
* Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); inrwatson2007-06-121-5/+5
| | | | | | | | | | | | | | | some cases, move to priv_check() if it was an operation on a thread and no other flags were present. Eliminate caller-side jail exception checking (also now-unused); jail privilege exception code now goes solely in kern_jail.c. We can't yet eliminate suser() due to some cases in the KAME code where a privilege check is performed and then used in many different deferred paths. Do, however, move those prototypes to priv.h. Reviewed by: csjp Obtained from: TrustedBSD Project
* Revert VMCNT_* operations introduction.attilio2007-05-311-4/+3
| | | | | | | | Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately. Requested by: alc Approved by: jeff (mentor)
* Universally adopt most conventional spelling of acquire.rwatson2007-05-271-1/+1
|
* Since renaming of vop_lock to _vop_lock, pre- and post-conditionkib2007-05-181-3/+3
| | | | | | function calls are no more generated for vop_lock. Rename _vop_lock to vop_lock1 to satisfy tools/vnode_if.awk assumption about vop naming conventions. This restores pre/post-condition calls.
* - define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulatingjeff2007-05-181-3/+4
| | | | | | | | vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines. Contributed by: Attilio Rao <attilio@FreeBSD.org>
* Fix jails and jail-friendly file systems handling:pjd2007-04-131-0/+24
| | | | | | | | - We need to allow for PRIV_VFS_MOUNT_OWNER inside a jail. - Move security checks to vfs_suser() and deny unmounting and updating for jailed root from different jails, etc. OK'ed by: rwatson
* When we are running low on vnodes, there is currently no way to ask otherpjd2007-04-131-0/+1
| | | | | subsystems to release some vnodes. Implement backpressure based on vfs_lowvnodes event (similar to vm_lowmem for memory).
* Minor style cleanups (mostly removal of trailing whitespaces).pjd2007-04-101-22/+22
|
* Correct typos.pjd2007-04-101-1/+1
|
* Now that the vdropl() function is public, assert that the vnode interlockpjd2007-04-011-0/+1
| | | | is held.
* Make vdropl() public; zfs needs it. There is also plenty of existingdes2007-03-311-2/+1
| | | | | | | | | | | | | | | | | file system code (mostly *_reclaim()) which look like this: VOP_LOCK(vp); /* examine vp */ VOP_UNLOCK(vp); vdrop(vp); This can now be rewritten to: VOP_LOCK(vp); /* examine vp */ vdropl(vp); /* will unlock vp */ MFC after: 1 week
* PowerPC is the only architecture with mpsafe_vfs=0. This is nowmarcel2007-03-271-4/+0
| | | | | broken. Rudimentary tests show that PowerPC can run with mpsafe_vfs=1. Make it so...
* Make insmntque() externally visibile and allow it to fail (e.g. duringtegge2007-03-131-5/+40
| | | | | | | | | | | | | | | | | | | | | | | late stages of unmount). On failure, the vnode is recycled. Add insmntque1(), to allow for file system specific cleanup when recycling vnode on failure. Change getnewvnode() to no longer call insmntque(). Previously, embryonic vnodes were put onto the list of vnode belonging to a file system, which is unsafe for a file system marked MPSAFE. Change vfs_hash_insert() to no longer lock the vnode. The caller now has that responsibility. Change most file systems to lock the vnode and call insmntque() or insmntque1() after a new vnode has been sufficiently setup. Handle failed insmntque*() calls by propagating errors to callers, possibly after some file system specific cleanup. Approved by: re (kensmith) Reviewed by: kib In collaboration with: kib
* change vop_lock handling to allowing tracking of callers' file and line forkmacy2006-11-131-3/+3
| | | | | | acquisition of lockmgr locks Approved by: scottl (standing in for mentor rwatson)
* Simplify operations with sync_mtx in sched_sync():jhb2006-11-071-7/+3
| | | | | | | | | - Don't drop the lock just to reacquire it again to check rushjob, this only wastes time. - Use msleep() to drop the mutex while sleeping instead of explicitly unlocking around tsleep. Reviewed by: pjd
* Fix comment typo and function declaration.jhb2006-11-071-2/+2
|
* Sweep kernel replacing suser(9) calls with priv(9) calls, assigningrwatson2006-11-061-40/+23
| | | | | | | | | | | | | specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>
* Typo, 'from' vnode is locked here, not 'to' vnode.pjd2006-11-041-1/+1
|
* Add gjournal specific code to the UFS file system:pjd2006-10-311-0/+2
| | | | | | | | | | | | | | | | - Add FS_GJOURNAL flag which enables gjournal support on a file system. - Add cg_unrefs field to the cylinder group structure which holds number of unreferenced (orphaned) inodes in the given cylinder group. - Add fs_unrefs field to the super block structure which holds total number of unreferenced (orphaned) inodes. - When file or a directory is orphaned (last reference is removed, but object is still open), increase fs_unrefs and cg_unrefs fields, which is a hint for fsck in which cylinder groups looks for such (orphaned) objects. - When file is last closed, decrease {fs,cg}_unrefs fields. - Add VV_DELETED vnode flag which points at orphaned objects. Sponsored by: home.pl
* Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.hrwatson2006-10-221-1/+2
| | | | | | | | | | | | | begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead. This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd. Obtained from: TrustedBSD Project Sponsored by: SPARTA
* Correct the comment: numvnodes is decreased on vdestroying the vnode.kib2006-10-021-1/+2
| | | | | | OKed by: tegge Approved by: pjd (mentor) MFC after: 1 week
* Add mnt_noasync counter to better handle interleaved calls to nmount(),tegge2006-09-261-5/+6
| | | | | | sync() and sync_fsync() without losing MNT_ASYNC. Add MNTK_ASYNC flag which is set only when MNT_ASYNC is set and mnt_noasync is zero, and check that flag instead of MNT_ASYNC before initiating async io.
* Use mount interlock to protect all changes to mnt_flag and mnt_kern_flag.tegge2006-09-261-0/+4
| | | | | This eliminates a race where MNT_UPDATE flag could be lost when nmount() raced against sync(), sync_fsync() or quotactl().
OpenPOWER on IntegriCloud