summaryrefslogtreecommitdiffstats
path: root/sys/kern/vfs_subr.c
Commit message (Collapse)AuthorAgeFilesLines
* Tidy up a capabilities-related comment.jonathan2011-06-241-3/+0
| | | | | | This comment refers to an #ifdef that hasn't been merged [yet?]; remove it. Approved by: rwatson
* Use a name instead of a magic number for kern_yield(9) when the prioritymdf2011-05-131-2/+2
| | | | | | | | should not change. Fetch the td_user_pri under the thread lock. This is probably not necessary but a magic number also seems preferable to knowing the implementation details here. Requested by: Jason Behmer < jason DOT behmer AT isilon DOT com >
* Add the watchdogs patting during the (shutdown time) disk syncing andattilio2011-04-281-0/+8
| | | | | | | | | | | | | | | | disk dumping. With the option SW_WATCHDOG on, these operations are doomed to let watchdog fire, fi they take too long. I implemented the stubs this way because I really want wdog_kern_* KPI to not be dependant by SW_WATCHDOG being on (and really, the option only enables watchdog activation in hardclock) and also avoid to call them when not necessary (avoiding not-volountary watchdog activations). Sponsored by: Sandvine Incorporated Discussed with: emaste, des MFC after: 2 weeks
* Fix a LOR in vfs_busy() where, after msleeping, it would lockrmacklem2011-04-231-1/+2
| | | | | | | | | | | | the mutexes in the wrong order for the case where the MBF_MNTLSTLOCK is set. I believe this did have the potential for deadlock. For example, if multiple nfsd threads called vfs_busyfs(), which calls vfs_busy() with MBF_MNTLSTLOCK. Thanks go to pho for catching this during his testing. Tested by: pho Submitted by: kib MFC after: 2 weeks
* Remove malloc type M_NETADDR unused since splitting into vfs_subr.cpluknet2011-04-041-2/+0
| | | | | | and vfs_export.c. MFC after: 1 week
* Do not assert buffer lock in VFS_STRATEGY() when kernel already paniced.kib2011-03-081-1/+1
| | | | | Sponsored by: The FreeBSD Foundation MFC after: 1 week
* Based on discussions on the svn-src mailing list, rework r218195:mdf2011-02-081-4/+4
| | | | | | | | | | | | | | | | | | | | | | - entirely eliminate some calls to uio_yeild() as being unnecessary, such as in a sysctl handler. - move should_yield() and maybe_yield() to kern_synch.c and move the prototypes from sys/uio.h to sys/proc.h - add a slightly more generic kern_yield() that can replace the functionality of uio_yield(). - replace source uses of uio_yield() with the functional equivalent, or in some cases do not change the thread priority when switching. - fix a logic inversion bug in vlrureclaim(), pointed out by bde@. - instead of using the per-cpu last switched ticks, use a per thread variable for should_yield(). With PREEMPTION, the only reasonable use of this is to determine if a lock has been held a long time and relinquish it. Without PREEMPTION, this is essentially the same as the per-cpu variable.
* Put the general logic for being a CPU hog into a new functionmdf2011-02-021-2/+2
| | | | | | | | | | should_yield(). Use this in various places. Encapsulate the common case of check-and-yield into a new function maybe_yield(). Change several checks for a magic number of iterations to use should_yield() instead. MFC after: 1 week
* When vtruncbuf() iterates over the vnode buffer list, lock buffer objectkib2011-01-251-2/+5
| | | | | | | | | before checking the validity of the next buffer pointer. Otherwise, the buffer might be reclaimed after the check, causing iteration to run into wrong buffer. Reported and tested by: pho MFC after: 1 week
* Specify a CTLTYPE_FOO so that a future sysctl(8) change does not needmdf2011-01-181-2/+4
| | | | to rely on the format string.
* sysctl(9) cleanup checkpoint: amd64 GENERIC builds cleanly.mdf2011-01-121-4/+4
| | | | Commit the kernel changes.
* - Restore dropping the priority of syncer down to PPAUSE when it is idle.jhb2011-01-061-0/+7
| | | | | | | | | This was lost when it was converted to using a condition variable instead of lbolt. - Drop the priority of flowtable down to PPAUSE when it is idle as well since it is a similar background task. MFC after: 2 weeks
* Teach ddb "show mount" about MNTK_SUJ flag.kib2010-12-271-0/+1
|
* Allow shared-locked vnode to be passed to vunref(9).kib2010-11-241-5/+15
| | | | | | | | | | When shared-locked vnode is supplied as an argument to vunref(9) and resulting usecount is 0, set VI_OWEINACT and do not try to upgrade vnode lock. The later could cause vnode unlock, allowing the vnode to be reclaimed meantime. Tested by: pho MFC after: 1 week
* Remove prtactive variable and related printf()s in the vop_inactivekib2010-11-191-3/+0
| | | | | | | | and vop_reclaim() methods. They seems to be unused, and the reported situation is normal for the forced unmount. MFC after: 1 week X-MFC-note: keep prtactive symbol in vfs_subr.c
* Fix some more style(9) issues.brucec2010-11-141-15/+19
|
* Fix style(9) issues from r215281 and r215282.brucec2010-11-141-8/+17
| | | | MFC after: 1 week
* Add descriptions to some more sysctls.brucec2010-11-141-13/+13
| | | | | PR: kern/148510 MFC after: 1 week
* Protect mnt_syncer with the sync_mtx. This prevents a (rare) vnode leakkib2010-09-111-4/+27
| | | | | | | | | | | when mount and update are executed in parallel. Encapsulate syncer vnode deallocation into the helper function vfs_deallocate_syncvnode(), to not externalize sync_mtx from vfs_subr.c. Found and reviewed by: jh (previous version of the patch) Tested by: pho MFC after: 3 weeks
* As long as we are going to panic anyway, there's no need to hide additionalemaste2010-09-011-2/+0
| | | | information behind DIAGNOSTIC.
* execve(2) has a special check for file permissions: a file must have atjh2010-08-301-0/+6
| | | | | | | | | | | | | | | least one execute bit set, otherwise execve(2) will return EACCES even for an user with PRIV_VFS_EXEC privilege. Add the check also to vaccess(9), vaccess_acl_nfs4(9) and vaccess_acl_posix1e(9). This makes access(2) to better agree with execve(2). Because ZFS doesn't use vaccess(9) for VEXEC, add the check to zfs_freebsd_access() too. There may be other file systems which are not using vaccess*() functions and need to be handled separately. PR: kern/125009 Reviewed by: bde, trasz Approved by: pjd (ZFS part)
* There is a bug in vfs_allocate_syncvnode() failure handling in mount code.pjd2010-08-281-7/+5
| | | | | | | | | | | | Actually it is hard to properly handle such a failure, especially in MNT_UPDATE case. The only reason for the vfs_allocate_syncvnode() function to fail is getnewvnode() failure. Fortunately it is impossible for current implementation of getnewvnode() to fail, so we can assert this and make vfs_allocate_syncvnode() void. This in turn free us from handling its failures in the mount code. Reviewed by: kib MFC after: 1 month
* The buffers b_vflags field is not always properly protected bykib2010-08-121-0/+10
| | | | | | | | | | | | | | | bufobj lock. If b_bufobj is not NULL, then bufobj lock should be held when manipulating the flags. Not doing this sometimes leaves BV_BKGRDINPROG to be erronously set, causing softdep' getdirtybuf() to stuck indefinitely in "getbuf" sleep, waiting for background write to finish which is not actually performed. Add BO_LOCK() in the cases where it was missed. In collaboration with: pho Tested by: bz Reviewed by: jeff MFC after: 1 month
* In order for MAXVNODES_MAX to be an "int" on powerpc and sparc, we mustalc2010-08-041-1/+1
| | | | | | | cast PAGE_SIZE to an "int". (Powerpc and sparc, unlike the other architectures, define PAGE_SIZE as a "long".) Submitted by: Andreas Tobler
* Update the "desiredvnodes" calculation. In particular, make the part ofalc2010-08-021-8/+19
| | | | | | | | | the calculation that is based on the kernel's heap size more conservative. Hopefully, this will eliminate the need for MAXVNODES_MAX, but for the time being set MAXVNODES_MAX to a large value. Reviewed by: jhb@ MFC after: 6 weeks
* Use ISO C99 integer types in sys/kern where possible.ed2010-06-211-2/+2
| | | | | | There are only about 100 occurences of the BSD-specific u_int*_t datatypes in sys/kern. The ISO C99 integer types are used here more often.
* Backout r207970 for now, it can lead to deadlocks.pjd2010-06-171-13/+0
| | | | | Reported by: kan MFC after: 3 days
* Sometimes vnodes share the lock despite being different vnodes onkib2010-06-031-2/+3
| | | | | | | | | | | | | different mount points, e.g. the nullfs vnode and the covered vnode from the lower filesystem. In this case, existing assertion in vop_rename_pre() may be triggered. Check for vnode locks equiality instead of the vnodes itself to not trip over the situation. Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> Tested by: pho MFC after: 2 weeks
* Add VOP_ADVLOCKPURGE so that the file system is called when purgingzml2010-05-121-1/+1
| | | | | | | locks (in the case where the VFS impl isn't using lf_*) Submitted by: Matthew Fleming <matthew.fleming@isilon.com> Reviewed by: zml, dfr
* When there is no memory or KVA, try to help by reclaiming some vnodes.pjd2010-05-121-0/+13
| | | | | | | | This helps with 'kmem_map too small' panics. No objections from: kib Tested by: Alexander V. Ribchansky <shurik@zk.informjust.ua> MFC after: 1 week
* I added vfs_lowvnodes event, but it was only used for a short while and nowpjd2010-05-111-1/+0
| | | | | | it is totally unused. Remove it. MFC after: 3 days
* - Merge soft-updates journaling from projects/suj/head into head. Thisjeff2010-04-241-0/+1
| | | | | | | | brings in support for an optional intent log which eliminates the need for background fsck on unclean shutdown. Sponsored by: iXsystems, Yahoo!, and Juniper. With help from: McKusick and Peter Holm
* Add missing MNT_NFS4ACLS.jh2010-04-041-0/+1
|
* Fix some whitespace nits.pjd2010-04-031-7/+5
|
* Add missing mnt_kern_flag flags in 'show mount' output.pjd2010-04-031-1/+5
|
* Add function vop_rename_fail(9) that performs needed cleanup for lockskib2010-04-021-0/+14
| | | | | | | | and references of the VOP_RENAME(9) arguments. Use vop_rename_fail() in deadfs_rename(). Tested by: Mikolaj Golub MFC after: 1 week
* Add new function vunref(9) that decrements vnode use count (and holdkib2010-01-171-70/+53
| | | | | | | | | | count) while vnode is exclusively locked. The code for vput(9), vrele(9) and vunref(9) is merged. In collaboration with: pho Reviewed by: alc MFC after: 3 weeks
* Add a knob to allow reclaim of the directory vnodes that are source ofkib2009-12-281-2/+10
| | | | | | | | | | the namecache records. The reclamation is not enabled by default because for typical workload it would make namecache unusable, but large nested directory tree easily puts any process that accesses filesystem into 1 second wait for vlru. Reported by: yar (long time ago) MFC after: 3 days
* Now that all the callers seem to be fixed, add KASSERTs to make sure VAPPENDtrasz2009-12-261-0/+2
| | | | is not being used improperly.
* VI_OBJDIRTY vnode flag mirrors the state of OBJ_MIGHTBEDIRTY vm objectkib2009-12-211-4/+3
| | | | | | | | | | | | | flag. Besides providing the redundand information, need to update both vnode and object flags causes more acquisition of vnode interlock. OBJ_MIGHTBEDIRTY is only checked for vnode-backed vm objects. Remove VI_OBJDIRTY and make sure that OBJ_MIGHTBEDIRTY is set only for vnode-backed vm objects. Suggested and reviewed by: alc Tested by: pho MFC after: 3 weeks
* Extend ddb(4) "show mount" command to print active string mount options.jh2009-11-191-0/+13
| | | | | | | | Note that only option names are printed, not values. Reviewed by: pjd Approved by: trasz (mentor) MFC after: 2 weeks
* Provide default implementation for VOP_ACCESS(9), so that filesystems whichtrasz2009-10-011-0/+3
| | | | | | | want to provide VOP_ACCESSX(9) don't have to implement both. Note that this commit makes implementation of either of these two mandatory. Reviewed by: kib
* Use C99 initialization for struct filterops.rwatson2009-09-121-8/+21
| | | | | | Obtained from: Mac OS X Sponsored by: Apple Inc. MFC after: 3 weeks
* In vfs_mark_atime(9), be resistent against reclaimed vnodes.kib2009-09-091-1/+5
| | | | | | | Assert that neccessary locks are taken, since vop might not be called. Tested by: pho MFC after: 3 days
* Call prison_check from vfs_suser rather than re-implementing it.jamie2009-07-021-2/+1
| | | | Approved by: re (kib), bz (mentor)
* Adapt vfs kqfilter to the shared vnode lock used by zfs write vop. Usekib2009-06-101-8/+39
| | | | | | | | | | | | | | | | | | | | | | | | | vnode interlock to protect the knote fields [1]. The locking assumes that shared vnode lock is held, thus we get exclusive access to knote either by exclusive vnode lock protection, or by shared vnode lock + vnode interlock. Do not use kl_locked() method to assert either lock ownership or the fact that curthread does not own the lock. For shared locks, ownership is not recorded, e.g. VOP_ISLOCKED can return LK_SHARED for the shared lock not owned by curthread, causing false positives in kqueue subsystem assertions about knlist lock. Remove kl_locked method from knlist lock vector, and add two separate assertion methods kl_assert_locked and kl_assert_unlocked, that are supposed to use proper asserts. Change knlist_init accordingly. Add convenience function knlist_init_mtx to reduce number of arguments for typical knlist initialization. Submitted by: jhb [1] Noted by: jhb [2] Reviewed by: jhb Tested by: rnoland
* Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERICrwatson2009-06-051-1/+0
| | | | | | | | and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include. Discussed with: pjd
* Remove the now invalid (and possibly unused) debug.mpsafevfsattilio2009-05-301-9/+0
| | | | | | | sysctl/tunable. Reviewed by: emaste Sponsored by: Sandvine Incorporated
* Add VOP_ACCESSX, which can be used to query for newly added V*trasz2009-05-301-0/+47
| | | | | | | | permissions, such as VWRITE_ACL. For a filsystems that don't implement it, there is a default implementation, which works as a wrapper around VOP_ACCESS. Reviewed by: rwatson@
* Add hierarchical jails. A jail may further virtualize its environmentjamie2009-05-271-13/+5
| | | | | | | | | | | | | | | | | | | | | | by creating a child jail, which is visible to that jail and to any parent jails. Child jails may be restricted more than their parents, but never less. Jail names reflect this hierarchy, being MIB-style dot-separated strings. Every thread now points to a jail, the default being prison0, which contains information about the physical system. Prison0's root directory is the same as rootvnode; its hostname is the same as the global hostname, and its securelevel replaces the global securelevel. Note that the variable "securelevel" has actually gone away, which should not cause any problems for code that properly uses securelevel_gt() and securelevel_ge(). Some jail-related permissions that were kept in global variables and set via sysctls are now per-jail settings. The sysctls still exist for backward compatibility, used only by the now-deprecated jail(2) system call. Approved by: bz (mentor)
OpenPOWER on IntegriCloud