summaryrefslogtreecommitdiffstats
path: root/sys/fs/nullfs/null_vnops.c
Commit message (Collapse)AuthorAgeFilesLines
* MFC r295717:kib2016-02-241-0/+9
| | | | | | | | After nullfs rmdir operation, reclaim the directory vnode which was unlinked. Otherwise the vnode stays cached, causing leak. This is similar to r292961 for regular files. Approved by: re (marius)
* MFC r292961:kib2016-01-061-3/+5
| | | | | Force nullfs vnode reclaim after unlinking, to potentially unlink lower vnode.
* MFC r269708:kib2014-08-221-4/+40
| | | | | Unlock ldvp and lock dvp to compensate for possible ldvp unlock in lower VOP_LOOKUP() and dvp reclamation. Use cached value of dvp->v_mount.
* MFC r269187:kib2014-08-041-0/+4
| | | | | Assert that nullfs vnode has VV_ROOT set whenever lower vnode has. Assert that dotdot lookup on the root vnode is not performed.
* MFC r268764:kib2014-07-301-10/+0
| | | | | Check for the cross-device cross-link attempt in the VFS, instead of VOP_LINK() implemenations.
* MFC r269081:kib2014-07-271-1/+1
| | | | Fix typo.
* Fix the length calculation for the final block of a sendfile(2)des2013-09-101-0/+10
| | | | | | | | | | | | | | | | | | | | transmission which could be tricked into rounding up to the nearest page size, leaking up to a page of kernel memory. [13:11] In IPv6 and NetATM, stop SIOCSIFADDR, SIOCSIFBRDADDR, SIOCSIFDSTADDR and SIOCSIFNETMASK at the socket layer rather than pass them on to the link layer without validation or credential checks. [SA-13:12] Prevent cross-mount hardlinks between different nullfs mounts of the same underlying filesystem. [SA-13:13] Security: CVE-2013-5666 Security: FreeBSD-SA-13:11.sendfile Security: CVE-2013-5691 Security: FreeBSD-SA-13:12.ifioctl Security: CVE-2013-5710 Security: FreeBSD-SA-13:13.nullfs Approved by: re
* The tvp vnode on rename is usually unlinked. Drop the cached nullkib2013-07-041-1/+6
| | | | | | | | | vnode for tvp to allow the free of the lower vnode, if needed. PR: kern/180236 Tested by: smh Sponsored by: The FreeBSD Foundation MFC after: 1 week
* - Fix nullfs vnode reference leak in nullfs_reclaim_lowervp(). Thekib2013-05-111-5/+14
| | | | | | | | | | | | | | | | | | | | | | | null_hashget() obtains the reference on the nullfs vnode, which must be dropped. - Fix a wart which existed from the introduction of the nullfs caching, do not unlock lower vnode in the nullfs_reclaim_lowervp(). It should be innocent, but now it is also formally safe. Inform the nullfs_reclaim() about this using the NULLV_NOUNLOCK flag set on nullfs inode. - Add a callback to the upper filesystems for the lower vnode unlinking. When inactivating a nullfs vnode, check if the lower vnode was unlinked, indicated by nullfs flag NULLV_DROP or VV_NOSYNC on the lower vnode, and reclaim upper vnode if so. This allows nullfs to purge cached vnodes for the unlinked lower vnode, avoiding excessive caching. Reported by: G??ran L??wkrantz <goran.lowkrantz@ismobile.com> Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
* When nullfs mount is forcibly unmounted and nullfs vnode is reclaimed,kib2013-01-101-0/+8
| | | | | | | | | | get back the leased write reference from the lower vnode. There is no other path which can correct v_writecount on the lowervp. Reported by: flo Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 days
* Add the "nocache" nullfs mount option, which disables the caching ofkib2013-01-031-0/+15
| | | | | | | | | | | | | the free nullfs vnodes, switching nullfs behaviour to pre-r240285. The option is mostly intended as the last-resort when higher pressure on the vnode cache due to doubling of the vnode counts is not desirable. Note that disabling the cache costs more than 2x wall time in the metadata-hungry scenarious. The default is "cache". Tested and benchmarked by: pho (previous version) MFC after: 2 weeks
* The r241025 fixed the case when a binary, executed from nullfs mount,kib2012-11-021-0/+21
| | | | | | | | | | | | | | | | | | | | | | | was still possible to open for write from the lower filesystem. There is a symmetric situation where the binary could already has file descriptors opened for write, but it can be executed from the nullfs overlay. Handle the issue by passing one v_writecount reference to the lower vnode if nullfs vnode has non-zero v_writecount. Note that only one write reference can be donated, since nullfs only keeps one use reference on the lower vnode. Always use the lower vnode v_writecount for the checks. Introduce the VOP_GET_WRITECOUNT to read v_writecount, which is currently always bypassed to the lower vnode, and VOP_ADD_WRITECOUNT to manipulate the v_writecount value, which manages a single bypass reference to the lower vnode. Caling the VOPs instead of directly accessing v_writecount provide the fix described in the previous paragraph. Tested by: pho MFC after: 3 weeks
* Grammar fixes.kib2012-10-141-3/+3
| | | | | Submitted by: bf MFC after: 1 week
* Replace the XXX comment with the proper description.kib2012-10-141-1/+3
| | | | MFC after: 1 week
* Allow shared lookups for nullfs mounts, if lower filesystem supportskib2012-09-091-19/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | it. There are two problems which shall be addressed for shared lookups use to have measurable effect on nullfs scalability: 1. When vfs_lookup() calls VOP_LOOKUP() for nullfs, which passes lookup operation to lower fs, resulting vnode is often only shared-locked. Then null_nodeget() cannot instantiate covering vnode for lower vnode, since insmntque1() and null_hashins() require exclusive lock on the lower. Change the assert that lower vnode is exclusively locked to only require any lock. If null hash failed to find pre-existing nullfs vnode for lower vnode and the vnode is shared-locked, the lower vnode lock is upgraded. 2. Nullfs reclaims its vnodes on deactivation. This is due to nullfs inability to detect reclamation of the lower vnode. Reclamation of a nullfs vnode at deactivation time prevents a reference to the lower vnode to become stale. Change nullfs VOP_INACTIVE to not reclaim the vnode, instead use the VFS_RECLAIM_LOWERVP to get notification and reclaim upper vnode together with the reclamation of the lower vnode. Note that nullfs reclamation procedure calls vput() on the lowervp vnode, temporary unlocking the vnode being reclaimed. This seems to be fine for MPSAFE filesystems, but not-MPSAFE code often put partially initialized vnode on some globally visible list, and later can decide that half-constructed vnode is not needed. If nullfs mount is created above such filesystem, then other threads might catch such not properly initialized vnode. Instead of trying to overcome this case, e.g. by recursing the lower vnode lock in null_reclaim_lowervp(), I decided to rely on nearby removal of the support for non-MPSAFE filesystems. In collaboration with: pho MFC after: 3 weeks
* Remove unused thread argument to vrecycle().trasz2012-04-231-2/+1
| | | | Reviewed by: kib
* In null_reclaim(), assert that reclaimed vnode is fully constructed,kib2012-02-291-9/+12
| | | | | | | | | | | instead of accepting half-constructed vnode. Previous code cannot decide what to do with such vnode anyway, and although processing it for hash removal, paniced later when getting rid of nullfs reference on lowervp. While there, remove initializations from the declaration block. Tested by: pho MFC after: 1 week
* Do the vput() for the lowervp in the null_nodeget() for error case too.kib2012-01-031-6/+2
| | | | | | | | | Several callers of null_nodeget() did the cleanup itself, but several missed it, most prominent being null_bypass(). Remove the cleanup from the callers, now null_nodeget() handles lowervp free itself. Reported and tested by: pho MFC after: 1 week
* Existing VOP_VPTOCNP() interface has a fatal flow that is critical forkib2011-11-191-4/+3
| | | | | | | | | | | | | | | | | | | | | nullfs. The problem is that resulting vnode is only required to be held on return from the successfull call to vop, instead of being referenced. Nullfs VOP_INACTIVE() method reclaims the vnode, which in combination with the VOP_VPTOCNP() interface means that the directory vnode returned from VOP_VPTOCNP() is reclaimed in advance, causing vn_fullpath() to error with EBADF or like. Change the interface for VOP_VPTOCNP(), now the dvp must be referenced. Convert all in-tree implementations of VOP_VPTOCNP(), which is trivial, because vhold(9) and vref(9) are similar in the locking prerequisites. Out-of-tree fs implementation of VOP_VPTOCNP(), if any, should have no trouble with the fix. Tested by: pho Reviewed by: mckusick MFC after: 3 weeks (subject of re approval)
* Do not use NULLVPTOLOWERVP() in the null_print(). If diagnostic is compiledkib2011-11-191-1/+1
| | | | | | | in, and show vnode is used from ddb on the faulty nullfs vnode, we get panic instead of vnode dump. MFC after: 1 week
* Fix typos - remove duplicate "is".brucec2011-02-231-1/+1
| | | | | | PR: docs/154934 Submitted by: Eitan Adler <lists at eitanadler.com> MFC after: 3 days
* Add a null_remove() function to nullfs, so that the v_usecountrmacklem2010-08-311-0/+27
| | | | | | | | | | | | | | | | | | | | | of the lower level vnode is incremented to greater than 1 when the upper level vnode's v_usecount is greater than one. This is necessary for the NFS clients, so that they will do a silly rename of the file instead of actually removing it when the file is still in use. It is "racy", since the v_usecount is incremented in many places in the kernel with minimal synchronization, but an extraneous silly rename is preferred to not doing a silly rename when it is required. The only other file systems that currently check the value of v_usecount in their VOP_REMOVE() functions are nwfs and smbfs. These file systems choose to fail a remove when the v_usecount is greater than 1 and I believe will function more correctly with this patch, as well. Tested by: to.my.trociny at gmail.com Submitted by: to.my.trociny at gmail.com (earlier version) Reviewed by: kib MFC after: 2 weeks
* Disable bypass for the vop_advlockpurge(). The vop is called afterkib2010-05-161-0/+1
| | | | | | vop_revoke(), the v_data is already destroyed. Reported and tested by: ed
* Add explicit struct ucred * argument for VOP_VPTOCNP, to be used bykib2009-06-211-1/+2
| | | | | | | | | | vn_open_cred in default implementation. Valid struct ucred is needed for audit and MAC, and curthread credentials may be wrong. This further requires modifying the interface of vn_fullpath(9), but it is out of scope of this change. Reviewed by: rwatson
* Implement the bypass routine for VOP_VPTOCNP in nullfs.kib2009-05-311-1/+50
| | | | | | | | Among other things, this makes procfs <pid>/file working for executables started from nullfs mount. Tested by: pho PR: 94269, 104938
* Lock the real null vnode lock before substitution of vp->v_vnlock.kib2009-05-311-3/+4
| | | | | | | | This should not really matter for correctness, since vp->v_lock is not locked before the call, and null_lock() holds the interlock, but makes the control flow for reclaim more clear. Tested by: pho
* Add VOP_ACCESSX, which can be used to query for newly added V*trasz2009-05-301-0/+27
| | | | | | | | permissions, such as VWRITE_ACL. For a filsystems that don't implement it, there is a default implementation, which works as a wrapper around VOP_ACCESS. Reviewed by: rwatson@
* Do not use null_bypass for VOP_ISLOCKED, directly call defaultpho2009-03-181-0/+1
| | | | | | | implementation. null_bypass cannot work for the !nullfs-vnodes, in particular, for VBAD vnodes. In collaboration with: kib
* Remove the null_islocked() overloaded vop because the standard one doesattilio2009-03-131-9/+0
| | | | the same.
* Do not use bypass for vop_vptocnp() from nullfs, call standardkib2009-03-101-0/+1
| | | | | | | | | implementation instead. The bypass does not assume that returned vnode is only held. Reported by: Paul B. Mahol <onemda gmail com>, pluknet <pluknet gmail com> Reviewed by: jhb Tested by: pho, pluknet <pluknet gmail com>
* Remove unused local variables.bz2009-01-311-2/+0
| | | | | | Submitted by: Christoph Mallon christoph.mallon@gmx.de Reviewed by: kib MFC after: 2 weeks
* In null_lookup(), do the needed cleanup instead of panicing sayingkib2008-11-261-5/+4
| | | | | | | | the cleanup is needed. Reported by: kris, pho Tested by: pho MFC after: 2 weeks
* Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessarytrasz2008-10-281-2/+2
| | | | | | | to add more V* constants, and the variables changed by this patch were often being assigned to mode_t variables, which is 16 bit. Approved by: rwatson (mentor)
* Retire the MALLOC and FREE macros. They are an abomination unto style(9).des2008-10-231-1/+1
| | | | MFC after: 3 months
* Fix two small typo's in comments in the nullfs vnops code.ed2008-09-111-2/+2
| | | | Submitted by: Jille Timmermans <jille quis cx>
* Axe the 'thread' argument from VOP_ISLOCKED() and lockstatus() as it isattilio2008-02-251-2/+1
| | | | | | | | | always curthread. As KPI gets broken by this patch, manpages and __FreeBSD_version will be updated by further commits. Tested by: Andrea Barberio <insomniac at slackware dot it>
* Cleanup lockmgr interface and exported KPI:attilio2008-01-241-2/+1
| | | | | | | | | | | | | | | | | | | | - Remove the "thread" argument from the lockmgr() function as it is always curthread now - Axe lockcount() function as it is no longer used - Axe LOCKMGR_ASSERT() as it is bogus really and no currently used. Hopefully this will be soonly replaced by something suitable for it. - Remove the prototype for dumplockinfo() as the function is no longer present Addictionally: - Introduce a KASSERT() in lockstatus() in order to let it accept only curthread or NULL as they should only be passed - Do a little bit of style(9) cleanup on lockmgr.h KPI results heavilly broken by this change, so manpages and FreeBSD_version will be modified accordingly by further commits. Tested by: matteo
* VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used inattilio2008-01-131-6/+4
| | | | | | | | | | | conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
* This changes give nullfs correctly work with latest unionfs.daichi2007-10-141-5/+18
| | | | | | | Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week
* Where I previously removed calls to kdb_enter(), now remove include ofrwatson2007-05-291-1/+0
| | | | | | kdb.h. Pointed out by: bde
* Since renaming of vop_lock to _vop_lock, pre- and post-conditionkib2007-05-181-2/+2
| | | | | | function calls are no more generated for vop_lock. Rename _vop_lock to vop_lock1 to satisfy tools/vnode_if.awk assumption about vop naming conventions. This restores pre/post-condition calls.
* Move vnode-to-file-handle translation from vfs_vptofh to vop_vptofh method.pjd2007-02-151-0/+10
| | | | | | | | | | | | | | | | This way we may support multiple structures in v_data vnode field within one file system without using black magic. Vnode-to-file-handle should be VOP in the first place, but was made VFS operation to keep interface as compatible as possible with SUN's VFS. BTW. Now Solaris also implements vnode-to-file-handle as VOP operation. VFS_VPTOFH() was left for API backward compatibility, but is marked for removal before 8.0-RELEASE. Approved by: mckusick Discussed with: many (on IRC) Tested with: ufs, msdosfs, cd9660, nullfs and zfs
* change vop_lock handling to allowing tracking of callers' file and line forkmacy2006-11-131-2/+2
| | | | | | acquisition of lockmgr locks Approved by: scottl (standing in for mentor rwatson)
* - Define a null_getwritemount to get the mount-point for the lowerjeff2006-03-121-2/+26
| | | | | | | filesystem so that nullfs doesn't permit you to circumvent snapshots. Discussed with: tegge Sponsored by: Isilon Systems, Inc.
* - spell VOP_LOCK(vp, LK_RELEASE... VOP_UNLOCK(vp,... so that asserts injeff2006-02-221-7/+8
| | | | | | | | | | | vop_lock_post do not trigger. - Rearrange null_inactive to null_hashrem earlier so there is no chance of finding the null node on the hash list after the locks have been switched. - We should never have a NULL lowervp in null_reclaim() so there is no need to handle this situation. panic instead. MFC After: 1 week
* Handle a race condition where NULLFS vnode can be cleaned while threadskan2005-09-151-4/+28
| | | | | | | can still be asleep waiting for lowervp lock. Tested by: kkenn Discussed with: ssouhlal, jeffr
* Use vput() instead of vrele() in null_reclaim() since the lower vnodessouhlal2005-09-021-4/+5
| | | | | | is locked. MFC after: 3 days
* - As this is presently the one and only place where duplicate acquires ofjeff2005-04-221-1/+1
| | | | | | | the vnode interlock are allowed mark it by passing MTX_DUPOK to this lock operation only. Sponsored by: Isilon Systems, Inc.
* - Lock the clearing of v_data so it is safe to inspect it with thejeff2005-03-171-1/+7
| | | | | | interlock. Sponsored by: Isilon Systems, Inc.
* - Assume that all lower filesystems now support proper locking. Assertjeff2005-03-151-130/+41
| | | | | | | | | | | | | | | | | that they set v->v_vnlock. This is true for all filesystems in the tree. - Remove all uses of LK_THISLAYER. If the lower layer is locked, the null layer is locked. We only use vget() to get a reference now. null essentially does no locking. This fixes LOOKUP_SHARED with nullfs. - Remove the special LK_DRAIN considerations, I do not believe this is needed now as LK_DRAIN doesn't destroy the lower vnode's lock, and it's hardly used anymore. - Add one well commented hack to prevent the lowervp from going away while we're in it's VOP_LOCK routine. This can only happen if we're forcibly unmounted while some callers are waiting in the lock. In this case the lowervp could be recycled after we drop our last ref in null_reclaim(). Prevent this with a vhold().
OpenPOWER on IntegriCloud