summaryrefslogtreecommitdiffstats
path: root/sys/fs/nullfs
Commit message (Collapse)AuthorAgeFilesLines
* MFC r295717:kib2016-02-241-0/+9
| | | | | | | | After nullfs rmdir operation, reclaim the directory vnode which was unlinked. Otherwise the vnode stays cached, causing leak. This is similar to r292961 for regular files. Approved by: re (marius)
* MFC r292961:kib2016-01-061-3/+5
| | | | | Force nullfs vnode reclaim after unlinking, to potentially unlink lower vnode.
* MFC: r281562rmacklem2015-04-301-0/+2
| | | | | | | | | | | File systems that do not use the buffer cache (such as ZFS) must use VOP_FSYNC() to perform the NFS server's Commit operation. This patch adds a mnt_kern_flag called MNTK_USES_BCACHE which is set by file systems that use the buffer cache. If this flag is not set, the NFS server always does a VOP_FSYNC(). This should be ok for old file system modules that do not set MNTK_USES_BCACHE, since calling VOP_FSYNC() is correct, although it might not be optimal for file systems that use the buffer cache.
* MFC r269708:kib2014-08-221-4/+40
| | | | | Unlock ldvp and lock dvp to compensate for possible ldvp unlock in lower VOP_LOOKUP() and dvp reclamation. Use cached value of dvp->v_mount.
* MFC r269187:kib2014-08-041-0/+4
| | | | | Assert that nullfs vnode has VV_ROOT set whenever lower vnode has. Assert that dotdot lookup on the root vnode is not performed.
* MFC r268764:kib2014-07-301-10/+0
| | | | | Check for the cross-device cross-link attempt in the VFS, instead of VOP_LINK() implemenations.
* MFC r269081:kib2014-07-271-1/+1
| | | | Fix typo.
* Fix the length calculation for the final block of a sendfile(2)des2013-09-101-0/+10
| | | | | | | | | | | | | | | | | | | | transmission which could be tricked into rounding up to the nearest page size, leaking up to a page of kernel memory. [13:11] In IPv6 and NetATM, stop SIOCSIFADDR, SIOCSIFBRDADDR, SIOCSIFDSTADDR and SIOCSIFNETMASK at the socket layer rather than pass them on to the link layer without validation or credential checks. [SA-13:12] Prevent cross-mount hardlinks between different nullfs mounts of the same underlying filesystem. [SA-13:13] Security: CVE-2013-5666 Security: FreeBSD-SA-13:11.sendfile Security: CVE-2013-5691 Security: FreeBSD-SA-13:12.ifioctl Security: CVE-2013-5710 Security: FreeBSD-SA-13:13.nullfs Approved by: re
* The tvp vnode on rename is usually unlinked. Drop the cached nullkib2013-07-041-1/+6
| | | | | | | | | vnode for tvp to allow the free of the lower vnode, if needed. PR: kern/180236 Tested by: smh Sponsored by: The FreeBSD Foundation MFC after: 1 week
* Do not leak the NULLV_NOUNLOCK flag from the nullfs_unlink_lowervp(),kib2013-05-211-7/+19
| | | | | | | | | | for the case when the nullfs vnode is not reclaimed. Otherwise, later reclamation would not unlock the lower vnode. Reported by: antoine Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week
* - Fix nullfs vnode reference leak in nullfs_reclaim_lowervp(). Thekib2013-05-114-7/+50
| | | | | | | | | | | | | | | | | | | | | | | null_hashget() obtains the reference on the nullfs vnode, which must be dropped. - Fix a wart which existed from the introduction of the nullfs caching, do not unlock lower vnode in the nullfs_reclaim_lowervp(). It should be innocent, but now it is also formally safe. Inform the nullfs_reclaim() about this using the NULLV_NOUNLOCK flag set on nullfs inode. - Add a callback to the upper filesystems for the lower vnode unlinking. When inactivating a nullfs vnode, check if the lower vnode was unlinked, indicated by nullfs flag NULLV_DROP or VV_NOSYNC on the lower vnode, and reclaim upper vnode if so. This allows nullfs to purge cached vnodes for the unlinked lower vnode, avoiding excessive caching. Reported by: G??ran L??wkrantz <goran.lowkrantz@ismobile.com> Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
* nullfs: Improve f_flags in statfs().jilles2013-03-021-1/+2
| | | | | | | | | | | | | | | | Include some flags of the nullfs mount itself: MNT_RDONLY, MNT_NOEXEC, MNT_NOSUID, MNT_UNION, MNT_NOSYMFOLLOW. This allows userland code calling statfs() or fstatfs() to see these flags. In particular, this allows opendir() to detect that a -t nullfs -o union mount needs deduplication (otherwise at least . and .. are returned twice) and allows rtld to detect a -t nullfs -o noexec mount as noexec. Turn off the MNT_ROOTFS flag from the underlying filesystem because the nullfs mount is definitely not the root filesystem. Reviewed by: kib MFC after: 1 week
* Remove the filtering of the acceptable mount options for nullfs, addedkib2013-01-161-11/+0
| | | | | | | | | in r245004. Although the report was for noatime option which is non-functional for the nullfs, other standard options like nosuid or noexec are useful with it. Reported by: Dewayne Geraghty <dewayne.geraghty@heuristicsystems.com.au> MFC after: 3 days
* The current default size of the nullfs hash table used to lookup thekib2013-01-141-10/+6
| | | | | | | | | | | | | | | | existing nullfs vnode by the lower vnode is only 16 slots. Since the default mode for the nullfs is to cache the vnodes, hash has extremely huge chains. Size the nullfs hashtbl based on the current value of desiredvnodes. Use vfs_hash_index() to calculate the hash bucket for a given vnode. Pointy hat to: kib Diagnosed and reviewed by: peter Tested by: peter, pho (previous version) Sponsored by: The FreeBSD Foundation MFC after: 5 days
* When nullfs mount is forcibly unmounted and nullfs vnode is reclaimed,kib2013-01-101-0/+8
| | | | | | | | | | get back the leased write reference from the lower vnode. There is no other path which can correct v_writecount on the lowervp. Reported by: flo Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 days
* Fix reversed condition in the assertion.kib2013-01-041-1/+1
| | | | | Pointy hat to: kib MFC after: 13 days
* Add the "nocache" nullfs mount option, which disables the caching ofkib2013-01-034-13/+63
| | | | | | | | | | | | | the free nullfs vnodes, switching nullfs behaviour to pre-r240285. The option is mostly intended as the last-resort when higher pressure on the vnode cache due to doubling of the vnode counts is not desirable. Note that disabling the cache costs more than 2x wall time in the metadata-hungry scenarious. The default is "cache". Tested and benchmarked by: pho (previous version) MFC after: 2 weeks
* Remove the check and panic for an impossible condition. The NULLkib2012-11-201-2/+0
| | | | | | | lowervp vnode v_vnlock would cause panic due to NULL pointer dereference much earlier. MFC after: 1 week
* r16312 is not any longer real since many years (likely since when VFSattilio2012-11-191-4/+0
| | | | | | | | | | received granular locking) but the comment present in UFS has been copied all over other filesystems code incorrectly for several times. Removes comments that makes no sense now. Reviewed by: kib MFC after: 3 days
* Complete MPSAFE VFS interface and remove MNTK_MPSAFE flag.attilio2012-11-091-2/+1
| | | | | Porters should refer to __FreeBSD_version 1000021 for this change as it may have happened at the same timeframe.
* The r241025 fixed the case when a binary, executed from nullfs mount,kib2012-11-021-0/+21
| | | | | | | | | | | | | | | | | | | | | | | was still possible to open for write from the lower filesystem. There is a symmetric situation where the binary could already has file descriptors opened for write, but it can be executed from the nullfs overlay. Handle the issue by passing one v_writecount reference to the lower vnode if nullfs vnode has non-zero v_writecount. Note that only one write reference can be donated, since nullfs only keeps one use reference on the lower vnode. Always use the lower vnode v_writecount for the checks. Introduce the VOP_GET_WRITECOUNT to read v_writecount, which is currently always bypassed to the lower vnode, and VOP_ADD_WRITECOUNT to manipulate the v_writecount value, which manages a single bypass reference to the lower vnode. Caling the VOPs instead of directly accessing v_writecount provide the fix described in the previous paragraph. Tested by: pho MFC after: 3 weeks
* Grammar fixes.kib2012-10-141-3/+3
| | | | | Submitted by: bf MFC after: 1 week
* Replace the XXX comment with the proper description.kib2012-10-141-1/+3
| | | | MFC after: 1 week
* Allow shared lookups for nullfs mounts, if lower filesystem supportskib2012-09-094-46/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | it. There are two problems which shall be addressed for shared lookups use to have measurable effect on nullfs scalability: 1. When vfs_lookup() calls VOP_LOOKUP() for nullfs, which passes lookup operation to lower fs, resulting vnode is often only shared-locked. Then null_nodeget() cannot instantiate covering vnode for lower vnode, since insmntque1() and null_hashins() require exclusive lock on the lower. Change the assert that lower vnode is exclusively locked to only require any lock. If null hash failed to find pre-existing nullfs vnode for lower vnode and the vnode is shared-locked, the lower vnode lock is upgraded. 2. Nullfs reclaims its vnodes on deactivation. This is due to nullfs inability to detect reclamation of the lower vnode. Reclamation of a nullfs vnode at deactivation time prevents a reference to the lower vnode to become stale. Change nullfs VOP_INACTIVE to not reclaim the vnode, instead use the VFS_RECLAIM_LOWERVP to get notification and reclaim upper vnode together with the reclamation of the lower vnode. Note that nullfs reclamation procedure calls vput() on the lowervp vnode, temporary unlocking the vnode being reclaimed. This seems to be fine for MPSAFE filesystems, but not-MPSAFE code often put partially initialized vnode on some globally visible list, and later can decide that half-constructed vnode is not needed. If nullfs mount is created above such filesystem, then other threads might catch such not properly initialized vnode. Instead of trying to overcome this case, e.g. by recursing the lower vnode lock in null_reclaim_lowervp(), I decided to rely on nearby removal of the support for non-MPSAFE filesystems. In collaboration with: pho MFC after: 3 weeks
* Remove unused thread argument to vrecycle().trasz2012-04-231-2/+1
| | | | Reviewed by: kib
* Use NULL instead of 0kevlo2012-03-131-2/+2
|
* Do not expose unlocked unconstructed nullfs vnode on mount list.kib2012-03-021-1/+1
| | | | | | | Lock the native nullfs vnode lock before switching the locks. Tested by: pho MFC after: 1 week
* Allow shared locks for reads when lower filesystem accept shared locking.kib2012-02-291-1/+2
| | | | | Tested by: pho MFC after: 1 week
* Document that null_nodeget() cannot take shared-locked lowervp due tokib2012-02-291-1/+5
| | | | | | | insmntque() requirements. Tested by: pho MFC after: 1 week
* In null_reclaim(), assert that reclaimed vnode is fully constructed,kib2012-02-291-9/+12
| | | | | | | | | | | instead of accepting half-constructed vnode. Previous code cannot decide what to do with such vnode anyway, and although processing it for hash removal, paniced later when getting rid of nullfs reference on lowervp. While there, remove initializations from the declaration block. Tested by: pho MFC after: 1 week
* Always request exclusive lock for the lower vnode in nullfs_vget().kib2012-02-291-0/+6
| | | | | | | | | The null_nodeget() requires exclusive lock on lowervp to be able to insmntque() new vnode. Reported by: rea Tested by: pho MFC after: 1 week
* Move the code to destroy half-contructed nullfs vnode into helperkib2012-02-291-6/+13
| | | | | | | | | | | | | | | | | function null_destroy_proto() from null_insmntque_dtr(). Also apply null_destroy_proto() in null_nodeget() when we raced and a vnode is found in the hash, so the currently allocated protonode shall be destroyed. Lock the vnode interlock around reassigning the v_vnlock. In fact, this path will not be exercised after several later commits, since null_nodeget() cannot take shared-locked lowervp at all due to insmntque() requirements. Reported by: rea Tested by: pho MFC after: 1 week
* Merge a split multi-line comment.kib2012-02-291-4/+1
| | | | MFC after: 1 week
* To improve control over the use of mount(8) inside a jail(8), introducemm2012-02-231-0/+5
| | | | | | | | | | | | | | | | | | | a new jail parameter node with the following parameters: allow.mount.devfs: allow mounting the devfs filesystem inside a jail allow.mount.nullfs: allow mounting the nullfs filesystem inside a jail Both parameters are disabled by default (equals the behavior before devfs and nullfs in jails). Administrators have to explicitly allow mounting devfs and nullfs for each jail. The value "-1" of the devfs_ruleset parameter is removed in favor of the new allow setting. Reviewed by: jamie Suggested by: pjd MFC after: 2 weeks
* Allow mounting nullfs(5) inside jails.mm2012-02-091-1/+1
| | | | | | This is now possible thanks to r230129. MFC after: 1 month
* Subject: NULLFS: properly destroy node hashrea2012-01-181-1/+1
| | | | | | | Use hashdestroy() instead of naive free(). Approved by: kib MFC after: 2 weeks
* In sys/fs/nullfs/null_subr.c, in a KASSERT, output the correct vnodedim2012-01-051-1/+1
| | | | | | | pointer 'lowervp' instead of 'vp', which is uninitialized at that point. Reviewed by: kib MFC after: 1 week
* Do the vput() for the lowervp in the null_nodeget() for error case too.kib2012-01-033-8/+6
| | | | | | | | | Several callers of null_nodeget() did the cleanup itself, but several missed it, most prominent being null_bypass(). Remove the cleanup from the callers, now null_nodeget() handles lowervp free itself. Reported and tested by: pho MFC after: 1 week
* Document the state of the lowervp vnode for null_nodeget().kib2012-01-031-0/+3
| | | | | Tested by: pho MFC after: 1 week
* Existing VOP_VPTOCNP() interface has a fatal flow that is critical forkib2011-11-191-4/+3
| | | | | | | | | | | | | | | | | | | | | nullfs. The problem is that resulting vnode is only required to be held on return from the successfull call to vop, instead of being referenced. Nullfs VOP_INACTIVE() method reclaims the vnode, which in combination with the VOP_VPTOCNP() interface means that the directory vnode returned from VOP_VPTOCNP() is reclaimed in advance, causing vn_fullpath() to error with EBADF or like. Change the interface for VOP_VPTOCNP(), now the dvp must be referenced. Convert all in-tree implementations of VOP_VPTOCNP(), which is trivial, because vhold(9) and vref(9) are similar in the locking prerequisites. Out-of-tree fs implementation of VOP_VPTOCNP(), if any, should have no trouble with the fix. Tested by: pho Reviewed by: mckusick MFC after: 3 weeks (subject of re approval)
* Do not use NULLVPTOLOWERVP() in the null_print(). If diagnostic is compiledkib2011-11-191-1/+1
| | | | | | | in, and show vnode is used from ddb on the faulty nullfs vnode, we get panic instead of vnode dump. MFC after: 1 week
* Use the plain panic calls, without additional printing around them.kib2011-11-191-14/+4
| | | | | | | The debugger and dumping support is adequate. Tested by: pho MFC after: 1 week
* The use of VOP_ISLOCKED() without a check for the return values can causekib2011-10-241-4/+1
| | | | | | | | false positives. Replace the #ifdef block with the proper ASSERT_VOP_UNLOCKED() assert. Tested by: pho MFC after: 1 week
* The only possible error return from null_nodeget() is due to insmntque1kib2011-10-241-1/+0
| | | | | | | | | failure (the getnewvnode cannot return an error). In this case, the null_insmntque_dtr() already unlocked the reclaimed vnode, so VOP_UNLOCK() in the nullfs_mount() after null_nodeget() failure is wrong. Tested by: pho MFC after: 1 week
* The covered vnode must be reloced if it was unlocked. Remove VOP_ISLOCKEDkib2011-10-241-1/+1
| | | | | | | test because of this and also because it can lead to false positives. Tested by: pho MFC after: 1 week
* Only unlock if the lock is exclusive.pho2011-10-241-3/+2
| | | | | Reported by: Subbsd <subbsd gmail com> Discussed with: kib
* Add a lock flags argument to the VFS_FHTOVP() file systemrmacklem2011-05-221-2/+4
| | | | | | | | | | | method, so that callers can indicate the minimum vnode locking requirement. This will allow some file systems to choose to return a LK_SHARED locked vnode when LK_SHARED is specified for the flags argument. This patch only adds the flag. It does not change any file system to use it and all callers specify LK_EXCLUSIVE, so file system semantics are not changed. Reviewed by: kib
* Fix typos - remove duplicate "is".brucec2011-02-231-1/+1
| | | | | | PR: docs/154934 Submitted by: Eitan Adler <lists at eitanadler.com> MFC after: 3 days
* Add a null_remove() function to nullfs, so that the v_usecountrmacklem2010-08-311-0/+27
| | | | | | | | | | | | | | | | | | | | | of the lower level vnode is incremented to greater than 1 when the upper level vnode's v_usecount is greater than one. This is necessary for the NFS clients, so that they will do a silly rename of the file instead of actually removing it when the file is still in use. It is "racy", since the v_usecount is incremented in many places in the kernel with minimal synchronization, but an extraneous silly rename is preferred to not doing a silly rename when it is required. The only other file systems that currently check the value of v_usecount in their VOP_REMOVE() functions are nwfs and smbfs. These file systems choose to fail a remove when the v_usecount is greater than 1 and I believe will function more correctly with this patch, as well. Tested by: to.my.trociny at gmail.com Submitted by: to.my.trociny at gmail.com (earlier version) Reviewed by: kib MFC after: 2 weeks
* Disable bypass for the vop_advlockpurge(). The vop is called afterkib2010-05-161-0/+1
| | | | | | vop_revoke(), the v_data is already destroyed. Reported and tested by: ed
OpenPOWER on IntegriCloud