| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Do not enable nullfs vnode caching over nfs v4 mounts.
|
|
|
|
|
| |
Pass MNTK_NO_IOPF and MNTK_UNMAPPED_BUFS flags from the lower
filesystem to the nullfs mount.
|
|
|
|
|
|
|
|
| |
After nullfs rmdir operation, reclaim the directory vnode which was
unlinked. Otherwise the vnode stays cached, causing leak. This is
similar to r292961 for regular files.
Approved by: re (marius)
|
|
|
|
|
| |
Force nullfs vnode reclaim after unlinking, to potentially unlink
lower vnode.
|
|
|
|
|
|
|
|
|
|
|
| |
File systems that do not use the buffer cache (such as ZFS) must
use VOP_FSYNC() to perform the NFS server's Commit operation.
This patch adds a mnt_kern_flag called MNTK_USES_BCACHE which
is set by file systems that use the buffer cache. If this flag
is not set, the NFS server always does a VOP_FSYNC().
This should be ok for old file system modules that do not set
MNTK_USES_BCACHE, since calling VOP_FSYNC() is correct, although
it might not be optimal for file systems that use the buffer cache.
|
|
|
|
|
| |
Unlock ldvp and lock dvp to compensate for possible ldvp unlock in lower
VOP_LOOKUP() and dvp reclamation. Use cached value of dvp->v_mount.
|
|
|
|
|
| |
Assert that nullfs vnode has VV_ROOT set whenever lower vnode has.
Assert that dotdot lookup on the root vnode is not performed.
|
|
|
|
|
| |
Check for the cross-device cross-link attempt in the VFS, instead of
VOP_LINK() implemenations.
|
|
|
|
| |
Fix typo.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
transmission which could be tricked into rounding up to the nearest
page size, leaking up to a page of kernel memory. [13:11]
In IPv6 and NetATM, stop SIOCSIFADDR, SIOCSIFBRDADDR, SIOCSIFDSTADDR
and SIOCSIFNETMASK at the socket layer rather than pass them on to the
link layer without validation or credential checks. [SA-13:12]
Prevent cross-mount hardlinks between different nullfs mounts of the
same underlying filesystem. [SA-13:13]
Security: CVE-2013-5666
Security: FreeBSD-SA-13:11.sendfile
Security: CVE-2013-5691
Security: FreeBSD-SA-13:12.ifioctl
Security: CVE-2013-5710
Security: FreeBSD-SA-13:13.nullfs
Approved by: re
|
|
|
|
|
|
|
|
|
| |
vnode for tvp to allow the free of the lower vnode, if needed.
PR: kern/180236
Tested by: smh
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
|
|
|
|
|
|
|
|
|
|
| |
for the case when the nullfs vnode is not reclaimed. Otherwise, later
reclamation would not unlock the lower vnode.
Reported by: antoine
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
null_hashget() obtains the reference on the nullfs vnode, which must
be dropped.
- Fix a wart which existed from the introduction of the nullfs
caching, do not unlock lower vnode in the nullfs_reclaim_lowervp().
It should be innocent, but now it is also formally safe. Inform the
nullfs_reclaim() about this using the NULLV_NOUNLOCK flag set on
nullfs inode.
- Add a callback to the upper filesystems for the lower vnode
unlinking. When inactivating a nullfs vnode, check if the lower
vnode was unlinked, indicated by nullfs flag NULLV_DROP or VV_NOSYNC
on the lower vnode, and reclaim upper vnode if so. This allows
nullfs to purge cached vnodes for the unlinked lower vnode, avoiding
excessive caching.
Reported by: G??ran L??wkrantz <goran.lowkrantz@ismobile.com>
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Include some flags of the nullfs mount itself:
MNT_RDONLY, MNT_NOEXEC, MNT_NOSUID, MNT_UNION, MNT_NOSYMFOLLOW.
This allows userland code calling statfs() or fstatfs() to see these flags.
In particular, this allows opendir() to detect that a -t nullfs -o union
mount needs deduplication (otherwise at least . and .. are returned twice)
and allows rtld to detect a -t nullfs -o noexec mount as noexec.
Turn off the MNT_ROOTFS flag from the underlying filesystem because the
nullfs mount is definitely not the root filesystem.
Reviewed by: kib
MFC after: 1 week
|
|
|
|
|
|
|
|
|
| |
in r245004. Although the report was for noatime option which is
non-functional for the nullfs, other standard options like nosuid or
noexec are useful with it.
Reported by: Dewayne Geraghty <dewayne.geraghty@heuristicsystems.com.au>
MFC after: 3 days
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
existing nullfs vnode by the lower vnode is only 16 slots. Since the
default mode for the nullfs is to cache the vnodes, hash has extremely
huge chains.
Size the nullfs hashtbl based on the current value of
desiredvnodes. Use vfs_hash_index() to calculate the hash bucket for a
given vnode.
Pointy hat to: kib
Diagnosed and reviewed by: peter
Tested by: peter, pho (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 5 days
|
|
|
|
|
|
|
|
|
|
| |
get back the leased write reference from the lower vnode. There is no
other path which can correct v_writecount on the lowervp.
Reported by: flo
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
|
|
|
|
|
| |
Pointy hat to: kib
MFC after: 13 days
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the free nullfs vnodes, switching nullfs behaviour to pre-r240285.
The option is mostly intended as the last-resort when higher pressure
on the vnode cache due to doubling of the vnode counts is not
desirable.
Note that disabling the cache costs more than 2x wall time in the
metadata-hungry scenarious. The default is "cache".
Tested and benchmarked by: pho (previous version)
MFC after: 2 weeks
|
|
|
|
|
|
|
| |
lowervp vnode v_vnlock would cause panic due to NULL pointer
dereference much earlier.
MFC after: 1 week
|
|
|
|
|
|
|
|
|
|
| |
received granular locking) but the comment present in UFS has been
copied all over other filesystems code incorrectly for several times.
Removes comments that makes no sense now.
Reviewed by: kib
MFC after: 3 days
|
|
|
|
|
| |
Porters should refer to __FreeBSD_version 1000021 for this change as
it may have happened at the same timeframe.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
was still possible to open for write from the lower filesystem. There
is a symmetric situation where the binary could already has file
descriptors opened for write, but it can be executed from the nullfs
overlay.
Handle the issue by passing one v_writecount reference to the lower
vnode if nullfs vnode has non-zero v_writecount. Note that only one
write reference can be donated, since nullfs only keeps one use
reference on the lower vnode. Always use the lower vnode v_writecount
for the checks.
Introduce the VOP_GET_WRITECOUNT to read v_writecount, which is
currently always bypassed to the lower vnode, and VOP_ADD_WRITECOUNT
to manipulate the v_writecount value, which manages a single bypass
reference to the lower vnode. Caling the VOPs instead of directly
accessing v_writecount provide the fix described in the previous
paragraph.
Tested by: pho
MFC after: 3 weeks
|
|
|
|
|
| |
Submitted by: bf
MFC after: 1 week
|
|
|
|
| |
MFC after: 1 week
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
it. There are two problems which shall be addressed for shared
lookups use to have measurable effect on nullfs scalability:
1. When vfs_lookup() calls VOP_LOOKUP() for nullfs, which passes lookup
operation to lower fs, resulting vnode is often only shared-locked. Then
null_nodeget() cannot instantiate covering vnode for lower vnode, since
insmntque1() and null_hashins() require exclusive lock on the lower.
Change the assert that lower vnode is exclusively locked to only
require any lock. If null hash failed to find pre-existing nullfs
vnode for lower vnode and the vnode is shared-locked, the lower vnode
lock is upgraded.
2. Nullfs reclaims its vnodes on deactivation. This is due to nullfs
inability to detect reclamation of the lower vnode. Reclamation of a
nullfs vnode at deactivation time prevents a reference to the lower
vnode to become stale.
Change nullfs VOP_INACTIVE to not reclaim the vnode, instead use the
VFS_RECLAIM_LOWERVP to get notification and reclaim upper vnode
together with the reclamation of the lower vnode.
Note that nullfs reclamation procedure calls vput() on the lowervp
vnode, temporary unlocking the vnode being reclaimed. This seems to be
fine for MPSAFE filesystems, but not-MPSAFE code often put partially
initialized vnode on some globally visible list, and later can decide
that half-constructed vnode is not needed. If nullfs mount is created
above such filesystem, then other threads might catch such not
properly initialized vnode. Instead of trying to overcome this case,
e.g. by recursing the lower vnode lock in null_reclaim_lowervp(), I
decided to rely on nearby removal of the support for non-MPSAFE
filesystems.
In collaboration with: pho
MFC after: 3 weeks
|
|
|
|
| |
Reviewed by: kib
|
| |
|
|
|
|
|
|
|
| |
Lock the native nullfs vnode lock before switching the locks.
Tested by: pho
MFC after: 1 week
|
|
|
|
|
| |
Tested by: pho
MFC after: 1 week
|
|
|
|
|
|
|
| |
insmntque() requirements.
Tested by: pho
MFC after: 1 week
|
|
|
|
|
|
|
|
|
|
|
| |
instead of accepting half-constructed vnode. Previous code cannot decide
what to do with such vnode anyway, and although processing it for hash
removal, paniced later when getting rid of nullfs reference on lowervp.
While there, remove initializations from the declaration block.
Tested by: pho
MFC after: 1 week
|
|
|
|
|
|
|
|
|
| |
The null_nodeget() requires exclusive lock on lowervp to be able to
insmntque() new vnode.
Reported by: rea
Tested by: pho
MFC after: 1 week
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
function null_destroy_proto() from null_insmntque_dtr(). Also
apply null_destroy_proto() in null_nodeget() when we raced and a vnode
is found in the hash, so the currently allocated protonode shall be
destroyed.
Lock the vnode interlock around reassigning the v_vnlock.
In fact, this path will not be exercised after several later commits,
since null_nodeget() cannot take shared-locked lowervp at all due to
insmntque() requirements.
Reported by: rea
Tested by: pho
MFC after: 1 week
|
|
|
|
| |
MFC after: 1 week
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
a new jail parameter node with the following parameters:
allow.mount.devfs:
allow mounting the devfs filesystem inside a jail
allow.mount.nullfs:
allow mounting the nullfs filesystem inside a jail
Both parameters are disabled by default (equals the behavior before
devfs and nullfs in jails). Administrators have to explicitly allow
mounting devfs and nullfs for each jail. The value "-1" of the
devfs_ruleset parameter is removed in favor of the new allow setting.
Reviewed by: jamie
Suggested by: pjd
MFC after: 2 weeks
|
|
|
|
|
|
| |
This is now possible thanks to r230129.
MFC after: 1 month
|
|
|
|
|
|
|
| |
Use hashdestroy() instead of naive free().
Approved by: kib
MFC after: 2 weeks
|
|
|
|
|
|
|
| |
pointer 'lowervp' instead of 'vp', which is uninitialized at that point.
Reviewed by: kib
MFC after: 1 week
|
|
|
|
|
|
|
|
|
| |
Several callers of null_nodeget() did the cleanup itself, but several
missed it, most prominent being null_bypass(). Remove the cleanup from
the callers, now null_nodeget() handles lowervp free itself.
Reported and tested by: pho
MFC after: 1 week
|
|
|
|
|
| |
Tested by: pho
MFC after: 1 week
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
nullfs. The problem is that resulting vnode is only required to be
held on return from the successfull call to vop, instead of being
referenced.
Nullfs VOP_INACTIVE() method reclaims the vnode, which in combination
with the VOP_VPTOCNP() interface means that the directory vnode
returned from VOP_VPTOCNP() is reclaimed in advance, causing
vn_fullpath() to error with EBADF or like.
Change the interface for VOP_VPTOCNP(), now the dvp must be
referenced. Convert all in-tree implementations of VOP_VPTOCNP(),
which is trivial, because vhold(9) and vref(9) are similar in the
locking prerequisites. Out-of-tree fs implementation of VOP_VPTOCNP(),
if any, should have no trouble with the fix.
Tested by: pho
Reviewed by: mckusick
MFC after: 3 weeks (subject of re approval)
|
|
|
|
|
|
|
| |
in, and show vnode is used from ddb on the faulty nullfs vnode, we get
panic instead of vnode dump.
MFC after: 1 week
|
|
|
|
|
|
|
| |
The debugger and dumping support is adequate.
Tested by: pho
MFC after: 1 week
|
|
|
|
|
|
|
|
| |
false positives. Replace the #ifdef block with the proper
ASSERT_VOP_UNLOCKED() assert.
Tested by: pho
MFC after: 1 week
|
|
|
|
|
|
|
|
|
| |
failure (the getnewvnode cannot return an error). In this case, the
null_insmntque_dtr() already unlocked the reclaimed vnode, so VOP_UNLOCK()
in the nullfs_mount() after null_nodeget() failure is wrong.
Tested by: pho
MFC after: 1 week
|
|
|
|
|
|
|
| |
test because of this and also because it can lead to false positives.
Tested by: pho
MFC after: 1 week
|
|
|
|
|
| |
Reported by: Subbsd <subbsd gmail com>
Discussed with: kib
|
|
|
|
|
|
|
|
|
|
|
| |
method, so that callers can indicate the minimum vnode
locking requirement. This will allow some file systems to choose
to return a LK_SHARED locked vnode when LK_SHARED is specified
for the flags argument. This patch only adds the flag. It
does not change any file system to use it and all callers
specify LK_EXCLUSIVE, so file system semantics are not changed.
Reviewed by: kib
|
|
|
|
|
|
| |
PR: docs/154934
Submitted by: Eitan Adler <lists at eitanadler.com>
MFC after: 3 days
|