| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
Limit the accesses to file' f_advice member to VREG vnodes only.
Recheck that f_advice is not NULL after lock is taken.
Approved by: re (marius)
|
|
|
|
|
|
| |
Correct handling of open("name", O_DIRECTORY | O_CREAT).
PR: 202892
|
|
|
|
| |
Provide a prefaulting for the userspace i/o buffers, disabled by default.
|
|
|
|
|
| |
Add V_MNTREF flag, to indicate that caller of vn_start*_write() already
owns a reference on the mount point, and the functions can consume it.
|
|
|
|
|
| |
Add VN_OPEN_NAMECACHE flag for vn_open_cred(9), which requests that
the created file name was cached. Use the flag for core dumps.
|
|
|
|
|
| |
Set NOCACHE flag for CREATE namei() calls, do not specially handle
MAKEENTRY in VOP_LOOKUP().
|
|
|
|
|
| |
Only sleep interruptible while waiting for suspension end when
filesystem specified VFCF_SBDRY flag, i.e. for NFS.
|
|
|
|
|
| |
In vfs_write_suspend_umnt(), if suspension cannot be established, do
not forget to restore write ops count when returning the error.
|
|
|
|
|
| |
Add IO_RANGELOCKED flag for vn_rdwr(9), which specifies that vnode is
not locked, but range is.
|
|
|
|
|
| |
Slightly reword comment. Move code, which is described by the
comment, after it.
|
|
|
|
|
|
|
|
|
|
| |
Fix up proc_realparent to always return correct process.
Prior to the change it would always return initproc for non-traced processes.
This fixes a regression in inferior().
Approved by: re (marius)
|
|
|
|
|
|
|
| |
Add helper helper vfs_write_suspend_umnt().
Fix the bug in the FFS unmount, when suspension failed, the ufs
extattrs were not reinitialized.
|
|
|
|
|
|
| |
Generalize vn_get_ino() to allow filesystems to use custom vnode
producer. Convert inline copies of vn_get_ino() in msdosfs and cd9660
into the uses of vn_get_ino_gen().
|
|
|
|
| |
Use vn_io_fault for the writes from core dumping code.
|
|
|
|
|
| |
In msdosfs_setattr(), add a check for result of the utimes(2) permissions test.
Refactor the permission checks for utimes(2).
|
|
|
|
|
|
| |
If vn_open_vnode() succeeded in opening the vnode, but subsequent
advisory lock cannot be obtained, prevent double-close of the vnode in
vn_close() called from the fdrop(), by resetting file' f_ops methods.
|
|
|
|
|
|
|
|
| |
Avoid overflow for the page counts.
MFC r258365:
Revert back to use int for the page counts.
Rearrange the checks to correctly handle overflowing address arithmetic.
|
|
|
|
|
| |
Change VFS_PROLOGUE() to evaluate the mp once, convert
MNTK_SHARED_WRITES and MNTK_EXTENDED_SHARED tests into inline functions.
|
|
|
|
|
|
|
|
|
|
|
| |
exclusively. Filesystems are assumed to disable shared locking for
the fifo vnode locks, but some do not.
Reported and tested by: olgeni
Discussed with: avg
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Approved by: re (glebius)
|
|
|
|
|
| |
Tested by: pho
Sponsored by: The FreeBSD Foundation
|
|
|
|
|
|
|
| |
Provide valid .fo_sendfile method for several missed struct fileops.
Reviewed by: glebius
Sponsored by: The FreeBSD Foundation
|
|
|
|
|
|
|
|
| |
vnode backed file descriptors have this method implemented.
Reviewed by: kib
Sponsored by: Nginx, Inc.
Sponsored by: Netflix
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vfs_busy(mp);
vfs_write_suspend(mp);
which are problematic if other thread starts unmount between two
calls. The unmount starts a write, while vfs_write_suspend() drain
writers. On the other hand, unmount drains busy references, causing
the deadlock.
Add a flag argument to vfs_write_suspend and require the callers of it
to specify VS_SKIP_UNMOUNT flag, when the call is performed not in the
mount path, i.e. the covered vnode is not locked. The suspension is
not attempted if VS_SKIP_UNMOUNT is specified and unmount is in
progress.
Reported and tested by: Andreas Longwitz <longwitz@incore.de>
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
|
|
|
|
| |
Suggested by: bde
|
|
|
|
|
|
|
|
| |
it was being passed down to VOP_IOCTL() where it promptly resulted in
ENOTTY due to a missing else for the past 8 years. While here, use a
shared vnode lock while fetching the current file's size.
MFC after: 1 week
|
|
|
|
| |
MFC after: 1 week
|
|
|
|
|
| |
Requested by: joel
MFC after: 2 weeks
|
|
|
|
|
|
|
| |
and f_offset locking. Add required Foundation notice for r248319.
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
|
|
|
|
|
|
|
|
|
|
|
| |
the filesystem VOP_READ() and VOP_WRITE() implementations in the same
way as vn_io_fault_uiomove() over the unmapped buffers. Helper
provides the convenient wrapper over the pmap_copy_pages() for struct
uio consumers, taking care of the TDP_UIOHELD situations.
Sponsored by: The FreeBSD Foundation
Tested by: pho
MFC after: 2 weeks
|
|\ |
|
| | |
|
| |
| |
| |
| |
| |
| | |
their "write" versions.
Sponsored by: EMC / Isilon storage division
|
|/
|
|
|
|
|
|
| |
* VM_OBJECT_LOCK and VM_OBJECT_UNLOCK are mapped to write operations
* VM_OBJECT_SLEEP() is introduced as a general purpose primitve to
get a sleep operation using a VM_OBJECT_LOCK() as protection
* The approach must bear with vm_pager.h namespace pollution so many
files require including directly rwlock.h
|
|
|
|
|
|
|
|
| |
- use u_long consistently
- use SYSCTL_ULONG to match the type of variable
Reviewed by: kib
MFC after: 1 week
|
| |
|
|
|
|
|
|
| |
vfs_write_resume_flags().
Sponsored by: The FreeBSD Foundation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
mount, which means that is must not be called while the snaplock is
owned. The vfs_write_resume(9) does call the function as the
VFS_SUSP_CLEAN() method, which is too early and falls into the region
still protected by snaplock.
Add yet another flag for the vfs_write_resume_flags() to avoid calling
suspension cleanup handler after the suspend is lifted, and use it in
the ffs_snapshot() call to vfs_write_resume.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the write start, by adding a variation of the vfs_write_resume(9)
which accepts flags.
Use the new function to prevent a deadlock between parallel suspension
and snapshotting a UFS mount. The ffs_snapshot() code performed
vfs_write_resume() followed by vn_start_write() while owning the
snaplock. If the suspension intervene between resume and
vn_start_write(), the deadlock occured after the suspending thread
tried to lock the snaplock, most typically during the write in the
ffs_copyonwrite().
Reported and tested by: Andreas Longwitz <longwitz@incore.de>
Reviewed by: mckusick
MFC after: 2 weeks
X-MFC-note: make the vfs_write_resume(9) function a macro after the MFC,
in HEAD
|
|
|
|
|
|
|
|
|
|
|
|
| |
is in capability mode.
- Add VN_OPEN_NOCAPCHECK flag for vn_open_cred() to will ne converted into
NOCAPCHECK namei flag.
This functionality will be used to enable core dumps for sandboxed processes.
Reviewed by: rwatson
Obtained from: WHEEL Systems
MFC after: 2 weeks
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
was still possible to open for write from the lower filesystem. There
is a symmetric situation where the binary could already has file
descriptors opened for write, but it can be executed from the nullfs
overlay.
Handle the issue by passing one v_writecount reference to the lower
vnode if nullfs vnode has non-zero v_writecount. Note that only one
write reference can be donated, since nullfs only keeps one use
reference on the lower vnode. Always use the lower vnode v_writecount
for the checks.
Introduce the VOP_GET_WRITECOUNT to read v_writecount, which is
currently always bypassed to the lower vnode, and VOP_ADD_WRITECOUNT
to manipulate the v_writecount value, which manages a single bypass
reference to the lower vnode. Caling the VOPs instead of directly
accessing v_writecount provide the fix described in the previous
paragraph.
Tested by: pho
MFC after: 3 weeks
|
|
|
|
|
|
|
|
|
|
|
|
| |
In particular, do not lock Giant conditionally when calling into the
filesystem module, remove the VFS_LOCK_GIANT() and related
macros. Stop handling buffers belonging to non-mpsafe filesystems.
The VFS_VERSION is bumped to indicate the interface change which does
not result in the interface signatures changes.
Conducted and reviewed by: attilio
Tested by: pho
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If you have a binary on a filesystem which is also mounted over by
nullfs, you could execute the binary from the lower filesystem, or
from the nullfs mount. When executed from lower filesystem, the lower
vnode gets VV_TEXT flag set, and the file cannot be modified while the
binary is active. But, if executed as the nullfs alias, only the
nullfs vnode gets VV_TEXT set, and you still can open the lower vnode
for write.
Add a set of VOPs for the VV_TEXT query, set and clear operations,
which are correctly bypassed to lower vnode.
Tested by: pho (previous version)
MFC after: 2 weeks
|
|
|
|
|
|
|
| |
so there is no need to check for it.
Sponsored by: FreeBSD Foundation
MFC after: 2 weeks
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
lock is obtained before the write count is increased during open() and the
lock is released after the write count is decreased during close().
The first change closes a race where an open() that will block with O_SHLOCK
or O_EXLOCK can increase the write count while it waits. If the process
holding the current lock on the file then tries to call exec() on the file
it has locked, it can fail with ETXTBUSY even though the advisory lock is
preventing other threads from succesfully completeing a writable open().
The second change closes a race where a read-only open() with O_SHLOCK or
O_EXLOCK may return successfully while the write count is non-zero due to
another descriptor that had the advisory lock and was blocking the open()
still being in the process of closing. If the process that completed the
open() then attempts to call exec() on the file it locked, it can fail with
ETXTBUSY even though the other process that held a write lock has closed
the file and released the lock.
Reviewed by: kib
MFC after: 1 month
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
now fully encapsulates all accesses to f_offset, and extends f_offset
locking to other consumers that need it, in particular, to lseek() and
variants of getdirentries().
Ensure that on 32bit architectures f_offset, which is 64bit quantity,
always read and written under the mtxpool protection. This fixes
apparently easy to trigger race when parallel lseek()s or lseek() and
read/write could destroy file offset.
The already broken ABI emulations, including iBCS and SysV, are not
converted (yet).
Tested by: pho
No objections from: jhb
MFC after: 3 weeks
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It seems that intended locking protocol for struct file f_offset field
was as follows: f_offset should always be changed under the vnode lock
(except fcntl(2) and lseek(2) did not followed the rules). Since
read(2) uses shared vnode lock, FOFFSET_LOCKED block is additionally
taken to serialize shared vnode lock owners.
This was broken first by enabling shared lock on writes, then by
fadvise changes, which moved f_offset assigned from under vnode lock,
and last by vn_io_fault() doing chunked i/o. More, due to uio_offset
not yet valid in vn_io_fault(), the range lock for reads was taken on
the wrong region.
Change the locking for f_offset to always use FOFFSET_LOCKED block,
which is placed before rangelocks in the lock order.
Extract foffset_lock() and foffset_unlock() functions which implements
FOFFSET_LOCKED lock, and consistently lock f_offset with it in the
vn_io_fault() both for reads and writes, even if MNTK_NO_IOPF flag is
not set for the vnode mount. Indicate that f_offset is already valid
for vn_read() and vn_write() calls from vn_io_fault() with FOF_OFFSET
flag, and assert that all callers of vn_read() and vn_write() follow
this protocol.
Extract get_advice() function to calculate the POSIX_FADV_XXX value
for the i/o region, and use it were appropriate.
Reviewed by: jhb
Tested by: pho
MFC after: 2 weeks
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
First, extend the changes in r230782 to better handle the common case
of using NOREUSE with sequential reads. A NOREUSE file descriptor
will now track the last implicit DONTNEED request it made as a result
of a NOREUSE read. If a subsequent NOREUSE read is adjacent to the
previous range, it will apply the DONTNEED request to the entire range
of both the previous read and the current read. The effect is that
each read of a file accessed sequentially will apply the DONTNEED
request to the entire range that has been read. This allows NOREUSE
to properly handle misaligned reads by flushing each buffer to cache
once it has been completely read.
Second, apply the same changes made to read(2) by r230782 and this
change to writes. This provides much better performance in the
sequential write case as it allows writes to still be clustered. It
also provides much better performance for misaligned writes. It does
mean that NOREUSE will be generally ineffective for non-sequential
writes as the current implementation relies on a future NOREUSE
write's implicit DONTNEED request to flush the dirty buffer from the
current write.
MFC after: 2 weeks
|
|
|
|
|
|
|
|
|
|
| |
a lookup or created via VOP_CREATE()) into a new vn_open_vnode() function
and use this function in fhopen() instead of duplicating code from
vn_open_cred() directly.
Tested by: pho
Reviewed by: kib
MFC after: 2 weeks
|
|
|
|
| |
MFC after: 1 month
|
|
|
|
| |
MFC after: 1 month
|