summaryrefslogtreecommitdiffstats
path: root/sys/kern/vfs_vnops.c
Commit message (Collapse)AuthorAgeFilesLines
* MFC r294596:kib2016-02-141-2/+3
| | | | | | | Limit the accesses to file' f_advice member to VREG vnodes only. Recheck that f_advice is not NULL after lock is taken. Approved by: re (marius)
* MFC r287599:kib2015-09-161-1/+4
| | | | | | Correct handling of open("name", O_DIRECTORY | O_CREAT). PR: 202892
* MFC r286106:kib2015-08-071-0/+62
| | | | Provide a prefaulting for the userspace i/o buffers, disabled by default.
* MFC r283601:kib2015-06-101-10/+10
| | | | | Add V_MNTREF flag, to indicate that caller of vn_start*_write() already owns a reference on the mount point, and the functions can consume it.
* MFC r276008:kib2015-01-041-0/+2
| | | | | Add VN_OPEN_NAMECACHE flag for vn_open_cred(9), which requests that the created file name was cached. Use the flag for core dumps.
* MFC r275897:kib2015-01-011-1/+5
| | | | | Set NOCACHE flag for CREATE namei() calls, do not specially handle MAKEENTRY in VOP_LOOKUP().
* MFC r275744:kib2014-12-201-5/+8
| | | | | Only sleep interruptible while waiting for suspension end when filesystem specified VFCF_SBDRY flag, i.e. for NFS.
* MFC r274501:kib2014-11-211-1/+3
| | | | | In vfs_write_suspend_umnt(), if suspension cannot be established, do not forget to restore write ops count when returning the error.
* MFC r272534:kib2014-10-181-7/+10
| | | | | Add IO_RANGELOCKED flag for vn_rdwr(9), which specifies that vnode is not locked, but range is.
* MFC r272538:kib2014-10-111-5/+4
| | | | | Slightly reword comment. Move code, which is described by the comment, after it.
* MFC r270993:mjg2014-09-261-1/+2
| | | | | | | | | | Fix up proc_realparent to always return correct process. Prior to the change it would always return initproc for non-traced processes. This fixes a regression in inferior(). Approved by: re (marius)
* MFC r268612:kib2014-07-281-0/+31
| | | | | | | Add helper helper vfs_write_suspend_umnt(). Fix the bug in the FFS unmount, when suspension failed, the ufs extattrs were not reinitialized.
* MFC r268606:kib2014-07-281-4/+27
| | | | | | Generalize vn_get_ino() to allow filesystems to use custom vnode producer. Convert inline copies of vn_get_ino() in msdosfs and cd9660 into the uses of vn_get_ino_gen().
* MFC r267491:kib2014-06-291-52/+137
| | | | Use vn_io_fault for the writes from core dumping code.
* MFC r267564:kib2014-06-241-0/+24
| | | | | In msdosfs_setattr(), add a check for result of the utimes(2) permissions test. Refactor the permission checks for utimes(2).
* MFC r259522:kib2013-12-241-0/+3
| | | | | | If vn_open_vnode() succeeded in opening the vnode, but subsequent advisory lock cannot be obtained, prevent double-close of the vnode in vn_close() called from the fdrop(), by resetting file' f_ops methods.
* MFC r258039:kib2013-12-171-10/+9
| | | | | | | | Avoid overflow for the page counts. MFC r258365: Revert back to use int for the page counts. Rearrange the checks to correctly handle overflowing address arithmetic.
* MFC r257898:kib2013-12-131-2/+2
| | | | | Change VFS_PROLOGUE() to evaluate the mp once, convert MNTK_SHARED_WRITES and MNTK_EXTENDED_SHARED tests into inline functions.
* When opening or closing fifo, ensure that the vnode is lockedkib2013-09-131-1/+3
| | | | | | | | | | | exclusively. Filesystems are assumed to disable shared locking for the fifo vnode locks, but some do not. Reported and tested by: olgeni Discussed with: avg Sponsored by: The FreeBSD Foundation MFC after: 1 week Approved by: re (glebius)
* Make the seek a method of the struct fileops.kib2013-08-211-0/+71
| | | | | Tested by: pho Sponsored by: The FreeBSD Foundation
* Restore the previous sendfile(2) behaviour on the block devices.kib2013-08-161-1/+0
| | | | | | | Provide valid .fo_sendfile method for several missed struct fileops. Reviewed by: glebius Sponsored by: The FreeBSD Foundation
* Make sendfile() a method in the struct fileops. Currently onlyglebius2013-08-151-0/+2
| | | | | | | | vnode backed file descriptors have this method implemented. Reviewed by: kib Sponsored by: Nginx, Inc. Sponsored by: Netflix
* There are several code sequences likekib2013-07-091-2/+16
| | | | | | | | | | | | | | | | | | | vfs_busy(mp); vfs_write_suspend(mp); which are problematic if other thread starts unmount between two calls. The unmount starts a write, while vfs_write_suspend() drain writers. On the other hand, unmount drains busy references, causing the deadlock. Add a flag argument to vfs_write_suspend and require the callers of it to specify VS_SKIP_UNMOUNT flag, when the call is performed not in the mount path, i.e. the covered vnode is not locked. The suspension is not attempted if VS_SKIP_UNMOUNT is specified and unmount is in progress. Reported and tested by: Andreas Longwitz <longwitz@incore.de> Sponsored by: The FreeBSD Foundation MFC after: 3 weeks
* Style fixes to vn_ioctl().jhb2013-05-311-14/+15
| | | | Suggested by: bde
* Fix FIONREAD on regular files. The computed result was being ignored andjhb2013-05-031-3/+2
| | | | | | | | it was being passed down to VOP_IOCTL() where it promptly resulted in ENOTTY due to a missing else for the past 8 years. While here, use a shared vnode lock while fetching the current file's size. MFC after: 1 week
* Use a shared lock for VOP_GETEXTATTR, as it is a read-like operation.mdf2013-03-301-1/+1
| | | | MFC after: 1 week
* Separate the copyright lines and the informational block by a blank line.kib2013-03-151-0/+1
| | | | | Requested by: joel MFC after: 2 weeks
* Add my copyright for the 2012 year work, in particular vn_io_fault()kib2013-03-151-0/+5
| | | | | | | and f_offset locking. Add required Foundation notice for r248319. Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
* Implement the helper function vn_io_fault_pgmove(), intended to use bykib2013-03-151-0/+39
| | | | | | | | | | | the filesystem VOP_READ() and VOP_WRITE() implementations in the same way as vn_io_fault_uiomove() over the unmapped buffers. Helper provides the convenient wrapper over the pmap_copy_pages() for struct uio consumers, taking care of the TDP_UIOHELD situations. Sponsored by: The FreeBSD Foundation Tested by: pho MFC after: 2 weeks
* MFCattilio2013-03-021-6/+2
|\
| * Remove unnecessary variables.pjd2013-03-011-6/+2
| |
* | Rename VM_OBJECT_LOCK(), VM_OBJECT_UNLOCK() and VM_OBJECT_TRYLOCK() toattilio2013-02-201-2/+2
| | | | | | | | | | | | their "write" versions. Sponsored by: EMC / Isilon storage division
* | Switch vm_object lock to be a rwlock.attilio2013-02-201-0/+1
|/ | | | | | | | * VM_OBJECT_LOCK and VM_OBJECT_UNLOCK are mapped to write operations * VM_OBJECT_SLEEP() is introduced as a general purpose primitve to get a sleep operation using a VM_OBJECT_LOCK() as protection * The approach must bear with vm_pager.h namespace pollution so many files require including directly rwlock.h
* vn_io_faults_cnt:pluknet2013-02-151-2/+2
| | | | | | | | - use u_long consistently - use SYSCTL_ULONG to match the type of variable Reviewed by: kib MFC after: 1 week
* Simplify code a bit. This is leftover after Giant removal from VFS.pjd2013-01-311-3/+1
|
* Add flags argument to vfs_write_resume() and removekib2013-01-111-9/+2
| | | | | | vfs_write_resume_flags(). Sponsored by: The FreeBSD Foundation
* The process_deferred_inactive() function locks the vnodes of the ufskib2013-01-011-1/+2
| | | | | | | | | | | | | | | mount, which means that is must not be called while the snaplock is owned. The vfs_write_resume(9) does call the function as the VFS_SUSP_CLEAN() method, which is too early and falls into the region still protected by snaplock. Add yet another flag for the vfs_write_resume_flags() to avoid calling suspension cleanup handler after the suspend is lifted, and use it in the ffs_snapshot() call to vfs_write_resume. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
* Make it possible to atomically resume writes on the mount and accountkib2012-12-281-27/+52
| | | | | | | | | | | | | | | | | | | the write start, by adding a variation of the vfs_write_resume(9) which accepts flags. Use the new function to prevent a deadlock between parallel suspension and snapshotting a UFS mount. The ffs_snapshot() code performed vfs_write_resume() followed by vn_start_write() while owning the snaplock. If the suspension intervene between resume and vn_start_write(), the deadlock occured after the suspending thread tried to lock the snaplock, most typically during the write in the ffs_copyonwrite(). Reported and tested by: Andreas Longwitz <longwitz@incore.de> Reviewed by: mckusick MFC after: 2 weeks X-MFC-note: make the vfs_write_resume(9) function a macro after the MFC, in HEAD
* - Add NOCAPCHECK flag to namei that allows lookup to work even if the processpjd2012-11-271-0/+4
| | | | | | | | | | | | is in capability mode. - Add VN_OPEN_NOCAPCHECK flag for vn_open_cred() to will ne converted into NOCAPCHECK namei flag. This functionality will be used to enable core dumps for sandboxed processes. Reviewed by: rwatson Obtained from: WHEEL Systems MFC after: 2 weeks
* The r241025 fixed the case when a binary, executed from nullfs mount,kib2012-11-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | was still possible to open for write from the lower filesystem. There is a symmetric situation where the binary could already has file descriptors opened for write, but it can be executed from the nullfs overlay. Handle the issue by passing one v_writecount reference to the lower vnode if nullfs vnode has non-zero v_writecount. Note that only one write reference can be donated, since nullfs only keeps one use reference on the lower vnode. Always use the lower vnode v_writecount for the checks. Introduce the VOP_GET_WRITECOUNT to read v_writecount, which is currently always bypassed to the lower vnode, and VOP_ADD_WRITECOUNT to manipulate the v_writecount value, which manages a single bypass reference to the lower vnode. Caling the VOPs instead of directly accessing v_writecount provide the fix described in the previous paragraph. Tested by: pho MFC after: 3 weeks
* Remove the support for using non-mpsafe filesystem modules.kib2012-10-221-57/+7
| | | | | | | | | | | | In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems. The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes. Conducted and reviewed by: attilio Tested by: pho
* Fix the mis-handling of the VV_TEXT on the nullfs vnodes.kib2012-09-281-1/+1
| | | | | | | | | | | | | | | | If you have a binary on a filesystem which is also mounted over by nullfs, you could execute the binary from the lower filesystem, or from the nullfs mount. When executed from lower filesystem, the lower vnode gets VV_TEXT flag set, and the file cannot be modified while the binary is active. But, if executed as the nullfs alias, only the nullfs vnode gets VV_TEXT set, and you still can open the lower vnode for write. Add a set of VOPs for the VV_TEXT query, set and clear operations, which are correctly bypassed to lower vnode. Tested by: pho (previous version) MFC after: 2 weeks
* vn_write() always expects FOF_OFFSET flag, which is asserted at the begining,pjd2012-09-251-4/+3
| | | | | | | so there is no need to check for it. Sponsored by: FreeBSD Foundation MFC after: 2 weeks
* Reorder the managament of advisory locks on open files so that the advisoryjhb2012-07-311-5/+55
| | | | | | | | | | | | | | | | | | | | | | lock is obtained before the write count is increased during open() and the lock is released after the write count is decreased during close(). The first change closes a race where an open() that will block with O_SHLOCK or O_EXLOCK can increase the write count while it waits. If the process holding the current lock on the file then tries to call exec() on the file it has locked, it can fail with ETXTBUSY even though the advisory lock is preventing other threads from succesfully completeing a writable open(). The second change closes a race where a read-only open() with O_SHLOCK or O_EXLOCK may return successfully while the write count is non-zero due to another descriptor that had the advisory lock and was blocking the open() still being in the process of closing. If the process that completed the open() then attempts to call exec() on the file it locked, it can fail with ETXTBUSY even though the other process that held a write lock has closed the file and released the lock. Reviewed by: kib MFC after: 1 month
* Extend the KPI to lock and unlock f_offset member of struct file. Itkib2012-07-021-29/+73
| | | | | | | | | | | | | | | | | | now fully encapsulates all accesses to f_offset, and extends f_offset locking to other consumers that need it, in particular, to lseek() and variants of getdirentries(). Ensure that on 32bit architectures f_offset, which is 64bit quantity, always read and written under the mtxpool protection. This fixes apparently easy to trigger race when parallel lseek()s or lseek() and read/write could destroy file offset. The already broken ABI emulations, including iBCS and SysV, are not converted (yet). Tested by: pho No objections from: jhb MFC after: 3 weeks
* Fix locking for f_offset, vn_read() and vn_write() cases only, for now.kib2012-06-211-53/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It seems that intended locking protocol for struct file f_offset field was as follows: f_offset should always be changed under the vnode lock (except fcntl(2) and lseek(2) did not followed the rules). Since read(2) uses shared vnode lock, FOFFSET_LOCKED block is additionally taken to serialize shared vnode lock owners. This was broken first by enabling shared lock on writes, then by fadvise changes, which moved f_offset assigned from under vnode lock, and last by vn_io_fault() doing chunked i/o. More, due to uio_offset not yet valid in vn_io_fault(), the range lock for reads was taken on the wrong region. Change the locking for f_offset to always use FOFFSET_LOCKED block, which is placed before rangelocks in the lock order. Extract foffset_lock() and foffset_unlock() functions which implements FOFFSET_LOCKED lock, and consistently lock f_offset with it in the vn_io_fault() both for reads and writes, even if MNTK_NO_IOPF flag is not set for the vnode mount. Indicate that f_offset is already valid for vn_read() and vn_write() calls from vn_io_fault() with FOF_OFFSET flag, and assert that all callers of vn_read() and vn_write() follow this protocol. Extract get_advice() function to calculate the POSIX_FADV_XXX value for the i/o region, and use it were appropriate. Reviewed by: jhb Tested by: pho MFC after: 2 weeks
* Further refine the implementation of POSIX_FADV_NOREUSE.jhb2012-06-191-11/+86
| | | | | | | | | | | | | | | | | | | | | | | | First, extend the changes in r230782 to better handle the common case of using NOREUSE with sequential reads. A NOREUSE file descriptor will now track the last implicit DONTNEED request it made as a result of a NOREUSE read. If a subsequent NOREUSE read is adjacent to the previous range, it will apply the DONTNEED request to the entire range of both the previous read and the current read. The effect is that each read of a file accessed sequentially will apply the DONTNEED request to the entire range that has been read. This allows NOREUSE to properly handle misaligned reads by flushing each buffer to cache once it has been completely read. Second, apply the same changes made to read(2) by r230782 and this change to writes. This provides much better performance in the sequential write case as it allows writes to still be clustered. It also provides much better performance for misaligned writes. It does mean that NOREUSE will be generally ineffective for non-sequential writes as the current implementation relies on a future NOREUSE write's implicit DONTNEED request to flush the dirty buffer from the current write. MFC after: 2 weeks
* Split the second half of vn_open_cred() (after a vnode has been found viajhb2012-06-081-32/+42
| | | | | | | | | | a lookup or created via VOP_CREATE()) into a new vn_open_vnode() function and use this function in fhopen() instead of duplicating code from vn_open_cred() directly. Tested by: pho Reviewed by: kib MFC after: 2 weeks
* Add a knob to disable vn_io_fault.kib2012-06-031-1/+5
| | | | MFC after: 1 month
* Count and export the number of prefaulting happen.kib2012-06-031-0/+5
| | | | MFC after: 1 month
OpenPOWER on IntegriCloud