summaryrefslogtreecommitdiffstats
path: root/sys/fs
Commit message (Collapse)AuthorAgeFilesLines
* Add sysctl vfs.nfs.nfs_keep_dirty_on_error to switch the nfs clientkib2012-03-172-3/+10
| | | | | | | | | behaviour on error from write RPC back to behaviour of old nfs client. When set to not zero, the pages for which write failed are kept dirty. PR: kern/165927 Reviewed by: alc MFC after: 2 weeks
* Prevent tmpfs_rename() deadlock in a way similar to UFSgleb2012-03-142-7/+179
| | | | | | Unlock vnodes and try to lock them one by one. Relookup fvp and tvp. Approved by: mdf (mentor)
* Don't enforce LK_RETRY to get existing vnode in tmpfs_alloc_vp()gleb2012-03-141-7/+11
| | | | | | | | | Doomed vnode is hardly of any use here, besides all callers handle error case. vfs_hash_get() does the same. Don't mess with vnode holdcount, vget() takes care of it already. Approved by: mdf (mentor)
* Use NULL instead of 0kevlo2012-03-134-7/+7
|
* Update comment.kib2012-03-111-1/+1
| | | | Submitted by: gianni
* Remove fifo.h. The only used function declaration from the header iskib2012-03-117-45/+0
| | | | | | migrated to sys/vnode.h. Submitted by: gianni
* Add support for ns timestamps and birthtime to the ext2/3 driver.pfg2012-03-087-23/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When using big inodes there is sufficient space in ext3 to keep extra resolution and birthtime (creation) timestamps. The appropriate fields in the on-disk inode have been approved for a long time but support for this in ext3 has not been widely distributed. In preparation for ext4 most linux distributions have enabled by default such bigger inodes and some people use nanosecond timestamps in ext3. We now support those when the inode is big enough and while we do recognize the EXT4F_ROCOMPAT_EXTRA_ISIZE, we maintain the extra timestamps even when they are not used. An additional note by Bruce Evans: We blindly accept unrepresentable tv_nsec in VOP_SETATTR(), but all file systems have always done that. When POSIX gets around to specifying the behaviour, it will probably require certain rounding to the fs's resolution and not rejecting the request. This unfortunately means that syscalls that set times can't really tell if they succeeded without reading back the times using stat() or similar and checking that they were set close enough. Reviewed by: bde Approved by: jhb (mentor) MFC after: 2 weeks
* Add KTR_VFS traces to track modifications to a vnode's writecount.jhb2012-03-081-0/+4
|
* The pipe_poll() performs lockless access to the vnode to testkib2012-03-072-14/+12
| | | | | | | | | | | | fifo_iseof() condition, allowing the v_fifoinfo to be reset and freed by fifo_cleanup(). Precalculate EOF at the places were fo_wgen is changed, and cache the state in a new pipe state flag PIPE_SAMEWGEN. Reported and tested by: bf Submitted by: gianni MFC after: 1 week (a backport)
* Apply inlined vn_vget_ino() algorithm for ".." lookup in pseudofs.kib2012-03-051-5/+32
| | | | | Reported and tested by: pho MFC after: 2 weeks
* Remove unneeded cast to u_int. The values as small enough to fit intokib2012-03-041-2/+1
| | | | | | | int, beside the use of MIN macro which performs type promotions. Submitted by: bde MFC after: 3 weeks
* Remove unnecessary castskevlo2012-03-041-2/+2
|
* Clean up style(9) nitskevlo2012-03-043-10/+10
|
* The name caching changes of r230394 exposed an intermittent bugrmacklem2012-03-031-0/+1
| | | | | | | | | | | | in the new NFS server for NFSv4, where it would report ENOENT when the file actually existed on the server. This turned out to be caused by not initializing ni_topdir before calling lookup() and there was a rare case where the value on the stack location assigned to ni_topdir happened to be a pointer to a ".." entry, such that "dp == ndp->ni_topdir" succeeded in lookup(). This patch initializes ni_topdir to fix the problem. MFC after: 5 days
* Post r230394, the Lookup RPC counts for both NFS clients increasedrmacklem2012-03-032-22/+34
| | | | | | | | | | | | | | | | | | | | | significantly. Upon investigation this was caused by name cache misses for lookups of "..". For name cache entries for non-".." directories, the cache entry serves double duty. It maps both the named directory plus ".." for the parent of the directory. As such, two ctime values (one for each of the directory and its parent) need to be saved in the name cache entry. This patch adds an entry for ctime of the parent directory to the name cache. It also adds an additional uma zone for large entries with this time value, in order to minimize memory wastage. As well, it fixes a couple of cases where the mtime of the parent directory was being saved instead of ctime for positive name cache entries. With this patch, Lookup RPC counts return to values similar to pre-r230394 kernels. Reported by: bde Discussed with: kib Reviewed by: jhb MFC after: 2 weeks
* Similar to the fixes in 226967 and 226987, purge any name cache entriesjhb2012-03-021-0/+2
| | | | | | | | | associated with the previous vnode (if any) associated with the target of a rename(). Otherwise, a lookup of the target pathname concurrent with a rename() could re-add a name cache entry after the namei(RENAME) lookup in kern_renameat() had purged the target pathname. MFC after: 2 weeks
* Do not expose unlocked unconstructed nullfs vnode on mount list.kib2012-03-021-1/+1
| | | | | | | Lock the native nullfs vnode lock before switching the locks. Tested by: pho MFC after: 1 week
* Fix the NFS clients so that they use copyin() instead of bcopy(),rmacklem2012-03-011-1/+16
| | | | | | | | when doing direct I/O. This direct I/O code is not enabled by default. Submitted by: kib (earlier version) Reviewed by: kib MFC after: 1 week
* Add "export" to devfs_opts[] and return EOPNOTSUPP if called with it.mm2012-02-291-1/+4
| | | | | | | Fixes mountd warnings. Reported by: kib MFC after: 1 week
* Allow shared locks for reads when lower filesystem accept shared locking.kib2012-02-291-1/+2
| | | | | Tested by: pho MFC after: 1 week
* Document that null_nodeget() cannot take shared-locked lowervp due tokib2012-02-291-1/+5
| | | | | | | insmntque() requirements. Tested by: pho MFC after: 1 week
* In null_reclaim(), assert that reclaimed vnode is fully constructed,kib2012-02-291-9/+12
| | | | | | | | | | | instead of accepting half-constructed vnode. Previous code cannot decide what to do with such vnode anyway, and although processing it for hash removal, paniced later when getting rid of nullfs reference on lowervp. While there, remove initializations from the declaration block. Tested by: pho MFC after: 1 week
* Always request exclusive lock for the lower vnode in nullfs_vget().kib2012-02-291-0/+6
| | | | | | | | | The null_nodeget() requires exclusive lock on lowervp to be able to insmntque() new vnode. Reported by: rea Tested by: pho MFC after: 1 week
* Move the code to destroy half-contructed nullfs vnode into helperkib2012-02-291-6/+13
| | | | | | | | | | | | | | | | | function null_destroy_proto() from null_insmntque_dtr(). Also apply null_destroy_proto() in null_nodeget() when we raced and a vnode is found in the hash, so the currently allocated protonode shall be destroyed. Lock the vnode interlock around reassigning the v_vnlock. In fact, this path will not be exercised after several later commits, since null_nodeget() cannot take shared-locked lowervp at all due to insmntque() requirements. Reported by: rea Tested by: pho MFC after: 1 week
* Merge a split multi-line comment.kib2012-02-291-4/+1
| | | | MFC after: 1 week
* Add procfs to jail-mountable filesystems.mm2012-02-292-3/+7
| | | | | Reviewed by: jamie MFC after: 1 week
* Remove an unused structure and unnecessary castkevlo2012-02-242-3/+1
|
* Check if the user has necessary permissions on the devicekevlo2012-02-241-6/+25
|
* To improve control over the use of mount(8) inside a jail(8), introducemm2012-02-232-15/+20
| | | | | | | | | | | | | | | | | | | a new jail parameter node with the following parameters: allow.mount.devfs: allow mounting the devfs filesystem inside a jail allow.mount.nullfs: allow mounting the nullfs filesystem inside a jail Both parameters are disabled by default (equals the behavior before devfs and nullfs in jails). Administrators have to explicitly allow mounting devfs and nullfs for each jail. The value "-1" of the devfs_ruleset parameter is removed in favor of the new allow setting. Reviewed by: jamie Suggested by: pjd MFC after: 2 weeks
* merge pipe and fifo implementationskmacy2012-02-232-403/+76
| | | | | | | | Also reviewed by: jhb, jilles (initial revision) Tested by: pho, jilles Submitted by: gianni Reviewed by: bde
* hrs@ reported a panic to freebsd-stable@ under the subject linermacklem2012-02-231-6/+5
| | | | | | | | | | | | | | | | "panic in 8.3-PRERELEASE" on Feb. 22, 2012. This panic was caused by use of a mix of tsleep() and msleep() calls on the same event in the new NFS server DRC code. It did "mtx_unlock(); tsleep();" in two places, which kib@ noted introduced a slight risk that the wakeup() would occur before the tsleep(), resulting in a 10sec delay before waking up. This patch fixes the problem by replacing "mtx_unlock(); tsleep();" with mtx_sleep(..PDROP..). It also changes a nfsmsleep() call to mtx_sleep() so that the code uses mtx_sleep() consistently within the file. Tested by: hrs (in progress) Reviewed by: jhb MFC after: 5 days
* Use DOINGASYNC() to test for async allowance, to honor VFS syncing requests.kib2012-02-223-7/+7
| | | | | Noted by: bde MFC after: 1 week
* Fix found places where uio_resid is truncated to int.kib2012-02-219-19/+25
| | | | | | | | | Add the sysctl debug.iosize_max_clamp, enabled by default. Setting the sysctl to zero allows to perform the SSIZE_MAX-sized i/o requests from the usermode. Discussed with: bde, das (previous versions) MFC after: 1 month
* Remove an unnecessary cast.kevlo2012-02-201-1/+1
|
* Merge multi-FIB IPv6 support from projects/multi-fibv6/head/:bz2012-02-172-4/+6
| | | | | | | | | | | | Extend the so far IPv4-only support for multiple routing tables (FIBs) introduced in r178888 to IPv6 providing feature parity. This includes an extended rtalloc(9) KPI for IPv6, the necessary adjustments to the network stack, and user land support as in netstat. Sponsored by: Cisco Systems, Inc. Reviewed by: melifaro (basically) MFC after: 10 days
* Delete a couple of out of date comments that are no longer true inrmacklem2012-02-162-8/+1
| | | | | | | the new NFS client. Requested by: bde MFC after: 1 week
* Replace PRIdMAX with "jd" in a printf call. Cast the corresponding value totijl2012-02-141-5/+2
| | | | intmax_t instead of uintmax_t, because the original type is off_t.
* Merge si_name and __si_namebuf.ed2012-02-101-1/+0
| | | | | The si_name pointer always points to the __si_namebuf member inside the same object. Remove it and rename __si_namebuf to si_name.
* Allow mounting nullfs(5) inside jails.mm2012-02-091-1/+1
| | | | | | This is now possible thanks to r230129. MFC after: 1 month
* Add support for mounting devfs inside jails.mm2012-02-091-1/+13
| | | | | | | | | | | | | A new jail(8) option "devfs_ruleset" defines the ruleset enforcement for mounting devfs inside jails. A value of -1 disables mounting devfs in jails, a value of zero means no restrictions. Nested jails can only have mounting devfs disabled or inherit parent's enforcement as jails are not allowed to view or manipulate devfs(8) rules. Utilizes new functions introduced in r231265. Reviewed by: jamie MFC after: 1 month
* Introduce the "ruleset=number" option for devfs(5) mounts.mm2012-02-093-1/+79
| | | | | | | | | | | | Add support for updating the devfs mount (currently only changing the ruleset number is supported). Check mnt_optnew with vfs_filteropt(9). This new option sets the specified ruleset number as the active ruleset of the new devfs mount and applies all its rules at mount time. If the specified ruleset doesn't exist, a new empty ruleset is created. MFC after: 1 month
* Update the data structures with some fields reserved forpfg2012-02-073-10/+29
| | | | | | | | | | | | | | ext4 but that can be used in ext3 mode. Also adjust the internal inode to carry the birthtime, like in UFS, which is starting to get some use when big inodes are available. Right now these are just placeholders for features to come. Approved by: jhb (mentor) MFC after: 2 weeks
* r228827 fixed a problem where copying of NFSv4 open credentials intormacklem2012-02-071-2/+6
| | | | | | | | | | | | | | | | a credential structure would corrupt it. This happened when the p argument was != NULL. However, I now realize that the copying of open credentials should only happen for p == NULL, since that indicates that it is a read-ahead or write-behind. This patch fixes this. After this commit, r228827 could be reverted, but I think the code is clearer and safer with the patch, so I am going to leave it in. Without this patch, it was possible that a NFSv4 VOP_SETATTR() could have changed the credentials of the caller. This would have happened if the process doing the VOP_SETATTR() did not have the file open, but some other process running as a different uid had the file open for writing at the same time. MFC after: 5 days
* Rename cache_lookup_times() to cache_lookup() and retire the old API andjhb2012-02-063-3/+3
| | | | ABI stub for cache_lookup().
* Current implementations of sync(2) and syncer vnode fsync() VOP useskib2012-02-062-4/+1
| | | | | | | | | | | | | | | | | | | | | | mnt_noasync counter to temporary remove MNTK_ASYNC mount option, which is needed to guarantee a synchronous completion of the initiated i/o before syscall or VOP return. Global removal of MNTK_ASYNC option is harmful because not only i/o started from corresponding thread becomes synchronous, but all i/o is synchronous on the filesystem which is initiated during sync(2) or syncer activity. Instead of removing MNTK_ASYNC from mnt_kern_flag, provide a local thread flag to disable async i/o for current thread only. Use the opportunity to move DOINGASYNC() macro into sys/vnode.h and consistently use it through places which tested for MNTK_ASYNC. Some testing demonstrated 60-70% improvements in run time for the metadata-intensive operations on async-mounted UFS volumes, but still with great deviation due to other reasons. Reviewed by: mckusick Tested by: scottl MFC after: 2 weeks
* When a "mount -u" switches an NFS mount point from TCP to UDP,rmacklem2012-01-311-0/+13
| | | | | | | | | | any thread doing an I/O RPC with a transfer size greater than NFS_UDPMAXDATA will be hung indefinitely, retrying the RPC. After a discussion on freebsd-fs@, I decided to add a warning message for this case, as suggested by Jeremy Chadwick. Suggested by: freebsd at jdc.parodius.com (Jeremy Chadwick) MFC after: 2 weeks
* A problem with respect to data read through the buffer cache for bothrmacklem2012-01-273-9/+7
| | | | | | | | | | | | | | | NFS clients was reported to freebsd-fs@ under the subject "NFS corruption in recent HEAD" on Nov. 26, 2011. This problem occurred when a TCP mounted root fs was changed to using UDP. I believe that this problem was caused by the change in mnt_stat.f_iosize that occurred because rsize was decreased to the maximum supported by UDP. This patch fixes the problem by using v_bufobj.bo_bsize instead of f_iosize, since the latter is set to f_iosize when the vnode is allocated, but does not change for a given vnode when f_iosize changes. Reported by: pjd Reviewed by: kib MFC after: 2 weeks
* Revert r230516, since it doesn't really fix the problem.rmacklem2012-01-261-17/+0
|
* Fix remaining calls to cache_enter() in both NFS clients to providekib2012-01-251-17/+18
| | | | | | | | appropriate timestamps. Restore the assertions which verify that NCF_TS is set when timestamp is asked for. Reviewed by: jhb (previous version) MFC after: 2 weeks
* Add a timeout on positive name cache entries in the NFS client. That is,jhb2012-01-253-11/+29
| | | | | | | | | | | we will only trust a positive name cache entry for a specified amount of time before falling back to a LOOKUP RPC, even if the ctime for the file handle matches the cached copy in the name cache entry. The timeout is configured via a new 'nametimeo' mount option and defaults to 60 seconds. It may be set to zero to disable positive name caching entirely. Reviewed by: rmacklem MFC after: 1 week
OpenPOWER on IntegriCloud