summaryrefslogtreecommitdiffstats
path: root/sys/fs
Commit message (Collapse)AuthorAgeFilesLines
* Apply inlined vn_vget_ino() algorithm for ".." lookup in pseudofs.kib2012-03-051-5/+32
| | | | | Reported and tested by: pho MFC after: 2 weeks
* Remove unneeded cast to u_int. The values as small enough to fit intokib2012-03-041-2/+1
| | | | | | | int, beside the use of MIN macro which performs type promotions. Submitted by: bde MFC after: 3 weeks
* Remove unnecessary castskevlo2012-03-041-2/+2
|
* Clean up style(9) nitskevlo2012-03-043-10/+10
|
* The name caching changes of r230394 exposed an intermittent bugrmacklem2012-03-031-0/+1
| | | | | | | | | | | | in the new NFS server for NFSv4, where it would report ENOENT when the file actually existed on the server. This turned out to be caused by not initializing ni_topdir before calling lookup() and there was a rare case where the value on the stack location assigned to ni_topdir happened to be a pointer to a ".." entry, such that "dp == ndp->ni_topdir" succeeded in lookup(). This patch initializes ni_topdir to fix the problem. MFC after: 5 days
* Post r230394, the Lookup RPC counts for both NFS clients increasedrmacklem2012-03-032-22/+34
| | | | | | | | | | | | | | | | | | | | | significantly. Upon investigation this was caused by name cache misses for lookups of "..". For name cache entries for non-".." directories, the cache entry serves double duty. It maps both the named directory plus ".." for the parent of the directory. As such, two ctime values (one for each of the directory and its parent) need to be saved in the name cache entry. This patch adds an entry for ctime of the parent directory to the name cache. It also adds an additional uma zone for large entries with this time value, in order to minimize memory wastage. As well, it fixes a couple of cases where the mtime of the parent directory was being saved instead of ctime for positive name cache entries. With this patch, Lookup RPC counts return to values similar to pre-r230394 kernels. Reported by: bde Discussed with: kib Reviewed by: jhb MFC after: 2 weeks
* Similar to the fixes in 226967 and 226987, purge any name cache entriesjhb2012-03-021-0/+2
| | | | | | | | | associated with the previous vnode (if any) associated with the target of a rename(). Otherwise, a lookup of the target pathname concurrent with a rename() could re-add a name cache entry after the namei(RENAME) lookup in kern_renameat() had purged the target pathname. MFC after: 2 weeks
* Do not expose unlocked unconstructed nullfs vnode on mount list.kib2012-03-021-1/+1
| | | | | | | Lock the native nullfs vnode lock before switching the locks. Tested by: pho MFC after: 1 week
* Fix the NFS clients so that they use copyin() instead of bcopy(),rmacklem2012-03-011-1/+16
| | | | | | | | when doing direct I/O. This direct I/O code is not enabled by default. Submitted by: kib (earlier version) Reviewed by: kib MFC after: 1 week
* Add "export" to devfs_opts[] and return EOPNOTSUPP if called with it.mm2012-02-291-1/+4
| | | | | | | Fixes mountd warnings. Reported by: kib MFC after: 1 week
* Allow shared locks for reads when lower filesystem accept shared locking.kib2012-02-291-1/+2
| | | | | Tested by: pho MFC after: 1 week
* Document that null_nodeget() cannot take shared-locked lowervp due tokib2012-02-291-1/+5
| | | | | | | insmntque() requirements. Tested by: pho MFC after: 1 week
* In null_reclaim(), assert that reclaimed vnode is fully constructed,kib2012-02-291-9/+12
| | | | | | | | | | | instead of accepting half-constructed vnode. Previous code cannot decide what to do with such vnode anyway, and although processing it for hash removal, paniced later when getting rid of nullfs reference on lowervp. While there, remove initializations from the declaration block. Tested by: pho MFC after: 1 week
* Always request exclusive lock for the lower vnode in nullfs_vget().kib2012-02-291-0/+6
| | | | | | | | | The null_nodeget() requires exclusive lock on lowervp to be able to insmntque() new vnode. Reported by: rea Tested by: pho MFC after: 1 week
* Move the code to destroy half-contructed nullfs vnode into helperkib2012-02-291-6/+13
| | | | | | | | | | | | | | | | | function null_destroy_proto() from null_insmntque_dtr(). Also apply null_destroy_proto() in null_nodeget() when we raced and a vnode is found in the hash, so the currently allocated protonode shall be destroyed. Lock the vnode interlock around reassigning the v_vnlock. In fact, this path will not be exercised after several later commits, since null_nodeget() cannot take shared-locked lowervp at all due to insmntque() requirements. Reported by: rea Tested by: pho MFC after: 1 week
* Merge a split multi-line comment.kib2012-02-291-4/+1
| | | | MFC after: 1 week
* Add procfs to jail-mountable filesystems.mm2012-02-292-3/+7
| | | | | Reviewed by: jamie MFC after: 1 week
* Remove an unused structure and unnecessary castkevlo2012-02-242-3/+1
|
* Check if the user has necessary permissions on the devicekevlo2012-02-241-6/+25
|
* To improve control over the use of mount(8) inside a jail(8), introducemm2012-02-232-15/+20
| | | | | | | | | | | | | | | | | | | a new jail parameter node with the following parameters: allow.mount.devfs: allow mounting the devfs filesystem inside a jail allow.mount.nullfs: allow mounting the nullfs filesystem inside a jail Both parameters are disabled by default (equals the behavior before devfs and nullfs in jails). Administrators have to explicitly allow mounting devfs and nullfs for each jail. The value "-1" of the devfs_ruleset parameter is removed in favor of the new allow setting. Reviewed by: jamie Suggested by: pjd MFC after: 2 weeks
* merge pipe and fifo implementationskmacy2012-02-232-403/+76
| | | | | | | | Also reviewed by: jhb, jilles (initial revision) Tested by: pho, jilles Submitted by: gianni Reviewed by: bde
* hrs@ reported a panic to freebsd-stable@ under the subject linermacklem2012-02-231-6/+5
| | | | | | | | | | | | | | | | "panic in 8.3-PRERELEASE" on Feb. 22, 2012. This panic was caused by use of a mix of tsleep() and msleep() calls on the same event in the new NFS server DRC code. It did "mtx_unlock(); tsleep();" in two places, which kib@ noted introduced a slight risk that the wakeup() would occur before the tsleep(), resulting in a 10sec delay before waking up. This patch fixes the problem by replacing "mtx_unlock(); tsleep();" with mtx_sleep(..PDROP..). It also changes a nfsmsleep() call to mtx_sleep() so that the code uses mtx_sleep() consistently within the file. Tested by: hrs (in progress) Reviewed by: jhb MFC after: 5 days
* Use DOINGASYNC() to test for async allowance, to honor VFS syncing requests.kib2012-02-223-7/+7
| | | | | Noted by: bde MFC after: 1 week
* Fix found places where uio_resid is truncated to int.kib2012-02-219-19/+25
| | | | | | | | | Add the sysctl debug.iosize_max_clamp, enabled by default. Setting the sysctl to zero allows to perform the SSIZE_MAX-sized i/o requests from the usermode. Discussed with: bde, das (previous versions) MFC after: 1 month
* Remove an unnecessary cast.kevlo2012-02-201-1/+1
|
* Merge multi-FIB IPv6 support from projects/multi-fibv6/head/:bz2012-02-172-4/+6
| | | | | | | | | | | | Extend the so far IPv4-only support for multiple routing tables (FIBs) introduced in r178888 to IPv6 providing feature parity. This includes an extended rtalloc(9) KPI for IPv6, the necessary adjustments to the network stack, and user land support as in netstat. Sponsored by: Cisco Systems, Inc. Reviewed by: melifaro (basically) MFC after: 10 days
* Delete a couple of out of date comments that are no longer true inrmacklem2012-02-162-8/+1
| | | | | | | the new NFS client. Requested by: bde MFC after: 1 week
* Replace PRIdMAX with "jd" in a printf call. Cast the corresponding value totijl2012-02-141-5/+2
| | | | intmax_t instead of uintmax_t, because the original type is off_t.
* Merge si_name and __si_namebuf.ed2012-02-101-1/+0
| | | | | The si_name pointer always points to the __si_namebuf member inside the same object. Remove it and rename __si_namebuf to si_name.
* Allow mounting nullfs(5) inside jails.mm2012-02-091-1/+1
| | | | | | This is now possible thanks to r230129. MFC after: 1 month
* Add support for mounting devfs inside jails.mm2012-02-091-1/+13
| | | | | | | | | | | | | A new jail(8) option "devfs_ruleset" defines the ruleset enforcement for mounting devfs inside jails. A value of -1 disables mounting devfs in jails, a value of zero means no restrictions. Nested jails can only have mounting devfs disabled or inherit parent's enforcement as jails are not allowed to view or manipulate devfs(8) rules. Utilizes new functions introduced in r231265. Reviewed by: jamie MFC after: 1 month
* Introduce the "ruleset=number" option for devfs(5) mounts.mm2012-02-093-1/+79
| | | | | | | | | | | | Add support for updating the devfs mount (currently only changing the ruleset number is supported). Check mnt_optnew with vfs_filteropt(9). This new option sets the specified ruleset number as the active ruleset of the new devfs mount and applies all its rules at mount time. If the specified ruleset doesn't exist, a new empty ruleset is created. MFC after: 1 month
* Update the data structures with some fields reserved forpfg2012-02-073-10/+29
| | | | | | | | | | | | | | ext4 but that can be used in ext3 mode. Also adjust the internal inode to carry the birthtime, like in UFS, which is starting to get some use when big inodes are available. Right now these are just placeholders for features to come. Approved by: jhb (mentor) MFC after: 2 weeks
* r228827 fixed a problem where copying of NFSv4 open credentials intormacklem2012-02-071-2/+6
| | | | | | | | | | | | | | | | a credential structure would corrupt it. This happened when the p argument was != NULL. However, I now realize that the copying of open credentials should only happen for p == NULL, since that indicates that it is a read-ahead or write-behind. This patch fixes this. After this commit, r228827 could be reverted, but I think the code is clearer and safer with the patch, so I am going to leave it in. Without this patch, it was possible that a NFSv4 VOP_SETATTR() could have changed the credentials of the caller. This would have happened if the process doing the VOP_SETATTR() did not have the file open, but some other process running as a different uid had the file open for writing at the same time. MFC after: 5 days
* Rename cache_lookup_times() to cache_lookup() and retire the old API andjhb2012-02-063-3/+3
| | | | ABI stub for cache_lookup().
* Current implementations of sync(2) and syncer vnode fsync() VOP useskib2012-02-062-4/+1
| | | | | | | | | | | | | | | | | | | | | | mnt_noasync counter to temporary remove MNTK_ASYNC mount option, which is needed to guarantee a synchronous completion of the initiated i/o before syscall or VOP return. Global removal of MNTK_ASYNC option is harmful because not only i/o started from corresponding thread becomes synchronous, but all i/o is synchronous on the filesystem which is initiated during sync(2) or syncer activity. Instead of removing MNTK_ASYNC from mnt_kern_flag, provide a local thread flag to disable async i/o for current thread only. Use the opportunity to move DOINGASYNC() macro into sys/vnode.h and consistently use it through places which tested for MNTK_ASYNC. Some testing demonstrated 60-70% improvements in run time for the metadata-intensive operations on async-mounted UFS volumes, but still with great deviation due to other reasons. Reviewed by: mckusick Tested by: scottl MFC after: 2 weeks
* When a "mount -u" switches an NFS mount point from TCP to UDP,rmacklem2012-01-311-0/+13
| | | | | | | | | | any thread doing an I/O RPC with a transfer size greater than NFS_UDPMAXDATA will be hung indefinitely, retrying the RPC. After a discussion on freebsd-fs@, I decided to add a warning message for this case, as suggested by Jeremy Chadwick. Suggested by: freebsd at jdc.parodius.com (Jeremy Chadwick) MFC after: 2 weeks
* A problem with respect to data read through the buffer cache for bothrmacklem2012-01-273-9/+7
| | | | | | | | | | | | | | | NFS clients was reported to freebsd-fs@ under the subject "NFS corruption in recent HEAD" on Nov. 26, 2011. This problem occurred when a TCP mounted root fs was changed to using UDP. I believe that this problem was caused by the change in mnt_stat.f_iosize that occurred because rsize was decreased to the maximum supported by UDP. This patch fixes the problem by using v_bufobj.bo_bsize instead of f_iosize, since the latter is set to f_iosize when the vnode is allocated, but does not change for a given vnode when f_iosize changes. Reported by: pjd Reviewed by: kib MFC after: 2 weeks
* Revert r230516, since it doesn't really fix the problem.rmacklem2012-01-261-17/+0
|
* Fix remaining calls to cache_enter() in both NFS clients to providekib2012-01-251-17/+18
| | | | | | | | appropriate timestamps. Restore the assertions which verify that NCF_TS is set when timestamp is asked for. Reviewed by: jhb (previous version) MFC after: 2 weeks
* Add a timeout on positive name cache entries in the NFS client. That is,jhb2012-01-253-11/+29
| | | | | | | | | | | we will only trust a positive name cache entry for a specified amount of time before falling back to a LOOKUP RPC, even if the ctime for the file handle matches the cached copy in the name cache entry. The timeout is configured via a new 'nametimeo' mount option and defaults to 60 seconds. It may be set to zero to disable positive name caching entirely. Reviewed by: rmacklem MFC after: 1 week
* If a mount -u is done to either NFS client that switches itrmacklem2012-01-251-0/+17
| | | | | | | | | | | | | | | | from TCP to UDP and the rsize/wsize/readdirsize is greater than NFS_MAXDGRAMDATA, it is possible for a thread doing an I/O RPC to get stuck repeatedly doing retries. This happens because the RPC will use a resize/wsize/readdirsize that won't work for UDP and, as such, it will keep failing indefinitely. This patch returns an error for this case, to avoid the problem. A discussion on freebsd-fs@ seemed to indicate that returning an error was preferable to silently ignoring the "udp"/"mntudp" option. This problem was discovered while investigating a problem reported by pjd@ via email. MFC after: 2 weeks
* Close a race in NFS lookup processing that could result in stale name cachejhb2012-01-203-54/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | entries on one client when a directory was renamed on another client. The root cause for the stale entry being trusted is that each per-vnode nfsnode structure has a single 'n_ctime' timestamp used to validate positive name cache entries. However, if there are multiple entries for a single vnode, they all share a single timestamp. To fix this, extend the name cache to allow filesystems to optionally store a timestamp value in each name cache entry. The NFS clients now fetch the timestamp associated with each name cache entry and use that to validate cache hits instead of the timestamps previously stored in the nfsnode. Another part of the fix is that the NFS clients now use timestamps from the post-op attributes of RPCs when adding name cache entries rather than pulling the timestamps out of the file's attribute cache. The latter is subject to races with other lookups updating the attribute cache concurrently. Some more details: - Add a variant of nfsm_postop_attr() to the old NFS client that can return a vattr structure with a copy of the post-op attributes. - Handle lookups of "." as a special case in the NFS clients since the name cache does not store name cache entries for ".", so we cannot get a useful timestamp. It didn't really make much sense to recheck the attributes on the the directory to validate the namecache hit for "." anyway. - ABI compat shims for the name cache routines are present in this commit so that it is safe to MFC. MFC after: 2 weeks
* Martin Cracauer reported a problem to freebsd-current@ under thermacklem2012-01-201-18/+31
| | | | | | | | | | | | | | | | | | | | subject "Data corruption over NFS in -current". During investigation of this, I came across an ugly bogusity in the new NFS client where it replaced the cr_uid with the one used for the mount. This was done so that "system operations" like the NFSv4 Renew would be performed as the user that did the mount. However, if any other thread shares the credential with the one doing this operation, it could do an RPC (or just about anything else) as the wrong cr_uid. This patch fixes the above, by using the mount credentials instead of the one provided as an argument for this case. It appears to have fixed Martin's problem. This patch is needed for NFSv4 mounts and NFSv3 mounts against some non-FreeBSD servers that do not put post operation attributes in the NFSv3 Statfs RPC reply. Tested by: Martin Cracauer (cracauer at cons.org) Reviewed by: jhb MFC after: 2 weeks
* Subject: NULLFS: properly destroy node hashrea2012-01-181-1/+1
| | | | | | | Use hashdestroy() instead of naive free(). Approved by: kib MFC after: 2 weeks
* Return EOPNOTSUPP since we only support update mounts for NFS export.kevlo2012-01-171-0/+4
| | | | Spotted by: trociny
* Make sure all intermediate variables holding mount flags (mnt_flag)mckusick2012-01-1711-11/+11
| | | | | | | and that all internal kernel calls passing mount flags are declared as uint64_t so that flags in the top 32-bits are not lost. MFC after: 2 weeks
* Add nfs export support to tmpfs(5)kevlo2012-01-161-4/+2
| | | | Reviewed by: kib
* When tmpfs_write() resets an extended file to its original size after analc2012-01-163-7/+12
| | | | | | | | error, we want tmpfs_reg_resize() to ignore I/O errors and unconditionally update the file's size. Reviewed by: kib MFC after: 3 weeks
* Abrogate nchr argument in proc_getargv() and proc_getenvv(): we always wanttrociny2012-01-151-1/+1
| | | | | | | | | | | | | | | to read strings completely to know the actual size. As a side effect it fixes the issue with kern.proc.args and kern.proc.env sysctls, which didn't return the size of available data when calling sysctl(3) with the NULL argument for oldp. Note, in get_ps_strings(), which does actual work for proc_getargv() and proc_getenvv(), we still have a safety limit on the size of data read in case of a corrupted procces stack. Suggested by: kib MFC after: 3 days
OpenPOWER on IntegriCloud