summaryrefslogtreecommitdiffstats
path: root/sys/fs
Commit message (Collapse)AuthorAgeFilesLines
* After nullfs rmdir operation, reclaim the directory vnode which waskib2016-02-171-0/+9
| | | | | | | | | unlinked. Otherwise the vnode stays cached, causing leak. This is similar to r292961 for regular files. Reported and tested by: pho (previous version) Sponsored by: The FreeBSD Foundation MFC after: 1 week
* ext2fs: Remove panics for rename() race conditions.pfg2016-02-141-5/+8
| | | | | | | | | | | | Sync with r84642 from UFS: The panics are inappropriate because the IN_RENAME flag only fixes a few of the huge number of race conditions that can result in the source path becoming invalid even prior to the VOP_RENAME() call. Found accidentally while checking an issue from PVS Static Analysis. MFC after: 3 days
* cd9660: More "check for NULL" cleaunps.pfg2016-02-121-10/+8
| | | | | | | | Cleanup some checks for NULL. Most of these were always unnecessary and starting with r294954 brelse() doesn't need any NULL checks at all. For now keep the checks somewhat consistent with NetBSD in case we want to merge the cleanups to older versions.
* Clear the cookie pointer on error in tmpfs_readdir().markj2016-02-121-1/+4
| | | | | | | | | | | | It is otherwise left dangling, and callers that request cookies always free the cookie buffer, even when VOP_READDIR(9) returns an error. This results in a double free if tmpfs_readdir() returns an error to the NFS server or the Linux getdents(2) emulation code. Reported by: pho MFC after: 1 week Security: double free of malloc(9)-backed memory Sponsored by: EMC / Isilon Storage Division
* Ext4: Use boolean type instead of '0' and '1'pfg2016-02-112-8/+8
| | | | | | There are precedents of uses of bool in the kernel and it is incorrect style to use integers as replacement for a boolean type.
* Ext4: fix handling of files with sparse blocks before extent's index.pfg2016-02-112-15/+44
| | | | | | | | | | | | | | | | This is ongoing work from Damjan Jovanovic to improve ext4 read support with sparse files: Keep track of the first and last block in each extent as it descends down the extent tree, thus being able to work out that some blocks are sparse earlier. This solves an issue on r293680. In ext4_bmapext() start supporting the runb parameter, which appears to be the number of adjacent blocks prior to the block being converted in the same way that runp is the number of blocks after, speding up random access to mmaped files. PR: 206652
* Revert r295359:pfg2016-02-071-1/+1
| | | | | | | | | | CID 1018688 is a false positive. The initialization is done by calling vn_start_write(... &mp, flags). mp is only an output parameter unless (flags & V_MNTREF), and fdesc doesn't put V_MNTREF in flags. Pointed out by: bde
* msdosfs_rename: yet another unused value.pfg2016-02-071-3/+0
| | | | | | | As with r295355, it seems to be left over from a cleanup in r33548. The code is not in NetBSD either. Thanks to bde for checking out the history.
* cd9660: Drop an unnecessary check for NULL.pfg2016-02-071-2/+1
| | | | | | | This was unnecessary and also confused Coverity. Confirmed on: NetBSD CID: 978558
* fdesc_setattr: unitialized pointer readpfg2016-02-071-1/+1
| | | | CID: 1018688
* msdosfs_rename: Unused valuepfg2016-02-061-1/+0
| | | | | | Assigned value to pmp, is immediatedly overwritten before it can be used. CID: 1304892
* Revert r294695:pfg2016-02-031-5/+7
| | | | | | | | | | ext2fs: passthrough any extra timestamps to the dinode struct. While it passed the classic testing, the change appears to have caused some regression and still requires some more precautions. PR: 206820 MFC after: 3 days
* ext2fs: passthrough any extra timestamps to the dinode struct.pfg2016-01-241-7/+5
| | | | | | | | | | | | | In general we don't trust any of the extended timestamps unless the EXT2F_ROCOMPAT_EXTRA_ISIZE feature is set. However, in the case where we freshly allocated a new inode the information is valid and it is better to pass it along instead of leaving the value undefined. This should have no practical effect but should reduce the amount of garbage if EXT2F_ROCOMPAT_EXTRA_ISIZE is set, like in cases where the filesystem is converted from ext3 to ext4. MFC after: 4 days
* ext2: rename some directory index constants.pfg2016-01-241-1/+1
| | | | | | Missed from r294653. Pointyhat: me
* Fix comment.pfg2016-01-241-1/+1
|
* Rename some directory index constants.pfg2016-01-244-7/+7
| | | | | | Directory index was introduced in ext3. We don't always use the prefix to denote the ext2 variant they belong to but when we do we should try to be accurate.
* ext2: Initialize i_flag after allocation.pfg2016-01-241-0/+1
| | | | | | | | | | | | | | We use i_flag to carry some flags like IN_E4INDEX which newer ext2fs variants uses internally. fsck.ext3 rightfully complains after our implementation tags non-directory inodes with INDEX_FL. Initializing i_flag during allocation removes the noise factor and quiets down fsck. Patch from: Damjan Jovanovic PR: 206530
* When devfs dirent is freed, a vnode might still keep a pointer to it,kib2016-01-221-0/+7
| | | | | | | | apparently. Interlock and clear the pointer to avoid free memory dereference. Submitted by: bde (previous version) MFC after: 3 weeks
* ext2fs: Bring back the htree dir_index implementation.pfg2016-01-217-115/+1492
| | | | | | | | | | | | | | | | | | The htree dir_index is perhaps one of the most characteristic features of the linux ext3 implementation. It was removed in r281670, due to repeated bug reports. Damjan Jovanic detected and fixed three bugs and did some stress testing by building Apache OpenOffice on top of it so it is now in good shape to bring back. Differential Revision: https://reviews.freebsd.org/D5007 Submitted by: Damjan Jovanovic Reviewed by: pfg Tested by: pho Relnotes: Yes MFC after: 2 months (only 10.x)
* Assert that the linkage between struct cdev_privdata and and structkib2016-01-171-0/+2
| | | | | | | | file is consistent. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
* [PR 206224] bv_cnt is sometimes examined without holding the bufobj lockrpokala2016-01-171-0/+5
| | | | | | | | | | | Add locking around access to bv_cnt which is currently being done unlocked PR: 206224 Reviewed by: imp Approved by: jhb MFC after: 1 week Sponsored by: Panasas, Inc. Differential Revision: https://reviews.freebsd.org/D4931
* Unbreak NOIP builds after r294084.bz2016-01-151-1/+2
|
* Make nfscl_getmyip() use new routing KPI.melifaro2016-01-153-55/+52
| | | | | | | | * Use standard IPv6 SAS instead of rt->rt_ifa address. * Make address lookup work for IPv6 LLA. * Save address into buffer provided by caller instead of using static vars. Discussed with: rmacklem
* Make devfs_fpdrop() static. It was not a public KPI, and it has nokib2016-01-131-1/+1
| | | | | | | reason to remain exported for some time. Sponsored by: The FreeBSD Foundation MFC after: 1 week
* ext4: mount panic from freeing invalid pointerspfg2016-01-111-1/+1
| | | | | | | | | | Initialize the struct with those fields to zeroes on allocation, preventing the panic. Patch by: Damjan Jovanovic. PR: 206056 MFC after: 3 days
* ext4: add support for reading sparse filespfg2016-01-114-34/+80
| | | | | | | | | | Add support for sparse files in ext4. Also implement read-ahead, which greatly increases the performance when transferring files from ext4. Both features implemented by Damjan Jovanovic. PR: 205816 MFC after: 1 week
* Change the type of newsize argument in the smbfs_smb_setfsize() functionae2016-01-113-6/+8
| | | | | | | | | | | | | | from int to int64. MSDN says that SMB_SET_FILE_END_OF_FILE_INFO uses signed 64-bit integer to specify offset, but since smbfs_smb_setfsize() has used plain int, a value was truncated in case when offset was larger than 2G. https://msdn.microsoft.com/en-us/library/ff469975.aspx In particular, now `truncate -s 10G` will work correctly on the mounted SMB share. Reported and tested by: Eugene Grosbein <eugen at grosbein dot net> MFC after: 1 week
* ext2fs: reading mmaped file in Ext4 causes panicpfg2016-01-071-6/+13
| | | | | | | | | Always call brelse(path.ep_bp), fixing reading EXT4 files using mmap(). Patch by Damjan Jovanovic. PR: 205938 MFC after: 1 week
* Hide transient EBADF errors caused by the parallel revoke(2) or forcedkib2016-01-021-3/+3
| | | | | | | | | | | | | unmount of devfs mounts, by restarting the failed syscall. When restarted, failing syscalls eventually either stop finding the node and returning ENOENT, or the vnode op vectors finally transition to the deadfs vop. The later return EIO or other error, more appropriate for the operation. Submitted by: bde Tested by: pho MFC after: 3 weeks
* Minor style cleanup.kib2016-01-011-1/+1
| | | | | Submitted by: bde MFC after: 1 week
* Force nullfs vnode reclaim after unlinking, to potentially unlinkkib2015-12-301-3/+5
| | | | | | | | | | lower vnode. Otherwise, reference to the lower vnode from the upper one prevents final unlink. PR: 178238 Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week
* ext2: recognize ext4 INCOMPAT_RECOVER flagpfg2015-12-291-0/+3
| | | | | | | | | This is a flag specific for journalling in ext4. Add it to the list of ext4 features we ignore for read-only purposes. PR: 205668 MFC after: 1 week
* Make it possible for the cdevsw d_close() driver method to detect lastkib2015-12-221-3/+9
| | | | | | | | | | | | | | | | close and close due to revoke(2)-like operation. A new FLASTCLOSE flag indicates that this is last close. FREVOKE is set for revokes, and FNONBLOCK is also set, same as is already done for VOP_CLOSE() call from vgonel(). The flags reuse user open(2) flags which are never stored in f_flag, to not consume bit space in the ABI visible way. Assert this with the static check. Requested and reviewed by: bde Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
* Keep devfs mount locked for the whole duration of the devfs_setattr(),kib2015-12-221-7/+14
| | | | | | | | and ensure that our dirent is instantiated. Reported and tested by: bde Sponsored by: The FreeBSD Foundation MFC after: 1 week
* Make CUSE usable with platforms where the size of "unsigned long" ishselasky2015-12-222-6/+6
| | | | different from the size of a pointer.
* Make CUSE usable with platforms where the size of "unsigned long" ishselasky2015-12-222-3/+3
| | | | different from the size of a pointer.
* Guard against the same process being both CUSE server and client athselasky2015-12-221-2/+13
| | | | | the same time. This can easily lead to a deadlock when destroying the character devices nodes.
* Fix breakage caused by r292373 in ZFS/FUSE/NFS/SMBFS.glebius2015-12-163-43/+30
| | | | | | | | | | With the new VOP_GETPAGES() KPI the "count" argument counts pages already, and doesn't need to be translated from bytes to pages. While here make it consistent that *rbehind and *rahead are updated only if we doesn't return error. Pointy hat to: glebius
* A change to KPI of vm_pager_get_pages() and underlying VOP_GETPAGES().glebius2015-12-164-65/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | o With new KPI consumers can request contiguous ranges of pages, and unlike before, all pages will be kept busied on return, like it was done before with the 'reqpage' only. Now the reqpage goes away. With new interface it is easier to implement code protected from race conditions. Such arrayed requests for now should be preceeded by a call to vm_pager_haspage() to make sure that request is possible. This could be improved later, making vm_pager_haspage() obsolete. Strenghtening the promises on the business of the array of pages allows us to remove such hacks as swp_pager_free_nrpage() and vm_pager_free_nonreq(). o New KPI accepts two integer pointers that may optionally point at values for read ahead and read behind, that a pager may do, if it can. These pages are completely owned by pager, and not controlled by the caller. This shifts the UFS-specific readahead logic from vm_fault.c, which should be file system agnostic, into vnode_pager.c. It also removes one VOP_BMAP() request per hard fault. Discussed with: kib, alc, jeff, scottl Sponsored by: Nginx, Inc. Sponsored by: Netflix
* The cdevpriv_dtr_t typedef was not able to be used in a function prototypejhb2015-12-021-1/+1
| | | | | | | | | | | | | | | like the various d_*_t typedefs since it declared a function pointer rather than a function. Add a new d_priv_dtor_t typedef that declares the function and can be used as a function prototype. The previous typedef wasn't useful outside of the cdevpriv implementation, so retire it. The name d_priv_dtor_t was chosen to be more consistent with cdev methods since it is commonly used in place of d_close_t even though it is not a direct pointer in struct cdevsw. Reviewed by: kib, imp MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D4340
* Fix the memory leak that occurs when the nfscommon.ko module is unloaded.rmacklem2015-12-023-0/+52
| | | | | | | | This leak was introduced by r291527. Since the nfscommon.ko module is rarely unloaded, this leak would not have been much of an issue. MFC after: 2 weeks
* Delete the TUNABLE_INT() line. It was in r291527 so that it could bermacklem2015-11-301-1/+0
| | | | MFC'd to stable/10 and still work.
* Add kernel support to the NFS server for the "-manage-gids"rmacklem2015-11-306-220/+547
| | | | | | | | | | | | | | | | | | | option that will be added to the nfsuserd daemon in a future commit. It modifies the cache used by NFSv4 for name<-->id translation (both username/uid and group/gid) to support this. When "-manage-gids" is set, the server looks up each uid for the RPC and uses the list of groups cached in the server instead of the list of groups provided in the RPC request. The cached group list is acquired for the cache by the nfsuserd daemon via getgrouplist(3). This avoids the 16 groups limit for the list in the RPC request. Since the cache is now used for every RPC when "-manage-gids" is enabled, the code also modifies the cache to use a separate mutex for each hash list instead of a single global mutex. Suggested by: jpaetzel Tested by: jpaetzel MFC after: 2 weeks
* For performance reasons, it is useful to have a single string used asmckusick2015-11-293-3/+11
| | | | | | | | | | | | the name of a filesystem when setting it as the first parameter to the getnewvnode() function. Most filesystems call getnewvnode from just one place so can use a literal string as the first parameter. However, NFS calls getnewvnode from two places, so we create a global constant string that can be used by the two instances. This change also collapses two instances of getnewvnode() in the UFS filesystem to a single call. Reviewed by: kib Tested by: Peter Holm
* When the nfsd threads are terminated, the NFSv4 server statermacklem2015-11-213-7/+50
| | | | | | | | | | | | | | | | (opens, locks, etc) is retained, which I believe is correct behaviour. However, for NFSv4.1, the server also retained a reference to the xprt (RPC transport socket structure) for the backchannel. This caused svcpool_destroy() to not call SVC_DESTROY() for the xprt and allowed a socket upcall to occur after the mutexes in the svcpool were destroyed, causing a crash. This patch fixes the code so that the backchannel xprt structure is dereferenced just before svcpool_destroy() is called, so the code does do an SVC_DESTROY() on the xprt, which shuts down the socket upcall. Tested by: g_amanakis@yahoo.com PR: 204340 MFC after: 2 weeks
* Revert r283330 since it broke directory caching in the client.rmacklem2015-11-211-0/+38
| | | | | | | | | | At this time I cannot see a way to fix directory caching when it has partial blocks in the buffer cache, due to the fact that the syscall's uio_offset won't stay the same as the lblkno * NFS_DIRBLKSIZ offset. Reported by: bde MFC after: 2 weeks
* mnt_stat.f_iosize (which is used to set bo_bsize) must be set tormacklem2015-11-171-1/+3
| | | | | | | | | | | | | the largest size of buffer cache block or the mapping of the buffer is bogus. When a mount with rsize=4096,wsize=4096 was done, f_iosize would be set to 4096. This resulted in corrupted directory data, since the buffer cache block size for directories is NFS_DIRBLKSIZ (8192). This patch fixes the code so that it always sets f_iosize to at least NFS_DIRBLKSIZ. Tested by: krichy@cflinux.hu PR: 177971 MFC after: 2 weeks
* - Consistently use PROC_ASSERT_HELD() to verify that a process' hold countmarkj2015-11-082-2/+2
| | | | | | | | | is non-zero. - Include the process address in the PROC_ASSERT_HELD() and PROC_ASSERT_NOT_HELD() assertion messages so that the corresponding process can be found easily when debugging. MFC after: 1 week
* Ensure that when a blockable open of fifo returns success, a validkib2015-09-201-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | file descriptor opened for complimentary access exists as well. The implementation of the guarantee is done by counting the generations of readers and writers opens. We return success and not EINTR or ERESTART error, when the sleep for complimentary opening is interrupted, but the generation was changed during the sleep. Longer explanation: assume there are two threads, A doing open("fifo", O_RDONLY) and B doing open("fifo", O_WRONLY), and no other threads either trying to open the fifo, nor there are any file descriptors referencing the fifo. Before the change, it was possible e.g. for for thread A to return a valid file descriptor, while thread B returned EINTR if a signal to B was delivered simultaneously with the wakeup from A. After the change, in this situation both A::open() and B::open() succeed and the signal is made "as if" it was noticed slightly later. Note that the signal actual delivery is not changed, it is done by ast on syscall return path, so signal handler is still executed before first instruction after syscall. See PR for the code demonstrating the issue. PR: 203162 Reported by: Victor Stinner victor.stinner@gmail.com Reviewed by: jilles Tested by: bapt, pho Sponsored by: The FreeBSD Foundation MFC after: 1 week
* Fix an NFS server bug that manifested in "ls -al" displaying a plustrasz2015-08-281-0/+2
| | | | | | | | | sign on every directory exported via NFSv4 with NFSv4 ACLs enabled. Reviewed by: rmacklem@ MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D3502
OpenPOWER on IntegriCloud