summaryrefslogtreecommitdiffstats
path: root/sys/fs
Commit message (Collapse)AuthorAgeFilesLines
* Add some basic definitions for a future htree implementation.pfg2012-08-242-1/+4
| | | | MFC after: 3 days
* Fix typokevlo2012-08-181-1/+1
|
* Remove unused member of struct indir (in_exists) from UFS and EXT2 code.mjg2012-08-172-4/+0
| | | | | | Reviewed by: mckusick Approved by: trasz (mentor) MFC after: 1 week
* Streamline use of cdevpriv and correct some corner cases.hselasky2012-08-151-0/+3
| | | | | | | | | | | | | | | | | | | | 1) It is not useful to call "devfs_clear_cdevpriv()" from "d_close" callbacks, hence for example read, write, ioctl and so on might be sleeping at the time of "d_close" being called and then then freed private data can still be accessed. Examples: dtrace, linux_compat, ksyms (all fixed by this patch) 2) In sys/dev/drm* there are some cases in which memory will be freed twice, if open fails, first by code in the open routine, secondly by the cdevpriv destructor. Move registration of the cdevpriv to the end of the drm open routines. 3) devfs_clear_cdevpriv() is not called if the "d_open" callback registered cdevpriv data and the "d_open" callback function returned an error. Fix this. Discussed with: phk MFC after: 2 weeks
* Do not leave invalid pages in the object after the short read for akib2012-08-143-6/+10
| | | | | | | | | | | | | | network file systems (not only NFS proper). Short reads cause pages other then the requested one, which were not filled by read response, to stay invalid. Change the vm_page_readahead_finish() interface to not take the error code, but instead to make a decision to free or to (de)activate the page only by its validity. As result, not requested invalid pages are freed even if the read RPC indicated success. Noted and reviewed by: alc MFC after: 1 week
* After the PHYS_TO_VM_PAGE() function was de-inlined, the main reasonkib2012-08-055-0/+5
| | | | | | | | | | | | | to pull vm_param.h was removed. Other big dependency of vm_page.h on vm_param.h are PA_LOCK* definitions, which are only needed for in-kernel code, because modules use KBI-safe functions to lock the pages. Stop including vm_param.h into vm_page.h. Include vm_param.h explicitely for the kernel code which needs it. Suggested and reviewed by: alc MFC after: 2 weeks
* Reduce code duplication and exposure of direct access to structkib2012-08-043-90/+6
| | | | | | | | | vm_page oflags by providing helper function vm_page_readahead_finish(), which handles completed reads for pages with indexes other then the requested one, for VOP_GETPAGES(). Reviewed by: alc MFC after: 1 week
* The header uma_int.h is internal uma header, unused by this sourcekib2012-08-041-1/+0
| | | | | | | file. Do not include it needlessly. Reviewed by: alc MFC after: 1 week
* I am comparing current pipe code with the one in 8.3-STABLE r236165,davidxu2012-07-311-17/+4
| | | | | | | | | | | | | | | | | | | | | | | | I found 8.3 is a history BSD version using socket to implement FIFO pipe, it uses per-file seqcount to compare with writer generation stored in per-pipe object. The concept is after all writers are gone, the pipe enters next generation, all old readers have not closed the pipe should get the indication that the pipe is disconnected, result is they should get EPIPE, SIGPIPE or get POLLHUP in poll(). But newcomer should not know that previous writters were gone, it should treat it as a fresh session. I am trying to bring back FIFO pipe to history behavior. It is still unclear that if single EOF flag can represent SBS_CANTSENDMORE and SBS_CANTRCVMORE which socket-based version is using, but I have run the poll regression test in tool directory, output is same as the one on 8.3-STABLE now. I think the output "not ok 18 FIFO state 6b: poll result 0 expected 1. expected POLLHUP; got 0" might be bogus, because newcomer should not know that old writers were gone. I got the same behavior on Linux. Our implementation always return POLLIN for disconnected pipe even it should return POLLHUP, but I think it is not wise to remove POLLIN for compatible reason, this is our history behavior. Regression test: /usr/src/tools/regression/poll
* When a thread is blocked in direct write state, it only sets PIPE_DIRECTWdavidxu2012-07-311-2/+8
| | | | | | | | | | | | | | | | | | | | | flag but not PIPE_WANTW, but FIFO pipe code does not understand this internal state, when a FIFO peer reader closes the pipe, it wants to notify the writer, it checks PIPE_WANTW, if not set, it skips calling wakeup(), so blocked writer never noticed the case, but in general, the writer should return from the syscall with EPIPE error code and may get SIGPIPE signal. Setting the PIPE_WANTW fixed problem, or you can turn off direct write, it should fix the problem too. This bug is found by PR/170203. Another bug in FIFO pipe code is when peer closes the pipe, another end which is being blocked in select() or poll() is not notified, it missed to call pipeselwakeup(). Third problem is found in poll regression test, the existing code can not pass 6b,6c,6d tests, but FreeBSD-4 works. This commit does not fix the problem, I still need to study more to find the cause. PR: 170203 Tested by: Garrett Copper < yanegomi at gmail dot com >
* Use NULL instead of 0 for pointerskevlo2012-07-225-9/+9
|
* Simply error handling by moving the allocation of np down to where it isbrueffer2012-07-161-8/+6
| | | | | | | actually used. While here, improve style a little. Submitted by: mjg MFC after: 2 weeks
* Save a bzero() by using M_ZERO.brueffer2012-07-151-2/+1
| | | | | Obtained from: Dragonfly BSD (change 4faaf07c3d7ddd120deed007370aaf4d90b72ebb) MFC after: 2 weeks
* Remove a check on MNTK_UPDATE that is not really necessary as it isattilio2012-07-101-39/+16
| | | | handled in a code snippet above.
* - Remove the unused and not completed write support for NTFS.attilio2012-07-104-306/+39
| | | | | | | - Fix a bug where vfs_mountedfrom() is called also when the filesystem is not mounted successfully. Tested by: pho
* Fix a typokevlo2012-07-031-1/+1
|
* Extend the KPI to lock and unlock f_offset member of struct file. Itkib2012-07-021-11/+4
| | | | | | | | | | | | | | | | | | now fully encapsulates all accesses to f_offset, and extends f_offset locking to other consumers that need it, in particular, to lseek() and variants of getdirentries(). Ensure that on 32bit architectures f_offset, which is 64bit quantity, always read and written under the mtxpool protection. This fixes apparently easy to trigger race when parallel lseek()s or lseek() and read/write could destroy file offset. The already broken ABI emulations, including iBCS and SysV, are not converted (yet). Tested by: pho No objections from: jhb MFC after: 3 weeks
* Do not override an error from uiomove() with (non-)error result fromkib2012-07-021-3/+6
| | | | | | | | | | bwrite(). VFS needs to know about EFAULT from uiomove() and does not care much that partially filled block writeback after EFAULT was successfull. Early return without error causes short write to be reported to usermode. Reported and tested by: andreast MFC after: 3 weeks
* Enable deadlock avoidance code for NFS client.kib2012-06-212-3/+4
| | | | MFC after: 2 weeks
* Fix the NFSv4 client for the case where mmap'd files arermacklem2012-06-181-5/+15
| | | | | | | | | | | | | | | | written, but not msync'd by a process. A VOP_PUTPAGES() called when VOP_RECLAIM() happens will usually fail, since the NFSv4 Open has already been closed by VOP_INACTIVE(). Add a vm_object_page_clean() call to the NFSv4 client's VOP_INACTIVE(), so that the write happens before the NFSv4 Open is closed. kib@ suggested using vgone() instead and I will explore this, but this patch fixes things in the meantime. For some reason, the VOP_PUTPAGES() is still attaempted in VOP_RECLAIM(), but having this fail doesn't cause any problems except a "stateid0 in write" being logged. Reviewed by: kib MFC after: 1 week
* Move the nfsrpc_close() call in ncl_reclaim() for the NFSv4 clientrmacklem2012-06-171-9/+9
| | | | | | | | to below the vnode_destroy_vobject() call, since that is where writes are flushed. Suggested by: kib MFC after: 1 week
* Improve handling of uiomove(9) errors for the NFS client.kib2012-06-061-8/+48
| | | | | | | | | | | | | | | | | | | | | Do not brelse() the buffer unconditionally with BIO_ERROR set if uiomove() failed. The brelse() treats most buffers with BIO_ERROR as B_INVAL, dropping their content. Instead, if the write request covered the whole buffer, remember the cached state and brelse() with BIO_ERROR set only if the buffer was not cached previously. Update the buffer dirtyoff/dirtyend based on the progress recorded by uiomove() in passed struct uio, even in the presence of error. Otherwise, usermode could see changed data in the backed pages, but later the buffer is destroyed without write-back. If uiomove() failed for IO_UNIT request, try to truncate the vnode back to the pre-write state, and rewind the progress in passed uio accordingly, following the FFS behaviour. Reviewed by: rmacklem (some time ago) Tested by: pho MFC after: 1 month
* Capitalize start of sentence.kib2012-05-301-1/+1
| | | | MFC after: 3 days
* Catch a corner case where ssegs could be 0 and thus i would be 0 andmarcel2012-05-281-8/+7
| | | | | | we index suinfo out of bounds (i.e. -1). Approved by: gber
* Fix style and consistency:ed2012-05-271-158/+158
| | | | | | | - Use tabs, not spaces. - Add tab after #define. - Don't mix the use of BSD and ISO C unsigned integer types. Prefer the ISO C ones.
* Use C99-style initialization for struct dirent in preparation forgleb2012-05-252-8/+29
| | | | | | changing the structure. Sponsored by: Google Summer of Code 2011
* Revert devfs part of r235911. I was unaware about old but unfinishedmav2012-05-241-45/+0
| | | | discussion between kib@ and gibbs@ about it.
* MFprojects/zfsd:mav2012-05-241-0/+45
| | | | | | | | | | | Revamp the CAM enclosure services driver. This updated driver uses an in-kernel daemon to track state changes and publishes physical path location information\for disk elements into the CAM device database. Sponsored by: Spectra Logic Corporation Sponsored by: iXsystems, Inc. Submitted by: gibbs, will, mav
* A problem with the NFSv4 server was reported by Andrew Leonardrmacklem2012-05-171-3/+1
| | | | | | | | | | | | to freebsd-fs@, where the setfacl of an NFSv4 acl would fail. This was caused by the VOP_ACLCHECK() call for ZFS replying EOPNOTSUPP. After discussion with rwatson@, it was determined that a call to VOP_ACLCHECK() before doing VOP_SETACL() is not required. This patch fixes the problem by deleting the VOP_ACLCHECK() call. Tested by: Andrew Leonard (previous version) MFC after: 1 week
* Import work done under project/nand (@235533) into head.gber2012-05-1719-0/+11832
| | | | | | | | | | | | | | The NAND Flash environment consists of several distinct components: - NAND framework (drivers harness for NAND controllers and NAND chips) - NAND simulator (NANDsim) - NAND file system (NAND FS) - Companion tools and utilities - Documentation (manual pages) This work is still experimental. Please use with caution. Obtained from: Semihalf Supported by: FreeBSD Foundation, Juniper Networks
* Fix a couple of issues that appear to be inherited from the oldpfg2012-05-162-12/+52
| | | | | | | | | | | | | | | 8.x code: - If the lock cannot be acquired immediately unlocks 'bar' vnode and then locks both vnodes in order. - wrong vnode type panics from cache_enter_time after calls by ext2_lookup. The fix merges the fixes from ufs/ufs_lookup.c. Submitted by: Mateusz Guzik Approved by: jhb@ (mentor) Reviewed by: kib@ MFC after: 1 week
* Skip directory entries with zero inode number during traversal.gleb2012-05-161-1/+1
| | | | | | | | Entries with zero inode number are considered placeholders by libc and UFS. Fix remaining uses of VOP_READDIR in kernel: vop_stdvptocnp, unionfs. Sponsored by: Google Summer of Code 2011
* Fix two cases in the new NFS server where a tsleep() isrmacklem2012-05-121-6/+9
| | | | | | | | | | | | | used, when the code should actually protect the tested variable with a mutex. Since the tsleep()s had a 10sec timeout, the race would have only delayed the allocation of a new clientid for a client. The sleeps will also rarely occur, since having a callback in progress when a client acquires a new clientid, is unlikely. in practice, since having a callback in progress when a fresh clientid is being acquired by a client is unlikely. MFC after: 1 month
* PR# 165923 reported intermittent write failures for dirtyrmacklem2012-05-124-1/+25
| | | | | | | | | | | | | | | | | | | memory mapped pages being written back on an NFS mount. Since any thread can call VOP_PUTPAGES() to write back a dirty page, the credentials of that thread may not have write access to the file on an NFS server. (Often the uid is 0, which may be mapped to "nobody" in the NFS server.) Although there is no completely correct fix for this (NFS servers check access on every write RPC instead of at open/mmap time), this patch avoids the common cases by holding onto a credential that recently opened the file for writing and uses that credential for the write RPCs being done by VOP_PUTPAGES() for both NFS clients. Tested by: Joel Ray Holveck (joelh at juniper.net) PR: kern/165923 Reviewed by: kib MFC after: 2 weeks
* Fix mount interlock oversights from the previous change in r234386.pluknet2012-05-101-2/+0
| | | | | | | Reported by: dougb Submitted by: Mateusz Guzik <mjguzik at gmail com> Reviewed by: Kirk McKusick Tested by: pho
* Use the common api helper routine instead of freeing the nameijwd2012-05-081-2/+1
| | | | | | | buffer directly. Approved by: rmacklem (mentor) MFC after: 1 month
* fixed a unionfs_readdir math issuedaichi2012-05-031-1/+1
| | | | | PR: 132987 Submitted by: Matthew Fleming <mfleming@isilon.com>
* - fixed a vnode lock hang-up issue.daichi2012-05-013-115/+213
| | | | | | | | | | | - fixed an incorrect lock status issue. - fixed an incorrect lock issue of unionfs root vnode removed. (pointed out by keith) - fixed an infinity loop issue. (pointed out by dumbbell) - changed to do LK_RELEASE expressly when unlocked. Submitted by: ozawa@ongs.co.jp
* It was reported via email that some non-FreeBSD NFS serversrmacklem2012-04-271-1/+4
| | | | | | | | | | | | do not include file attributes in the reply to an NFS create RPC under certain circumstances. This resulted in a vnode of type VNON that was not usable. This patch adds an NFS getattr RPC to nfs_create() for this case, to fix the problem. It was tested by the person that reported the problem and confirmed to fix this case for their server. Tested by: Steven Haber (steven.haber at isilon.com) MFC after: 2 weeks
* Fix a leak of namei lookup path buffers that occurs when armacklem2012-04-271-0/+4
| | | | | | | | | | | ZFS volume is exported via the new NFS server. The leak occurred because the new NFS server code didn't handle the case where a file system sets the SAVENAME flag in its VOP_LOOKUP() and ZFS does this for the DELETE case. Tested by: Oliver Brandmueller (ob at gruft.de), hrs PR: kern/167266 MFC after: 1 month
* Remove unused thread argument to vrecycle().trasz2012-04-238-12/+8
| | | | Reviewed by: kib
* Remove unused thread argument from vtruncbuf().trasz2012-04-237-15/+13
| | | | Reviewed by: kib
* This change creates a new list of active vnodes associated withmckusick2012-04-202-1/+3
| | | | | | | | | | | | | | | | | | | | a mount point. Active vnodes are those with a non-zero use or hold count, e.g., those vnodes that are not on the free list. Note that this list is in addition to the list of all the vnodes associated with a mount point. To avoid adding another set of linkage pointers to the vnode structure, the active list uses the existing linkage pointers used by the free list (previously named v_freelist, now renamed v_actfreelist). This update adds the MNT_VNODE_FOREACH_ACTIVE interface that loops over just the active vnodes associated with a mount point (typically less than 1% of the vnodes associated with the mount point). Reviewed by: kib Tested by: Peter Holm MFC after: 2 weeks
* Return EOPNOTSUPP rather than EPERM for the SF_SNAPSHOT flag becausejh2012-04-181-4/+1
| | | | | | tmpfs doesn't support snapshots. Suggested by: bde
* Replace the MNT_VNODE_FOREACH interface with MNT_VNODE_FOREACH_ALL.mckusick2012-04-175-56/+13
| | | | | | | | | | | | | | | | | | | | | The primary changes are that the user of the interface no longer needs to manage the mount-mutex locking and that the vnode that is returned has its mutex locked (thus avoiding the need to check to see if its is DOOMED or other possible end of life senarios). To minimize compatibility issues for third-party developers, the old MNT_VNODE_FOREACH interface will remain available so that this change can be MFC'ed to 9. Following the MFC to 9, MNT_VNODE_FOREACH will be removed in head. The reason for this update is to prepare for the addition of the MNT_VNODE_FOREACH_ACTIVE interface that will loop over just the active vnodes associated with a mount point (typically less than 1% of the vnodes associated with the mount point). Reviewed by: kib Tested by: Peter Holm MFC after: 2 weeks
* Sync tmpfs_chflags() with the recent changes to UFS:jh2012-04-161-13/+13
| | | | | | - Add a check for unsupported file flags. - Return EPERM when an user without PRIV_VFS_SYSFLAGS privilege attempts to toggle SF_SETTABLE flags.
* tmpfs: Allow update mounts only for certain options.jh2012-04-162-6/+15
| | | | | | | | Since r230208 update mounts were allowed if the list of mount options contained the "export" option. This is not correct as tmpfs doesn't really support updating all options. Reviewed by: kevlo, trociny
* Provide better description for vfs.tmpfs.memory_reserved sysctl.gleb2012-04-151-1/+2
| | | | Suggested by: Anton Yuzhaninov <citrin@citrin.ru>
* Apply changes from r234103 to ext2fs:jh2012-04-131-8/+4
| | | | | | | | | | | | Return EPERM from ext2_setattr() when an user without PRIV_VFS_SYSFLAGS privilege attempts to toggle SF_SETTABLE flags. Flags are now stored to ip->i_flags in one place after all checks. Also, remove SF_NOUNLINK from the checks because ext2fs doesn't support that flag. Reviewed by: bde
* Restore the blank line incorrectly removed in r234104.jh2012-04-111-0/+1
| | | | Pointed out by: bde
OpenPOWER on IntegriCloud