summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
* NFS: O_DIRECT async IO may lose contextTrond Myklebust2006-03-201-18/+12
| | | | | | | | | | | | The struct nfs_direct_req currently keeps a pointer to the file descriptor without referencing it. This may cause problems if the parent process is killed. The nfs_open_context should normally have all the information that we're currently using the filp for, and unlike fput(), is safe to release from an rpciod process context. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* nfs: Use UNSTABLE + COMMIT for NFS O_DIRECT writesTrond Myklebust2006-03-201-25/+199
| | | | | | | | | | | | | | | | | | Currently NFS O_DIRECT writes use FILE_SYNC so that a COMMIT is not necessary. This simplifies the internal logic, but this could be a difficult workload for some servers. Instead, let's send UNSTABLE writes, and after they all complete, send a COMMIT for the dirty range. After the COMMIT returns successfully, then do the wake_up or fire off aio_complete(). Test plan: Async direct I/O tests against Solaris (or any server that requires committed unstable writes). Reboot server during test. Based on an earlier patch by Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Make nfs_commit_alloc() externTrond Myklebust2006-03-201-2/+2
| | | | | | We need to use nfs_commit_alloc() in fs/nfs/direct.c. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: fix data_update accounting in NFS direct I/O pathChuck Lever2006-03-201-0/+3
| | | | | | | | | | | ^C against "iozone -I" is hitting the assertion in nfs_clear_inode(). Test plan: "iozone -i0 -I -a -c" against a slow server, then control C. This should not cause an oops. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Replace atomic_t variables in nfs_direct_req with a single spin lockChuck Lever2006-03-201-34/+47
| | | | | | | | | | | | | | | | Three atomic_t variables cause a lot of bus locking. Because they are all used in the same places in the code, just use a single spin lock. Now that the atomic_t variables are gone, we can remove the request size limitation since the code no longer depends on the limited width of atomic_t on some platforms. Test plan: Compile with CONFIG_NFS and CONFIG_NFS_DIRECTIO enabled. Millions of fsx operations, iozone, OraSim. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: clean up comments and tab damage in direct.cChuck Lever2006-03-201-23/+31
| | | | | | | | | | | Clean up tab damage and comments. Replace "file_offset" with more commonly used "pos". Test plan: Compile with CONFIG_NFS and CONFIG_NFS_DIRECTIO enabled. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: support EIOCBQUEUED return in direct write pathChuck Lever2006-03-201-7/+13
| | | | | | | | | | | | | | | | | For async iocb's, the NFS direct write path now returns EIOCBQUEUED, and calls aio_complete when all the requested writes are finished. The synchronous part of the NFS direct write path behaves exactly as it was before. Shared mapped NFS files will have some coherency difficulties when accessed concurrently with aio+dio. Will need to explore how this is handled in the local file system case. Test plan: aio-stress with "-O". OraSim. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: make iocb available everywhere in direct write pathChuck Lever2006-03-201-32/+14
| | | | | | | | | | | | Pass the iocb argument all the way down to the direct write request scheduler, and make it available in nfs_direct_write_result. Test plan: Compile the kernel with CONFIG_NFS and CONFIG_NFS_DIRECTIO enabled. Millions of fsx-odirect ops. OraSim. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: remove support for multi-segment iovs in the direct write pathChuck Lever2006-03-201-51/+17
| | | | | | | | | | | | | Eliminate the persistent use of automatic storage in all parts of the NFS client's direct write path to pave the way for introducing support for aio against files opened with the O_DIRECT flag. Test plan: Compile the kernel with CONFIG_NFS and CONFIG_NFS_DIRECTIO enabled. Millions of fsx-odirect ops. OraSim. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: make direct write path generate write requests concurrentlyChuck Lever2006-03-202-83/+160
| | | | | | | | | | | | | | | | | Duplicate infrastructure from direct read path that will allow write path to generate multiple write requests concurrently. This will enable us to add support for aio in this path. Temporarily we will lose the ability to do UNSTABLE writes followed by a COMMIT in the direct write path. However, all applications I am aware of that use NFS O_DIRECT currently write in relatively small chunks, so this should not be inconvenient in any way. Test plan: Millions of fsx-odirect ops. OraSim. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: create common routine for handling direct I/O completionChuck Lever2006-03-201-20/+26
| | | | | | | | | | Factor out the common piece of completing an NFS direct I/O request. Test plan: Compile kernel with CONFIG_NFS and CONFIG_NFS_DIRECTIO enabled. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: create common routine for allocating nfs_direct_reqChuck Lever2006-03-201-8/+19
| | | | | | | | | | | Factor out a small common piece of the path that allocate nfs_direct_req structures. Test plan: Compile kernel with CONFIG_NFS and CONFIG_NFS_DIRECTIO enabled. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: create common routine for waiting for direct I/O to completeChuck Lever2006-03-201-31/+26
| | | | | | | | | | | | | | We're about to add asynchrony to the NFS direct write path. Begin by abstracting out the common pieces in the read path. The first piece is nfs_direct_read_wait, which works the same whether the process is waiting for a read or a write. Test plan: Compile kernel with CONFIG_NFS and CONFIG_NFS_DIRECTIO enabled. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: support EIOCBQUEUED return in direct read pathChuck Lever2006-03-201-4/+21
| | | | | | | | | | | | | For async iocb's, the NFS direct read path should return EIOCBQUEUED and call aio_complete when all the requested reads are finished. The synchronous part of the NFS direct read path behaves exactly as it was before. Test plan: aio-stress with "-O". OraSim. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: make iocb available everywhere in direct read pathChuck Lever2006-03-201-8/+12
| | | | | | | | | | | | Pass the iocb argument all the way down to the direct read request scheduler, and make it available in nfs_direct_read_result. Test plan: Compile the kernel with CONFIG_NFS and CONFIG_NFS_DIRECTIO enabled. Millions of fsx-odirect ops. OraSim. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: remove support for multi-segment iovs in the direct read pathChuck Lever2006-03-201-51/+15
| | | | | | | | | | | | | Eliminate the persistent use of automatic storage in all parts of the NFS client's direct read path to pave the way for introducing support for aio against files opened with the O_DIRECT flag. Test plan: Compile the kernel with CONFIG_NFS and CONFIG_NFS_DIRECTIO enabled. Millions of fsx-odirect ops. OraSim. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: use size_t type for holding rsize bytes in NFS O_DIRECT read pathChuck Lever2006-03-201-3/+3
| | | | | | | | | | | | | size_t is used for holding byte counts, so use it for variables storing rsize. Note that the write path will be updated as we add support for async O_DIRECT writes. Test plan: Need to verify that existing comparisons against new size_t variables behave correctly. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: update comments and function definitions in fs/nfs/direct.cChuck Lever2006-03-201-110/+15
| | | | | | | | | | | | Update to latest coding style standards. Remove block comments on statically defined functions, and place function definitions all on one line. Test plan: Compile kernel with CONFIG_NFS and CONFIG_NFS_DIRECTIO. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: clean up NFS client's a_ops->direct_IO methodChuck Lever2006-03-201-47/+23
| | | | | | | | | | | | | | | | | | The NFS client's a_ops->direct_IO method, nfs_direct_IO, is required to be present to allow NFS files to be opened with O_DIRECT, but is never called because the NFS client shunts reads and writes to files opened with O_DIRECT directly to its own routines. Gut the nfs_direct_IO function. This eliminates the only part of the NFS client's direct I/O path that requires support for multi-segment iovs, allowing further simplification in subsequent patches. Test plan: Compile the kernel with CONFIG_NFS and CONFIG_NFS_DIRECTIO enabled. Millions of fsx-odirect ops. OraSim. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Cleanup of NFS read codeTrond Myklebust2006-03-205-86/+74
| | | | | | | | Same callback hierarchy inversion as for the NFS write calls. This patch is not strictly speaking needed by the O_DIRECT code, but avoids confusing differences between the asynchronous read and write code. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Cleanup of NFS write code in preparation for asynchronous o_directTrond Myklebust2006-03-204-135/+101
| | | | | | | | | | | | | | | | | This patch inverts the callback hierarchy for NFS write calls. Instead of having the NFSv2/v3/v4-specific code set up the RPC callback ops, we allow the original caller to do so. This allows for more flexibility w.r.t. how to set up and tear down the nfs_write_data structure while still allowing the NFSv3/v4 code to perform error handling. The greater flexibility is needed by the asynchronous O_DIRECT code, which wants to be able to hold on to the original nfs_write_data structures after the WRITE RPC call has completed in order to be able to replay them if the COMMIT call determines that the server has rebooted. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* lockd: Remove FL_LOCKD flagJ. Bruce Fields2006-03-202-3/+1
| | | | | | | | | | | | | | | | | | Currently lockd identifies its own locks using the FL_LOCKD flag. This doesn't scale well to multiple lock managers--if we did this in nfsv4 too, for example, we'd be left with only one free flag bit. Instead, we just check whether the file manager ops (fl_lmops) set on this lock are our own. The only use for this is in nlm_traverse_locks, which uses it to find locks that need cleaning up when freeing a host or a file. In the long run it might be nice to do reference counting instead of traversing all the locks like this.... Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* locks,lockd: fix race in nlmsvc_testlockAndy Adamson2006-03-204-26/+27
| | | | | | | | | | | | | | posix_test_lock() returns a pointer to a struct file_lock which is unprotected and can be removed while in use by the caller. Move the conflicting lock from the return to a parameter, and copy the conflicting lock. In most cases the caller ends up putting the copy of the conflicting lock on the stack. On i386, sizeof(struct file_lock) appears to be about 100 bytes. We're assuming that's reasonable. Signed-off-by: Andy Adamson <andros@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* locks: remove unused posix_block_lockAndy Adamson2006-03-201-15/+0
| | | | | | | | | posix_lock_file() is used to add a blocked lock to Lockd's block, so posix_block_lock() is no longer needed. Signed-off-by: Andy Adamson <andros@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* lockd: make nlmsvc_lock use only posix_lock_fileAndy Adamson2006-03-201-19/+4
| | | | | | | | | | Reorganize nlmsvc_lock() to make full use of posix_lock_file(), which does eveything nlmsvc_lock() needs - no need to call posix_test_lock(), posix_locks_deadlock(), or posix_block_lock() separately. Signed-off-by: Andy Adamson <andros@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* lockd: simplify nlmsvc_grant_blockedAndy Adamson2006-03-201-11/+6
| | | | | | | | | | Reorganize nlmsvc_grant_blocked() to make full use of posix_lock_file(). Note that there's no need for separate calls to posix_test_lock(), posix_locks_deadlock(), or posix_block_lock(). Signed-off-by: Andy Adamson <andros@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* lockd: clean up nlmsvc_lockAndy Adamson2006-03-201-13/+21
| | | | | | | | Slightly more consistent dprintk error reporting, consolidate some up()'s. Signed-off-by: Andy Adamson <andros@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: directory trace messagesChuck Lever2006-03-201-30/+66
| | | | | | | | | | | | | | | | Reuse NFSDBG_DIRCACHE and NFSDBG_LOOKUPCACHE to provide additional diagnostic messages that trace the operation of the NFS client's directory behavior. A few new messages are now generated when NFSDBG_VFS is active, as well, to trace normal VFS activity. This compromise provides better trace debugging for those who use pre-built kernels, without adding a lot of extra noise to the standard debug settings. Test-plan: Enable NFS trace debugging with flags 1, 2, or 4. You should be able to see different types of trace messages with each flag setting. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* SUNRPC: eliminate rpc_call()Chuck Lever2006-03-205-68/+215
| | | | | | | | | | | | | | | Clean-up: replace rpc_call() helper with direct call to rpc_call_sync. This makes NFSv2 and NFSv3 synchronous calls more computationally efficient, and reduces stack consumption in functions that used to invoke rpc_call more than once. Test plan: Compile kernel with CONFIG_NFS enabled. Connectathon on NFS version 2, version 3, and version 4 mount points. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* SUNRPC: display human-readable procedure name in rpc_iostats outputChuck Lever2006-03-208-4/+26
| | | | | | | | | | | | | | | Add fields to the rpc_procinfo struct that allow the display of a human-readable name for each procedure in the rpc_iostats output. Also fix it so that the NFSv4 stats are broken up correctly by sub-procedure number. NFSv4 uses only two real RPC procedures: NULL, and COMPOUND. Test plan: Mount with NFSv2, NFSv3, and NFSv4, and do "cat /proc/self/mountstats". Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: add RPC I/O statistics to /proc/self/mountstatsChuck Lever2006-03-201-0/+4
| | | | | | | | | | | NFS client now shows various RPC I/O metrics in /proc/self/mountstats. Test plan: Mount/umount while doing "cat /proc/self/mountstats", multiple iterations of connectathon locking suite. Test with NFS version 2, 3, and 4. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: report how long an NFS file system has been mountedChuck Lever2006-03-201-0/+5
| | | | | | | | | | | | | Add a field in nfs_server to record a timestamp when a mount succeeds. Report the number of seconds the file system has been mounted via nfs_show_stats(). Test plan: Mount an NFS file system, watch the mountstats reports and compare with clock time. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: add hooks to account for NFSERR_JUKEBOX errorsChuck Lever2006-03-203-10/+27
| | | | | | | | | | | | | | Make an inode or an nfs_server struct available in the logic that handles JUKEBOX/DELAY type errors so the NFS client can account for them. This patch is split out from the main nfs iostat patch to highlight minor architectural changes required to support this statistic. Test plan: None. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: add I/O performance countersChuck Lever2006-03-206-11/+62
| | | | | | | | | | | | | | | | Invoke the byte and event counter macros where we want to count bytes and events. Clean-up: fix a possible NULL dereference in nfs_lock, and simplify nfs_file_open. Test-plan: fsx and iozone on UP and SMP systems, with and without pre-emption. Watch for memory overwrite bugs, and performance loss (significantly more CPU required per op). Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: introduce mechanism for tracking NFS client metricsChuck Lever2006-03-202-6/+249
| | | | | | | | | | | | | | | | | Add a per-superblock performance counter facility to the NFS client. This facility mimics the counters available for block devices and for networking. Expose these new counters via the new /proc/self/mountstats interface. Thanks to Andrew Morton and Trond Myklebust for their review and comments. Test plan: fsx and iozone on UP and SMP systems, with and without pre-emption. Watch for memory overwrite bugs, and performance loss (significantly more CPU required per op). Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: clean up some mount optionsChuck Lever2006-03-201-3/+2
| | | | | | | | | | Get rid of "lock" and "posix", and spell out "vers=". Test plan: None. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: show retransmit settings when displaying mount optionsChuck Lever2006-03-201-0/+8
| | | | | | | | | | | | | Sometimes it's important to know the exact RPC retransmit settings the kernel is using for an NFS mount point. Add this facility to the NFS client's show_options method. Test plan: Set various retransmit settings via the mount command, and check that the settings are reflected in /proc/mounts. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* VFS: New /proc file /proc/self/mountstatsChuck Lever2006-03-202-0/+77
| | | | | | | | | | | | | | | | | Create a new file under /proc/self, called mountstats, where mounted file systems can export information (configuration options, performance counters, and so on). Use a mechanism similar to /proc/mounts and s_ops->show_options. This mechanism does not violate namespace security, and is safe to use while other processes are unmounting file systems. Thanks to Mike Waychison for his review and comments. Test-plan: Test concurrent mount/unmount operations while cat'ing /proc/self/mountstats. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: sem2mutex idmap.cIngo Molnar2006-03-201-20/+21
| | | | | | | | | | | | semaphore to mutex conversion. the conversion was generated via scripts, and the result was validated automatically via a script as well. build and boot tested. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: kzalloc conversion in fs/nfsEric Sesterhenn2006-03-204-13/+6
| | | | | | | | this converts fs/nfs to kzalloc() usage. compile tested with make allyesconfig Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Kill braindead gcc warningsTrond Myklebust2006-03-202-13/+17
| | | | | | | | | nfs4_open_revalidate: 'res' may be used uninitialized nfs4_callback_compound: ‘hdr_res.nops’ may be used uninitialized 'op_nr’ may be used uninitialized encode_getattr_res: ‘savep’ may be used uninitialized Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Do not call rpciod_down() before call to destroy_nfsv4_state()Trond Myklebust2006-03-201-1/+2
| | | | | | | The reason is that the idmapper cleanup may call flush_workqueue() on rpciod_workqueue. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* SUNRPC: Ensure that rpc_mkpipe returns a refcounted dentryTrond Myklebust2006-03-201-0/+2
| | | | | | | If not, we cannot guarantee that idmap->idmap_dentry, gss_auth->dentry and clnt->cl_dentry are valid dentries. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: reduce the number of false cache invalidations.Trond Myklebust2006-03-201-5/+2
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: "const static" vs "static const" in nfs4Jesper Juhl2006-03-201-1/+1
| | | | | | | | My previous "const static" vs "static const" cleanup missed a single case, patch below takes care of it. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Don't invalidate cached attributes if change attribute is unchangedTrond Myklebust2006-03-201-20/+24
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: writes should not clobber utimes() callsTrond Myklebust2006-03-201-5/+3
| | | | | | | Ensure that we flush out writes in the case when someone calls utimes() in order to set the file times. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* lockd: Don't expose the process pid to the NLM serverTrond Myklebust2006-03-205-15/+33
| | | | | | Instead we use the nlm_lockowner->pid. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NLM: nlm_alloc_call should not immediately fail on signalTrond Myklebust2006-03-201-4/+5
| | | | | | | | Currently, nlm_alloc_call tests for a signal before it even tries to allocate memory. Fix it so that it tries at least once. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* VFS: Fix __posix_lock_file() copy of private lock areaTrond Myklebust2006-03-201-17/+36
| | | | | | | The struct file_lock->fl_u area must be copied using the fl_copy_lock() operation. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
OpenPOWER on IntegriCloud