summaryrefslogtreecommitdiffstats
path: root/sys/nfsclient/nfs_bio.c
Commit message (Collapse)AuthorAgeFilesLines
* Avoid an egcs pessimization for 64-bit signed division on i386's.bde1998-06-141-2/+2
| | | | | | | | | Pre-2.8 versions of gcc generate a call to __divdi3() for all 64-bit signed divisions, but egcs optimizes them to a shift and fixup when the divisor is a constant power of 2. Unfortunately, it generates a call to __cmpdi2() for the fixup, although all except possibly ancient versions of gcc and egcs do ordinary 64-bit comparisons inline.
* Make sure we go a nfs_fsinfo() in get/putpages before callingpeter1998-06-011-30/+70
| | | | | | | | readrpc/writerpc, since they assume it's already been done. This could break if the first read/write access to a nfs filesystem was an exec() or mmap() instead of a read(), write() syscall. (or statfs()). nfs_getpages() could return an errno (EOPNOTSUPP) instead of a VM_PAGER_* return code. Some layout tweaks for the get/putpages code.
* When using NFSv3, use the remote server's idea of the maximum file sizepeter1998-05-301-2/+7
| | | | | | | | | | | | | | | | rather than assuming 2^64. It may not like files that big. :-) On the nfs server, calculate and report the max file size as the point that the block numbers in the cache would turn negative. (ie: 1099511627775 bytes (1TB)). One of the things I'm worried about however, is that directory offsets are really cookies on a NFSv3 server and can be rather large, especially when/if the server generates the opaque directory cookies by using a local filesystem offset in what comes out as the upper 32 bits of the 64 bit cookie. (a server is free to do this, it could save byte swapping depending on the native 64 bit byte order) Obtained from: NetBSD
* A cleaner fix for PR#5102, clear nonsense flags at mount time rather thanpeter1998-05-201-3/+1
| | | | | | in the core of nfs_bio.c at the 11th hour. PR: 5102
* Allow control of the attribute cache timeouts at mount time.peter1998-05-191-3/+5
| | | | | | We had run out of bits in the nfs mount flags, I have moved the internal state flags into a seperate variable. These are no longer visible via statfs(), but I don't know of anything that looks at them.
* Don't allow the readdirplus routine to be used in NFS V2.steve1998-03-281-1/+3
| | | | | | PR: 5102 Reviewed by: msmith Submitted by: Dmitry Kohmanyuk <dk@farm.org>
* Reviewed by: dyson@freebsd.org (john Dyson), dg@root.com (david greenman)julian1998-03-081-1/+5
| | | | | Submitted by: Kirk McKusick (mcKusick@mckusick.com) Obtained from: WHistle development tree
* This mega-commit is meant to fix numerous interrelated problems. Theredyson1998-03-071-22/+145
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | has been some bitrot and incorrect assumptions in the vfs_bio code. These problems have manifest themselves worse on NFS type filesystems, but can still affect local filesystems under certain circumstances. Most of the problems have involved mmap consistancy, and as a side-effect broke the vfs.ioopt code. This code might have been committed seperately, but almost everything is interrelated. 1) Allow (pmap_object_init_pt) prefaulting of buffer-busy pages that are fully valid. 2) Rather than deactivating erroneously read initial (header) pages in kern_exec, we now free them. 3) Fix the rundown of non-VMIO buffers that are in an inconsistent (missing vp) state. 4) Fix the disassociation of pages from buffers in brelse. The previous code had rotted and was faulty in a couple of important circumstances. 5) Remove a gratuitious buffer wakeup in vfs_vmio_release. 6) Remove a crufty and currently unused cluster mechanism for VBLK files in vfs_bio_awrite. When the code is functional, I'll add back a cleaner version. 7) The page busy count wakeups assocated with the buffer cache usage were incorrectly cleaned up in a previous commit by me. Revert to the original, correct version, but with a cleaner implementation. 8) The cluster read code now tries to keep data associated with buffers more aggressively (without breaking the heuristics) when it is presumed that the read data (buffers) will be soon needed. 9) Change to filesystem lockmgr locks so that they use LK_NOPAUSE. The delay loop waiting is not useful for filesystem locks, due to the length of the time intervals. 10) Correct and clean-up spec_getpages. 11) Implement a fully functional nfs_getpages, nfs_putpages. 12) Fix nfs_write so that modifications are coherent with the NFS data on the server disk (at least as well as NFS seems to allow.) 13) Properly support MS_INVALIDATE on NFS. 14) Properly pass down MS_INVALIDATE to lower levels of the VM code from vm_map_clean. 15) Better support the notion of pages being busy but valid, so that fewer in-transit waits occur. (use p->busy more for pageouts instead of PG_BUSY.) Since the page is fully valid, it is still usable for reads. 16) It is possible (in error) for cached pages to be busy. Make the page allocation code handle that case correctly. (It should probably be a printf or panic, but I want the system to handle coding errors robustly. I'll probably add a printf.) 17) Correct the design and usage of vm_page_sleep. It didn't handle consistancy problems very well, so make the design a little less lofty. After vm_page_sleep, if it ever blocked, it is still important to relookup the page (if the object generation count changed), and verify it's status (always.) 18) In vm_pageout.c, vm_pageout_clean had rotted, so clean that up. 19) Push the page busy for writes and VM_PROT_READ into vm_pageout_flush. 20) Fix vm_pager_put_pages and it's descendents to support an int flag instead of a boolean, so that we can pass down the invalidate bit.
* Trivial filesystem getpages/putpages implementations, set the second.msmith1998-03-061-1/+16
| | | | | These should be considered the first steps in a work-in-progress. Submitted by: Terry Lambert <terry@freebsd.org>
* Back out DIAGNOSTIC changes.eivind1998-02-061-2/+1
|
* Turn DIAGNOSTIC into a new-style option.eivind1998-02-041-1/+2
|
* Release the buffer when an error occurs while reading directory entries.tegge1998-01-311-3/+6
|
* Various NFS fixes:dyson1998-01-251-87/+78
| | | | | | | | | | | | | | | | Make vfs_bio buffer mgmt work better. Buffers were being used after brelse. Make nfs_getpages work independently of other NFS interfaces. This eliminates some difficult recursion problems and decreases pagefault overhead. Remove an erroneous vfs_unbusy_pages. Fix a reentrancy problem, with nfs_vinvalbuf when vnode is already being rundown. Reassignbuf wasn't being called when needed under certain circumstances. (Thanks to Bill Paul for help.)
* Make our v_usecount vnode reference count work identically to thedyson1998-01-061-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | original BSD code. The association between the vnode and the vm_object no longer includes reference counts. The major difference is that vm_object's are no longer freed gratuitiously from the vnode, and so once an object is created for the vnode, it will last as long as the vnode does. When a vnode object reference count is incremented, then the underlying vnode reference count is incremented also. The two "objects" are now more intimately related, and so the interactions are now much less complex. When vnodes are now normally placed onto the free queue with an object still attached. The rundown of the object happens at vnode rundown time, and happens with exactly the same filesystem semantics of the original VFS code. There is absolutely no need for vnode_pager_uncache and other travesties like that anymore. A side-effect of these changes is that SMP locking should be much simpler, the I/O copyin/copyout optimizations work, NFS should be more ponderable, and further work on layered filesystems should be less frustrating, because of the totally coherent management of the vnode objects and vnodes. Please be careful with your system while running this code, but I would greatly appreciate feedback as soon a reasonably possible.
* Various of the ISP users have commented that the 1.41 version of thedyson1997-12-081-115/+19
| | | | | | nfs_bio.c code worked better than the 1.44. This commit reverts the important parts of 1.44 to 1.41, and we will fix it when we can get a handle on the problem.
* unifdef -U__NetBSD__ -D__FreeBSD__phk1997-09-101-5/+1
|
* Removed unused #includes.bde1997-08-021-3/+1
|
* Avoid small synchronous writes when an application does lots of random-accessdfr1997-06-251-19/+115
| | | | short writes within a block (e.g. ld).
* Upgrade NFS to support the new vfs_bio resource/buffer management.dyson1997-06-161-1/+2
|
* Fix a problem caused by removing large numbers of files from a directorydfr1997-06-061-2/+6
| | | | which could cause a bad size to be given to uiomove, causing a page fault.
* Fix some performance problems with the NFS mmap fixes.dfr1997-06-031-4/+4
|
* Fix a few bugs with NFS and mmap caused by NFS' use of b_validoffdfr1997-05-191-2/+83
| | | | | | | | and b_validend. The changes to vfs_bio.c are a bit ugly but hopefully can be tidied up later by a slight redesign. PR: kern/2573, kern/2754, kern/3046 (possibly) Reviewed by: dyson
* Check the B_CLUSTER flag when choosing whether to use unstable or filesyncdfr1997-05-131-2/+2
| | | | | | | writes. PR: kern/3438 Submitted by: Tor Egge <Tor.Egge@idi.ntnu.no>
* Fix a bug where a program which appended many small records to a file coulddfr1997-04-191-1/+3
| | | | | | | | | | | | | wind up writing zeros instead of real data when the file is on an NFSv2 mounted directory. While tracking this bug down, I noticed that nfs_asyncio was waking *all* the iods when a block was written instead of just one per block. Fixing this gives a 25% performance improvment for writes on v2 (less for v3). Both are 2.2 candidates. PR: kern/2774
* Don't allow partial buffers to be cluster-comitted.dfr1997-04-181-4/+7
| | | | | | | Zero the b_dirty{off,end} after cluster-comitting a group of buffers. With these fixes, I was able to complete a 'make world' with remote src and obj directories.
* The code which recovered from a modified directory situation did not checkdfr1997-04-031-1/+8
| | | | | for eof when re-caching the directory. This could cause it to loop forever if a directory was truncated.
* YAMInTheWrongDirectionF22 (part of rev.1.28.2.3: set B_CLUSTEROK forbde1997-03-091-2/+2
| | | | commits).
* Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are notpeter1997-02-221-1/+1
| | | | ready for it yet.
* This is the kernel Lite/2 commit. There are some requisite userlanddyson1997-02-101-21/+5
| | | | | | | | | | | | | | | changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes. The system boots and can mount UFS filesystems. Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed. Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>
* Make the long-awaited change from $Id$ to $FreeBSD$jkh1997-01-141-1/+1
| | | | | | | | This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
* Improve the queuing algorithms used by NFS' asynchronous i/o. Thedfr1996-11-061-35/+85
| | | | | | | | | | | | | | | | | | existing mechanism uses a global queue for some buffers and the vp->b_dirtyblkhd queue for others. This turns sequential writes into randomly ordered writes to the server, affecting both read and write performance. The existing mechanism also copes badly with hung servers, tending to block accesses to other servers when all the iods are waiting for a hung server. The new mechanism uses a queue for each mount point. All asynchronous i/o goes through this queue which preserves the ordering of requests. A simple mechanism ensures that the iods are shared out fairly between active mount points. This removes the sysctl variable vfs.nfs.dwrite since the new queueing mechanism removes the old delayed write code completely. This should go into the 2.2 branch.
* If a large (>4096 bytes) directory was modified, the old directorydfr1996-10-211-2/+6
| | | | | | | | | contents are discarded, including the cached seek cookies. Unfortunately, if the directory was larger than NFS_DIRBLKSIZ, then this confused nfs_readdirrpc(), making it appear as if the directory was truncated. Reviewed by: Karl Denninger <karl@Mcs.Net>
* Staticized `nfs_dwrite'.bde1996-10-121-2/+2
|
* This fixes a problem with the nfs socket handling code which happensdfr1996-10-111-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | if a single process is performing a large number of requests (in this case writing a large file). The writing process could monopolise the recieve lock and prevent any other processes from recieving their replies. It also adds a new sysctl variable 'vfs.nfs.dwrite' which controls the behaviour which originally pointed out the problem. When a process writes to a file over NFS, it usually arranges for another process (the 'iod') to perform the request. If no iods are available, then it turns the write into a 'delayed write' which is later picked up by the next iod to do a write request for that file. This can cause that particular iod to do a disproportionate number of requests from a single process which can harm performance on some NFS servers. The alternative is to perform the write synchronously in the context of the original writing process if no iod is avaiable for asynchronous writing. The 'delayed write' behaviour is selected when vfs.nfs.dwrite=1 and the non-delayed behaviour is selected when vfs.nfs.dwrite=0. The default is vfs.nfs.dwrite=1; if many people tell me that performance is better if vfs.nfs.dwrite=0 then I will change the default. Submitted by: Hidetoshi Shimokawa <simokawa@sat.t.u-tokyo.ac.jp>
* In sys/time.h, struct timespec is defined as:nate1996-09-191-5/+5
| | | | | | | | | | | | | | /* * Structure defined by POSIX.4 to be like a timeval. */ struct timespec { time_t ts_sec; /* seconds */ long ts_nsec; /* and nanoseconds */ }; The correct names of the fields are tv_sec and tv_nsec. Reminded by: James Drobina <jdrobina@infinet.com>
* Various fixes from frank@fwi.uva.nl (Frank van der Linden) viadfr1996-07-161-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rick@snowhite.cis.uoguelph.ca: 1. Clear B_NEEDCOMMIT in nfs_write to make sure that dirty data is correctly send to the server. If a buffer was dirtied when it was in the B_DELWRI+B_NEEDCOMMIT state, the state of the buffer was left unchanged and when the buffer was later cleaned, just a commit rpc was made to the server to complete the previous write. Clearing B_NEEDCOMMIT ensures that another write is made to the server. 2. If a server returned a server (for whatever reason) returned an answer to a write RPC that implied that fewer bytes than requested were written, bad things would happen. 3. The setattr operation passed on the atime in stead of the mtime to the server. The fix is trivial. 4. XIDs always started at 0, but this caused some servers (older DEC OSF/1 3.0 so I've been told) who had very long-lasting XID caches to get confused if, after a reboot of a BSD client, RPCs came in with a XID that had in the past been used before from that client. Patch is to use the current time in seconds as a starting point for XIDs. The patch below is not perfect, because it requires the root fs to be mounted first. This is because of the check BSD systems do, comparing FS time to system time. Reviewed by: Bruce Evans, Terry Lambert. Obtained from: frank@fwi.uva.nl (Frank van der Linden) via rick@snowhite.cis.uoguelph.ca
* Clear flags before using an inactive buffer. This is a kludge, butpst1996-06-081-1/+2
| | | | | | matches the code in bread(). Reviewed by: bde
* Add a check to prevent a computation from underflowing and causingmpp1996-01-241-3/+4
| | | | | | | | | | | | | a panic due to an attaempt to allocate a buffer for a terabyte or so of data when an attempt is made to create sparse data (e.g. a holey file) more than 1 block past the end of the file. Note: some other areas of this code need to be looked at, since they might cause problems when the file size exceeds 2GB, due to storing results in ints when the computations are being done with quad sized variables. Reviewed by: bde
* Staticize.phk1995-12-171-3/+3
|
* Untangled the vm.h include file spaghetti.dg1995-12-071-1/+3
|
* Completed function declarations and/or added prototypes and/or movedbde1995-12-031-2/+4
| | | | prototypes to the right place.
* Second batch of cleanup changes.phk1995-10-291-4/+2
| | | | | This time mostly making a lot of things static and some unused variables here and there.
* Add support for amd direct maps.dfr1995-08-241-2/+2
| | | | Reviewed by: Thomas Graichen <graichen@sirius.physik.fu-berlin.de>
* Use a consistent blocksize for sizing bufs to avoid panicing the bio system.dfr1995-07-071-4/+4
|
* Changes to support version 3 of the NFS protocol.dfr1995-06-271-111/+181
| | | | | | | | | | | | | | | | | | The version 2 support has been tested (client+server) against FreeBSD-2.0, IRIX 5.3 and FreeBSD-current (using a loopback mount). The version 2 support is stable AFAIK. The version 3 support has been tested with a loopback mount and minimally against an IRIX 5.3 server. It needs more testing and may have problems. I have patched amd to support the new variable length filehandles although it will still only use version 2 of the protocol. Before booting a kernel with these changes, nfs clients will need to at least build and install /usr/sbin/mount_nfs. Servers will need to build and install /usr/sbin/mountd. NFS diskless support is untested. Obtained from: Rick Macklem <rick@snowhite.cis.uoguelph.ca>
* Remove trailing whitespace.rgrimes1995-05-301-3/+3
|
* Changes to fix the following bugs:dg1995-05-211-34/+64
| | | | | | | | | | | | | | | 1) Files weren't properly synced on filesystems other than UFS. In some cases, this lead to lost data. Most likely would be noticed on NFS. The fix is to make the VM page sync/object_clean general rather than in each filesystem. 2) Mixing regular and mmaped file I/O on NFS was very broken. It caused chunks of files to end up as zeroes rather than the intended contents. The fix was to fix several race conditions and to kludge up the "b_dirtyoff" and "b_dirtyend" that NFS relies upon - paying attention to page modifications that occurred via the mmapping. Reviewed by: David Greenman Submitted by: John Dyson
* Various fixes from John Dyson:dg1995-04-161-39/+19
| | | | | | | | 1) Rewrote screwy code that uses an incore buffer without making it busy. 2) Use B_CACHE instead of B_DONE in cases where it is appropriate. 3) Minor code optimization. This *might* fix kern/345 submitted by Heikki Suonsivu.
* Removed obsolete vtrace() remnants.dg1995-03-041-2/+1
|
* Removed a pile of vfs_unbusy_pages()...both unnecessary and wrong - resulteddg1995-02-031-6/+3
| | | | | | | in serious system instability. Changed a B_INVAL to a B_NOCACHE so that buffer data is properly disposed of. Submitted by: John Dyson, Rick Macklin, and ohki@gssm.otsuka.tsukuba.ac.jp
OpenPOWER on IntegriCloud