summaryrefslogtreecommitdiffstats
path: root/sys/ufs/ffs/ffs_vnops.c
Commit message (Collapse)AuthorAgeFilesLines
* Fix a file-rewrite performance case for UFS[2]. When rewriting portionsdillon2002-10-181-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | of a file in chunks that are less then the filesystem block size, if the data is not already cached the system will perform a read-before-write. The problem is that it does this on a block-by-block basis, breaking up the I/Os and making clustering impossible for the writes. Programs such as INN using cyclic file buffers suffer greatly. This problem is only going to get worse as we use larger and larger filesystem block sizes. The solution is to extend the sequential heuristic so UFS[2] can perform a far larger read and readahead when dealing with this case. (note: maximum disk write bandwidth is 27MB/sec thru filesystem) (note: filesystem blocksize in test is 8K (1K frag)) dd if=/dev/zero of=test.dat bs=1k count=2m conv=notrunc Before: (note half of these are reads) tty da0 da1 acd0 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 76 14.21 598 8.30 0.00 0 0.00 0.00 0 0.00 0 0 7 1 92 0 76 14.09 813 11.19 0.00 0 0.00 0.00 0 0.00 0 0 9 5 86 0 76 14.28 821 11.45 0.00 0 0.00 0.00 0 0.00 0 0 8 1 91 After: (note half of these are reads) tty da0 da1 acd0 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 76 63.62 434 26.99 0.00 0 0.00 0.00 0 0.00 0 0 18 1 80 0 76 63.58 424 26.30 0.00 0 0.00 0.00 0 0.00 0 0 17 2 82 0 76 63.82 438 27.32 0.00 0 0.00 0.00 0 0.00 1 0 19 2 79 Reviewed by: mckusick Approved by: re X-MFC after: immediately (was heavily tested in -stable for 4 months)
* When reading or writing the extended attributes of a special devicemckusick2002-10-141-3/+38
| | | | | | | | | | | | or fifo in UFS2, the normal ufs_strategy routine needs to be used rather than the spec_strategy or fifo_strategy routine. Thus the ffsext_strategy routine is interposed in the ffs_vnops vectors for special devices and fifo's to pick off this special case. Otherwise it simply falls through to the usual spec_strategy or fifo_strategy routine. Submitted by: Robert Watson <rwatson@FreeBSD.org> Sponsored by: DARPA & NAI Labs.
* size_t is not a struct (fix mislabelling in a comment).dd2002-10-021-1/+1
|
* Be consistent about "static" functions: if the function is markedphk2002-09-281-7/+7
| | | | | | static in its prototype, mark it static at the definition too. Inspired by: FlexeLint warning #512
* Use our mount-credential if we get a NOCRED when we try to write out EAphk2002-09-271-0/+2
| | | | | | | | | space back to disk. This is wrong in many ways, but not as wrong as a panic. Pancied on: rwatson & jmallet Sponsored by: DARPA & NAI Labs.
* - Convert locks to use standard macros.jeff2002-09-251-2/+8
| | | | | | - Lock access to the buflists. - Document broken locking. - Use vrefcnt().
* Implement the VOP_OPENEXTATTR() and VOP_CLOSEEXTATTR() methods.phk2002-09-051-31/+184
| | | | | | | | Use extattr_check_cred() to check access to EAs. This is still a WIP. Sponsored by: DARPA & NAI Labs.
* Include <sys/malloc.h> instead of depending on namespace pollution 2bde2002-09-051-9/+11
| | | | | | | | | layers deep in <sys/proc.h> or <sys/vnode.h>. Include <sys/vmmeter.h> instead of depending on namespace pollution in <sys/pcpu.h>. Sorted includes as much as possible.
* Correctly handle setting, getting and deleting EA's with zero length content.phk2002-08-301-12/+14
| | | | Sponsored by: DARPA & NAI Labs.
* o Retire vm_page_zero_fill() and vm_page_zero_fill_area(). Ever sincealc2002-08-251-1/+1
| | | | | | pmap_zero_page() and pmap_zero_page_area() were modified to accept a struct vm_page * instead of a physical address, vm_page_zero_fill() and vm_page_zero_fill_area() have served no purpose.
* Implement list of EA return functionality.phk2002-08-201-9/+33
| | | | | | Correctly delete EA's when the content length is set to zero. Sponsored by: DARPA & NAI Labs.
* First snapshot of UFS2 EA support.phk2002-08-191-7/+228
| | | | Sponsored by: DARPA & NAI Labs.
* Expand the arguments to ffs_ext{read,write}() to their componentphk2002-08-131-41/+17
| | | | | | | | | parts rather than use vop_{read,write}_args. Access to these functions will ultimately not be available through the "vop_{read,write}+IO_EXT" API but this functionality is retained for debugging purposes for now. Sponsored by: DARPA & NAI Labs.
* Unravel the UFS_EXTATTR incest between FFS and UFS: UFS_EXTATTR is anphk2002-08-131-2/+49
| | | | | | | | | UFS only thing, and FFS should in principle not know if it is enabled or not. This commit cleans ffs_vnops.c for such knowledge, but not ffs_vfsops.c Sponsored by: DARPA and NAI Labs.
* Stop pretending that the FFS file ufs_readwrite.c is a UFS file.phk2002-08-121-2/+1028
| | | | | | | Instead of #including it, pull it into ffs_vnops.c and name things correctly. Sponsored by: DARPA & NAI Labs.
* - Replace v_flag with v_iflag and v_vflagjeff2002-08-041-3/+5
| | | | | | | | | | | | | | | - v_vflag is protected by the vnode lock and is used when synchronization with VOP calls is needed. - v_iflag is protected by interlock and is used for dealing with vnode management issues. These flags include X/O LOCK, FREE, DOOMED, etc. - All accesses to v_iflag and v_vflag have either been locked or marked with mp_fixme's. - Many ASSERT_VOP_LOCKED calls have been added where the locking was not clear. - Many functions in vfs_subr.c were restructured to provide for stronger locking. Idea stolen from: BSD/OS
* This commit adds basic support for the UFS2 filesystem. The UFS2mckusick2002-06-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | filesystem expands the inode to 256 bytes to make space for 64-bit block pointers. It also adds a file-creation time field, an ability to use jumbo blocks per inode to allow extent like pointer density, and space for extended attributes (up to twice the filesystem block size worth of attributes, e.g., on a 16K filesystem, there is space for 32K of attributes). UFS2 fully supports and runs existing UFS1 filesystems. New filesystems built using newfs can be built in either UFS1 or UFS2 format using the -O option. In this commit UFS1 is the default format, so if you want to build UFS2 format filesystems, you must specify -O 2. This default will be changed to UFS2 when UFS2 proves itself to be stable. In this commit the boot code for reading UFS2 filesystems is not compiled (see /sys/boot/common/ufsread.c) as there is insufficient space in the boot block. Once the size of the boot block is increased, this code can be defined. Things to note: the definition of SBSIZE has changed to SBLOCKSIZE. The header file <ufs/ufs/dinode.h> must be included before <ufs/ffs/fs.h> so as to get the definitions of ufs2_daddr_t and ufs_lbn_t. Still TODO: Verify that the first level bootstraps work for all the architectures. Convert the utility ffsinfo to understand UFS2 and test growfs. Add support for the extended attribute storage. Update soft updates to ensure integrity of extended attribute storage. Switch the current extended attribute interfaces to use the extended attribute storage. Add the extent like functionality (framework is there, but is currently never used). Sponsored by: DARPA & NAI Labs. Reviewed by: Poul-Henning Kamp <phk@freebsd.org>
* Name ufs_vop_[gs]etextattr() consistently with the rest of our VOPs andphk2002-05-031-12/+0
| | | | | | | put then in the ufs_vnops where they belong, rather than in the ffs_vnops. Ok'ed by: rwatson Sponsored by: DARPA & NAI Labs.
* Remove __P.alfred2002-03-191-4/+4
|
* This corrects the first of two known deadlock conditions thatmckusick2002-03-141-12/+1
| | | | come from the presence of a snapshot file.
* KSE Milestone 2julian2001-09-121-3/+3
| | | | | | | | | | | | | | Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha
* Implement vop_std{get|put}pages() and add them to the default vop[].phk2001-05-011-2/+0
| | | | | Un-copy&paste all the VOP_{GET|PUT}PAGES() functions which do nothing but the default.
* VOP_BALLOC was never really a VOP in the first place, so convert itphk2001-04-291-1/+0
| | | | to UFS_BALLOC like the other "between UFS and FFS function interfaces".
* Revert consequences of changes to mount.h, part 2.grog2001-04-291-2/+0
| | | | Requested by: bde
* Correct #includes to work with fixed sys/mount.h.grog2001-04-231-0/+2
|
* o Change options FFS_EXTATTR and options FFS_EXTATTR_AUTOSTART torwatson2001-03-191-4/+4
| | | | | | | | | | | | | options UFS_EXTATTR and UFS_EXTATTR_AUTOSTART respectively. This change reflects the fact that our EA support is implemented entirely at the UFS layer (modulo FFS start/stop/autostart hooks for mount and unmount events). This also better reflects the fact that [shortly] MFS will also support EAs, as well as possibly IFS. o Consumers of the EA support in FFS are reminded that as a result, they must change kernel config files to reflect the new option names. Obtained from: TrustedBSD Project
* Fixes to track snapshot copy-on-write checking in the specinfomckusick2001-03-071-1/+0
| | | | | | structure rather than assuming that the device vnode would reside in the FFS filesystem (which is obviously a broken assumption with the device filesystem).
* Another round of the <sys/queue.h> FOREACH transmogriffer.phk2001-02-041-2/+1
| | | | | Created with: sed(1) Reviewed by: md5(1)
* Initial commit of IFS - a inode-namespaced FFS. Here is a shortadrian2000-10-141-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | description: How it works: -- Basically ifs is a copy of ffs, overriding some vfs/vnops. (Yes, hack.) I didn't see the need in duplicating all of sys/ufs/ffs to get this off the ground. File creation is done through a special file - 'newfile' . When newfile is called, the system allocates and returns an inode. Note that newfile is done in a cloning fashion: fd = open("newfile", O_CREAT|O_RDWR, 0644); fstat(fd, &st); printf("new file is %d\n", (int)st.st_ino); Once you have created a file, you can open() and unlink() it by its returned inode number retrieved from the stat call, ie: fd = open("5", O_RDWR); The creation permissions depend entirely if you have write access to the root directory of the filesystem. To get the list of currently allocated inodes, VOP_READDIR has been added which returns a directory listing of those currently allocated. -- What this entails: * patching conf/files and conf/options to include IFS as a new compile option (and since ifs depends upon FFS, include the FFS routines) * An entry in i386/conf/NOTES indicating IFS exists and where to go for an explanation * Unstaticize a couple of routines in src/sys/ufs/ffs/ which the IFS routines require (ffs_mount() and ffs_reload()) * a new bunch of routines in src/sys/ufs/ifs/ which implement the IFS routines. IFS replaces some of the vfsops, and a handful of vnops - most notably are VFS_VGET(), VOP_LOOKUP(), VOP_UNLINK() and VOP_READDIR(). Any other directory operation is marked as invalid. What this results in: * an IFS partition's create permissions are controlled by the perm/ownership of the root mount point, just like a normal directory * Each inode has perm and ownership too * IFS does *NOT* mean an FFS partition can be opened per inode. This is a completely seperate filesystem here * Softupdates doesn't work with IFS, and really I don't think it needs it. Besides, fsck's are FAST. (Try it :-) * Inodes 0 and 1 aren't allocatable because they are special (dump/swap IIRC). Inode 2 isn't allocatable since UFS/FFS locks all inodes in the system against this particular inode, and unravelling THAT code isn't trivial. Therefore, useful inodes start at 3. Enjoy, and feedback is definitely appreciated!
* Blow away the v_specmountpoint define, replacing it with what it waseivind2000-10-091-2/+2
| | | | defined as (rdev->si_mountpoint)
* o Permit UFS Extended Attributes to be associated with special devicesrwatson2000-09-211-0/+8
| | | | | | and FIFOs. Obtained from: TrustedBSD Project
* Add snapshots to the fast filesystem. Most of the changes supportmckusick2000-07-111-2/+12
| | | | | | | | | | | | | | | | | | | | the gating of system calls that cause modifications to the underlying filesystem. The gating can be enabled by any filesystem that needs to consistently suspend operations by adding the vop_stdgetwritemount to their set of vnops. Once gating is enabled, the function vfs_write_suspend stops all new write operations to a filesystem, allows any filesystem modifying system calls already in progress to complete, then sync's the filesystem to disk and returns. The function vfs_write_resume allows the suspended write operations to begin again. Gating is not added by default for all filesystems as for SMP systems it adds two extra locks to such critical kernel paths as the write system call. Thus, gating should only be added as needed. Details on the use and current status of snapshots in FFS can be found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness is not included here. Unless and until you create a snapshot file, these changes should have no effect on your system (famous last words).
* Revert part of my bioops change which implemented panic(8).phk2000-06-161-2/+0
|
* Virtualizes & untangles the bioops operations vector.phk2000-06-161-2/+5
| | | | Ref: Message-ID: <18317.961014572@critter.freebsd.dk> To: current@
* Separate the struct bio related stuff out of <sys/buf.h> intophk2000-05-051-0/+1
| | | | | | | | | | | | | | | <sys/bio.h>. <sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall not be made a nested include according to bdes teachings on the subject of nested includes. Diskdrivers and similar stuff below specfs::strategy() should no longer need to include <sys/buf.> unless they need caching of data. Still a few bogus uses of struct buf to track down. Repocopy by: peter
* Introduce extended attribute support for FFS, allowing arbitraryrwatson2000-04-151-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | (name, value) pairs to be associated with inodes. This support is used for ACLs, MAC labels, and Capabilities in the TrustedBSD security extensions, which are currently under development. In this implementation, attributes are backed to data vnodes in the style of the quota support in FFS. Support for FFS extended attributes may be enabled using the FFS_EXTATTR kernel option (disabled by default). Userland utilities and man pages will be committed in the next batch. VFS interfaces and man pages have been in the repo since 4.0-RELEASE and are unchanged. o ufs/ufs/extattr.h: UFS-specific extattr defines o ufs/ufs/ufs_extattr.c: bulk of support routines o ufs/{ufs,ffs,mfs}/*.[ch]: hooks and extattr.h includes o contrib/softupdates/ffs_softdep.c: extattr.h includes o conf/options, conf/files, i386/conf/LINT: added FFS_EXTATTR o coda/coda_vfsops.c: XXX required extattr.h due to ufsmount.h (This should not be the case, and will be fixed in a future commit) Currently attributes are not supported in MFS. This will be fixed. Reviewed by: adrian, bp, freebsd-fs, other unthanked souls Obtained from: TrustedBSD Project
* Give vn_isdisk() a second argument where it can return a suitable errno.phk2000-01-101-3/+2
| | | | Suggested by: bde
* Several performance improvements for soft updates have been added:mckusick2000-01-101-19/+30
| | | | | | | | | | | | | | | 1) Fastpath deletions. When a file is being deleted, check to see if it was so recently created that its inode has not yet been written to disk. If so, the delete can proceed to immediately free the inode. 2) Background writes: No file or block allocations can be done while the bitmap is being written to disk. To avoid these stalls, the bitmap is copied to another buffer which is written thus leaving the original available for futher allocations. 3) Link count tracking. Constantly track the difference in i_effnlink and i_nlink so that inodes that have had no change other than i_effnlink need not be written. 4) Identify buffers with rollback dependencies so that the buffer flushing daemon can choose to skip over them.
* Convert various pieces of code to use vn_isdisk() rather than checkingphk1999-11-221-2/+2
| | | | | | | | for vp->v_type == VBLK. In ccd: we don't need to call VOP_GETATTR to find the type of a vnode. Reviewed by: sos
* useracc() the prequel:phk1999-10-291-1/+0
| | | | | | | | | | | Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs. This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE} as argument.
* $Id$ -> $FreeBSD$peter1999-08-281-1/+1
|
* Decommision miscfs/specfs/specdev.h. Most of it goes into <sys/conf.h>,phk1999-08-081-3/+2
| | | | | | a few lines into <sys/vnode.h>. Add a few fields to struct specinfo, paving the way for the fun part.
* Convert buffer locking from using the B_BUSY and B_WANTED flags to usingmckusick1999-06-261-6/+7
| | | | | | | lockmgr locks. This commit should be functionally equivalent to the old semantics. That is, all buffer locking is done with LK_EXCLUSIVE requests. Changes to take advantage of LK_SHARED and LK_RECURSIVE will be done in future commits.
* On our final pass through ffs_fsync, do all I/O synchronously so thatmckusick1999-06-181-7/+8
| | | | | we can find out if our flush is failing because of write errors. This change avoids a "flush failed" panic during unrecoverable disk errors.
* Add a hook to ffs_fsync to allow soft updates to get first chance at doingmckusick1999-05-141-1/+6
| | | | | | a sync on the block device for the filesystem. That allows it to push the bitmap blocks before the inode blocks which greatly reduces the number of inode rollbacks that need to be done.
* When fsync'ing a file on a filesystem using soft updates, we first trymckusick1999-03-021-10/+18
| | | | | | | | | | | | | | | | to write all the dirty blocks. If some of those blocks have dependencies, they will be remarked dirty when the I/O completes. On systems with really fast I/O systems, it is possible to get in an infinite loop trying to flush the buffers, because the I/O finishes before we can get all the dirty buffers off the v_dirtyblkhd list and into the I/O queue. (The previous algorithm looped over the v_dirtyblkhd list writing out buffers until the list emptied.) So, now we mark each buffer that we try to write so that we can distinguish the ones that are being remarked dirty from those that we have not yet tried to flush. Once we have tried to push every buffer once, we then push any associated metadata that is causing the remaining buffers to be redirtied. Submitted by: Matthew Dillon <dillon@apollo.backplane.com>
* Don't pass unused unused timestamp args to UFS_UPDATE() or wastebde1999-01-071-4/+2
| | | | | time initializing them. This almost finishes centralizing (in-core) timestamp updates in ufs_itimes().
* Use TAILQ macros for clean/dirty block list processing. Set b_xflagspeter1998-10-311-4/+4
| | | | rather than abusing the list next pointer with a magic number.
* Eliminate a race in VOP_FSYNC() when softupdates is enabled.luoqi1998-09-241-6/+2
| | | | | | | | | | Submitted by: Kirk McKusick <mckusick@McKusick.COM> Two minor changes are also included, 1. Remove gratuitious checks for error return from vn_lock with LK_RETRY set, vn_lock should always succeed in these cases. 2. Back out change rev. 1.36->1.37, which unnecessarily makes async mount a little more unstable. It also keeps us in sync with other BSDs. Suggested by: Bruce Evans <bde@zeta.org.au>
* Put the zombie ffs sysctl node in "notyet" state together with its fewbde1998-09-071-4/+1
| | | | remaining children. Prepare it for MOUNT_UFS going away.
OpenPOWER on IntegriCloud