summaryrefslogtreecommitdiffstats
path: root/sys/ufs/ffs
Commit message (Collapse)AuthorAgeFilesLines
* When deleting a file, the ordering of events imposed by soft updatesmckusick2000-11-141-15/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | is to first write the deleted directory entry to disk, second write the zero'ed inode to disk, and finally to release the freed blocks and the inode back to the cylinder-group map. As this ordering requires two disk writes to occur which are normally spaced about 30 seconds apart (except when memory is under duress), it takes about a minute from the time that a file is deleted until its inode and data blocks show up in the cylinder-group map for reallocation. If a file has had only a brief lifetime (less than 30 seconds from creation to deletion), neither its inode nor its directory entry may have been written to disk. If its directory entry has not been written to disk, then we need not wait for that directory block to be written as the on-disk directory block does not reference the inode. Similarly, if the allocated inode has never been written to disk, we do not have to wait for it to be written back either as its on-disk representation is still zero'ed out. Thus, in the case of a short lived file, we can simply release the blocks and inode to the cylinder-group map immediately. As the inode and its blocks are released immediately, they are immediately available for other uses. If they are not released for a minute, then other inodes and blocks must be allocated for short lived files, cluttering up the vnode and buffer caches. The previous code was a bit too aggressive in trying to release the blocks and inode back to the cylinder-group map resulting in their being made available when in fact the inode on disk had not yet been zero'ed. This patch takes a more conservative approach to doing the release which avoids doing the release prematurely.
* Initial commit of IFS - a inode-namespaced FFS. Here is a shortadrian2000-10-143-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | description: How it works: -- Basically ifs is a copy of ffs, overriding some vfs/vnops. (Yes, hack.) I didn't see the need in duplicating all of sys/ufs/ffs to get this off the ground. File creation is done through a special file - 'newfile' . When newfile is called, the system allocates and returns an inode. Note that newfile is done in a cloning fashion: fd = open("newfile", O_CREAT|O_RDWR, 0644); fstat(fd, &st); printf("new file is %d\n", (int)st.st_ino); Once you have created a file, you can open() and unlink() it by its returned inode number retrieved from the stat call, ie: fd = open("5", O_RDWR); The creation permissions depend entirely if you have write access to the root directory of the filesystem. To get the list of currently allocated inodes, VOP_READDIR has been added which returns a directory listing of those currently allocated. -- What this entails: * patching conf/files and conf/options to include IFS as a new compile option (and since ifs depends upon FFS, include the FFS routines) * An entry in i386/conf/NOTES indicating IFS exists and where to go for an explanation * Unstaticize a couple of routines in src/sys/ufs/ffs/ which the IFS routines require (ffs_mount() and ffs_reload()) * a new bunch of routines in src/sys/ufs/ifs/ which implement the IFS routines. IFS replaces some of the vfsops, and a handful of vnops - most notably are VFS_VGET(), VOP_LOOKUP(), VOP_UNLINK() and VOP_READDIR(). Any other directory operation is marked as invalid. What this results in: * an IFS partition's create permissions are controlled by the perm/ownership of the root mount point, just like a normal directory * Each inode has perm and ownership too * IFS does *NOT* mean an FFS partition can be opened per inode. This is a completely seperate filesystem here * Softupdates doesn't work with IFS, and really I don't think it needs it. Besides, fsck's are FAST. (Try it :-) * Inodes 0 and 1 aren't allocatable because they are special (dump/swap IIRC). Inode 2 isn't allocatable since UFS/FFS locks all inodes in the system against this particular inode, and unravelling THAT code isn't trivial. Therefore, useful inodes start at 3. Enjoy, and feedback is definitely appreciated!
* Blow away the v_specmountpoint define, replacing it with what it waseivind2000-10-093-9/+9
| | | | defined as (rdev->si_mountpoint)
* o Move initialization of ump from mp to the top of the function so thatrwatson2000-10-061-2/+1
| | | | | | | it is defined whenm used in ufs_extattr_uepm_destroy(), fixing a panic due to a NULL pointer dereference. Submitted by: Wesley Morgan <morganw@chemicals.tacorp.com>
* o Add call to ufs_extattr_uepm_destroy() in ffs_unmount() so as to cleanrwatson2000-10-041-0/+15
| | | | | | | up lock on extattrs. o Get for free a comment indicating where auto-starting of extended attributes will eventually occur, as it was in my commit tree also. No implementation change here, only a comment.
* Convert lockmgr locks from using simple locks to using mutexes.jasone2000-10-041-6/+8
| | | | | | Add lockdestroy() and appropriate invocations, which corresponds to lockinit() and must be called to clean up after a lockmgr lock is no longer needed.
* Add a lock structure to vnode structure. Previously it was either allocatedbp2000-09-251-1/+5
| | | | | | | | | | | | | | | | | | | separately (nfs, cd9660 etc) or keept as a first element of structure referenced by v_data pointer(ffs). Such organization leads to known problems with stacked filesystems. From this point vop_no*lock*() functions maintain only interlock lock. vop_std*lock*() functions maintain built-in v_lock structure using lockmgr(). vop_sharedlock() is compatible with vop_stdunlock(), but maintains a shared lock on vnode. If filesystem wishes to export lockmgr compatible lock, it can put an address of this lock to v_vnlock field. This indicates that the upper filesystem can take advantage of it and use single lock structure for entire (or part) of stack of vnodes. This field shouldn't be examined or modified by VFS code except for initialization purposes. Reviewed in general by: mckusick
* o Permit UFS Extended Attributes to be associated with special devicesrwatson2000-09-211-0/+8
| | | | | | and FIFOs. Obtained from: TrustedBSD Project
* Silence a warning.des2000-09-171-1/+1
|
* Cannot do MALLOC with M_WAITOK while holding ACQUIRE_LOCKmckusick2000-09-071-2/+2
| | | | Obtained from: Ethan Solomita <ethan@geocast.com>
* Major update to the way synchronization is done in the kernel. Highlightsjasone2000-09-072-2/+0
| | | | | | | | | | | | | | | include: * Mutual exclusion is used instead of spl*(). See mutex(9). (Note: The alpha port is still in transition and currently uses both.) * Per-CPU idle processes. * Interrupts are run in their own separate kernel threads and can be preempted (i386 only). Partially contributed by: BSDi (BSD/OS) Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh
* Initialize *countp to 0 in stub for softdep_flushworklist().tegge2000-08-091-0/+1
| | | | | | This allows ffs_fsync() to break out of a loop that might otherwise be infinite on kernels compiled without the SOFTUPDATES option. The observed symptom was a system hang at the first unmount attempt.
* Fix the lockmgr panic everyone is seeing at shutdown time.roberto2000-08-011-1/+2
| | | | | | | | | vput assumes curproc is the lock holder, but it's not true in this case. Thanks a lot Luoqi ! Submitted by: luoqi Tested by: phk
* Minor change: fix warning - move a 'struct vnode *vp' declaration inside apeter2000-07-281-0/+2
| | | | #ifdef DIAGNOSTIC to match its corresponding usage.
* Clean up the snapshot code so that it no longer depends on the use ofmckusick2000-07-262-8/+9
| | | | | | | | | | | | | | the SF_IMMUTABLE flag to prevent writing. Instead put in explicit checking for the SF_SNAPSHOT flag in the appropriate places. With this change, it is now possible to rename and link to snapshot files. It is also possible to set or clear any of the owner, group, or other read bits on the file, though none of the write or execute bits can be set. There is also an explicit test to prevent the setting or clearing of the SF_SNAPSHOT flag via chflags() or fchflags(). Note also that the modify time cannot be changed as it needs to accurately reflect the time that the snapshot was taken. Submitted by: Robert Watson <rwatson@FreeBSD.org>
* Add stub for softdep_flushworklist() so that kernels compiledmckusick2000-07-251-0/+10
| | | | | | without the SOFTUPDATES option will load correctly. Obtained from: John Baldwin <jhb@bsdi.com>
* This patch corrects the first round of panics and hangs reportedmckusick2000-07-244-68/+122
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with the new snapshot code. Update addaliasu to correctly implement the semantics of the old checkalias function. When a device vnode first comes into existence, check to see if an anonymous vnode for the same device was created at boot time by bdevvp(). If so, adopt the bdevvp vnode rather than creating a new vnode for the device. This corrects a problem which caused the kernel to panic when taking a snapshot of the root filesystem. Change the calling convention of vn_write_suspend_wait() to be the same as vn_start_write(). Split out softdep_flushworklist() from softdep_flushfiles() so that it can be used to clear the work queue when suspending filesystem operations. Access to buffers becomes recursive so that snapshots can recursively traverse their indirect blocks using ffs_copyonwrite() when checking for the need for copy on write when flushing one of their own indirect blocks. This eliminates a deadlock between the syncer daemon and a process taking a snapshot. Ensure that softdep_process_worklist() can never block because of a snapshot being taken. This eliminates a problem with buffer starvation. Cleanup change in ffs_sync() which did not synchronously wait when MNT_WAIT was specified. The result was an unclean filesystem panic when doing forcible unmount with heavy filesystem I/O in progress. Return a zero'ed block when reading a block that was not in use at the time that a snapshot was taken. Normally, these blocks should never be read. However, the readahead code will occationally read them which can cause unexpected behavior. Clean up the debugging code that ensures that no blocks be written on a filesystem while it is suspended. Snapshots must explicitly label the blocks that they are writing during the suspension so that they do not cause a `write on suspended filesystem' panic. Reorganize ffs_copyonwrite() to eliminate a deadlock and also to prevent a race condition that would permit the same block to be copied twice. This change eliminates an unexpected soft updates inconsistency in fsck caused by the double allocation. Use bqrelse rather than brelse for buffers that will be needed soon again by the snapshot code. This improves snapshot performance.
* Prevent possible dereference of NULL pointer.bp2000-07-131-1/+1
| | | | Submitted by: Marius Bendiksen <mbendiks@eunet.no>
* Brain fault, forgot to update ffs_snapshot.c with the new calling conventionmckusick2000-07-121-4/+5
| | | | for vn_start_write.
* Add snapshots to the fast filesystem. Most of the changes supportmckusick2000-07-119-175/+1332
| | | | | | | | | | | | | | | | | | | | the gating of system calls that cause modifications to the underlying filesystem. The gating can be enabled by any filesystem that needs to consistently suspend operations by adding the vop_stdgetwritemount to their set of vnops. Once gating is enabled, the function vfs_write_suspend stops all new write operations to a filesystem, allows any filesystem modifying system calls already in progress to complete, then sync's the filesystem to disk and returns. The function vfs_write_resume allows the suspended write operations to begin again. Gating is not added by default for all filesystems as for SMP systems it adds two extra locks to such critical kernel paths as the write system call. Thus, gating should only be added as needed. Details on the use and current status of snapshots in FFS can be found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness is not included here. Unless and until you create a snapshot file, these changes should have no effect on your system (famous last words).
* Clean up warning about undeclared function by declaring softdep_fsyncmckusick2000-07-111-0/+3
| | | | | | in mount.h instead of ffs_extern.h. The correct solution is to use an indirect function pointer so that the kernel does not have to be built with options FFS, but that will be left for another day.
* Delete README as it is now obsolete. Relevant information is inmckusick2000-07-081-322/+0
| | | | README.softupdates.
* Update to reflect current status.mckusick2000-07-081-4/+42
|
* Get userland visible flags added for snapshots to give a few daysmckusick2000-07-041-1/+26
| | | | | | advance preparation for them to get migrated into place so that subsequent changes in utilities will not fail to compile for lack of up-to-date header files in /usr/include.
* Make the two calls from kern/* into softupdates #ifdef SOFTUPDATES,phk2000-07-031-15/+0
| | | | | | | that is way cleaner than using the softupdates_stub stunt, which should be killed when convenient. Discussed with: mckusick
* Remove obsoleted info about linking from contribache2000-06-241-16/+2
|
* Update to new copyright.mckusick2000-06-222-46/+12
|
* When running with quotas enabled on a filesystem using soft updates,mckusick2000-06-181-2/+3
| | | | | | | | the system would panic when a user's inode quota was exceeded (see PR 18959 for details). This fixes that problem. PR: 18959 Submitted by: Jason Godsey <jason@unixguy.fidalgo.net>
* Some additional performance improvements. When freeing an inodemckusick2000-06-181-8/+18
| | | | | | | | | | check to see if it has been committed to disk. If it has never been written, it can be freed immediately. For short lived files this change allows the same inode to be reused repeatedly. Similarly, when upgrading a fragment to a larger size, if it has never been claimed by an inode on disk, it too can be freed immediately making it available for reuse often in the next slowly growing block of the same file.
* Revert part of my bioops change which implemented panic(8).phk2000-06-161-2/+0
|
* ARGH! I have too many source trees :-(phk2000-06-163-4/+4
| | | | Fix prototype errors in last commit.
* Virtualizes & untangles the bioops operations vector.phk2000-06-163-4/+21
| | | | Ref: Message-ID: <18317.961014572@critter.freebsd.dk> To: current@
* Remove a comment which should never have made it in.phk2000-06-141-1/+0
|
* o If FFS_EXTATTR is defined, don't print out an error message on unmountrwatson2000-06-041-3/+4
| | | | | | if an FFS partition returns EOPNOTSUPP, as it just means extended attributes weren't enabled on that partition. Prevents spurious warning per-partition at shutdown.
* Back out the previous change to the queue(3) interface.jake2000-05-262-25/+25
| | | | | | It was not discussed and should probably not happen. Requested by: msmith and others
* Change the way that the queue(3) structures are declared; don't assume thatjake2000-05-232-25/+25
| | | | | | | | the type argument to *_HEAD and *_ENTRY is a struct. Suggested by: phk Reviewed by: phk Approved by: mdodd
* s/ffs_unmonut/ffs_unmount/ in a gratuitous ufs_extattr printf.rwatson2000-05-071-1/+1
| | | | Reported by: knu
* Separate the struct bio related stuff out of <sys/buf.h> intophk2000-05-057-0/+7
| | | | | | | | | | | | | | | <sys/bio.h>. <sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall not be made a nested include according to bdes teachings on the subject of nested includes. Diskdrivers and similar stuff below specfs::strategy() should no longer need to include <sys/buf.> unless they need caching of data. Still a few bogus uses of struct buf to track down. Repocopy by: peter
* Remove unneeded #include <vm/vm_zone.h>phk2000-04-301-1/+0
| | | | Generated by: src/tools/tools/kerninclude
* s/biowait/bufwait/gphk2000-04-291-1/+1
| | | | Prodded by: several.
* Remove unneeded #include <sys/kernel.h>phk2000-04-291-1/+0
|
* Remove ~25 unneeded #include <sys/conf.h>phk2000-04-191-1/+0
| | | | Remove ~60 unneeded #include <sys/malloc.h>
* Introduce extended attribute support for FFS, allowing arbitraryrwatson2000-04-155-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | (name, value) pairs to be associated with inodes. This support is used for ACLs, MAC labels, and Capabilities in the TrustedBSD security extensions, which are currently under development. In this implementation, attributes are backed to data vnodes in the style of the quota support in FFS. Support for FFS extended attributes may be enabled using the FFS_EXTATTR kernel option (disabled by default). Userland utilities and man pages will be committed in the next batch. VFS interfaces and man pages have been in the repo since 4.0-RELEASE and are unchanged. o ufs/ufs/extattr.h: UFS-specific extattr defines o ufs/ufs/ufs_extattr.c: bulk of support routines o ufs/{ufs,ffs,mfs}/*.[ch]: hooks and extattr.h includes o contrib/softupdates/ffs_softdep.c: extattr.h includes o conf/options, conf/files, i386/conf/LINT: added FFS_EXTATTR o coda/coda_vfsops.c: XXX required extattr.h due to ufsmount.h (This should not be the case, and will be fixed in a future commit) Currently attributes are not supported in MFS. This will be fixed. Reviewed by: adrian, bp, freebsd-fs, other unthanked souls Obtained from: TrustedBSD Project
* Move B_ERROR flag to b_ioflags and call it BIO_ERROR.phk2000-04-022-2/+3
| | | | | | | | | | | | | (Much of this done by script) Move B_ORDERED flag to b_ioflags and call it BIO_ORDERED. Move b_pblkno and b_iodone_chain to struct bio while we transition, they will be obsoleted once bio structs chain/stack. Add bio_queue field for struct bio aware disksort. Address a lot of stylistic issues brought up by bde.
* Rename the existing BUF_STRATEGY() to DEV_STRATEGY()phk2000-03-202-13/+12
| | | | | | | | substitute BUF_WRITE(foo) for VOP_BWRITE(foo->b_vp, foo) substitute BUF_STRATEGY(foo) for VOP_STRATEGY(foo->b_vp, foo) This patch is machine generated except for the ccd.c and buf.h parts.
* Remove B_READ, B_WRITE and B_FREEBUF and replace them with a newphk2000-03-202-2/+2
| | | | | | | | | | | | | | | | | | | | | field in struct buf: b_iocmd. The b_iocmd is enforced to have exactly one bit set. B_WRITE was bogusly defined as zero giving rise to obvious coding mistakes. Also eliminate the redundant struct buf flag B_CALL, it can just as efficiently be done by comparing b_iodone to NULL. Should you get a panic or drop into the debugger, complaining about "b_iocmd", don't continue. It is likely to write on your disk where it should have been reading. This change is a step in the direction towards a stackable BIO capability. A lot of this patch were machine generated (Thanks to style(9) compliance!) Vinum users: Greg has not had time to test this yet, be careful.
* Use 64-bit math to calculate if we have hit our freespace limit.mckusick2000-03-171-1/+2
| | | | Necessary for coherent results on filesystems bigger than 0.5Tb.
* Use 64-bit math to decide if optimization needs to be changed.mckusick2000-03-151-30/+48
| | | | | | Necessary for coherent results on filesystems bigger than 0.5Tb. Submitted by: Paul Saab <ps@yahoo-inc.com>
* Fix a 'freeing free block' panic in UFS. The problem occurs when thedillon2000-02-241-1/+24
| | | | | | | | | | | | | | filesystem fills up. If the first indirect block exists and FFS is able to allocate deeper indirect blocks, but is not able to allocate the data block, FFS improperly unwinds the indirect blocks and leaves a block pointer hanging to a freed block. This will cause a panic later when the file is removed. The solution is to properly account for the first block-pointer-to-an-indirect-block we had to create in a balloc operation and then unwind it if a failure occurs. Detective work by: Ian Dowse <iedowse@maths.tcd.ie> Reviewed by: mckusick, Ian Dowse <iedowse@maths.tcd.ie> Approved by: jkh
* When writing out bitmap buffers, need to skip over ones that alreadymckusick2000-01-301-1/+2
| | | | | | | have a write in progress. Otherwise one can get in an infinite loop trying to get them all flushed. Submitted by: Matthew Dillon <dillon@apollo.backplane.com>
OpenPOWER on IntegriCloud