summaryrefslogtreecommitdiffstats
path: root/sys/ufs
Commit message (Collapse)AuthorAgeFilesLines
* When writing out bitmap buffers, need to skip over ones that alreadymckusick2000-01-301-1/+2
| | | | | | | have a write in progress. Otherwise one can get in an infinite loop trying to get them all flushed. Submitted by: Matthew Dillon <dillon@apollo.backplane.com>
* During fastpath processing for removal of a short-lived inode, themckusick2000-01-181-48/+56
| | | | | | | | | | set of restrictions for cancelling an inode dependency (inodedep) is somewhat stronger than originally coded. Since this check appears in two places, we codify it into the function check_inode_unwritten which we then call from the two sites, one freeing blocks and the other freeing directory entries. Submitted by: Steinar Haug via Matthew Dillon
* Need to reorganize the flushing of directory entry (pagedep) dependenciesmckusick2000-01-181-63/+62
| | | | | | | | | | | | | | so that they never try to lock an inode corresponding to ".." as this can lead to deadlock. We observe that any inode with an updated link count is always pushed into its buffer at the time of the link count change, so we do not need to do a VOP_UPDATE, but merely find its buffer and write it. The only time we need to get the inode itself is from the result of a mkdir whose name will never be ".." and hence locking such an inode will never request a lock above us in the filesystem tree. Thanks to Brian Fundakowski Feldman for providing the test program that tickled soft updates into hanging in "inode" sleep. Submitted by: Brian Fundakowski Feldman <green@FreeBSD.org>
* Better bounding on softdep_flushfiles; other minor tweeks to checks.mckusick2000-01-171-7/+9
|
* Must track multiple uncommitted renames until one ultimately getsmckusick2000-01-171-22/+65
| | | | committed to disk or is removed.
* Non-operational change, fix compiler warning.dillon2000-01-141-1/+1
| | | | Reviewed by: mckusick
* Confirming Peter's fix (locking 101: release the lock before you gomckusick2000-01-131-2/+0
| | | | | | to sleep). Locking 101, part 2: do not look at buffer contents after you have been asleep. There is no telling what wonderous changes may have occurred.
* Free the global softupdates lock prior to tsleep() in getdirtybuf().peter2000-01-131-0/+2
| | | | | | | | This seems to be responsible for a bunch of panics where the process sleeps and something else finds softupdates "locked" when it shouldn't be. This commit is unreviewed, but has been a big help here. Previously my boxes would panic pretty much on the first fsync() that wrote something to disk.
* Because cylinder group blocks are now written in background,mckusick2000-01-131-3/+13
| | | | | | | | | | it is no longer sufficient to get a lock on a buffer to know that its write has been completed. We have to first get the lock on the buffer, then check to see if it is doing a background write. If it is doing background write, we have to wait for the background write to finish, then check to see if that fullfilled our dependency, and if not to start another write. Luckily the explanation is longer than the fix.
* A panic occurs during an fsync when a dirty block associated withmckusick2000-01-131-4/+7
| | | | | | | | | | | a vnode has not been written (which would clear certain of its dependencies). The problems arises because fsync with MNT_NOWAIT no longer pushes all the dirty blocks associated with a vnode. It skips those that require rollbacks, since they will just get instantly dirty again. Such skipped blocks are marked so that they will not be skipped a second time (otherwise circular dependencies would never clear). So, we fsync twice to ensure that everything will be written at least once.
* The only known cause of this panic is running out of disk space.mckusick2000-01-111-3/+13
| | | | | | | | | | | | | | | | | | | | | | The problem occurs when an indirect block and a data block are being allocated at the same time. For example when the 13th block of the file is written, the filesystem needs to allocate the first indirect block and a data block. If the indirect block allocation succeeds, but the data block allocation fails, the error code dellocates the indirect block as it has nothing at which to point. Unfortunately, it does not deallocate the indirect block's associated dependencies which then fail when they find the block unexpectedly gone (ptr == 0 instead of its expected value). The fix is to fsync the file before doing the block rollback, as the fsync will flush out all of the dependencies. Once the rollback is done the file must be fsync'ed again so that the soft updates code does not find unexpected changes. This approach is much slower than writing the code to back out the extraneous dependencies, but running out of disk space is not expected to be a common occurence, so just getting it right is the main criterion. PR: kern/15063 Submitted by: Assar Westerlund <assar@stacken.kth.se>
* We cannot proceed to free the blocks of the file until the dependenciesmckusick2000-01-111-29/+32
| | | | | | have been cleaned up by deallocte_dependencies(). Once that is done, it is safe to post the request to free the blocks. A similar change is also needed for the freefile case.
* Give vn_isdisk() a second argument where it can return a suitable errno.phk2000-01-103-11/+9
| | | | Suggested by: bde
* Missing FREE_LOCK call before handle_workitem_freeblocks.mckusick2000-01-101-3/+5
| | | | Submitted by: "Kenneth D. Merry" <ken@kdm.org>
* Several performance improvements for soft updates have been added:mckusick2000-01-108-115/+285
| | | | | | | | | | | | | | | 1) Fastpath deletions. When a file is being deleted, check to see if it was so recently created that its inode has not yet been written to disk. If so, the delete can proceed to immediately free the inode. 2) Background writes: No file or block allocations can be done while the bitmap is being written to disk. To avoid these stalls, the bitmap is copied to another buffer which is written thus leaving the original available for futher allocations. 3) Link count tracking. Constantly track the difference in i_effnlink and i_nlink so that inodes that have had no change other than i_effnlink need not be written. 4) Identify buffers with rollback dependencies so that the buffer flushing daemon can choose to skip over them.
* Keep tighter control of removal dependencies by limiting the numbermckusick2000-01-091-20/+22
| | | | | | of dirrem structure rather than the collaterally created freeblks and freefile structures. Limit the rate of buffer dirtying by the syncer process during periods of intense file removal.
* Reorganize softdep_fsync so that it only does the inode-is-flushedmckusick2000-01-091-26/+22
| | | | | | | | | check before the inode is unlocked while grabbing its parent directory. Once it is unlocked, other operations may slip in that could make the inode-is-flushed check fail. Allowing other writes to the inode before returning from fsync does not break the semantics of fsync since we have flushed everything that was dirty at the time of the fsync call.
* Get rid of unreferenced function.mckusick2000-01-091-9/+0
|
* Make static non-exported functions from soft updates.mckusick2000-01-092-11/+12
|
* Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL"peter1999-12-294-8/+8
| | | | | | is an application space macro and the applications are supposed to be free to use it as they please (but cannot). This is consistant with the other BSD's who made this change quite some time ago. More commits to come.
* Update the unclean flag for mount -u. I forgot to handle this casebde1999-12-231-0/+2
| | | | | | | | when I made the absence of the clean flag sticky in rev.1.88. This was a problem main for "mount /". There is no way to mount "/" for writing without using mount -u (normally implicitly), so after "mount -f /" of an unclean filesystem, the absence of the clean flag was sticky forever.
* Change incorrect NULLs to 0seivind1999-12-211-1/+1
|
* Second pass commit to introduce new ACL and Extended Attribute systemrwatson1999-12-192-0/+4
| | | | | | | calls, vnops, vfsops, both in /kern, and to individual file systems that require a vfsop_ array entry. Reviewed by: eivind
* The function request_cleanup() had a tsleep() with PCATCH. It ismckusick1999-12-161-1/+1
| | | | | | | | | | | | quite dangerous, since the process may hold locks at the point, and if it is stopped in that tsleep the machine may hang. Because the sleep is so short, the PCATCH is not required here, so it has been removed. For the future, the FreeBSD team needs to decide whether it is still reasonable to stop a process in tsleep, as that may affect any other code that uses PCATCH while holding kernel locks. Submitted by: Dmitrij Tejblum <tejblum@arc.hq.cti.ru> Reviewed by: Kirk McKusick <mckusick@mckusick.com>
* Introduce NDFREE (and remove VOP_ABORTOP)eivind1999-12-153-43/+5
|
* Lock reporting and assertion changes.eivind1999-12-111-1/+1
| | | | | | | | | | | | | | | * lockstatus() and VOP_ISLOCKED() gets a new process argument and a new return value: LK_EXCLOTHER, when the lock is held exclusively by another process. * The ASSERT_VOP_(UN)LOCKED family is extended to use what this gives them * Extend the vnode_if.src format to allow more exact specification than locked/unlocked. This commit should not do any semantic changes unless you are using DEBUG_VFS_LOCKS. Discussed with: grog, mch, peter, phk Reviewed by: peter
* Remove the 'alpha, use at your own risk' death-statement.billf1999-12-031-4/+1
| | | | Reviewed by: mckusick (verbally at FreeBSDcon)
* Fix typo, add $FreeBSD$billf1999-12-031-1/+3
|
* Preferentially allocate the first indirect block in the samemckusick1999-12-011-1/+1
| | | | | cylinder group as the inode. This makes a 15% difference in read speed for files in the 96K to 500K size range.
* Retire MFS_ROOT and MFS_ROOT_SIZE options from the MFS implementation.phk1999-11-261-121/+0
| | | | | | | | | | | | | | | | | Add MD_ROOT and MD_ROOT_SIZE options to the md driver. Make the md driver handle MFS_ROOT and MFS_ROOT_SIZE options for compatibility. Add md driver to GENERIC, PCCARD and LINT. This is a cleanup which removes the need for some of the worse hacks in MFS: We really want to have a rootvnode but MFS on a preloaded image doesn't really have one. md is a true device, so it is less trouble. This has been tested with make release, and if people remember to add the "md" pseudo-device to their kernels, PicoBSD should be just fine as well. If people have no other use for MFS, it can be removed from the kernel.
* Convert various pieces of code to use vn_isdisk() rather than checkingphk1999-11-223-8/+8
| | | | | | | | for vp->v_type == VBLK. In ccd: we don't need to call VOP_GETATTR to find the type of a vnode. Reviewed by: sos
* We do not have ffs_checkexp, so remove the prototypeeivind1999-11-201-2/+0
|
* struct mountlist and struct mount.mnt_list have no business beingphk1999-11-201-2/+1
| | | | | | | | | | a CIRCLEQ. Change them to TAILQ_HEAD and TAILQ_ENTRY respectively. This removes ugly mp != (void*)&mountlist comparisons. Requested by: phk Submitted by: Jake Burkholder jake@checker.org PR: 14967
* Fix a warning (unused static declaration without MFS_ROOT)peter1999-11-181-0/+2
|
* Remove WILLRELE from VOP_SYMLINKeivind1999-11-131-1/+2
| | | | | | Note: Previous commit to these files (except coda_vnops and devfs_vnops) that claimed to remove WILLRELE from VOP_RENAME actually removed it from VOP_MKNOD.
* Remove WILLRELE from VOP_RENAMEeivind1999-11-121-2/+6
|
* Next step in the device cleanup process.phk1999-11-091-0/+2
| | | | | | | | Correctly lock vnodes when calling VOP_OPEN() from filesystem mount code. Unify spec_open() for bdev and cdev cases. Remove the disabled bdev specific read/write code.
* Quick fix for breakage of ext2fs link counts as reported by stat(2) bybde1999-11-033-1/+4
| | | | | | | the soft updates changes: only report the link count to be i_effnlink in ufs_getattr() for file systems that maintain i_effnlink. Tested by: Mike Dracopoulos <mdraco@math.uoa.gr>
* Make MFS work with the new root filesystem search process.msmith1999-11-031-14/+30
| | | | | | | | | | | | | | | | | In order to achieve this, root filesystem mount is moved from SI_ORDER_FIRST to SI_ORDER_SECOND in the SI_SUB_MOUNT_ROOT sysinit group. Now, modules which wish to usurp the default root mount can use SI_ORDER_FIRST. A compiled-in or preloaded MFS filesystem will become the root filesystem unless the vfs.root.mountfrom environment variable refers to a valid bootable device. This will normally only be the case when the kernel and MFS image have been loaded from a disk which has a valid /etc/fstab file. In this case, the variable should be manually overridden in the loader, or the kernel booted with -a. In either case "mfs:" should be supplied as the new value. Also fix a typo in one DFLTROOT case that would not have compiled.
* Newline-terminate the complaint message about not being able to findmsmith1999-11-012-2/+2
| | | | the root vnode pointer.
* Add sysctl debug.dircheck to allow directory sanity checking to be turneddillon1999-10-301-0/+11
| | | | | | | on with a sysctl. Fix two bugs in ufs_lookup that can cause deadlocks due to out-of-order locking. This fix was tested for a few days prior to commit.
* useracc() the prequel:phk1999-10-292-2/+0
| | | | | | | | | | | Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs. This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE} as argument.
* Remove the D_NOCLUSTER[RW] options which were added because vn hadphk1999-09-301-15/+0
| | | | | problems. Now that Matt has fixed vn, this can go. The vn driver should have used d_maxio (now si_iosize_max) anyway.
* Remove v_maxio from struct vnode.phk1999-09-292-2/+5
| | | | | | Replace it with mnt_iosize_max in struct mount. Nits from: bde
* sigset_t change (part 2 of 5)marcel1999-09-291-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ----------------------------- The core of the signalling code has been rewritten to operate on the new sigset_t. No methodological changes have been made. Most references to a sigset_t object are through macros (see signalvar.h) to create a level of abstraction and to provide a basis for further improvements. The NSIG constant has not been changed to reflect the maximum number of signals possible. The reason is that it breaks programs (especially shells) which assume that all signals have a non-null name in sys_signame. See src/bin/sh/trap.c for an example. Instead _SIG_MAXSIG has been introduced to hold the maximum signal possible with the new sigset_t. struct sigprop has been moved from signalvar.h to kern_sig.c because a) it is only used there, and b) access must be done though function sigprop(). The latter because the table doesn't holds properties for all signals, but only for the first NSIG signals. signal.h has been reorganized to make reading easier and to add the new and/or modified structures. The "old" structures are moved to signalvar.h to prevent namespace polution. Especially the coda filesystem suffers from the change, because it contained lines like (p->p_sigmask == SIGIO), which is easy to do for integral types, but not for compound types. NOTE: kdump (and port linux_kdump) must be recompiled. Thanks to Garrett Wollman and Daniel Eischen for pressing the importance of changing sigreturn as well.
* Remove five now unused fields from struct cdevsw. They should neverphk1999-09-251-5/+0
| | | | | | | | have been there in the first place. A GENERIC kernel shrinks almost 1k. Add a slightly different safetybelt under nostop for tty drivers. Add some missing FreeBSD tags
* More removals of vnode->v_lastr, replaced by preexisting seqcountdillon1999-09-201-78/+11
| | | | | | | | | heuristic to detect sequential operation. VM-related forced clustering code removed from ufs in preparation for a commit to vm/vm_fault.c that does it more generally. Reviewed by: David Greenman <dg@root.com>, Alan Cox <alc@cs.rice.edu>
* Fix a harmless bug I introduced, simplify a bit more while here.phk1999-09-201-6/+4
|
* Step one of replacing devsw->d_maxio with si_bsize_max.phk1999-09-202-37/+6
| | | | | | | | Rename dev->si_bsize_max to si_iosize_max and set it in spec_open if the device didn't. Set vp->v_maxio from dev->si_bsize_max in spec_open rather than in ufs_bmap.c
* Removed diskerr()'s unused d_name arg and updated callers. This fixesbde1999-09-131-2/+2
| | | | | | warnings caused by the arg having the wrong type (not const enough). The arg was also wrong (a full name instead of a short one) for calls from from subr_diskmbr.c and pc98/diskslice_machdep.c.
OpenPOWER on IntegriCloud