summaryrefslogtreecommitdiffstats
path: root/sys/ufs
Commit message (Collapse)AuthorAgeFilesLines
* Initialise the bioops vector hack at runtime rather than at link time. Thismsmith2002-01-081-8/+7
| | | | | | avoids the use of common variables. Reviewed by: mckusick
* Fix a BUF_TIMELOCK race against BUF_LOCK and fix a deadlock in vget()dillon2001-12-202-2/+2
| | | | | | | | against VM_WAIT in the pageout code. Both fixes involve adjusting the lockmgr's timeout capability so locks obtained with timeouts do not interfere with locks obtained without a timeout. Hopefully MFC: before the 4.5 release
* Change the atomic_set_char to atomic_set_int and atomic_clear_charmckusick2001-12-183-13/+17
| | | | | | to atomic_clear_int to ease the implementation for the sparc64. Requested by: Jake Burkholder <jake@locore.ca>
* Make sure we ignore the value of `fs_active' when reloading theiedowse2001-12-161-1/+2
| | | | | superblock, and move the initialisation of it to beside where other pointer fields are initialised.
* Move the new superblock field `fs_active' into the region of theiedowse2001-12-161-5/+7
| | | | | | superblock that is already set up to handle pointer types. This fixes an accidental change in the superblock size on 64-bit platforms caused by revision 1.24.
* Minimize the time necessary to suspend operations on a filesystemmckusick2001-12-144-96/+209
| | | | | | | | | | | | | | | | | | when taking a snapshot. The two time consuming operations are scanning all the filesystem bitmaps to determine which blocks are in use and scanning all the other snapshots so as to be able to expunge their blocks from the view of the current snapshot. The bitmap scanning is broken into two passes. Before suspending the filesystem all bitmaps are scanned. After the suspension, those bitmaps that changed after being scanned the first time are rescanned. Typically there are few bitmaps that need to be rescanned. The expunging of other snapshots is now done after the suspension is released by observing that we can easily identify any blocks that were allocated to them after the suspension (they will be maked as `not needing to be copied' in the just created snapshot). For all the gory details, see the ``Running fsck in the Background'' paper in the Usenix BSDCon 2002 Conference Proceedings, pages 55-64.
* When a file is partially truncated, we first check to see if themckusick2001-12-131-0/+12
| | | | | | | | | | | | | | new file end will land in the middle of a file hole. Since the last block of a file must always be allocated, the hole is filled by allocating a block at that location. If the hole being filled is a direct block, then the truncation may eventually reduce the full sized block down to a fragment. When running with soft updates, it is necessary to FSYNC the file after allocating the block and before creating the fragment to avoid triggering a soft updates inconsistency when the block unexpectedly shrinks. Found by: Matthew Dillon <dillon@apollo.backplane.com> MFC after: 1 week
* Use 'mkdir -p /.attribute/system' instead of breaking it intorwatson2001-11-301-1/+1
| | | | | | two seperate mkdir targets. Submitted by: jedgar
* Use 'mkdir -p /.attribute/system' instead of breaking it intorwatson2001-11-301-1/+1
| | | | two seperate mkdir targets.
* README.extattr incorrectly specified sample command lines forrwatson2001-11-301-2/+2
| | | | | | | UFS_EXTATTR_AUTOSTART. Insert the missing 'initattr' arguments to extattrctl. Noticed by: green
* When mkdir()-ing, the parent dir gets is linkcount increased.guido2001-11-221-1/+1
| | | | | | | Fix VN_KNOTE to reflect that. Found by: tobez@freebsd.org MFC after: 2 days
* Oops, when trying the dirhash sequential-access optimisation,iedowse2001-11-141-1/+1
| | | | | | | | | | | | | | compare the slot offset against the predicted offset, not a boolean flag. This typo effectively disabled the sequential optimisation, but was otherwise harmless. Not surprisingly, fixing this improves performance in the sequential access case. I am seeing a 7% speedup on one machine here; using dirhash when sequentially looking up directory entries is now about 5% faster instead of 2% slower than the non-dirhash case. Submitted by: KOIE Hidetaka <koie@suri.co.jp> MFC after: 1 week
* Implement IO_NOWDRAIN and B_NOWDRAIN - prevents the buffer cache from blockingdillon2001-11-051-0/+2
| | | | | | | | | | | | | in wdrain during a write. This flag needs to be used in devices whos strategy routines turn-around and issue another high level I/O, such as when MD turns around and issues a VOP_WRITE to vnode backing store, in order to avoid deadlocking the dirty buffer draining code. Remove a vprintf() warning from MD when the backing vnode is found to be in-use. The syncer of buf_daemon could be flushing the backing vnode at the time of an MD operation so the warning is not correct. MFC after: 1 week
* o Update copyright dates.rwatson2001-11-014-7/+18
| | | | | | | o Add reference to TrustedBSD Project in license header. o Update dated comments, including comment in extattr.h claiming that no file systems support extended attributes. o Improve comment consistency.
* o Althought this is not specified in POSIX.1e, the UFS ACL implementationrwatson2001-10-271-1/+6
| | | | | | | | | | | | | | | | coerces the deletion of a default ACL on a directory when no default ACL EA is present to success. Because the UFS EA implementation doesn't disinguish the EA failure modes "that EA name has not been administratively enabled" from "that EA name has no defined data", there's a potential conflict in error return values. Normally, the lack of administratively configured EA support is coerced to EOPNOTSUPP to indicate that ACLs are not available; in this case, it is possible to get a successful return, even if ACLs are not available because EA support for them has not been enabled. Expand the comment in ufs_setacl() to identify this case. Obtained from: TrustedBSD Project
* o Clarify a comment about the locking condition of the vnode upon exitrwatson2001-10-271-9/+15
| | | | | | | | | | from ufs_extattr_enable_with_open(). o Print auto-start notifications if (bootverbose). This was previously commented out since it didn't know how to check for bootverbose. o Drop in comments throughout indicating where ENOENT should be replaced with ENOATTR once that is available. Obtained from: TrustedBSD Project
* o The comment about ordering the destruction of the lock and the removal ofrwatson2001-10-271-1/+1
| | | | | | | the flag indicating that the structure was initialized didn't need an XXX, since it didn't need fixing. Obtained from: TrustedBSD Project
* o Wrap a number of long lines of code, many of which were introducedrwatson2001-10-271-9/+16
| | | | | | due to KSE-related (p) expansions. Obtained from: TrustedBSD Project
* Since namespace support was added to the UFS extended attributerwatson2001-10-271-9/+7
| | | | | | | | | | | | | | | implementation to replace single-character namespace prefixes, '$' is no longer an invalid attribute name, and the namespace is relevant to validity determination. o Remove '$' case from ufs_extattr_valid_attrname() o Add attrnamespace argument to ufs_extattr_valid_attrname(), and fill out appropriately. Currently no decisions are made based on the namespace argument, but may be in the future. Obtained from: TrustedBSD Project
* Implement kern.maxvnodes. adjusting kern.maxvnodes now actually has adillon2001-10-261-16/+22
| | | | | | | | | | | | | | | | real effect. Optimize vfs_msync(). Avoid having to continually drop and re-obtain mutexes when scanning the vnode list. Improves looping case by 500%. Optimize ffs_sync(). Avoid having to continually drop and re-obtain mutexes when scanning the vnode list. This makes a couple of assumptions, which I believe are ok, in regards to vnode stability when the mount list mutex is held. Improves looping case by 500%. (more optimization work is needed on top of these fixes) MFC after: 1 week
* Default to not performing ufs_dirhash's extensive directory-blockiedowse2001-10-251-1/+1
| | | | | | | | | | | | | sanity check after every directory modification. This check can be re-enabled at any time by setting the sysctl "vfs.ufs.dirhash_docheck" to 1. This group of sanity tests was there to ensure that any UFS_DIRHASH bugs could be caught by a panic before a potentially corrupted directory block would be written to disk. It has served its main purpose now, so disable it in the interest of performance. MFC after: 1 week
* Change the vnode list under the mount point from a LIST to a TAILQdillon2001-10-232-13/+13
| | | | | | in preparation for an implementation of limiting code for kern.maxvnodes. MFC after: 3 days
* Change the kernel's ucred API as follows:jhb2001-10-112-4/+2
| | | | | | | | - crhold() returns a reference to the ucred whose refcount it bumps. - crcopy() now simply copies the credentials from one credential to another and has no return value. - a new crshared() primitive is added which returns true if a ucred's refcount is > 1 and false (0) otherwise.
* Add missing includes of sys/lock.h.jhb2001-10-112-0/+2
|
* Remove panics for rename() race conditions. The panics are inappropriatedillon2001-10-081-1/+16
| | | | | | | | | | | | | because the IN_RENAME flag only fixes a few of the huge number of race conditions that can result in the source path becoming invalid even prior to the VOP_RENAME() call. The panics created a serious security issue whereby an attacker could fairly easily cause the panic to occur, crashing the machine. The correct solution requires a great deal of work in the namei path cache code. MFC after: 0 days
* o Replace two direct uid!=0 comparisons with suser_xxx() calls.rwatson2001-10-021-2/+2
| | | | Obtained from: TrustedBSD Project
* o Replace two direct uid!=0 comparisons with suser_td() calls.rwatson2001-10-021-2/+2
| | | | Obtained from: TrustedBSD Project
* Backout the last commit. The problem is actually much worse then Idillon2001-10-021-8/+3
| | | | | | first thought and may require serious work to the VOP_RENAME() api itself. Basically, by the time the VOP_RENAME() function is called, it's already too late.
* IN_RENAME should only be cleared by the routine that set it. This fixesdillon2001-10-021-3/+8
| | | | | | | a rename/rmdir race that has been shown to cause a panic. Bug reported by: Yevgeniy Aleynikov <eugenea@infospace.com> MFC after: 3 days
* - Fix some minor whitespace nits.jhb2001-09-271-4/+4
| | | | | | - Move the SPECIAL_FLAG #define up next to the NOHOLDER #define and fix a little nit that caused it to be defined as -(sizeof (struct thread) + 1) instead of -2.
* o Re-enable support of system file flags in jail() by adding back therwatson2001-09-261-1/+1
| | | | | | | | | | | | | PRISON_ROOT to the suser_xxx() check. Since securelevels may now be raised in specific jails, use of system flags can still be restricted in jail(), but in a more configurable way. o Users of jail() expecting system flags (such as schg) to restrict jail()'s should be sure to set the securelevel appropriately in jail()'s. o This fixes activities involving automated system flag removal in jail(), including installkernel and friends. Obtained from: TrustedBSD Project
* o Modify ufs_setattr() so that it uses securelevel_gt() instead ofrwatson2001-09-261-4/+6
| | | | | | direct variable access. Obtained from: TrustedBSD Project
* o Further clarify comment: ad Udo's request, re-insert the 'if'rwatson2001-09-251-3/+4
| | | | | | | | refering to securelevels; also, update the unprivileged process text to better indicate the scope of actions permittable when any system flags are already set (limited). Submitted by: Udo Schweigert <udo.schweigert@siemens.com>
* o Parallelize the comment on the relationship between privileged un-jailedrwatson2001-09-251-2/+2
| | | | | processes and the actual securelevel check: make the comment use '> 0' instead of inverted '<= 0'.
* The addition of i_dirhash to struct inode pushed RELENG_4'siedowse2001-09-241-1/+1
| | | | | | | | | | | | | sizeof(struct inode) into a new malloc bucket on the i386. This didn't happen in -current due to the removal of i_lock, but it does no harm to apply the workaround to -current first. Reduce the size of the i_spare[] array in struct inode from 4 to 3 entries, and change ext2fs to use i_din.di_spare[1] so that it does not need i_spare[3]. Reviewed by: bde MFC after: 3 days
* KSE Milestone 2julian2001-09-1224-511/+513
| | | | | | | | | | | | | | Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha
* The "dirpref" directory layout preference improvements make use ofiedowse2001-09-091-1/+1
| | | | | | | | | | | | | | | | | an array "fs_contigdirs[]" to avoid too many directories getting created in each cylinder group. The memory required for this and two other arrays (fs_csp[] and fs_maxcluster[]) is allocated with a single malloc() call, and divided up afterwards. However, the 'space' pointer is not advanced correctly, so fs_contigdirs and fs_maxcluster end up pointing to the same address. Add the missing code to advance the 'space' pointer, and remove an unnecessary update of the pointer that follows. This is likely to fix the "ffs_clusteralloc: map mismatch" panics that have been reported recently. Submitted by: Luke Mewburn <lukem@wasabisystems.com>
* Use ACL_PERM_NONE instead of hardcoding 0 when initializingjedgar2001-09-011-3/+3
| | | | | | ACL entry permissions. Reviewed by: rwatson
* o At some point, unmounting a non-EA file system with EA's compiledrwatson2001-09-011-2/+4
| | | | | | | | | in got a bit broken, when ufs_extattr_stop() was called and failed, ufs_extattr_destroy() would panic. This makes the call to destroy() conditional on the success of stop(). Submitted by: Christian Carstensen <cc@devcon.net> Obtained from: TrustedBSD Project
* If a file has been completely unlinked, stop automatically syncing thepeter2001-08-271-0/+2
| | | | | | file. ffs will discard any pending dirty pages when it is closed, so we may as well not waste time trying to clean them. This doesn't stop other things from writing it out, eg: pageout, fsync(2) etc.
* Stop using dirhash when a directory is removed, and ensure that weiedowse2001-08-262-0/+12
| | | | | | never attempt to hash directories once they are deleted. This fixes a problem where operations on a deleted directory could trigger dirhash sanity panics.
* When compacting directories, ufs_direnter() always trusted DIRSIZ()iedowse2001-08-261-11/+29
| | | | | | | | | | | | | | | | | | | | to supply the number of bytes to be bcopy()'d to move an entry. If d_ino == 0 however, DIRSIZ() is not guaranteed to return a sensible length, so ufs_direnter could end up corrupting a directory during compaction. In practice I believe this can only happen after fsck_ffs has fixed a previously-corrupted directory. We now deal with any mid-block unused entries specially to avoid using DIRSIZ() or bcopy() on such entries. We also ensure that the variables 'dsize' and 'spacefree' contain meaningful values at all times. Add a few comments to describe better this intricate piece of code. The special handling of mid-block unused entries makes the dirhash- specific bugfix in the previous revision (1.53) now uncecessary, so this change removes it. Reviewed by: mckusick
* When compressing directory blocks, the dirhash code didn't checkiedowse2001-08-221-1/+1
| | | | | | | | | that the directory entry was in use before attempting to find it in the hash structures to change its offset. Normally, unused entries do not need to be moved, but fsck can leave behind some unused entries that do. A dirhash sanity panic resulted when the entry to be moved was not found. Add a check that stops entries with d_ino == 0 from being passed to ufsdirhash_move().
* Sigh. ufs_lookup() calls ffs_snapgone(), meaning that 'options EXT2FS'peter2001-08-181-0/+5
| | | | without 'options FFS' would fail to link.
* Two recent commits in sys/ufs/ufs interacted badly with ext2fsiedowse2001-07-292-2/+5
| | | | | | | | | | | | | | because it shares ufs code. In ufs_fhtovp(), the test on i_effnlink is invalid because ext2fs does not maintain this field. In ufs_close(), i_effnlink is also tested, to determines whether or not to call vn_start_write(). The ufs_fhtovp issue breaks NFS exporting of ext2fs filesystems; I believe the other is harmless. Fix both cases by checking um_i_effnlink_valid in the ufsmount struct, and use i_nlink if necessary. Noticed by: bde Reviewed by: mckusick, bde
* Disable the dirhash sanity check that panics if an unused directoryiedowse2001-07-271-0/+8
| | | | | | | | | entry (d_ino == 0) is found in a position that is not the start of a DIRBLKSIZ block. While such entries cannot occur normally (ufs always extends the previous entry to cover the free space instead), they do not cause problems and fsck does not fix them, so panicking is bad.
* Use a fixed type for times in on-disk structures for ufs rather thanpeter2001-07-162-5/+5
| | | | something that could potentially change like time_t.
* Return a locked struct buf from ufsdirhash_lookup() to avoid oneiedowse2001-07-133-9/+9
| | | | | | | | | | extra getblk/brelse sequence for each lookup. We already had this buf in ufsdirhash_lookup(), so there was no point in brelse'ing it only to have the caller immediately reaquire the same buffer. This should make the case of sequential lookups marginally faster; in my tests, sequential lookups with dirhash enabled are now only around 1% slower than without dirhash.
* Bring in dirhash, a simple hash-based lookup optimisation for largeiedowse2001-07-105-2/+1276
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | directories. When enabled via "options UFS_DIRHASH", in-core hash arrays are maintained for large directories. These allow all directory operations to take place quickly instead of requiring long linear searches. For now anyway, dirhash is not enabled by default. The in-core hash arrays have a memory requirement that is approximately half the size of the size of the on-disk directory file. A number of new sysctl variables allow control over which directories get hashed and over the maximum amount of memory that dirhash will use: vfs.ufs.dirhash_minsize The minimum on-disk directory size for which hashing should be used. The default is 2560 (2.5k). vfs.ufs.dirhash_maxmem The system-wide maximum total memory to be used by dirhash data structures. The default is 2097152 (2MB). The current amount of memory being used by dirhash is visible through the read-only sysctl variable vfs.ufs.dirhash_maxmem. Finally, some extra sanity checks that are enabled by default, but which may have an impact on performance, can be disabled by setting vfs.ufs.dirhash_docheck to 0. Discussed on: -fs, -hackers
* With Alfred's permission, remove vm_mtx in favor of a fine-grained approachdillon2001-07-041-30/+5
| | | | | | | | | (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
OpenPOWER on IntegriCloud