summaryrefslogtreecommitdiffstats
path: root/sys/ufs
Commit message (Collapse)AuthorAgeFilesLines
* When taking a snapshot, we must check for active files that havemckusick2002-02-026-272/+336
| | | | | | been unlinked (e.g., with a zero link count). We have to expunge all trace of these files from the snapshot so that they are neither reclaimed prematurely by fsck nor saved unnecessarily by dump.
* Add a stub for softdep_request_cleanup() so that compilation withoutmckusick2002-01-231-0/+9
| | | | | | SOFTUPDATES option works properly. Submitted by: Benno Rice <benno@jeamland.net>
* This patch fixes a long standing complaint with soft updates inmckusick2002-01-223-10/+64
| | | | | | | | | | | | | | | | which small and/or nearly full filesystems would fail with `file system full' messages when trying to replace a number of existing files (for example during a system installation). When the allocation routines are about to fail with a file system full condition, they make a call to softdep_request_cleanup() which attempts to accelerate the flushing of pending deletion requests in an effort to free up space. In the face of filesystem I/O requests that exceed the available disk transfer capacity, the cleanup request could take an unbounded amount of time. Thus, the softdep_request_cleanup() routine will only try for tickdelay seconds (default 2 seconds) before giving up and returning a filesystem full error. Under typical conditions, the softdep_request_cleanup() routine is able to free up space in under fifty milliseconds.
* Fix a bug introduced in ffs_snapshot.c -r1.25 and fs.h -r1.26mckusick2002-01-172-4/+4
| | | | | | | | | which caused incomplete snapshots to be taken. When background fsck would run on these snapshots, the result would be files being incorrectly released which would subsequently panic the kernel with ``handle_workitem_freefile: inodedep survived'', ``handle_written_inodeblock: live inodedep'', and ``handle_workitem_remove: lost inodedep'' errors.
* Put write on read-only filesystem panic after we have weeded outmckusick2002-01-161-2/+2
| | | | | | block and character devices, fifo's, etc. Submitted by: Bruce Evans <bde@zeta.org.au>
* When downgrading a filesystem from read-write to read-only, operationsmckusick2002-01-154-8/+17
| | | | | | | | | | | | | | | | | | | | involving file removal or file update were not always being fully committed to disk. The result was lost files or corrupted file data. This change ensures that the filesystem is properly synced to disk before the filesystem is down-graded. This delta also fixes a long standing bug in which a file open for reading has been unlinked. When the last open reference to the file is closed, the inode is reclaimed by the filesystem. Previously, if the filesystem had been down-graded to read-only, the inode could not be reclaimed, and thus was lost and had to be later recovered by fsck. With this change, such files are found at the time of the down-grade. Normally they will result in the filesystem down-grade failing with `device busy'. If a forcible down-grade is done, then the affected files will be revoked causing the inode to be released and the open file descriptors to begin failing on attempts to read. Submitted by: "Sam Leffler" <sam@errno.com>
* SMP Lock struct file, filedesc and the global file list.alfred2002-01-131-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Seigo Tanimura (tanimura) posted the initial delta. I've polished it quite a bit reducing the need for locking and adapting it for KSE. Locks: 1 mutex in each filedesc protects all the fields. protects "struct file" initialization, while a struct file is being changed from &badfileops -> &pipeops or something the filedesc should be locked. 1 mutex in each struct file protects the refcount fields. doesn't protect anything else. the flags used for garbage collection have been moved to f_gcflag which was the FILLER short, this doesn't need locking because the garbage collection is a single threaded container. could likely be made to use a pool mutex. 1 sx lock for the global filelist. struct file * fhold(struct file *fp); /* increments reference count on a file */ struct file * fhold_locked(struct file *fp); /* like fhold but expects file to locked */ struct file * ffind_hold(struct thread *, int fd); /* finds the struct file in thread, adds one reference and returns it unlocked */ struct file * ffind_lock(struct thread *, int fd); /* ffind_hold, but returns file locked */ I still have to smp-safe the fget cruft, I'll get to that asap.
* When going to sleep, we must save our SPL so that it does not getmckusick2002-01-121-38/+76
| | | | | | | | | | lost if some other process uses the lock while we are sleeping. We restore it after we have slept. This functionality is provided by a new routine interlocked_sleep() that wraps the interlocking with functions that sleep. This function is then used in place of the old ACQUIRE_LOCK_INTERLOCKED() and FREE_LOCK_INTERLOCKED() macros. Submitted by: Debbie Chu <dchu@juniper.net>
* Must call drain_output() before checking the dirty block listmckusick2002-01-111-8/+10
| | | | | | | | | in softdep_sync_metadata(). Otherwise we may miss dependencies that need to be flushed which will result in a later panic with the message ``vinvalbuf: dirty bufs''. Submitted by: Matthew Dillon <dillon@apollo.backplane.com> MFC after: 1 week
* Do not pull quota entries of the cache-list if they have alreadyphk2002-01-101-1/+2
| | | | | | | | | | | | | been removed from the cache-list as part of a previous unmount. This would result in panics (page fault in dqflush()) during subsequent umounts provided that enough distinct UID's to actually make the hash do something are active. This can probably explain a number of weird quota related behaviours. PR: 32331 maybe more. Reproduced by: Søren Schrørder <sch@cybercity.dk>
* Initialise the bioops vector hack at runtime rather than at link time. Thismsmith2002-01-081-8/+7
| | | | | | avoids the use of common variables. Reviewed by: mckusick
* Fix a BUF_TIMELOCK race against BUF_LOCK and fix a deadlock in vget()dillon2001-12-202-2/+2
| | | | | | | | against VM_WAIT in the pageout code. Both fixes involve adjusting the lockmgr's timeout capability so locks obtained with timeouts do not interfere with locks obtained without a timeout. Hopefully MFC: before the 4.5 release
* Change the atomic_set_char to atomic_set_int and atomic_clear_charmckusick2001-12-183-13/+17
| | | | | | to atomic_clear_int to ease the implementation for the sparc64. Requested by: Jake Burkholder <jake@locore.ca>
* Make sure we ignore the value of `fs_active' when reloading theiedowse2001-12-161-1/+2
| | | | | superblock, and move the initialisation of it to beside where other pointer fields are initialised.
* Move the new superblock field `fs_active' into the region of theiedowse2001-12-161-5/+7
| | | | | | superblock that is already set up to handle pointer types. This fixes an accidental change in the superblock size on 64-bit platforms caused by revision 1.24.
* Minimize the time necessary to suspend operations on a filesystemmckusick2001-12-144-96/+209
| | | | | | | | | | | | | | | | | | when taking a snapshot. The two time consuming operations are scanning all the filesystem bitmaps to determine which blocks are in use and scanning all the other snapshots so as to be able to expunge their blocks from the view of the current snapshot. The bitmap scanning is broken into two passes. Before suspending the filesystem all bitmaps are scanned. After the suspension, those bitmaps that changed after being scanned the first time are rescanned. Typically there are few bitmaps that need to be rescanned. The expunging of other snapshots is now done after the suspension is released by observing that we can easily identify any blocks that were allocated to them after the suspension (they will be maked as `not needing to be copied' in the just created snapshot). For all the gory details, see the ``Running fsck in the Background'' paper in the Usenix BSDCon 2002 Conference Proceedings, pages 55-64.
* When a file is partially truncated, we first check to see if themckusick2001-12-131-0/+12
| | | | | | | | | | | | | | new file end will land in the middle of a file hole. Since the last block of a file must always be allocated, the hole is filled by allocating a block at that location. If the hole being filled is a direct block, then the truncation may eventually reduce the full sized block down to a fragment. When running with soft updates, it is necessary to FSYNC the file after allocating the block and before creating the fragment to avoid triggering a soft updates inconsistency when the block unexpectedly shrinks. Found by: Matthew Dillon <dillon@apollo.backplane.com> MFC after: 1 week
* Use 'mkdir -p /.attribute/system' instead of breaking it intorwatson2001-11-301-1/+1
| | | | | | two seperate mkdir targets. Submitted by: jedgar
* Use 'mkdir -p /.attribute/system' instead of breaking it intorwatson2001-11-301-1/+1
| | | | two seperate mkdir targets.
* README.extattr incorrectly specified sample command lines forrwatson2001-11-301-2/+2
| | | | | | | UFS_EXTATTR_AUTOSTART. Insert the missing 'initattr' arguments to extattrctl. Noticed by: green
* When mkdir()-ing, the parent dir gets is linkcount increased.guido2001-11-221-1/+1
| | | | | | | Fix VN_KNOTE to reflect that. Found by: tobez@freebsd.org MFC after: 2 days
* Oops, when trying the dirhash sequential-access optimisation,iedowse2001-11-141-1/+1
| | | | | | | | | | | | | | compare the slot offset against the predicted offset, not a boolean flag. This typo effectively disabled the sequential optimisation, but was otherwise harmless. Not surprisingly, fixing this improves performance in the sequential access case. I am seeing a 7% speedup on one machine here; using dirhash when sequentially looking up directory entries is now about 5% faster instead of 2% slower than the non-dirhash case. Submitted by: KOIE Hidetaka <koie@suri.co.jp> MFC after: 1 week
* Implement IO_NOWDRAIN and B_NOWDRAIN - prevents the buffer cache from blockingdillon2001-11-051-0/+2
| | | | | | | | | | | | | in wdrain during a write. This flag needs to be used in devices whos strategy routines turn-around and issue another high level I/O, such as when MD turns around and issues a VOP_WRITE to vnode backing store, in order to avoid deadlocking the dirty buffer draining code. Remove a vprintf() warning from MD when the backing vnode is found to be in-use. The syncer of buf_daemon could be flushing the backing vnode at the time of an MD operation so the warning is not correct. MFC after: 1 week
* o Update copyright dates.rwatson2001-11-014-7/+18
| | | | | | | o Add reference to TrustedBSD Project in license header. o Update dated comments, including comment in extattr.h claiming that no file systems support extended attributes. o Improve comment consistency.
* o Althought this is not specified in POSIX.1e, the UFS ACL implementationrwatson2001-10-271-1/+6
| | | | | | | | | | | | | | | | coerces the deletion of a default ACL on a directory when no default ACL EA is present to success. Because the UFS EA implementation doesn't disinguish the EA failure modes "that EA name has not been administratively enabled" from "that EA name has no defined data", there's a potential conflict in error return values. Normally, the lack of administratively configured EA support is coerced to EOPNOTSUPP to indicate that ACLs are not available; in this case, it is possible to get a successful return, even if ACLs are not available because EA support for them has not been enabled. Expand the comment in ufs_setacl() to identify this case. Obtained from: TrustedBSD Project
* o Clarify a comment about the locking condition of the vnode upon exitrwatson2001-10-271-9/+15
| | | | | | | | | | from ufs_extattr_enable_with_open(). o Print auto-start notifications if (bootverbose). This was previously commented out since it didn't know how to check for bootverbose. o Drop in comments throughout indicating where ENOENT should be replaced with ENOATTR once that is available. Obtained from: TrustedBSD Project
* o The comment about ordering the destruction of the lock and the removal ofrwatson2001-10-271-1/+1
| | | | | | | the flag indicating that the structure was initialized didn't need an XXX, since it didn't need fixing. Obtained from: TrustedBSD Project
* o Wrap a number of long lines of code, many of which were introducedrwatson2001-10-271-9/+16
| | | | | | due to KSE-related (p) expansions. Obtained from: TrustedBSD Project
* Since namespace support was added to the UFS extended attributerwatson2001-10-271-9/+7
| | | | | | | | | | | | | | | implementation to replace single-character namespace prefixes, '$' is no longer an invalid attribute name, and the namespace is relevant to validity determination. o Remove '$' case from ufs_extattr_valid_attrname() o Add attrnamespace argument to ufs_extattr_valid_attrname(), and fill out appropriately. Currently no decisions are made based on the namespace argument, but may be in the future. Obtained from: TrustedBSD Project
* Implement kern.maxvnodes. adjusting kern.maxvnodes now actually has adillon2001-10-261-16/+22
| | | | | | | | | | | | | | | | real effect. Optimize vfs_msync(). Avoid having to continually drop and re-obtain mutexes when scanning the vnode list. Improves looping case by 500%. Optimize ffs_sync(). Avoid having to continually drop and re-obtain mutexes when scanning the vnode list. This makes a couple of assumptions, which I believe are ok, in regards to vnode stability when the mount list mutex is held. Improves looping case by 500%. (more optimization work is needed on top of these fixes) MFC after: 1 week
* Default to not performing ufs_dirhash's extensive directory-blockiedowse2001-10-251-1/+1
| | | | | | | | | | | | | sanity check after every directory modification. This check can be re-enabled at any time by setting the sysctl "vfs.ufs.dirhash_docheck" to 1. This group of sanity tests was there to ensure that any UFS_DIRHASH bugs could be caught by a panic before a potentially corrupted directory block would be written to disk. It has served its main purpose now, so disable it in the interest of performance. MFC after: 1 week
* Change the vnode list under the mount point from a LIST to a TAILQdillon2001-10-232-13/+13
| | | | | | in preparation for an implementation of limiting code for kern.maxvnodes. MFC after: 3 days
* Change the kernel's ucred API as follows:jhb2001-10-112-4/+2
| | | | | | | | - crhold() returns a reference to the ucred whose refcount it bumps. - crcopy() now simply copies the credentials from one credential to another and has no return value. - a new crshared() primitive is added which returns true if a ucred's refcount is > 1 and false (0) otherwise.
* Add missing includes of sys/lock.h.jhb2001-10-112-0/+2
|
* Remove panics for rename() race conditions. The panics are inappropriatedillon2001-10-081-1/+16
| | | | | | | | | | | | | because the IN_RENAME flag only fixes a few of the huge number of race conditions that can result in the source path becoming invalid even prior to the VOP_RENAME() call. The panics created a serious security issue whereby an attacker could fairly easily cause the panic to occur, crashing the machine. The correct solution requires a great deal of work in the namei path cache code. MFC after: 0 days
* o Replace two direct uid!=0 comparisons with suser_xxx() calls.rwatson2001-10-021-2/+2
| | | | Obtained from: TrustedBSD Project
* o Replace two direct uid!=0 comparisons with suser_td() calls.rwatson2001-10-021-2/+2
| | | | Obtained from: TrustedBSD Project
* Backout the last commit. The problem is actually much worse then Idillon2001-10-021-8/+3
| | | | | | first thought and may require serious work to the VOP_RENAME() api itself. Basically, by the time the VOP_RENAME() function is called, it's already too late.
* IN_RENAME should only be cleared by the routine that set it. This fixesdillon2001-10-021-3/+8
| | | | | | | a rename/rmdir race that has been shown to cause a panic. Bug reported by: Yevgeniy Aleynikov <eugenea@infospace.com> MFC after: 3 days
* - Fix some minor whitespace nits.jhb2001-09-271-4/+4
| | | | | | - Move the SPECIAL_FLAG #define up next to the NOHOLDER #define and fix a little nit that caused it to be defined as -(sizeof (struct thread) + 1) instead of -2.
* o Re-enable support of system file flags in jail() by adding back therwatson2001-09-261-1/+1
| | | | | | | | | | | | | PRISON_ROOT to the suser_xxx() check. Since securelevels may now be raised in specific jails, use of system flags can still be restricted in jail(), but in a more configurable way. o Users of jail() expecting system flags (such as schg) to restrict jail()'s should be sure to set the securelevel appropriately in jail()'s. o This fixes activities involving automated system flag removal in jail(), including installkernel and friends. Obtained from: TrustedBSD Project
* o Modify ufs_setattr() so that it uses securelevel_gt() instead ofrwatson2001-09-261-4/+6
| | | | | | direct variable access. Obtained from: TrustedBSD Project
* o Further clarify comment: ad Udo's request, re-insert the 'if'rwatson2001-09-251-3/+4
| | | | | | | | refering to securelevels; also, update the unprivileged process text to better indicate the scope of actions permittable when any system flags are already set (limited). Submitted by: Udo Schweigert <udo.schweigert@siemens.com>
* o Parallelize the comment on the relationship between privileged un-jailedrwatson2001-09-251-2/+2
| | | | | processes and the actual securelevel check: make the comment use '> 0' instead of inverted '<= 0'.
* The addition of i_dirhash to struct inode pushed RELENG_4'siedowse2001-09-241-1/+1
| | | | | | | | | | | | | sizeof(struct inode) into a new malloc bucket on the i386. This didn't happen in -current due to the removal of i_lock, but it does no harm to apply the workaround to -current first. Reduce the size of the i_spare[] array in struct inode from 4 to 3 entries, and change ext2fs to use i_din.di_spare[1] so that it does not need i_spare[3]. Reviewed by: bde MFC after: 3 days
* KSE Milestone 2julian2001-09-1224-511/+513
| | | | | | | | | | | | | | Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha
* The "dirpref" directory layout preference improvements make use ofiedowse2001-09-091-1/+1
| | | | | | | | | | | | | | | | | an array "fs_contigdirs[]" to avoid too many directories getting created in each cylinder group. The memory required for this and two other arrays (fs_csp[] and fs_maxcluster[]) is allocated with a single malloc() call, and divided up afterwards. However, the 'space' pointer is not advanced correctly, so fs_contigdirs and fs_maxcluster end up pointing to the same address. Add the missing code to advance the 'space' pointer, and remove an unnecessary update of the pointer that follows. This is likely to fix the "ffs_clusteralloc: map mismatch" panics that have been reported recently. Submitted by: Luke Mewburn <lukem@wasabisystems.com>
* Use ACL_PERM_NONE instead of hardcoding 0 when initializingjedgar2001-09-011-3/+3
| | | | | | ACL entry permissions. Reviewed by: rwatson
* o At some point, unmounting a non-EA file system with EA's compiledrwatson2001-09-011-2/+4
| | | | | | | | | in got a bit broken, when ufs_extattr_stop() was called and failed, ufs_extattr_destroy() would panic. This makes the call to destroy() conditional on the success of stop(). Submitted by: Christian Carstensen <cc@devcon.net> Obtained from: TrustedBSD Project
* If a file has been completely unlinked, stop automatically syncing thepeter2001-08-271-0/+2
| | | | | | file. ffs will discard any pending dirty pages when it is closed, so we may as well not waste time trying to clean them. This doesn't stop other things from writing it out, eg: pageout, fsync(2) etc.
OpenPOWER on IntegriCloud