summaryrefslogtreecommitdiffstats
path: root/sys/fs/specfs
Commit message (Collapse)AuthorAgeFilesLines
* - Replace v_flag with v_iflag and v_vflagjeff2002-08-041-8/+19
| | | | | | | | | | | | | | | - v_vflag is protected by the vnode lock and is used when synchronization with VOP calls is needed. - v_iflag is protected by interlock and is used for dealing with vnode management issues. These flags include X/O LOCK, FREE, DOOMED, etc. - All accesses to v_iflag and v_vflag have either been locked or marked with mp_fixme's. - Many ASSERT_VOP_LOCKED calls have been added where the locking was not clear. - Many functions in vfs_subr.c were restructured to provide for stronger locking. Idea stolen from: BSD/OS
* - Explicitly state that specfs does not support locking by usingjeff2002-07-271-0/+3
| | | | | vop_no{lock,unlock,islocked}. This should be the only vnode opv that does so.
* o Lock page queue accesses by vm_page_activate() and vm_page_deactivate().alc2002-07-271-1/+2
|
* Remove a check of blocknumbers/offsets which will be pointless withphk2002-05-181-12/+0
| | | | | | 64 bit daddr_t. Sponsored by: DARPA & NAI Labs.
* Lock proctree_lock instead of pgrpsess_lock.jhb2002-04-161-2/+2
|
* Remove __P.alfred2002-03-191-15/+15
|
* If in strategy we find that we have no devsw on the device anymore wephk2002-03-051-3/+6
| | | | | are probably talking about some disk-device which wente away, so return ENXIO instead of panicing.
* Simple p_ucred -> td_ucred changes to start using the per-thread ucredjhb2002-02-271-4/+4
| | | | reference.
* Lock struct pgrp, session and sigio.tanimura2002-02-231-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | New locks are: - pgrpsess_lock which locks the whole pgrps and sessions, - pg_mtx which protects the pgrp members, and - s_mtx which protects the session members. Please refer to sys/proc.h for the coverage of these locks. Changes on the pgrp/session interface: - pgfind() needs the pgrpsess_lock held. - The caller of enterpgrp() is responsible to allocate a new pgrp and session. - Call enterthispgrp() in order to enter an existing pgrp. - pgsignal() requires a pgrp lock held. Reviewed by: jhb, alfred Tested on: cvsup.jp.FreeBSD.org (which is a quad-CPU machine running -current)
* Various nit-picking, mostly of style(9) character.phk2002-02-101-43/+41
| | | | Obtained from: ~bde/sys.dif.gz
* Change the kernel's ucred API as follows:jhb2001-10-111-5/+4
| | | | | | | | - crhold() returns a reference to the ucred whose refcount it bumps. - crcopy() now simply copies the credentials from one credential to another and has no return value. - a new crshared() primitive is added which returns true if a ucred's refcount is > 1 and false (0) otherwise.
* o Modify generic specfs device open access control checks to userwatson2001-09-261-4/+8
| | | | | | securelevel_ge() instead of direct securelevel variable checks. Obtained from: TrustedBSD Project
* KSE Milestone 2julian2001-09-121-25/+25
| | | | | | | | | | | | | | Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha
* With Alfred's permission, remove vm_mtx in favor of a fine-grained approachdillon2001-07-041-3/+2
| | | | | | | | | (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
* Don't acquire/release Giant around some of the places that need it injhb2001-05-231-2/+1
| | | | spec_getpages(). Instead, assert that Giant is held by the caller.
* Introduce a global lock for the vm subsystem (vm_mtx).alfred2001-05-191-0/+4
| | | | | | | | | | | | | | | | | | | vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb
* Backed out previous commit. It cause massive filesystem corruption,bde2001-04-301-0/+1
| | | | | | | | | | | | | | | | | | | not to mention a compile-time warning about the critical function becoming unused, by replacing spec_bmap() with vop_stdbmap(). ntfs seems to have the same bug. The factor for converting specfs block numbers to physical block numbers is 1, but vop_stdbmap() uses the bogus factor btodb(ap->a_vp->v_mount->mnt_stat.f_iosize), which is 16 for ffs with the default block size of 8K. This factor is bogus even for vop_stdbmap() -- the correct factor is related to the filesystem blocksize which is not necessarily the same to the optimal i/o size. vop_stdbmap() was apparently cloned from nfs where these sizes happen to be the same. There may also be a problem with a_vp->v_mount being null. spec_bmap() still checks for this, but I think the checks in specfs are dead code which used to support block devices.
* Add a vop_stdbmap(), and make it part of the default vop vector.phk2001-04-291-1/+0
| | | | | | Make 7 filesystems which don't really know about VOP_BMAP rely on the default vector, rather than more or less complete local vop_nopbmap() implementations.
* Revert consequences of changes to mount.h, part 2.grog2001-04-291-2/+0
| | | | Requested by: bde
* Correct #includes to work with fixed sys/mount.h.grog2001-04-231-0/+2
|
* Fixes to track snapshot copy-on-write checking in the specinfomckusick2001-03-071-2/+2
| | | | | | structure rather than assuming that the device vnode would reside in the FFS filesystem (which is obviously a broken assumption with the device filesystem).
* Extend kqueue down to the device layer.jlemon2001-02-151-0/+19
| | | | Backwards compatible approach suggested by: peter
* Another round of the <sys/queue.h> FOREACH transmogriffer.phk2001-02-041-2/+1
| | | | | Created with: sed(1) Reviewed by: md5(1)
* Add a BUF_KERNPROC() in the BIO_DELETE path.phk2001-01-301-0/+1
| | | | This seems to fix the problem which md(4) backed filesystems exposed.
* This patch reestablishes the spec_fsync() guarentee that synchronousdillon2001-01-291-7/+16
| | | | | | | | | | | | | | | | fsyncs, which typically occur during unmounting, will drain all dirty buffers even if it takes multiple passes to do so. The guarentee was mangled by the last patch which solved a problem due to -current disabling interrupts while holding giant (which caused an infinite spin loop waiting for I/O to complete). -stable does not have either patch, but has a similar bug in the original spec_fsync() code which is triggered by a bug in the softupdates umount code, a fix for which will be committed to -current as soon as Kirk stamps it. Then both solutions will be MFC'd to -stable. -stable currently suffers from a combination of the softupdates bug and a small window of opportunity in the original spec_fsync() code, and -stable also suffers from the spin-loop bug but since interrupts are enabled the spin resolves itself in a few milliseconds.
* Fix a lockup problem that occurs with 'cvs update'. specfs's fsync candillon2000-12-301-0/+13
| | | | | | get into the same sort of infinite loop that ffs's fsync used to get into, probably due to background bitmap writes. The solution is the same.
* This implements a better launder limiting solution. There was a solutiondillon2000-12-261-0/+2
| | | | | | | | | | | | | | | | | | | in 4.2-REL which I ripped out in -stable and -current when implementing the low-memory handling solution. However, maxlaunder turns out to be the saving grace in certain very heavily loaded systems (e.g. newsreader box). The new algorithm limits the number of pages laundered in the first pageout daemon pass. If that is not sufficient then suceessive will be run without any limit. Write I/O is now pipelined using two sysctls, vfs.lorunningspace and vfs.hirunningspace. This prevents excessive buffered writes in the disk queues which cause long (multi-second) delays for reads. It leads to more stable (less jerky) and generally faster I/O streaming to disk by allowing required read ops (e.g. for indirect blocks and such) to occur without interrupting the write stream, amoung other things. NOTE: eventually, filesystem write I/O pipelining needs to be done on a per-device basis. At the moment it is globalized.
* Take VBLK devices further out of their missery.phk2000-11-021-9/+2
| | | | This should fix the panic I introduced in my previous commit on this topic.
* Blow away the v_specmountpoint define, replacing it with what it waseivind2000-10-091-5/+5
| | | | defined as (rdev->si_mountpoint)
* Fix panic when removing open device (found by bp@)phk2000-08-241-3/+11
| | | | | | | | | | | | Implement subdirs. Build the full "devicename" for cloning functions. Fix panic when deleted device goes away. Collaps devfs_dir and devfs_dirent structures. Add proper cloning to the /dev/fd* "device-"driver. Fix a bug in make_dev_alias() handling which made aliases appear multiple times. Use devfs_clone to implement getdiskbyname() Make specfs maintain the stat(2) timestamps per dev_t
* Introduce vop_stdinactive() and make it the default if no vop_inactivephk2000-08-181-15/+1
| | | | | | is declared. Sort and prune a few vop_op[].
* This patch corrects the first round of panics and hangs reportedmckusick2000-07-241-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with the new snapshot code. Update addaliasu to correctly implement the semantics of the old checkalias function. When a device vnode first comes into existence, check to see if an anonymous vnode for the same device was created at boot time by bdevvp(). If so, adopt the bdevvp vnode rather than creating a new vnode for the device. This corrects a problem which caused the kernel to panic when taking a snapshot of the root filesystem. Change the calling convention of vn_write_suspend_wait() to be the same as vn_start_write(). Split out softdep_flushworklist() from softdep_flushfiles() so that it can be used to clear the work queue when suspending filesystem operations. Access to buffers becomes recursive so that snapshots can recursively traverse their indirect blocks using ffs_copyonwrite() when checking for the need for copy on write when flushing one of their own indirect blocks. This eliminates a deadlock between the syncer daemon and a process taking a snapshot. Ensure that softdep_process_worklist() can never block because of a snapshot being taken. This eliminates a problem with buffer starvation. Cleanup change in ffs_sync() which did not synchronously wait when MNT_WAIT was specified. The result was an unclean filesystem panic when doing forcible unmount with heavy filesystem I/O in progress. Return a zero'ed block when reading a block that was not in use at the time that a snapshot was taken. Normally, these blocks should never be read. However, the readahead code will occationally read them which can cause unexpected behavior. Clean up the debugging code that ensures that no blocks be written on a filesystem while it is suspended. Snapshots must explicitly label the blocks that they are writing during the suspension so that they do not cause a `write on suspended filesystem' panic. Reorganize ffs_copyonwrite() to eliminate a deadlock and also to prevent a race condition that would permit the same block to be copied twice. This change eliminates an unexpected soft updates inconsistency in fsck caused by the double allocation. Use bqrelse rather than brelse for buffers that will be needed soon again by the snapshot code. This improves snapshot performance.
* Add snapshots to the fast filesystem. Most of the changes supportmckusick2000-07-111-4/+18
| | | | | | | | | | | | | | | | | | | | the gating of system calls that cause modifications to the underlying filesystem. The gating can be enabled by any filesystem that needs to consistently suspend operations by adding the vop_stdgetwritemount to their set of vnops. Once gating is enabled, the function vfs_write_suspend stops all new write operations to a filesystem, allows any filesystem modifying system calls already in progress to complete, then sync's the filesystem to disk and returns. The function vfs_write_resume allows the suspended write operations to begin again. Gating is not added by default for all filesystems as for SMP systems it adds two extra locks to such critical kernel paths as the write system call. Thus, gating should only be added as needed. Details on the use and current status of snapshots in FFS can be found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness is not included here. Unless and until you create a snapshot file, these changes should have no effect on your system (famous last words).
* Pull the rug under block mode devices. they return ENXIO on open(2) now.phk2000-07-031-3/+3
|
* Virtualizes & untangles the bioops operations vector.phk2000-06-161-3/+2
| | | | Ref: Message-ID: <18317.961014572@critter.freebsd.dk> To: current@
* before this commit, specfs reported disk partitionsjmb2000-06-121-1/+1
| | | | | | | | | using decimal major and minor numbers. "ls -l" reports disk partitions using decimal major numbers and hex minor numbers. make specfs use decimal major numbers and hex minor numbers, just like "ls -l"
* Change the "bdev-whiner" to whine when open is attempted and extendphk2000-05-091-0/+9
| | | | the deadline a month.
* Separate the struct bio related stuff out of <sys/buf.h> intophk2000-05-051-0/+1
| | | | | | | | | | | | | | | <sys/bio.h>. <sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall not be made a nested include according to bdes teachings on the subject of nested includes. Diskdrivers and similar stuff below specfs::strategy() should no longer need to include <sys/buf.> unless they need caching of data. Still a few bogus uses of struct buf to track down. Repocopy by: peter
* Move B_ERROR flag to b_ioflags and call it BIO_ERROR.phk2000-04-021-1/+1
| | | | | | | | | | | | | (Much of this done by script) Move B_ORDERED flag to b_ioflags and call it BIO_ORDERED. Move b_pblkno and b_iodone_chain to struct bio while we transition, they will be obsoleted once bio structs chain/stack. Add bio_queue field for struct bio aware disksort. Address a lot of stylistic issues brought up by bde.
* Rename the existing BUF_STRATEGY() to DEV_STRATEGY()phk2000-03-201-3/+3
| | | | | | | | substitute BUF_WRITE(foo) for VOP_BWRITE(foo->b_vp, foo) substitute BUF_STRATEGY(foo) for VOP_STRATEGY(foo->b_vp, foo) This patch is machine generated except for the ccd.c and buf.h parts.
* Remove B_READ, B_WRITE and B_FREEBUF and replace them with a newphk2000-03-201-4/+4
| | | | | | | | | | | | | | | | | | | | | field in struct buf: b_iocmd. The b_iocmd is enforced to have exactly one bit set. B_WRITE was bogusly defined as zero giving rise to obvious coding mistakes. Also eliminate the redundant struct buf flag B_CALL, it can just as efficiently be done by comparing b_iodone to NULL. Should you get a panic or drop into the debugger, complaining about "b_iocmd", don't continue. It is likely to write on your disk where it should have been reading. This change is a step in the direction towards a stackable BIO capability. A lot of this patch were machine generated (Thanks to style(9) compliance!) Vinum users: Greg has not had time to test this yet, be careful.
* Eliminate the undocumented, experimental, non-delivering and highlyphk2000-03-161-2/+0
| | | | dangerous MAX_PERF option.
* Give vn_isdisk() a second argument where it can return a suitable errno.phk2000-01-101-5/+6
| | | | Suggested by: bde
* Remove unused #includes.phk1999-12-081-4/+0
| | | | Obtained from: http://bogon.freebsd.dk/include
* Collect read and write counts for filesystems. This new codemckusick1999-12-011-0/+21
| | | | | | | | | | | | | | drops the counting in bwrite and puts it all in spec_strategy. I did some tests and verified that the counts collected for writes in spec_strategy is identical to the counts that we previously collected in bwrite. We now also get read counts (async reads come from requests for read-ahead blocks). Note that you need to compile a new version of mount to get the read counts printed out. The old mount binary is completely compatible, the only reason to install a new mount is to get the read counts printed. Submitted by: Craig A Soules <soules+@andrew.cmu.edu> Reviewed by: Kirk McKusick <mckusick@mckusick.com>
* Next step in the device cleanup process.phk1999-11-091-321/+78
| | | | | | | | Correctly lock vnodes when calling VOP_OPEN() from filesystem mount code. Unify spec_open() for bdev and cdev cases. Remove the disabled bdev specific read/write code.
* Oops, a bit too hasty there.phk1999-11-081-3/+0
|
* Various cleanups.phk1999-11-081-27/+14
|
* Use vop_panic() instead of spec_badop().phk1999-11-071-23/+11
|
* Remove the iskmemdev() function. Make it the responsibility of the mem.cphk1999-11-071-3/+1
| | | | drivers to enforce the securelevel checks.
OpenPOWER on IntegriCloud