summaryrefslogtreecommitdiffstats
path: root/sys/kern/vfs_export.c
Commit message (Collapse)AuthorAgeFilesLines
* o Restructure vaccess() so as to check for DAC permission to modify therwatson2000-08-291-40/+92
| | | | | | | | | | | | | | | | object before falling back on privilege. Make vaccess() accept an additional optional argument, privused, to determine whether privilege was required for vaccess() to return 0. Add commented out capability checks for reference. Rename some variables to make it more clear which modes/uids/etc are associated with the object, and which with the access mode. o Update file system use of vaccess() to pass NULL as the optional privused argument. Once additional patches are applied, suser() will no longer set ASU, so privused will permit passing of privilege information up the stack to the caller. Reviewed by: bde, green, phk, -security, others Obtained from: TrustedBSD Project
* Fix typo in last commit.phk2000-08-201-2/+1
|
* Centralize the canonical vop_access user/group/other check in vaccess().phk2000-08-201-0/+54
| | | | Discussed with: bde
* This patch corrects the first round of panics and hangs reportedmckusick2000-07-241-3/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with the new snapshot code. Update addaliasu to correctly implement the semantics of the old checkalias function. When a device vnode first comes into existence, check to see if an anonymous vnode for the same device was created at boot time by bdevvp(). If so, adopt the bdevvp vnode rather than creating a new vnode for the device. This corrects a problem which caused the kernel to panic when taking a snapshot of the root filesystem. Change the calling convention of vn_write_suspend_wait() to be the same as vn_start_write(). Split out softdep_flushworklist() from softdep_flushfiles() so that it can be used to clear the work queue when suspending filesystem operations. Access to buffers becomes recursive so that snapshots can recursively traverse their indirect blocks using ffs_copyonwrite() when checking for the need for copy on write when flushing one of their own indirect blocks. This eliminates a deadlock between the syncer daemon and a process taking a snapshot. Ensure that softdep_process_worklist() can never block because of a snapshot being taken. This eliminates a problem with buffer starvation. Cleanup change in ffs_sync() which did not synchronously wait when MNT_WAIT was specified. The result was an unclean filesystem panic when doing forcible unmount with heavy filesystem I/O in progress. Return a zero'ed block when reading a block that was not in use at the time that a snapshot was taken. Normally, these blocks should never be read. However, the readahead code will occationally read them which can cause unexpected behavior. Clean up the debugging code that ensures that no blocks be written on a filesystem while it is suspended. Snapshots must explicitly label the blocks that they are writing during the suspension so that they do not cause a `write on suspended filesystem' panic. Reorganize ffs_copyonwrite() to eliminate a deadlock and also to prevent a race condition that would permit the same block to be copied twice. This change eliminates an unexpected soft updates inconsistency in fsck caused by the double allocation. Use bqrelse rather than brelse for buffers that will be needed soon again by the snapshot code. This improves snapshot performance.
* Add snapshots to the fast filesystem. Most of the changes supportmckusick2000-07-111-3/+26
| | | | | | | | | | | | | | | | | | | | the gating of system calls that cause modifications to the underlying filesystem. The gating can be enabled by any filesystem that needs to consistently suspend operations by adding the vop_stdgetwritemount to their set of vnops. Once gating is enabled, the function vfs_write_suspend stops all new write operations to a filesystem, allows any filesystem modifying system calls already in progress to complete, then sync's the filesystem to disk and returns. The function vfs_write_resume allows the suspended write operations to begin again. Gating is not added by default for all filesystems as for SMP systems it adds two extra locks to such critical kernel paths as the write system call. Thus, gating should only be added as needed. Details on the use and current status of snapshots in FFS can be found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness is not included here. Unless and until you create a snapshot file, these changes should have no effect on your system (famous last words).
* Fix support for more than 256 simultaneous mounts. Theoretical limitbp2000-07-071-2/+4
| | | | | | | is 2^16 mounts per fs type. Reported by: Troy Arie Cobb <tcobb@staff.circle.net> via phk Reviewed by: bde
* Previous commit changing SYSCTL_HANDLER_ARGS violated KNF.phk2000-07-041-3/+3
| | | | Pointed out by: bde
* Simplify and rationalise the management of the vnode free listmckusick2000-07-041-74/+30
| | | | (preparing the code to add snapshots).
* If a buffer flush fails when trying to reclaim a vnode, it is toomckusick2000-07-041-4/+10
| | | | | late to save the vnode, so just toss any remaining unwritten buffers rather than leaving them lying around to make trouble in the future.
* Make the two calls from kern/* into softupdates #ifdef SOFTUPDATES,phk2000-07-031-0/+3
| | | | | | | that is way cleaner than using the softupdates_stub stunt, which should be killed when convenient. Discussed with: mckusick
* Style police catches up with rev 1.26 of src/sys/sys/sysctl.h:phk2000-07-031-4/+4
| | | | | | | | Sanitize SYSCTL_HANDLER_ARGS so that simplistic tools can grog our sources: -sysctl_vm_zone SYSCTL_HANDLER_ARGS +sysctl_vm_zone (SYSCTL_HANDLER_ARGS)
* Move prtactive to vfs from ufs. It is used all over the place.phk2000-06-271-0/+1
|
* Virtualizes & untangles the bioops operations vector.phk2000-06-161-2/+1
| | | | Ref: Message-ID: <18317.961014572@critter.freebsd.dk> To: current@
* Back out the previous change to the queue(3) interface.jake2000-05-261-4/+4
| | | | | | It was not discussed and should probably not happen. Requested by: msmith and others
* Change the way that the queue(3) structures are declared; don't assume thatjake2000-05-231-4/+4
| | | | | | | | the type argument to *_HEAD and *_ENTRY is a struct. Suggested by: phk Reviewed by: phk Approved by: mdodd
* Fix the rootmount code for now.asmodai2000-05-141-1/+2
| | | | | | | | | This function will probably rewritten/renamed to devpp. Submitted by: Assar Westerlund <assar@sics.se> on -current Confirmed to work: Steinar Haug <sthaug@nethelp.no>, Manfred Antar <mantar@pacbell.net> Reviewed by: phk
* Separate the struct bio related stuff out of <sys/buf.h> intophk2000-05-051-0/+1
| | | | | | | | | | | | | | | <sys/bio.h>. <sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall not be made a nested include according to bdes teachings on the subject of nested includes. Diskdrivers and similar stuff below specfs::strategy() should no longer need to include <sys/buf.> unless they need caching of data. Still a few bogus uses of struct buf to track down. Repocopy by: peter
* Rename the existing BUF_STRATEGY() to DEV_STRATEGY()phk2000-03-201-3/+3
| | | | | | | | substitute BUF_WRITE(foo) for VOP_BWRITE(foo->b_vp, foo) substitute BUF_STRATEGY(foo) for VOP_STRATEGY(foo->b_vp, foo) This patch is machine generated except for the ccd.c and buf.h parts.
* In vn_isdisk(), check whether vp->v_rdev is NULL. If it is, thenchris2000-03-181-0/+5
| | | | | | | | return ENXIO (Device not configured). Without this, vn_isdisk() could (and did in the case of lstat() under fdesc) pass a NULL pointer to devsw(), which caused a page fault. Reviewed by: alfred
* Eliminate the undocumented, experimental, non-delivering and highlyphk2000-03-161-6/+0
| | | | dangerous MAX_PERF option.
* Don't try so hard to make the lower 16 bits of fsids unique. It tendedbde2000-03-141-22/+13
| | | | | | | | to recycle full fsids after only 16 mount/unmount's. This is probably too often for exported fsids. Now we recycle the full fsids only after 2^16 mount/ umount's and only ensure uniqueness in the lower 16 bits if there have been <= 256 calls to vfs_getnewfsid() since the system started.
* Try harder to make the lower 16 bits of fsids unique. The vfs typebde2000-03-121-15/+25
| | | | | | | | | | | number was packed very wastefully, giving perfect non-uniqeness in the lower 16 bits of fsids for filesystems with the same vfs type. This made linux_stat() return perfectly non-unique (broken) 16-bit st_dev's for nfs mount points, and effectively reduced mntid_base to 8 bits so that the vfs_getnewfsid() looped endlessly when there are already 256 mounted filesystems with the required vfs type. Approved by: jkh
* Do refcounting of open devices (more) correctly.sos2000-02-071-0/+16
| | | | count_dev funtion by phk.
* Remove static qualifier from vgonel, as it is needed by the Arla folkrwatson2000-02-021-2/+1
| | | | | | | | outside of vfs_subr.c. Submitted by: Assar Westerlund <assar@sics.se> Reviewed by: rwatson Approved by: jkh
* This patch fixes a locking bug that can result in deadlock ifrwatson2000-01-291-2/+17
| | | | | | | | | | | | | | | | | the codepath is followed. From the PR: vclean calls vrele leading to deadlock (if usecount > 0) vclean() calls vrele() if v_usecount of the node was higher than one. But before calling it, it sets the VXLOCK flag, which will make vn_lock called from vrele dead-lock. PR: kern/15117 Submitted by: Assar Westerlund <assar@stacken.kth.se> Reviewed by: rwatson Obtained from: NetBSD
* Give vn_isdisk() a second argument where it can return a suitable errno.phk2000-01-101-6/+18
| | | | Suggested by: bde
* Remove the P_BUFEXHAUST flag from the syncer process (leavingmckusick2000-01-101-2/+0
| | | | | | | | | | | | it only on the buf_daemon process). The problem is that when the syncer process starts running the worklist, it wants to delete lots of files. It does this by VFS_VGET'ing the vnodes, clearing the blocks in them and bdwrite'ing the buffer. It can process close to a thousand files per second which generates a large number of dirty buffers. So, giving it special priviledge at the buffer trough leads to trouble as the buf_daemon does occationally need a free buffer to proceed and if the syncer has used every last one up, we are toast.
* Change NDFREE() from a macro to a function for the time being; the macroeivind2000-01-081-0/+34
| | | | | | | version caused intolerable bloat (30k). I'm likely to revisit this with an attempt at a smarter macro. Bloat noticed by: bde
* Introduce a mechanism to suspend/resume system processes. Suspend syncerluoqi2000-01-071-7/+14
| | | | and bufdaemon prior to disk sync during system shutdown.
* Enhance reassignbuf(). When a buffer cannot be time-optimally inserteddillon2000-01-051-2/+19
| | | | | | | | | | | | | | | | | | | into vnode dirtyblkhd we append it to the list instead of prepend it to the list in order to maintain a 'forward' locality of reference, which is arguably better then 'reverse'. The original algorithm did things this way to but at a huge time cost. Enhance the append interlock for NFS writes to handle intr/soft mounts better. Fix the hysteresis for NFS async daemon I/O requests to reduce the number of unnecessary context switches. Modify handling of NFS mount options. Any given user option that is too high now defaults to the kernel maximum for that option rather then the kernel default for that option. Reviewed by: Alfred Perlstein <bright@wintelcom.net>
* Prettyness police: Identify flags in b_xflags with BX_ to distinguishmckusick1999-12-221-17/+19
| | | | them from flags in b_flags which are prefixed with B_
* Add MAP_NOSYNC feature to mmap(), and MADV_NOSYNC and MADV_AUTOSYNC todillon1999-12-121-1/+1
| | | | | | | | | | | | | | | | | madvise(). This feature prevents the update daemon from gratuitously flushing dirty pages associated with a mapped file-backed region of memory. The system pager will still page the memory as necessary and the VM system will still be fully coherent with the filesystem. Modifications made by other means to the same area of memory, for example by write(), are unaffected. The feature works on a page-granularity basis. MAP_NOSYNC allows one to use mmap() to share memory between processes without incuring any significant filesystem overhead, putting it in the same performance category as SysV Shared memory and anonymous memory. Reviewed by: julian, alc, dg
* Lock reporting and assertion changes.eivind1999-12-111-3/+3
| | | | | | | | | | | | | | | * lockstatus() and VOP_ISLOCKED() gets a new process argument and a new return value: LK_EXCLOTHER, when the lock is held exclusively by another process. * The ASSERT_VOP_(UN)LOCKED family is extended to use what this gives them * Extend the vnode_if.src format to allow more exact specification than locked/unlocked. This commit should not do any semantic changes unless you are using DEBUG_VFS_LOCKS. Discussed with: grog, mch, peter, phk Reviewed by: peter
* Remove vfs_getrootfsid() function (a temporary hack added a few monthsdillon1999-11-291-17/+0
| | | | | ago to make BOOTP work again). It is no longer required by BOOTP and no longer used.
* Convert various pieces of code to use vn_isdisk() rather than checkingphk1999-11-221-3/+4
| | | | | | | | for vp->v_type == VBLK. In ccd: we don't need to call VOP_GETATTR to find the type of a vnode. Reviewed by: sos
* struct mountlist and struct mount.mnt_list have no business beingphk1999-11-201-14/+14
| | | | | | | | | | a CIRCLEQ. Change them to TAILQ_HEAD and TAILQ_ENTRY respectively. This removes ugly mp != (void*)&mountlist comparisons. Requested by: phk Submitted by: Jake Burkholder jake@checker.org PR: 14967
* Commit the remaining part of PR14914:phk1999-11-161-19/+18
| | | | | | | | | | | Alot of the code in sys/kern directly accesses the *Q_HEAD and *Q_ENTRY structures for list operations. This patch makes all list operations in sys/kern use the queue(3) macros, rather than directly accessing the *Q_{HEAD,ENTRY} structures. Reviewed by: phk Submitted by: Jake Burkholder <jake@checker.org> PR: 14914
* Next step in the device cleanup process.phk1999-11-091-1/+1
| | | | | | | | Correctly lock vnodes when calling VOP_OPEN() from filesystem mount code. Unify spec_open() for bdev and cdev cases. Remove the disabled bdev specific read/write code.
* useracc() the prequel:phk1999-10-291-1/+0
| | | | | | | | | | | Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs. This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE} as argument.
* Trim unused options (or #ifdef for undoc options).peter1999-10-111-1/+0
| | | | Submitted by: phk
* Move the buffered read/write code out of spec_{read|write} and intophk1999-10-041-3/+0
| | | | | | | | | | | two new functions spec_buf{read|write}. Add sysctl vfs.bdev_buffered which defaults to 1 == true. This sysctl can be used to experimentally turn buffered behaviour for bdevs off. I should not be changed while any blockdevices are open. Remove the misplaced sysctl vfs.enable_userblk_io. No other changes in behaviour.
* Remove v_maxio from struct vnode.phk1999-09-291-1/+1
| | | | | | Replace it with mnt_iosize_max in struct mount. Nits from: bde
* Final commit to remove vnode->v_lastr. vm_fault now handles readdillon1999-09-211-1/+0
| | | | | | | | | | | | | | | clustering issues (replacing code that used to be in ufs/ufs/ufs_readwrite.c). vm_fault also now uses the new VM page counter inlines. This completes the changeover from vnode->v_lastr to vm_entry_t->v_lastr for VM, and fp->f_nextread and fp->f_seqcount (which have been in the tree for a while). Determination of the I/O strategy (sequential, random, and so forth) is now handled on a descriptor-by-descriptor basis for base I/O calls, and on a memory-region-by-memory-region and process-by-process basis for VM faults. Reviewed by: David Greenman <dg@root.com>, Alan Cox <alc@cs.rice.edu>
* Initialize vp->v_maxio to its default in getnetvnode() rather thanphk1999-09-201-1/+1
| | | | four different places in vfs_cluster.c
* Fix BOOTP root FS mounts. Also cleanup vfs_getnewfsid() and collapsedillon1999-09-191-21/+37
| | | | | | | | | | addaliasu() into addalias() (no operational change) and clarify comments relating to a trick that vclean() uses. The fix to BOOTP is yet another hack. Actually, rootfsid handling is already a major hack. The whole thing needs to be cleaned up. Reviewed by: David Greenman <dg@root.com>, Alan Cox <alc@cs.rice.edu>
* Add vfs.enable_userblk_io sysctl to control whether user reads and writesdillon1999-09-171-0/+3
| | | | | | | | | | | | | | | | | | to buffered block devices are allowed. The default is to be backwards compatible, i.e. reads and writes are allowed. The idea is for a larger crowd to start running with this disabled and see what problems, if any, crop up, and then to change the default to off and see if any problems crop up in the next 6 months prior to potentially removing support entirely. There are still a few people, Julian and myself included, who believe the buffered block device access from usermode to be useful. Remove use of vnode->v_lastr from buffered block device I/O in preparation for removal of vnode->v_lastr field, replacing it with the already existing seqcount metric to detect sequential operation. Reviewed by: Alan Cox <alc@cs.rice.edu>, David Greenman <dg@root.com>
* Add dev_t freeing code. Controlled by sysctl debug.free_devt, defaultphk1999-08-291-1/+2
| | | | is off.
* remove unused variables.phk1999-08-281-1/+1
|
* $Id$ -> $FreeBSD$peter1999-08-281-1/+1
|
* Simplify the handling of VCHR and VBLK vnodes using the new dev_t:phk1999-08-261-218/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | Make the alias list a SLIST. Drop the "fast recycling" optimization of vnodes (including the returning of a prexisting but stale vnode from checkalias). It doesn't buy us anything now that we don't hardlimit vnodes anymore. Rename checkalias2() and checkalias() to addalias() and addaliasu() - which takes dev_t and udev_t arg respectively. Make the revoke syscalls use vcount() instead of VALIASED. Remove VALIASED flag, we don't need it now and it is faster to traverse the much shorter lists than to maintain the flag. vfs_mountedon() can check the dev_t directly, all the vnodes point to the same one. Print the devicename in specfs/vprint(). Remove a couple of stale LFS vnode flags. Remove unimplemented/unused LK_DRAINED;
OpenPOWER on IntegriCloud