summaryrefslogtreecommitdiffstats
path: root/sys/ufs/ufs
Commit message (Collapse)AuthorAgeFilesLines
* o Introduce new VOP_ACCESS() flag VADMIN, allowing file systems to performrwatson2000-10-192-26/+48
| | | | | | | | | | | | | | | | | | | | "administrative" authorization checks. In most cases, the VADMIN test checks to make sure the credential effective uid is the same as the file owner. o Modify vaccess() to set VADMIN as an available right if the uid is appropriate. o Modify references to uid-based access control operations such that they now always invoke VOP_ACCESS() instead of using hard-coded policy checks. o This allows alternative UFS policies to be implemented by replacing only ufs_access() (such as mandatory system policies). o VOP_ACCESS() requires the caller to hold an exclusive vnode lock on the vnode: I believe that new invocations of VOP_ACCESS() are always called with the lock held. o Some direct checks of the uid remain, largely associated with the QUOTA and SUIDDIR code. Reviewed by: eivind Obtained from: TrustedBSD Project
* o Sanity check was inverted, resulting in a possible spurious panicrwatson2000-10-091-1/+1
| | | | | during unmount if extended attributes were in use. Correct by removing an unneeded (and undesirable) '!'.
* o Correct use of lockdestroy() by adding a new ufs_extattr_uepm_destroy()rwatson2000-10-042-2/+25
| | | | | | | | call, which should be the last thing down to a per-mount extattr management structure, after ufs_extattr_stop() on the file system. This currently has the effect only of destroying the per-mount lock on extended attributes, and clearing appropriate flags. o Remove inappropriate invocation in ufs_extattr_vnode_inactive().
* Convert lockmgr locks from using simple locks to using mutexes.jasone2000-10-045-9/+15
| | | | | | Add lockdestroy() and appropriate invocations, which corresponds to lockinit() and must be called to clean up after a lockmgr lock is no longer needed.
* Add a lock structure to vnode structure. Previously it was either allocatedbp2000-09-253-3/+2
| | | | | | | | | | | | | | | | | | | separately (nfs, cd9660 etc) or keept as a first element of structure referenced by v_data pointer(ffs). Such organization leads to known problems with stacked filesystems. From this point vop_no*lock*() functions maintain only interlock lock. vop_std*lock*() functions maintain built-in v_lock structure using lockmgr(). vop_sharedlock() is compatible with vop_stdunlock(), but maintains a shared lock on vnode. If filesystem wishes to export lockmgr compatible lock, it can put an address of this lock to v_vnlock field. This indicates that the upper filesystem can take advantage of it and use single lock structure for entire (or part) of stack of vnodes. This field shouldn't be examined or modified by VFS code except for initialization purposes. Reviewed in general by: mckusick
* o Disallow privileged processes in jail() from directly accessingrwatson2000-09-181-1/+9
| | | | | | | | system namespace extended attributes. o Document privilege/jail() interaction relating to extended attributes. Obtained from: TrustedBSD Project
* o Allow privileged processes in jail() to override sticky bit behaviorrwatson2000-09-181-2/+2
| | | | | | | | | | | on directories. o Allow privileged processes in jail() to create inodes with the setgid bit set even if they are not a member of the group denoted by the file creation gid. This occurs due to inherited gid's from parent directories on file creation, allowing a user to create a file with a gid that is not in the creating process's credentials. Obtained from: TrustedBSD Project
* o Add a comment clarifying interaction between jail(), privileged processes,rwatson2000-09-181-0/+5
| | | | | | | | | | | | | | and UFS file flags. Here's what the comment says, for reference: Privileged processes in jail() are permitted to modify arbitrary user flags on files, but are not permitted to modify system flags. In other words, privilege does allow a process in jail to modify user flags for objects that the process does not own, but privilege will not permit the setting of system flags on the file. Obtained from: TrustedBSD Project
* o Add missing PRISON_ROOT allowing a privileged process in a jail() to notrwatson2000-09-181-1/+1
| | | | | | | | remove the setuid/setgid bits by virtue of a change to a file with those bits set, even if the process doesn't own the file, or isn't a group member of the file's gid. Obtained from: TrustedBSD Project
* o Substitute suser() calls for direct credential checks, which is nowrwatson2000-09-184-8/+10
| | | | | | | | | | | | | | safe as suser() no longer sets ASU. o Note that in some cases, the PRISON_ROOT flag is used even though no process structure is passed, to indicate that if a process structure (and hence jail) was available, it would be ok. In the long run, the jail identifier should probably be moved to ucred, as the uidinfo information was. o Some uid 0 checks remain relating to the quota code, which I'll leave for another day. Reviewed by: phk, eivind Obtained from: TrustedBSD Project
* Add new flag PDIRUNLOCK to the component.cn_flags which should be set bybp2000-09-171-13/+30
| | | | | | | | | | | | | | | | | | filesystem lookup() routine if it unlocks parent directory. This flag should be carefully tracked by filesystems if they want to work properly with nullfs and other stacked filesystems. VFS takes advantage of this flag to perform symantically correct usage of vrele() instead of vput() if parent directory already unlocked. If filesystem fails to track this flag then previous codepath in VFS left unchanged. Convert UFS code to set PDIRUNLOCK flag if necessary. Other filesystmes will be changed after some period of testing. Reviewed in general by: mckusick, dillon, adrian Obtained from: NetBSD
* Remove a pointless casting of a gid_t to a gid_t.phk2000-09-161-1/+1
|
* o Variety of extended attribute fixesrwatson2000-09-121-26/+39
| | | | | | | | | | | | | | | | | | | | | | | - In ufs_extattr_enable(), return EEXIST instead of EOPNOTSUPP if the caller tries to configure an attribute name that is already configured - Throughout, add IO_NODELOCKED to VOP_{READ,WRITE} calls to indicate lock status of passed vnode. Apparently not a problem, but worth fixing. - For all writes, make use of IO_SYNC consistent. Really, IO_UNIT and combining of VOP_WRITE's should happen, but I don't have that tested. At least with this, it's consistent usage. (pointed out by: bde) - In ufs_extattr_get(), fixed nested locking of backing vnode (fine due to recursive lock support, but make it more consistent with other code) - In ufs_extattr_get(), clean up return code to set uio_resid more consistently with other pieces of code (worked fine, this is just a cleanup) - Fix ufs_extattr_rm(), which was broken--effectively a nop. - Minor comment and whitespace fixes. Obtained from: TrustedBSD Project
* Fix a 64-bitism. Use size_t instead of int for 4th argument to copyinstr.jhb2000-09-111-1/+2
| | | | Approved by: rwatson
* Major update to the way synchronization is done in the kernel. Highlightsjasone2000-09-071-1/+1
| | | | | | | | | | | | | | | include: * Mutual exclusion is used instead of spl*(). See mutex(9). (Note: The alpha port is still in transition and currently uses both.) * Per-CPU idle processes. * Interrupts are run in their own separate kernel threads and can be preempted (i386 only). Partially contributed by: BSDi (BSD/OS) Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh
* Modify extended attribute protection model to authorize based onrwatson2000-09-022-43/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | attribute namespace and DAC protection on file: - Attribute names beginning with '$' are in the system namespace - The attribute name "$" is reserved - System namespace attributes may only be read/set by suser() or by kernel (cred == NULL) - Other attribute names are in the application namespace - The attribute name "" is reserved - Application namespace attributes are protected in the manner of the target file permission o Kernel changes - Add ufs_extattr_valid_attrname() to check whether the requested attribute "set" or "enable" is appropriate (i.e., non-reserved) - Modify ufs_extattr_credcheck() to accept target file vnode, not to take inode uid - Modify ufs_extattr_credcheck() to check namespace, then enforce either kernel/suser for system namespace, or vaccess() for application namespace o EA backing file format changes - Remove permission fields from extended attribute backing file header - Bump extended attribute backing file header version to 3 o Update extattrctl.c and extattrctl.8 - Remove now deprecated -r and -w arguments to initattr, as permissions are now implicit - (unrelated) fix error reporting and unlinking during failed initattr to remove duplicate/inaccurate error messages, and to only unlink if the failure wasn't in the backing file open() Obtained from: TrustedBSD Project
* o Restructure vaccess() so as to check for DAC permission to modify therwatson2000-08-291-1/+1
| | | | | | | | | | | | | | | | object before falling back on privilege. Make vaccess() accept an additional optional argument, privused, to determine whether privilege was required for vaccess() to return 0. Add commented out capability checks for reference. Rename some variables to make it more clear which modes/uids/etc are associated with the object, and which with the access mode. o Update file system use of vaccess() to pass NULL as the optional privused argument. Once additional patches are applied, suser() will no longer set ASU, so privused will permit passing of privilege information up the stack to the caller. Reviewed by: bde, green, phk, -security, others Obtained from: TrustedBSD Project
* o Correct spelling of ufs_exttatr_find_attr -> ufs_extattr_find_attrrwatson2000-08-262-22/+22
| | | | | | | o Add "const" qualifier to attrname argument of various calls to remove warnings Obtained from: TrustedBSD Project
* Centralize the canonical vop_access user/group/other check in vaccess().phk2000-08-201-41/+3
| | | | Discussed with: bde
* Minor tweak - removed unused variable 'struct mount *mp';peter2000-07-281-1/+0
|
* Clean up the snapshot code so that it no longer depends on the use ofmckusick2000-07-262-4/+28
| | | | | | | | | | | | | | the SF_IMMUTABLE flag to prevent writing. Instead put in explicit checking for the SF_SNAPSHOT flag in the appropriate places. With this change, it is now possible to rename and link to snapshot files. It is also possible to set or clear any of the owner, group, or other read bits on the file, though none of the write or execute bits can be set. There is also an explicit test to prevent the setting or clearing of the SF_SNAPSHOT flag via chflags() or fchflags(). Note also that the modify time cannot be changed as it needs to accurately reflect the time that the snapshot was taken. Submitted by: Robert Watson <rwatson@FreeBSD.org>
* This patch corrects the first round of panics and hangs reportedmckusick2000-07-244-6/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with the new snapshot code. Update addaliasu to correctly implement the semantics of the old checkalias function. When a device vnode first comes into existence, check to see if an anonymous vnode for the same device was created at boot time by bdevvp(). If so, adopt the bdevvp vnode rather than creating a new vnode for the device. This corrects a problem which caused the kernel to panic when taking a snapshot of the root filesystem. Change the calling convention of vn_write_suspend_wait() to be the same as vn_start_write(). Split out softdep_flushworklist() from softdep_flushfiles() so that it can be used to clear the work queue when suspending filesystem operations. Access to buffers becomes recursive so that snapshots can recursively traverse their indirect blocks using ffs_copyonwrite() when checking for the need for copy on write when flushing one of their own indirect blocks. This eliminates a deadlock between the syncer daemon and a process taking a snapshot. Ensure that softdep_process_worklist() can never block because of a snapshot being taken. This eliminates a problem with buffer starvation. Cleanup change in ffs_sync() which did not synchronously wait when MNT_WAIT was specified. The result was an unclean filesystem panic when doing forcible unmount with heavy filesystem I/O in progress. Return a zero'ed block when reading a block that was not in use at the time that a snapshot was taken. Normally, these blocks should never be read. However, the readahead code will occationally read them which can cause unexpected behavior. Clean up the debugging code that ensures that no blocks be written on a filesystem while it is suspended. Snapshots must explicitly label the blocks that they are writing during the suspension so that they do not cause a `write on suspended filesystem' panic. Reorganize ffs_copyonwrite() to eliminate a deadlock and also to prevent a race condition that would permit the same block to be copied twice. This change eliminates an unexpected soft updates inconsistency in fsck caused by the double allocation. Use bqrelse rather than brelse for buffers that will be needed soon again by the snapshot code. This improves snapshot performance.
* o Marius pointed out an unusually inconvenient upper bound on extendedrwatson2000-07-141-1/+0
| | | | | | | | | attribute data size. o Fortunately it turned out to be an unused constant left over from an earlier implementation, and is therefore being removed so as not to confuse casual observers. Submitted by: mbendiks@eunet.no
* Add snapshots to the fast filesystem. Most of the changes supportmckusick2000-07-116-9/+33
| | | | | | | | | | | | | | | | | | | | the gating of system calls that cause modifications to the underlying filesystem. The gating can be enabled by any filesystem that needs to consistently suspend operations by adding the vop_stdgetwritemount to their set of vnops. Once gating is enabled, the function vfs_write_suspend stops all new write operations to a filesystem, allows any filesystem modifying system calls already in progress to complete, then sync's the filesystem to disk and returns. The function vfs_write_resume allows the suspended write operations to begin again. Gating is not added by default for all filesystems as for SMP systems it adds two extra locks to such critical kernel paths as the write system call. Thus, gating should only be added as needed. Details on the use and current status of snapshots in FFS can be found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness is not included here. Unless and until you create a snapshot file, these changes should have no effect on your system (famous last words).
* Finish repo-copy:phk2000-07-101-409/+0
| | | | | | Move ufs/ufs/ufs_disksubr.c to kern/subr_disklabel.c. These functions are not UFS specific and are in fact used all over the place.
* Move the truncation code out of vn_open and into the open system callmckusick2000-07-042-4/+6
| | | | | | | | | | after the acquisition of any advisory locks. This fix corrects a case in which a process tries to open a file with a non-blocking exclusive lock. Even if it fails to get the lock it would still truncate the file even though its open failed. With this change, the truncation is done only after the lock is successfully acquired. Obtained from: BSD/OS
* Move prtactive to vfs from ufs. It is used all over the place.phk2000-06-271-2/+0
|
* o Remove unneeded off_t variable to clean up compile warningrwatson2000-06-051-1/+1
| | | | Obtained from: TrustedBSD Project
* Back out the previous change to the queue(3) interface.jake2000-05-265-8/+8
| | | | | | It was not discussed and should probably not happen. Requested by: msmith and others
* Change the way that the queue(3) structures are declared; don't assume thatjake2000-05-235-8/+8
| | | | | | | | the type argument to *_HEAD and *_ENTRY is a struct. Suggested by: phk Reviewed by: phk Approved by: mdodd
* Separate the struct bio related stuff out of <sys/buf.h> intophk2000-05-054-0/+4
| | | | | | | | | | | | | | | <sys/bio.h>. <sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall not be made a nested include according to bdes teachings on the subject of nested includes. Diskdrivers and similar stuff below specfs::strategy() should no longer need to include <sys/buf.> unless they need caching of data. Still a few bogus uses of struct buf to track down. Repocopy by: peter
* Don't allow VOP_GETEXTATTR to set uio->uio_offset != 0, as we don'trwatson2000-05-031-12/+10
| | | | | | | | | | provide locking over extended attribute operations, requiring that individual operations be atomic. Allowing non-zero starting offsets permits applications/etc to put themselves at risk for inconsistent behavior. As VOP_SETEXTATTR already prohibited non-zero write offsets, this makes sense. Suggested by: Andreas Gruenbacher <a.gruenbacher@bestbits.at>
* Remove unneeded #include <vm/vm_zone.h>phk2000-04-302-2/+0
| | | | Generated by: src/tools/tools/kerninclude
* s/biowait/bufwait/gphk2000-04-292-5/+5
| | | | Prodded by: several.
* When files are given to users by root, the quota system failed tomckusick2000-04-281-0/+10
| | | | | | | | | | | | | | reset their grace timer as their ownership crossed the soft limit threshhold. Thus if they had been over their limit in the past, they were suddenly penalized as if they had been over their limit ever since. The fix is to check when root gives away files, that when the receiving user crosses their soft limit, their grace timer is reset. See the PR report for a detailed method of reproducing the bug. PR: kern/17128 Submitted by: Andre Albsmeier <andre.albsmeier@mchp.siemens.de> Reviewed by: Kirk McKusick <mckusick@mckusick.com>
* o Introduce an extended attribute backing file header magic numberrwatson2000-04-192-3/+20
| | | | o Introduce an extended attribute backing file header version number
* Remove ~25 unneeded #include <sys/conf.h>phk2000-04-191-1/+0
| | | | Remove ~60 unneeded #include <sys/malloc.h>
* o Cause attribute data writes to use IO_SYNC since this improves therwatson2000-04-192-1/+17
| | | | | | | | | | | | | | | | | | | | | | chances of consistency with other file/directory meta-data in a write. In the current set of extended attribute applications, this does not hurt much. This should be discussed again later when it comes time to optimize performance of attributes. o Include an inode generation number in the per-attribute header information. This allows consistency verification to catch when a crash occurs, or an inode is recycled while attributes are not properly configured. For now, an irritating error message is displayed when an inconsistency occurs. At some point, may introduce an ``extattrctl check ...'' which catches these before attributes are enabled. Not today. If you get this message, it means you somehow managed to get your attribute backing file out of synch with the file system. When this occurs, attribute not found is returned (== undefined). Writes will overwrite the value there correcting the problem. Might want to think about introducing a new errno or two to handle this kind of situation. Discussed with: kris
* Retire bufqdisksort(), all drivers use bioqdisksort now.phk2000-04-181-98/+0
|
* Remove unneeded cast.jlemon2000-04-171-1/+1
|
* Replace the POLLEXTEND extensions with the kqueue() mechanism.jlemon2000-04-162-25/+29
|
* Fix two bugs in extended attribute support for UFS/FFS:rwatson2000-04-161-2/+5
| | | | | | | | | | o Put back in {} removed during over-zealous cleanup of gratuitous debugging output during preparation for the commit. Due to the missing {}, writes on extended attributes always silently failed. Doh. o Don't unlock the target vnode if it's the backing vnode, as we don't lock the target vnode if it's the backing vnode.
* Complete the bio/buf divorce for all code below devfs::strategyphk2000-04-152-13/+13
| | | | | | | | | | Exceptions: Vinum untouched. This means that it cannot be compiled. Greg Lehey is on the case. CCD not converted yet, casts to struct buf (still safe) atapi-cd casts to struct buf to examine B_PHYS
* Introduce extended attribute support for FFS, allowing arbitraryrwatson2000-04-159-0/+899
| | | | | | | | | | | | | | | | | | | | | | | | | | | (name, value) pairs to be associated with inodes. This support is used for ACLs, MAC labels, and Capabilities in the TrustedBSD security extensions, which are currently under development. In this implementation, attributes are backed to data vnodes in the style of the quota support in FFS. Support for FFS extended attributes may be enabled using the FFS_EXTATTR kernel option (disabled by default). Userland utilities and man pages will be committed in the next batch. VFS interfaces and man pages have been in the repo since 4.0-RELEASE and are unchanged. o ufs/ufs/extattr.h: UFS-specific extattr defines o ufs/ufs/ufs_extattr.c: bulk of support routines o ufs/{ufs,ffs,mfs}/*.[ch]: hooks and extattr.h includes o contrib/softupdates/ffs_softdep.c: extattr.h includes o conf/options, conf/files, i386/conf/LINT: added FFS_EXTATTR o coda/coda_vfsops.c: XXX required extattr.h due to ufsmount.h (This should not be the case, and will be fixed in a future commit) Currently attributes are not supported in MFS. This will be fixed. Reviewed by: adrian, bp, freebsd-fs, other unthanked souls Obtained from: TrustedBSD Project
* Clone bio versions of certain bits of infrastructure:phk2000-04-021-0/+98
| | | | | | | | | | | | | devstat_end_transaction_bio() bioq_* versions of bufq_* incl bioqdisksort() the corresponding "buf" versions will disappear when no longer used. Move b_offset, b_data and b_bcount to struct bio. Add BIO_FORMAT as a hack for fd.c etc. We are now largely ready to start converting drivers to use struct bio instead of struct buf.
* Move B_ERROR flag to b_ioflags and call it BIO_ERROR.phk2000-04-023-3/+4
| | | | | | | | | | | | | (Much of this done by script) Move B_ORDERED flag to b_ioflags and call it BIO_ORDERED. Move b_pblkno and b_iodone_chain to struct bio while we transition, they will be obsoleted once bio structs chain/stack. Add bio_queue field for struct bio aware disksort. Address a lot of stylistic issues brought up by bde.
* Change the write-behind code to take more care when startingdillon2000-04-021-1/+3
| | | | | | | | | | | | | | async I/O's. The sequential read heuristic has been extended to cover writes as well. We continue to call cluster_write() normally, thus blocks in the file will still be reallocated for large (but still random) I/O's, but I/O will only be initiated for truely sequential writes. This solves a number of annoying situations, especially with DBM (hash method) writes, and also has the side effect of fixing a number of (stupid) benchmarks. Reviewed-by: mckusick
* diff, patch and cvs didn't like these three last time around, try again.phk2000-03-201-3/+3
|
* Rename the existing BUF_STRATEGY() to DEV_STRATEGY()phk2000-03-204-6/+6
| | | | | | | | substitute BUF_WRITE(foo) for VOP_BWRITE(foo->b_vp, foo) substitute BUF_STRATEGY(foo) for VOP_STRATEGY(foo->b_vp, foo) This patch is machine generated except for the ccd.c and buf.h parts.
* Remove B_READ, B_WRITE and B_FREEBUF and replace them with a newphk2000-03-202-7/+7
| | | | | | | | | | | | | | | | | | | | | field in struct buf: b_iocmd. The b_iocmd is enforced to have exactly one bit set. B_WRITE was bogusly defined as zero giving rise to obvious coding mistakes. Also eliminate the redundant struct buf flag B_CALL, it can just as efficiently be done by comparing b_iodone to NULL. Should you get a panic or drop into the debugger, complaining about "b_iocmd", don't continue. It is likely to write on your disk where it should have been reading. This change is a step in the direction towards a stackable BIO capability. A lot of this patch were machine generated (Thanks to style(9) compliance!) Vinum users: Greg has not had time to test this yet, be careful.
OpenPOWER on IntegriCloud