summaryrefslogtreecommitdiffstats
path: root/sys/kern
Commit message (Collapse)AuthorAgeFilesLines
...
* vfs_subr.c is getting rather fat. The underlying repocopy and thisphk2001-04-262-3198/+4
| | | | commit moves the filesystem export handling code to vfs_export.c
* Sendfile is documented to return 0 on success, however if when aalfred2001-04-261-0/+7
| | | | | | | | | | | sf_hdtr is used to provide writev(2) style headers/trailers on the sent data the return value is actually either the result of writev(2) from the trailers or headers of no tailers are specified. Fix sendfile to comply with the documentation, by returning 0 on success. Ok'd by: dg
* Do not leave a process with no credential in zombproc.tanimura2001-04-251-17/+14
| | | | Reviewed by: jhb
* When closing the last reference to an unlinked file, it is freedmckusick2001-04-251-0/+9
| | | | | | | | | | | | | | | by the inactive routine. Because the freeing causes the filesystem to be modified, the close must be held up during periods when the filesystem is suspended. For snapshots to be consistent across crashes, they must write blocks that they copy and claim those written blocks in their on-disk block pointers before the old blocks that they referenced can be allowed to be written. Close a loophole that allowed unwritten blocks to be skipped when doing ffs_sync with a request to wait for all I/O activity to be completed.
* Move the netexport structure from the fs-specific mountstructurephk2001-04-253-20/+122
| | | | | | | | | | | | | | to struct mount. This makes the "struct netexport *" paramter to the vfs_export and vfs_checkexport interface unneeded. Consequently that all non-stacking filesystems can use vfs_stdcheckexp(). At the same time, make it a pointer to a struct netexport in struct mount, so that we can remove the bogus AF_MAX and #include <net/radix.h> from <sys/mount.h>
* Change uipc_sockaddr so that a sockaddr_un without a path is returnedtmm2001-04-241-0/+2
| | | | | | | | | nam for an unbound socket instead of leaving nam untouched in that case. This way, the getsockname() output can be used to determine the address family of such sockets (AF_LOCAL). Reviewed by: iedowse Approved by: rwatson
* Change the pfind() and zpfind() functions to lock the process that theyjhb2001-04-2411-115/+173
| | | | | | find before releasing the allproc lock and returning. Reviewed by: -smp, dfr, jake
* Fix a bug introduced in the last commit: vaccess_acl_posix1 only checkedtmm2001-04-233-3/+3
| | | | | | | the file gid gainst the egid of the accessing process for the ACL_GROUP_OBJ case, and ignored supplementary groups. Approved by: rwatson
* Correct #includes to work with fixed sys/mount.h.grog2001-04-2318-0/+35
|
* o Remove comment indicating policy permits loop-back debugging, butrwatson2001-04-211-1/+0
| | | | | | | semantics don't: in practice, both policy and semantics permit loop-back debugging operations, only it's just a subset of debugging operations (i.e., a proc can open its own /dev/mem), and that's at a higher layer.
* Spelling nit: acquring -> acquiring.jhb2001-04-211-1/+1
| | | | Reported by: T. William Wells <bill@twwells.com>
* Assert that when using an interlock mutex it is not recursed when lockmgr()alfred2001-04-201-1/+3
| | | | | | is called. Ok'd by: jhb
* Make the ap_boot_mtx mutex static.jhb2001-04-201-1/+1
|
* - Whoops, forgot to enable the clock lock in the spin order list on thejhb2001-04-191-4/+2
| | | | | alpha. - Change the Debugger() functions to pass in the real function name.
* Fix inconsistency in setup of kernel_map: we need to make sure thatbmilekic2001-04-182-6/+7
| | | | | | | | we also reserve _adequate_ space for the mb_map submap; i.e. we need space for nmbclusters, nmbufs, _and_ nmbcnt. Furthermore, we need to rounddown, and not roundup, so that we are consistent. Pointed out by: bde
* Check validity of signal callback requested via aio routines.alfred2001-04-181-2/+13
| | | | | | | | | | | | | Also move the insertion of the request to after the request is validated, there's still looks like there may be some problems if an invalid address is passed to the aio routines, basically a possible leak or having a not completely initialized structure on the queue may still be possible. A new sig macro was made _SIG_VALID to check the validity of a signal, it would be advisable to use it from now on (in kern/kern_sig.c) rather than rolling your own. PR: kern/17152
* Reclaim directory vnodes held in namecache if few free vnodes aretanimura2001-04-183-2/+118
| | | | | | | | | | | | | | | available. Only directory vnodes holding no child directory vnodes held in v_cache_src are recycled, so that directory vnodes near the root of the filesystem hierarchy remain in namecache and directory vnodes are not reclaimed in cascade. The period of vnode reclaiming attempt and the number of vnodes attempted to reclaim can be tuned via sysctl(2). Suggested by: tegge Approved by: phk
* bread() is a special case of breadn(), so don't replicate code.phk2001-04-181-23/+2
|
* Make this driver play ball with devfs(5).dd2001-04-171-3/+19
| | | | Reviewed by: brian
* Add a sanity check on ucred refcount.alfred2001-04-171-0/+1
| | | | Submitted by: Terry Lambert <terry@lambert.org>
* Implement client side NFS locks.alfred2001-04-171-1/+1
| | | | | Obtained from: BSD/os Import Ok'd by: mckusick, jkh, motd on builder.freebsd.org
* Write a switch statement as less obscure if statements.phk2001-04-171-18/+8
|
* Fix an old bug related to BETTER_CLOCK. Call forward_*clock if SMPjhb2001-04-171-3/+3
| | | | | | | | | and __i386__ are defined rather than if SMP and BETTER_CLOCK are defined. The removal of BETTER_CLOCK would have broken this except that kern_clock.c doesn't include <machine/smptests.h>, so it doesn't see the definition of BETTER_CLOCK, and forward_*clock aren't called, even on 4.x. This seems to fix the problem where a n-way SMP system would see 100 * n clk interrupts and 128 * n rtc interrupts.
* This patch removes the VOP_BWRITE() vector.phk2001-04-176-19/+12
| | | | | | | | | | | | | VOP_BWRITE() was a hack which made it possible for NFS client side to use struct buf with non-bio backing. This patch takes a more general approach and adds a bp->b_op vector where more methods can be added. The success of this patch depends on bp->b_op being initialized all relevant places for some value of "relevant" which is not easy to determine. For now the buffers have grown a b_magic element which will make such issues a tiny bit easier to debug.
* Add debugging option to always read/write cylinder groups as fullmckusick2001-04-171-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sized blocks. To enable this option, use: `sysctl -w debug.bigcgs=1'. Add debugging option to disable background writes of cylinder groups. To enable this option, use: `sysctl -w debug.dobkgrdwrite=0'. These debugging options should be tried on systems that are panicing with corrupted cylinder group maps to see if it makes the problem go away. The set of panics in question are: ffs_clusteralloc: map mismatch ffs_nodealloccg: map corrupted ffs_nodealloccg: block not in map ffs_alloccg: map corrupted ffs_alloccg: block not in map ffs_alloccgblk: cyl groups corrupted ffs_alloccgblk: can't find blk in cyl ffs_checkblk: partially free fragment The following panics are less likely to be related to this problem, but might be helped by these debugging options: ffs_valloc: dup alloc ffs_blkfree: freeing free block ffs_blkfree: freeing free frag ffs_vfree: freeing free inode If you try these options, please report whether they helped reduce your bitmap corruption panics to Kirk McKusick at <mckusick@mckusick.com> and to Matt Dillon <dillon@earth.backplane.com>.
* In my first reading of POSIX.1e, I misinterpreted handling of therwatson2001-04-173-135/+297
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ACL_USER_OBJ and ACL_GROUP_OBJ fields, believing that modification of the access ACL could be used by privileged processes to change file/directory ownership. In fact, this is incorrect; ACL_*_OBJ (+ ACL_MASK and ACL_OTHER) should have undefined ae_id fields; this commit attempts to correct that misunderstanding. o Modify arguments to vaccess_acl_posix1e() to accept the uid and gid associated with the vnode, as those can no longer be extracted from the ACL passed as an argument. Perform all comparisons against the passed arguments. This actually has the effect of simplifying a number of components of this call, as well as reducing the indent level, but now seperates handling of ACL_GROUP_OBJ from ACL_GROUP. o Modify acl_posix1e_check() to return EINVAL if the ae_id field of any of the ACL_{USER_OBJ,GROUP_OBJ,MASK,OTHER} entries is a value other than ACL_UNDEFINED_ID. As a temporary work-around to allow clean upgrades, set the ae_id field to ACL_UNDEFINED_ID before each check so that this cannot cause a failure in the short term (this work-around will be removed when the userland libraries and utilities are updated to take this change into account). o Modify ufs_sync_acl_from_inode() so that it forces ACL_{USER_OBJ,GROUP_OBJ,MASK,OTHER} ae_id fields to ACL_UNDEFINED_ID when synchronizing the ACL from the inode. o Modify ufs_sync_inode_from_acl to not propagate uid and gid information to the inode from the ACL during ACL update. Also modify the masking of permission bits that may be set from ALLPERMS to (S_IRWXU|S_IRWXG|S_IRWXO), as ACLs currently do not carry none-ACCESSPERMS (S_ISUID, S_ISGID, S_ISTXT). o Modify ufs_getacl() so that when it emulates an access ACL from the inode, it initializes the ae_id fields to ACL_UNDEFINED_ID. o Clean up ufs_setacl() substantially since it is no longer possible to perform chown/chgrp operations using vop_setacl(), so all the access control for that can be eliminated. o Modify ufs_access() so that it passes owner uid and gid information into vaccess_acl_posix1e(). Pointed out by: jedger Obtained from: TrustedBSD Project
* Blow away the panic mutex in favor of using a single atomic_cmpset() on ajhb2001-04-172-5/+9
| | | | | | panic_cpu shared variable. I used a simple atomic operation here instead of a spin lock as it seemed to be excessive overhead. Also, this can avoid recursive panics if, for example, witness is broken.
* Check to see if enroll() returns NULL in the witness initialization. Thisjhb2001-04-171-0/+4
| | | | | | | can happen if witness runs out of resources during initialization or if witness_skipspin is enabled. Sleuthing by: Peter Jeremy <peter.jeremy@alcatel.com.au>
* Exit and re-enter the critical section while spinning for a spinlock sojhb2001-04-172-0/+6
| | | | that interrupts can come in while we are waiting for a lock.
* Update to the 2001-04-02 version of the nanokernel code from Dave Mills.jhay2001-04-161-22/+28
|
* Call strlen() once instead of twice.brian2001-04-141-2/+2
|
* o Since uid checks in p_cansignal() are now identical between P_SUGIDrwatson2001-04-131-28/+14
| | | | | | | | | and non-P_SUGID cases, simplify p_cansignal() logic so that the P_SUGID masking of possible signals is independent from uid checks, removing redundant code and generally improving readability. Reviewed by: tmm Obtained from: TrustedBSD Project
* convert if/panic -> KASSERT, explain what triggered the assertionalfred2001-04-131-2/+4
|
* Generate useful error messages.murray2001-04-131-4/+4
|
* Handle a rare but fatal race invoked sometimes when SIGSTOP ismarkm2001-04-132-2/+2
| | | | invoked.
* - Add a comment at the start of the spin locks list.jhb2001-04-131-1/+4
| | | | - The alpha SMP code uses an "ap boot" spinlock as well.
* o Disallow two "allow this" exceptions in p_cansignal() restrictingrwatson2001-04-131-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the ability of unprivileged processes to deliver arbitrary signals to daemons temporarily taking on unprivileged effective credentials when P_SUGID is not set on the target process: Removed: (p1->p_cred->cr_ruid != ps->p_cred->cr_uid) (p1->p_ucred->cr_uid != ps->p_cred->cr_uid) o Replace two "allow this" exceptions in p_cansignal() restricting the ability of unprivileged processes to deliver arbitrary signals to daemons temporarily taking on unprivileged effective credentials when P_SUGID is set on the target process: Replaced: (p1->p_cred->p_ruid != p2->p_ucred->cr_uid) (p1->p_cred->cr_uid != p2->p_ucred->cr_uid) With: (p1->p_cred->p_ruid != p2->p_ucred->p_svuid) (p1->p_ucred->cr_uid != p2->p_ucred->p_svuid) o These changes have the effect of making the uid-based handling of both P_SUGID and non-P_SUGID signal delivery consistent, following these four general cases: p1's ruid equals p2's ruid p1's euid equals p2's ruid p1's ruid equals p2's svuid p1's euid equals p2's svuid The P_SUGID and non-P_SUGID cases can now be largely collapsed, and I'll commit this in a few days if no immediate problems are encountered with this set of changes. o These changes remove a number of warning cases identified by the proc_to_proc inter-process authorization regression test. o As these are new restrictions, we'll have to watch out carefully for possible side effects on running code: they seem reasonable to me, but it's possible this change might have to be backed out if problems are experienced. Submitted by: src/tools/regression/security/proc_to_proc/testuid Reviewed by: tmm Obtained from: TrustedBSD Project
* o Disable two "allow this" exceptions in p_cansched()m retricting therwatson2001-04-121-1/+4
| | | | | | | | | | | | | | | | | | ability of unprivileged processes to modify the scheduling properties of daemons temporarily taking on unprivileged effective credentials. These cases (p1->p_cred->p_ruid == p2->p_ucred->cr_uid) and (p1->p_ucred->cr_uid == p2->p_ucred->cr_uid), respectively permitting a subject process to influence the scheduling of a daemon if the subject process has the same real uid or effective uid as the daemon's effective uid. This removes a number of the warning cases identified by the proc_to_proc iner-process authorization regression test. o As these are new restrictions, we'll have to watch out carefully for possible side effects on running code: they seem reasonable to me, but it's possible this change might have to be backed out if problems are experienced. Reported by: src/tools/regression/security/proc_to_proc/testuid Obtained from: TrustedBSD Project
* o Make kqueue's filt_procattach() function use the error value returnedrwatson2001-04-121-2/+3
| | | | | | | | | by p_can(...P_CAN_SEE), rather than returning EACCES directly. This brings the error code used here into line with similar arrangements elsewhere, and prevents the leakage of pid usage information. Reviewed by: jlemon Obtained from: TrustedBSD Project
* o Limit process information leakage by introducing a p_can(...P_CAN_SEE...)rwatson2001-04-121-0/+2
| | | | | | in rtprio()'s RTP_LOOKIP implementation. Obtained from: TrustedBSD Project
* o Reduce information leakage into jails by adding invocations ofrwatson2001-04-121-0/+9
| | | | | | | | | p_can(...P_CAN_SEE...) to getpgid(), getsid(), and setpgid(), blocking these operations on processes that should not be visible by the requesting process. Required to reduce information leakage in MAC environments. Obtained from: TrustedBSD Project
* o Replace p_cankill() with p_cansignal(), remove wrappage of p_can()rwatson2001-04-122-42/+74
| | | | | | | | | | | | | | | | | | from signal authorization checking. o p_cansignal() takes three arguments: subject process, object process, and signal number, unlike p_cankill(), which only took into account the processes and not the signal number, improving the abstraction such that CANSIGNAL() from kern_sig.c can now also be eliminated; previously CANSIGNAL() special-cased the handling of SIGCONT based on process session. privused is now deprecated. o The new p_cansignal() further limits the set of signals that may be delivered to processes with P_SUGID set, and restructures the access control check to allow it to be extended more easily. o These changes take into account work done by the OpenBSD Project, as well as by Robert Watson and Thomas Moestl on the TrustedBSD Project. Obtained from: TrustedBSD Project
* o Regenerated following introduction of __setugid() system call forrwatson2001-04-112-2/+4
| | | | | | "options REGRESSION". Obtained from: TrustedBSD Project
* o Introduce a new system call, __setsugid(), which allows a process torwatson2001-04-112-0/+24
| | | | | | | | | | | | | | | | | toggle the P_SUGID bit explicitly, rather than relying on it being set implicitly by other protection and credential logic. This feature is introduced to support inter-process authorization regression testing by simplifying userland credential management allowing the easy isolation and reproduction of authorization events with specific security contexts. This feature is enabled only by "options REGRESSION" and is not intended to be used by applications. While the feature is not known to introduce security vulnerabilities, it does allow processes to enter previously inaccessible parts of the credential state machine, and is therefore disabled by default. It may not constitute a risk, and therefore in the future pending further analysis (and appropriate need) may become a published interface. Obtained from: TrustedBSD Project
* Stick proc0 in the PID hash table.jhb2001-04-111-0/+1
|
* Rename the IPI API from smp_ipi_* to ipi_* since the smp_ prefix is justjhb2001-04-111-14/+14
| | | | | | "redundant noise" and to match the IPI constant namespace (IPI_*). Requested by: bde
* Correct the following defines to match the POSIX.1e spec:jedgar2001-04-113-108/+108
| | | | | | | | ACL_PERM_EXEC -> ACL_EXECUTE ACL_PERM_READ -> ACL_READ ACL_PERM_WRITE -> ACL_WRITE Obtained from: TrustedBSD
* Create debug.hashstat.[raw]nchash and debug.hashstat.[raw]nfsnode topeter2001-04-111-0/+80
| | | | | | | | | | | enable easy access to the hash chain stats. The raw prefixed versions dump an integer array to userland with the chain lengths. This cheats and calls it an array of 'struct int' rather than 'int' or sysctl -a faithfully dumps out the 128K array on an average machine. The non-raw versions return 4 integers: count, number of chains used, maximum chain length, and percentage utilization (fixed point, multiplied by 100). The raw forms are more useful for analyzing the hash distribution, while the other form can be read easily by humans and stats loggers.
* Remove the BETTER_CLOCK #ifdef's. The code is on by default and is herejhb2001-04-101-14/+4
| | | | | | to stay for the foreseeable future. OK'd by: peter (the idea)
* Add an MI API for sending IPI's. I used the same API present on the alphajhb2001-04-101-12/+55
| | | | | | | | because: - it used a better namespace (smp_ipi_* rather than *_ipi), - it used better constant names for the IPI's (IPI_* rather than X*_OFFSET), and - this API also somewhat exists for both alpha and ia64 already.
OpenPOWER on IntegriCloud