summaryrefslogtreecommitdiffstats
path: root/sys/kern/kern_jail.c
Commit message (Collapse)AuthorAgeFilesLines
* Don't allow creating a socket with a protocol family that the currentjamie2009-02-051-0/+42
| | | | | | | | | | jail doesn't support. This involves a new function prison_check_af, like prison_check_ip[46] but that checks only the family. With this change, most of the errors generated by jailed sockets shouldn't ever occur, at least until jails are changeable. Approved by: bz (mentor)
* Standardize the various prison_foo_ip[46] functions and prison_if tojamie2009-02-051-70/+74
| | | | | | | | | | | | | | | return zero on success and an error code otherwise. The possible errors are EADDRNOTAVAIL if an address being checked for doesn't match the prison, and EAFNOSUPPORT if the prison doesn't have any addresses in that address family. For most callers of these functions, use the returned error code instead of e.g. a hard-coded EADDRNOTAVAIL or EINVAL. Always include a jailed() check in these functions, where a non-jailed cred always returns success (and makes no changes). Remove the explicit jailed() checks that preceded many of the function calls. Approved by: bz (mentor)
* Mark most often used sysctl's as MPSAFE.ed2009-01-281-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | After running a `make buildkernel', I noticed most of the Giant locks in sysctl are only caused by a very small amount of sysctl's: - sysctl.name2oid. This one is locked by SYSCTL_LOCK, just like sysctl.oidfmt. - kern.ident, kern.osrelease, kern.version, etc. These are just constant strings. - kern.arandom, used by the stack protector. It is already protected by arc4_mtx. I also saw the following sysctl's show up. Not as often as the ones above, but still quite often: - security.jail.jailed. Also mark security.jail.list as MPSAFE. They don't need locking or already use allprison_lock. - kern.devname, used by devname(3), ttyname(3), etc. This seems to reduce Giant locking inside sysctl by ~75% in my primitive test setup.
* For consistency with prison_{local,remote,check}_ipN renamebz2009-01-251-2/+2
| | | | | | | prison_getipN to prison_get_ipN. Submitted by: jamie (as part of a larger patch) MFC after: 1 week
* Back out r186615; the sanitizing of the pointers in the error casebz2009-01-041-2/+0
| | | | | | is not needed and seems that it will not be needed either. Pointy hat: mine, mine, mine and not pho's
* Added missing second part of cleaning j->ip[46] as requested by bzpho2008-12-301-0/+2
| | | | | Approved by: kib (mentor) Pointy hat: pho
* Make sure that unused j->ip[46] are clearedpho2008-12-301-2/+4
| | | | | Reviewed by: bz Approved by: kib (mentor)
* Correctly check the number of prison states to not access anythingbz2008-12-111-2/+2
| | | | | | | | | | | | outside the prison_states array. When checking if there is a name configured for the prison, check the first character to not be '\0' instead of checking if the char array is present, which it always is. Note, that this is different for the *jailname in the syscall. Found with: Coverity Prevent(tm) CID: 4156, 4155 MFC after: 4 weeks (just that I get the mail)
* Unbreak the no-networks (no INET/6) build that I broke withbz2008-11-291-0/+2
| | | | | | the commit in r185435. Pointyhat: no, but I could need a ski cap for the winter
* MFp4:bz2008-11-291-60/+848
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bring in updated jail support from bz_jail branch. This enhances the current jail implementation to permit multiple addresses per jail. In addtion to IPv4, IPv6 is supported as well. Due to updated checks it is even possible to have jails without an IP address at all, which basically gives one a chroot with restricted process view, no networking,.. SCTP support was updated and supports IPv6 in jails as well. Cpuset support permits jails to be bound to specific processor sets after creation. Jails can have an unrestricted (no duplicate protection, etc.) name in addition to the hostname. The jail name cannot be changed from within a jail and is considered to be used for management purposes or as audit-token in the future. DDB 'show jails' command was added to aid debugging. Proper compat support permits 32bit jail binaries to be used on 64bit systems to manage jails. Also backward compatibility was preserved where possible: for jail v1 syscalls, as well as with user space management utilities. Both jail as well as prison version were updated for the new features. A gap was intentionally left as the intermediate versions had been used by various patches floating around the last years. Bump __FreeBSD_version for the afore mentioned and in kernel changes. Special thanks to: - Pawel Jakub Dawidek (pjd) for his multi-IPv4 patches and Olivier Houchard (cognet) for initial single-IPv6 patches. - Jeff Roberson (jeff) and Randall Stewart (rrs) for their help, ideas and review on cpuset and SCTP support. - Robert Watson (rwatson) for lots and lots of help, discussions, suggestions and review of most of the patch at various stages. - John Baldwin (jhb) for his help. - Simon L. Nielsen (simon) as early adopter testing changes on cluster machines as well as all the testers and people who provided feedback the last months on freebsd-jail and other channels. - My employer, CK Software GmbH, for the support so I could work on this. Reviewed by: (see above) MFC after: 3 months (this is just so that I get the mail) X-MFC Before: 7.2-RELEASE if possible
* With the permissions of phk@ change the license on kern_jail.cbz2008-11-281-6/+22
| | | | to a 2 clause BSD license.
* Update ZFS from version 6 to 13 and bring some FreeBSD-specific changes.pjd2008-11-171-234/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This bring huge amount of changes, I'll enumerate only user-visible changes: - Delegated Administration Allows regular users to perform ZFS operations, like file system creation, snapshot creation, etc. - L2ARC Level 2 cache for ZFS - allows to use additional disks for cache. Huge performance improvements mostly for random read of mostly static content. - slog Allow to use additional disks for ZFS Intent Log to speed up operations like fsync(2). - vfs.zfs.super_owner Allows regular users to perform privileged operations on files stored on ZFS file systems owned by him. Very careful with this one. - chflags(2) Not all the flags are supported. This still needs work. - ZFSBoot Support to boot off of ZFS pool. Not finished, AFAIK. Submitted by: dfr - Snapshot properties - New failure modes Before if write requested failed, system paniced. Now one can select from one of three failure modes: - panic - panic on write error - wait - wait for disk to reappear - continue - serve read requests if possible, block write requests - Refquota, refreservation properties Just quota and reservation properties, but don't count space consumed by children file systems, clones and snapshots. - Sparse volumes ZVOLs that don't reserve space in the pool. - External attributes Compatible with extattr(2). - NFSv4-ACLs Not sure about the status, might not be complete yet. Submitted by: trasz - Creation-time properties - Regression tests for zpool(8) command. Obtained from: OpenSolaris
* Retire the MALLOC and FREE macros. They are an abomination unto style(9).des2008-10-231-6/+6
| | | | MFC after: 3 months
* Step 1.5 of importing the network stack virtualization infrastructurezec2008-10-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | from the vimage project, as per plan established at devsummit 08/08: http://wiki.freebsd.org/Image/Notes200808DevSummit Introduce INIT_VNET_*() initializer macros, VNET_FOREACH() iterator macros, and CURVNET_SET() context setting macros, all currently resolving to NOPs. Prepare for virtualization of selected SYSCTL objects by introducing a family of SYSCTL_V_*() macros, currently resolving to their global counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT(). Move selected #defines from sys/sys/vimage.h to newly introduced header files specific to virtualized subsystems (sys/net/vnet.h, sys/netinet/vinet.h etc.). All the changes are verified to have zero functional impact at this point in time by doing MD5 comparision between pre- and post-change object files(*). (*) netipsec/keysock.c did not validate depending on compile time options. Implemented by: julian, bz, brooks, zec Reviewed by: julian, bz, brooks, kris, rwatson, ... Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation
* Commit step 1 of the vimage project, (network stack)bz2008-08-171-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | virtualization work done by Marko Zec (zec@). This is the first in a series of commits over the course of the next few weeks. Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only. We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again. Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch
* MFp4 144659:bz2008-07-071-0/+4
| | | | | | | | Plug a memory leak with jail services. PR: 125257 Submitted by: Mateusz Guzik <mjguzik gmail.com> MFC after: 6 days
* Introduce a new lock, hostname_mtx, and use it to synchronize accessrwatson2008-07-051-1/+4
| | | | | | | | | | | | to global hostname and domainname variables. Where necessary, copy to or from a stack-local buffer before performing copyin() or copyout(). A few uses, such as in cd9660 and daemon_saver, remain under-synchronized and will require further updates. Correct a bug in which a failed copyin() of domainname would leave domainname potentially corrupted. MFC after: 3 weeks
* Revert rev. 178124 as requested by kris@. Having jail id not beingdelphij2008-06-191-18/+24
| | | | reused too frequently is useful for script controlled environment.
* Instead of rolling our own jail number allocation procedure, usedelphij2008-04-111-24/+18
| | | | | | | | alloc_unr() to do it. Submitted by: Ed Schouten <ed 80386 nl> PR: kern/122270 MFC after: 1 month
* Add the support for the AT_FDCWD and fd-relative name lookups to thekib2008-03-311-0/+1
| | | | | | | | | namei(9). Based on the submission by rdivacky, sponsored by Google Summer of Code 2007 Reviewed by: rwatson, rdivacky Tested by: pho
* Replace the last susers calls in netinet6/ with privilege checks.bz2008-01-241-0/+6
| | | | | | | | | Introduce a new privilege allowing to set certain IP header options (hop-by-hop, routing headers). Leave a few comments to be addressed later. Reviewed by: rwatson (older version, before addressing his comments)
* VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used inattilio2008-01-131-3/+3
| | | | | | | | | | | conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
* vn_lock() is currently only used with the 'curthread' passed as argument.attilio2008-01-101-1/+1
| | | | | | | | | | | | | | | | Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>
* Merge first in a series of TrustedBSD MAC Framework KPI changesrwatson2007-10-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | from Mac OS X Leopard--rationalize naming for entry points to the following general forms: mac_<object>_<method/action> mac_<object>_check_<method/action> The previous naming scheme was inconsistent and mostly reversed from the new scheme. Also, make object types more consistent and remove spaces from object types that contain multiple parts ("posix_sem" -> "posixsem") to make mechanical parsing easier. Introduce a new "netinet" object type for certain IPv4/IPv6-related methods. Also simplify, slightly, some entry point names. All MAC policy modules will need to be recompiled, and modules not updates as part of this commit will need to be modified to conform to the new KPI. Sponsored by: SPARTA (original patches against Mac OS X) Obtained from: TrustedBSD Project, Apple Computer
* Add PRIV_VFS_STAT privilege, which will allow overriding policy limits onrwatson2007-10-211-0/+1
| | | | | | | the right to stat() a file, such as in mac_bsdextended. Obtained from: TrustedBSD Project MFC after: 3 months
* Fix jails and jail-friendly file systems handling:pjd2007-04-131-0/+1
| | | | | | | | - We need to allow for PRIV_VFS_MOUNT_OWNER inside a jail. - Move security checks to vfs_suser() and deny unmounting and updating for jailed root from different jails, etc. OK'ed by: rwatson
* Allow PRIV_NETINET_REUSEPORT in jail.rwatson2007-04-101-1/+3
|
* prison_free() can be called with a mutex held. This wasn't a problem untilpjd2007-04-081-11/+16
| | | | | | | | | | | I converted allprison_mtx mutex to allprison_lock sx lock. To fix this LOR, move prison removal to prison_complete() entirely. To ensure that noone will reference this prison before it's beeing removed from the list skip prisons with 'pr_ref == 0' in prison_find() and assert that pr_ref has to greater than 0 in prison_hold(). Reported by: kris OK'ed by: rwatson
* Only use prison mutex to protect the fields that need to be protected by it.pjd2007-04-081-2/+2
|
* pr_list is protected by the allprison_lock.pjd2007-04-081-1/+1
|
* Implement functionality I called 'jail services'.pjd2007-04-051-27/+244
| | | | | | | | | | | | | | | | | | | | | | It may be used for external modules to attach some data to jail's in-kernel structure. - Change allprison_mtx mutex to allprison_sx sx(9) lock. We will need to call external functions while holding this lock, which may want to allocate memory. Make use of the fact that this is shared-exclusive lock and use shared version when possible. - Implement the following functions: prison_service_register() - registers a service that wants to be noticed when a jail is created and destroyed prison_service_deregister() - deregisters service prison_service_data_add() - adds service-specific data to the jail structure prison_service_data_get() - takes service-specific data from the jail structure prison_service_data_del() - removes service-specific data from the jail structure Reviewed by: rwatson
* Make prison_find() globally accessible.pjd2007-04-051-2/+1
|
* Add security.jail.mount_allowed sysctl, which allows to mount andpjd2007-04-051-0/+17
| | | | | | | | | | | | | | | | | | unmount jail-friendly file systems from within a jail. Precisely it grants PRIV_VFS_MOUNT, PRIV_VFS_UNMOUNT and PRIV_VFS_MOUNT_NONUSER privileges for a jailed super-user. It is turned off by default. A jail-friendly file system is a file system which driver registers itself with VFCF_JAIL flag via VFS_SET(9) API. The lsvfs(1) command can be used to see which file systems are jail-friendly ones. There currently no jail-friendly file systems, ZFS will be the first one. In the future we may consider marking file systems like nullfs as jail-friendly. Reviewed by: rwatson
* Minor simplification.pjd2007-03-091-3/+1
|
* White space nits.pjd2007-03-071-4/+4
|
* Remove 'MPSAFE' annotations from the comments above most system calls: allrwatson2007-03-041-4/+0
| | | | | | | | system calls now enter without Giant held, and then in some cases, acquire Giant explicitly. Remove a number of other MPSAFE annotations in the credential code and tweak one or two other adjacent comments.
* Rename PRIV_VFS_CLEARSUGID to PRIV_VFS_RETAINSUGID, which seems to betterpjd2007-03-011-1/+1
| | | | | | describe the privilege. OK'ed by: rwatson
* Remove unused PRIV_IPC_EXEC. Renumbers System V IPC privilege.rwatson2007-02-201-1/+0
|
* Rename three quota privileges from the UFS privilege namespace to therwatson2007-02-191-2/+2
| | | | | | | | | | VFS privilege namespace: exceedquota, getquota, and setquota. Leave UFS-specific quota configuration privileges in the UFS name space. This renumbers VFS and UFS privileges, so requires rebuilding modules if you are using security policies aware of privilege identifiers. This is likely no one at this point since none of the committed MAC policies use the privilege checks.
* Limit quota privileges in jail to PRIV_UFS_GETQUOTA andrwatson2007-02-191-5/+2
| | | | PRIV_UFS_SETQUOTA.
* For now, reflect practical reality that Audit system calls aren'trwatson2007-02-191-0/+2
| | | | allowed in Jail: return a privilege error.
* Add a new priv(9) kernel interface for checking the availability ofrwatson2006-11-061-1/+168
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | privilege for threads and credentials. Unlike the existing suser(9) interface, priv(9) exposes a named privilege identifier to the privilege checking code, allowing more complex policies regarding the granting of privilege to be expressed. Two interfaces are provided, replacing the existing suser(9) interface: suser(td) -> priv_check(td, priv) suser_cred(cred, flags) -> priv_check_cred(cred, priv, flags) A comprehensive list of currently available kernel privileges may be found in priv.h. New privileges are easily added as required, but the comments on adding privileges found in priv.h and priv(9) should be read before doing so. The new privilege interface exposed sufficient information to the privilege checking routine that it will now be possible for jail to determine whether a particular privilege is granted in the check routine, rather than relying on hints from the calling context via the SUSER_ALLOWJAIL flag. For now, the flag is maintained, but a new jail check function, prison_priv_check(), is exposed from kern_jail.c and used by the privilege check routine to determine if the privilege is permitted in jail. As a result, a centralized list of privileges permitted in jail is now present in kern_jail.c. The MAC Framework is now also able to instrument privilege checks, both to deny privileges otherwise granted (mac_priv_check()), and to grant privileges otherwise denied (mac_priv_grant()), permitting MAC Policy modules to implement privilege models, as well as control a much broader range of system behavior in order to constrain processes running with root privilege. The suser() and suser_cred() functions remain implemented, now in terms of priv_check() and the PRIV_ROOT privilege, for use during the transition and possibly continuing use by third party kernel modules that have not been updated. The PRIV_DRIVER privilege exists to allow device drivers to check privilege without adopting a more specific privilege identifier. This change does not modify the actual security policy, rather, it modifies the interface for privilege checks so changes to the security policy become more feasible. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>
* Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.hrwatson2006-10-221-1/+2
| | | | | | | | | | | | | begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead. This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd. Obtained from: TrustedBSD Project Sponsored by: SPARTA
* Declare security and security.bsd sysctl hierarchies in sysctl.h alongrwatson2006-09-171-1/+0
| | | | | | | | with other commonly used sysctl name spaces, rather than declaring them all over the place. MFC after: 1 month Sponsored by: nCircle Network Security, Inc.
* Push Giant down in jails. Pass the MPSAFE flag to NDINIT, and keep trackcsjp2005-09-281-16/+15
| | | | | | | | | of whether or not Giant was picked up by the filesystem. Add VFS_LOCK_GIANT macros around vrele as it's possible that this can call in the VOP_INACTIVE filesystem specific code. Also while we are here, remove the Giant assertion. from the sysctl handler, we do not actually require Giant here so we shouldn't assert it. Doing so will just complicate things when Giant is removed from the sysctl framework.
* Actually only protect mount-point if security.jail.enforce_statfs is set to 2.pjd2005-06-231-1/+0
| | | | | | | If we don't return statistics about requested file systems, system tools may not work correctly or at all. Approved by: re (scottl)
* Rename sysctl security.jail.getfsstatroot_only to security.jail.enforce_statfspjd2005-06-091-11/+86
| | | | | | | | | | | | | | | and extend its functionality: value policy 0 show all mount-points without any restrictions 1 show only mount-points below jail's chroot and show only part of the mount-point's path (if jail's chroot directory is /jails/foo and mount-point is /jails/foo/usr/home only /usr/home will be shown) 2 show only mount-point where jail's chroot directory is placed. Default value is 2. Discussed with: rwatson
* - Use taskqueue_thread rather than taskqueue_swi since our task is goingjeff2005-04-051-1/+1
| | | | | to vrele, which may vop lock. This is not safe in a software interrupt context.
* Drop a bogus mp_fixme(). Adding a lock would do nothing to reduce userlandjhb2005-03-311-2/+0
| | | | races regarding changing of jail-related sysctls.
* Add a new sysctl, "security.jail.chflags_allowed", which controls thecperciva2005-02-081-0/+5
| | | | | | | | | | | | | behaviour of chflags within a jail. If set to 0 (the default), then a jailed root user is treated as an unprivileged user; if set to 1, then a jailed root user is treated the same as an unjailed root user. This is necessary to allow "make installworld" to work inside a jail, since it attempts to manipulate the system immutable flag on certain files. Discussed with: csjp, rwatson MFC after: 2 weeks
OpenPOWER on IntegriCloud