From b7d1206a4939a236b435bd6853f19469d02a7208 Mon Sep 17 00:00:00 2001 From: phk Date: Mon, 29 May 2000 18:18:07 +0000 Subject: The Jail paper, written jointly by rwatson & me. --- share/doc/papers/jail/Makefile | 10 + share/doc/papers/jail/future.ms | 104 ++++++++ share/doc/papers/jail/implementation.ms | 126 +++++++++ share/doc/papers/jail/jail01.eps | 234 +++++++++++++++++ share/doc/papers/jail/jail01.fig | 86 +++++++ share/doc/papers/jail/mgt.ms | 218 ++++++++++++++++ share/doc/papers/jail/paper.ms | 437 ++++++++++++++++++++++++++++++++ 7 files changed, 1215 insertions(+) create mode 100644 share/doc/papers/jail/Makefile create mode 100644 share/doc/papers/jail/future.ms create mode 100644 share/doc/papers/jail/implementation.ms create mode 100644 share/doc/papers/jail/jail01.eps create mode 100644 share/doc/papers/jail/jail01.fig create mode 100644 share/doc/papers/jail/mgt.ms create mode 100644 share/doc/papers/jail/paper.ms (limited to 'share/doc') diff --git a/share/doc/papers/jail/Makefile b/share/doc/papers/jail/Makefile new file mode 100644 index 0000000..174af30 --- /dev/null +++ b/share/doc/papers/jail/Makefile @@ -0,0 +1,10 @@ +# $FreeBSD$ +PRINTERDEVICE=ps +NODOCCOMPRESS=1 +VOLUME= papers +DOC= jail +SRCS= paper.ms +MACROS= -ms -U +OBJS= implementation.ms mgt.ms future.ms + +.include diff --git a/share/doc/papers/jail/future.ms b/share/doc/papers/jail/future.ms new file mode 100644 index 0000000..01c325d --- /dev/null +++ b/share/doc/papers/jail/future.ms @@ -0,0 +1,104 @@ +.\" +.\" $FreeBSD$ +.\" +.NH +Future Directions +.PP +The jail facility has already been deployed in numerous capacities and +a few opportunities for improvement have manifested themselves. +.NH 2 +Improved Virtualisation +.PP +As it stands, the jail code provides a strict subset of system resources +to the jail environment, based on access to processes, files, network +resources, and privileged services. +Virtualisation, or making the jail environments appear to be fully +functional FreeBSD systems, allows maximum application support and the +ability to offer a wide range of services within a jail environment. +However, there are a number of limitations on the degree of virtualisation +in the current code, and removing these limitations will enhance the +ability to offer services in a jail environment. +Two areas that deserve greater attention are the virtualisation of +network resources, and management of scheduling resources. +.PP +Currently, a single IP address may be allocated to each jail, and all +communication from the jail is limited to that IP address. +In particular, these addresses are IPv4 addresses. +There has been substantial interest in improving interface virtualisation, +allowing one or more addresses to be assigned to an interface, and +removing the requirement that the address be an IPv4 address, allowing +the use of IPv6. +Also, access to raw sockets is currently prohibited, as the current +implementation of raw sockets allows access to raw IP packets associated +with all interfaces. +Limiting the scope of the raw socket would allow its safe use within +a jail, re-enabling support for ping, and other network debugging and +evaluation tools. +.PP +Another area of great interest to the current consumers of the jail code +is the ability to limit the impact of one jail on the CPU resources +available for other jails. +Specifically, this would require that the jail of a process play a rule in +its scheduling parameters. +Prior work in the area of lottery scheduling, currently available as +patches on FreeBSD 2.2.x, might be leveraged to allow some degree of +partitioning between jail environments \s-2[LOTTERY1] [LOTTERY2]\s+2. +However, as the current scheduling mechanism is targeted at time +sharing, and FreeBSD does not currently support real time preemption +of processes in kernel, complete partitioning is not possible within the +current framework. +.NH 2 +Improved Management +.PP +Management of jail environments is currently somewhat ad hoc--creating +and starting jails is a well-documented procedure, but day-to-day +management of jails, as well as special case procedures such as shutdown, +are not well analysed and documented. +The current kernel process management infrastructure does not have the +ability to manage pools of processes in a jail-centric way. +For example, it is possible to, within a jail, deliver a signal to all +processes in a jail, but it is not possibly to atomically target all +processes within a jail from outside of the jail. +If the jail code is to effectively limit the behaviour of a jail, the +ability to shut it down cleanly is paramount. +Similarly, shutting down a jail cleanly from within is also not well +defined, the traditional shutdown utilities having been written with +a host environment in mind. +This suggests a number of improvements, both in the kernel and in the +user-land utility set. +.PP +First, the ability to address kernel-centric management mechanisms at +jails is important. +One way in which this might be done is to assign a unique jail id, not +unlike a process id or process group id, at jail creation time. +A new jailkill() syscall would permit the direction of signals to +specific jailids, allowing for the effective termination of all processes +in the jail. +A unique jailid could also supplant the hostname as the unique +identifier for a jail, allowing the hostname to be changed by the +processes in the jail without interfering with jail management. +.PP +More carefully defining the user-land semantics of a jail during startup +and shutdown is also important. +The traditional FreeBSD environment makes use of an init process to +bring the system up during the boot process, and to assist in shutdown. +A similar technique might be used for jail, in effect a jailinit, +formulated to handle the clean startup and shutdown, including calling +out to jail-local /etc/rc.shutdown, and other useful shutdown functions. +A jailinit would also present a central location for delivering +management requests to within a jail from the host environment, allowing +the host environment to request the shutdown of the jail cleanly, before +resorting to terminating processes, in the same style as the host +environment shutting down before killing all processes and halting the +kernel. +.PP +Improvements in the host environment would also assist in improving +jail management, possibly including automated runtime jail management tools, +tools to more easily construct the per-jail file system area, and +include jail shutdown as part of normal system shutdown. +.PP +These improvements in the jail framework would improve both raw +functionality and usability from a management perspective. +The jail code has raised significant interest in the FreeBSD community, +and it is hoped that this type of improved functionality will be +available in upcoming releases of FreeBSD. diff --git a/share/doc/papers/jail/implementation.ms b/share/doc/papers/jail/implementation.ms new file mode 100644 index 0000000..eafc8f2 --- /dev/null +++ b/share/doc/papers/jail/implementation.ms @@ -0,0 +1,126 @@ +.\" +.\" $FreeBSD$ +.\" +.NH +Implementation jail in the FreeBSD kernel. +.NH 2 +The jail(2) system call, allocation, refcounting and deallocation of +\fCstruct prison\fP. +.PP +The jail(2) system call is implemented as a non-optional system call +in FreeBSD. Other system calls are controlled by compile time options +in the kernel configuration file, but due to the minute footprint of +the jail implementation, it was decided to make it a standard +facility in FreeBSD. +.PP +The implementation of the system call is straightforward: a data structure +is allocated and populated with the arguments provided. The data structure +is attached to the current process' \fCstruct proc\fP, its reference count +set to one and a call to the +chroot(2) syscall implementation completes the task. +.PP +Hooks in the code implementing process creation and destruction maintains +the reference count on the data structure and free it when the last reference +is lost. +Any new process created by a process in a jail will inherit a reference +to the jail, which effectively puts the new process in the same jail. +.PP +There is no way to modify the contents of the data structure describing +the jail after its creation, and no way to attach a process to an existing +jail if it was not created from the inside that jail. +.NH 2 +Fortification of the chroot(2) facility for filesystem name scoping. +.PP +A number of ways to escape the confines of a chroot(2)-created subscope +of the filesystem view have been identified over the years. +chroot(2) was never intended to be security mechanism as such, but even +then the ftp daemon largely depended on the security provided by +chroot(2) to provide the ``anonymous ftp'' access method. +.PP +Three classes of escape routes existed: recursive chroot(2) escapes, +``..'' based escapes and fchdir(2) based escapes. +All of these exploited the fact that chroot(2) didn't try sufficiently +hard to enforce the new root directory. +.PP +New code were added to detect and thwart these escapes, amongst +other things by tracking the directory of the first level of chroot(2) +experienced by a process and refusing backwards traversal across +this directory, as well as additional code to refuse chroot(2) if +file-descriptors were open referencing directories. +.NH 2 +Restriction of process visibility and interaction. +.PP +A macro was already in available in the kernel to determine if one process +could affect another process. This macro did the rather complex checking +of uid and gid values. It was felt that the complexity of the macro were +approaching the lower edge of IOCCC entrance criteria, and it was therefore +converted to a proper function named \fCp_trespass(p1, p2)\fP which does +all the previous checks and additionally checks the jail aspect of the access. +The check is implemented such that access fails if the origin process is jailed +but the target process is not in the same jail. +.PP +Process visibility is provided through two mechanisms in FreeBSD, +the \fCprocfs\fP file system and a sub-tree of the \fCsysctl\fP tree. +Both of these were modified to report only the processes in the same +jail to a jailed process. +.NH 2 +Restriction to one IP number. +.PP +Restricting TCP and UDP access to just one IP number was done almost +entirely in the code which manages ``protocol control blocks''. +When a jailed process binds to a socket, the IP number provided by +the process will not be used, instead the pre-configured IP number of +the jail is used. +.PP +BSD based TCP/IP network stacks sport a special interface, the loop-back +interface, which has the ``magic'' IP number 127.0.0.1. +This is often used by processes to contact servers on the local machine, +and consequently special handling for jails were needed. +To handle this case it was necessary to also intercept and modify the +behaviour of connection establishment, and when the 127.0.0.1 address +were seen from a jailed process, substitute the jails configured IP number. +.PP +Finally the APIs through which the network configuration and connection +state may be queried were modified to report only information relevant +to the configured IP number of a jailed process. +.NH 2 +Adding jail awareness to selected device drivers. +.PP +A couple of device drivers needed to be taught about jails, the ``pty'' +driver is one of them. The pty driver provides ``virtual terminals'' to +services like telnet, ssh, rlogin and X11 terminal window programs. +Therefore jails need access to the pty driver, and code had to be added +to enforce that a particular virtual terminal were not accessed from more +than one jail at the same time. +.NH 2 +General restriction of super-users powers for jailed super-users. +.PP +This item proved to be the simplest but most tedious to implement. +Tedious because a manual review of all places where the kernel allowed +the super user special powers were called for, +simple because very few places were required to let a jailed root through. +Of the approximately 260 checks in the FreeBSD 4.0 kernel, only +about 35 will let a jailed root through. +.PP +Since the default is for jailed roots to not receive privilege, new +code or drivers in the FreeBSD kernel are automatically jail-aware: they +will refuse jailed roots privilege. +The other part of this protection comes from the fact that a jailed +root cannot create new device nodes with the mknod(2) systemcall, so +unless the machine administrator creates device nodes for a particular +device inside the jails filesystem tree, the driver in effect does +not exist in the jail. +.PP +As a side-effect of this work the suser(9) API were cleaned up and +extended to cater for not only the jail facility, but also to make room +for future partitioning facilities. +.NH 2 +Implementation statistics +.PP +The change of the suser(9) API modified approx 350 source lines +distributed over approx. 100 source files. The vast majority of +these changes were generated automatically with a script. +.PP +The implementation of the jail facility added approx 200 lines of +code in total, distributed over approx. 50 files. and about 200 lines +in two new kernel files. diff --git a/share/doc/papers/jail/jail01.eps b/share/doc/papers/jail/jail01.eps new file mode 100644 index 0000000..ffcfa30 --- /dev/null +++ b/share/doc/papers/jail/jail01.eps @@ -0,0 +1,234 @@ +%!PS-Adobe-2.0 EPSF-2.0 +%%Title: jail01.eps +%%Creator: fig2dev Version 3.2 Patchlevel 1 +%%CreationDate: Fri Mar 24 20:37:59 2000 +%%For: $FreeBSD$ +%%Orientation: Portrait +%%BoundingBox: 0 0 425 250 +%%Pages: 0 +%%BeginSetup +%%EndSetup +%%Magnification: 1.0000 +%%EndComments +/$F2psDict 200 dict def +$F2psDict begin +$F2psDict /mtrx matrix put +/col-1 {0 setgray} bind def +/col0 {0.000 0.000 0.000 srgb} bind def +/col1 {0.000 0.000 1.000 srgb} bind def +/col2 {0.000 1.000 0.000 srgb} bind def +/col3 {0.000 1.000 1.000 srgb} bind def +/col4 {1.000 0.000 0.000 srgb} bind def +/col5 {1.000 0.000 1.000 srgb} bind def +/col6 {1.000 1.000 0.000 srgb} bind def +/col7 {1.000 1.000 1.000 srgb} bind def +/col8 {0.000 0.000 0.560 srgb} bind def +/col9 {0.000 0.000 0.690 srgb} bind def +/col10 {0.000 0.000 0.820 srgb} bind def +/col11 {0.530 0.810 1.000 srgb} bind def +/col12 {0.000 0.560 0.000 srgb} bind def +/col13 {0.000 0.690 0.000 srgb} bind def +/col14 {0.000 0.820 0.000 srgb} bind def +/col15 {0.000 0.560 0.560 srgb} bind def +/col16 {0.000 0.690 0.690 srgb} bind def +/col17 {0.000 0.820 0.820 srgb} bind def +/col18 {0.560 0.000 0.000 srgb} bind def +/col19 {0.690 0.000 0.000 srgb} bind def +/col20 {0.820 0.000 0.000 srgb} bind def +/col21 {0.560 0.000 0.560 srgb} bind def +/col22 {0.690 0.000 0.690 srgb} bind def +/col23 {0.820 0.000 0.820 srgb} bind def +/col24 {0.500 0.190 0.000 srgb} bind def +/col25 {0.630 0.250 0.000 srgb} bind def +/col26 {0.750 0.380 0.000 srgb} bind def +/col27 {1.000 0.500 0.500 srgb} bind def +/col28 {1.000 0.630 0.630 srgb} bind def +/col29 {1.000 0.750 0.750 srgb} bind def +/col30 {1.000 0.880 0.880 srgb} bind def +/col31 {1.000 0.840 0.000 srgb} bind def + +end +save +-117.0 298.0 translate +1 -1 scale + +/cp {closepath} bind def +/ef {eofill} bind def +/gr {grestore} bind def +/gs {gsave} bind def +/sa {save} bind def +/rs {restore} bind def +/l {lineto} bind def +/m {moveto} bind def +/rm {rmoveto} bind def +/n {newpath} bind def +/s {stroke} bind def +/sh {show} bind def +/slc {setlinecap} bind def +/slj {setlinejoin} bind def +/slw {setlinewidth} bind def +/srgb {setrgbcolor} bind def +/rot {rotate} bind def +/sc {scale} bind def +/sd {setdash} bind def +/ff {findfont} bind def +/sf {setfont} bind def +/scf {scalefont} bind def +/sw {stringwidth} bind def +/tr {translate} bind def +/tnt {dup dup currentrgbcolor + 4 -2 roll dup 1 exch sub 3 -1 roll mul add + 4 -2 roll dup 1 exch sub 3 -1 roll mul add + 4 -2 roll dup 1 exch sub 3 -1 roll mul add srgb} + bind def +/shd {dup dup currentrgbcolor 4 -2 roll mul 4 -2 roll mul + 4 -2 roll mul srgb} bind def +/$F2psBegin {$F2psDict begin /$F2psEnteredState save def} def +/$F2psEnd {$F2psEnteredState restore end} def +%%EndProlog + +$F2psBegin +10 setmiterlimit +n -1000 5962 m -1000 -1000 l 10022 -1000 l 10022 5962 l cp clip + 0.06000 0.06000 sc +/Courier-BoldOblique ff 180.00 scf sf +7725 3600 m +gs 1 -1 sc (10.0.0.2) dup sw pop neg 0 rm col0 sh gr +% Polyline +15.000 slw +n 9000 3300 m 9000 4275 l gs col0 s gr +% Polyline +2 slc +n 7875 3225 m 7800 3225 l gs col0 s gr +% Polyline +0 slc +n 7875 4125 m 7800 4125 l gs col0 s gr +% Polyline +n 7875 3225 m 7875 4425 l gs col0 s gr +% Polyline +n 7875 3825 m 7800 3825 l gs col0 s gr +% Polyline +n 7875 3525 m 7800 3525 l gs col0 s gr +% Polyline +n 8175 3825 m 7875 3825 l gs col0 s gr +% Polyline +2 slc +n 7875 4425 m 7800 4425 l gs col0 s gr +/Courier-Bold ff 180.00 scf sf +8700 3900 m +gs 1 -1 sc (fxp0) dup sw pop neg 0 rm col0 sh gr +% Polyline +0 slc +7.500 slw +n 2925 1425 m 3075 1425 l gs col0 s gr +% Polyline +15.000 slw +n 2475 1350 m 2472 1347 l 2465 1342 l 2453 1334 l 2438 1323 l 2420 1311 l + 2401 1299 l 2383 1289 l 2366 1281 l 2351 1275 l 2338 1274 l + 2325 1275 l 2314 1279 l 2303 1285 l 2291 1293 l 2278 1303 l + 2264 1314 l 2250 1326 l 2236 1339 l 2222 1353 l 2209 1366 l + 2198 1379 l 2188 1391 l 2181 1403 l 2177 1414 l 2175 1425 l + 2177 1436 l 2181 1447 l 2188 1459 l 2198 1471 l 2209 1484 l + 2222 1497 l 2236 1511 l 2250 1524 l 2264 1536 l 2278 1547 l + 2291 1557 l 2303 1565 l 2314 1571 l 2325 1575 l 2338 1576 l + 2351 1575 l 2366 1569 l 2383 1561 l 2401 1551 l 2420 1539 l + 2438 1527 l 2453 1516 l 2465 1508 l 2472 1503 l 2475 1500 l gs col0 s gr +/Courier-Bold ff 180.00 scf sf +2550 1500 m +gs 1 -1 sc (lo0) col0 sh gr +/Courier-BoldOblique ff 180.00 scf sf +3075 1500 m +gs 1 -1 sc (127.0.0.1) col0 sh gr +% Polyline +7.500 slw +n 2100 3525 m 2250 3525 l gs col0 s gr +% Polyline +n 2550 2100 m 2250 2400 l 2250 4500 l 2550 4800 l gs col0 s gr +/Courier-Bold ff 180.00 scf sf +1950 3600 m +gs 1 -1 sc (/) col0 sh gr +/Courier-Bold ff 180.00 scf sf +2550 3900 m +gs 1 -1 sc (jail_1/) col0 sh gr +/Courier-Bold ff 180.00 scf sf +2550 4200 m +gs 1 -1 sc (jail_2/) col0 sh gr +/Courier-Bold ff 180.00 scf sf +2550 4500 m +gs 1 -1 sc (jail_3/) col0 sh gr +/Courier-Bold ff 180.00 scf sf +2550 2400 m +gs 1 -1 sc (dev/) col0 sh gr +/Courier-Bold ff 180.00 scf sf +2550 2700 m +gs 1 -1 sc (etc/) col0 sh gr +/Courier-Bold ff 180.00 scf sf +2550 3000 m +gs 1 -1 sc (usr/) col0 sh gr +/Courier-Bold ff 180.00 scf sf +2550 3300 m +gs 1 -1 sc (var/) col0 sh gr +/Courier-Bold ff 180.00 scf sf +2550 3600 m +gs 1 -1 sc (home/) col0 sh gr +% Polyline +n 3375 3825 m 3900 3825 l 4950 1800 l 5100 1800 l gs col0 s gr +% Polyline +n 3375 4125 m 3900 4125 l 4950 3900 l 5100 3900 l gs col0 s gr +% Polyline +n 5400 900 m 5100 1200 l 5100 2400 l 5400 2700 l gs col0 s gr +% Polyline +n 5400 3000 m 5100 3300 l 5100 4500 l 5400 4800 l gs col0 s gr +% Polyline +n 4650 825 m 4650 2775 l 6675 2775 l 6675 3375 l 7950 3375 l 7950 825 l + cp gs col0 s gr +% Polyline +n 4650 2775 m 4650 4950 l 6300 4950 l 6300 3675 l 7950 3675 l 7950 3375 l + 6675 3375 l 6675 2775 l cp gs col0 s gr +/Courier-Bold ff 180.00 scf sf +5400 1200 m +gs 1 -1 sc (dev/) col0 sh gr +/Courier-Bold ff 180.00 scf sf +5400 1500 m +gs 1 -1 sc (etc/) col0 sh gr +/Courier-Bold ff 180.00 scf sf +5400 1800 m +gs 1 -1 sc (usr/) col0 sh gr +/Courier-Bold ff 180.00 scf sf +5400 2100 m +gs 1 -1 sc (var/) col0 sh gr +/Courier-Bold ff 180.00 scf sf +5400 2400 m +gs 1 -1 sc (home/) col0 sh gr +/Courier-Bold ff 180.00 scf sf +5400 3300 m +gs 1 -1 sc (dev/) col0 sh gr +/Courier-Bold ff 180.00 scf sf +5400 3600 m +gs 1 -1 sc (etc/) col0 sh gr +/Courier-Bold ff 180.00 scf sf +5400 3900 m +gs 1 -1 sc (usr/) col0 sh gr +/Courier-Bold ff 180.00 scf sf +5400 4200 m +gs 1 -1 sc (var/) col0 sh gr +/Courier-Bold ff 180.00 scf sf +5400 4500 m +gs 1 -1 sc (home/) col0 sh gr +/Courier-BoldOblique ff 180.00 scf sf +7725 3300 m +gs 1 -1 sc (10.0.0.1) dup sw pop neg 0 rm col0 sh gr +/Courier-BoldOblique ff 180.00 scf sf +7725 4500 m +gs 1 -1 sc (10.0.0.5) dup sw pop neg 0 rm col0 sh gr +/Courier-BoldOblique ff 180.00 scf sf +7725 4200 m +gs 1 -1 sc (10.0.0.4) dup sw pop neg 0 rm col0 sh gr +/Courier-BoldOblique ff 180.00 scf sf +7725 3900 m +gs 1 -1 sc (10.0.0.3) dup sw pop neg 0 rm col0 sh gr +% Polyline +15.000 slw +n 9000 3825 m 8775 3825 l gs col0 s gr +$F2psEnd +rs diff --git a/share/doc/papers/jail/jail01.fig b/share/doc/papers/jail/jail01.fig new file mode 100644 index 0000000..d4ef165 --- /dev/null +++ b/share/doc/papers/jail/jail01.fig @@ -0,0 +1,86 @@ +#FIG 3.2 +# $FreeBSD$ +Landscape +Center +Inches +A4 +100.00 +Single +-2 +1200 2 +6 7725 3150 9075 4500 +6 8700 3225 9075 4350 +2 1 0 2 0 7 100 0 -1 0.000 0 0 -1 0 0 2 + 9000 3825 8775 3825 +2 1 0 2 0 7 100 0 -1 0.000 0 0 -1 0 0 2 + 9000 3300 9000 4275 +-6 +2 1 0 2 0 7 100 0 -1 0.000 0 2 -1 0 0 2 + 7875 3225 7800 3225 +2 1 0 2 0 7 100 0 -1 0.000 0 0 -1 0 0 2 + 7875 4125 7800 4125 +2 1 0 2 0 7 100 0 -1 0.000 0 0 -1 0 0 2 + 7875 3225 7875 4425 +2 1 0 2 0 7 100 0 -1 0.000 0 0 -1 0 0 2 + 7875 3825 7800 3825 +2 1 0 2 0 7 100 0 -1 0.000 0 0 -1 0 0 2 + 7875 3525 7800 3525 +2 1 0 2 0 7 100 0 -1 0.000 0 0 -1 0 0 2 + 8175 3825 7875 3825 +2 1 0 2 0 7 100 0 -1 0.000 0 2 -1 0 0 2 + 7875 4425 7800 4425 +4 2 0 100 0 14 12 0.0000 4 180 420 8700 3900 fxp0\001 +-6 +6 2100 1200 4050 1650 +2 1 0 1 0 7 100 0 -1 0.000 0 0 -1 0 0 2 + 2925 1425 3075 1425 +3 2 0 2 0 7 100 0 -1 0.000 0 0 0 5 + 2475 1350 2325 1275 2175 1425 2325 1575 2475 1500 + 0.000 -1.000 -1.000 -1.000 0.000 +4 0 0 100 0 14 12 0.0000 4 135 315 2550 1500 lo0\001 +4 0 0 100 0 15 12 0.0000 4 135 945 3075 1500 127.0.0.1\001 +-6 +6 1950 2100 3300 4800 +2 1 0 1 0 7 100 0 -1 0.000 0 0 -1 0 0 2 + 2100 3525 2250 3525 +2 1 0 1 0 7 100 0 -1 0.000 0 0 -1 0 0 4 + 2550 2100 2250 2400 2250 4500 2550 4800 +4 0 0 100 0 14 12 0.0000 4 150 105 1950 3600 /\001 +4 0 0 100 0 14 12 0.0000 4 180 735 2550 3900 jail_1/\001 +4 0 0 100 0 14 12 0.0000 4 180 735 2550 4200 jail_2/\001 +4 0 0 100 0 14 12 0.0000 4 180 735 2550 4500 jail_3/\001 +4 0 0 100 0 14 12 0.0000 4 165 420 2550 2400 dev/\001 +4 0 0 100 0 14 12 0.0000 4 150 420 2550 2700 etc/\001 +4 0 0 100 0 14 12 0.0000 4 150 420 2550 3000 usr/\001 +4 0 0 100 0 14 12 0.0000 4 150 420 2550 3300 var/\001 +4 0 0 100 0 14 12 0.0000 4 165 525 2550 3600 home/\001 +-6 +2 1 0 1 0 7 100 0 -1 0.000 0 0 -1 0 0 4 + 3375 3825 3900 3825 4950 1800 5100 1800 +2 1 0 1 0 7 100 0 -1 0.000 0 0 -1 0 0 4 + 3375 4125 3900 4125 4950 3900 5100 3900 +2 1 0 1 0 7 100 0 -1 0.000 0 0 -1 0 0 4 + 5400 900 5100 1200 5100 2400 5400 2700 +2 1 0 1 0 7 100 0 -1 0.000 0 0 -1 0 0 4 + 5400 3000 5100 3300 5100 4500 5400 4800 +2 3 0 1 0 7 100 0 -1 0.000 0 0 -1 0 0 7 + 4650 825 4650 2775 6675 2775 6675 3375 7950 3375 7950 825 + 4650 825 +2 3 0 1 0 7 100 0 -1 0.000 0 0 -1 0 0 9 + 4650 2775 4650 4950 6300 4950 6300 3675 7950 3675 7950 3375 + 6675 3375 6675 2775 4650 2775 +4 0 0 100 0 14 12 0.0000 4 165 420 5400 1200 dev/\001 +4 0 0 100 0 14 12 0.0000 4 150 420 5400 1500 etc/\001 +4 0 0 100 0 14 12 0.0000 4 150 420 5400 1800 usr/\001 +4 0 0 100 0 14 12 0.0000 4 150 420 5400 2100 var/\001 +4 0 0 100 0 14 12 0.0000 4 165 525 5400 2400 home/\001 +4 0 0 100 0 14 12 0.0000 4 165 420 5400 3300 dev/\001 +4 0 0 100 0 14 12 0.0000 4 150 420 5400 3600 etc/\001 +4 0 0 100 0 14 12 0.0000 4 150 420 5400 3900 usr/\001 +4 0 0 100 0 14 12 0.0000 4 150 420 5400 4200 var/\001 +4 0 0 100 0 14 12 0.0000 4 165 525 5400 4500 home/\001 +4 2 0 100 0 15 12 0.0000 4 135 840 7725 3300 10.0.0.1\001 +4 2 0 100 0 15 12 0.0000 4 135 840 7725 4500 10.0.0.5\001 +4 2 0 100 0 15 12 0.0000 4 135 840 7725 4200 10.0.0.4\001 +4 2 0 100 0 15 12 0.0000 4 135 840 7725 3900 10.0.0.3\001 +4 2 0 100 0 15 12 0.0000 4 135 840 7725 3600 10.0.0.2\001 diff --git a/share/doc/papers/jail/mgt.ms b/share/doc/papers/jail/mgt.ms new file mode 100644 index 0000000..b9b5b31 --- /dev/null +++ b/share/doc/papers/jail/mgt.ms @@ -0,0 +1,218 @@ +.\" +.\" $FreeBSD$ +.\" +.NH +Managing Jails and the Jail File System Environment +.NH 2 +Creating a Jail Environment +.PP +While the jail(2) call could be used in a number of ways, the expected +configuration creates a complete FreeBSD installation for each jail. +This includes copies of all relevant system binaries, data files, and its +own \fC/etc\fP directory. +Such a configuration maximises the independence of various jails, +and reduces the chances of interference between jails being possible, +especially when it is desirable to provide root access within a jail to +a less trusted user. +.PP +On a box making use of the jail facility, we refer to two types of +environment: the host environment, and the jail environment. +The host environment is the real operating system environment, which is +used to configure interfaces, and start up the jails. +There are then one or more jail environments, effectively virtual +FreeBSD machines. +When configuring Jail for use, it is necessary to configure both the +host and jail environments to prevent overlap. +.PP +As jailed virtual machines are generally bound to an IP address configured +using the normal IP alias mechanism, those jail IP addresses are also +accessible to host environment applications to use. +If the accessibility of some host applications in the jail environment is +not desirable, it is necessary to configure those applications to only +listen on appropriate addresses. +.PP +In most of the production environments where jail is currently in use, +one IP address is allocated to the host environment, and then a number +are allocated to jail boxes, with each jail box receiving a unique IP. +In this situation, it is sufficient to configure the networking applications +on the host to listen only on the host IP. +Generally, this consists of specifying the appropriate IP address to be +used by inetd and SSH, and disabling applications that are not capable +of limiting their address scope, such as sendmail, the port mapper, and +syslogd. +Other third party applications that have been installed on the host must also be +configured in this manner, or users connecting to the jailbox will +discover the host environment service, unless the jailbox has +specifically bound a service to that port. +In some situations, this can actually be the desirable behaviour. +.PP +The jail environments must also be custom-configured. +This consists of building and installing a miniature version of the +FreeBSD file system tree off of a subdirectory in the host environment, +usually \fC/usr/jail\fP, or \fC/data/jail\fP, with a subdirectory per jail. +Appropriate instructions for generating this tree are included in the +jail(8) man page, but generally this process may be automated using the +FreeBSD build environment. +.PP +One notable difference from the default FreeBSD install is that only +a limited set of device nodes should be created. +MAKEDEV(8) has been modified to accept a ``jail'' argument that creates +the correct set of nodes. +.PP +To improve storage efficiency, a fair number of the binaries in the system tree +may be deleted, as they are not relevant in a jail environment. +This includes the kernel, boot loader, and related files, as well as +hardware and network configuration tools. +.PP +After the creation of the jail tree, the easiest way to configure it is +to start up the jail in single-user mode. +The sysinstall admin tool may be used to help with the task, although +it is not installed by default as part of the system tree. +These tools should be run in the jail environment, or they will affect +the host environment's configuration. +.DS +.ft C +.ps -2 +# mkdir /data/jail/192.168.11.100/stand +# cp /stand/sysinstall /data/jail/192.168.11.100/stand +# jail /data/jail/192.168.11.100 testhostname 192.168.11.100 \e + /bin/sh +.ps +2 +.R +.DE +.PP +After running the jail command, the shell is now within the jail environment, +and all further commands +will be limited to the scope of the jail until the shell exits. +If the network alias has not yet been configured, then the jail will be +unable to access the network. +.PP +The startup configuration of the jail environment may be configured so +as to quell warnings from services that cannot run in the jail. +Also, any per-system configuration required for a normal FreeBSD system +is also required for each jailbox. +Typically, this includes: +.IP "" 5n +\(bu Create empty /etc/fstab +.IP +\(bu Disable portmapper +.IP +\(bu Run newaliases +.IP +\(bu Disabling interface configuration +.IP +\(bu Configure the resolver +.IP +\(bu Set root password +.IP +\(bu Set timezone +.IP +\(bu Add any local accounts +.IP +\(bu Install any packets +.NH 2 +Starting Jails +.PP +Jails are typically started by executing their /etc/rc script in much +the same manner a shell was started in the previous section. +Before starting the jail, any relevant networking configuration +should also be performed. +Typically, this involves adding an additional IP address to the +appropriate network interface, setting network properties for the +IP address using IP filtering, forwarding, and bandwidth shaping, +and mounting a process file system for the jail, if the ability to +debug processes from within the jail is desired. +.DS +.ft C +.ps -2 +# ifconfig ed0 inet add 192.168.11.100 netmask 255.255.255.255 +# mount -t procfs proc /data/jail/192.168.11.100/proc +# jail /data/jail/192.168.11.100 testhostname 192.168.11.100 \e + /bin/sh /etc/rc +.ps +2 +.ft P +.DE +.PP +A few warnings are generated for sysctl's that are not permitted +to be set within the jail, but the end result is a set of processes +in an isolated process environment, bound to a single IP address. +Normal procedures for accessing a FreeBSD machine apply: telneting in +through the network reveals a telnet prompt, login, and shell. +.DS +.ft C +.ps -2 +% ps ax + PID TT STAT TIME COMMAND + 228 ?? SsJ 0:18.73 syslogd + 247 ?? IsJ 0:00.05 inetd -wW + 249 ?? IsJ 0:28.43 cron + 252 ?? SsJ 0:30.46 sendmail: accepting connections on port 25 + 291 ?? IsJ 0:38.53 /usr/local/sbin/sshd +93694 ?? SJ 0:01.01 sshd: rwatson@ttyp0 (sshd) +93695 p0 SsJ 0:00.06 -csh (csh) +93700 p0 R+J 0:00.00 ps ax +.ps +2 +.ft P +.DE +.PP +It is immediately obvious that the environment is within a jailbox: there +is no init process, no kernel daemons, and a J flag is present beside all +processes indicating the presence of a jail. +.PP +As with any FreeBSD system, accounts may be created and deleted, +mail is delivered, logs are generated, packages may be added, and the +system may be hacked into if configured incorrectly, or running a buggy +version of a piece of software. +However, all of this happens strictly within the scope of the jail. +.NH 2 +Jail Management +.PP +Jail management is an interesting prospect, as there are two perspectives +from which a jail environment may be administered: from within the jail, +and from the host environment. +From within the jail, as described above, the process is remarkably similar +to any regular FreeBSD install, although certain actions are prohibited, +such as mounting file systems, modifying system kernel properties, etc. +The only area that really differs are that of shutting +the system down: the processes within the jail may deliver signals +between them, allowing all processes to be killed, but bringing the +system back up requires intervention from outside of the jailbox. +.PP +From outside of the jail, there are a range of capabilities, as well +as limitations. +The jail environment is, in effect, a subset of the host environment: +the jail file system appears as part of the host file system, and may +be directly modified by processes in the host environment. +Processes within the jail appear in the process listing of the host, +and may likewise be signalled or debugged. +The host process file system makes the hostname of the jail environment +accessible in /proc/procnum/status, allowing utilities in the host +environment to manage processes based on jailname. +However, the default configuration allows privileged processes within +jails to set the hostname of the jail, which makes the status file less +useful from a management perspective if the contents of the jail are +malicious. +To prevent a jail from changing its hostname, the +"jail.set_hostname_allowed" sysctl may be set to 0 prior to starting +any jails. +.PP +One aspect immediately observable in an environment with multiple jails +is that uids and gids are local to each jail environment: the uid associated +with a process in one jail may be for a different user than in another +jail. +This collision of identifiers is only visible in the host environment, +as normally processes from one jail are never visible in an environment +with another scope for user/uid and group/gid mapping. +Managers in the host environment should understand these scoping issues, +or confusion and unintended consequences may result. +.PP +Jailed processes are subject to the normal restrictions present for +any processes, including resource limits, and limits placed by the network +code, including firewall rules. +By specifying firewall rules for the IP address bound to a jail, it is +possible to place connectivity and bandwidth limitations on individual +jails, restricting services that may be consumed or offered. +.PP +Management of jails is an area that will see further improvement in +future versions of FreeBSD. Some of these potential improvements are +discussed later in this paper. diff --git a/share/doc/papers/jail/paper.ms b/share/doc/papers/jail/paper.ms new file mode 100644 index 0000000..ce5096d --- /dev/null +++ b/share/doc/papers/jail/paper.ms @@ -0,0 +1,437 @@ +.\" +.\" $FreeBSD$ +.\" +.ds CH " +.nr PI 2n +.nr PS 12 +.nr LL 15c +.nr PO 3c +.nr FM 3.5c +.po 3c +.TL +Jails: Confining the omnipotent root. +.FS +This paper was presented at the 2nd International System Administration and netoworking Conference "SANE 2000" May 22-25, 2000 in Maastricht, The Netherlands and are published in the in the proceedings. +.FE +.AU +Poul-Henning Kamp +.AU +Robert N. M. Watson +.AI +The FreeBSD Project +.FS +This work was sponsored by \fChttp://www.servetheweb.com/\fP and +donated to the FreeBSD Project for inclusion in the FreeBSD +OS. FreeBSD 4.0-RELEASE was the first release including this +code. +Follow-on work was sponsored by Safeport Network Services, +\fChttp://www.safeport.com/\fP +.FE +.AB +The traditional UNIX security model is simple but inexpressive. +Adding fine-grained access control improves the expressiveness, +but often dramatically increases both the cost of system management +and implementation complexity. +In environments with a more complex management model, with delegation +of some management functions to parties under varying degrees of trust, +the base UNIX model and most natural +extensions are inappropriate at best. +Where multiple mutually un-trusting parties are introduced, +``inappropriate'' rapidly transitions to ``nightmarish'', especially +with regards to data integrity and privacy protection. +.PP +The FreeBSD ``Jail'' facility provides the ability to partition +the operating system environment, while maintaining the simplicity +of the UNIX ``root'' model. +In Jail, users with privilege find that the scope of their requests +is limited to the jail, allowing system administrators to delegate +management capabilities for each virtual machine +environment. +Creating virtual machines in this manner has many potential uses; the +most popular thus far has been for providing virtual machine services +in Internet Service Provider environments. +.AE +.NH +Introduction +.PP +The UNIX access control mechanism is designed for an environment with two +types of users: those with, and without administrative privilege. +Within this framework, every attempt is made to provide an open +system, allowing easy sharing of files and inter-process communication. +As a member of the UNIX family, FreeBSD inherits these +security properties. +Users of FreeBSD in non-traditional UNIX environments must balance +their need for strong application support, high network performance +and functionality, and low total cost of ownership with the need +for alternative security models that are difficult or impossible to +implement with the UNIX security mechanisms. +.PP +One such consideration is the desire to delegate some (but not all) +administrative functions to untrusted or less trusted parties, and +simultaneously impose system-wide mandatory policies on process +interaction and sharing. +Attempting to create such an environment in the current-day FreeBSD +security environment is both difficult and costly: in many cases, +the burden of implementing these policies falls on user +applications, which means an increase in the size and complexity +of the code base, in turn translating to higher development +and maintaennce cost, as well as less overall flexibility. +.PP +This abstract risk becomes more clear when applied to a practical, +real-world example: +many web service providers turn to the FreeBSD +operating system to host customer web sites, as it provides a +high-performance, network-centric server environment. +However, these providers have a number of concerns on their plate, both in +terms of protecting the integrity and confidentiality of their own +files and services from their customers, as well as protecting the files +and services of one customer from (accidental or +intentional) access by any other customer. +At the same time, a provider would like to provide +substantial autonomy to customers, allowing them to install and +maintain their own software, and to manage their own services, +such as web servers and other content-related daemon programs. +.PP +This problem space points strongly in the direction of a partitioning +solution, in which customer processes and storage are isolated from those of +other customers, both in terms of accidental disclosure of data or process +information, but also in terms of the ability to modify files or processes +outside of a compartment. +Delegation of management functions within the system must +be possible, but not at the cost of system-wide requirements, including +integrity and privacy protection between partitions. +.PP +However, UNIX-style access control makes it notoriously difficult to +compartmentalise functionality. +While mechanisms such as chroot(2) provide a modest +level compartmentalisation, it is well known +that these mechanisms have serious shortcomings, both in terms of the +scope of their functionality, and effectiveness at what they provide \s-2[CHROOT]\s+2. +.PP +In the case of the chroot(2) call, a process's visibility of +the file system name-space is limited to a single subtree. +However, the compartmentalisation does not extend to the process +or networking spaces and therefore both observation of and interference +with processes outside their compartment is possible. +.PP +To this end, we describe the new FreeBSD ``Jail'' facility, which +provides a strong partitioning solution, leveraging existing +mechanisms, such as chroot(2), to what effectively amounts to a +virtual machine environment. Processes in a jail are provided +full access to the files that they may manipulate, processes they +may influence, and network services they can make use of, and neither +access nor visibility of files, processes or network services outside +their partition. +.PP +Unlike other fine-grained security solutions, Jail does not +substantially increase the policy management requirements for the +system administrator, as each Jail is a virtual FreeBSD environment +permitting local policy to be independently managed, with much the +same properties as the main system itself, making Jail easy to use +for the administrator, and far more compatible with applications. +.NH +Traditional UNIX Security, or, ``God, root, what difference?" \s-2[UF]\s+2. +.PP +The traditional UNIX access model assigns numeric uids to each user of the +system. In turn, each process ``owned'' by a user will be tagged with that +user's uid in an unforgeable manner. The uids serve two purposes: first, +they determine how discretionary access control mechanisms will be applied, and +second, they are used to determine whether special privileges are accorded. +.PP +In the case of discretionary access controls, the primary object protected is +a file. The uid (and related gids indicating group membership) are mapped to +a set of rights for each object, courtesy the UNIX file mode, in effect acting +as a limited form of access control list. Jail is, in general, not concerned +with modifying the semantics of discretionary access control mechanisms, +although there are important implications from a management perspective. +.PP +For the purposes of determining whether special privileges are accorded to a +process, the check is simple: ``is the numeric uid equal to 0 ?''. +If so, the +process is acting with ``super-user privileges'', and all access checks are +granted, in effect allowing the process the ability to do whatever it wants +to \**. +.FS +\&... no matter how patently stupid it may be. +.FE +.PP +For the purposes of human convenience, uid 0 is canonically allocated +to the ``root'' user \s-2[ROOT]\s+2. +For the purposes of jail, this behaviour is extremely relevant: many of +these privileged operations can be used to manage system hardware and +configuration, file system name-space, and special network operations. +.PP +Many limitations to this model are immediately clear: the root user is a +single, concentrated source of privilege that is exposed to many pieces of +software, and as such an immediate target for attacks. In the event of a +compromise of the root capability set, the attacker has complete control over +the system. Even without an attacker, the risks of a single administrative +account are serious: delegating a narrow scope of capability to an +inexperienced administrator is difficult, as the granularity of delegation is +that of all system management abilities. These features make the omnipotent +root account a sharp, efficient and extremely dangerous tool. +.PP +The BSD family of operating systems have implemented the ``securelevel'' +mechanism which allows the administrator to block certain configuration +and management functions from being performed by root, +until the system is restarted and brought up into single-user mode. +While this does provide some amount of protection in the case of a root +compromise of the machine, it does nothing to address the need for +delegation of certain root abilities. +.NH +Other Solutions to the Root Problem +.PP +Many operating systems attempt to address these limitations by providing +fine-grained access controls for system resources \s-2[BIBA]\s+2. +These efforts vary in +degrees of success, but almost all suffer from at least three serious +limitations: +.PP +First, increasing the granularity of security controls increases the +complexity of the administration process, in turn increasing both the +opportunity for incorrect configuration, as well as the demand on +administrator time and resources. In many cases, the increased complexity +results in significant frustration for the administrator, which may result +in two +disastrous types of policy: ``all doors open as it's too much trouble'', and +``trust that the system is secure, when in fact it isn't''. +.PP +The extent of the trouble is best illustrated by the fact that an entire +niche industry has emerged providing tools to manage fine grained security +controls \s-2[UAS]\s+2. +.PP +Second, usefully segregating capabilities and assigning them to running code +and users is very difficult. Many privileged operations in UNIX seem +independent, but are in fact closely related, and the handing out of one +privilege may, in effect, be transitive to the many others. For example, in +some trusted operating systems, a system capability may be assigned to a +running process to allow it to read any file, for the purposes of backup. +However, this capability is, in effect, equivalent to the ability to switch to +any other account, as the ability to access any file provides access to system +keying material, which in turn provides the ability to authenticate as any +user. Similarly, many operating systems attempt to segregate management +capabilities from auditing capabilities. In a number of these operating +systems, however, ``management capabilities'' permit the administrator to +assign ``auditing capabilities'' to itself, or another account, circumventing +the segregation of capability. +.PP +Finally, introducing new security features often involves introducing new +security management APIs. When fine-grained capabilities are introduced to +replace the setuid mechanism in UNIX-like operating systems, applications that +previously did an ``appropriateness check'' to see if they were running as +root before executing must now be changed to know that they need not run as +root. In the case of applications running with privilege and executing other +programs, there is now a new set of privileges that must be voluntarily given +up before executing another program. These change can introduce significant +incompatibility for existing applications, and make life more difficult for +application developers who may not be aware of differing security semantics on +different systems \s-2[POSIX1e]\s+2. +.NH +The Jail Partitioning Solution +.PP +Jail neatly side-steps the majority of these problems through partitioning. +Rather +than introduce additional fine-grained access control mechanism, we partition +a FreeBSD environment (processes, file system, network resources) into a +management environment, and optionally subset Jail environments. In doing so, +we simultaneously maintain the existing UNIX security model, allowing +multiple users and a privileged root user in each jail, while +limiting the scope of root's activities to his jail. +Consequently the administrator of a +FreeBSD machine can partition the machine into separate jails, and provide +access to the super-user account in each of these without losing control of +the over-all environment. +.PP +A process in a partition is referred to as ``in jail''. When a FreeBSD +system is booted up after a fresh install, no processes will be in jail. +When +a process is placed in a jail, it, and any descendents of the process created +after the jail creation, will be in that jail. A process may be in only one +jail, and after creation, it can not leave the jail. +Jails are created when a +privileged process calls the jail(2) syscall, with a description of the jail as an +argument to the call. Each call to jail(2) creates a new jail; the only way +for a new process to enter the jail is by inheriting access to the jail from +another process already in that jail. +Processes may never +leave the jail they created, or were created in. +.KF +.PSPIC jail01.eps 4i +.ce 1 +Fig. 1 \(em Schematic diagram of machine with two configured jails +.sp +.KE +.PP +Membership in a jail involves a number of restrictions: access to the file +name-space is restricted in the style of chroot(2), the ability to bind network +resources is limited to a specific IP address, the ability to manipulate +system resources and perform privileged operations is sharply curtailed, and +the ability to interact with other processes is limited to only processes +inside the same jail. +.PP +Jail takes advantage of the existing chroot(2) behaviour to limit access to the +file system name-space for jailed processes. When a jail is created, it is +bound to a particular file system root. +Processes are unable to manipulate files that they cannot address, +and as such the integrity and confidentiality of files outside of the jail +file system root are protected. Traditional mechanisms for breaking out of +chroot(2) have been blocked. +In the expected and documented configuration, each jail is provided +with its exclusive file system root, and standard FreeBSD directory layout, +but this is not mandated by the implementation. +.PP +Each jail is bound to a single IP address: processes within the jail may not +make use of any other IP address for outgoing or incoming connections; this +includes the ability to restrict what network services a particular jail may +offer. As FreeBSD distinguishes attempts to bind all IP addresses from +attempts to bind a particular address, bind requests for all IP addresses are +redirected to the individual Jail address. Some network functionality +associated with privileged calls are wholesale disabled due to the nature of the +functionality offered, in particular facilities which would allow ``spoofing'' +of IP numbers or disruptive traffic to be generated have been disabled. +.PP +Processes running without root privileges will notice few, if any differences +between a jailed environment or un-jailed environment. Processes running with +root privileges will find that many restrictions apply to the privileged calls +they may make. Some calls will now return an access error \(em for example, an +attempt to create a device node will now fail. Others will have a more +limited scope than normal \(em attempts to bind a reserved port number on all +available addresses will result in binding only the address associated with +the jail. Other calls will succeed as normal: root may read a file owned by +any uid, as long as it is accessible through the jail file system name-space. +.PP +Processes within the jail will find that they are unable to interact or +even verify the existence of +processes outside the jail \(em processes within the jail are +prevented from delivering signals to processes outside the jail, as well as +connecting to those processes with debuggers, or even see them in the +sysctl or process file system monitoring mechanisms. Jail does not prevent, +nor is it intended to prevent, the use of covert channels or communications +mechanisms via accepted interfaces \(em for example, two processes may communicate +via sockets over the IP network interface. Nor does it attempt to provide +scheduling services based on the partition; however, it does prevent calls +that interfere with normal process operation. +.PP +As a result of these attempts to retain the standard FreeBSD API and +framework, almost all applications will run unaffected. Standard system +services such as Telnet, FTP, and SSH all behave normally, as do most third +party applications, including the popular Apache web server. +.NH +Jail Implementation +.PP +Processes running with root privileges in the jail find that there are serious +restrictions on what it is capable of doing \(em in particular, activities that +would extend outside of the jail: +.IP "" 5n +\(bu Modifying the running kernel by direct access and loading kernel +modules is prohibited. +.IP +\(bu Modifying any of the network configuration, interfaces, addresses, and +routing table is prohibited. +.IP +\(bu Mounting and unmounting file systems is prohibited. +.IP +\(bu Creating device nodes is prohibited. +.IP +\(bu Accessing raw, divert, or routing sockets is prohibited. +.IP +\(bu Modifying kernel runtime parameters, such as most sysctl settings, is +prohibited. +.IP +\(bu Changing securelevel-related file flags is prohibited. +.IP +\(bu Accessing network resources not associated with the jail is prohibited. +.bp +.PP +Other privileged activities are permitted as long as they are limited to the +scope of the jail: +.IP "" 5n +\(bu Signalling any process within the jail is permitted. +.IP +\(bu Changing the ownership and mode of any file within the jail is permitted, as +long as the file flags permit this. +.IP +\(bu Deleting any file within the jail is permitted, as long as the file flags +permit this. +.IP +\(bu Binding reserved TCP and UDP port numbers on the jails IP address is +permitted. (Attempts to bind TCP and UDP ports using IN_ADDRANY will be +redirected to the jails IP address.) +.IP +\(bu Functions which operate on the uid/gid space are all permitted since they +act as labels for filesystem objects of proceses +which are partitioned off by other mechanisms. +.PP +These restrictions on root access limit the scope of root processes, enabling +most applications to run un-hindered, but preventing calls that might allow an +application to reach beyond the jail and influence other processes or +system-wide configuration. +.PP +.so implementation.ms +.so mgt.ms +.so future.ms +.NH +Conclusion +.PP +The jail facility provides FreeBSD with a conceptually simple security +partitioning mechanism, allowing the delegation of administrative rights +within virtual machine partitions. +.PP +The implementation relies on +restricting access within the jail environment to a well-defined subset +of the overall host environment. This includes limiting interaction +between processes, and to files, network resources, and privileged +operations. Administrative overhead is reduced through avoiding +fine-grained access control mechanisms, and maintaining a consistent +administrative interface across partitions and the host environment. +.PP +The jail facility has already seen widespread deployment in particular as +a vehicle for delivering "virtual private server" services. +.PP +The jail code is included in the base system as part of FreeBSD 4.0-RELEASE, +and fully documented in the jail(2) and jail(8) man-pages. +.bp +.SH +Notes & References +.IP \s-2[BIBA]\s+2 .5i +K. J. Biba, Integrity Considerations for Secure +Computer Systems, USAF Electronic Systems Division, 1977 +.IP \s-2[CHROOT]\s+2 .5i +Dr. Marshall Kirk Mckusick, private communication: +``According to the SCCS logs, the chroot call was added by Bill Joy +on March 18, 1982 approximately 1.5 years before 4.2BSD was released. +That was well before we had ftp servers of any sort (ftp did not +show up in the source tree until January 1983). My best guess as +to its purpose was to allow Bill to chroot into the /4.2BSD build +directory and build a system using only the files, include files, +etc contained in that tree. That was the only use of chroot that +I remember from the early days.'' +.IP \s-2[LOTTERY1]\s+2 .5i +David Petrou and John Milford. Proportional-Share Scheduling: +Implementation and Evaluation in a Widely-Deployed Operating System, +December 1997. +.nf +\s-2\fChttp://www.cs.cmu.edu/~dpetrou/papers/freebsd_lottery_writeup98.ps\fP\s+2 +\s-2\fChttp://www.cs.cmu.edu/~dpetrou/code/freebsd_lottery_code.tar.gz\fP\s+2 +.IP \s-2[LOTTERY2]\s+2 .5i +Carl A. Waldspurger and William E. Weihl. Lottery Scheduling: Flexible Proportional-Share Resource Management, Proceedings of the First Symposium on Operating Systems Design and Implementation (OSDI '94), pages 1-11, Monterey, California, November 1994. +.nf +\s-2\fChttp://www.research.digital.com/SRC/personal/caw/papers.html\fP\s+2 +.IP \s-2[POSIX1e]\s+2 .5i +Draft Standard for Information Technology \(em +Portable Operating System Interface (POSIX) \(em +Part 1: System Application Program Interface (API) \(em Amendment: +Protection, Audit and Control Interfaces [C Language] +IEEE Std 1003.1e Draft 17 Editor Casey Schaufler +.IP \s-2[ROOT]\s+2 .5i +Historically other names have been used at times, Zilog for instance +called the super-user account ``zeus''. +.IP \s-2[UAS]\s+2 .5i +One such niche product is the ``UAS'' system to maintain and audit +RACF configurations on MVS systems. +.nf +\s-2\fChttp://www.entactinfo.com/products/uas/\fP\s+2 +.IP \s-2[UF]\s+2 .5i +Quote from the User-Friendly cartoon by Illiad. +.nf +\s-2\fChttp://www.userfriendly.org/cartoons/archives/98nov/19981111.html\fP\s+2 -- cgit v1.1