summaryrefslogtreecommitdiffstats
path: root/sys/kern/uipc_usrreq.c
Commit message (Collapse)AuthorAgeFilesLines
* Remove uidinfo hash table lookup and maintenance out of chgproccnt() andtruckman2000-09-051-2/+2
| | | | | | | | | | | | | | chgsbsize(), which are called rather frequently and may be called from an interrupt context in the case of chgsbsize(). Instead, do the hash table lookup and maintenance when credentials are changed, which is a lot less frequent. Add pointers to the uidinfo structures to the ucred and pcred structures for fast access. Pass a pointer to the credential to chgproccnt() and chgsbsize() instead of passing the uid. Add a reference count to the uidinfo structure and use it to decide when to free the structure rather than freeing the structure when the resource consumption drops to zero. Move the resource tracking code from kern_proc.c to kern_resource.c. Move some duplicate code sequences in kern_prot.c to separate helper functions. Change KASSERTs in this code to unconditional tests and calls to panic().
* Remove any possibility of hiwat-related race conditions by changinggreen2000-08-291-7/+10
| | | | | | | the chgsbsize() call to use a "subject" pointer (&sb.sb_hiwat) and a u_long target to set it to. The whole thing is splnet(). This fixes a problem that jdp has been able to provoke.
* Add snapshots to the fast filesystem. Most of the changes supportmckusick2000-07-111-4/+12
| | | | | | | | | | | | | | | | | | | | the gating of system calls that cause modifications to the underlying filesystem. The gating can be enabled by any filesystem that needs to consistently suspend operations by adding the vop_stdgetwritemount to their set of vnops. Once gating is enabled, the function vfs_write_suspend stops all new write operations to a filesystem, allows any filesystem modifying system calls already in progress to complete, then sync's the filesystem to disk and returns. The function vfs_write_resume allows the suspended write operations to begin again. Gating is not added by default for all filesystems as for SMP systems it adds two extra locks to such critical kernel paths as the write system call. Thus, gating should only be added as needed. Details on the use and current status of snapshots in FFS can be found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness is not included here. Unless and until you create a snapshot file, these changes should have no effect on your system (famous last words).
* Previous commit changing SYSCTL_HANDLER_ARGS violated KNF.phk2000-07-041-1/+1
| | | | Pointed out by: bde
* Style police catches up with rev 1.26 of src/sys/sys/sysctl.h:phk2000-07-031-1/+1
| | | | | | | | Sanitize SYSCTL_HANDLER_ARGS so that simplistic tools can grog our sources: -sysctl_vm_zone SYSCTL_HANDLER_ARGS +sysctl_vm_zone (SYSCTL_HANDLER_ARGS)
* fix races in the uidinfo subsystem, several problems existed:alfred2000-06-221-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | 1) while allocating a uidinfo struct malloc is called with M_WAITOK, it's possible that while asleep another process by the same user could have woken up earlier and inserted an entry into the uid hash table. Having redundant entries causes inconsistancies that we can't handle. fix: do a non-waiting malloc, and if that fails then do a blocking malloc, after waking up check that no one else has inserted an entry for us already. 2) Because many checks for sbsize were done as "test then set" in a non atomic manner it was possible to exceed the limits put up via races. fix: instead of querying the count then setting, we just attempt to set the count and leave it up to the function to return success or failure. 3) The uidinfo code was inlining and repeating, lookups and insertions and deletions needed to be in their own functions for clarity. Reviewed by: green
* Enable SCM_RIGHTS on alpha. Allocate necessary buffer as conversion betweenshin2000-03-091-28/+108
| | | | | | | | | int and struct file *. Approved by: jkh Submitted by: brian Reviewed by: bde, brian, peter
* Introduce NDFREE (and remove VOP_ABORTOP)eivind1999-12-151-1/+3
|
* This is a partial commit of the patch from PR 14914:phk1999-11-161-8/+8
| | | | | | | | | | | | | Alot of the code in sys/kern directly accesses the *Q_HEAD and *Q_ENTRY structures for list operations. This patch makes all list operations in sys/kern use the queue(3) macros, rather than directly accessing the *Q_{HEAD,ENTRY} structures. This batch of changes compile to the same object files. Reviewed by: phk Submitted by: Jake Burkholder <jake@checker.org> PR: 14914
* Trim unused options (or #ifdef for undoc options).peter1999-10-111-1/+0
| | | | Submitted by: phk
* Implement RLIMIT_SBSIZE in the kernel. This is a per-uid sockbuf totalgreen1999-10-091-0/+4
| | | | usage limit.
* Do not follow symlinks when binding a unix domain socket.guido1999-09-291-1/+1
| | | | This fixes the ssh 1.2.27 vulnerability as reported in bugtraq.
* Change so_cred's type to a ucred, not a pcred. THis makes more sense, actually.green1999-09-191-1/+1
| | | | | | Make a sonewconn3() which takes an extra argument (proc) so new sockets created with sonewconn() from a user's system call get the correct credentials, not just the parent's credentials.
* Get rid of some evil defines (a pair of snd and rcv.)green1999-09-171-19/+12
|
* $Id$ -> $FreeBSD$peter1999-08-281-1/+1
|
* Divorce "dev_t" from the "major|minor" bitmap, which is now calledphk1999-05-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | udev_t in the kernel but still called dev_t in userland. Provide functions to manipulate both types: major() umajor() minor() uminor() makedev() umakedev() dev2udev() udev2dev() For now they're functions, they will become in-line functions after one of the next two steps in this process. Return major/minor/makedev to macro-hood for userland. Register a name in cdevsw[] for the "filedescriptor" driver. In the kernel the udev_t appears in places where we have the major/minor number combination, (ie: a potential device: we may not have the driver nor the device), like in inodes, vattr, cdevsw registration and so on, whereas the dev_t appears where we carry around a reference to a actual device. In the future the cdevsw and the aliased-from vnode will be hung directly from the dev_t, along with up to two softc pointers for the device driver and a few houskeeping bits. This will essentially replace the current "alias" check code (same buck, bigger bang). A little stunt has been provided to try to catch places where the wrong type is being used (dev_t vs udev_t), if you see something not working, #undef DEVT_FASCIST in kern/kern_conf.c and see if it makes a difference. If it does, please try to track it down (many hands make light work) or at least try to reproduce it as simply as possible, and describe how to do that. Without DEVT_FASCIST I belive this patch is a no-op. Stylistic/posixoid comments about the userland view of the <sys/*.h> files welcome now, from userland they now contain the end result. Next planned step: make all dev_t's refer to the same devsw[] which means convert BLK's to CHR's at the perimeter of the vnodes and other places where they enter the game (bootdev, mknod, sysctl).
* Fix descriptor leak provoked by KKIS.05051999.003b exploit code.truckman1999-05-101-1/+4
| | | | | | | | unp_internalize() takes a reference to the descriptor. If the send fails after unp_internalize(), the control mbuf would be freed ophaning the reference. Tested in -CURRENT by: Pierre Beyssac <beyssac@enst.fr>
* This Implements the mumbled about "Jail" feature.phk1999-04-281-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a seriously beefed up chroot kind of thing. The process is jailed along the same lines as a chroot does it, but with additional tough restrictions imposed on what the superuser can do. For all I know, it is safe to hand over the root bit inside a prison to the customer living in that prison, this is what it was developed for in fact: "real virtual servers". Each prison has an ip number associated with it, which all IP communications will be coerced to use and each prison has its own hostname. Needless to say, you need more RAM this way, but the advantage is that each customer can run their own particular version of apache and not stomp on the toes of their neighbors. It generally does what one would expect, but setting up a jail still takes a little knowledge. A few notes: I have no scripts for setting up a jail, don't ask me for them. The IP number should be an alias on one of the interfaces. mount a /proc in each jail, it will make ps more useable. /proc/<pid>/status tells the hostname of the prison for jailed processes. Quotas are only sensible if you have a mountpoint per prison. There are no privisions for stopping resource-hogging. Some "#ifdef INET" and similar may be missing (send patches!) If somebody wants to take it from here and develop it into more of a "virtual machine" they should be most welcome! Tools, comments, patches & documentation most welcome. Have fun... Sponsored by: http://www.rndassociates.com/ Run for almost a year by: http://www.servetheweb.com/
* More consistent with surrounding style. (Hey - it looked great in theeivind1999-04-121-2/+2
| | | | | | diff...) Prodded by: bde
* Staticize.eivind1999-04-111-2/+2
|
* * Change sysctl from using linker_set to construct its tree using SLISTs.dfr1999-02-161-1/+4
| | | | | | | | | | This makes it possible to change the sysctl tree at runtime. * Change KLD to find and register any sysctl nodes contained in the loaded file and to unregister them when the file is unloaded. Reviewed by: Archie Cobbs <archie@whistle.com>, Peter Wemm <peter@netplex.com.au> (well they looked at it anyway)
* The code that reclaims descriptors from in-transit unix domaindillon1999-01-211-1/+1
| | | | | | | descriptor-passing messages was calling sorflush() without checking to see if the descriptor was actually a socket. This can cause a crash by exiting programs that use the mechanism under certain circumstances.
* This is a rather large commit that encompasses the new swapper,dillon1999-01-211-3/+6
| | | | | | | | | | changes to the VM system to support the new swapper, VM bug fixes, several VM optimizations, and some additional revamping of the VM code. The specific bug fixes will be documented with additional forced commits. This commit is somewhat rough in regards to code cleanup issues. Reviewed by: "John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>
* Nitpicking and dusting performed on a train. Removes trivial warningsphk1998-10-251-2/+2
| | | | about unused variables, labels and other lint.
* Cast pointers to uintptr_t/intptr_t instead of to u_long/long,bde1998-07-151-2/+2
| | | | | | | respectively. Most of the longs should probably have been u_longs, but this changes is just to prevent warnings about casts between pointers and integers of different sizes, not to fix poorly chosen types.
* Convert socket structures to be type-stable and add a version number.wollman1998-05-151-23/+138
| | | | | | | | | | | | | | | | | | | Define a parameter which indicates the maximum number of sockets in a system, and use this to size the zone allocators used for sockets and for certain PCBs. Convert PF_LOCAL PCB structures to be type-stable and add a version number. Define an external format for infomation about socket structures and use it in several places. Define a mechanism to get all PF_LOCAL and PF_INET PCB lists through sysctl(3) without blocking network interrupts for an unreasonable length of time. This probably still has some bugs and/or race conditions, but it seems to work well enough on my machines. It is now possible for `netstat' to get almost all of its information via the sysctl(3) interface rather than reading kmem (changes to follow).
* In the words of the submitter:msmith1998-05-071-2/+4
| | | | | | | | | | | | | | | | | | | --------- Make callers of namei() responsible for releasing references or locks instead of having the underlying filesystems do it. This eliminates redundancy in all terminal filesystems and makes it possible for stacked transport layers such as umapfs or nullfs to operate correctly. Quality testing was done with testvn, and lat_fs from the lmbench suite. Some NFS client testing courtesy of Patrik Kudo. vop_mknod and vop_symlink still release the returned vpp. vop_rename still releases 4 vnode arguments before it returns. These remaining cases will be corrected in the next set of patches. --------- Submitted by: Michael Hancock <michaelh@cet.co.jp>
* Seventy-odd "its" / "it's" typos in comments fixed as per kern/6108.des1998-04-171-2/+2
|
* Back out DIAGNOSTIC changes.eivind1998-02-061-3/+1
|
* Turn DIAGNOSTIC into a new-style option.eivind1998-02-041-1/+3
|
* Fixed duplicate definitions of M_FILE (one static).bde1997-11-231-4/+2
|
* Remove a bunch of variables which were unused both in GENERIC and LINT.phk1997-11-071-2/+1
| | | | Found by: -Wunused
* Last major round (Unless Bruce thinks of somthing :-) of malloc changes.phk1997-10-121-1/+3
| | | | | | | | Distribute all but the most fundamental malloc types. This time I also remembered the trick to making things static: Put "static" in front of them. A couple of finer points by: bde
* Various select -> poll changespeter1997-09-141-2/+2
|
* Removed unused #includes.bde1997-09-021-3/+1
|
* Added used #include - don't depend on <sys/mbuf.h> includingbde1997-09-021-1/+2
| | | | <sys/malloc.h> (unless we only use the bogusly shared M*WAIT flags).
* Fix all areas of the system (or at least all those in LINT) to avoid storingwollman1997-08-161-57/+55
| | | | | | | | socket addresses in mbufs. (Socket buffers are the one exception.) A number of kernel APIs needed to get fixed in order to make this happen. Also, fix three protocol families which kept PCBs in mbufs to not malloc them instead. Delete some old compatibility cruft while we're at it, and add some new routines in the in_cksum family.
* The long-awaited mega-massive-network-code- cleanup. Part I.wollman1997-04-271-217/+300
| | | | | | | | | | | | | | | | | | | | | | | | This commit includes the following changes: 1) Old-style (pr_usrreq()) protocols are no longer supported, the compatibility glue for them is deleted, and the kernel will panic on boot if any are compiled in. 2) Certain protocol entry points are modified to take a process structure, so they they can easily tell whether or not it is possible to sleep, and also to access credentials. 3) SS_PRIV is no more, and with it goes the SO_PRIVSTATE setsockopt() call. Protocols should use the process pointer they are now passed. 4) The PF_LOCAL and PF_ROUTE families have been updated to use the new style, as has the `raw' skeleton family. 5) PF_LOCAL sockets now obey the process's umask when creating a socket in the filesystem. As a result, LINT is now broken. I'm hoping that some enterprising hacker with a bit more time will either make the broken bits work (should be easy for netipx) or dike them out.
* Don't #include <sys/fcntl.h> in <sys/file.h> if KERNEL is defined.bde1997-03-231-1/+2
| | | | | Fixed everything that depended on getting fcntl.h stuff from the wrong place. Most things don't depend on file.h stuff at all.
* Add support to sendmsg()/recvmsg() for passing credentials betweenwpaul1997-03-211-3/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | processes using AF_LOCAL sockets. This hack is going to be used with Secure RPC to duplicate a feature of STREAMS which has no real counterpart in sockets (with STREAMS/TLI, you can apparently use t_getinfo() to learn UID of a local process on the other side of a transport endpoint). What happens is this: the client sets up a sendmsg() call with ancillary data using the SCM_CREDS socket-level control message type. It does not need to fill in the structure. When the kernel notices the data, unp_internalize() fills in the cmesgcred structure with the sending process' credentials (UID, EUID, GID, and ancillary groups). This data is later delivered to the receiving process. The receiver can then perform the follwing tests: - Did the client send ancillary data? o Yes, proceed. o No, refuse to authenticate the client. - The the client send data of type SCM_CREDS? o Yes, proceed. o No, refuse to authenticate the client. - Is the cmsgcred structure the right size? o Yes, proceed. o No, signal a possible error. The receiver can now inspect the credential information and use it to authenticate the client.
* Create a new branch of the kernel MIB, kern.ipc, to storewollman1997-02-241-9/+20
| | | | | | | | | | | all of the configurables and instrumentation related to inter-process communication mechanisms. Some variables, like mbuf statistics, are instrumented here for the first time. For mbuf statistics: also keep track of m_copym() and m_pullup() failures, and provide for the user's inspection the compiled-in values of MSIZE, MHLEN, MCLBYTES, and MINCLSIZE.
* Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are notpeter1997-02-221-1/+1
| | | | ready for it yet.
* This is the kernel Lite/2 commit. There are some requisite userlanddyson1997-02-101-7/+7
| | | | | | | | | | | | | | | changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes. The system boots and can mount UFS filesystems. Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed. Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>
* Make the long-awaited change from $Id$ to $FreeBSD$jkh1997-01-141-1/+1
| | | | | | | | This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
* Add comments to hard-to-follow File descriptor handling codejulian1996-12-051-1/+68
|
* Move or add #include <queue.h> in preparation for upcoming struct socketdg1996-03-111-1/+2
| | | | changes.
* Merge in Lite2: LIST replacement for f_filef, f_fileb, and filehead.hsu1996-03-111-5/+6
| | | | Reviewed by: davidg & bde
* Another mega commit to staticize things.phk1995-12-141-21/+35
|
* Increase the size of the pipe buffer as denoted by PIPSIZ fromdyson1995-08-311-2/+4
| | | | | | 4k to 8k. This has a significant effect on the pipe performance. In the future it might be good to increase this to 16k. PIPSIZ is now tunable for experimentation.
* Make everything except the unsupported network sources compile cleanlybde1995-08-161-2/+1
| | | | with -Wnested-externs.
OpenPOWER on IntegriCloud