FreeBSD-src - Raptor Engineering's fork of pfsense FreeBSD src with pfSense changes

	Commit message (Collapse)	Author	Age	Files	Lines
*	Remove one zero from the double-0.	bz	2010-04-23	1	-2/+2
\| \| \| \| \| \|	This code doesn't have a license to kill. MFC after: 3 days
*	On the return path from F_RDAHEAD and F_READAHEAD fcntls, do not	kib	2009-11-20	1	-2/+3
\| \| \| \| \| \| \| \| \|	unlock Giant twice. While there, bring conditions in the do/while loops closer to style, that also makes the lines fit into 80 columns. Reported and tested by: dougb
*	Add two new fcntls to enable/disable read-ahead:	delphij	2009-09-28	1	-0/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- F_READAHEAD: specify the amount for sequential access. The amount is specified in bytes and is rounded up to nearest block size. - F_RDAHEAD: Darwin compatible version that use 128KB as the sequential access size. A third argument of zero disables the read-ahead behavior. Please note that the read-ahead amount is also constrainted by sysctl variable, vfs.read_max, which may need to be raised in order to better utilize this feature. Thanks Igor Sysoev for proposing the feature and submitting the original version, and kib@ for his valuable comments. Submitted by: Igor Sysoev <is rambler-co ru> Reviewed by: kib@ MFC after: 1 month
*	Replace AUDIT_ARG() with variable argument macros with a set more more	rwatson	2009-06-27	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	specific macros for each audit argument type. This makes it easier to follow call-graphs, especially for automated analysis tools (such as fxr). In MFC, we should leave the existing AUDIT_ARG() macros as they may be used by third-party kernel modules. Suggested by: brooks Approved by: re (kib) Obtained from: TrustedBSD Project MFC after: 1 week
*	- Similar to the previous commit, but for CURRENT: Fix a bug where a FIFO vnode	lulf	2009-06-24	1	-1/+0
\| \| \| \|	use count was increased twice, but only decreased once.
*	- Fix a bug where a FIFO vnode use count was increased twice, but only	lulf	2009-06-24	1	-1/+0
\| \| \| \| \| \|	decreased once. MFC after: 1 week
*	Add a new 'void closefrom(int lowfd)' system call. When called, it closes	jhb	2009-06-15	1	-0/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	any open file descriptors >= 'lowfd'. It is largely identical to the same function on other operating systems such as Solaris, DFly, NetBSD, and OpenBSD. One difference from other *BSD is that this closefrom() does not fail with any errors. In practice, while the manpages for NetBSD and OpenBSD claim that they return EINTR, they ignore internal errors from close() and never return EINTR. DFly does return EINTR, but for the common use case (closing fd's prior to execve()), the caller really wants all fd's closed and returning EINTR just forces callers to call closefrom() in a loop until it stops failing. Note that this implementation of closefrom(2) does not make any effort to resolve userland races with open(2) in other threads. As such, it is not multithread safe. Submitted by: rwatson (initial version) Reviewed by: rwatson MFC after: 2 weeks
*	- Use an acquire barrier to increment f_count in fget_unlocked and	jeff	2009-06-02	1	-2/+6
\| \| \| \| \| \|	remove the volatile cast. Describe the reason in detail in a comment. Discussed with: bde, jhb
*	Add hierarchical jails. A jail may further virtualize its environment	jamie	2009-05-27	1	-6/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	by creating a child jail, which is visible to that jail and to any parent jails. Child jails may be restricted more than their parents, but never less. Jail names reflect this hierarchy, being MIB-style dot-separated strings. Every thread now points to a jail, the default being prison0, which contains information about the physical system. Prison0's root directory is the same as rootvnode; its hostname is the same as the global hostname, and its securelevel replaces the global securelevel. Note that the variable "securelevel" has actually gone away, which should not cause any problems for code that properly uses securelevel_gt() and securelevel_ge(). Some jail-related permissions that were kept in global variables and set via sysctls are now per-jail settings. The sysctls still exist for backward compatibility, used only by the now-deprecated jail(2) system call. Approved by: bz (mentor)
*	Set the umask in a new file descriptor table earlier in fdcopy() to remove	jhb	2009-05-20	1	-4/+2
\| \| \| \|	two lock operations.
*	Revert r192094. The revision caused problems for sysctl(3) consumers	kib	2009-05-15	1	-2/+15
\| \| \| \| \| \| \| \| \| \| \| \| \|	that expect that oldlen is filled with required buffer length even when supplied buffer is too short and returned error is ENOMEM. Redo the fix for kern.proc.filedesc, by reverting the req->oldidx when remaining buffer space is too short for the current kinfo_file structure. Also, only ignore ENOMEM. We have to convert ENOMEM to no error condition to keep existing interface for the sysctl, though. Reported by: ed, Florian Smeets <flo kasimir com> Tested by: pho
*	- Implement a lockless file descriptor lookup algorithm in	jeff	2009-05-14	1	-33/+88
\| \| \| \| \| \| \| \| \| \| \| \|	fget_unlocked(). - Save old file descriptor tables created on expansion until the entire descriptor table is freed so that pointers may be followed without regard for expanders. - Mark the file zone as NOFREE so we may attempt to reference potentially freed files. - Convert several fget_locked() users to fget_unlocked(). This requires us to manage reference counts explicitly but reduces locking overhead in the common case.
*	Update comment above _fget() for earlier change to FWRITE failures return	jhb	2009-04-15	1	-4/+2
\| \| \| \| \| \| \|	EBADF rather than EINVAL. Submitted by: Jaakko Heinonen jh saunalahti fi MFC after: 1 month
*	Remove the printf's when the vnode to be exported for procstat is not a VDIR.	marcus	2009-02-14	1	-4/+0
\| \| \| \| \| \| \| \| \|	If the file system backing a process' cwd is removed, and procstat -f PID is called, then these messages would have been printed. The extra verbosity is not required in this situation. Requested by: kib Approved by: kib
*	Change two KASSERTS to printfs and simple returns. Stress testing has	marcus	2009-02-14	1	-2/+12
\| \| \| \| \| \| \| \| \| \| \|	revealed that a process' current working directory can be VBAD if the directory is removed. This can trigger a panic when procstat -f PID is run. Tested by: pho Discovered by: phobot Reviewed by: kib Approved by: kib
*	Modify fdcopy() so that, during fork(2), it won't copy file descriptors	rwatson	2009-02-11	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	from the parent to the child process if they have an operation vector of &badfileops. This narrows a set of races involving system calls that allocate a new file descriptor, potentially block for some extended period, and then return the file descriptor, when invoked by a threaded program that concurrently invokes fork(2). Similar approches are used in both Solaris and Linux, and the wideness of this race was introduced in FreeBSD when we moved to a more optimistic implementation of accept(2) in order to simplify locking. A small race necessarily remains because the fork(2) might occur after the finit() in accept(2) but before the system call has returned, but that appears unavoidable using current APIs. However, this race is vastly narrower. The fix can be validated using the newfileops_on_fork regression test. PR: kern/130348 Reported by: Ivan Shcheklein <shcheklein at gmail dot com> Reviewed by: jhb, kib MFC after: 1 week
*	Clear the pointers to the file in the struct filedesc before file is closed	kib	2008-12-30	1	-6/+8
\| \| \| \| \| \| \| \|	in fdfree. Otherwise, sysctl_kern_proc_filedesc may dereference stale struct file * values. Reported and tested by: pho MFC after: 1 month
*	Prune some whining.	peter	2008-12-02	1	-10/+0
\|
*	Duplicate another few hundred lines of code in order to be compatible	peter	2008-12-01	1	-0/+2
\| \| \| \|	with unreleased binaries.
*	Properly wrap this giant block of duplicate code inside COMPAT_FREEBSD7	peter	2008-11-30	1	-2/+2
\|
*	Implement copyout packing more along the lines of what I had in mind.	peter	2008-11-30	1	-4/+268
\| \| \| \| \| \|	Create a temporary duplicate implementation of old filedesc struct for pre-7.1 libgtop package. Todo: specific fd or addr request
*	WIP kinfo_file/kinfo_vmmentry tweaks. The idea:	peter	2008-11-29	1	-0/+4
\| \| \| \| \| \| \| \| \|	1) to get the 32 and 64 bit versions in sync so that no shims are needed, Valgrind in particular excercises this. and: 2) reduce the size of the copyout. On large processes this turns out to be a huge problem. Valgrind also suffers from this since it needs to do this in a context that can't malloc. I want to pack the records. 3) Add new types.. 'tell me about fd N' and 'tell me about addr N'.
*	Remove unnecessary locking around vn_fullpath(). The vnode lock for the	jhb	2008-11-04	1	-6/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	vnode in question does not need to be held. All the data structures used during the name lookup are protected by the global name cache lock. Instead, the caller merely needs to ensure a reference is held on the vnode (such as vhold()) to keep it from being freed. In the case of procfs' <pid>/file entry, grab the process lock while we gain a new reference (via vhold()) on p_textvp to fully close races with execve(2). For the kern.proc.vmmap sysctl handler, use a shared vnode lock around the call to VOP_GETATTR() rather than an exclusive lock. MFC after: 1 month
*	Use shared vnode locks instead of exclusive vnode locks for the access(),	jhb	2008-11-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	chdir(), chroot(), eaccess(), fpathconf(), fstat(), fstatfs(), lseek() (when figuring out the current size of the file in the SEEK_END case), pathconf(), readlink(), and statfs() system calls. Submitted by: ups (mostly) Tested by: pho MFC after: 1 month
*	Fix a number of style issues in the MALLOC / FREE commit. I've tried to	des	2008-10-23	1	-1/+1
\| \| \| \| \|	be careful not to fix anything that was already broken; the NFSv4 code is particularly bad in this respect.
*	Retire the MALLOC and FREE macros. They are an abomination unto style(9).	des	2008-10-23	1	-12/+11
\| \| \| \|	MFC after: 3 months
*	Downgrade XXX to a Note for fgetsock() and fputsock().	rwatson	2008-10-12	1	-2/+2
\| \| \| \|	MFC after: 3 days
*	Integrate the new MPSAFE TTY layer to the FreeBSD operating system.	ed	2008-08-20	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The last half year I've been working on a replacement TTY layer for the FreeBSD kernel. The new TTY layer was designed to improve the following: - Improved driver model: The old TTY layer has a driver model that is not abstract enough to make it friendly to use. A good example is the output path, where the device drivers directly access the output buffers. This means that an in-kernel PPP implementation must always convert network buffers into TTY buffers. If a PPP implementation would be built on top of the new TTY layer (still needs a hooks layer, though), it would allow the PPP implementation to directly hand the data to the TTY driver. - Improved hotplugging: With the old TTY layer, it isn't entirely safe to destroy TTY's from the system. This implementation has a two-step destructing design, where the driver first abandons the TTY. After all threads have left the TTY, the TTY layer calls a routine in the driver, which can be used to free resources (unit numbers, etc). The pts(4) driver also implements this feature, which means posix_openpt() will now return PTY's that are created on the fly. - Improved performance: One of the major improvements is the per-TTY mutex, which is expected to improve scalability when compared to the old Giant locking. Another change is the unbuffered copying to userspace, which is both used on TTY device nodes and PTY masters. Upgrading should be quite straightforward. Unlike previous versions, existing kernel configuration files do not need to be changed, except when they reference device drivers that are listed in UPDATING. Obtained from: //depot/projects/mpsafetty/... Approved by: philip (ex-mentor) Discussed: on the lists, at BSDCan, at the DevSummit Sponsored by: Snow B.V., the Netherlands dcons(4) fixed by: kan
*	Remove unneeded D_NEEDGIANT from /dev/fd/{0,1,2}.	ed	2008-08-09	1	-1/+0
\| \| \| \| \| \| \| \|	There is no reason the fdopen() routine needs Giant. It only sets curthread->td_dupfd, based on the device unit number of the cdev. I guess we won't get massive performance improvements here, but still, I assume we eventually want to get rid of Giant.
*	Rework the lifetime management of the kernel implementation of POSIX	jhb	2008-06-27	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	semaphores. Specifically, semaphores are now represented as new file descriptor type that is set to close on exec. This removes the need for all of the manual process reference counting (and fork, exec, and exit event handlers) as the normal file descriptor operations handle all of that for us nicely. It is also suggested as one possible implementation in the spec and at least one other OS (OS X) uses this approach. Some bugs that were fixed as a result include: - References to a named semaphore whose name is removed still work after the sem_unlink() operation. Prior to this patch, if a semaphore's name was removed, valid handles from sem_open() would get EINVAL errors from sem_getvalue(), sem_post(), etc. This fixes that. - Unnamed semaphores created with sem_init() were not cleaned up when a process exited or exec'd. They were only cleaned up if the process did an explicit sem_destroy(). This could result in a leak of semaphore objects that could never be cleaned up. - On the other hand, if another process guessed the id (kernel pointer to 'struct ksem' of an unnamed semaphore (created via sem_init)) and had write access to the semaphore based on UID/GID checks, then that other process could manipulate the semaphore via sem_destroy(), sem_post(), sem_wait(), etc. - As part of the permission check (UID/GID), the umask of the proces creating the semaphore was not honored. Thus if your umask denied group read/write access but the explicit mode in the sem_init() call allowed it, the semaphore would be readable/writable by other users in the same group, for example. This includes access via the previous bug. - If the module refused to unload because there were active semaphores, then it might have deregistered one or more of the semaphore system calls before it noticed that there was a problem. I'm not sure if this actually happened as the order that modules are discovered by the kernel linker depends on how the actual .ko file is linked. One can make the order deterministic by using a single module with a mod_event handler that explicitly registers syscalls (and deregisters during unload after any checks). This also fixes a race where even if the sem_module unloaded first it would have destroyed locks that the syscalls might be trying to access if they are still executing when they are unloaded. XXX: By the way, deregistering system calls doesn't do any blocking to drain any threads from the calls. - Some minor fixes to errno values on error. For example, sem_init() isn't documented to return ENFILE or EMFILE if we run out of semaphores the way that sem_open() can. Instead, it should return ENOSPC in that case. Other changes: - Kernel semaphores now use a hash table to manage the namespace of named semaphores nearly in a similar fashion to the POSIX shared memory object file descriptors. Kernel semaphores can now also have names longer than 14 chars (up to MAXPATHLEN) and can include subdirectories in their pathname. - The UID/GID permission checks for access to a named semaphore are now done via vaccess() rather than a home-rolled set of checks. - Now that kernel semaphores have an associated file object, the various MAC checks for POSIX semaphores accept both a file credential and an active credential. There is also a new posixsem_check_stat() since it is possible to fstat() a semaphore file descriptor. - A small set of regression tests (using the ksem API directly) is present in src/tools/regression/posixsem. Reported by: kris (1) Tested by: kris Reviewed by: rwatson (lightly) MFC after: 1 month
*	Remove redundant checks from fcntl()'s F_DUPFD.	ed	2008-05-28	1	-31/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Right now we perform some of the checks inside the fcntl()'s F_DUPFD operation twice. We first validate the `fd' argument. When finished, we validate the `arg' argument. These checks are also performed inside do_dup(). The reason we need to do this, is because fcntl() should return different errno's when the `arg' argument is out of bounds (EINVAL instead of EBADF). To prevent the redundant locking of the PROC_LOCK and FILEDESC_SLOCK, patch do_dup() to support the error semantics required by fcntl(). Approved by: philip (mentor)
*	Replace direct atomic operation for the file refcount witht the	attilio	2008-05-25	1	-2/+2
\| \| \| \| \| \| \|	refcount interface. It also introduces the correct usage of memory barriers, as sometimes fdrop() and fhold() are used with shared locks, which don't use any release barrier.
*	Implement the per-open file data for the cdev.	kib	2008-05-21	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The patch does not change the cdevsw KBI. Management of the data is provided by the functions int devfs_set_cdevpriv(void priv, cdevpriv_dtr_t dtr); int devfs_get_cdevpriv(void *datap); void devfs_clear_cdevpriv(void); All of the functions are supposed to be called from the cdevsw method contexts. - devfs_set_cdevpriv assigns the priv as private data for the file descriptor which is used to initiate currently performed driver operation. dtr is the function that will be called when either the last refernce to the file goes away, the device is destroyed or devfs_clear_cdevpriv is called. - devfs_get_cdevpriv is the obvious accessor. - devfs_clear_cdevpriv allows to clear the private data for the still open file. Implementation keeps the driver-supplied pointers in the struct cdev_privdata, that is referenced both from the struct file and struct cdev, and cannot outlive any of the referee. Man pages will be provided after the KPI stabilizes. Reviewed by: jhb Useful suggestions from: jeff, antoine Debugging help and tested by: pho MFC after: 1 month
*	* Correct a mis-merge that leaked the PROC_LOCK [1]	kris	2008-04-26	1	-2/+2
\| \| \| \| \| \|	* Return ENOENT on error instead of 0 [2] Submitted by: rdivacky [1], kib [2]
*	fdhold can return NULL, so add the one remaining missing check for this	kris	2008-04-24	1	-0/+2
\| \| \| \| \| \| \|	condition. Reviewed by: attilio MFC after: 1 week
*	Add the new kernel-mode NFS Lock Manager. To use it instead of the	dfr	2008-03-26	1	-7/+66
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	user-mode lock manager, build a kernel with the NFSLOCKD option and add '-k' to 'rpc_lockd_flags' in rc.conf. Highlights include: * Thread-safe kernel RPC client - many threads can use the same RPC client handle safely with replies being de-multiplexed at the socket upcall (typically driven directly by the NIC interrupt) and handed off to whichever thread matches the reply. For UDP sockets, many RPC clients can share the same socket. This allows the use of a single privileged UDP port number to talk to an arbitrary number of remote hosts. * Single-threaded kernel RPC server. Adding support for multi-threaded server would be relatively straightforward and would follow approximately the Solaris KPI. A single thread should be sufficient for the NLM since it should rarely block in normal operation. * Kernel mode NLM server supporting cancel requests and granted callbacks. I've tested the NLM server reasonably extensively - it passes both my own tests and the NFS Connectathon locking tests running on Solaris, Mac OS X and Ubuntu Linux. * Userland NLM client supported. While the NLM server doesn't have support for the local NFS client's locking needs, it does have to field async replies and granted callbacks from remote NLMs that the local client has contacted. We relay these replies to the userland rpc.lockd over a local domain RPC socket. * Robust deadlock detection for the local lock manager. In particular it will detect deadlocks caused by a lock request that covers more than one blocking request. As required by the NLM protocol, all deadlock detection happens synchronously - a user is guaranteed that if a lock request isn't rejected immediately, the lock will eventually be granted. The old system allowed for a 'deferred deadlock' condition where a blocked lock request could wake up and find that some other deadlock-causing lock owner had beaten them to the lock. * Since both local and remote locks are managed by the same kernel locking code, local and remote processes can safely use file locks for mutual exclusion. Local processes have no fairness advantage compared to remote processes when contending to lock a region that has just been unlocked - the local lock manager enforces a strict first-come first-served model for both local and remote lockers. Sponsored by: Isilon Systems PR: 95247 107555 115524 116679 MFC after: 2 weeks
*	Revert previous change - it appears that the limit I was hitting was a	sobomax	2008-03-19	1	-36/+3
\| \| \| \| \| \| \| \| \|	maxsockets limit, not maxfiles limit. The question remains why those limits are handled differently (with error code for maxfiles but with sleep for maxsokets), but those would be addressed in a separate commit if necessary. Requested by: rwhatson, jeff
*	In keeping with style(9)'s recommendations on macros, use a ';'	rwatson	2008-03-16	1	-2/+2
\| \| \| \| \| \| \| \| \|	after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr. MFC after: 1 month Discussed with: imp, rink
*	Properly set size of the file_zone to match kern.maxfiles parameter.	sobomax	2008-03-16	1	-3/+36
\| \| \| \| \| \| \| \|	Otherwise the parameter is no-op, since zone by default limits number of descriptors to some 12K entries. Attempt to allocate more ends up sleeping on zonelimit. MFC after: 2 weeks
*	Introduce a new F_DUP2FD command to fcntl(2), for compatibility with	antoine	2008-03-08	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \|	Solaris and AIX. fcntl(fd, F_DUP2FD, arg) and dup2(fd, arg) are functionnaly equivalent. Document it. Add some regression tests (identical to the dup2(2) regression tests). PR: 120233 Submitted by: Jukka Ukkonen Approved by: rwaston (mentor) MFC after: 1 month
*	This patch adds a new ktrace(2) record type, KTR_STRUCT, whose payload	des	2008-02-23	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	consists of the null-terminated name and the contents of any structure you wish to record. A new ktrstruct() function constructs and emits a KTR_STRUCT record. It is accompanied by convenience macros for struct stat and struct sockaddr. In kdump(1), KTR_STRUCT records are handled by a dispatcher function that runs stringent sanity checks on its contents before handing it over to individual decoding funtions for each type of structure. Currently supported structures are struct stat and struct sockaddr for the AF_INET, AF_INET6 and AF_UNIX families; support for AF_APPLETALK and AF_IPX is present but disabled, as I am unable to test it properly. Since 's' was already taken, the letter 't' is used by ktrace(1) to enable KTR_STRUCT trace points, and in kdump(1) to enable their decoding. Derived from patches by Andrew Li <andrew2.li@citi.com>. PR: kern/117836 MFC after: 3 weeks
*	Fix sendfile(2) write-only file permission bypass.	simon	2008-02-14	1	-1/+1
\| \| \| \| \|	Security: FreeBSD-SA-08:03.sendfile Submitted by: kib
*	Add support for displaying a process' current working directory, root	marcus	2008-02-09	1	-0/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	directory, and jail directory within procstat. While this functionality is available already in fstat, encapsulating it in the kern.proc.filedesc sysctl makes it accessible without using kvm and thus without needing elevated permissions. The new procstat output looks like: PID COMM FD T V FLAGS REF OFFSET PRO NAME 76792 tcsh cwd v d -------- - - - /usr/src 76792 tcsh root v d -------- - - - / 76792 tcsh 15 v c rw------ 16 9130 - - 76792 tcsh 16 v c rw------ 16 9130 - - 76792 tcsh 17 v c rw------ 16 9130 - - 76792 tcsh 18 v c rw------ 16 9130 - - 76792 tcsh 19 v c rw------ 16 9130 - - I am also bumping __FreeBSD_version for this as this new feature will be used in at least one port. Reviewed by: rwatson Approved by: rwatson
*	Export a type for POSIX SHM file descriptors via kern.proc.filedesc as	rwatson	2008-01-20	1	-0/+4
\| \| \| \| \|	used by procstat, or SHM descriptors will show up as type unknown in userspace.
*	VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in	attilio	2008-01-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
*	vn_lock() is currently only used with the 'curthread' passed as argument.	attilio	2008-01-10	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>
*	Add a new file descriptor type for IPC shared memory objects and use it to	jhb	2008-01-08	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	implement shm_open(2) and shm_unlink(2) in the kernel: - Each shared memory file descriptor is associated with a swap-backed vm object which provides the backing store. Each descriptor starts off with a size of zero, but the size can be altered via ftruncate(2). The shared memory file descriptors also support fstat(2). read(2), write(2), ioctl(2), select(2), poll(2), and kevent(2) are not supported on shared memory file descriptors. - shm_open(2) and shm_unlink(2) are now implemented as system calls that manage shared memory file descriptors. The virtual namespace that maps pathnames to shared memory file descriptors is implemented as a hash table where the hash key is generated via the 32-bit Fowler/Noll/Vo hash of the pathname. - As an extension, the constant 'SHM_ANON' may be specified in place of the path argument to shm_open(2). In this case, an unnamed shared memory file descriptor will be created similar to the IPC_PRIVATE key for shmget(2). Note that the shared memory object can still be shared among processes by sharing the file descriptor via fork(2) or sendmsg(2), but it is unnamed. This effectively serves to implement the getmemfd() idea bandied about the lists several times over the years. - The backing store for shared memory file descriptors are garbage collected when they are not referenced by any open file descriptors or the shm_open(2) virtual namespace. Submitted by: dillon, peter (previous versions) Submitted by: rwatson (I based this on his version) Reviewed by: alc (suggested converting getmemfd() to shm_open())
*	Make ftruncate a 'struct file' operation rather than a vnode operation.	jhb	2008-01-07	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This makes it possible to support ftruncate() on non-vnode file types in the future. - 'struct fileops' grows a 'fo_truncate' method to handle an ftruncate() on a given file descriptor. - ftruncate() moves to kern/sys_generic.c and now just fetches a file object and invokes fo_truncate(). - The vnode-specific portions of ftruncate() move to vn_truncate() in vfs_vnops.c which implements fo_truncate() for vnode file types. - Non-vnode file types return EINVAL in their fo_truncate() method. Submitted by: rwatson
*	- In sysctl_kern_file skip fdps with negative lastfiles. This can	jeff	2008-01-03	1	-1/+2
\| \| \| \| \| \| \| \|	happen if there are no files open. Accounting for these can eventually return a negative value for olenp causing sysctl to crash with a bad malloc. Reported by: Pawel Worach <pawel.worach@gmail.com>
*	Remove explicit locking of struct file.	jeff	2007-12-30	1	-105/+67
\| \| \| \| \| \| \| \| \| \| \| \| \|	- Introduce a finit() which is used to initailize the fields of struct file in such a way that the ops vector is only valid after the data, type, and flags are valid. - Protect f_flag and f_count with atomic operations. - Remove the global list of all files and associated accounting. - Rewrite the unp garbage collection such that it no longer requires the global list of all files and instead uses a list of all unp sockets. - Mark sockets in the accept queue so we don't incorrectly gc them. Tested by: kris, pho