FreeBSD-src - Raptor Engineering's fork of pfsense FreeBSD src with pfSense changes

	Commit message (Collapse)	Author	Age	Files	Lines
*	Don't use almost perfectly pessimal cluster allocation. Allocation	bde	2007-07-10	2	-6/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	of the the first cluster in a file (and, if the allocation cannot be continued contiguously, for subsequent clusters in a file) was randomized in an attempt to leave space for contiguous allocation of subsequent clusters in each file when there are multiple writers. This reduced internal fragmentation by a few percent, but it increased external fragmentation by up to a few thousand percent. Use simple sequential allocation instead. Actually maintain the fsinfo sequence index for this. The read and write of this index from/to disk still have many non-critical bugs, but we now write an index that has something to do with our allocations instead of being modified garbage. If there is no fsinfo on the disk, then we maintain the index internally and don't go near the bugs for writing it. Allocating the first free cluster gives a layout that is almost as good (better in some cases), but takes too much CPU if the FAT is large and the first free cluster is not near the beginning. The effect of this change for untar and tar of a slightly reduced copy of /usr/src on a new file system was: Before (msdosfs 4K-clusters): untar: 459.57 real untar from cached file (actually a pipe) tar: 342.50 real tar from uncached tree to /dev/zero Before (ffs2 soft updates 4K-blocks 4K-frags) untar: 39.18 real tar: 29.94 real Before (ffs2 soft updates 16K-blocks 2K-frags) untar: 31.35 real tar: 18.30 real After (msdosfs 4K-clusters): untar 54.83 real tar 16.18 real All of these times can be improved further. With multiple concurrent writers or readers (especially readers), the improvement is smaller, but I couldn't find any case where it is negative. 342 seconds for tarring up about 342 MB on a ~47MB/S partition is just hard to unimprove on. (This operation would take about 7.3 seconds with reasonably localized allocation and perfect read-ahead.) However, for active file systems, 342 seconds is closer to normal than the 16+ seconds above or the 11 seconds with other changes (best I've measured -- won easily by msdosfs!). E.g., my active /usr/src on ffs1 is quite old and fragmented, so reading to prepare for the above benchmark takes about 6 times longer than reading back the fresh copies of it. Approved by: re (kensmith)
*	MFp4:	delphij	2007-07-08	4	-67/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Plug memory leak. - Respect underlying vnode's properties rather than assuming that the user want root:wheel + 0755. Useful for using tmpfs(5) for /tmp. - Use roundup2 and howmany macros instead of rolling our own version. - Try to fix fsx -W -R foo case. - Instead of blindly zeroing a page, determine whether we need a pagein order to prevent data corruption. - Fix several bugs reported by Coverity. Submitted by: Mingyan Guo <guomingyan gmail com>, Howard Su, delphij Coverity ID: CID 2550, 2551, 2552, 2557 Approved by: re (tmpfs blanket)
*	Since rev. 1.199 of sys/kern/kern_conf.c, the thread that calls	kib	2007-07-03	2	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	destroy_dev() from d_close() cdev method would self-deadlock. devfs_close() bump device thread reference counter, and destroy_dev() sleeps, waiting for si_threadcount to reach zero for cdev without d_purge method. destroy_dev_sched() could be used instead from d_close(), to schedule execution of destroy_dev() in another context. The destroy_dev_sched_drain() function can be used to drain the scheduled calls to destroy_dev_sched(). Similarly, drain_dev_clone_events() drains the events clone to make sure no lingering devices are left after dev_clone event handler deregistered. make_dev_credf(MAKEDEV_REF) function should be used from dev_clone event handlers instead of make_dev()/make_dev_cred() to ensure that created device has reference counter bumped before cdev mutex is dropped inside make_dev(). Reviewed by: tegge (early versions), njl (programming interface) Debugging help and testing by: Peter Holm Approved by: re (kensmith)
*	MFp4:	delphij	2007-06-29	6	-187/+20
\| \| \| \| \| \| \| \| \| \| \| \| \|	- Remove unnecessary NULL checks after M_WAITOK allocations. - Use VOP_ACCESS instead of hand-rolled suser_cred() calls. [1] - Use malloc(9) KPI to allocate memory for string. The optimization taken from NetBSD is not valid for FreeBSD because our malloc(9) already act that way. [2] Requested by: rwatson [1] Submitted by: Howard Su [2] Approved by: re (tmpfs blanket)
*	Space/style cleanups after last set of commits.	delphij	2007-06-28	7	-72/+71
\| \| \| \|	Approved by: re (tmpfs blanket)
*	Staticify most of fifo/vn operations, they should not	delphij	2007-06-28	4	-99/+76
\| \| \| \| \| \|	be directly exposed outside. Approved by: re (tmpfs blanket)
*	Use vfs_timestamp instead of nanotime when obtaining	delphij	2007-06-28	1	-4/+3
\| \| \| \| \| \|	a timestamp for use with timekeeping. Approved by: re (tmpfs blanket)
*	Reorder tf_gen and tf_id in struct tmpfs_fid. This	delphij	2007-06-28	1	-2/+2
\| \| \| \| \| \| \|	saves 8 bytes on amd64 architecture. Obtained from: NetBSD Approved by: re (tmpfs blanket)
*	Remove two function prototypes that are no longer used.	delphij	2007-06-26	1	-4/+0
\| \| \| \|	Approved by: re (tmpfs blanket)
*	- Sync with NetBSD's RCSID (HEAD preferred).	delphij	2007-06-26	2	-2/+2
\| \| \| \| \| \|	- Correct a typo. Approved by: re (tmpfs blanket)
*	MFp4: Several clean-ups and improvements over tmpfs:	delphij	2007-06-25	6	-122/+106
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Remove tmpfs_zone_xxx KPI, the uma(9) wrapper, since they does not bring any value now. - Use \|= instead of = when applying VV_ROOT flag. - Remove tm_avariable_nodes list. Use uma to hold the released nodes. - init/destory interlock mutex of node when init/fini instead of ctor/dtor. - Change memory computing using u_int to fix negative value in 2G mem machine. - Remove unnecessary bzero's - Rely uma logic to make file id allocation harder to guess. - Fix some unsigned/signed related things. Make sure we respect -o size=xxxx - Use wire instead of hold a page. - Pass allocate_zero to obtain zeroed pages upon first use. Submitted by: Howard Su Approved by: re (tmpfs blanket, kensmith)
*	- Remove UMAP filesystem. It was disconnected from build three years ago,	rafan	2007-06-25	4	-1441/+0
\| \| \| \| \| \| \|	and it is seriously broken. Discussed on: freebsd-arch@ Approved by: re (mux)
*	Use vfs_timestamp() instead of nanotime() - make it up to	delphij	2007-06-18	1	-1/+1
\| \| \| \| \|	the user to make decisions about how detail they wanted timestamps to have.
*	MFp4: fix two locking problems:	delphij	2007-06-18	2	-0/+7
\| \| \| \| \| \| \| \| \|	- Hold TMPFS_LOCK while updating tm_pages_used. - Hold vm page while doing uiomove. This will hopefully fix all known panics. Submitted by: Howard Su
*	MFp4: Add tmpfs, an efficient memory file system.	delphij	2007-06-16	9	-0/+4070
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Please note that, this is currently considered as an experimental feature so there could be some rough edges. Consult http://wiki.freebsd.org/TMPFS for more information. For now, connect tmpfs to build on i386 and amd64 architectures only. Please let us know if you have success with other platforms. This work was developed by Julio M. Merino Vidal for NetBSD as a SoC project; Rohit Jalan ported it from NetBSD to FreeBSD. Howard Su and Glen Leeder are worked on it to continue this effort. Obtained from: NetBSD via p4 Submitted by: Howard Su (with some minor changes) Approved by: re (kensmith)
*	Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in	rwatson	2007-06-12	3	-14/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	some cases, move to priv_check() if it was an operation on a thread and no other flags were present. Eliminate caller-side jail exception checking (also now-unused); jail privilege exception code now goes solely in kern_jail.c. We can't yet eliminate suser() due to some cases in the KAME code where a privilege check is performed and then used in many different deferred paths. Do, however, move those prototypes to priv.h. Reviewed by: csjp Obtained from: TrustedBSD Project
*	Correct corrupt read when the read starts at a non-aligned offset.	remko	2007-06-11	1	-4/+6
\| \| \| \| \| \| \| \|	PR: kern/77234 MFC After: 1 week Approved by: imp (mentor) Requested by: many many people Submitted by: Andriy Gapon <avg at icyb dot net dot ua>
*	rufetch and calcru sometimes should be called atomically together.	attilio	2007-06-09	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	This patch fixes places where they should be called atomically changing their locking requirements (both assume per-proc spinlock held) and introducing rufetchcalc which wrappers both calls to be performed in atomic way. Reviewed by: jeff Approved by: jeff (mentor)
*	Fix off-by-one error (introduced in r1.60) that had the effect of	bmah	2007-06-07	1	-1/+1
\| \| \| \| \| \| \| \|	disallowing a read of exactly MAXPHYS bytes. Reviewed by: des, rdivacky MFC after: 1 week Sponsored by: nCircle Network Security
*	Commit 14/14 of sched_lock decomposition.	jeff	2007-06-05	3	-8/+8
\| \| \| \| \| \| \| \| \| \| \|	- Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
*	Do proper "locking" for missing vmmeters part.	attilio	2007-06-04	1	-4/+4
\| \| \| \| \| \| \| \|	Now, we assume no more sched_lock protection for some of them and use the distribuited loads method for vmmeter (distribuited through CPUs). Reviewed by: alc, bde Approved by: jeff (mentor)
*	Revert previous, part of NFS that I didn't know about.	trhodes	2007-06-01	1	-0/+20
\|
*	Garbage collect msdosfs_fhtovp; it appears unused and I have been using	trhodes	2007-06-01	1	-20/+0
\| \| \| \|	MSDOSFS without this function and problems for the last month.
*	Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation	kib	2007-06-01	2	-4/+4
\| \| \| \| \| \| \| \|	argument from being file descriptor index into the pointer to struct file: part 2. Convert calls missed in the first big commit. Noted by: rwatson Pointy hat to: kib
*	Revert VMCNT_* operations introduction.	attilio	2007-05-31	1	-4/+4
\| \| \| \| \| \| \| \|	Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately. Requested by: alc Approved by: jeff (mentor)
*	Revert UF_OPENING workaround for CURRENT.	kib	2007-05-31	7	-29/+20
\| \| \| \| \| \| \| \| \|	Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation argument from being file descriptor index into the pointer to struct file. Proposed and reviewed by: jhb Reviewed by: daichi (unionfs) Approved by: re (kensmith)
*	Where I previously removed calls to kdb_enter(), now remove include of	rwatson	2007-05-29	3	-3/+0
\| \| \| \| \| \|	kdb.h. Pointed out by: bde
*	Rather than entering the debugger via kdb_enter() when detecting memory	rwatson	2007-05-27	1	-12/+6
\| \| \| \|	corruption under SMBUFS_NAME_DEBUG, panic() with the same error message.
*	Rather than entering the debugger via kdb_enter() in the event the	rwatson	2007-05-27	1	-5/+2
\| \| \| \| \|	root vnode is unexpectedly locked under NULLFS_DEBUG in nullfs and then returning EDEADLK, panic.
*	Since renaming of vop_lock to _vop_lock, pre- and post-condition	kib	2007-05-18	5	-10/+10
\| \| \| \| \| \|	function calls are no more generated for vop_lock. Rename _vop_lock to vop_lock1 to satisfy tools/vnode_if.awk assumption about vop naming conventions. This restores pre/post-condition calls.
*	- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating	jeff	2007-05-18	1	-4/+4
\| \| \| \| \| \| \| \|	vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines. Contributed by: Attilio Rao <attilio@FreeBSD.org>
*	The process lock is held when procfs_ioctl() is called. Assert that this	des	2007-05-01	1	-2/+8
\| \| \| \| \|	is so, and PHOLD the process while sleeping since msleep() will release the lock.
*	Fix old locking bugs which were revealed when pseudofs was made MPSAFE.	des	2007-04-23	1	-1/+9
\| \| \| \|	Submitted by: tegge
*	Rename macdevfsdirent() to macdevfs() to synchronize with SEDarwin,	rwatson	2007-04-23	2	-3/+3
\| \| \| \| \| \| \| \|	where similar data structures exist to support devfs and the MAC Framework, but are named differently. Obtained from: TrustedBSD Project Sponsored by: SPARTA, Inc.
*	Add synchronization. Eliminate the acquisition and release of Giant.	alc	2007-04-23	1	-23/+47
\| \| \| \|	Reviewed by: tegge
*	In some cases, like whenever devfs file times are zero, the fix(aa) will not	trhodes	2007-04-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	be applied to dev entries. This leaves us with file times like "Jan 1 1970." Work around this problem by replacing the tv_sec == 0 check with a <= 3600 check. It's doubtful anyone will be booting within an hour of the Epoch, let alone care about a few seconds worth of nonzero timestamps. It's a hackish work around, but it does work and I have not experienced any negatives in my testing. Discussed with: bde "Ok with me: phk
*	Avoid "unused variable" warning when building without PSEUDOFS_TRACE.	des	2007-04-15	1	-0/+1
\|
*	Make pseudofs (and consequently procfs, linprocfs and linsysfs) MPSAFE.	des	2007-04-15	6	-340/+553
\|
*	Instead of stating GIANT_REQUIRED, just acquire and release Giant where	des	2007-04-15	1	-2/+5
\| \| \| \| \|	needed. This does not make a difference now, but will when procfs is marked MPSAFE.
*	Fix the same bug as in procfs_doproc{,db}regs(): check that uio_offset is	des	2007-04-15	1	-1/+3
\| \| \| \| \| \|	0 upon entry, and don't reset it before returning. MFC after: 3 weeks
*	Don't reset uio_offset to 0 before returning. Instead, refuse to service	des	2007-04-15	2	-3/+7
\| \| \| \| \| \| \|	requests where uio_offset is not 0 to begin with. This fixes a long- standing bug where e.g. 'cat /proc/$$/regs' would loop forever. MFC after: 3 weeks
*	Further pseudofs improvements:	des	2007-04-14	6	-62/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The pfs_info mutex is only needed to lock pi_unrhdr. Everything else in struct pfs_info is modified only while Giant is held (during vfs_init() / vfs_uninit()); add assertions to that effect. Simplify pfs_destroy somewhat. Remove superfluous arguments from pfs_fileno_{alloc,free}(), and the assertions which were added in the previous commit to ensure they were consistent. Assert that Giant is held while the vnode cache is initialized and destroyed. Also assert that the cache is empty when it is destroyed. Rename the vnode cache mutex for consistency. Fix a long-standing bug in pfs_getattr(): it would uncritically return the node's pn_fileno as st_ino. This would result in st_ino being 0 if the node had not previously been visited by readdir(), and also in an incorrect st_ino for process directories and any files contained therein. Correct this by abstracting the fileno manipulations previously done in pfs_readdir() into a new function, pfs_fileno(), which is used by both pfs_getattr() and pfs_readdir().
*	Add a flag to struct pfs_vdata to mark the vnode as dead (e.g. process-	des	2007-04-11	5	-51/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	specific nodes when the process exits) Move the vnode-cache-walking loop which was duplicated in pfs_exit() and pfs_disable() into its own function, pfs_purge(), which looks for vnodes marked as dead and / or belonging to the specified pfs_node and reclaims them. Note that this loop is still extremely inefficient. Add a comment in pfs_vncache_alloc() explaining why we have to purge the vnode from the vnode cache before returning, in case anyone should be tempted to remove the call to cache_purge(). Move the special handling for pfstype_root nodes into pfs_fileno_alloc() and pfs_fileno_free() (the root node's fileno must always be 2). This also fixes a bug where pfs_fileno_free() would reclaim the root node's fileno, triggering a panic in the unr code, as that fileno was never allocated from unr to begin with. When destroying a pfs_node, release its fileno and purge it from the vnode cache. I wish we could put off the call to pfs_purge() until after the entire tree had been destroyed, but then we'd have vnodes referencing freed pfs nodes. This probably doesn't matter while we're still under Giant, but might become an issue later. When destroying a pseudofs instance, destroy the tree before tearing down the fileno allocator. In pfs_mount(), acquire the mountpoint interlock when required. MFC after: 3 weeks
*	Whitespace nits.	des	2007-04-05	2	-4/+4
\|
*	Replace custom file descriptor array sleep lock constructed using a mutex	rwatson	2007-04-04	5	-11/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	and flags with an sxlock. This leads to a significant and measurable performance improvement as a result of access to shared locking for frequent lookup operations, reduced general overhead, and reduced overhead in the event of contention. All of these are imported for threaded applications where simultaneous access to a shared file descriptor array occurs frequently. Kris has reported 2x-4x transaction rate improvements on 8-core MySQL benchmarks; smaller improvements can be expected for many workloads as a result of reduced overhead. - Generally eliminate the distinction between "fast" and regular acquisisition of the filedesc lock; the plan is that they will now all be fast. Change all locking instances to either shared or exclusive locks. - Correct a bug (pointed out by kib) in fdfree() where previously msleep() was called without the mutex held; sx_sleep() is now always called with the sxlock held exclusively. - Universally hold the struct file lock over changes to struct file, rather than the filedesc lock or no lock. Always update the f_ops field last. A further memory barrier is required here in the future (discussed with jhb). - Improve locking and reference management in linux_at(), which fails to properly acquire vnode references before using vnode pointers. Annotate improper use of vn_fullpath(), which will be replaced at a future date. In fcntl(), we conservatively acquire an exclusive lock, even though in some cases a shared lock may be sufficient, which should be revisited. The dropping of the filedesc lock in fdgrowtable() is no longer required as the sxlock can be held over the sleep operation; we should consider removing that (pointed out by attilio). Tested by: kris Discussed with: jhb, kris, attilio, jeff
*	Annotate that this giant acqusition is dependent on tty locking.	kris	2007-03-26	1	-2/+2
\|
*	o cd9660 code repo-copied, update a comment.	maxim	2007-03-24	1	-1/+1
\|
*	Make insmntque() externally visibile and allow it to fail (e.g. during	tegge	2007-03-13	16	-11/+132
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	late stages of unmount). On failure, the vnode is recycled. Add insmntque1(), to allow for file system specific cleanup when recycling vnode on failure. Change getnewvnode() to no longer call insmntque(). Previously, embryonic vnodes were put onto the list of vnode belonging to a file system, which is unsafe for a file system marked MPSAFE. Change vfs_hash_insert() to no longer lock the vnode. The caller now has that responsibility. Change most file systems to lock the vnode and call insmntque() or insmntque1() after a new vnode has been sufficiently setup. Handle failed insmntque*() calls by propagating errors to callers, possibly after some file system specific cleanup. Approved by: re (kensmith) Reviewed by: kib In collaboration with: kib
*	Add a pn_destroy field to pfs_node. This field points to a destructor	des	2007-03-12	3	-22/+44
\| \| \| \| \| \| \| \| \| \| \| \| \|	function which is called from pfs_destroy() before the node is reclaimed. Modify pfs_create_{dir,file,link}() to accept a pointer to a destructor function in addition to the usual attr / fill / vis pointers. This breaks both the programming and binary interfaces between pseudofs and its consumers. It is believed that there are no pseudofs consumers outside the source tree, so that the impact of this change is minimal. Submitted by: Aniruddha Bohra <bohra@cs.rutgers.edu>
*	Change fifo_printinfo to check if the vnode v_fifoinfo pointer	mpp	2007-03-02	1	-0/+4
\| \| \| \|	is NULL and print a message to that effect to prevent a panic.