FreeBSD-src - Raptor Engineering's fork of pfsense FreeBSD src with pfSense changes

	Commit message (Collapse)	Author	Age	Files	Lines
*	Make MSGBUF_SIZE kernel option a loader tunable kern.msgbufsize.	pluknet	2011-01-21	1	-3/+2
\| \| \| \| \| \| \|	Submitted by: perryh pluto.rain.com (previous version) Reviewed by: jhb Approved by: kib (mentor) Tested by: universe
*	Remove empty dev_mem_md_init() stubs.	jkim	2011-01-17	1	-5/+0
\|
*	Add reader/writer lock around mem_range_attr_get() and mem_range_attr_set().	jkim	2011-01-17	1	-1/+4
\| \| \| \| \| \| \| \| \|	Compile sys/dev/mem/memutil.c for all supported platforms and remove now unnecessary dev_mem_md_init(). Consistently define mem_range_softc from mem.c for all platforms. Add missing #include guards for machine/memdev.h and sys/memrange.h. Clean up some nearby style(9) nits. MFC after: 1 month
*	Remove unneeded includes of <sys/linker_set.h>. Other headers that use	jhb	2011-01-11	1	-1/+0
\| \| \| \| \| \|	it internally contain nested includes. Reviewed by: bde
*	Move repeated MAXSLP definition from machine/vmparam.h to sys/vmmeter.h.	kib	2011-01-09	1	-11/+0
\| \| \| \| \| \| \|	Update the outdated comments describing MAXSLP and the process selection algorithm for swap out. Comments wording and reviewed by: alc
*	The highest-precision floating point type on ia64 has 64 bits of	das	2011-01-09	1	-1/+1
\| \| \| \|	precision, so DECIMAL_DIG should be 21, as on i386/amd64.
*	On mixed 32/64 bit architectures (mips, powerpc) use __LP64__ rather than	tijl	2011-01-08	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	architecture macros (__mips_n64, __powerpc64__) when 64 bit types (and corresponding macros) are different from 32 bit. [1] Correct the type of INT64_MIN, INT64_MAX and UINT64_MAX. Define (U)INTMAX_C as an alias for (U)INT64_C matching the type definition for (u)intmax_t. Do this on all architectures for consistency. Suggested by: bde [1] Approved by: kib (mentor)
*	Fix types of some values in machine/_limits.h.	tijl	2011-01-08	1	-9/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On some architectures UCHAR_MAX and USHRT_MAX had type unsigned int. However, lacking integer suffixes for types smaller than int, their type should correspond to that of an object of type unsigned char (or short) when used in an expression with objects of type int. In that case unsigned char (short) are promoted to int (i.e. signed) so the type of UCHAR_MAX and USHRT_MAX should also be int. Where MIN/MAX constants implicitly have the correct type the suffix has been removed. While here, correct some comments. Reviewed by: bde Approved by: kib (mentor)
*	Add AT_STACKPROT elf aux vector. Will be used to inform rtld about the	kib	2011-01-07	1	-1/+2
\| \| \| \|	initial stack protection set by the kernel image activator.
*	Revert r216134. This checkin broke platforms where bus_space are macros:	brucec	2010-12-03	1	-31/+24
\| \| \| \| \|	they need to be a single statement, and do { } while (0) doesn't work in this situation so revert until a solution can be devised.
*	Disallow passing in a count of zero bytes to the bus_space(9) functions.	brucec	2010-12-02	1	-24/+31
\| \| \| \| \| \| \| \| \|	Passing a count of zero on i386 and amd64 for [I386\|AMD64]_BUS_SPACE_MEM causes a crash/hang since the 'loop' instruction decrements the counter before checking if it's zero. PR: kern/80980 Discussed with: jhb
*	phys_avail[] is correctly defined as an array of vm_paddr_t's in	alc	2010-12-01	1	-1/+1
\| \| \| \|	machdep.c. Use that same type, and not vm_offset_t, in this include file.
*	- Remove <machine/mutex.h>. Most of the headers were empty, and the	jhb	2010-11-09	1	-70/+0
\| \| \| \| \| \| \| \| \| \| \| \|	contents of the ones that were not empty were stale and unused. - Now that <machine/mutex.h> no longer exists, there is no need to allow it to override various helper macros in <sys/mutex.h>. - Rename various helper macros for low-level operations on mutexes to live in the _mtx_* or __mtx_* namespaces. While here, change the names to more closely match the real API functions they are backing. - Drop support for including <sys/mutex.h> in assembly source files. Suggested by: bde (1, 2)
*	Remove unused includes of <sys/mutex.h> and <machine/mutex.h>.	jhb	2010-11-09	3	-3/+0
\|
*	Reduce diff between platforms and fix style(9) bugs.	jkim	2010-11-09	1	-11/+17
\|
*	Adjust the order of operations in spinlock_enter() and spinlock_exit() to	jhb	2010-11-05	1	-4/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	work properly with single-stepping in a kernel debugger. Specifically, these routines have always disabled interrupts before increasing the nesting count and restored the prior state of interrupts after decreasing the nesting count to avoid problems with a nested interrupt not disabling interrupts when acquiring a spin lock. However, trap interrupts for single-stepping can still occur even when interrupts are disabled. Now the saved state of interrupts is not saved in the thread until after interrupts have been disabled and the nesting count has been increased. Similarly, the saved state from the thread cannot be read once the nesting count has been decreased to zero. To fix this, use temporary variables to store interrupt state and shuffle it between the thread's MD area and the appropriate registers. In cooperation with: bde MFC after: 1 month
*	Fix bogus error message from bus_dmamem_alloc() about incorrect alignment.	neel	2010-09-29	1	-1/+1
\| \| \| \| \| \| \| \| \|	The check for alignment should be made against the physical address and not the virtual address that maps it. Sponsored by: NetApp Submitted by: Will McGovern (will at netapp dot com) Reviewed by: mjacob, jhb
*	Remove clauses 3 and 4, per changes to NetBSD versions of these files.	imp	2010-09-25	2	-14/+0
\|
*	bus_add_child: change type of order parameter to u_int	avg	2010-09-10	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	This reflects actual type used to store and compare child device orders. Change is mostly done via a Coccinelle (soon to be devel/coccinelle) semantic patch. Verified by LINT+modules kernel builds. Followup to: r212213 MFC after: 10 days
*	Remove unused KTRACE includes.	jhb	2010-08-19	1	-6/+0
\|
*	Supply some useful information to the started image using ELF aux vectors.	kib	2010-08-17	1	-2/+8
\| \| \| \| \| \| \| \|	In particular, provide pagesize and pagesizes array, the canary value for SSP use, number of host CPUs and osreldate. Tested by: marius (sparc64) MFC after: 1 month
*	Prefer struct sysentvec sv_psstrings to hardcoding FREEBSD32_PS_STRINGS	kib	2010-08-07	1	-7/+9
\| \| \| \| \| \|	in the compat32 code. Use sv_usrstack instead of FREEBSD32_USRSTACK as well. MFC after: 1 week
*	Add a new ipi_cpu() function to the MI IPI API that can be used to send an	jhb	2010-08-06	2	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \|	IPI to a specific CPU by its cpuid. Replace calls to ipi_selected() that constructed a mask for a single CPU with calls to ipi_cpu() instead. This will matter more in the future when we transition from cpumask_t to cpuset_t for CPU masks in which case building a CPU mask is more expensive. Submitted by: peter, sbruno Reviewed by: rookie Obtained from: Yahoo! (x86) MFC after: 1 month
*	Mark the __curthread() functions as __pure2 and remove the volatile keyword	jhb	2010-07-29	1	-2/+2
\| \| \| \| \| \| \| \|	from the inline assembly. This allows the compiler to cache invocations of curthread since it's value does not change within a thread context. Submitted by: zec (i386) MFC after: 1 week
*	Add MALLOC_DEBUG_MAXZONES debug malloc(9) option to use multiple uma	mdf	2010-07-28	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	zones for each malloc bucket size. The purpose is to isolate different malloc types into hash classes, so that any buffer overruns or use-after-free will usually only affect memory from malloc types in that hash class. This is purely a debugging tool; by varying the hash function and tracking which hash class was corrupted, the intersection of the hash classes from each instance will point to a single malloc type that is being misused. At this point inspection or memguard(9) can be used to catch the offending code. Add MALLOC_DEBUG_MAXZONES=8 to -current GENERIC configuration files. The suggestion to have this on by default came from Kostik Belousov on -arch. This code is based on work by Ron Steinke at Isilon Systems. Reviewed by: -arch (mostly silence) Reviewed by: zml Approved by: zml (mentor)
*	Very rough first cut at NUMA support for the physical page allocator. For	jhb	2010-07-27	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	now it uses a very dumb first-touch allocation policy. This will change in the future. - Each architecture indicates the maximum number of supported memory domains via a new VM_NDOMAIN parameter in <machine/vmparam.h>. - Each cpu now has a PCPU_GET(domain) member to indicate the memory domain a CPU belongs to. Domain values are dense and numbered from 0. - When a platform supports multiple domains, the default freelist (VM_FREELIST_DEFAULT) is split up into N freelists, one for each domain. The MD code is required to populate an array of mem_affinity structures. Each entry in the array defines a range of memory (start and end) and a domain for the range. Multiple entries may be present for a single domain. The list is terminated by an entry where all fields are zero. This array of structures is used to split up phys_avail[] regions that fall in VM_FREELIST_DEFAULT into per-domain freelists. - Each memory domain has a separate lookup-array of freelists that is used when fulfulling a physical memory allocation. Right now the per-domain freelists are listed in a round-robin order for each domain. In the future a table such as the ACPI SLIT table may be used to order the per-domain lookup lists based on the penalty for each memory domain relative to a specific domain. The lookup lists may be examined via a new vm.phys.lookup_lists sysctl. - The first-touch policy is implemented by using PCPU_GET(domain) to pick a lookup list when allocating memory. Reviewed by: alc
*	When compat32 binary asks for the value of hw.machine_arch, report the	kib	2010-07-22	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \|	name of 32bit sibling architecture instead of the host one. Do the same for hw.machine on amd64. Add a safety belt debug.adaptive_machine_arch sysctl, to turn the substitution off. Reviewed by: jhb, nwhitehorn MFC after: 2 weeks
*	Add acpi_find_table() -- a convenience function for looking up an	marcel	2010-07-07	2	-2/+38
\| \| \| \|	ACPI table given the signature.
*	Remove pointless BOOTP conditional.	marcel	2010-07-07	1	-5/+0
\|
*	Provide more examples for error injection.	marcel	2010-07-06	1	-1/+6
\|
*	Allocate and setup an interrupt vector for corrected machine checks.	marcel	2010-07-03	3	-0/+37
\| \| \| \| \|	For now, just print when we get the interrupt, but eventually we need to collect the details and provide a more useful report.
*	When compiling with profiling, we define PROF for userspace and GPROF	marcel	2010-07-01	1	-1/+1
\| \| \| \|	for the kernel.
*	While functions are ideally aligned to a 32-byte boundary, don't	marcel	2010-06-30	1	-1/+1
\| \| \| \|	assume this to be the case.
*	Move prototypes for kern_sigtimedwait() and kern_sigprocmask() to	jhb	2010-06-30	1	-0/+1
\| \| \| \|	<sys/syscallsubr.h> where all other kern_<syscall> prototypes live.
*	The ptc.g operation for the Mckinley and Madison processors has the	marcel	2010-06-12	2	-6/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	side-effect of purging more than the requested translation. While this is not a problem in general, it invalidates the assumption made during constructing the trapframe on entry into the kernel in SMP configurations. The assumption is that only the first store to the stack will possibly cause a TLB miss. Since the ptc.g purges the translation caches of all CPUs in the coherency domain, a ptc.g executed on one CPU can cause a purge on another CPU that is currently running the critical code that saves the state to the trapframe. This can cause an unexpected TLB miss and with interrupt collection disabled this means an unexpected data nested TLB fault. A data nested TLB fault will not save any context, nor provide a way for software to determine what caused the TLB miss nor where it occured. Careful construction of the kernel entry and exit code allows us to handle a TLB miss in precisely orchastrated points and thereby avoiding the need to wire the kernel stack, but the unexpected TLB miss caused by the ptc.g instructution resulted in an unrecoverable condition and resulting in machine checks. The solution to this problem is to synchronize the kernel entry on all CPUs with the use of the ptc.g instruction on a single CPU by implementing a bare-bones readers-writer lock that allows N readers (= N CPUs entering the kernel) and 1 writer (= execution of the ptc.g instruction on some CPU). This solution wins over a rendez-vous approach by not interrupting CPUs with an IPI. This problem has not been observed on the Montecito. PR: ia64/147772 MFC after: 6 days
*	Relax one of the new assertions in pmap_enter() a little. Specifically,	alc	2010-06-11	1	-1/+2
\| \| \| \| \| \|	allow pmap_enter() to be performed on an unmanaged page that doesn't have VPO_BUSY set. Having VPO_BUSY set really only matters for managed pages. (See, for example, pmap_remove_write().)
*	Bump MAX_BPAGES from 256 to 1024. It seems that a few drivers, bge(4)	marcel	2010-06-11	3	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	in particular, do not handle deferred DMA map load operations at all. Any error, and especially EINPROGRESS, is treated as a hard error and typically abort the current operation. The fact that the busdma code queues the load operation for when resources (i.e. bounce buffers in this particular case) are available makes this especially problematic. Bounce buffering, unlike what the PR synopsis would suggest, works fine. While on the subject, properly implement swi_vm(). PR: 147502 MFC after: 1 week
*	Reduce the scope of the page queues lock and the number of	alc	2010-06-10	1	-14/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PG_REFERENCED changes in vm_pageout_object_deactivate_pages(). Simplify this function's inner loop using TAILQ_FOREACH(), and shorten some of its overly long lines. Update a stale comment. Assert that PG_REFERENCED may be cleared only if the object containing the page is locked. Add a comment documenting this. Assert that a caller to vm_page_requeue() holds the page queues lock, and assert that the page is on a page queue. Push down the page queues lock into pmap_ts_referenced() and pmap_page_exists_quick(). (As of now, there are no longer any pmap functions that expect to be called with the page queues lock held.) Neither pmap_ts_referenced() nor pmap_page_exists_quick() should ever be passed an unmanaged page. Assert this rather than returning "0" and "FALSE" respectively. ARM: Simplify pmap_page_exists_quick() by switching to TAILQ_FOREACH(). Push down the page queues lock inside of pmap_clearbit(), simplifying pmap_clear_modify(), pmap_clear_reference(), and pmap_remove_write(). Additionally, this allows for avoiding the acquisition of the page queues lock in some cases. PowerPC/AIM: moea_page_exits_quick() and moea_page_wired_mappings() will never be called before pmap initialization is complete. Therefore, the check for moea_initialized can be eliminated. Push down the page queues lock inside of moea_clear_bit(), simplifying moea_clear_modify() and moea_clear_reference(). The last parameter to moea_clear_bit() is never used. Eliminate it. PowerPC/BookE: Simplify mmu_booke_page_exists_quick()'s control flow. Reviewed by: kib@
*	Simplify the inner loop of get_pv_entry(): While iterating over the page's	alc	2010-05-30	1	-2/+2
\| \| \| \| \|	pv list, there is no point in checking whether or not the pv list is empty, wait instead until the loop completes.
*	Don't set PG_WRITEABLE in pmap_enter() unless the page is managed.	alc	2010-05-29	1	-1/+1
\|
*	Push down page queues lock acquisition in pmap_enter_object() and	alc	2010-05-26	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	pmap_is_referenced(). Eliminate the corresponding page queues lock acquisitions from vm_map_pmap_enter() and mincore(), respectively. In mincore(), this allows some additional cases to complete without ever acquiring the page queues lock. Assert that the page is managed in pmap_is_referenced(). On powerpc/aim, push down the page queues lock acquisition from moea_is_modified() and moea_is_referenced() into moea*_query_bit(). Again, this will allow some additional cases to complete without ever acquiring the page queues lock. Reorder a few statements in vm_page_dontneed() so that a race can't lead to an old reference persisting. This scenario is described in detail by a comment. Correct a spelling error in vm_page_dontneed(). Assert that the object is locked in vm_page_clear_dirty(), and restrict the page queues lock assertion to just those cases in which the page is currently writeable. Add object locking to vnode_pager_generic_putpages(). This was the one and only place where vm_page_clear_dirty() was being called without the object being locked. Eliminate an unnecessary vm_page_lock() around vnode_pager_setsize()'s call to vm_page_clear_dirty(). Change vnode_pager_generic_putpages() to the modern-style of function definition. Also, change the name of one of the parameters to follow virtual memory system naming conventions. Reviewed by: kib
*	Change ia64' struct syscall_args definition so that args is a pointer to	kib	2010-05-24	3	-7/+7
\| \| \| \| \| \| \| \| \|	the arguments array instead of array itself. ia64 syscall arguments are readily available in the frame, point args to it, do not do unnecessary bcopy. Still reserve the array in syscall_args for ia32 emulation. Suggested and reviewed by: marcel MFC after: 1 month
*	Roughly half of a typical pmap_mincore() implementation is machine-	alc	2010-05-24	1	-56/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	independent code. Move this code into mincore(), and eliminate the page queues lock from pmap_mincore(). Push down the page queues lock into pmap_clear_modify(), pmap_clear_reference(), and pmap_is_modified(). Assert that these functions are never passed an unmanaged page. Eliminate an inaccurate comment from powerpc/powerpc/mmu_if.m: Contrary to what the comment says, pmap_mincore() is not simply an optimization. Without a complete pmap_mincore() implementation, mincore() cannot return either MINCORE_MODIFIED or MINCORE_REFERENCED because only the pmap can provide this information. Eliminate the page queues lock from vfs_setdirty_locked_object(), vm_pageout_clean(), vm_object_page_collect_flush(), and vm_object_page_clean(). Generally speaking, these are all accesses to the page's dirty field, which are synchronized by the containing vm object's lock. Reduce the scope of the page queues lock in vm_object_madvise() and vm_page_dontneed(). Reviewed by: kib (an earlier version)
*	Reorganize syscall entry and leave handling.	kib	2010-05-23	4	-201/+131
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Extend struct sysvec with three new elements: sv_fetch_syscall_args - the method to fetch syscall arguments from usermode into struct syscall_args. The structure is machine-depended (this might be reconsidered after all architectures are converted). sv_set_syscall_retval - the method to set a return value for usermode from the syscall. It is a generalization of cpu_set_syscall_retval(9) to allow ABIs to override the way to set a return value. sv_syscallnames - the table of syscall names. Use sv_set_syscall_retval in kern_sigsuspend() instead of hardcoding the call to cpu_set_syscall_retval(). The new functions syscallenter(9) and syscallret(9) are provided that use sv_syscall pointers and contain the common repeated code from the syscall() implementations for the architecture-specific syscall trap handlers. Syscallenter() fetches arguments, calls syscall implementation from ABI sysent table, and set up return frame. The end of syscall bookkeeping is done by syscallret(). Take advantage of single place for MI syscall handling code and implement ptrace_lwpinfo pl_flags PL_FLAG_SCE, PL_FLAG_SCX and PL_FLAG_EXEC. The SCE and SCX flags notify the debugger that the thread is stopped at syscall entry or return point respectively. The EXEC flag augments SCX and notifies debugger that the process address space was changed by one of exec(2)-family syscalls. The i386, amd64, sparc64, sun4v, powerpc and ia64 syscall()s are changed to use syscallenter()/syscallret(). MIPS and arm are not converted and use the mostly unchanged syscall() implementation. Reviewed by: jhb, marcel, marius, nwhitehorn, stas Tested by: marcel (ia64), marius (sparc64), nwhitehorn (powerpc), stas (mips) MFC after: 1 month
*	- Adjust the whitespace for the lines that output fields in 'show pcpu' in	jhb	2010-05-21	1	-5/+5
\| \| \| \| \| \| \| \|	DDB so that all the fields line up. - Print out the tid of the per-CPU idlethread instead of the pid since the idle process is now shared across all idle threads. MFC after: 1 month
*	Switch to C99 exact-width types.	marcel	2010-05-19	5	-78/+77
\|
*	On entry to pmap_enter(), assert that the page is busy. While I'm	alc	2010-05-16	1	-5/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	here, make the style of assertion used by pmap_enter() consistent across all architectures. On entry to pmap_remove_write(), assert that the page is neither unmanaged nor fictitious, since we cannot remove write access to either kind of page. With the push down of the page queues lock, pmap_remove_write() cannot condition its behavior on the state of the PG_WRITEABLE flag if the page is busy. Assert that the object containing the page is locked. This allows us to know that the page will neither become busy nor will PG_WRITEABLE be set on it while pmap_remove_write() is running. Correct a long-standing bug in vm_page_cowsetup(). We cannot possibly do copy-on-write-based zero-copy transmit on unmanaged or fictitious pages, so don't even try. Previously, the call to pmap_remove_write() would have failed silently.
*	Push down the page queues into vm_page_cache(), vm_page_try_to_cache(), and	alc	2010-05-08	1	-11/+10
\| \| \| \| \| \| \| \| \| \| \|	vm_page_try_to_free(). Consequently, push down the page queues lock into pmap_enter_quick(), pmap_page_wired_mapped(), pmap_remove_all(), and pmap_remove_write(). Push down the page queues lock into Xen's pmap_page_is_mapped(). (I overlooked the Xen pmap in r207702.) Switch to a per-processor counter for the total number of pages cached.
*	On Alan's advice, rather than do a wholesale conversion on a single	kmacy	2010-04-30	2	-2/+8
\| \| \| \| \| \| \| \| \| \| \| \|	architecture from page queue lock to a hashed array of page locks (based on a patch by Jeff Roberson), I've implemented page lock support in the MI code and have only moved vm_page's hold_count out from under page queue mutex to page lock. This changes pmap_extract_and_hold on all pmaps. Supported by: Bitgravity Inc. Discussed with: alc, jeffr, and kib
*	MFamd64/i386 r207205	alc	2010-04-29	1	-11/+5
\| \| \| \| \| \| \|	Clearing a page table entry's accessed bit and setting the page's PG_REFERENCED flag in pmap_protect() can't really be justified, so don't do it. Moreover, on ia64, don't set the page's dirty field unless pmap_protect() is removing write access.