FreeBSD-src - Raptor Engineering's fork of pfsense FreeBSD src with pfSense changes

	Commit message (Collapse)	Author	Age	Files	Lines
*	MFC r291716, r291724, r291741, r291742	ken	2015-12-16	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In addition to those revisions, add this change to a file that is not in head: sys/ia64/include/bus.h: Guard kernel-only parts of the ia64 machine/bus.h header with #ifdef _KERNEL. This allows userland programs to include <machine/bus.h> to get the definition of bus_addr_t and bus_size_t. ------------------------------------------------------------------------ r291716 \| ken \| 2015-12-03 15:54:55 -0500 (Thu, 03 Dec 2015) \| 257 lines Add asynchronous command support to the pass(4) driver, and the new camdd(8) utility. CCBs may be queued to the driver via the new CAMIOQUEUE ioctl, and completed CCBs may be retrieved via the CAMIOGET ioctl. User processes can use poll(2) or kevent(2) to get notification when I/O has completed. While the existing CAMIOCOMMAND blocking ioctl interface only supports user virtual data pointers in a CCB (generally only one per CCB), the new CAMIOQUEUE ioctl supports user virtual and physical address pointers, as well as user virtual and physical scatter/gather lists. This allows user applications to have more flexibility in their data handling operations. Kernel memory for data transferred via the queued interface is allocated from the zone allocator in MAXPHYS sized chunks, and user data is copied in and out. This is likely faster than the vmapbuf()/vunmapbuf() method used by the CAMIOCOMMAND ioctl in configurations with many processors (there are more TLB shootdowns caused by the mapping/unmapping operation) but may not be as fast as running with unmapped I/O. The new memory handling model for user requests also allows applications to send CCBs with request sizes that are larger than MAXPHYS. The pass(4) driver now limits queued requests to the I/O size listed by the SIM driver in the maxio field in the Path Inquiry (XPT_PATH_INQ) CCB. There are some things things would be good to add: 1. Come up with a way to do unmapped I/O on multiple buffers. Currently the unmapped I/O interface operates on a struct bio, which includes only one address and length. It would be nice to be able to send an unmapped scatter/gather list down to busdma. This would allow eliminating the copy we currently do for data. 2. Add an ioctl to list currently outstanding CCBs in the various queues. 3. Add an ioctl to cancel a request, or use the XPT_ABORT CCB to do that. 4. Test physical address support. Virtual pointers and scatter gather lists have been tested, but I have not yet tested physical addresses or scatter/gather lists. 5. Investigate multiple queue support. At the moment there is one queue of commands per pass(4) device. If multiple processes open the device, they will submit I/O into the same queue and get events for the same completions. This is probably the right model for most applications, but it is something that could be changed later on. Also, add a new utility, camdd(8) that uses the asynchronous pass(4) driver interface. This utility is intended to be a basic data transfer/copy utility, a simple benchmark utility, and an example of how to use the asynchronous pass(4) interface. It can copy data to and from pass(4) devices using any target queue depth, starting offset and blocksize for the input and ouptut devices. It currently only supports SCSI devices, but could be easily extended to support ATA devices. It can also copy data to and from regular files, block devices, tape devices, pipes, stdin, and stdout. It does not support queueing multiple commands to any of those targets, since it uses the standard read(2)/write(2)/writev(2)/readv(2) system calls. The I/O is done by two threads, one for the reader and one for the writer. The reader thread sends completed read requests to the writer thread in strictly sequential order, even if they complete out of order. That could be modified later on for random I/O patterns or slightly out of order I/O. camdd(8) uses kqueue(2)/kevent(2) to get I/O completion events from the pass(4) driver and also to send request notifications internally. For pass(4) devcies, camdd(8) uses a single buffer (CAM_DATA_VADDR) per CAM CCB on the reading side, and a scatter/gather list (CAM_DATA_SG) on the writing side. In addition to testing both interfaces, this makes any potential reblocking of I/O easier. No data is copied between the reader and the writer, but rather the reader's buffers are split into multiple I/O requests or combined into a single I/O request depending on the input and output blocksize. For the file I/O path, camdd(8) also uses a single buffer (read(2), write(2), pread(2) or pwrite(2)) on reads, and a scatter/gather list (readv(2), writev(2), preadv(2), pwritev(2)) on writes. Things that would be nice to do for camdd(8) eventually: 1. Add support for I/O pattern generation. Patterns like all zeros, all ones, LBA-based patterns, random patterns, etc. Right Now you can always use /dev/zero, /dev/random, etc. 2. Add support for a "sink" mode, so we do only reads with no writes. Right now, you can use /dev/null. 3. Add support for automatic queue depth probing, so that we can figure out the right queue depth on the input and output side for maximum throughput. At the moment it defaults to 6. 4. Add support for SATA device passthrough I/O. 5. Add support for random LBAs and/or lengths on the input and output sides. 6. Track average per-I/O latency and busy time. The busy time and latency could also feed in to the automatic queue depth determination. sys/cam/scsi/scsi_pass.h: Define two new ioctls, CAMIOQUEUE and CAMIOGET, that queue and fetch asynchronous CAM CCBs respectively. Although these ioctls do not have a declared argument, they both take a union ccb pointer. If we declare a size here, the ioctl code in sys/kern/sys_generic.c will malloc and free a buffer for either the CCB or the CCB pointer (depending on how it is declared). Since we have to keep a copy of the CCB (which is fairly large) anyway, having the ioctl malloc and free a CCB for each call is wasteful. sys/cam/scsi/scsi_pass.c: Add asynchronous CCB support. Add two new ioctls, CAMIOQUEUE and CAMIOGET. CAMIOQUEUE adds a CCB to the incoming queue. The CCB is executed immediately (and moved to the active queue) if it is an immediate CCB, but otherwise it will be executed in passstart() when a CCB is available from the transport layer. When CCBs are completed (because they are immediate or passdone() if they are queued), they are put on the done queue. If we get the final close on the device before all pending I/O is complete, all active I/O is moved to the abandoned queue and we increment the peripheral reference count so that the peripheral driver instance doesn't go away before all pending I/O is done. The new passcreatezone() function is called on the first call to the CAMIOQUEUE ioctl on a given device to allocate the UMA zones for I/O requests and S/G list buffers. This may be good to move off to a taskqueue at some point. The new passmemsetup() function allocates memory and scatter/gather lists to hold the user's data, and copies in any data that needs to be written. For virtual pointers (CAM_DATA_VADDR), the kernel buffer is malloced from the new pass(4) driver malloc bucket. For virtual scatter/gather lists (CAM_DATA_SG), buffers are allocated from a new per-pass(9) UMA zone in MAXPHYS-sized chunks. Physical pointers are passed in unchanged. We have support for up to 16 scatter/gather segments (for the user and kernel S/G lists) in the default struct pass_io_req, so requests with longer S/G lists require an extra kernel malloc. The new passcopysglist() function copies a user scatter/gather list to a kernel scatter/gather list. The number of elements in each list may be different, but (obviously) the amount of data stored has to be identical. The new passmemdone() function copies data out for the CAM_DATA_VADDR and CAM_DATA_SG cases. The new passiocleanup() function restores data pointers in user CCBs and frees memory. Add new functions to support kqueue(2)/kevent(2): passreadfilt() tells kevent whether or not the done queue is empty. passkqfilter() adds a knote to our list. passreadfiltdetach() removes a knote from our list. Add a new function, passpoll(), for poll(2)/select(2) to use. Add devstat(9) support for the queued CCB path. sys/cam/ata/ata_da.c: Add support for the BIO_VLIST bio type. sys/cam/cam_ccb.h: Add a new enumeration for the xflags field in the CCB header. (This doesn't change the CCB header, just adds an enumeration to use.) sys/cam/cam_xpt.c: Add a new function, xpt_setup_ccb_flags(), that allows specifying CCB flags. sys/cam/cam_xpt.h: Add a prototype for xpt_setup_ccb_flags(). sys/cam/scsi/scsi_da.c: Add support for BIO_VLIST. sys/dev/md/md.c: Add BIO_VLIST support to md(4). sys/geom/geom_disk.c: Add BIO_VLIST support to the GEOM disk class. Re-factor the I/O size limiting code in g_disk_start() a bit. sys/kern/subr_bus_dma.c: Change _bus_dmamap_load_vlist() to take a starting offset and length. Add a new function, _bus_dmamap_load_pages(), that will load a list of physical pages starting at an offset. Update _bus_dmamap_load_bio() to allow loading BIO_VLIST bios. Allow unmapped I/O to start at an offset. sys/kern/subr_uio.c: Add two new functions, physcopyin_vlist() and physcopyout_vlist(). sys/pc98/include/bus.h: Guard kernel-only parts of the pc98 machine/bus.h header with #ifdef _KERNEL. This allows userland programs to include <machine/bus.h> to get the definition of bus_addr_t and bus_size_t. sys/sys/bio.h: Add a new bio flag, BIO_VLIST. sys/sys/uio.h: Add prototypes for physcopyin_vlist() and physcopyout_vlist(). share/man/man4/pass.4: Document the CAMIOQUEUE and CAMIOGET ioctls. usr.sbin/Makefile: Add camdd. usr.sbin/camdd/Makefile: Add a makefile for camdd(8). usr.sbin/camdd/camdd.8: Man page for camdd(8). usr.sbin/camdd/camdd.c: The new camdd(8) utility. Sponsored by: Spectra Logic ------------------------------------------------------------------------ r291724 \| ken \| 2015-12-03 17:07:01 -0500 (Thu, 03 Dec 2015) \| 6 lines Fix typos in the camdd(8) usage() function output caused by an error in my diff filter script. Sponsored by: Spectra Logic ------------------------------------------------------------------------ r291741 \| ken \| 2015-12-03 22:38:35 -0500 (Thu, 03 Dec 2015) \| 10 lines Fix g_disk_vlist_limit() to work properly with deletes. Add a new bp argument to g_disk_maxsegs(), and add a new function, g_disk_maxsize() tha will properly determine the maximum I/O size for a delete or non-delete bio. Submitted by: will Sponsored by: Spectra Logic ------------------------------------------------------------------------ ------------------------------------------------------------------------ r291742 \| ken \| 2015-12-03 22:44:12 -0500 (Thu, 03 Dec 2015) \| 5 lines Fix a style issue in g_disk_limit(). Noticed by: bdrewery ------------------------------------------------------------------------ Sponsored by: Spectra Logic
*	MFC 281266:	jhb	2015-06-02	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Move the 32-bit compatible procfs types from freebsd32.h to <sys/procfs.h> and export them to userland. - Define __HAVE_REG32 on platforms that define a reg32 structure and check for this in <sys/procfs.h> to control when to export prstatus32, etc. - Add prstatus32_t and prpsinfo32_t typedefs for the 32-bit structures. libbfd looks for these types, and having them fixes 'gcore' in gdb of a 32-bit process on a 64-bit platform. - Use the structure definitions from <sys/procfs.h> in gcore's elf32 core dump code instead of duplicating the definitions.
*	Merge the fueword(9) and casueword(9). In particular,	kib	2014-11-18	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	MFC r273783: Add fueword(9) and casueword(9) functions. MFC note: ia64 is handled like arm, with NO_FUEWORD define. MFC r273784: Replace some calls to fuword() by fueword() with proper error checking. MFC r273785: Convert kern_umtx.c to use fueword() and casueword(). MFC note: the sys__umtx_lock and sys__umtx_unlock syscalls are not converted, they are removed from HEAD, and not used. The do_sem2() family is not yet merged to stable/10, corresponding chunk will be merged after do_sem2 are committed. MFC r273788 (by jkim): Actually install casuword(9) to fix build. MFC r273911: Add type qualifier volatile to the base (userspace) address argument of fuword(9) and suword(9).
*	Fix previous commit: unbreak build of libkvm by including sys/systm.h	marcel	2014-09-07	1	-1/+2
\| \| \| \| \| \|	only when _KERNEL is defined. Approved by: re@ (implicit)
*	Fix the PCPU access macros. It was found that the PCPU pointer, when	marcel	2014-09-06	1	-8/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	held in register r13, is used outside the bounds of critical_enter() and critical_exit() by virtue of optimizations performed by the compiler. The net effect being that address computations of fields in the PCPU structure could be relative to the PCPU structure of the CPU on which the address computation was performed and not related to the CPU that executes the actual load or store operation. The typical failure mode being that the per-CPU cache of UMA got corrupted due to accesses from other CPUs. Adding more volatile decorating to the register expression does not help. The thinking being that volatile is assumed to work on memory references and not register references. Thus, the fix is to perform the address computation using a volatile inline assembly statement. Additionally, since the reference is fundamentally non-atomic on ia64 by virtue of have a distinct address computation followed by the actual load or store operation, it is required to wrap the entire PCPU access in a critical section. With PCPU_GET and friends requiring curthread now that they're in a critical section, low-level use of these macros in functions like cpu_switch() is not possible anymore. Consequently, a second order set of changes is needed to avoid using PCPU_GET and friends where curthread is either not set yet, or in the process of being changed. In those cases, explicit dereferencing of pcpup is needed. In those cases it is also possible to do that. This is a direct commit to stable/10. Approved by: re@ (marius)
*	MFC r263815, r263872:	emaste	2014-08-21	1	-177/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move ia64 efi.h to sys in preparation for amd64 UEFI support Prototypes specific to ia64 have been left in this file for now, under __ia64__, rather than moving them to a new header under sys/ia64. I anticipate that (some of) the corresponding functions will be shared by the amd64, arm64, i386, and ia64 architectures, and we can adjust this as EFI support on other than ia64 continues to develop. Fix missed efi.h header change in r263815 Sponsored by: The FreeBSD Foundation
*	MFC r263380 & r268185: Add KTR events for the PMAP interface functions.	marcel	2014-07-02	1	-9/+14
\|
*	MFC r263323: Fix and improve exception tracing.	marcel	2014-07-02	3	-1/+9
\|
*	MFC r263254: Move the implementation of kdb_cpu_trap() from <machine/kdb.h>	marcel	2014-07-02	1	-10/+2
\| \| \| \|	to machdep.c.
*	MFC r257477: Purge the translation cache of APs before we unleash them.	marcel	2014-07-02	1	-0/+1
\|
*	MFC r257854 (discussed with alc@)	ian	2014-05-16	1	-10/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	As of r257209, all architectures have defined VM_KMEM_SIZE_SCALE. In other words, every architecture is now auto-sizing the kmem arena. This revision changes kmeminit() so that the definition of VM_KMEM_SIZE_SCALE becomes mandatory and the definition of VM_KMEM_SIZE becomes optional. Replace or eliminate all existing definitions of VM_KMEM_SIZE. With auto-sizing enabled, VM_KMEM_SIZE effectively became an alternate spelling for VM_KMEM_SIZE_MIN on most architectures. Use VM_KMEM_SIZE_MIN for clarity.
*	MFC r263998:	tijl	2014-04-15	1	-1/+1
\| \| \| \| \|	Rename __wchar_t so it no longer conflicts with __wchar_t from clang 3.4 -fms-extensions.
*	MFC r260175:	marcel	2014-02-16	1	-0/+28
\| \| \| \|	Implement atomic_swap_<type>.
*	MFC r257487:	marcel	2014-02-16	1	-0/+5
\| \| \| \|	Use LOG2_ID_PAGE_SIZE again for the identity mapping in regions 6 & 7.
*	On those machines, where sf_bufs do not represent any real object, make	glebius	2013-09-06	1	-0/+12
\| \| \| \| \| \| \| \| \|	sf_buf_alloc()/sf_buf_free() inlines, to save two calls to an absolutely empty functions. Reviewed by: alc, kib, scottl Sponsored by: Nginx, Inc. Sponsored by: Netflix
*	Tidy up global locks for ACPICA. There is no functional change.	jkim	2013-08-13	1	-3/+3
\|
*	Revert r253748,253749	avg	2013-07-28	1	-2/+2
\| \| \| \| \| \|	This WIP should not have been committed yet. Pointyhat to: avg
*	put contents of cpu.h under _KERNEL	avg	2013-07-28	1	-2/+2
\| \| \| \| \| \|	no userland-serviceable parts inside MFC after: 20 days
*	Fix issues with zeroing and fetching the counters, on x86 and ppc64.	kib	2013-07-01	1	-0/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Issues were noted by Bruce Evans and are present on all architectures. On i386, a counter fetch should use atomic read of 64bit value, otherwise carry from the increment on other CPU could be lost for the given fetch, making error of 2^32. If 64bit read (cmpxchg8b) is not available on the machine, it cannot be SMP and it is enough to disable preemption around read to avoid the split read. On x86 the counter increment is not atomic on purpose, which makes it possible for the store of the incremented result to override just zeroed per-cpu slot. The effect would be a counter going off by arbitrary value after zeroing. Perform the counter zeroing on the same processor which does the increments, making the operations mutually exclusive. On i386, same as for the fetching, if the cmpxchg8b is not available, machine is not SMP and we disable preemption for zeroing. PowerPC64 is treated the same as amd64. For other architectures, the changes made to allow the compilation to succeed, without fixing the issues with zeroing or fetching. It should be possible to handle them by using the 64bit loads and stores atomic WRT preemption (assuming the architectures also converted from using critical sections to proper asm). If architecture does not provide the facility, using global (spin) mutex would be non-optimal but working solution. Noted by: bde Sponsored by: The FreeBSD Foundation
*	Move definitions required by userland applications out of acpica_machdep.h.	jkim	2013-06-27	1	-7/+2
\|
*	Rename VM_NDOMAIN into MAXMEMDOM and move it into machine/param.h in	attilio	2013-05-07	2	-7/+4
\| \| \| \| \| \| \| \| \|	order to match the MAXCPU concept. The change should also be useful for consolidation and consistency. Sponsored by: EMC / Isilon storage division Obtained from: jeff Reviewed by: alc
*	Merge from projects/counters: counter(9).	glebius	2013-04-08	1	-0/+54
\| \| \| \| \| \| \| \| \| \| \| \| \|	Introduce counter(9) API, that implements fast and raceless counters, provided (but not limited to) for gathering of statistical data. See http://lists.freebsd.org/pipermail/freebsd-arch/2013-April/014204.html for more details. In collaboration with: kib Reviewed by: luigi Tested by: ae, ray Sponsored by: Nginx, Inc.
*	Merge from projects/counters:	glebius	2013-04-08	1	-1/+2
\| \| \| \| \| \| \|	Pad struct pcpu so that its size is denominator of PAGE_SIZE. This is done to reduce memory waste in UMA_PCPU_ZONE zones. Sponsored by: Nginx, Inc.
*	kernacc() expects all KVAs to be covered in the kernel map. With the	marcel	2013-02-25	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \|	introduction of the PBVM, this stopped being the case. Redefine the VM parameters so that the PBVM is included in the kernel map. In particular this introduces VM_INIT_KERNEL_ADDRESS to point to the base of region 5 now that VM_MIN_KERNEL_ADDRESS points to the base of region 4 to include the PBVM. While here define KERNBASE to the actual link address of the kernel as is intended. PR: 169926
*	Eliminate padding by moving 'narg' next to 'code'. Both are 32-bit	marcel	2013-02-12	1	-1/+1
\| \| \| \| \|	entities in the syscall_args structure that otherwise has 64-bit only fields.
*	Fix compilation on ia64 when page size is configured for 16KB.	kib	2012-10-28	1	-15/+0
\| \| \| \|	Reviewed by: alc, marcel
*	Port the new PV entry allocator from amd64/i386. This allocator has two	alc	2012-10-26	1	-4/+17
\| \| \| \| \| \| \| \| \| \| \|	advantages. First, PV entries are roughly half the size. Second, this allocator doesn't access the paging queues, and thus it allows for the removal of the page queues lock from this pmap. Replace all uses of the page queues lock by a R/W lock that is private to this pmap. Tested by: marcel
*	Move PCPU initialization to a new function called cpu_pcpu_setup().	marcel	2012-07-08	1	-0/+2
\| \| \| \| \|	This makes it easier to add additional CPU or platform information to the per-CPU structure without duplicated code.
*	Implement ia64_physmem_alloc() and use it consistently to get memory	marcel	2012-07-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	before VM has been initialized. This includes: 1. Replacing pmap_steal_memory(), 2. Replace the handcrafted logic to allocate a naturally aligned VHPT, 3. Properly allocate the DPCPU for the BSP. Ad 3: Appending the DPCPU to kernend worked as long as we wouldn't cross into the next PBVM page. If we were to cross into the next page, then there wouldn't be a PTE entry on the page table for it and we would end up with a MCA following a page fault. As such, this commit fixes MCAs occasionally seen.
*	Hide the creation of phys_avail behind an API to make it easier to do it	marcel	2012-07-07	2	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	correctly. We now iterate the EFI memory descriptors once and collect all the information in a single pass. This includes: 1. The I/O port base address, 2. The PAL memory region. Have the physmem API track this. 3. Memory descriptors of memory we can't use, like bad memory, runtime services code & data, etc. Have the physmem API track these. 4. memory descriptors of memory we can use or re-use, such as free memory, boot time services code & data, loader code & data, etc. These are added by the physmem API. Since the PBVM page table and pages are in memory described as loader data, inform the physmem API of chunks that need to be delated from the available physical memory. While here, remove Maxmem and replace it with the better named paddr_max. Maxmem was defined as physmem, which is generally wrong. Now, paddr_max is properly defined as the largesty physical address. The upshot of all this is that: 1. We properly determine realmem. 2. We maximize physmem by re-using memory where possible. 3. We remove complexity from ia64_init() in machdep.c. 4. Remove confusion about realmem, physmem & Maxmem. The new ia64_physmem_alloc() is to replace pmap_steal_memory() in pmap.c, as well as replace the handcrafted allocation of the VHPT for the BSP in pmap_bootstrap() in pmap.c. This is step 2 and addresses the manipulation of phys_avail after it is being created.
*	Make the wchar_t type machine dependent.	andrew	2012-06-24	2	-6/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is required for ARM EABI. Section 7.1.1 of the Procedure Call for the ARM Architecture (AAPCS) defines wchar_t as either an unsigned int or an unsigned short with the former preferred. Because of this requirement we need to move the definition of __wchar_t to a machine dependent header. It also cleans up the macros defining the limits of wchar_t by defining __WCHAR_MIN and __WCHAR_MAX in the same machine dependent header then using them to define WCHAR_MIN and WCHAR_MAX respectively. Discussed with: bde
*	Implement mechanism to export some kernel timekeeping data to	kib	2012-06-22	1	-0/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	usermode, using shared page. The structures and functions have vdso prefix, to indicate the intended location of the code in some future. The versioned per-algorithm data is exported in the format of struct vdso_timehands, which mostly repeats the content of in-kernel struct timehands. Usermode reading of the structure can be lockless. Compatibility export for 32bit processes on 64bit host is also provided. Kernel also provides usermode with indication about currently used timecounter, so that libc can fall back to syscall if configured timecounter is unknown to usermode code. The shared data updates are initiated both from the tc_windup(), where a fast task is queued to do the update, and from sysctl handlers which change timecounter. A manual override switch kern.timecounter.fast_gettime allows to turn off the mechanism. Only x86 architectures export the real algorithm data, and there, only for tsc timecounter. HPET counters page could be exported as well, but I prefer to not further glue the kernel and libc ABI there until proper vdso-based solution is developed. Minimal stubs neccessary for non-x86 architectures to still compile are provided. Discussed with: bde Reviewed by: jhb Tested by: flo MFC after: 1 month
*	Reserve AT_TIMEKEEP auxv entry for providing usermode the pointer to	kib	2012-06-22	1	-0/+1
\| \| \| \| \| \|	timekeeping information. MFC after: 1 week
*	The page flag PGA_WRITEABLE is set and cleared exclusively by the pmap	alc	2012-06-16	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	layer, but it is read directly by the MI VM layer. This change introduces pmap_page_is_write_mapped() in order to completely encapsulate all direct access to PGA_WRITEABLE in the pmap layer. Aesthetics aside, I am making this change because amd64 will likely begin using an alternative method to track write mappings, and having pmap_page_is_write_mapped() in place allows me to make such a change without further modification to the MI VM layer. As an added bonus, tidy up some nearby comments concerning page flags. Reviewed by: kib MFC after: 6 weeks
*	MFp4 bz_ipv6_fast:	bz	2012-05-24	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	in_cksum.h required ip.h to be included for struct ip. To be able to use some general checksum functions like in_addword() in a non-IPv4 context, limit the (also exported to user space) IPv4 specific functions to the times, when the ip.h header is present and IPVERSION is defined (to 4). We should consider more general checksum (updating) functions to also allow easier incremental checksum updates in the L3/4 stack and firewalls, as well as ponder further requirements by certain NIC drivers needing slightly different pseudo values in offloading cases. Thinking in terms of a better "library". Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems Reviewed by: gnn (as part of the whole) MFC After: 3 days
*	Add a convenience macro for the returns_twice attribute, and apply it to	dim	2012-04-29	1	-2/+2
\| \| \| \| \| \| \|	the prototypes of the appropriate functions (getcontext, savectx, setjmp, sigsetjmp and vfork). MFC after: 2 weeks
*	Eliminate ia32_reg.h by moving its contents to x86 and ia64 reg.h.	tijl	2012-03-18	1	-8/+40
\| \| \| \|	Reviewed by: kib
*	Add C11 macros describing subnormal numbers to float.h.	das	2012-01-23	1	-0/+15
\| \| \| \|	Reviewed by: bde
*	Add parentheses where required. Without them, `sizeof LDBL_MAX'	das	2012-01-20	1	-4/+4
\| \| \| \| \|	is a syntax error and shouldn't be, while `1 FLT_ROUNDS' isn't a syntax error and should be. Thanks to bde for the examples.
*	Replace __signed by signed.	ed	2011-12-13	1	-1/+1
\| \| \| \| \|	The signed keyword is an integral part of the C syntax. There's no need to use __signed.
*	People porting FreeBSD to new architectures ought not have to	das	2011-10-21	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \|	implement a deprecated FPU control interface in addition to the standard one. To make this clearer, further deprecate ieeefp.h by not declaring the function prototypes except on architectures that implement them already. Currently i386 and amd64 implement the ieeefp.h interface for compatibility, and for fp[gs]etprec(), which doesn't exist on most other hardware. Powerpc, sparc64, and ia64 partially implement it and probably shouldn't, and other architectures don't implement it at all.
*	Remove unused define.	kib	2011-10-07	1	-1/+0
\| \| \| \|	MFC after: 1 month
*	Bump MAXCPU for amd64, ia64 and XLP mips appropriately.	attilio	2011-07-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	From now on, default values for FreeBSD will be 64 maxiumum supported CPUs on amd64 and ia64 and 128 for XLP. All the other architectures seem already capped appropriately (with the exception of sparc64 which needs further support on jalapeno flavour). Bump __FreeBSD_version in order to reflect KBI/KPI brekage introduced during the infrastructure cleanup for supporting MAXCPU > 32. This covers cpumask_t retiral too. The switch is considered completed at the present time, so for whatever bug you may experience that is reconducible to that area, please report immediately. Requested by: marcel, jchandra Tested by: pluknet, sbruno Approved by: re (kib)
*	Add the possibility to specify from kernel configs MAXCPU value.	attilio	2011-07-19	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	This patch is going to help in cases like mips flavours where you want a more granular support on MAXCPU. No MFC is previewed for this patch. Tested by: pluknet Approved by: re (kib)
*	Add a few more helper functions for working with memory descriptors:	marcel	2011-07-16	1	-0/+3
\| \| \| \| \| \|	o efi_md_find() - returns the md that covers the given address o efi_md_last() - returns the last md in the list o efi_md_prev() - returns the md that preceeds the given md.
*	Implement basic support for memory attributes. At this time we only	marcel	2011-07-08	2	-11/+15
\| \| \| \| \| \| \| \| \| \| \| \| \|	distinguish between UC and WB memory so that we can map the page to either a region 6 address (for UC) or a region 7 address (for WB). This change is only now possible, because previously we would map regions 6 and 7 with 256MB translations and on top of that had the kernel mapped in region 7 using a wired translation. The introduction of the PBVM moved the kernel into its own region and freed up region 7 and allowed us to revert to standard page-sized translations. This commit inroduces pmap_page_to_va() that respects the attribute.
*	Switch to the event timers infrastructure. This includes:	marcel	2011-06-25	2	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	o Setting td_intr_frame to the XIVs trap frame because it's referenced by the ET event handler. o Signal EOI to the CPU before calling the registered XIV handlers. This prevents lost ITC interrupts, which cause starvation in one-shot mode. o Adding support for IPI_HARDCLOCK with corresponding per-CPU counters. o Have the APs call cpu_initclocks() so as to limited the scattering of clock related initialization. cpu_initclocks() calls the <self>_bsp() or <self>_ap() version accordingly. o Uncomment the ET clock handling in cpu_idle(). o Update the DDB 'show pcpu' output for the new MD fields. o Entirely rewritten ia64_ih_clock(). Note that we don't create as many clock XIVs as we have CPUs, as is done on PowerPC. It doesn't scale. We can only have 240 XIVs and we can have more CPUs than that. There's a single intrcnt index for the cumulative clock ticks and we keep per CPU counts in the PCPU stats structure. o Register the ITC by hooking SI_SUB_CONFIGURE (2nd order). Open issues: o Clock interrupts can still be lost. Some tweaking is still necessary. Thanks to: mav@ for his support, feedback and explanations. ET stats while committing: eris% sysctl machdep.cpu \| grep nclks machdep.cpu.0.nclks: 24007 machdep.cpu.1.nclks: 22895 machdep.cpu.2.nclks: 13523 machdep.cpu.3.nclks: 9342 machdep.cpu.4.nclks: 9103 machdep.cpu.5.nclks: 9298 machdep.cpu.6.nclks: 10039 machdep.cpu.7.nclks: 9479 eris% vmstat -i \| grep clock clock 108599 50
*	Properly serialize the global shootdown with the instruction	marcel	2011-06-17	1	-2/+11
\| \| \| \| \| \| \|	stream of the local processor. Also explicitly invalidate the ALAT. This is done on the other CPUs in the coherence domain by virtue of the ptc.ga instruction, but does not apply to the local CPU.
*	MFC	attilio	2011-05-14	2	-12/+9
\|\
\| *	Be pedantic: mark the pcpu pointer (= register r13) itself as volatile.	marcel	2011-05-14	1	-1/+1
\| \|