summaryrefslogtreecommitdiffstats
path: root/sys/vm/vm_init.c
Commit message (Collapse)AuthorAgeFilesLines
* Add a mmap flag (MAP_32BIT) on 64-bit platforms to request that a mapping usejhb2013-09-091-1/+1
| | | | | | | | | | | | | an address in the first 2GB of the process's address space. This flag should have the same semantics as the same flag on Linux. To facilitate this, add a new parameter to vm_map_find() that specifies an optional maximum virtual address. While here, fix several callers of vm_map_find() to use a VMFS_* constant for the findspace argument instead of TRUE and FALSE. Reviewed by: alc Approved by: re (kib)
* - Use an arbitrary but reasonably large import size for kva on architecturesjeff2013-08-191-1/+2
| | | | | | | | | | | that don't support superpages. This keeps the number of spans and internal fragmentation lower. - When the user asks for alignment from vmem_xalloc adjust the imported size by 2*align to be certain we can satisfy the allocation. This comes at the expense of potential failures when the backend can't supply enough memory but could supply the requested size and alignment. Sponsored by: EMC / Isilon Storage Division
* Add new mmap(2) flags to permit applications to request specific virtualjhb2013-08-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | address alignment of mappings. - MAP_ALIGNED(n) requests a mapping aligned on a boundary of (1 << n). Requests for n >= number of bits in a pointer or less than the size of a page fail with EINVAL. This matches the API provided by NetBSD. - MAP_ALIGNED_SUPER is a special case of MAP_ALIGNED. It can be used to optimize the chances of using large pages. By default it will align the mapping on a large page boundary (the system is free to choose any large page size to align to that seems best for the mapping request). However, if the object being mapped is already using large pages, then it will align the virtual mapping to match the existing large pages in the object instead. - Internally, VMFS_ALIGNED_SPACE is now renamed to VMFS_SUPER_SPACE, and VMFS_ALIGNED_SPACE(n) is repurposed for specifying a specific alignment. MAP_ALIGNED(n) maps to using VMFS_ALIGNED_SPACE(n), while MAP_ALIGNED_SUPER maps to VMFS_SUPER_SPACE. - mmap() of a device object now uses VMFS_OPTIMAL_SPACE rather than explicitly using VMFS_SUPER_SPACE. All device objects are forced to use a specific color on creation, so VMFS_OPTIMAL_SPACE is effectively equivalent. Reviewed by: alc MFC after: 1 month
* Replace kernel virtual address space allocation with vmem. This providesjeff2013-08-071-13/+64
| | | | | | | | | | | | | transparent layering and better fragmentation. - Normalize functions that allocate memory to use kmem_* - Those that allocate address space are named kva_* - Those that operate on maps are named kmap_* - Implement recursive allocation handling for kmem_arena in vmem. Reviewed by: alc Tested by: pho Sponsored by: EMC / Isilon Storage Division
* - Add a general purpose resource allocator, vmem, from NetBSD. It wasjeff2013-06-281-17/+21
| | | | | | | | | | | | | | originally inspired by the Solaris vmem detailed in the proceedings of usenix 2001. The NetBSD version was heavily refactored for bugs and simplicity. - Use this resource allocator to allocate the buffer and transient maps. Buffer cache defrags are reduced by 25% when used by filesystems with mixed block sizes. Ultimately this may permit dynamic buffer cache sizing on low KVA machines. Discussed with: alc, kib, attilio Tested by: pho Sponsored by: EMC / Isilon Storage Division
* Only size and create the bio_transient_map when unmapped buffers arekib2013-03-211-4/+6
| | | | | | | enabled. Now, disabling the unmapped buffers should result in the kernel memory map identical to pre-r248550. Sponsored by: The FreeBSD Foundation
* Implement the concept of the unmapped VMIO buffers, i.e. buffers whichkib2013-03-191-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | do not map the b_pages pages into buffer_map KVA. The use of the unmapped buffers eliminate the need to perform TLB shootdown for mapping on the buffer creation and reuse, greatly reducing the amount of IPIs for shootdown on big-SMP machines and eliminating up to 25-30% of the system time on i/o intensive workloads. The unmapped buffer should be explicitely requested by the GB_UNMAPPED flag by the consumer. For unmapped buffer, no KVA reservation is performed at all. The consumer might request unmapped buffer which does have a KVA reserve, to manually map it without recursing into buffer cache and blocking, with the GB_KVAALLOC flag. When the mapped buffer is requested and unmapped buffer already exists, the cache performs an upgrade, possibly reusing the KVA reservation. Unmapped buffer is translated into unmapped bio in g_vfs_strategy(). Unmapped bio carry a pointer to the vm_page_t array, offset and length instead of the data pointer. The provider which processes the bio should explicitely specify a readiness to accept unmapped bio, otherwise g_down geom thread performs the transient upgrade of the bio request by mapping the pages into the new bio_transient_map KVA submap. The bio_transient_map submap claims up to 10% of the buffer map, and the total buffer_map + bio_transient_map KVA usage stays the same. Still, it could be manually tuned by kern.bio_transient_maxcnt tunable, in the units of the transient mappings. Eventually, the bio_transient_map could be removed after all geom classes and drivers can accept unmapped i/o requests. Unmapped support can be turned off by the vfs.unmapped_buf_allowed tunable, disabling which makes the buffer (or cluster) creation requests to ignore GB_UNMAPPED and GB_KVAALLOC flags. Unmapped buffers are only enabled by default on the architectures where pmap_copy_page() was implemented and tested. In the rework, filesystem metadata is not the subject to maxbufspace limit anymore. Since the metadata buffers are always mapped, the buffers still have to fit into the buffer map, which provides a reasonable (but practically unreachable) upper bound on it. The non-metadata buffer allocations, both mapped and unmapped, is accounted against maxbufspace, as before. Effectively, this means that the maxbufspace is forced on mapped and unmapped buffers separately. The pre-patch bufspace limiting code did not worked, because buffer_map fragmentation does not allow the limit to be reached. By Jeff Roberson request, the getnewbuf() function was split into smaller single-purpose functions. Sponsored by: The FreeBSD Foundation Discussed with: jeff (previous version) Tested by: pho, scottl (previous version), jhb, bf MFC after: 2 weeks
* MFCattilio2013-03-091-7/+0
|
* Switch vm_object lock to be a rwlock.attilio2013-02-201-1/+1
| | | | | | | | * VM_OBJECT_LOCK and VM_OBJECT_UNLOCK are mapped to write operations * VM_OBJECT_SLEEP() is introduced as a general purpose primitve to get a sleep operation using a VM_OBJECT_LOCK() as protection * The approach must bear with vm_pager.h namespace pollution so many files require including directly rwlock.h
* Introduce exec_alloc_args(). The objective being to encapsulate thealc2010-07-271-3/+1
| | | | | | | | | | details of the string buffer allocation in one place. Eliminate the portion of the string buffer that was dedicated to storing the interpreter name. The pointer to the interpreter name can simply be made to point to the appropriate argument string. Reviewed by: kib
* Change the order in which the file name, arguments, environment, andalc2010-07-251-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | shell command are stored in exec*()'s demand-paged string buffer. For a "buildworld" on an 8GB amd64 multiprocessor, the new order reduces the number of global TLB shootdowns by 31%. It also eliminates about 330k page faults on the kernel address space. Change exec_shell_imgact() to use "args->begin_argv" consistently as the start of the argument and environment strings. Previously, it would sometimes use "args->buf", which is the start of the overall buffer, but no longer the start of the argument and environment strings. While I'm here, eliminate unnecessary passing of "&length" to copystr(), where we don't actually care about the length of the copied string. Clean up the initialization of the exec map. In particular, use the correct size for an entry, and express that size in the same way that is used when an entry is allocated. The old size was one page too large. (This discrepancy originated in 2004 when I rewrote exec_map_first_page() to use sf_buf_alloc() instead of the exec map for mapping the first page of the executable.) Reviewed by: kib
* Align the start of the clean submap to a superpage boundary. Althoughalc2010-02-211-1/+1
| | | | | | no superpage mappings are created within the clean submap, aligning the start of the clean submap helps to prevent interference with kmem_alloc()'s use of superpages.
* Adjust some variables (mostly related to the buffer cache) that holdjhb2009-03-091-3/+3
| | | | | | | | | | | | | | | | | | | address space sizes to be longs instead of ints. Specifically, the follow values are now longs: runningbufspace, bufspace, maxbufspace, bufmallocspace, maxbufmallocspace, lobufspace, hibufspace, lorunningspace, hirunningspace, maxswzone, maxbcache, and maxpipekva. Previously, a relatively small number (~ 44000) of buffers set in kern.nbuf would result in integer overflows resulting either in hangs or bogus values of hidirtybuffers and lodirtybuffers. Now one has to overflow a long to see such problems. There was a check for a nbuf setting that would cause overflows in the auto-tuning of nbuf. I've changed it to always check and cap nbuf but warn if a user-supplied tunable would cause overflow. Note that this changes the ABI of several sysctls that are used by things like top(1), etc., so any MFC would probably require a some gross shims to allow for that. MFC after: 1 month
* Introduce a new parameter "superpage_align" to kmem_suballoc() that isalc2008-05-101-5/+6
| | | | | | | | | | | used to request superpage alignment for the submap. Request superpage alignment for the kmem_map. Pass VMFS_ANY_SPACE instead of TRUE to vm_map_find(). (They are currently equivalent but VMFS_ANY_SPACE is the new preferred spelling.) Remove a stale comment from kmem_malloc().
* In keeping with style(9)'s recommendations on macros, use a ';'rwatson2008-03-161-1/+1
| | | | | | | | | after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr. MFC after: 1 month Discussed with: imp, rink
* Add the vm.exec_map_entries tunable and read-only sysctl, which controlskris2005-04-251-1/+7
| | | | | | | | | | | | the number of entries in exec_map (maximum number of simultaneous execs that can be handled by the kernel). The default value of 16 is insufficient on heavily loaded machines (particularly SMP machines), and if it is exceeded then executing further processes will generate a SIGABRT. This is a workaround until a better solution can be implemented. Reviewed by: alc MFC after: 3 days
* /* -> /*- for license, minor formatting changesimp2005-01-071-1/+1
|
* Remove dead code. A vm_map's first_free is never NULL (even if the map isalc2004-08-071-7/+2
| | | | | | | | full). (This is preparation for an O(log n) implementation of vm_map_findspace().) Submitted by: Mark W. Krentel
* The demise of vm_pager_map_page() in revision 1.93 of vm/vm_pager.c permitsalc2004-04-081-2/+2
| | | | | | the reduction of the pager map's size by 8M bytes. In other words, eight megabytes of largely wasted KVA are returned to the kernel map for use elsewhere.
* Remove advertising clause from University of California Regent's license,imp2004-04-061-4/+0
| | | | | | per letter dated July 22, 1999. Approved by: core
* Remove unused arguments from pmap_init().alc2004-04-051-1/+1
|
* Eliminate unused arguments from vm_page_startup().alc2004-04-041-1/+1
|
* Change clean_map from a global to an auto variableeivind2003-09-011-0/+1
|
* More pipe changes:silby2003-08-111-0/+3
| | | | | | | | | | | | | | From alc: Move pageable pipe memory to a seperate kernel submap to avoid awkward vm map interlocking issues. (Bad explanation provided by me.) From me: Rework pipespace accounting code to handle this new layout, and adjust our default values to account for the fact that we now have a solid limit on allocations. Also, remove the "maxpipes" limit, as it no longer has a purpose. (The limit on kva usage solves the problem of having two many pipes.)
* Avoid an unnecessary calculation: there is no need to subtractrobert2003-07-131-1/+1
| | | | `firstaddr' from `v' if we know that the former equals zero.
* Use __FBSDID().obrien2003-06-111-2/+3
|
* Move the definitions of the hw.physmem, hw.usermem and hw.availpagestmm2002-11-071-0/+2
| | | | | | | | | | | sysctls to MI code; this reduces code duplication and makes all of them available on sparc64, and the latter two on powerpc. The semantics by the i386 and pc98 hw.availpages is slightly changed: previously, holes between ranges of available pages would be included, while they are excluded now. The new behaviour should be more correct and brings i386 in line with the other architectures. Move physmem to vm/vm_init.c, where this variable is used in MI code.
* Change hw.physmem and hw.usermem to unsigned long like they used to bepeter2002-08-301-2/+2
| | | | | | | | | | | | | in the original hardwired sysctl implementation. The buf size calculator still overflows an integer on machines with large KVA (eg: ia64) where the number of pages does not fit into an int. Use 'long' there. Change Maxmem and physmem and related variables to 'long', mostly for completeness. Machines are not likely to overflow 'int' pages in the near term, but then again, 640K ought to be enough for anybody. This comes for free on 32 bit machines, so why not?
* Remove references to vm_zone.h and switch over to the new uma API.jeff2002-03-201-1/+0
|
* Remove __P.alfred2002-03-191-1/+1
|
* This is the first part of the new kernel memory allocator. This replacesjeff2002-03-191-1/+0
| | | | | | malloc(9) and vm_zone with a slab like allocator. Reviewed by: arch@
* - Remove a number of extra newlines that do not belong here according toeivind2002-03-101-2/+0
| | | | | | | | | style(9) - Minor space adjustment in cases where we have "( ", " )", if(), return(), while(), for(), etc. - Add /* SYMBOL */ after a few #endifs. Reviewed by: alc
* Move most of the kernel submap initialization code, including thedillon2001-08-221-0/+89
| | | | | | | | timeout callwheel and buffer cache, out of the platform specific areas and into the machine independant area. i386 and alpha adjusted here. Other cpus can be fixed piecemeal. Reviewed by: freebsd-smp, jake
* With Alfred's permission, remove vm_mtx in favor of a fine-grained approachdillon2001-07-041-6/+0
| | | | | | | | | (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
* Sort includes from previous commit.jhb2001-05-221-1/+1
|
* Introduce a global lock for the vm subsystem (vm_mtx).alfred2001-05-191-1/+7
| | | | | | | | | | | | | | | | | | | vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb
* Undo part of the tangle of having sys/lock.h and sys/mutex.h included inmarkm2001-05-011-1/+2
| | | | | | | | | | | other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)
* Add missing include.jhb2001-01-241-0/+1
|
* Call vm_zone_init() at the appropriate time.des2001-01-221-0/+2
| | | | Reviewed by: jasone, jhb
* Revert spelling mistake I made in the previous commitcharnier2000-03-271-1/+1
| | | | Requested by: Alan and Bruce
* Spellingcharnier2000-03-261-1/+1
|
* useracc() the prequel:phk1999-10-291-1/+0
| | | | | | | | | | | Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs. This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE} as argument.
* $Id$ -> $FreeBSD$peter1999-08-281-1/+1
|
* Back out DIAGNOSTIC changes.eivind1998-02-061-3/+1
|
* Turn DIAGNOSTIC into a new-style option.eivind1998-02-041-1/+3
|
* Removed unused #includes.bde1997-08-021-4/+1
|
* Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are notpeter1997-02-221-1/+1
| | | | ready for it yet.
* This is the kernel Lite/2 commit. There are some requisite userlanddyson1997-02-101-1/+1
| | | | | | | | | | | | | | | changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes. The system boots and can mount UFS filesystems. Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed. Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>
* Make the long-awaited change from $Id$ to $FreeBSD$jkh1997-01-141-1/+1
| | | | | | | | This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
* Changes to support 1Tb filesizes. Pages are now named by andyson1995-12-111-2/+2
| | | | (object,index) pair instead of (object,offset) pair.
OpenPOWER on IntegriCloud