summaryrefslogtreecommitdiffstats
path: root/sys/vm/vm_map.c
Commit message (Collapse)AuthorAgeFilesLines
* Fix locking violations during page wiring:tegge2001-10-141-3/+32
| | | | | | | | | | | | - vm map entries are not valid after the map has been unlocked. - An exclusive lock on the map is needed before calling vm_map_simplify_entry(). Fix cleanup after page wiring failure to unwire all pages that had been successfully wired before the failure was detected. Reviewed by: dillon
* Add missing includes of sys/ktr.h.jhb2001-10-111-0/+1
|
* Make MAXTSIZ, DFLDSIZ, MAXDSIZ, DFLSSIZ, MAXSSIZ, SGROWSIZ loaderps2001-10-101-3/+3
| | | | | | | tunable. Reviewed by: peter MFC after: 2 weeks
* KSE Milestone 2julian2001-09-121-14/+14
| | | | | | | | | | | | | | Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha
* Change inlines back into mainline code in preparation for mutexing. Also,dillon2001-07-041-106/+183
| | | | | | | | most of these inlines had been bloated in -current far beyond their original intent. Normalize prototypes and function declarations to be ANSI only (half already were). And do some general cleanup. (kernel size also reduced by 50-100K, but that isn't the prime intent)
* With Alfred's permission, remove vm_mtx in favor of a fine-grained approachdillon2001-07-041-60/+53
| | | | | | | | | (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
* Introduce numerous SMP friendly changes to the mbuf allocator. Namely,bmilekic2001-06-221-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | introduce a modified allocation mechanism for mbufs and mbuf clusters; one which can scale under SMP and which offers the possibility of resource reclamation to be implemented in the future. Notable advantages: o Reduce contention for SMP by offering per-CPU pools and locks. o Better use of data cache due to per-CPU pools. o Much less code cache pollution due to excessively large allocation macros. o Framework for `grouping' objects from same page together so as to be able to possibly free wired-down pages back to the system if they are no longer needed by the network stacks. Additional things changed with this addition: - Moved some mbuf specific declarations and initializations from sys/conf/param.c into mbuf-specific code where they belong. - m_getclr() has been renamed to m_get_clrd() because the old name is really confusing. m_getclr() HAS been preserved though and is defined to the new name. No tree sweep has been done "to change the interface," as the old name will continue to be supported and is not depracated. The change was merely done because m_getclr() sounds too much like "m_get a cluster." - TEMPORARILY disabled mbtypes statistics displaying in netstat(1) and systat(1) (see TODO below). - Fixed systat(1) to display number of "free mbufs" based on new per-CPU stat structures. - Fixed netstat(1) to display new per-CPU stats based on sysctl-exported per-CPU stat structures. All infos are fetched via sysctl. TODO (in order of priority): - Re-enable mbtypes statistics in both netstat(1) and systat(1) after introducing an SMP friendly way to collect the mbtypes stats under the already introduced per-CPU locks (i.e. hopefully don't use atomic() - it seems too costly for a mere stat update, especially when other locks are already present). - Optionally have systat(1) display not only "total free mbufs" but also "total free mbufs per CPU pool." - Fix minor length-fetching issues in netstat(1) related to recently re-enabled option to read mbuf stats from a core file. - Move reference counters at least for mbuf clusters into an unused portion of the cluster itself, to save space and need to allocate a counter. - Look into introducing resource freeing possibly from a kproc. Reviewed by (in parts): jlemon, jake, silby, terry Tested by: jlemon (Intel & Alpha), mjacob (Intel & Alpha) Preliminary performance measurements: jlemon (and me, obviously) URL: http://people.freebsd.org/~bmilekic/mb_alloc/
* Cleanup the tabbingdillon2001-06-111-17/+16
|
* Two fixes to the out-of-swap process termination code. First, start killingdillon2001-06-091-0/+35
| | | | | | | | | | | processes a little earlier to avoid a deadlock. Second, when calculating the 'largest process' do not just count RSS. Instead count the RSS + SWAP used by the process. Without this the code tended to kill small inconsequential processes like, oh, sshd, rather then one of the many 'eatmem 200MB' I run on a whim :-). This fix has been extensively tested on -stable and somewhat tested on -current and will be MFCd in a few days. Shamed into fixing this by: ps
* - Add lots of vm_mtx assertions.jhb2001-05-231-2/+39
| | | | | | | - Add a few KTR tracepoints to track the addition and removal of vm_map_entry's and the creation adn free'ing of vmspace's. - Adjust a few portions of code so that we update the process' vmspace pointer to its new vmspace before freeing the old vmspace.
* Introduce a global lock for the vm subsystem (vm_mtx).alfred2001-05-191-8/+33
| | | | | | | | | | | | | | | | | | | vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb
* Undo part of the tangle of having sys/lock.h and sys/mutex.h included inmarkm2001-05-011-1/+2
| | | | | | | | | | | other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)
* remove truncated part from commmentalfred2001-04-121-1/+1
|
* Fix a lock reversal problem in the VM subsystem related to threadeddillon2001-03-141-0/+3
| | | | | | | | | | | | | programs. There is a case during a fork() which can cause a deadlock. From Tor - The workaround that consists of setting a flag in the vm map that indicates that a fork is in progress and using that mark in the page fault handling to force a revalidation failure. That change will only affect (pessimize) page fault handling during fork for threaded (linuxthreads style) applications and applications using aio_*(). Submited by: tegge
* Temporarily remove the vm_map_simplify() call from vm_map_insert(). Thedillon2001-03-141-0/+7
| | | | | | | | call is correct, but it interferes with the massive hack called vm_map_growstack(). The call will be returned after our stack handling code is fixed. Reported by: tegge
* When creating a shadow vm_object in vmspace_fork(), only oneiedowse2001-03-091-0/+4
| | | | | | | | | | | | reference count was transferred to the new object, but both the new and the old map entries had pointers to the new object. Correct this by transferring the second reference. This fixes a panic that can occur when mmap(2) is used with the MAP_INHERIT flag. PR: i386/25603 Reviewed by: dillon, alc
* This commit represents work mainly submitted by Tor and slightly modifieddillon2001-02-041-7/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | by myself. It solves a serious vm_map corruption problem that can occur with the buffer cache when block sizes > 64K are used. This code has been heavily tested in -stable but only tested somewhat on -current. An MFC will occur in a few days. My additions include the vm_map_simplify_entry() and minor buffer cache boundry case fix. Make the buffer cache use a system map for buffer cache KVM rather then a normal map. Ensure that VM objects are not allocated for system maps. There were cases where a buffer map could wind up with a backing VM object -- normally harmless, but this could also result in the buffer cache blocking in places where it assumes no blocking will occur, possibly resulting in corrupted maps. Fix a minor boundry case in the buffer cache size limit is reached that could result in non-optimal code. Add vm_map_simplify_entry() calls to prevent 'creeping proliferation' of vm_map_entry's in the buffer cache's vm_map. Previously only a simple linear optimization was made. (The buffer vm_map typically has only a handful of vm_map_entry's. This stabilizes it at that level permanently). PR: 20609 Submitted by: (Tor Egge) tegge
* - If swap metadata does not fit into the KVM, reduce the number oftanimura2000-12-131-1/+1
| | | | | | | | | | | | | | | struct swblock entries by dividing the number of the entries by 2 until the swap metadata fits. - Reject swapon(2) upon failure of swap_zone allocation. This is just a temporary fix. Better solutions include: (suggested by: dillon) o reserving swap in SWAP_META_PAGES chunks, and o swapping the swblock structures themselves. Reviewed by: alfred, dillon
* Clear the MAP_ENTRY_USER_WIRED flag from cloned vm_map entries.tegge2000-11-021-0/+2
| | | | PR: 2840
* Convert lockmgr locks from using simple locks to using mutexes.jasone2000-10-041-0/+8
| | | | | | Add lockdestroy() and appropriate invocations, which corresponds to lockinit() and must be called to clean up after a lockmgr lock is no longer needed.
* Fixed bug in madvise() / MADV_WILLNEED. When the request is offsetdillon2000-05-141-1/+5
| | | | | | | from the base of the first map_entry the call to pmap_object_init_pt() uses the wrong start VA. MFC to follow. PR: i386/18095
* Revert spelling mistake I made in the previous commitcharnier2000-03-271-1/+1
| | | | Requested by: Alan and Bruce
* Spellingcharnier2000-03-261-2/+2
|
* Add MAP_NOCORE to mmap(2), and MADV_NOCORE and MADV_CORE to madvise(2).ps2000-02-281-1/+11
| | | | | | | | | | | | This This feature allows you to specify if mmap'd data is included in an application's corefile. Change the type of eflags in struct vm_map_entry from u_char to vm_eflags_t (an unsigned int). Reviewed by: dillon,jdp,alfred Approved by: jkh
* Fix null-pointer dereference crash when the system is intentionallydillon2000-02-161-1/+7
| | | | | | | | | | | | | | | | run out of KVM through a mmap()/fork() bomb that allocates hundreds of thousands of vm_map_entry structures. Add panic to make null-pointer dereference crash a little more verbose. Add a new sysctl, vm.max_proc_mmap, which specifies the maximum number of mmap()'d spaces (discrete vm_map_entry's in the process). The value defaults to around 9000 for a 128MB machine. The test is scaled for the number of processes sharing a vmspace (aka linux threads). Setting the value to 0 disables the feature. PR: kern/16573 Approved by: jkh
* Fix a deadlock between msync(..., MS_INVALIDATE) and vm_fault. Thedillon2000-01-211-25/+29
| | | | | | | | | | invalidation code cannot wait for paging to complete while holding a vnode lock, so we don't wait. Instead we simply allow the lower level code to simply block on any busy pages it encounters. I think Yahoo may be the only entity in the entire world that actually uses this msync feature :-). Bug reported by: Paul Saab <paul@mu.org>
* Add MAP_NOSYNC feature to mmap(), and MADV_NOSYNC and MADV_AUTOSYNC todillon1999-12-121-2/+15
| | | | | | | | | | | | | | | | | madvise(). This feature prevents the update daemon from gratuitously flushing dirty pages associated with a mapped file-backed region of memory. The system pager will still page the memory as necessary and the VM system will still be fully coherent with the filesystem. Modifications made by other means to the same area of memory, for example by write(), are unaffected. The feature works on a page-granularity basis. MAP_NOSYNC allows one to use mmap() to share memory between processes without incuring any significant filesystem overhead, putting it in the same performance category as SysV Shared memory and anonymous memory. Reviewed by: julian, alc, dg
* Remove nonsensical vm_map_{clear,set}_recursive() callsalc1999-11-251-3/+0
| | | | | | | from vm_map_pageable(). At the point they called, vm_map_pageable() holds a read (or shared) lock on the map. The purpose of vm_map_{clear,set}_recursive() is to disable/enable repeated write (or exclusive) lock requests by the same process.
* Correct the following error: vm_map_pageable() on a COW'ed (post-fork)alc1999-11-231-4/+5
| | | | | | | vm_map always failed because vm_map_lookup() looked at "vm_map_entry->wired_count" instead of "(vm_map_entry->eflags & MAP_ENTRY_USER_WIRED)". The effect was that many page wiring operations by sysctl were (silently) failing.
* Remove unused #include's.alc1999-11-071-1/+0
| | | | Submitted by: phk
* The functions declared by this header file no longer exist.alc1999-11-071-1/+0
| | | | Submitted by: phk (in part)
* useracc() the prequel:phk1999-10-291-2/+0
| | | | | | | | | | | Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs. This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE} as argument.
* cleanup madvise code, add a few more sanity checks.dillon1999-09-211-47/+71
| | | | Reviewed by: Alan Cox <alc@cs.rice.edu>, dg@root.com
* $Id$ -> $FreeBSD$peter1999-08-281-1/+1
|
* vm_map_madvise:alc1999-08-131-60/+85
| | | | | | | | | A complete rewrite by dillon and myself to separate the implementation of behaviors that effect the vm_map_entry from those that effect the vm_object. A result of this change is that madvise(..., MADV_FREE); is much cheaper.
* vm_map_madvise:alc1999-08-101-17/+3
| | | | | | | | Now that behaviors are stored in the vm_map_entry rather than the vm_object, it's no longer necessary to instantiate a vm_object just to hold the behavior. Reviewed by: dillon
* Move the memory access behavior information provided by madvisealc1999-08-011-7/+7
| | | | | | from the vm_object to the vm_map. Submitted by: dillon
* Fix the following problem:alc1999-07-211-1/+3
| | | | | | | | | | | | | | | | When creating new processes (or performing exec), the new page directory is initialized too early. The kernel might grow before p_vmspace is initialized for the new process. Since pmap_growkernel doesn't yet know about the new page directory, it isn't updated, and subsequent use causes a failure. The fix is (1) to clear p_vmspace early, to stop pmap_growkernel from stomping on memory, and (2) to defer part of the initialization of new page directories until p_vmspace is initialized. PR: kern/12378 Submitted by: tegge Reviewed by: dfr
* Cleanup OBJ_ONEMAPPING management.alc1999-07-111-13/+1
| | | | | | | | | | | | vm_map.c: Don't set OBJ_ONEMAPPING on arbitrary vm objects. Only default and swap type vm objects should have it set. vm_object_deallocate already handles these cases. vm_object.c: If OBJ_ONEMAPPING isn't already clear in vm_object_shadow, we are in trouble. Instead of clearing it, make it an assertion that it is already clear.
* Fix some int/long printf problems for the Alphapeter1999-07-011-2/+2
|
* vm_map_growstack uses vmspace::vm_ssize as though it containedalc1999-06-171-6/+6
| | | | the stack size in bytes when in fact it is the stack size in pages.
* vm_map_insert sometimes extends an existing vm_map entry, rather thanalc1999-06-171-3/+7
| | | | | | | creating a new entry. vm_map_stack and vm_map_growstack can panic when a new entry isn't created. Fixed vm_map_stack and vm_map_growstack. Also, when extending the stack, always set the protection to VM_PROT_ALL.
* Move vm_map_stack and vm_map_growstack after the definitionalc1999-06-171-204/+204
| | | | | of the vm_map_clip_end macro. (The next commit will modify vm_map_stack and vm_map_growstack to use vm_map_clip_end.)
* Remove some unused declarations and duplicate initialization.alc1999-06-171-6/+2
|
* vm_map_protect:alc1999-06-121-2/+2
| | | | | The wrong vm_map_entry is used to determine if writes must not be allowed due to COW.
* Avoid the creation of unnecessary shadow objects.alc1999-05-281-3/+9
|
* vm_map_insert:alc1999-05-181-47/+38
| | | | | General cleanup. Eliminate coalescing checks that are duplicated by vm_object_coalesce.
* Add the options MAP_PREFAULT and MAP_PREFAULT_PARTIAL to vm_map_find/insert,alc1999-05-171-1/+6
| | | | | | | eliminating the need for the pmap_object_init_pt calls in imgact_* and mmap. Reviewed by: David Greenman <dg@root.com>
* Remove prototypes for functions that don't exist anymore (vm_map.h).alc1999-05-161-7/+4
| | | | | | | | | | | | | | | | Remove a useless argument from vm_map_madvise's interface (vm_map.c, vm_map.h, and vm_mmap.c). Remove a redundant test in vm_uiomove (vm_map.c). Make two changes to vm_object_coalesce: 1. Determine whether the new range of pages actually overlaps the existing object's range of pages before calling vm_object_page_remove. (Prior to this change almost 90% of the calls to vm_object_page_remove were to remove pages that were beyond the end of the object.) 2. Free any swap space allocated to removed pages.
* Simplify vm_map_find/insert's interface: remove the MAP_COPY_NEEDED option.alc1999-05-141-5/+3
| | | | | | | It never makes sense to specify MAP_COPY_NEEDED without also specifying MAP_COPY_ON_WRITE, and vice versa. Thus, MAP_COPY_ON_WRITE suffices. Reviewed by: David Greenman <dg@root.com>
OpenPOWER on IntegriCloud