summaryrefslogtreecommitdiffstats
path: root/sys/vm/vm_map.c
Commit message (Collapse)AuthorAgeFilesLines
* The MAP_ENTRY_NEEDS_COPY flag belongs to protoeflags, cow variablekib2010-01-291-1/+1
| | | | | | | uses different namespace. Reported by: Jonathan Anderson <jonathan.anderson cl cam ac uk> MFC after: 3 days
* Replace VM_PROT_OVERRIDE_WRITE by VM_PROT_COPY. VM_PROT_OVERRIDE_WRITE hasalc2009-11-261-21/+8
| | | | | | | | | | | | | | | | | | | | | | represented a write access that is allowed to override write protection. Until now, VM_PROT_OVERRIDE_WRITE has been used to write breakpoints into text pages. Text pages are not just write protected but they are also copy-on-write. VM_PROT_OVERRIDE_WRITE overrides the write protection on the text page and triggers the replication of the page so that the breakpoint will be written to a private copy. However, here is where things become confused. It is the debugger, not the process being debugged that requires write access to the copied page. Nonetheless, the copied page is being mapped into the process with write access enabled. In other words, once the debugger sets a breakpoint within a text page, the program can write to its private copy of that text page. Whereas prior to setting the breakpoint, a SIGSEGV would have occurred upon a write access. VM_PROT_COPY addresses this problem. The combination of VM_PROT_READ and VM_PROT_COPY forces the replication of a copy-on-write page even though the access is only for read. Moreover, the replicated page is only mapped into the process with read access, and not write access. Reviewed by: kib MFC after: 4 weeks
* Simplify both the invocation and the implementation of vm_fault() for wiringalc2009-11-181-2/+2
| | | | | | | | | | pages. (Note: Claims made in the comments about the handling of breakpoints in wired pages have been false for roughly a decade. This and another bug involving breakpoints will be fixed in coming changes.) Reviewed by: kib
* Avoid pointless calls to pmap_protect().alc2009-11-021-3/+3
| | | | Reviewed by: kib
* When protection of wired read-only mapping is changed to read-write,kib2009-10-271-4/+10
| | | | | | | | | | | | | | install new shadow object behind the map entry and copy the pages from the underlying objects to it. This makes the mprotect(2) call to actually perform the requested operation instead of silently do nothing and return success, that causes SIGSEGV on later write access to the mapping. Reuse vm_fault_copy_entry() to do the copying, modifying it to behave correctly when src_entry == dst_entry. Reviewed by: alc MFC after: 3 weeks
* Move the annotation for vm_map_startup() immediately before the function.kib2009-10-011-16/+16
| | | | MFC after: 3 days
* Add a new type of VM object: OBJT_SG. An OBJT_SG object is very similar tojhb2009-07-241-5/+9
| | | | | | | | | | | a device pager (OBJT_DEVICE) object in that it uses fictitious pages to provide aliases to other memory addresses. The primary difference is that it uses an sglist(9) to determine the physical addresses for a given offset into the object instead of invoking the d_mmap() method in a device driver. Reviewed by: alc Approved by: re (kensmith) MFC after: 2 weeks
* When VM_MAP_WIRE_HOLESOK is not specified and vm_map_wire(9) encounterskib2009-07-121-1/+1
| | | | | | | | | | | | non-readable and non-executable map entry, the entry is skipped from wiring and loop is aborted. But, since MAP_ENTRY_WIRE_SKIPPED was not set for the map entry, its wired_count is later erronously decremented. vm_map_delete(9) for such map entry stuck in "vmmaps". Properly set MAP_ENTRY_WIRE_SKIPPED when aborting the loop. Reported by: John Marshall <john.marshall riverwillow com au> Approved by: re (kensmith)
* When forking a vm space that has wired map entries, do not forget tokib2009-07-031-1/+3
| | | | | | | | | charge the objects created by vm_fault_copy_entry. The object charge was set, but reserve not incremented. Reported by: Greg Rivers <gcr+freebsd-current tharned org> Reviewed by: alc (previous version) Approved by: re (kensmith)
* Implement global and per-uid accounting of the anonymous memory. Addkib2009-06-231-28/+284
| | | | | | | | | | | | | | | | | | | | | | | | | | | rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved for the uid. The accounting information (charge) is associated with either map entry, or vm object backing the entry, assuming the object is the first one in the shadow chain and entry does not require COW. Charge is moved from entry to object on allocation of the object, e.g. during the mmap, assuming the object is allocated, or on the first page fault on the entry. It moves back to the entry on forks due to COW setup. The per-entry granularity of accounting makes the charge process fair for processes that change uid during lifetime, and decrements charge for proper uid when region is unmapped. The interface of vm_pager_allocate(9) is extended by adding struct ucred *, that is used to charge appropriate uid when allocation if performed by kernel, e.g. md(4). Several syscalls, among them is fork(2), may now return ENOMEM when global or per-uid limits are enforced. In collaboration with: pho Reviewed by: alc Approved by: re (kensmith)
* Eliminate an unnecessary restriction on the vm object type fromalc2009-06-091-4/+2
| | | | | | vm_map_pmap_enter(). The immediate effect of this change is that automatic prefaulting by mmap() for small mappings is performed on POSIX shared memory objects just the same as it is on ordinary files.
* Eliminate unnecessary obfuscation when testing a page's valid bits.alc2009-06-071-1/+1
|
* Allow valid pages to be mapped for read access when they have a non-zeroalc2009-04-191-2/+1
| | | | | | | | | | | busy count. Only mappings that allow write access should be prevented by a non-zero busy count. (The prohibition on mapping pages for read access when they have a non- zero busy count originated in revision 1.202 of i386/i386/pmap.c when this code was a part of the pmap.) Reviewed by: tegge
* When vm_map_wire(9) is allowed to skip holes in the wired region, skipkib2009-04-101-1/+15
| | | | | | | | | | | | | | | the mappings without any of read and execution rights, in particular, the PROT_NONE entries. This makes mlockall(2) work for the process address space that has such mappings. Since protection mode of the entry may change between setting MAP_ENTRY_IN_TRANSITION and final pass over the region that records the wire status of the entries, allocate new map entry flag MAP_ENTRY_WIRE_SKIPPED to mark the skipped PROT_NONE entries. Reported and tested by: Hans Ottevanger <fbsdhackers beasties demon nl> Reviewed by: alc MFC after: 3 weeks
* Revert the addition of the freelist argument for the vm_map_delete()kib2009-02-241-30/+24
| | | | | | | | | function, done in r188334. Instead, collect the entries that shall be freed, in the deferred_freelist member of the map. Automatically purge the deferred freelist when map is unlocked. Tested by: pho Reviewed by: alc
* Add the assertion macros for the map locks. Use them in several mapkib2009-02-241-0/+44
| | | | | | | manipulation functions. Tested by: pho Reviewed by: alc
* Update the comment after the r188334.kib2009-02-241-4/+4
| | | | Reviewed by: alc
* Improve comments, correct English.kib2009-02-081-8/+8
| | | | Submitted by: alc
* Do not call vm_object_deallocate() from vm_map_delete(), because wekib2009-02-081-8/+32
| | | | | | | | | | hold the map lock there, and might need the vnode lock for OBJT_VNODE objects. Postpone object deallocation until caller of vm_map_delete() drops the map lock. Link the map entries to be freed into the freelist, that is released by the new helper function vm_map_entry_free_freelist(). Reviewed by: tegge, alc Tested by: pho
* In vm_map_sync(), do not call vm_object_sync() while holding map lock.kib2009-02-081-2/+10
| | | | | | | | Reference object, drop the map lock, and then call vm_object_sync(). The object sync might require vnode lock for OBJT_VNODE type objects. Reviewed by: tegge Tested by: pho
* Add the comments to vm_map_simplify_entry() and vmspace_fork(),kib2009-02-081-0/+20
| | | | | | | describing why several calls to vm_deallocate_object() with locked map do not result in the acquisition of the vnode lock after map lock. Suggested and reviewed by: tegge
* Lock the new map in vmspace_fork(). The newly allocated map should notkib2009-02-081-0/+5
| | | | | | | | | | | be accessible outside vmspace_fork() yet, but locking it would satisfy the protocol of the vm_map_entry_link() and other functions called from vmspace_fork(). Use trylock that is supposedly cannot fail, to silence WITNESS warning of the nested acquisition of the sx lock with the same name. Suggested and reviewed by: tegge
* Do not leak the MAP_ENTRY_IN_TRANSITION flag when copying map entrykib2009-02-081-2/+4
| | | | | | | on fork. Otherwise, copied entry cannot be removed in the child map. Reviewed by: tegge MFC after: 2 weeks
* Resurrect shared map locks allowing greater concurrency during some mapalc2009-01-011-11/+83
| | | | | | | | | | operations, such as page faults. An earlier version of this change was ... Reviewed by: kib Tested by: pho MFC after: 6 weeks
* Update or eliminate some stale comments.alc2008-12-311-3/+4
|
* Avoid an unnecessary memory dereference in vm_map_entry_splay().alc2008-12-301-3/+4
|
* Style change to vm_map_lookup(): Eliminate a macro of dubious value.alc2008-12-301-11/+8
|
* Move the implementation of the vm map's fast path on address lookup fromalc2008-12-301-34/+23
| | | | | | | vm_map_lookup{,_locked}() to vm_map_lookup_entry(). Having the fast path in vm_map_lookup{,_locked}() limits its benefits to page faults. Moving it to vm_map_lookup_entry() extends its benefits to other operations on the vm map.
* KERNBASE is not necessarily an address within the kernel map, e.g.,alc2008-06-211-1/+1
| | | | | | | | PowerPC/AIM. Consequently, it should not be used to determine the maximum number of kernel map entries. Intead, use VM_MIN_KERNEL_ADDRESS, which marks the start of the kernel map on all architectures. Tested by: marcel@ (PowerPC/AIM)
* Generalize vm_map_find(9)'s parameter "find_space". Specifically, addalc2008-05-101-9/+14
| | | | | | | | | | support for VMFS_ALIGNED_SPACE, which requests the allocation of an address range best suited to superpages. The old options TRUE and FALSE are mapped to VMFS_ANY_SPACE and VMFS_NO_SPACE, so that there is no immediate need to update all of vm_map_find(9)'s callers. While I'm here, correct a misstatement about vm_map_find(9)'s return values in the man page.
* vm_map_fixed(), unlike vm_map_find(), does not update "addr", so it can bealc2008-04-281-3/+2
| | | | passed by value.
* Update a comment to vm_map_pmap_enter().alc2008-04-041-2/+2
|
* Remove kernel support for M:N threading.jeff2008-03-121-2/+2
| | | | | | | | While the KSE project was quite successful in bringing threading to FreeBSD, the M:N approach taken by the kse library was never developed to its full potential. Backwards compatibility will be provided via libmap.conf for dynamically linked binaries and static binaries will be broken.
* In the vm_map_stack(), check for the specified stack region wraparound.kib2008-01-041-1/+3
| | | | | | Reported and tested by: Peter Holm Reviewed by: alc MFC after: 3 days
* Change unused 'user_wait' argument to 'timo' argument, which will bepjd2007-11-071-5/+5
| | | | | | | used to specify timeout for msleep(9). Discussed with: alc Reviewed by: alc
* Fix for the panic("vm_thread_new: kstack allocation failed") andkib2007-11-051-6/+23
| | | | | | | | | | | | | | | | | | | | silent NULL pointer dereference in the i386 and sparc64 pmap_pinit() when the kmem_alloc_nofault() failed to allocate address space. Both functions now return error instead of panicing or dereferencing NULL. As consequence, vmspace_exec() and vmspace_unshare() returns the errno int. struct vmspace arg was added to vm_forkproc() to avoid dealing with failed allocation when most of the fork1() job is already done. The kernel stack for the thread is now set up in the thread_alloc(), that itself may return NULL. Also, allocation of the first process thread is performed in the fork1() to properly deal with stack allocation failure. proc_linkup() is separated into proc_linkup() called from fork1(), and proc_linkup0(), that is used to set up the kernel process (was known as swapper). In collaboration with: Peter Holm Reviewed by: jhb
* Correct an error in vm_map_sync(), nee vm_map_clean(), that has existedalc2007-10-221-2/+4
| | | | | | | | | | | | since revision 1.1. Specifically, neither traversal of the vm map checks whether the end of the vm map has been reached. Consequently, the first traversal can wrap around and bogusly return an error. This error has gone unnoticed for so long because no one had ever before tried msync(2)ing a region above the stack. Reported by: peter MFC after: 1 week
* Change the management of cached pages (PQ_CACHE) in two fundamentalalc2007-09-251-11/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ways: (1) Cached pages are no longer kept in the object's resident page splay tree and memq. Instead, they are kept in a separate per-object splay tree of cached pages. However, access to this new per-object splay tree is synchronized by the _free_ page queues lock, not to be confused with the heavily contended page queues lock. Consequently, a cached page can be reclaimed by vm_page_alloc(9) without acquiring the object's lock or the page queues lock. This solves a problem independently reported by tegge@ and Isilon. Specifically, they observed the page daemon consuming a great deal of CPU time because of pages bouncing back and forth between the cache queue (PQ_CACHE) and the inactive queue (PQ_INACTIVE). The source of this problem turned out to be a deadlock avoidance strategy employed when selecting a cached page to reclaim in vm_page_select_cache(). However, the root cause was really that reclaiming a cached page required the acquisition of an object lock while the page queues lock was already held. Thus, this change addresses the problem at its root, by eliminating the need to acquire the object's lock. Moreover, keeping cached pages in the object's primary splay tree and memq was, in effect, optimizing for the uncommon case. Cached pages are reclaimed far, far more often than they are reactivated. Instead, this change makes reclamation cheaper, especially in terms of synchronization overhead, and reactivation more expensive, because reactivated pages will have to be reentered into the object's primary splay tree and memq. (2) Cached pages are now stored alongside free pages in the physical memory allocator's buddy queues, increasing the likelihood that large allocations of contiguous physical memory (i.e., superpages) will succeed. Finally, as a result of this change long-standing restrictions on when and where a cached page can be reclaimed and returned by vm_page_alloc(9) are eliminated. Specifically, calls to vm_page_alloc(9) specifying VM_ALLOC_INTERRUPT can now reclaim and return a formerly cached page. Consequently, a call to malloc(9) specifying M_NOWAIT is less likely to fail. Discussed with: many over the course of the summer, including jeff@, Justin Husted @ Isilon, peter@, tegge@ Tested by: an earlier version by kris@ Approved by: re (kensmith)
* Do not drop vm_map lock between doing vm_map_remove() and vm_map_insert().kib2007-08-201-16/+35
| | | | | | | | | | | For this, introduce vm_map_fixed() that does that for MAP_FIXED case. Dropping the lock allowed for parallel thread to occupy the freed space. Reported by: Tijl Coosemans <tijl ulyssis org> Reviewed by: alc Approved by: re (kensmith) MFC after: 2 weeks
* Revert VMCNT_* operations introduction.attilio2007-05-311-2/+2
| | | | | | | | Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately. Requested by: alc Approved by: jeff (mentor)
* Add functions sx_xlock_sig() and sx_slock_sig().attilio2007-05-311-2/+2
| | | | | | | | | | | | | | | | These functions are intended to do the same actions of sx_xlock() and sx_slock() but with the difference to perform an interruptible sleep, so that sleep can be interrupted by external events. In order to support these new featueres, some code renstruction is needed, but external API won't be affected at all. Note: use "void" cast for "int" returning functions in order to avoid tools like Coverity prevents to whine. Requested by: rwatson Tested by: rwatson Reviewed by: jhb Approved by: jeff (mentor)
* Eliminate the reactivation of cached pages in vm_fault_prefault() andalc2007-05-221-5/+13
| | | | | | | | | | | | | | | | | vm_map_pmap_enter() unless the caller is madvise(MADV_WILLNEED). With the exception of calls to vm_map_pmap_enter() from madvise(MADV_WILLNEED), vm_fault_prefault() and vm_map_pmap_enter() are both used to create speculative mappings. Thus, always reactivating cached pages is a mistake. In principle, cached pages should only be reactivated by an actual access. Otherwise, the following misbehavior can occur. On a hard fault for a text page the clustering algorithm fetches not only the required page but also several of the adjacent pages. Now, suppose that one or more of the adjacent pages are never accessed. Ultimately, these unused pages become cached pages through the efforts of the page daemon. However, the next activation of the executable reactivates and maps these unused pages. Consequently, they are never replaced. In effect, they become pinned in memory.
* - define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulatingjeff2007-05-181-2/+2
| | | | | | | | vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines. Contributed by: Attilio Rao <attilio@FreeBSD.org>
* Remove some code from vmspace_fork() that became redundant afteralc2007-04-261-4/+0
| | | | | revision 1.334 modified _vm_map_init() to initialize the new vm map's flags to zero.
* Two small changes to vm_map_pmap_enter():alc2007-03-251-4/+3
| | | | | | | | | | 1) Eliminate an unnecessary check for fictitious pages. Specifically, only device-backed objects contain fictitious pages and the object is not device-backed. 2) Change the types of "psize" and "tmpidx" to vm_pindex_t in order to prevent possible wrap around with extremely large maps and objects, respectively. Observed by: tegge (last summer)
* Change the way that unmanaged pages are created. Specifically,alc2007-02-251-2/+1
| | | | | | | | | | | | | | immediately flag any page that is allocated to a OBJT_PHYS object as unmanaged in vm_page_alloc() rather than waiting for a later call to vm_page_unmanage(). This allows for the elimination of some uses of the page queues lock. Change the type of the kernel and kmem objects from OBJT_DEFAULT to OBJT_PHYS. This allows us to take advantage of the above change to simplify the allocation of unmanaged pages in kmem_alloc() and kmem_malloc(). Remove vm_page_unmanage(). It is no longer used.
* Eliminate unnecessary PG_BUSY tests. They originally served a purposealc2006-10-211-1/+1
| | | | that is now handled by vm object locking.
* Retire debug.mpsafevm. None of the architectures supported in CVS requirealc2006-07-211-8/+2
| | | | it any longer.
* Use ptoa(psize) instead of size to compute the end of the mapping inalc2006-06-171-3/+3
| | | | vm_map_pmap_enter().
* Correct an error in the previous revision that could lead to a panic:alc2006-06-141-0/+1
| | | | | | | | | | | | Found mapped cache page. Specifically, if cnt.v_free_count dips below cnt.v_free_reserved after p_start has been set to a non-NULL value, then vm_map_pmap_enter() would break out of the loop and incorrectly call pmap_enter_object() for the remaining address range. To correct this error, this revision truncates the address range so that pmap_enter_object() will not map any cache pages. In collaboration with: tegge@ Reported by: kris@
OpenPOWER on IntegriCloud