summaryrefslogtreecommitdiffstats
path: root/sys/vm
Commit message (Collapse)AuthorAgeFilesLines
* Backout previous change, I think Julian has a better solution whichbmilekic2004-06-091-1/+1
| | | | does not require type-stable refcnts here.
* Make the slabrefzone, the zone from which we allocated slabs withbmilekic2004-06-091-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | internal reference counters, UMA_ZONE_NOFREE. This way, those slabs (with their ref counts) will be effectively type-stable, then using a trick like this on the refcount is no longer dangerous: MEXT_REM_REF(m); if (atomic_cmpset_int(m->m_ext.ref_cnt, 0, 1)) { if (m->m_ext.ext_type == EXT_PACKET) { uma_zfree(zone_pack, m); return; } else if (m->m_ext.ext_type == EXT_CLUSTER) { uma_zfree(zone_clust, m->m_ext.ext_buf); m->m_ext.ext_buf = NULL; } else { (*(m->m_ext.ext_free))(m->m_ext.ext_buf, m->m_ext.ext_args); if (m->m_ext.ext_type != EXT_EXTREF) free(m->m_ext.ref_cnt, M_MBUF); } } uma_zfree(zone_mbuf, m); Previously, a second thread hitting the above cmpset might actually read the refcnt AFTER it has already been freed. A very rare occurance. Now we'll know that it won't be freed, though. Spotted by: julian, pjd
* Remove references to L1 in the comments, according to Alan they arenetchild2004-06-071-2/+2
| | | | | | historical leftovers. Approved by: alc
* Update stale comments regarding page coloring.alc2004-06-051-10/+10
|
* Move the definitions of SWAPBLK_NONE and SWAPBLK_MASK from vm_page.h toalc2004-06-041-8/+0
| | | | | blist.h, enabling the removal of numerous #includes from subr_blist.c. (subr_blist.c and swap_pager.c are the only users of these definitions.)
* Fix a comment above uma_zsecond_create(), describing its arguments.bmilekic2004-06-011-3/+3
| | | | | | | It doesn't take 'align' and 'flags' but 'master' instead, which is a reference to the Master Zone, containing the backing Keg. Pointed out by: Tim Robbins (tjr)
* Bring in mbuma to replace mballoc.bmilekic2004-05-315-355/+832
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mbuma is an Mbuf & Cluster allocator built on top of a number of extensions to the UMA framework, all included herein. Extensions to UMA worth noting: - Better layering between slab <-> zone caches; introduce Keg structure which splits off slab cache away from the zone structure and allows multiple zones to be stacked on top of a single Keg (single type of slab cache); perhaps we should look into defining a subset API on top of the Keg for special use by malloc(9), for example. - UMA_ZONE_REFCNT zones can now be added, and reference counters automagically allocated for them within the end of the associated slab structures. uma_find_refcnt() does a kextract to fetch the slab struct reference from the underlying page, and lookup the corresponding refcnt. mbuma things worth noting: - integrates mbuf & cluster allocations with extended UMA and provides caches for commonly-allocated items; defines several zones (two primary, one secondary) and two kegs. - change up certain code paths that always used to do: m_get() + m_clget() to instead just use m_getcl() and try to take advantage of the newly defined secondary Packet zone. - netstat(1) and systat(1) quickly hacked up to do basic stat reporting but additional stats work needs to be done once some other details within UMA have been taken care of and it becomes clearer to how stats will work within the modified framework. From the user perspective, one implication is that the NMBCLUSTERS compile-time option is no longer used. The maximum number of clusters is still capped off according to maxusers, but it can be made unlimited by setting the kern.ipc.nmbclusters boot-time tunable to zero. Work should be done to write an appropriate sysctl handler allowing dynamic tuning of kern.ipc.nmbclusters at runtime. Additional things worth noting/known issues (READ): - One report of 'ips' (ServeRAID) driver acting really slow in conjunction with mbuma. Need more data. Latest report is that ips is equally sucking with and without mbuma. - Giant leak in NFS code sometimes occurs, can't reproduce but currently analyzing; brueffer is able to reproduce but THIS IS NOT an mbuma-specific problem and currently occurs even WITHOUT mbuma. - Issues in network locking: there is at least one code path in the rip code where one or more locks are acquired and we end up in m_prepend() with M_WAITOK, which causes WITNESS to whine from within UMA. Current temporary solution: force all UMA allocations to be M_NOWAIT from within UMA for now to avoid deadlocks unless WITNESS is defined and we can determine with certainty that we're not holding any locks when we're M_WAITOK. - I've seen at least one weird socketbuffer empty-but- mbuf-still-attached panic. I don't believe this to be related to mbuma but please keep your eyes open, turn on debugging, and capture crash dumps. This change removes more code than it adds. A paper is available detailing the change and considering various performance issues, it was presented at BSDCan2004: http://www.unixdaemons.com/~bmilekic/netbuf_bmilekic.pdf Please read the paper for Future Work and implementation details, as well as credits. Testing and Debugging: rwatson, brueffer, Ketrien I. Saihr-Kesenchedra, ... Reviewed by: Lots of people (for different parts)
* Remove a stale comment: PG_DIRTY and PG_FILLED were removed inalc2004-05-301-2/+0
| | | | revisions 1.17 and 1.12 respectively.
* Correct typo, vm_page_list_find() is called vm_pageq_find() for quite ahmp2004-05-301-2/+2
| | | | | | | | long time, i.e., since the cleanup of the VM Page-queues code done two years ago. Reviewed by: Alan Cox <alc at freebsd.org>, Matthew Dillon <dillon at backplane.com>
* MFS: vm_map.c rev 1.187.2.27 through 1.187.2.29, fix MS_INVALIDATEdes2004-05-251-1/+5
| | | | semantics but provide a sysctl knob for reverting to old ones.
* Back out previous commit; it went to the wrong file.des2004-05-251-8/+1
|
* MFS: rev 1.187.2.27 through 1.187.2.29, fix MS_INVALIDATE semantics butdes2004-05-251-1/+8
| | | | provide a sysctl knob for reverting to old ones.
* Correct two error cases in vm_map_unwire():alc2004-05-251-4/+5
| | | | | | | | | | | | | | | | | 1. Contrary to the Single Unix Specification our implementation of munlock(2) when performed on an unwired virtual address range has returned an error. Correct this. Note, however, that the behavior of "system" unwiring is unchanged, only "user" unwiring is changed. If "system" unwiring is performed on an unwired virtual address range, an error is still returned. 2. Performing an errant "system" unwiring on a virtual address range that was "user" (i.e., mlock(2)) but not "system" wired would incorrectly undo the "user" wiring instead of returning an error. Correct this. Discussed with: green@ Reviewed by: tegge@
* To date, unwiring a fictitious page has produced a panic. The reasonalc2004-05-224-18/+29
| | | | | | | | | | | | | | | being that PHYS_TO_VM_PAGE() returns the wrong vm_page for fictitious pages but unwiring uses PHYS_TO_VM_PAGE(). The resulting panic reported an unexpected wired count. Rather than attempting to fix PHYS_TO_VM_PAGE(), this fix takes advantage of the properties of fictitious pages. Specifically, fictitious pages will never be completely unwired. Therefore, we can keep a fictitious page's wired count forever set to one and thereby avoid the use of PHYS_TO_VM_PAGE() when we know that we're working with a fictitious page, just not which one. In collaboration with: green@, tegge@ PR: kern/29915
* Restructure vm_page_select_cache() so that adding assertions is easy.alc2004-05-121-10/+15
| | | | | | | | Some of the conditions that caused vm_page_select_cache() to deactivate a page were wrong. For example, deactivating an unmanaged or wired page is a nop. Thus, if vm_page_select_cache() had ever encountered an unmanaged or wired page, it would have looped forever. Now, we assert that the page is neither unmanaged nor wired.
* Cache queue pages are not mapped. Thus, the pmap_remove_all() byalc2004-05-121-1/+0
| | | | vm_pageout_scan()'s loop for freeing cache queue pages is unnecessary.
* To handle orphaned character device vnodes properly in mmap(), check thattjr2004-05-111-1/+1
| | | | | v_mount is non-null before dereferencing it. If it's null, behave as if MNT_NOEXEC was not set on the mount that originally containined it.
* Cache queue pages are not mapped. Thus, the pmap_remove_all() byalc2004-05-091-1/+0
| | | | vm_page_alloc() is unnecessary.
* In r1.190, vslock() and vsunlock() were bogusly made to do a "user wire"green2004-05-071-1/+1
| | | | | | and a "system unwire." Make this a "system wire" and "system unwire." Reviewed by: alc
* Properly remove MAP_FUTUREWIRE when a vm_map_entry gets torn down.green2004-05-071-0/+1
| | | | | | | | | | | | | | Previously, mlockall(2) usage would leak MAP_FUTUREWIRE of the process's vmspace::vm_map and subsequent processes would wire all of their memory. Coupled with a wired-page leak in vm_fault_unwire(), this would run the system out of free pages and cause programs to randomly SIGBUS when faulting in new pages. (Note that this is not the fix for the latter part; pages are still leaked when a wired area is unmapped in some cases.) Reviewed by: alc PR kern/62930
* Make vm_page's PG_ZERO flag immutable between the time of the page'salc2004-05-063-8/+0
| | | | | | | | | | allocation and deallocation. This flag's principal use is shortly after allocation. For such cases, clearing the flag is pointless. The only unusual use of PG_ZERO is in vfs_bio_clrbuf(). However, allocbuf() never requests a prezeroed page. So, vfs_bio_clrbuf() never sees a prezeroed page. Reviewed by: tegge@
* Zero the physical page only if it is invalid and not prezeroed.alc2004-04-251-7/+9
|
* Add a VM_OBJECT_LOCK_ASSERT() call. Remove splvm() and splx() calls. Movealc2004-04-241-7/+5
| | | | a comment.
* Update the comment describing vm_page_grab() to reflect the previousalc2004-04-241-6/+5
| | | | revision and correct some of its style errors.
* Push down the responsibility for zeroing a physical page from thealc2004-04-242-2/+2
| | | | | | | | | | | | | caller to vm_page_grab(). Although this gives VM_ALLOC_ZERO a different meaning for vm_page_grab() than for vm_page_alloc(), I feel such change is necessary to accomplish other goals. Specifically, I want to make the PG_ZERO flag immutable between the time it is allocated by vm_page_alloc() and freed by vm_page_free() or vm_page_free_zero() to avoid locking overheads. Once we gave up on the ability to automatically recognize a zeroed page upon entry to vm_page_free(), the ability to mutate the PG_ZERO flag became useless. Instead, I would like to say that "Once a page becomes valid, its PG_ZERO flag must be ignored."
* In cases where a file was resident in memory mmap(..., PROT_NONE, ...)alc2004-04-242-5/+6
| | | | | | | | | | | | | | | would actually map the file with read access enabled. According to http://www.opengroup.org/onlinepubs/007904975/functions/mmap.html this is an error. Similarly, an madvise(..., MADV_WILLNEED) would enable read access on a virtual address range that was PROT_NONE. The solution implemented herein is (1) to pass a vm_prot_t to vm_map_pmap_enter() describing the allowed access and (2) to make vm_map_pmap_enter() responsible for understanding the limitations of pmap_enter_quick(). Submitted by: "Mark W. Krentel" <krentel@dreamscape.com> PR: kern/64573
* Push down Giant into vm_pager_get_pages(). The only get pages methods thatalc2004-04-232-0/+5
| | | | require Giant are in the device and vnode pagers.
* - pmap_kenter_temporary() is unused by machine-independent code. Therefore,alc2004-04-101-1/+0
| | | | | | | | move its declaration to the machine-dependent header file on those machines that use it. In principle, only i386 should have it. Alpha and AMD64 should use their direct virtual-to-physical mapping. - Remove pmap_kenter_temporary() from ia64. It is unused. Approved by: marcel@
* The demise of vm_pager_map_page() in revision 1.93 of vm/vm_pager.c permitsalc2004-04-083-6/+2
| | | | | | the reduction of the pager map's size by 8M bytes. In other words, eight megabytes of largely wasted KVA are returned to the kernel map for use elsewhere.
* Remove advertising clause from University of California Regent's license,imp2004-04-0624-96/+0
| | | | | | per letter dated July 22, 1999. Approved by: core
* Eliminate vm_pager_map_page() and vm_pager_unmap_page() and their uses.alc2004-04-063-32/+12
| | | | Use sf_buf_alloc() and sf_buf_free() instead.
* Delay permission checks for VCHR vnodes until after vnode is locked inkan2004-04-051-2/+7
| | | | | | | | vm_mmap_vnode function, where we can safely check for a special /dev/zero case. Rev. 1.180 has reordered checks and introduced a regression. Submitted by: alc Was broken by: kan
* Remove unused arguments from pmap_init().alc2004-04-052-2/+2
|
* Eliminate unused arguments from vm_page_startup().alc2004-04-043-3/+3
|
* Do not copy vm_exitingcnt to the new vmspace in vmspace_exec(). Copyingtjr2004-03-231-1/+2
| | | | | it led to impossibly high values in the new vmspace, causing it to never drop to 0 and be freed.
* When mmap-ing a file from a noexec mount, be sure not to grant the rightguido2004-03-181-1/+5
| | | | | | | | to mmap it PROT_EXEC. This also depends on the architecture, as some architextures (e.g. i386) do not distinguish between read and exec pages Inspired by: http://linux.bkbits.net:8080/linux-2.4/cset@1.1267.1.85 Reviewed by: alc
* Make overflow/wraparound checking more robust and unbreak len=0 intruckman2004-03-152-16/+22
| | | | | | vslock(), mlock(), and munlock(). Reviewed by: bde
* Style(9) changes.truckman2004-03-152-40/+11
| | | | Pointed out by: bde
* Revert to the original vslock() and vsunlock() API with the followingtruckman2004-03-152-33/+25
| | | | | | | | | exceptions: Retain the recently added vslock() error return. The type of the len argument should be size_t, not u_int. Suggested by: bde
* Remove redundant suser() check.truckman2004-03-151-4/+0
|
* Remove GIANT_REQUIRED from contigfree().alc2004-03-131-1/+1
|
* Part 2 of rev 1.68. Update comment to match reality now that vm_endcopypeter2004-03-121-1/+1
| | | | | | exists and we no longer copy to the end of the struct. Forgotten by: alfred and green
* - Make the acquisition of Giant in vm_fault_unwire() conditional on thealc2004-03-102-13/+5
| | | | | | | | | | | | pmap. For the kernel pmap, Giant is not required. In general, for other pmaps, Giant is required by i386's pmap_pte() implementation. Specifically, the use of PMAP2/PADDR2 is synchronized by Giant. Note: In principle, updates to the kernel pmap's wired count could be lost without Giant. However, in practice, we never use the kernel pmap's wired count. This will be resolved when pmap locking appears. - With the above change, cpu_thread_clean() and uma_large_free() need not acquire Giant. (The first case is simply the revival of i386/i386/vm_machdep.c's revision 1.226 by peter.)
* Implement a work around for the deadlock avoidance case inalc2004-03-081-0/+7
| | | | | | vm_object_deallocate() so that it doesn't spin forever either. Submitted by: bde
* Retire pmap_pinit2(). Alpha was the last platform that used it. However,alc2004-03-073-6/+0
| | | | | | | | | | | | | | ever since alpha/alpha/pmap.c revision 1.81 introduced the list allpmaps, there has been no reason for having this function on Alpha. Briefly, when pmap_growkernel() relied upon the list of all processes to find and update the various pmaps to reflect a growth in the kernel's valid address space, pmap_init2() served to avoid a race between pmap initialization and pmap_growkernel(). Specifically, pmap_pinit2() was responsible for initializing the kernel portions of the pmap and pmap_pinit2() was called after the process structure contained a pointer to the new pmap for use by pmap_growkernel(). Thus, an update to the kernel's address space might be applied to the new pmap unnecessarily, but an update would never be lost.
* Mark uma_callout as CALLOUT_MPSAFE, as uma_timeout can run MPSAFE.rwatson2004-03-071-1/+1
| | | | Reviewed by: jeff
* Undo the merger of mlock()/vslock and munlock()/vsunlock() and thetruckman2004-03-053-50/+113
| | | | | | | | | | | | | | | | | | | | | | | | | | introduction of kern_mlock() and kern_munlock() in src/sys/kern/kern_sysctl.c 1.150 src/sys/vm/vm_extern.h 1.69 src/sys/vm/vm_glue.c 1.190 src/sys/vm/vm_mmap.c 1.179 because different resource limits are appropriate for transient and "permanent" page wiring requests. Retain the kern_mlock() and kern_munlock() API in the revived vslock() and vsunlock() functions. Combine the best parts of each of the original sets of implementations with further code cleanup. Make the mclock() and vslock() implementations as similar as possible. Retain the RLIMIT_MEMLOCK check in mlock(). Move the most strigent test, which can return EAGAIN, last so that requests that have no hope of ever being satisfied will not be retried unnecessarily. Disable the test that can return EAGAIN in the vslock() implementation because it will cause the sysctl code to wedge. Tested by: Cy Schubert <Cy.Schubert AT komquats.com>
* In the last revision, I introduced a physical contiguity check that is bothalc2004-03-051-3/+1
| | | | | | | | | | unnecessary and wrong. While it is necessary to verify that the page is still free after dropping and reacquiring the free page queue lock, the physical contiguity of the page can not change, making this check unnecessary. This check was wrong in that it could cause an out-of-bounds array access. Tested by: rwatson
* Record exactly where this file was copied from. It wasn't repo-copied sobde2004-03-041-12/+12
| | | | | | this is not very obvious. Fixed some style bugs (mainly missing parentheses around return values).
* Minor style fixes. In vm_daemon(), don't fetch the rss limit long beforebde2004-03-041-9/+8
| | | | it is needed.
OpenPOWER on IntegriCloud