summaryrefslogtreecommitdiffstats
path: root/sys/vm
Commit message (Collapse)AuthorAgeFilesLines
* Correct contigmalloc2()'s implementation of M_ZERO. Specifically,alc2007-04-191-1/+1
| | | | | | | | | contigmalloc2() was always testing the first physical page for PG_ZERO, not the current page of interest. Submitted by: Michael Plass PR: 81301 MFC after: 1 week
* Correct two comments.alc2007-04-191-2/+2
| | | | Submitted by: Michael Plass
* Minor typo fix, noticed while I was going through *_pager.c files.keramida2007-04-101-1/+1
|
* When KVA is exhausted, try the vm_lowmem event for the last time beforepjd2007-04-051-4/+14
| | | | panicing. This helps a lot in ZFS stability.
* Fix a problem for file systems that don't implement VOP_BMAP() operation.pjd2007-04-051-0/+2
| | | | | | | | | | | | | | | | | | | | | | The problem is this: vm_fault_additional_pages() calls vm_pager_has_page(), which calls vnode_pager_haspage(). Now when VOP_BMAP() returns an error (eg. EOPNOTSUPP), vnode_pager_haspage() returns TRUE without initializing 'before' and 'after' arguments, so we have some accidental values there. This bascially was causing this condition to be meet: if ((rahead + rbehind) > ((cnt.v_free_count + cnt.v_cache_count) - cnt.v_free_reserved)) { pagedaemon_wakeup(); [...] } (we have some random values in rahead and rbehind variables) I'm not entirely sure this is the right fix, maybe we should just return FALSE in vnode_pager_haspage() when VOP_BMAP() fails? alc@ knows about this problem, maybe he will be able to come up with a better fix if this is not the right one.
* Prevent a race between vm_object_collapse() and vm_object_split() fromalc2007-03-271-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | causing a crash. Suppose that we have two objects, obj and backing_obj, where backing_obj is obj's backing object. Further, suppose that backing_obj has a reference count of two. One being the reference held by obj and the other by a map entry. Now, suppose that the map entry is deallocated and its reference removed by vm_object_deallocate(). vm_object_deallocate() recognizes that the only remaining reference is from a shadow object, obj, and calls vm_object_collapse() on obj. vm_object_collapse() executes if (backing_object->ref_count == 1) { /* * If there is exactly one reference to the backing * object, we can collapse it into the parent. */ vm_object_backing_scan(object, OBSC_COLLAPSE_WAIT); vm_object_backing_scan(OBSC_COLLAPSE_WAIT) executes if (op & OBSC_COLLAPSE_WAIT) { vm_object_set_flag(backing_object, OBJ_DEAD); } Finally, suppose that either vm_object_backing_scan() or vm_object_collapse() sleeps releasing its locks. At this instant, another thread executes vm_object_split(). It crashes in vm_object_reference_locked() on the assertion that the object is not dead. If, however, assertions are not enabled, it crashes much later, after the object has been recycled, in vm_object_deallocate() because the shadow count and shadow list are inconsistent. Reviewed by: tegge Reported by: jhb MFC after: 1 week
* Two small changes to vm_map_pmap_enter():alc2007-03-251-4/+3
| | | | | | | | | | 1) Eliminate an unnecessary check for fictitious pages. Specifically, only device-backed objects contain fictitious pages and the object is not device-backed. 2) Change the types of "psize" and "tmpidx" to vm_pindex_t in order to prevent possible wrap around with extremely large maps and objects, respectively. Observed by: tegge (last summer)
* vm_page_busy() no longer requires the page queues lock to be held. Reducealc2007-03-231-2/+2
| | | | the scope of the page queues lock in vm_fault() accordingly.
* Change the order of lock reacquisition in vm_object_split() in order toalc2007-03-221-2/+5
| | | | simplify the code slightly. Add a comment concerning lock ordering.
* Use PCPU_LAZY_INC() to update page fault statistics.alc2007-03-051-6/+6
|
* Use pause() in vm_object_deallocate() to yield the CPU to the lock holderjhb2007-02-271-1/+1
| | | | | | rather than a tsleep() on &proc0. The only wakeup on &proc0 is intended to awaken the swapper, not random threads blocked in vm_object_deallocate().
* Use pause() rather than tsleep() on stack variables and function pointers.jhb2007-02-271-2/+1
|
* Change the way that unmanaged pages are created. Specifically,alc2007-02-256-48/+11
| | | | | | | | | | | | | | immediately flag any page that is allocated to a OBJT_PHYS object as unmanaged in vm_page_alloc() rather than waiting for a later call to vm_page_unmanage(). This allows for the elimination of some uses of the page queues lock. Change the type of the kernel and kmem objects from OBJT_DEFAULT to OBJT_PHYS. This allows us to take advantage of the above change to simplify the allocation of unmanaged pages in kmem_alloc() and kmem_malloc(). Remove vm_page_unmanage(). It is no longer used.
* Change the page's CLEANCHK flag from being a page queue mutex synchronizedalc2007-02-222-16/+16
| | | | flag to a vm object mutex synchronized flag.
* Enable vm_page_free() and vm_page_free_zero() to be called on some pagesalc2007-02-181-2/+4
| | | | | without the page queues lock being held, specifically, pages that are not contained in a vm object and not a member of a page queue.
* Remove a stale comment. Add punctuation to a nearby comment.alc2007-02-171-6/+1
|
* Relax the page queue lock assertions in vm_page_remove() andalc2007-02-151-2/+3
| | | | | | | | vm_page_free_toq() to account for recent changes that allow vm_page_free_toq() to be called on some pages without the page queues lock being held, specifically, pages that are not contained in a vm object and not a member of a page queue. (Examples of such pages include page table pages, pv entry pages, and uma small alloc pages.)
* Avoid the unnecessary acquisition of the free page queues lock when a pagealc2007-02-141-4/+5
| | | | | | | is actually being added to the hold queue, not the free queue. At the same time, avoid unnecessary tests to wake up threads waiting for free memory and the idle thread that zeroes free pages. (These tests will be performed later when the page finally moves from the hold queue to the free queue.)
* Add uma_set_align() interface, which will be called at most once duringrwatson2007-02-112-2/+26
| | | | | | | | | | | | | | | | | boot by MD code to indicated detected alignment preference. Rather than cache alignment being encoded in UMA consumers by defining a global alignment value of (16 - 1) in UMA_ALIGN_CACHE, UMA_ALIGN_CACHE is now a special value (-1) that causes UMA to look at registered alignment. If no preferred alignment has been selected by MD code, a default alignment of (16 - 1) will be used. Currently, no hardware platforms specify alignment; architecture maintainers will need to modify MD startup code to specify an alignment if desired. This must occur before initialization of UMA so that all UMA zones pick up the requested alignment. Reviewed by: jeff, alc Submitted by: attilio
* Use the free page queue mutex instead of the page queue mutex toalc2007-02-112-7/+6
| | | | synchronize sleeping and waking of the zero idle thread.
* - Move 'struct swdevt' back into swap_pager.h and expose it to userland.jhb2007-02-072-31/+32
| | | | | | | | - Restore support for fetching swap information from crash dumps via kvm_get_swapinfo(3) to fix pstat -T/-s on crash dumps. Reviewed by: arch@, phk MFC after: 1 week
* Change the pagedaemon, vm_wait(), and vm_waitpfault() to sleep on thealc2007-02-072-15/+21
| | | | vm page queue free mutex instead of the vm page queue mutex.
* Change the free page queue lock from a spin mutex to a default (blocking)alc2007-02-054-22/+22
| | | | | mutex. With the demise of Alpha support, there is no longer a reason for it to be a spin mutex.
* Fix for problems that occur when all mbuf clusters migrate to the mbuf packetmohans2007-01-252-2/+10
| | | | | | | | | zone. Cluster allocations fail when this happens. Also processes that may have blocked on cluster allocations will never be woken up. Thanks to rwatson for an overview of the issue and pointers to the mbuma paper and his tool to dump out UMA zones. Reviewed by: andre@
* Fix for a bug where only one process (of multiple) blocked onmohans2007-01-241-2/+7
| | | | | | | | | | maxpages on a zone is woken up, with the rest never being woken up as a result of the ZFLAG_FULL flag being cleared. Wakeup all such blocked procsses instead. This change introduces a thundering herd, but since this should be relatively infrequent, optimizing this (by introducing a count of blocked processes, for example) may be premature. Reviewd by: ups@
* - Remove setrunqueue and replace it with direct calls to sched_add().jeff2007-01-232-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | setrunqueue() was mostly empty. The few asserts and thread state setting were moved to the individual schedulers. sched_add() was chosen to displace it for naming consistency reasons. - Remove adjustrunqueue, it was 4 lines of code that was ifdef'd to be different on all three schedulers where it was only called in one place each. - Remove the long ifdef'd out remrunqueue code. - Remove the now redundant ts_state. Inspect the thread state directly. - Don't set TSF_* flags from kern_switch.c, we were only doing this to support a feature in one scheduler. - Change sched_choose() to return a thread rather than a td_sched. Also, rely on the schedulers to return the idlethread. This simplifies the logic in choosethread(). Aside from the run queue links kern_switch.c mostly does not care about the contents of td_sched. Discussed with: julian - Move the idle thread loop into the per scheduler area. ULE wants to do something different from the other schedulers. Suggested by: jhb Tested on: x86/amd64 sched_{4BSD, ULE, CORE}.
* Use FOREACH_PROC_IN_SYSTEM instead of using its unrolled form.delphij2007-01-172-2/+2
|
* Remove uma_zalloc_arg() hack, which coerced M_WAITOK to M_NOWAIT whenrwatson2007-01-101-22/+3
| | | | | | | | | | allocations were made using improper flags in interrupt context. Replace with a simple WITNESS warning call. This restores the invariant that M_WAITOK allocations will always succeed or die horribly trying, which is relied on by many UMA consumers. MFC after: 3 weeks Discussed with: jhb
* Declare the map entry created by kmem_init() for the range fromalc2007-01-071-1/+2
| | | | | VM_MIN_KERNEL_ADDRESS to the end of the kernel's bootstrap data as MAP_NOFAULT.
* - Add a new function uma_zone_exhausted() to see if a zone is full.jhb2007-01-053-0/+25
| | | | | | | | | - Add a printf in swp_pager_meta_build() to warn if the swapzone becomes exhausted so that there's at least a warning before a box that runs out of swapzone space before running out of swap space deadlocks. MFC after: 1 week Reviwed by: alc
* Optimize vm_object_split(). Specifically, make the number of iterationsalc2006-12-171-9/+14
| | | | | equal to the number of physical pages that are renamed to the new object rather than the new object's virtual size.
* Simplify the computation of the new object's size in vm_object_split().alc2006-12-161-3/+2
|
* Remove the requirement that phys_avail be sorted in ascending orderkmacy2006-12-081-2/+10
| | | | | | | by explicitly finding the lowest and highest addresses when calculating the size of the vm_pages array Reviewed by :alc
* Threading cleanup.. part 2 of several.julian2006-12-062-43/+3
| | | | | | | | | | | | | | | | | | | | | | Make part of John Birrell's KSE patch permanent.. Specifically, remove: Any reference of the ksegrp structure. This feature was never fully utilised and made things overly complicated. All code in the scheduler that tried to make threaded programs fair to unthreaded programs. Libpthread processes will already do this to some extent and libthr processes already disable it. Also: Since this makes such a big change to the scheduler(s), take the opportunity to rename some structures and elements that had to be moved anyhow. This makes the code a lot more readable. The ULE scheduler compiles again but I have no idea if it works. The 4bsd scheduler still reqires a little cleaning and some functions that now do ALMOST nothing will go away, but I thought I'd do that as a separate commit. Tested by David Xu, and Dan Eischen using libthr and libpthread.
* The clean_map has been made local to vm_init.c long ago.ru2006-11-201-1/+0
|
* Remove a redundant pointer-type variable.ru2006-11-201-19/+18
|
* When counting vm totals, skip unreferenced objects, includingru2006-11-201-0/+7
| | | | | | | vnodes representing mounted file systems. Reviewed by: alc MFC after: 3 days
* There is no point in setting PG_REFERENCED on kmem_object pages becausealc2006-11-131-6/+1
| | | | | | they are "unmanaged", i.e., non-pageable, pages. Remove a stale comment.
* Make pmap_enter() responsible for setting PG_WRITEABLE insteadalc2006-11-122-8/+3
| | | | | of its caller. (As a beneficial side-effect, a high-contention acquisition of the page queues lock in vm_fault() is eliminated.)
* I misplaced the assertion that was added to vm_page_startup() in thealc2006-11-081-6/+6
| | | | previous change. Correct its placement.
* Simplify the construction of the free queues in vm_page_startup(). Addalc2006-11-081-2/+12
| | | | | an assertion to test a hypothesis concerning other redundant computation in vm_page_startup().
* Ensure that the page's oflags field is initialized by contigmalloc().alc2006-11-081-0/+1
|
* Sweep kernel replacing suser(9) calls with priv(9) calls, assigningrwatson2006-11-062-10/+11
| | | | | | | | | | | | | specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>
* Make KSE a kernel option, turned on by default in all GENERICjb2006-10-262-1/+36
| | | | | | | kernel configs except sun4v (which doesn't process signals properly with KSE). Reviewed by: davidxu@
* Better align output of "show uma" by moving from displaying the basicrwatson2006-10-261-5/+7
| | | | | | | counters of allocs/frees/use for each zone to the same statistics shown by userspace "vmstat -z". MFC after: 3 days
* The page queues lock is no longer required by vm_page_wakeup().alc2006-10-234-8/+8
|
* The page queues lock is no longer required by vm_page_busy() oralc2006-10-222-5/+4
| | | | vm_page_wakeup(). Reduce or eliminate its use accordingly.
* Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.hrwatson2006-10-222-2/+4
| | | | | | | | | | | | | begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead. This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd. Obtained from: TrustedBSD Project Sponsored by: SPARTA
* Replace PG_BUSY with VPO_BUSY. In other words, changes to the page'salc2006-10-228-51/+59
| | | | | busy flag, i.e., VPO_BUSY, are now synchronized by the per-vm object lock instead of the global page queues lock.
* Eliminate unnecessary PG_BUSY tests. They originally served a purposealc2006-10-212-2/+2
| | | | that is now handled by vm object locking.
OpenPOWER on IntegriCloud