summaryrefslogtreecommitdiffstats
path: root/sys/vm
Commit message (Collapse)AuthorAgeFilesLines
* With the removal of kern/uipc_jumbo.c and sys/jumbo.h,alc2004-12-082-22/+5
| | | | vm_object_allocate_wait() is not used. Remove it.
* Almost nine years ago, when support for 1TB files was introduced inalc2004-12-071-1/+1
| | | | | | | | | | | | | revision 1.55, the address parameter to vnode_pager_addr() was changed from an unsigned 32-bit quantity to a signed 64-bit quantity. However, an out-of-range check on the address was not updated. Consequently, memory-mapped I/O on files greater than 2GB could cause a kernel panic. Since the address is now a signed 64-bit quantity, the problem resolution is simply to remove a cast. Reviewed by: bde@ and tegge@ PR: 73010 MFC after: 1 week
* Correct a sanity check in vnode_pager_generic_putpages(). The cast usedalc2004-12-051-1/+1
| | | | | | | | | | to implement the sanity check should have been changed when we converted the implementation of vm_pindex_t from 32 to 64 bits. (Thus, RELENG_4 is not affected.) The consequence of this error would be a legimate write to an extremely large file being treated as an errant attempt to write meta- data. Discussed with: tegge@
* Don't include sys/user.h merely for its side-effect of recursivelydas2004-11-271-2/+0
| | | | including other headers.
* Remove useless casts.cognet2004-11-261-2/+2
|
* Try to close a potential, but serious race in our VM subsystem.delphij2004-11-241-2/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | Historically, our contigmalloc1() and contigmalloc2() assumes that a page in PQ_CACHE can be unconditionally reused by busying and freeing it. Unfortunatelly, when object happens to be not NULL, the code will set m->object to NULL and disregard the fact that the page is actually in the VM page bucket, resulting in page bucket hash table corruption and finally, a filesystem corruption, or a 'page not in hash' panic. This commit has borrowed the idea taken from DragonFlyBSD's fix to the VM fix by Matthew Dillon[1]. This version of patch will do the following checks: - When scanning pages in PQ_CACHE, check hold_count and skip over pages that are held temporarily. - For pages in PQ_CACHE and selected as candidate of being freed, check if it is busy at that time. Note: It seems that this is might be unrelated to kern/72539. Obtained from: DragonFlyBSD, sys/vm/vm_contig.c,v 1.11 and 1.12 [1] Reminded by: Matt Dillon Reworked by: alc MFC After: 1 week
* Disable U area swapping and remove the routines that create, destroy,das2004-11-204-206/+0
| | | | | | copy, and swap U areas. Reviewed by: arch@
* Make VOP_BMAP return a struct bufobj for the underlying storage devicephk2004-11-151-10/+13
| | | | | | | | | instead of a vnode for it. The vnode_pager does not and should not have any interest in what the filesystem uses for backend. (vfs_cluster doesn't use the backing store argument.)
* Add pbgetbo()/pbrelbo() lighter weight versions of pbgetvp()/pbrelvp().phk2004-11-151-0/+42
|
* More kasserts.phk2004-11-151-1/+6
|
* style polishing.phk2004-11-151-7/+3
|
* Move pbgetvp() and pbrelvp() to vm_pager.c with the rest of the pbuf stuff.phk2004-11-151-0/+44
|
* expect the caller to have called pbrelvp() if necessary.phk2004-11-151-3/+0
|
* Explicitly call pbrelvp()phk2004-11-151-0/+2
|
* Improve readability with a bunch of typedefs for the pager ops.phk2004-11-091-7/+15
| | | | These can also be used for prototypes in the pagers.
* #include <vm/vm_param.h> instead of <machine/vmparam.h> (the formerdes2004-11-081-6/+6
| | | | | | | | | | | | includes the latter, but also declares variables which are defined in kern/subr_param.c). Change som VM parameters from quad_t to unsigned long. They refer to quantities (size limits for text, heap and stack segments) which must necessarily be smaller than the size of the address space, so long is adequate on all platforms. MFC after: 1 week
* Eliminate an unnecessary atomic operation. Articulate the rationale inalc2004-11-061-4/+11
| | | | a comment.
* Abstract the logic to look up the uma_bucket_zone given a desiredrwatson2004-11-061-7/+23
| | | | | | | | | number of entries into bucket_zone_lookup(), which helps make more clear the logic of consumers of bucket zones. Annotate the behavior of bucket_init() with a comment indicating how the various data structures, including the bucket lookup tables, are initialized.
* Remove dangling variablephk2004-11-061-1/+0
|
* Annotate what bucket_size[] array does; staticize since it's used onlyrwatson2004-11-061-1/+5
| | | | in uma_core.c.
* Fix the last known race in swapoff(), which could lead to a spurious panic:das2004-11-061-21/+14
| | | | | | | | | | | | | swapoff: failed to locate %d swap blocks The race occurred because putpages() can block between the time it allocates swap space and the time it updates the swap metadata to associate that space with a vm_object, so swapoff() would complain about the temporary inconsistency. I hoped to fix this by making swp_pager_getswapspace() and swp_pager_meta_build() a single atomic operation, but that proved to be inconvenient. With this change, swapoff() simply doesn't attempt to be so clever about detecting when all the pageout activity to the target device should have drained.
* Move a call to wakeup() from vm_object_terminate() to vnode_pager_dealloc()alc2004-11-063-2/+6
| | | | | | | | | because this call is only needed to wake threads that slept when they discovered a dead object connected to a vnode. To eliminate unnecessary calls to wakeup() by vnode_pager_dealloc(), introduce a new flag, OBJ_DISCONNECTWNT. Reviewed by: tegge@
* - Set the priority of the page zeroing thread using sched_prio() when thejhb2004-11-051-14/+5
| | | | | | | | | | thread is created rather than adjusting the priority in the main function. (kthread_create() should probably take the initial priority as an argument.) - Only yield the CPU in the !PREEMPTION case if there are any other runnable threads. Yielding when there isn't anything else better to do just wastes time in pointless context switches (albeit while the system is idle.)
* During traversal of the inactive queue, try locking the page's containingalc2004-11-051-4/+9
| | | | object before accessing the page's flags or the object's reference count.
* Eliminate another unnecessary call to vm_page_busy() that immediatelyalc2004-11-051-1/+0
| | | | | precedes a call to vm_page_rename(). (See the previous revision for a detailed explanation.)
* Close a race in swapoff(). Here are the gory details:das2004-11-051-70/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to avoid livelock, swapoff() skips over objects with a nonzero pip count and makes another pass if necessary. Since it is impossible to know which objects we care about, it would choose an arbitrary object with a nonzero pip count and wait for it before making another pass, the theory being that this object would finish paging about as quickly as the ones we care about. Unfortunately, we may have slept since we acquired a reference to this object. Hack around this problem by tsleep()ing on the pointer anyway, but timeout after a fixed interval. More elegant solutions are possible, but the ones I considered unnecessarily complicate this rare case. Also, kill some nits that seem to have crept into the swapoff() code in the last 75 revisions or so: - Don't pass both sp and sp->sw_used to swap_pager_swapoff(), since the latter can be derived from the former. - Replace swp_pager_find_dev() with something simpler. There's no need to iterate over the entire list of swap devices just to determine if a given block is assigned to the one we're interested in. - Expand the scope of the swhash_mtx in a couple of places so that it isn't released and reacquired once for every hash bucket. - Don't drop the swhash_mtx while holding a reference to an object. We need to lock the object first. Unfortunately, doing so would violate the established lock order, so use VM_OBJECT_TRYLOCK() and try again on a subsequent pass if the object is already locked. - Refactor swp_pager_force_pagein() and swap_pager_swapoff() a bit.
* Retire b_magic now, we have the bufobj containing the same hint.phk2004-11-041-1/+0
|
* De-couple our I/O bio request from the embedded bio in buf by explicitlyphk2004-11-041-1/+6
| | | | copying the fields.
* Remove buf->b_dev field.phk2004-11-041-4/+2
|
* The synchronization provided by vm object locking has eliminated thealc2004-11-035-23/+5
| | | | | | | | | | | | | | | | | need for most calls to vm_page_busy(). Specifically, most calls to vm_page_busy() occur immediately prior to a call to vm_page_remove(). In such cases, the containing vm object is locked across both calls. Consequently, the setting of the vm page's PG_BUSY flag is not even visible to other threads that are following the synchronization protocol. This change (1) eliminates the calls to vm_page_busy() that immediately precede a call to vm_page_remove() or functions, such as vm_page_free() and vm_page_rename(), that call it and (2) relaxes the requirement in vm_page_remove() that the vm page's PG_BUSY flag is set. Now, the vm page's PG_BUSY flag is set only when the vm object lock is released while the vm page is still in transition. Typically, this is when it is undergoing I/O.
* Introduce a Boolean variable wakeup_needed to avoid repeated, unnecessaryalc2004-10-311-2/+9
| | | | calls to wakeup() by vm_page_zero_idle_wakeup().
* During traversal of the active queue by vm_pageout_page_stats(), tryalc2004-10-301-1/+10
| | | | locking the page's containing object before accessing the page's flags.
* Eliminate an unused but initialized variable.alc2004-10-301-2/+0
|
* Add an assignment statement that I omitted from the previous revision.alc2004-10-301-0/+1
|
* Assert that the containing vm object is locked in vm_page_cache() andalc2004-10-281-0/+2
| | | | vm_page_try_to_cache().
* Fix a INVARIANTS-only bug introduced in Revision 1.104:bmilekic2004-10-271-1/+5
| | | | | | | | | | IF INVARIANTS is defined, and in the rare case that we have allocated some objects from the slab and at least one initializer on at least one of those objects failed, and we need to fail the allocation and push the uninitialized items back into the slab caches -- in that scenario, we would fail to [re]set the bucket cache's ub_bucket item references to NULL, which would eventually trigger a KASSERT.
* During traversal of the active queue, try locking the page's containingalc2004-10-271-4/+12
| | | | | object before accessing the page's flags or the object's reference count. If the trylock fails, handle the page as though it is busy.
* Also check that the sectormask is bigger than zero.phk2004-10-261-1/+3
| | | | Wrap this overly long KASSERT and remove newline.
* Put the I/O block size in bufobj->bo_bsize.phk2004-10-261-1/+1
| | | | | | | We keep si_bsize_phys around for now as that is the simplest way to pull the number out of disk device drivers in devfs_open(). The correct solution would be to do an ioctl(DIOCGSECTORSIZE), but the point is probably mooth when filesystems sit on GEOM, so don't bother for now.
* Don't clear flags we just checked were not set.phk2004-10-261-1/+0
|
* Assert that the containing vm object is locked in vm_page_flash().alc2004-10-251-0/+2
|
* Assert that the containing vm object is locked in vm_page_busy() andalc2004-10-241-0/+4
| | | | vm_page_wakeup().
* Move the buffer method vector (buf->b_op) to the bufobj.phk2004-10-243-10/+3
| | | | | | | | | | | | | | | | | Extend it with a strategy method. Add bufstrategy() which do the usual VOP_SPECSTRATEGY/VOP_STRATEGY song and dance. Rename ibwrite to bufwrite(). Move the two NFS buf_ops to more sensible places, add bufstrategy to them. Add inlines for bwrite() and bstrategy() which calls through buf->b_bufobj->b_ops->b_{write,strategy}(). Replace almost all VOP_STRATEGY()/VOP_SPECSTRATEGY() calls with bstrategy().
* Acquire the vm object lock before rather than after callingalc2004-10-241-4/+5
| | | | | | vm_page_sleep_if_busy(). (The motivation being to transition synchronization of the vm_page's PG_BUSY flag from the global page queues lock to the per-object lock.)
* Use VM_ALLOC_NOBUSY instead of calling vm_page_wakeup().alc2004-10-242-11/+3
|
* Introduce VM_ALLOC_NOBUSY, an option to vm_page_alloc() and vm_page_grab()alc2004-10-242-2/+4
| | | | | | | | that indicates that the caller does not want a page with its busy flag set. In many places, the global page queues lock is acquired and released just to clear the busy flag on a just allocated page. Both the allocation of the page and the clearing of the busy flag occur while the containing vm object is locked. So, the busy flag might as well never be set.
* Add b_bufobj to struct buf which eventually will eliminate the need for b_vp.phk2004-10-221-4/+3
| | | | | | | | | | | | | | | | | | Initialize b_bufobj for all buffers. Make incore() and gbincore() take a bufobj instead of a vnode. Make inmem() local to vfs_bio.c Change a lot of VI_[UN]LOCK(bp->b_vp) to BO_[UN]LOCK(bp->b_bufobj) also VI_MTX() to BO_MTX(), Make buf_vlist_add() take a bufobj instead of a vnode. Eliminate other uses of bp->b_vp where bp->b_bufobj will do. Various minor polishing: remove "register", turn panic into KASSERT, use new function declarations, TAILQ_FOREACH_SAFE() etc.
* Move the VI_BWAIT flag into no bo_flag element of bufobj and call it BO_WWAITphk2004-10-211-12/+3
| | | | | | | | | | Add bufobj_wref(), bufobj_wdrop() and bufobj_wwait() to handle the write count on a bufobj. Bufobj_wdrop() replaces vwakeup(). Use these functions all relevant places except in ffs_softdep.c where the use if interlocked_sleep() makes this impossible. Rename b_vnbufs to b_bobufs now that we touch all the relevant files anyway.
* Correct two errors in PG_BUSY management by vm_page_cowfault(). Bothalc2004-10-181-2/+1
| | | | | | | | | | | errors are in rarely executed paths. 1. Each time the retry_alloc path is taken, the PG_BUSY must be set again. Otherwise vm_page_remove() panics. 2. There is no need to set PG_BUSY on the newly allocated page before freeing it. The page already has PG_BUSY set by vm_page_alloc(). Setting it again could cause an assertion failure. MFC after: 2 weeks
* Assert that the containing object is locked in vm_page_io_start() andalc2004-10-171-0/+2
| | | | | | vm_page_io_finish(). The motivation being to transition synchronization of the vm_page's busy field from the global page queues lock to the per-object lock.
OpenPOWER on IntegriCloud