summaryrefslogtreecommitdiffstats
path: root/sys/vm
Commit message (Collapse)AuthorAgeFilesLines
* Make VOP_BMAP return a struct bufobj for the underlying storage devicephk2004-11-151-10/+13
| | | | | | | | | instead of a vnode for it. The vnode_pager does not and should not have any interest in what the filesystem uses for backend. (vfs_cluster doesn't use the backing store argument.)
* Add pbgetbo()/pbrelbo() lighter weight versions of pbgetvp()/pbrelvp().phk2004-11-151-0/+42
|
* More kasserts.phk2004-11-151-1/+6
|
* style polishing.phk2004-11-151-7/+3
|
* Move pbgetvp() and pbrelvp() to vm_pager.c with the rest of the pbuf stuff.phk2004-11-151-0/+44
|
* expect the caller to have called pbrelvp() if necessary.phk2004-11-151-3/+0
|
* Explicitly call pbrelvp()phk2004-11-151-0/+2
|
* Improve readability with a bunch of typedefs for the pager ops.phk2004-11-091-7/+15
| | | | These can also be used for prototypes in the pagers.
* #include <vm/vm_param.h> instead of <machine/vmparam.h> (the formerdes2004-11-081-6/+6
| | | | | | | | | | | | includes the latter, but also declares variables which are defined in kern/subr_param.c). Change som VM parameters from quad_t to unsigned long. They refer to quantities (size limits for text, heap and stack segments) which must necessarily be smaller than the size of the address space, so long is adequate on all platforms. MFC after: 1 week
* Eliminate an unnecessary atomic operation. Articulate the rationale inalc2004-11-061-4/+11
| | | | a comment.
* Abstract the logic to look up the uma_bucket_zone given a desiredrwatson2004-11-061-7/+23
| | | | | | | | | number of entries into bucket_zone_lookup(), which helps make more clear the logic of consumers of bucket zones. Annotate the behavior of bucket_init() with a comment indicating how the various data structures, including the bucket lookup tables, are initialized.
* Remove dangling variablephk2004-11-061-1/+0
|
* Annotate what bucket_size[] array does; staticize since it's used onlyrwatson2004-11-061-1/+5
| | | | in uma_core.c.
* Fix the last known race in swapoff(), which could lead to a spurious panic:das2004-11-061-21/+14
| | | | | | | | | | | | | swapoff: failed to locate %d swap blocks The race occurred because putpages() can block between the time it allocates swap space and the time it updates the swap metadata to associate that space with a vm_object, so swapoff() would complain about the temporary inconsistency. I hoped to fix this by making swp_pager_getswapspace() and swp_pager_meta_build() a single atomic operation, but that proved to be inconvenient. With this change, swapoff() simply doesn't attempt to be so clever about detecting when all the pageout activity to the target device should have drained.
* Move a call to wakeup() from vm_object_terminate() to vnode_pager_dealloc()alc2004-11-063-2/+6
| | | | | | | | | because this call is only needed to wake threads that slept when they discovered a dead object connected to a vnode. To eliminate unnecessary calls to wakeup() by vnode_pager_dealloc(), introduce a new flag, OBJ_DISCONNECTWNT. Reviewed by: tegge@
* - Set the priority of the page zeroing thread using sched_prio() when thejhb2004-11-051-14/+5
| | | | | | | | | | thread is created rather than adjusting the priority in the main function. (kthread_create() should probably take the initial priority as an argument.) - Only yield the CPU in the !PREEMPTION case if there are any other runnable threads. Yielding when there isn't anything else better to do just wastes time in pointless context switches (albeit while the system is idle.)
* During traversal of the inactive queue, try locking the page's containingalc2004-11-051-4/+9
| | | | object before accessing the page's flags or the object's reference count.
* Eliminate another unnecessary call to vm_page_busy() that immediatelyalc2004-11-051-1/+0
| | | | | precedes a call to vm_page_rename(). (See the previous revision for a detailed explanation.)
* Close a race in swapoff(). Here are the gory details:das2004-11-051-70/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to avoid livelock, swapoff() skips over objects with a nonzero pip count and makes another pass if necessary. Since it is impossible to know which objects we care about, it would choose an arbitrary object with a nonzero pip count and wait for it before making another pass, the theory being that this object would finish paging about as quickly as the ones we care about. Unfortunately, we may have slept since we acquired a reference to this object. Hack around this problem by tsleep()ing on the pointer anyway, but timeout after a fixed interval. More elegant solutions are possible, but the ones I considered unnecessarily complicate this rare case. Also, kill some nits that seem to have crept into the swapoff() code in the last 75 revisions or so: - Don't pass both sp and sp->sw_used to swap_pager_swapoff(), since the latter can be derived from the former. - Replace swp_pager_find_dev() with something simpler. There's no need to iterate over the entire list of swap devices just to determine if a given block is assigned to the one we're interested in. - Expand the scope of the swhash_mtx in a couple of places so that it isn't released and reacquired once for every hash bucket. - Don't drop the swhash_mtx while holding a reference to an object. We need to lock the object first. Unfortunately, doing so would violate the established lock order, so use VM_OBJECT_TRYLOCK() and try again on a subsequent pass if the object is already locked. - Refactor swp_pager_force_pagein() and swap_pager_swapoff() a bit.
* Retire b_magic now, we have the bufobj containing the same hint.phk2004-11-041-1/+0
|
* De-couple our I/O bio request from the embedded bio in buf by explicitlyphk2004-11-041-1/+6
| | | | copying the fields.
* Remove buf->b_dev field.phk2004-11-041-4/+2
|
* The synchronization provided by vm object locking has eliminated thealc2004-11-035-23/+5
| | | | | | | | | | | | | | | | | need for most calls to vm_page_busy(). Specifically, most calls to vm_page_busy() occur immediately prior to a call to vm_page_remove(). In such cases, the containing vm object is locked across both calls. Consequently, the setting of the vm page's PG_BUSY flag is not even visible to other threads that are following the synchronization protocol. This change (1) eliminates the calls to vm_page_busy() that immediately precede a call to vm_page_remove() or functions, such as vm_page_free() and vm_page_rename(), that call it and (2) relaxes the requirement in vm_page_remove() that the vm page's PG_BUSY flag is set. Now, the vm page's PG_BUSY flag is set only when the vm object lock is released while the vm page is still in transition. Typically, this is when it is undergoing I/O.
* Introduce a Boolean variable wakeup_needed to avoid repeated, unnecessaryalc2004-10-311-2/+9
| | | | calls to wakeup() by vm_page_zero_idle_wakeup().
* During traversal of the active queue by vm_pageout_page_stats(), tryalc2004-10-301-1/+10
| | | | locking the page's containing object before accessing the page's flags.
* Eliminate an unused but initialized variable.alc2004-10-301-2/+0
|
* Add an assignment statement that I omitted from the previous revision.alc2004-10-301-0/+1
|
* Assert that the containing vm object is locked in vm_page_cache() andalc2004-10-281-0/+2
| | | | vm_page_try_to_cache().
* Fix a INVARIANTS-only bug introduced in Revision 1.104:bmilekic2004-10-271-1/+5
| | | | | | | | | | IF INVARIANTS is defined, and in the rare case that we have allocated some objects from the slab and at least one initializer on at least one of those objects failed, and we need to fail the allocation and push the uninitialized items back into the slab caches -- in that scenario, we would fail to [re]set the bucket cache's ub_bucket item references to NULL, which would eventually trigger a KASSERT.
* During traversal of the active queue, try locking the page's containingalc2004-10-271-4/+12
| | | | | object before accessing the page's flags or the object's reference count. If the trylock fails, handle the page as though it is busy.
* Also check that the sectormask is bigger than zero.phk2004-10-261-1/+3
| | | | Wrap this overly long KASSERT and remove newline.
* Put the I/O block size in bufobj->bo_bsize.phk2004-10-261-1/+1
| | | | | | | We keep si_bsize_phys around for now as that is the simplest way to pull the number out of disk device drivers in devfs_open(). The correct solution would be to do an ioctl(DIOCGSECTORSIZE), but the point is probably mooth when filesystems sit on GEOM, so don't bother for now.
* Don't clear flags we just checked were not set.phk2004-10-261-1/+0
|
* Assert that the containing vm object is locked in vm_page_flash().alc2004-10-251-0/+2
|
* Assert that the containing vm object is locked in vm_page_busy() andalc2004-10-241-0/+4
| | | | vm_page_wakeup().
* Move the buffer method vector (buf->b_op) to the bufobj.phk2004-10-243-10/+3
| | | | | | | | | | | | | | | | | Extend it with a strategy method. Add bufstrategy() which do the usual VOP_SPECSTRATEGY/VOP_STRATEGY song and dance. Rename ibwrite to bufwrite(). Move the two NFS buf_ops to more sensible places, add bufstrategy to them. Add inlines for bwrite() and bstrategy() which calls through buf->b_bufobj->b_ops->b_{write,strategy}(). Replace almost all VOP_STRATEGY()/VOP_SPECSTRATEGY() calls with bstrategy().
* Acquire the vm object lock before rather than after callingalc2004-10-241-4/+5
| | | | | | vm_page_sleep_if_busy(). (The motivation being to transition synchronization of the vm_page's PG_BUSY flag from the global page queues lock to the per-object lock.)
* Use VM_ALLOC_NOBUSY instead of calling vm_page_wakeup().alc2004-10-242-11/+3
|
* Introduce VM_ALLOC_NOBUSY, an option to vm_page_alloc() and vm_page_grab()alc2004-10-242-2/+4
| | | | | | | | that indicates that the caller does not want a page with its busy flag set. In many places, the global page queues lock is acquired and released just to clear the busy flag on a just allocated page. Both the allocation of the page and the clearing of the busy flag occur while the containing vm object is locked. So, the busy flag might as well never be set.
* Add b_bufobj to struct buf which eventually will eliminate the need for b_vp.phk2004-10-221-4/+3
| | | | | | | | | | | | | | | | | | Initialize b_bufobj for all buffers. Make incore() and gbincore() take a bufobj instead of a vnode. Make inmem() local to vfs_bio.c Change a lot of VI_[UN]LOCK(bp->b_vp) to BO_[UN]LOCK(bp->b_bufobj) also VI_MTX() to BO_MTX(), Make buf_vlist_add() take a bufobj instead of a vnode. Eliminate other uses of bp->b_vp where bp->b_bufobj will do. Various minor polishing: remove "register", turn panic into KASSERT, use new function declarations, TAILQ_FOREACH_SAFE() etc.
* Move the VI_BWAIT flag into no bo_flag element of bufobj and call it BO_WWAITphk2004-10-211-12/+3
| | | | | | | | | | Add bufobj_wref(), bufobj_wdrop() and bufobj_wwait() to handle the write count on a bufobj. Bufobj_wdrop() replaces vwakeup(). Use these functions all relevant places except in ffs_softdep.c where the use if interlocked_sleep() makes this impossible. Rename b_vnbufs to b_bobufs now that we touch all the relevant files anyway.
* Correct two errors in PG_BUSY management by vm_page_cowfault(). Bothalc2004-10-181-2/+1
| | | | | | | | | | | errors are in rarely executed paths. 1. Each time the retry_alloc path is taken, the PG_BUSY must be set again. Otherwise vm_page_remove() panics. 2. There is no need to set PG_BUSY on the newly allocated page before freeing it. The page already has PG_BUSY set by vm_page_alloc(). Setting it again could cause an assertion failure. MFC after: 2 weeks
* Assert that the containing object is locked in vm_page_io_start() andalc2004-10-171-0/+2
| | | | | | vm_page_io_finish(). The motivation being to transition synchronization of the vm_page's busy field from the global page queues lock to the per-object lock.
* Remove unnecessary check for curthread == NULL.alc2004-10-171-1/+1
|
* Put on my peril sensitive sunglasses and add a flags field to the internalpeter2004-10-111-2/+18
| | | | | | | | | | | | | | | | sysctl routines and state. Add some code to use it for signalling the need to downconvert a data structure to 32 bits on a 64 bit OS when requested by a 32 bit app. I tried to do this in a generic abi wrapper that intercepted the sysctl oid's, or looked up the format string etc, but it was a real can of worms that turned into a fragile mess before I even got it partially working. With this, we can now run 'sysctl -a' on a 32 bit sysctl binary and have it not abort. Things like netstat, ps, etc have a long way to go. This also fixes a bug in the kern.ps_strings and kern.usrstack hacks. These do matter very much because they are used by libc_r and other things.
* In the previous revision, I did not intend to change the default valuegreen2004-10-091-1/+1
| | | | | | of "nosleepwithlocks." Submitted by: ru
* Fix critical stability problems that can cause UMA mbuf clustergreen2004-10-082-24/+64
| | | | | | | | | | | | | | | | | | | | | | | | state management corruption, mbuf leaks, general mbuf corruption, and at least on i386 a first level splash damage radius that encompasses up to about half a megabyte of the memory after an mbuf cluster's allocation slab. In short, this has caused instability nightmares anywhere the right kind of network traffic is present. When the polymorphic refcount slabs were added to UMA, the new types were not used pervasively. In particular, the slab management structure was turned into one for refcounts, and one for non-refcounts (supposed to be mostly like the old slab management structure), but the latter was almost always used through out. In general, every access to zones with UMA_ZONE_REFCNT turned on corrupted the "next free" slab offset offset and the refcount with each other and with other allocations (on i386, 2 mbuf clusters per 4096 byte slab). Fix things so that the right type is used to access refcounted zones where it was not before. There are additional errors in gross overestimation of padding, it seems, that would cause a large kegs (nee zones) to be allocated when small ones would do. Unless I have analyzed this incorrectly, it is not directly harmful.
* Don't look for swap blocks in objects that aren't swap-backed.das2004-09-241-0/+3
| | | | | | | I expect that this will fix the following panic, reported by Jun: swap_pager_isswapped: failed to locate all swap meta blocks MT5 candidate
* XXX mark two places where we do not hold a threadcount on the dev whenphk2004-09-241-0/+1
| | | | | | | | frobbing the cdevsw. In both cases we examine only the cdevsw and it is a good question if we weren't better off copying those properties into the cdev in the first place. This question will be revisited.
* Use dev_re[fl]thread() to maintain a ref on the device driver whilephk2004-09-241-14/+13
| | | | we call the ->d_mmap function.
OpenPOWER on IntegriCloud