FreeBSD-src - Raptor Engineering's fork of pfsense FreeBSD src with pfSense changes

	Commit message (Collapse)	Author	Age	Files	Lines
*	MFC r271586:	kib	2014-09-21	1	-10/+6
\| \| \| \| \| \|	Fix mis-spelling of bits and types names in the vnode_pager_putpages(). Approved by: re (delphij)
*	The soft and hard busy mechanism rely on the vm object lock to work.	attilio	2013-08-09	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Unify the 2 concept into a real, minimal, sxlock where the shared acquisition represent the soft busy and the exclusive acquisition represent the hard busy. The old VPO_WANTED mechanism becames the hard-path for this new lock and it becomes per-page rather than per-object. The vm_object lock becames an interlock for this functionality: it can be held in both read or write mode. However, if the vm_object lock is held in read mode while acquiring or releasing the busy state, the thread owner cannot make any assumption on the busy state unless it is also busying it. Also: - Add a new flag to directly shared busy pages while vm_page_alloc and vm_page_grab are being executed. This will be very helpful once these functions happen under a read object lock. - Move the swapping sleep into its own per-object flag The KPI is heavilly changed this is why the version is bumped. It is very likely that some VM ports users will need to change their own code. Sponsored by: EMC / Isilon storage division Discussed with: alc Reviewed by: jeff, kib Tested by: gavin, bapt (older version) Tested by: pho, scottl
*	- Correct a stale comment. We don't have vclean() anymore. The work is	jeff	2013-07-23	1	-5/+0
\| \| \| \| \| \| \|	done by vgonel() and destroy_vobject() should only be called once from VOP_INACTIVE(). Sponsored by: EMC / Isilon Storage Division
*	Assert that the object type for the vnode' non-NULL v_object, passed	kib	2013-04-28	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	to vnode_pager_setsize(), is either OBJT_VNODE, or, if vnode was already reclaimed, OBJT_DEAD. Note that the later is only possible due to some filesystems, in particular, nfsiods from nfs clients, call vnode_pager_setsize() with unlocked vnode. More, if the object is terminated, do not perform the resizing operation. Reviewed by: alc Tested by: pho, bf MFC after: 1 week
*	Convert panic() into KASSERT().	kib	2013-04-28	1	-2/+1
\| \| \| \| \|	Reviewed by: alc MFC after: 1 week
*	Fix the logic inversion in the r248512.	kib	2013-03-20	1	-1/+1
\| \| \| \|	Noted by: mckay
*	Pass unmapped buffers for page in requests if the filesystem indicated support	kib	2013-03-19	1	-6/+30
\| \| \| \| \| \| \|	for the unmapped i/o. Sponsored by: The FreeBSD Foundation Tested by: pho
*	Some style fixes.	kib	2013-03-14	1	-1/+1
\| \| \| \|	Sponsored by: The FreeBSD Foundation
*	MFC	attilio	2013-02-26	1	-2/+2
\|
*	Hide the details for the assertion for VM_OBJECT_LOCK operations.	attilio	2013-02-21	1	-3/+3
\| \| \| \| \| \| \| \|	Rename current VM_OBJECT_LOCK_ASSERT(foo, RA_WLOCKED) into VM_OBJECT_ASSERT_WLOCKED(foo) Sponsored by: EMC / Isilon storage division Requested by: alc
*	Rename VM_OBJECT_LOCK(), VM_OBJECT_UNLOCK() and VM_OBJECT_TRYLOCK() to	attilio	2013-02-20	1	-55/+55
\| \| \| \| \| \|	their "write" versions. Sponsored by: EMC / Isilon storage division
*	Switch vm_object lock to be a rwlock.	attilio	2013-02-20	1	-5/+6
\| \| \| \| \| \| \| \|	* VM_OBJECT_LOCK and VM_OBJECT_UNLOCK are mapped to write operations * VM_OBJECT_SLEEP() is introduced as a general purpose primitve to get a sleep operation using a VM_OBJECT_LOCK() as protection * The approach must bear with vm_pager.h namespace pollution so many files require including directly rwlock.h
*	The r241025 fixed the case when a binary, executed from nullfs mount,	kib	2012-11-02	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	was still possible to open for write from the lower filesystem. There is a symmetric situation where the binary could already has file descriptors opened for write, but it can be executed from the nullfs overlay. Handle the issue by passing one v_writecount reference to the lower vnode if nullfs vnode has non-zero v_writecount. Note that only one write reference can be donated, since nullfs only keeps one use reference on the lower vnode. Always use the lower vnode v_writecount for the checks. Introduce the VOP_GET_WRITECOUNT to read v_writecount, which is currently always bypassed to the lower vnode, and VOP_ADD_WRITECOUNT to manipulate the v_writecount value, which manages a single bypass reference to the lower vnode. Caling the VOPs instead of directly accessing v_writecount provide the fix described in the previous paragraph. Tested by: pho MFC after: 3 weeks
*	Remove the support for using non-mpsafe filesystem modules.	kib	2012-10-22	1	-9/+0
\| \| \| \| \| \| \| \| \| \| \| \|	In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems. The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes. Conducted and reviewed by: attilio Tested by: pho
*	Fix the mis-handling of the VV_TEXT on the nullfs vnodes.	kib	2012-09-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If you have a binary on a filesystem which is also mounted over by nullfs, you could execute the binary from the lower filesystem, or from the nullfs mount. When executed from lower filesystem, the lower vnode gets VV_TEXT flag set, and the file cannot be modified while the binary is active. But, if executed as the nullfs alias, only the nullfs vnode gets VV_TEXT set, and you still can open the lower vnode for write. Add a set of VOPs for the VV_TEXT query, set and clear operations, which are correctly bypassed to lower vnode. Tested by: pho (previous version) MFC after: 2 weeks
*	Do not leave invalid pages in the object after the short read for a	kib	2012-08-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	network file systems (not only NFS proper). Short reads cause pages other then the requested one, which were not filled by read response, to stay invalid. Change the vm_page_readahead_finish() interface to not take the error code, but instead to make a decision to free or to (de)activate the page only by its validity. As result, not requested invalid pages are freed even if the read RPC indicated success. Noted and reviewed by: alc MFC after: 1 week
*	After the PHYS_TO_VM_PAGE() function was de-inlined, the main reason	kib	2012-08-05	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	to pull vm_param.h was removed. Other big dependency of vm_page.h on vm_param.h are PA_LOCK* definitions, which are only needed for in-kernel code, because modules use KBI-safe functions to lock the pages. Stop including vm_param.h into vm_page.h. Include vm_param.h explicitely for the kernel code which needs it. Suggested and reviewed by: alc MFC after: 2 weeks
*	Reduce code duplication and exposure of direct access to struct	kib	2012-08-04	1	-31/+2
\| \| \| \| \| \| \| \| \|	vm_page oflags by providing helper function vm_page_readahead_finish(), which handles completed reads for pages with indexes other then the requested one, for VOP_GETPAGES(). Reviewed by: alc MFC after: 1 week
*	Do a more targeted check on the page cache and avoid to check the cache	attilio	2012-06-16	1	-1/+1
\| \| \| \| \| \| \| \| \|	pointer directly in vnode_pager_setsize() by using newly introduced vm_page_is_cached() function. Reviewed by: alc MFC after: 2 weeks X-MFC: r234039,234064
*	The page flag PGA_WRITEABLE is set and cleared exclusively by the pmap	alc	2012-06-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	layer, but it is read directly by the MI VM layer. This change introduces pmap_page_is_write_mapped() in order to completely encapsulate all direct access to PGA_WRITEABLE in the pmap layer. Aesthetics aside, I am making this change because amd64 will likely begin using an alternative method to track write mappings, and having pmap_page_is_write_mapped() in place allows me to make such a change without further modification to the MI VM layer. As an added bonus, tidy up some nearby comments concerning page flags. Reviewed by: kib MFC after: 6 weeks
*	Keep track of the mount point associated with a special device	mckusick	2012-03-28	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to enable the collection of counts of synchronous and asynchronous reads and writes for its associated filesystem. The counts are displayed using `mount -v'. Ensure that buffers used for paging indicate the vnode from which they are operating so that counts of paging I/O operations from the filesystem are collected. This checkin only adds the setting of the mount point for the UFS/FFS filesystem, but it would be trivial to add the setting and clearing of the mount point at filesystem mount/unmount time for other filesystems too. Reviewed by: kib
*	Add KTR_VFS traces to track modifications to a vnode's writecount.	jhb	2012-03-08	1	-0/+6
\|
*	Account the writeable shared mappings backed by file in the vnode	kib	2012-02-23	1	-0/+85
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	v_writecount. Keep the amount of the virtual address space used by the mappings in the new vm_object un_pager.vnp.writemappings counter. The vnode v_writecount is incremented when writemappings gets non-zero value, and decremented when writemappings is returned to zero. Writeable shared vnode-backed mappings are accounted for in vm_mmap(), and vm_map_insert() is instructed to set MAP_ENTRY_VN_WRITECNT flag on the created map entry. During deferred map entry deallocation, vm_map_process_deferred() checks for MAP_ENTRY_VN_WRITECOUNT and decrements writemappings for the vm object. Now, the writeable mount cannot be demoted to read-only while writeable shared mappings of the vnodes from the mount point exist. Also, execve(2) fails for such files with ETXTBUSY, as it should be. Noted by: tegge Reviewed by: tegge (long time ago, early version), alc Tested by: pho MFC after: 3 weeks
*	Rename vm_page_set_valid() to vm_page_set_valid_range().	kib	2011-11-30	1	-2/+2
\| \| \| \| \| \| \|	The vm_page_set_valid() is the most reasonable name for the m->valid accessor. Reviewed by: attilio, alc
*	Provide typedefs for the type of bit mask for the page bits.	kib	2011-11-05	1	-2/+3
\| \| \| \| \| \| \| \| \|	Use the defined types instead of int when manipulating masks. Supposedly, it could fix support for 32KB page size in the machine-independend VM layer. Reviewed by: alc MFC after: 2 weeks
*	Fix a typo in a comment.	jhb	2011-10-14	1	-1/+1
\|
*	Split the vm_page flags PG_WRITEABLE and PG_REFERENCED into atomic	kib	2011-09-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	flags field. Updates to the atomic flags are performed using the atomic ops on the containing word, do not require any vm lock to be held, and are non-blocking. The vm_page_aflag_set(9) and vm_page_aflag_clear(9) functions are provided to modify afalgs. Document the changes to flags field to only require the page lock. Introduce vm_page_reference(9) function to provide a stable KPI and KBI for filesystems like tmpfs and zfs which need to mark a page as referenced. Reviewed by: alc, attilio Tested by: marius, flo (sparc64); andreast (powerpc, powerpc64) Approved by: re (bz)
*	Add a new option, OBJPR_NOTMAPPED, to vm_object_page_remove(). Passing this	alc	2011-06-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	option to vm_object_page_remove() asserts that the specified range of pages is not mapped, or more precisely that none of these pages have any managed mappings. Thus, vm_object_page_remove() need not call pmap_remove_all() on the pages. This change not only saves time by eliminating pointless calls to pmap_remove_all(), but it also eliminates an inconsistency in the use of pmap_remove_all() versus related functions, like pmap_remove_write(). It eliminates harmless but pointless calls to pmap_remove_all() that were being performed on PG_UNMANAGED pages. Update all of the existing assertions on pmap_remove_all() to reflect this change. Reviewed by: kib
*	Fix a bug in r222586. Lock the page owner object around the modification	kib	2011-06-11	1	-0/+6
\| \| \| \| \| \| \|	of the m->dirty. Reported and tested by: nwhitehorn Reviewed by: alc
*	In the VOP_PUTPAGES() implementations, change the default error from	kib	2011-06-01	1	-1/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	VM_PAGER_AGAIN to VM_PAGER_ERROR for the uwritten pages. Return VM_PAGER_AGAIN for the partially written page. Always forward at least one page in the loop of vm_object_page_clean(). VM_PAGER_ERROR causes the page reactivation and does not clear the page dirty state, so the write is not lost. The change fixes an infinite loop in vm_object_page_clean() when the filesystem returns permanent errors for some page writes. Reported and tested by: gavin Reviewed by: alc, rmacklem MFC after: 1 week
*	Minimize the use of the page queues lock for synchronizing access to the	alc	2010-06-02	1	-2/+0
\| \| \| \| \|	page's dirty field. With the exception of one case, access to this field is now synchronized by the object lock.
*	Push down page queues lock acquisition in pmap_enter_object() and	alc	2010-05-26	1	-18/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	pmap_is_referenced(). Eliminate the corresponding page queues lock acquisitions from vm_map_pmap_enter() and mincore(), respectively. In mincore(), this allows some additional cases to complete without ever acquiring the page queues lock. Assert that the page is managed in pmap_is_referenced(). On powerpc/aim, push down the page queues lock acquisition from moea_is_modified() and moea_is_referenced() into moea*_query_bit(). Again, this will allow some additional cases to complete without ever acquiring the page queues lock. Reorder a few statements in vm_page_dontneed() so that a race can't lead to an old reference persisting. This scenario is described in detail by a comment. Correct a spelling error in vm_page_dontneed(). Assert that the object is locked in vm_page_clear_dirty(), and restrict the page queues lock assertion to just those cases in which the page is currently writeable. Add object locking to vnode_pager_generic_putpages(). This was the one and only place where vm_page_clear_dirty() was being called without the object being locked. Eliminate an unnecessary vm_page_lock() around vnode_pager_setsize()'s call to vm_page_clear_dirty(). Change vnode_pager_generic_putpages() to the modern-style of function definition. Also, change the name of one of the parameters to follow virtual memory system naming conventions. Reviewed by: kib
*	Push down the page queues lock into vm_page_activate().	alc	2010-05-07	1	-6/+9
\|
*	Eliminate page queues locking around most calls to vm_page_free().	alc	2010-05-06	1	-18/+0
\|
*	On Alan's advice, rather than do a wholesale conversion on a single	kmacy	2010-04-30	1	-26/+59
\| \| \| \| \| \| \| \| \| \| \| \|	architecture from page queue lock to a hashed array of page locks (based on a patch by Jeff Roberson), I've implemented page lock support in the MI code and have only moved vm_page's hold_count out from under page queue mutex to page lock. This changes pmap_extract_and_hold on all pmaps. Supported by: Bitgravity Inc. Discussed with: alc, jeffr, and kib
*	Remove write-only variable.	kib	2010-02-22	1	-3/+0
\| \| \| \|	MFC after: 3 days
*	When a vnode-backed vm object is referenced, it increments the vnode	kib	2010-01-17	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	reference count, and decrements it on dereference. If referenced object is deallocated, object type is reset to OBJT_DEAD. Consequently, all vnode references that are owned by object references are never released. vunref() the vnode in vm object deallocation code for OBJT_VNODE appropriate number of times to prevent leak. Add an assertion to the vm_pageout() to make sure that we never get reference on the vnode but then do not execute code to release it. In collaboration with: pho Reviewed by: alc MFC after: 3 weeks
*	Change the type of uio_resid member of struct uio from int to ssize_t.	kib	2009-06-25	1	-1/+1
\| \| \| \| \| \| \| \|	Note that this does not actually enable full-range i/o requests for 64 architectures, and is done now to update KBI only. Tested by: pho Reviewed by: jhb, bde (as part of the review of the bigger patch)
*	Implement global and per-uid accounting of the anonymous memory. Add	kib	2009-06-23	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved for the uid. The accounting information (charge) is associated with either map entry, or vm object backing the entry, assuming the object is the first one in the shadow chain and entry does not require COW. Charge is moved from entry to object on allocation of the object, e.g. during the mmap, assuming the object is allocated, or on the first page fault on the entry. It moves back to the entry on forks due to COW setup. The per-entry granularity of accounting makes the charge process fair for processes that change uid during lifetime, and decrements charge for proper uid when region is unmapped. The interface of vm_pager_allocate(9) is extended by adding struct ucred *, that is used to charge appropriate uid when allocation if performed by kernel, e.g. md(4). Several syscalls, among them is fork(2), may now return ENOMEM when global or per-uid limits are enforced. In collaboration with: pho Reviewed by: alc Approved by: re (kensmith)
*	Correct a boundary case error in the management of a page's dirty bits by	alc	2009-06-02	1	-10/+16
\| \| \| \| \| \| \| \|	shm_dotruncate() and vnode_pager_setsize(). Specifically, if the length of a shared memory object or a file is truncated such that the length modulo the page size is between 1 and 511, then all of the page's dirty bits were cleared. Now, a dirty bit is cleared only if the corresponding block is truncated in its entirety.
*	Eliminate unnecessary clearing of the page's dirty mask from various	alc	2009-05-15	1	-5/+6
\| \| \| \| \| \|	getpages functions. Eliminate a stale comment.
*	Eliminate gratuitous clearing of the page's dirty mask.	alc	2009-05-12	1	-1/+2
\|
*	Fix a race involving vnode_pager_input_smlfs(). Specifically, in the case	alc	2009-05-09	1	-23/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	that vnode_pager_input_smlfs() zeroes the page, it should not mark the page as valid until after the page is zeroed. Otherwise, the page could be mapped for read access (e.g., by vm_map_pmap_enter()) before the page is zeroed. Reviewed by: tegge Eliminate gratuitous clearing of the page's dirty mask by vnode_pager_input_smlfs(). Instead, assert that the page is clean. Reviewed by: tegge Eliminate some blank lines. Eliminate pointless calls to pmap_clear_modify() and vm_page_undirty() from vnode_pager_input_old(). The page is not mapped. Therefore, it cannot have any page table entries that are modified. Eliminate an incorrect comment from vnode_pager_generic_getpages().
*	Eliminate vnode_pager_input_smlfs()'s pointless call to pmap_clear_modify().	alc	2009-05-04	1	-3/+0
\| \| \| \| \|	The page can't possibly have any modified page table entries because it isn't even mapped.
*	Eliminate unnecessary calls to pmap_clear_modify(). Specifically, calling	alc	2009-04-25	1	-2/+6
\| \| \| \| \| \| \| \| \|	pmap_clear_modify() on a page is pointless if that page is not mapped or it is only mapped for read access. Instead, assert that the page is not mapped or not mapped for write access as appropriate. Eliminate unnecessary clearing of a page's dirty mask. Instead, assert that the page's dirty mask is clear.
*	Adjust some variables (mostly related to the buffer cache) that hold	jhb	2009-03-09	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	address space sizes to be longs instead of ints. Specifically, the follow values are now longs: runningbufspace, bufspace, maxbufspace, bufmallocspace, maxbufmallocspace, lobufspace, hibufspace, lorunningspace, hirunningspace, maxswzone, maxbcache, and maxpipekva. Previously, a relatively small number (~ 44000) of buffers set in kern.nbuf would result in integer overflows resulting either in hangs or bogus values of hidirtybuffers and lodirtybuffers. Now one has to overflow a long to see such problems. There was a check for a nbuf setting that would cause overflows in the auto-tuning of nbuf. I've changed it to always check and cap nbuf but warn if a user-supplied tunable would cause overflow. Note that this changes the ABI of several sysctls that are used by things like top(1), etc., so any MFC would probably require a some gross shims to allow for that. MFC after: 1 month
*	Comment out the assertion from r188321. It is not valid for nfs.	kib	2009-02-09	1	-1/+1
\| \| \| \|	Reported by: alc
*	Eliminate OBJ_NEEDGIANT. After r188331, OBJ_NEEDGIANT's only use is by a	alc	2009-02-08	1	-2/+0
\| \| \| \| \| \|	redundant assertion in vm_fault(). Reviewed by: kib
*	Do not sleep for vnode lock while holding map lock in vm_fault. Try to	kib	2009-02-08	1	-53/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	acquire vnode lock for OBJT_VNODE object after map lock is dropped. Because we have the busy page(s) in the object, sleeping there would result in deadlock with vnode resize. Try to get lock without sleeping, and, if the attempt failed, drop the state, lock the vnode, and restart the fault handler from the start with already locked vnode. Because the vnode_pager_lock() function is inlined in vm_fault(), axe it. Based on suggestion by: alc Reviewed by: tegge, alc Tested by: pho
*	Assert that vnode is exclusively locked when its vm object is resized.	kib	2009-02-08	1	-0/+1
\| \| \| \|	Reviewed by: tegge