summaryrefslogtreecommitdiffstats
path: root/sys/fs/tmpfs
Commit message (Collapse)AuthorAgeFilesLines
...
* Assert that OBJ_TMPFS flag on the vm object for the tmpfs node iskib2013-05-301-0/+2
| | | | | | cleared when the tmpfs node is going away. Tested by: bdrewery, pho
* Avoid deactivating the page if it is already on a queue, only requeuekib2013-05-061-6/+10
| | | | | | | | | | the page. This both reduces the number of queues locking and avoids moving the active page to inactive list just because the page was read or written. Based on the suggestion by: alc Reviewed by: alc Tested by: pho
* Fix the v_object leak for non-regular tmpfs vnodes.kib2013-05-021-0/+3
| | | | | Reported and tested by: pho Sponsored by: The FreeBSD Foundation
* For the new regular tmpfs vnode, v_object is initialized beforekib2013-05-023-14/+34
| | | | | | | | | | | | | | insmntque() is called. The standard insmntque destructor resets the vop vector to deadfs one, and calls vgone() on the vnode. As result, v_object is kept unchanged, which triggers an assertion in the reclaim code, on instmntque() failure. Also, in this case, OBJ_TMPFS flag on the backed vm object is not cleared. Provide the tmpfs insmntque() destructor which properly clears OBJ_TMPFS flag and resets v_object. Reported and tested by: pho Sponsored by: The FreeBSD Foundation
* The page read or written could be wired. Do not requeue if the pagekib2013-05-021-2/+4
| | | | | | | is not on a queue. Reported and tested by: pho Sponsored by: The FreeBSD Foundation
* Rework the handling of the tmpfs node backing swap object and tmpfskib2013-04-282-164/+103
| | | | | | | | | | | | | | | | | | vnode v_object to avoid double-buffering. Use the same object both as the backing store for tmpfs node and as the v_object. Besides reducing memory use up to 2x times for situation of mapping files from tmpfs, it also makes tmpfs read and write operations copy twice bytes less. VM subsystem was already slightly adapted to tolerate OBJT_SWAP object as v_object. Now the vm_object_deallocate() is modified to not reinstantiate OBJ_ONEMAPPING flag and help the VFS to correctly handle VV_TEXT flag on the last dereference of the tmpfs backing object. Reviewed by: alc Tested by: pho, bf MFC after: 1 month
* - Constify local path variable for chflagsat().pjd2013-03-221-1/+1
| | | | | | - Use correct format characters (%lx) for u_long. This fixes the build broken in r248599.
* - Make 'flags' argument to chflags(2), fchflags(2) and lchflags(2) of typepjd2013-03-212-3/+4
| | | | | | | | | | | u_long. Before this change it was of type int for syscalls, but prototypes in sys/stat.h and documentation for chflags(2) and fchflags(2) (but not for lchflags(2)) stated that it was u_long. Now some related functions use u_long type for flags (strtofflags(3), fflagstostr(3)). - Make path argument of type 'const char *' for consistency. Discussed on: arch Sponsored by: The FreeBSD Foundation
* Remove negative name cache entry pointing to the target name, whichkib2013-03-171-0/+1
| | | | | | | | could be instantiated while tdvp was unlocked. Reported by: Rick Miller <vmiller at hostileadmin com> Tested by: pho MFC after: 1 week
* Rename VM_OBJECT_LOCK(), VM_OBJECT_UNLOCK() and VM_OBJECT_TRYLOCK() toattilio2013-02-202-27/+27
| | | | | | their "write" versions. Sponsored by: EMC / Isilon storage division
* Switch vm_object lock to be a rwlock.attilio2013-02-202-0/+4
| | | | | | | | * VM_OBJECT_LOCK and VM_OBJECT_UNLOCK are mapped to write operations * VM_OBJECT_SLEEP() is introduced as a general purpose primitve to get a sleep operation using a VM_OBJECT_LOCK() as protection * The approach must bear with vm_pager.h namespace pollution so many files require including directly rwlock.h
* Remove a racy checks on resident and cached pages forattilio2013-02-101-12/+0
| | | | | | | | | | | | | | | | tmpfs_mapped{read, write}() functions: - tmpfs_mapped{read, write}() are only called within VOP_{READ, WRITE}(), which check before-hand to work only on valid VREG vnodes. Also the vnode is locked for the duration of the work, making vnode reclaiming impossible, during the operation. Hence, vobj can never be NULL. - Currently check on resident pages and cached pages without vm object lock held is racy and can do even more harm than good, as a page could be transitioning between these 2 pools and then be skipped entirely. Skip the checks as lookups on empty splay trees are very cheap. Discussed with: alc Tested by: flo MFC after: 2 weeks
* tmpfs: Replace directory entry linked list with RB-Tree.gleb2013-01-064-312/+538
| | | | | | | | | | | | | | | | | | | | | Use file name hash as a tree key, handle duplicate keys. Both VOP_LOOKUP and VOP_READDIR operations utilize same tree for search. Directory entry offset (cookie) is either file name hash or incremental id in case of hash collisions (duplicate-cookies). Keep sorted per directory list of duplicate-cookie entries to facilitate cookie number allocation. Don't fail if previous VOP_READDIR() offset is no longer valid, start with next dirent instead. Other file system handle it similarly. Workaround race prone tn_readdir_last[pn] fields update. Add tmpfs_dir_destroy() to free all dirents. Set NFS cookies in tmpfs_dir_getdents(). Return EJUSTRETURN from tmpfs_dir_getdents() instead of hard coded -1. Mark directory traversal routines static as they are no longer used outside of tmpfs_subr.c
* Complete MPSAFE VFS interface and remove MNTK_MPSAFE flag.attilio2012-11-091-1/+0
| | | | | Porters should refer to __FreeBSD_version 1000021 for this change as it may have happened at the same timeframe.
* Fix up kernel sources to be ready for a 64-bit ino_t.mdf2012-09-271-1/+2
| | | | Original code by: Gleb Kurtsou
* After the PHYS_TO_VM_PAGE() function was de-inlined, the main reasonkib2012-08-052-0/+2
| | | | | | | | | | | | | to pull vm_param.h was removed. Other big dependency of vm_page.h on vm_param.h are PA_LOCK* definitions, which are only needed for in-kernel code, because modules use KBI-safe functions to lock the pages. Stop including vm_param.h into vm_page.h. Include vm_param.h explicitely for the kernel code which needs it. Suggested and reviewed by: alc MFC after: 2 weeks
* Remove unused thread argument to vrecycle().trasz2012-04-231-2/+1
| | | | Reviewed by: kib
* Return EOPNOTSUPP rather than EPERM for the SF_SNAPSHOT flag becausejh2012-04-181-4/+1
| | | | | | tmpfs doesn't support snapshots. Suggested by: bde
* Sync tmpfs_chflags() with the recent changes to UFS:jh2012-04-161-13/+13
| | | | | | - Add a check for unsupported file flags. - Return EPERM when an user without PRIV_VFS_SYSFLAGS privilege attempts to toggle SF_SETTABLE flags.
* tmpfs: Allow update mounts only for certain options.jh2012-04-162-6/+15
| | | | | | | | Since r230208 update mounts were allowed if the list of mount options contained the "export" option. This is not correct as tmpfs doesn't really support updating all options. Reviewed by: kevlo, trociny
* Provide better description for vfs.tmpfs.memory_reserved sysctl.gleb2012-04-151-1/+2
| | | | Suggested by: Anton Yuzhaninov <citrin@citrin.ru>
* - Introduce a cache-miss optimization for consistency with otherattilio2012-04-091-1/+1
| | | | | | | | | accesses of the cache member of vm_object objects. - Use novel vm_page_is_cached() for checks outside of the vm subsystem. Reviewed by: alc MFC after: 2 weeks X-MFC: r234039
* tmpfs supports only INT_MAX nodes due to limitations of unit numbergleb2012-04-071-3/+7
| | | | | | | | | | | allocator. Replace UINT32_MAX checks with INT_MAX. Keeping more than 2^31 nodes in memory is not likely to become possible in foreseeable feature and would require new unit number allocator. Discussed with: delphij MFC after: 2 weeks
* Add vfs_getopt_size. Support human readable file system options in tmpfs.gleb2012-04-071-16/+13
| | | | | | | Increase maximum tmpfs file system size to 4GB*PAGE_SIZE on 32 bit archs. Discussed with: delphij MFC after: 2 weeks
* Add reserved memory limit sysctl to tmpfs.gleb2012-04-073-61/+91
| | | | | | | Cleanup availble and used memory functions. Check if free pages available before allocating new node. Discussed with: delphij
* Prevent tmpfs_rename() deadlock in a way similar to UFSgleb2012-03-142-7/+179
| | | | | | Unlock vnodes and try to lock them one by one. Relookup fvp and tvp. Approved by: mdf (mentor)
* Don't enforce LK_RETRY to get existing vnode in tmpfs_alloc_vp()gleb2012-03-141-7/+11
| | | | | | | | | Doomed vnode is hardly of any use here, besides all callers handle error case. vfs_hash_get() does the same. Don't mess with vnode holdcount, vget() takes care of it already. Approved by: mdf (mentor)
* Remove fifo.h. The only used function declaration from the header iskib2012-03-111-1/+0
| | | | | | migrated to sys/vnode.h. Submitted by: gianni
* Similar to the fixes in 226967 and 226987, purge any name cache entriesjhb2012-03-021-0/+2
| | | | | | | | | associated with the previous vnode (if any) associated with the target of a rename(). Otherwise, a lookup of the target pathname concurrent with a rename() could re-add a name cache entry after the namei(RENAME) lookup in kern_renameat() had purged the target pathname. MFC after: 2 weeks
* Replace PRIdMAX with "jd" in a printf call. Cast the corresponding value totijl2012-02-141-5/+2
| | | | intmax_t instead of uintmax_t, because the original type is off_t.
* Return EOPNOTSUPP since we only support update mounts for NFS export.kevlo2012-01-171-0/+4
| | | | Spotted by: trociny
* Add nfs export support to tmpfs(5)kevlo2012-01-161-4/+2
| | | | Reviewed by: kib
* When tmpfs_write() resets an extended file to its original size after analc2012-01-163-7/+12
| | | | | | | | error, we want tmpfs_reg_resize() to ignore I/O errors and unconditionally update the file's size. Reviewed by: kib MFC after: 3 weeks
* Neither tmpfs_nocacheread() nor tmpfs_mappedwrite() needs to callalc2012-01-142-21/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | vm_object_pip_{add,subtract}() on the swap object because the swap object can't be destroyed while the vnode is exclusively locked. Moreover, even if the swap object could have been destroyed during tmpfs_nocacheread() and tmpfs_mappedwrite() this code is broken because vm_object_pip_subtract() does not wake up the sleeping thread that is trying to destroy the swap object. Free invalid pages after an I/O error. There is no virtue in keeping them around in the swap object creating more work for the page daemon. (I believe that any non-busy page in the swap object will now always be valid.) vm_pager_get_pages() does not return a standard errno, so its return value should not be returned by tmpfs without translation to an errno value. There is no reason for the wakeup on vpg in tmpfs_mappedwrite() to occur with the swap object locked. Eliminate printf()s from tmpfs_nocacheread() and tmpfs_mappedwrite(). (The swap pager already spam your console if data corruption is imminent.) Reviewed by: kib MFC after: 3 weeks
* Correct an error of omission in the implementation of the truncationalc2012-01-081-19/+59
| | | | | | | | | | | | | | | | | | | | | | | operation on POSIX shared memory objects and tmpfs. Previously, neither of these modules correctly handled the case in which the new size of the object or file was not a multiple of the page size. Specifically, they did not handle partial page truncation of data stored on swap. As a result, stale data might later be returned to an application. Interestingly, a data inconsistency was less likely to occur under tmpfs than POSIX shared memory objects. The reason being that a different mistake by the tmpfs truncation operation helped avoid a data inconsistency. If the data was still resident in memory in a PG_CACHED page, then the tmpfs truncation operation would reactivate that page, zero the truncated portion, and leave the page pinned in memory. More precisely, the benevolent error was that the truncation operation didn't add the reactivated page to any of the paging queues, effectively pinning the page. This page would remain pinned until the file was destroyed or the page was read or written. With this change, the page is now added to the inactive queue. Discussed with: jhb Reviewed by: kib (an earlier version) MFC after: 3 weeks
* Don't pass VM_ALLOC_ZERO to vm_page_grab() in tmpfs_mappedwrite() andalc2012-01-031-2/+2
| | | | | | | | tmpfs_nocacheread(). It is both unnecessary and a pessimization. It results in either the page being zeroed twice or zeroed first and then overwritten by an I/O operation. MFC after: 3 weeks
* Avoid panics from recursive rename operations. Not a perfect patch butivoras2011-11-221-6/+3
| | | | | | | | | good enough for now. PR: kern/159418 Submitted by: Gleb Kurtsou Reviewed by: kib MFC after: 1 month
* Improve the way to calculate available pages in tmpfs:delphij2011-11-211-4/+1
| | | | | | | | | | | | | | - Don't deduct wired pages from total usable counts because it does not make any sense. To make things worse, on systems where swap size is smaller than physical memory and use a lot of wired pages (e.g. ZFS), tmpfs can suddenly have free space of 0 because of this; - Count cached pages as available; [1] - Don't count inactive pages as available, technically we could but that might be too aggressive; [1] [1] Suggested by kib@ MFC after: 1 week
* Don astbestos garment and remove the warning about TMPFS being experimentalmarcel2011-11-071-3/+0
| | | | | | | | | -- highly experimental even. So far the closest to a bug in TMPFS that people have gotten to relates to how ZFS can take away from the memory that TMPFS needs. One can argue that such is not a bug in TMPFS. Irrespective, even if there is a bug here and there in TMPFS, it's not in our own advantage to scare people away from using TMPFS. I for one have been using it, even with ZFS, very successfully.
* Added missing cache purge of from argument for rename().pho2011-11-011-0/+1
| | | | | | Reported by: Anton Yuzhaninov <citrin citrin ru> In collaboration with: kib MFC after: 1 week
* Split the vm_page flags PG_WRITEABLE and PG_REFERENCED into atomickib2011-09-061-6/+3
| | | | | | | | | | | | | | | | | flags field. Updates to the atomic flags are performed using the atomic ops on the containing word, do not require any vm lock to be held, and are non-blocking. The vm_page_aflag_set(9) and vm_page_aflag_clear(9) functions are provided to modify afalgs. Document the changes to flags field to only require the page lock. Introduce vm_page_reference(9) function to provide a stable KPI and KBI for filesystems like tmpfs and zfs which need to mark a page as referenced. Reviewed by: alc, attilio Tested by: marius, flo (sparc64); andreast (powerpc, powerpc64) Approved by: re (bz)
* Add a new option, OBJPR_NOTMAPPED, to vm_object_page_remove(). Passing thisalc2011-06-291-1/+1
| | | | | | | | | | | | | | | | | | option to vm_object_page_remove() asserts that the specified range of pages is not mapped, or more precisely that none of these pages have any managed mappings. Thus, vm_object_page_remove() need not call pmap_remove_all() on the pages. This change not only saves time by eliminating pointless calls to pmap_remove_all(), but it also eliminates an inconsistency in the use of pmap_remove_all() versus related functions, like pmap_remove_write(). It eliminates harmless but pointless calls to pmap_remove_all() that were being performed on PG_UNMANAGED pages. Update all of the existing assertions on pmap_remove_all() to reflect this change. Reviewed by: kib
* Add a lock flags argument to the VFS_FHTOVP() file systemrmacklem2011-05-221-2/+4
| | | | | | | | | | | method, so that callers can indicate the minimum vnode locking requirement. This will allow some file systems to choose to return a LK_SHARED locked vnode when LK_SHARED is specified for the flags argument. This patch only adds the flag. It does not change any file system to use it and all callers specify LK_EXCLUSIVE, so file system semantics are not changed. Reviewed by: kib
* Eliminate two dubious attempts at optimizing the implementation of aalc2011-02-221-12/+4
| | | | | | | | | | | | | | | | | | file's last accessed, modified, and changed times: TMPFS_NODE_ACCESSED and TMPFS_NODE_CHANGED should be set unconditionally in tmpfs_remove() without regard to the number of hard links to the file. Otherwise, after the last directory entry for a file has been removed, a process that still has the file open could read stale values for the last accessed and changed times with fstat(2). Similarly, tmpfs_close() should update the time-related fields even if all directory entries for a file have been removed. In this case, the effect is that the time-related fields will have values that are later than expected. They will correspond to the time at which fstat(2) is called. In collaboration with: kib MFC after: 1 week
* tmpfs_remove() isn't modifying the file's data, so it shouldn't setalc2011-02-191-2/+1
| | | | | | | | | TMPFS_NODE_MODIFIED on the node. PR: 152488 Submitted by: Anton Yuzhaninov Reviewed by: kib MFC after: 1 week
* Further simplify tmpfs_reg_resize(). Also, update its comments, includingalc2011-02-141-17/+12
| | | | style fixes.
* Eliminate tn_reg.tn_aobj_pages. Instead, correctly maintain the vmalc2011-02-132-33/+26
| | | | | | | | | object's size field. Previously, that field was always zero, even when the object tn_reg.tn_aobj contained numerous pages. Apply style fixes to tmpfs_reg_resize(). In collaboration with: kib
* In tmpfs_readdir(), normalize handling of the directory entries thatkib2011-01-202-4/+5
| | | | | | | | | | either overflow the supplied buffer, or cause uiomove fail. Do not advance cached de when directory entry was not copied out. Do not return EOF when no entries could be copied due to first entry too large for supplied buffer, signal EINVAL instead. Reported by: Beat G?tzi <beat chruetertee ch> MFC after: 1 week
* tmpfs + sendfile: do not produce partially valid pages for vnode's tailavg2010-10-121-3/+6
| | | | | | See r213730 for details of analogous change in ZFS. MFC after: 3 days
* tmpfs, zfs + sendfile: mark page bits as valid after populating it with dataavg2010-09-151-0/+2
| | | | | | | | | | | | Otherwise, adding insult to injury, in addition to double-caching of data we would always copy the data into a vnode's vm object page from backend. This is specific to sendfile case only (VOP_READ with UIO_NOCOPY). PR: kern/141305 Reported by: Wiktor Niesiobedzki <bsd@vink.pl> Reviewed by: alc Tested by: tools/regression/sockets/sendfile MFC after: 2 weeks
OpenPOWER on IntegriCloud