summaryrefslogtreecommitdiffstats
path: root/sys/fs
Commit message (Collapse)AuthorAgeFilesLines
* In some cases, like whenever devfs file times are zero, the fix(aa) will nottrhodes2007-04-201-1/+1
| | | | | | | | | | | | be applied to dev entries. This leaves us with file times like "Jan 1 1970." Work around this problem by replacing the tv_sec == 0 check with a <= 3600 check. It's doubtful anyone will be booting within an hour of the Epoch, let alone care about a few seconds worth of nonzero timestamps. It's a hackish work around, but it does work and I have not experienced any negatives in my testing. Discussed with: bde "Ok with me: phk
* Avoid "unused variable" warning when building without PSEUDOFS_TRACE.des2007-04-151-0/+1
|
* Make pseudofs (and consequently procfs, linprocfs and linsysfs) MPSAFE.des2007-04-156-340/+553
|
* Instead of stating GIANT_REQUIRED, just acquire and release Giant wheredes2007-04-151-2/+5
| | | | | needed. This does not make a difference now, but will when procfs is marked MPSAFE.
* Fix the same bug as in procfs_doproc{,db}regs(): check that uio_offset isdes2007-04-151-1/+3
| | | | | | 0 upon entry, and don't reset it before returning. MFC after: 3 weeks
* Don't reset uio_offset to 0 before returning. Instead, refuse to servicedes2007-04-152-3/+7
| | | | | | | requests where uio_offset is not 0 to begin with. This fixes a long- standing bug where e.g. 'cat /proc/$$/regs' would loop forever. MFC after: 3 weeks
* Further pseudofs improvements:des2007-04-146-62/+59
| | | | | | | | | | | | | | | | | | | | | | | | | The pfs_info mutex is only needed to lock pi_unrhdr. Everything else in struct pfs_info is modified only while Giant is held (during vfs_init() / vfs_uninit()); add assertions to that effect. Simplify pfs_destroy somewhat. Remove superfluous arguments from pfs_fileno_{alloc,free}(), and the assertions which were added in the previous commit to ensure they were consistent. Assert that Giant is held while the vnode cache is initialized and destroyed. Also assert that the cache is empty when it is destroyed. Rename the vnode cache mutex for consistency. Fix a long-standing bug in pfs_getattr(): it would uncritically return the node's pn_fileno as st_ino. This would result in st_ino being 0 if the node had not previously been visited by readdir(), and also in an incorrect st_ino for process directories and any files contained therein. Correct this by abstracting the fileno manipulations previously done in pfs_readdir() into a new function, pfs_fileno(), which is used by both pfs_getattr() and pfs_readdir().
* Add a flag to struct pfs_vdata to mark the vnode as dead (e.g. process-des2007-04-115-51/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | specific nodes when the process exits) Move the vnode-cache-walking loop which was duplicated in pfs_exit() and pfs_disable() into its own function, pfs_purge(), which looks for vnodes marked as dead and / or belonging to the specified pfs_node and reclaims them. Note that this loop is still extremely inefficient. Add a comment in pfs_vncache_alloc() explaining why we have to purge the vnode from the vnode cache before returning, in case anyone should be tempted to remove the call to cache_purge(). Move the special handling for pfstype_root nodes into pfs_fileno_alloc() and pfs_fileno_free() (the root node's fileno must always be 2). This also fixes a bug where pfs_fileno_free() would reclaim the root node's fileno, triggering a panic in the unr code, as that fileno was never allocated from unr to begin with. When destroying a pfs_node, release its fileno and purge it from the vnode cache. I wish we could put off the call to pfs_purge() until after the entire tree had been destroyed, but then we'd have vnodes referencing freed pfs nodes. This probably doesn't matter while we're still under Giant, but might become an issue later. When destroying a pseudofs instance, destroy the tree before tearing down the fileno allocator. In pfs_mount(), acquire the mountpoint interlock when required. MFC after: 3 weeks
* Whitespace nits.des2007-04-052-4/+4
|
* Replace custom file descriptor array sleep lock constructed using a mutexrwatson2007-04-045-11/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and flags with an sxlock. This leads to a significant and measurable performance improvement as a result of access to shared locking for frequent lookup operations, reduced general overhead, and reduced overhead in the event of contention. All of these are imported for threaded applications where simultaneous access to a shared file descriptor array occurs frequently. Kris has reported 2x-4x transaction rate improvements on 8-core MySQL benchmarks; smaller improvements can be expected for many workloads as a result of reduced overhead. - Generally eliminate the distinction between "fast" and regular acquisisition of the filedesc lock; the plan is that they will now all be fast. Change all locking instances to either shared or exclusive locks. - Correct a bug (pointed out by kib) in fdfree() where previously msleep() was called without the mutex held; sx_sleep() is now always called with the sxlock held exclusively. - Universally hold the struct file lock over changes to struct file, rather than the filedesc lock or no lock. Always update the f_ops field last. A further memory barrier is required here in the future (discussed with jhb). - Improve locking and reference management in linux_at(), which fails to properly acquire vnode references before using vnode pointers. Annotate improper use of vn_fullpath(), which will be replaced at a future date. In fcntl(), we conservatively acquire an exclusive lock, even though in some cases a shared lock may be sufficient, which should be revisited. The dropping of the filedesc lock in fdgrowtable() is no longer required as the sxlock can be held over the sleep operation; we should consider removing that (pointed out by attilio). Tested by: kris Discussed with: jhb, kris, attilio, jeff
* Annotate that this giant acqusition is dependent on tty locking.kris2007-03-261-2/+2
|
* o cd9660 code repo-copied, update a comment.maxim2007-03-241-1/+1
|
* Make insmntque() externally visibile and allow it to fail (e.g. duringtegge2007-03-1316-11/+132
| | | | | | | | | | | | | | | | | | | | | | | late stages of unmount). On failure, the vnode is recycled. Add insmntque1(), to allow for file system specific cleanup when recycling vnode on failure. Change getnewvnode() to no longer call insmntque(). Previously, embryonic vnodes were put onto the list of vnode belonging to a file system, which is unsafe for a file system marked MPSAFE. Change vfs_hash_insert() to no longer lock the vnode. The caller now has that responsibility. Change most file systems to lock the vnode and call insmntque() or insmntque1() after a new vnode has been sufficiently setup. Handle failed insmntque*() calls by propagating errors to callers, possibly after some file system specific cleanup. Approved by: re (kensmith) Reviewed by: kib In collaboration with: kib
* Add a pn_destroy field to pfs_node. This field points to a destructordes2007-03-123-22/+44
| | | | | | | | | | | | | function which is called from pfs_destroy() before the node is reclaimed. Modify pfs_create_{dir,file,link}() to accept a pointer to a destructor function in addition to the usual attr / fill / vis pointers. This breaks both the programming and binary interfaces between pseudofs and its consumers. It is believed that there are no pseudofs consumers outside the source tree, so that the impact of this change is minimal. Submitted by: Aniruddha Bohra <bohra@cs.rutgers.edu>
* Change fifo_printinfo to check if the vnode v_fifoinfo pointermpp2007-03-021-0/+4
| | | | is NULL and print a message to that effect to prevent a panic.
* Use pause() rather than tsleep() on stack variables and function pointers.jhb2007-02-271-1/+1
|
* Check that the error returned by vfs_getopts() is not ENOENT before assumingcognet2007-02-211-2/+2
| | | | | | there's actually an error. This is just in order to unbreak ntfs on current, before a proper solution is committed.
* Do allow PIOCSFL in jail for setguid processes; this is more consistentrwatson2007-02-191-4/+2
| | | | | with other debugging checks elsewhere. XXX comment on the fact that p_candebug() is not being used here remains.
* Move vnode-to-file-handle translation from vfs_vptofh to vop_vptofh method.pjd2007-02-1518-146/+154
| | | | | | | | | | | | | | | | This way we may support multiple structures in v_data vnode field within one file system without using black magic. Vnode-to-file-handle should be VOP in the first place, but was made VFS operation to keep interface as compatible as possible with SUN's VFS. BTW. Now Solaris also implements vnode-to-file-handle as VOP operation. VFS_VPTOFH() was left for API backward compatibility, but is marked for removal before 8.0-RELEASE. Approved by: mckusick Discussed with: many (on IRC) Tested with: ufs, msdosfs, cd9660, nullfs and zfs
* Forced commit and #include changes for repo copy fromrodrigc2007-02-117-21/+21
| | | | | | sys/isofs/cd9660 to sys/fs/cd9660. Discussed on freebsd-current.
* Add noatime to the list of mount options that msdosfs accepts.rodrigc2007-02-081-1/+1
| | | | | PR: 108896 Submitted by: Eugene Grosbein <eugen grosbein pp ru>
* Style fixes: use ANSI C function declarations.rodrigc2007-02-081-31/+8
|
* Fix the race of dereferencing /proc/<pid>/file with execve(2) by cachingkib2007-02-071-4/+12
| | | | | | | | | the value of p_textvp. This way, we always unlock the locked vnode. While there, vhold() the vnode around the vn_lock(). Reported and tested by: Guy Helmer (ghelmer palisadesys com) Approved by: des (procfs maintainer) MFC after: 1 week
* Eliminate some dead code which was introduced in 1.23, yet was alwaysrodrigc2007-02-061-11/+0
| | | | commented out.
* coda_vptofh is never defined nor used.pjd2007-02-021-1/+0
|
* Fixing compilation bustage by removing references to opt_msdosfs.h.avatar2007-01-302-4/+0
| | | | | This auto-generated header file no longer exists since the removal of MSDOSFS_LARGE in sys/conf/options:1.574.
* Fix spacing from my previous commit to this file:trhodes2007-01-301-1/+1
| | | | Noticed by: fjoe
* Add a "-o large" mount option for msdosfs. Convert compile-time checks forrodrigc2007-01-302-36/+54
| | | | | | | | | | | | | | | | #ifdef MSDOSFS_LARGE to run-time checks to see if "-o large" was specified. Test case provided by Oliver Fromme: truncate -s 200G test.img mdconfig -a -t vnode -f test.img -u 9 newfs_msdos -s 419430400 -n 1 /dev/md9 zip250 mount -t msdosfs /dev/md9 /mnt # should fail mount -t msdosfs -o large /dev/md9 /mnt # should succeed PR: 105964 Requested by: Oliver Fromme <olli lurza secnetix de> Tested by: trhodes MFC after: 2 weeks
* Below is slightly edited description of the LOR by Tor Egge:kib2007-01-221-1/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | -------------------------- [Deadlock] is caused by a lock order reversal in vfs_lookup(), where [some] process is trying to lock a directory vnode, that is the parent directory of covered vnode) while holding an exclusive vnode lock on covering vnode. A simplified scenario: root fs var fs / A / (/var) D /var B /log (/var/log) E vfs lock C vfs lock F Within each file system, the lock order is clear: C->A->B and F->D->E When traversing across mounts, the system can choose between two lock orders, but everything must then follow that lock order: L1: C->A->B | +->F->D->E L2: F->D->E | +->C->A->B The lookup() process for namei("/var") mixes those two lock orders: VOP_LOOKUP() obtains B while A is held vfs_busy() obtains a shared lock on F while A and B are held (follows L1, violates L2) vput() releases lock on B VOP_UNLOCK() releases lock on A VFS_ROOT() obtains lock on D while shared lock on F is held vfs_unbusy() releases shared lock on F vn_lock() obtains lock on A while D is held (violates L1, follows L2) dounmount() follows L1 (B is locked while F is drained). Without unmount activity, vfs_busy() will always succeed without blocking and the deadlock isn't triggered (the system behaves as if L2 is followed). With unmount, you can get 4 processes in a deadlock: p1: holds D, want A (in lookup()) p2: holds shared lock on F, want D (in VFS_ROOT()) p3: holds B, want drain lock on F (in dounmount()) p4: holds A, want B (in VOP_LOOKUP()) You can have more than one instance of p2. The reversal was introduced in revision 1.81 of src/sys/kern/vfs_lookup.c and MFCed to revision 1.80.2.1, probably to avoid a cascade of vnode locks when nfs servers are dead (VFS_ROOT() just hangs) spreading to the root fs root vnode. - Tor Egge To fix the LOR, ups@ noted that when crossing the mount point, ni_dvp is actually not used by the callers of namei. Thus, placeholder deadfs vnode vp_crossmp is introduced that is filled into ni_dvp. Idea by: ups Reviewed by: tegge, ups, jeff, rwatson (mac interaction) Tested by: Peter Holm MFC after: 2 weeks
* Add a 3rd entry in the cache, which keeps the end positiontrhodes2007-01-162-3/+19
| | | | | | | | | | from just before extending a file. This has the desired effect of keeping the write speed constant. And yes, that helps a lot copying large files always at full speed now, and I have seen improvements using benchmarks/bonnie. Stolen from: NetBSD Reviewed by: bde
* Rewrite the udf_read() routine to use a file vnode instead of the devvp vnode.pav2007-01-151-24/+52
| | | | | | | | | | | | The code is modelled after cd9660, including support for simple read-ahead courtesy of clustered read. Fix udf_strategy to DTRT. This change fixes sendfile(2) not to send out garbage. Reviewed by: scottl MFC after: 1 month
* Tell backing v_object the filesize right on it's creation.pav2007-01-071-1/+6
| | | | MFC after: 1 week
* When performing a mount update to change a mount from read-only to read-write,rodrigc2007-01-061-4/+10
| | | | | | | | | | | | | | | | | | | | | do not call markvoldirty() until the mount has been flagged as read-write. Due to the nature of the msdosfs code, this bug only seemed to appear for FAT-16 and FAT-32. This fixes the testcase: #!/bin/sh dd if=/dev/zero bs=1m count=1 oseek=119 of=image.msdos mdconfig -a -t vnode -f image.msdos newfs_msdos -F 16 /dev/md0 fd120m mount_msdosfs -o ro /dev/md0 /mnt mount | grep md0 mount -u -o rw /dev/md0; echo $? mount | grep md0 umount /mnt mdconfig -d -u 0 PR: 105412 Tested by: Eugene Grosbein <eugen grosbein pp ru>
* Simplify code in union_hashins() and union_hashget() functions. Theserodrigc2007-01-051-24/+12
| | | | | | | functions now more closely resemble similar functions in nullfs. This also eliminates some errors. Submitted by: daichi, Masanori OZAWA <ozawa ongs co jp>
* Eliminate obsolete comment, now that getushort() is implemented inrodrigc2007-01-051-4/+0
| | | | terms of functions in <sys/endian.h>.
* Eliminate ASSERT_VOP_ELOCKED panics when doing mkdir or symlink whenrodrigc2007-01-051-8/+20
| | | | | | sysctl vfs.lookup_shared=1. Submitted by: daichi, Masanori OZAWA <ozawa ongs co jp>
* Use the vnode interlock to close a race where pfs_vncache_alloc() couldjhb2007-01-021-8/+9
| | | | | | | | attempt to vn_lock() a destroyed vnode resulting in a hang. MFC after: 1 week Submitted by: ups Reviewed by: des
* Call vnode_create_vobject() in VOP_OPEN. Makes mmap work on UDF filesystem.pav2006-12-231-0/+12
| | | | | | PR: kern/92040 Approved by: scottl MFC after: 1 week
* Unbreak 64-bit little-endian systems that do require alignment.marcel2006-12-211-18/+5
| | | | | The fix involves using le16dec(), le32dec(), le16enc() and le32enc(). This eliminates invalid casts and duplicated logic.
* For big-endian version of getulong() macro, cast result to u_int32_t.rodrigc2006-12-191-1/+1
| | | | | | | | | | | This macro was written expecting a 32-bit unsigned long, and doesn't work properly on 64-bit systems. This bug caused vn_stat() to return incorrect values for files larger than 2gb on msdosfs filesystems on 64-bit systems. PR: 106703 Submitted by: Axel Gonzalez <loox e-shell net> MFC after: 3 days
* Fix get_ulong() macro on AMD64 (or any little-endian 64-bit platform).rodrigc2006-12-191-5/+1
| | | | | | | | | This bug caused vn_stat() to fail on files larger than 2gb on msdosfs filesystems on AMD64. PR: 106703 Tested by: Axel Gonzalez <loox e-shell net> MFC after: 3 days
* Remove unused variable in unionfs_root().rodrigc2006-12-091-2/+0
| | | | Submitted by: daichi, Masanori OZAWA
* Use vfs_mount_error() in a few places to give more descriptive mount errorrodrigc2006-12-091-2/+6
| | | | messages.
* Add locking around calls to unionfs_get_node_status()rodrigc2006-12-091-0/+4
| | | | | | | in unionfs_ioctl() and unionfs_poll(). Submitted by: daichi, Masanori OZAWA <ozawa@ongs.co.jp> Prompted by: kris
* In unionfs_readdir(), prevent a possible NULL dereference.rodrigc2006-12-091-0/+4
| | | | | CID: 1667 Found by: Coverity Prevent (tm)
* In unionfs_hashrem(), use LIST_FOREACH_SAFE when iterating overrodrigc2006-12-091-2/+3
| | | | | | | the list of nodes to free them. CID: 1668 Found by: Coverity Prevent (tm)
* Minor cleanup. If we are doing a mount update, and we pass inrodrigc2006-12-091-4/+8
| | | | | | | | | | | an "export" flag indicating that we are trying to NFS export the filesystem, and the MSDOSFS_LARGEFS flag is set on the filesystem, then deny the mount update and export request. Otherwise, let the full mount update proceed normally. MSDOSFS_LARGES and NFS don't mix because of the way inodes are calculated for MSDOSFS_LARGEFS. MFC after: 3 days
* The ISO9660 spec does allow files up to 4G. Change the i_sizekientzle2006-12-081-1/+1
| | | | | | | | | | field to "unsigned long" so that it actually works. Thanks to Robert Sciuk for sending me a DVD that demonstrated ISO9660-formatted media with a file >2G. I've now fixed this both in libarchive and in the cd9660 filesystem. MFC after: 14 days
* Threading cleanup.. part 2 of several.julian2006-12-061-10/+3
| | | | | | | | | | | | | | | | | | | | | | Make part of John Birrell's KSE patch permanent.. Specifically, remove: Any reference of the ksegrp structure. This feature was never fully utilised and made things overly complicated. All code in the scheduler that tried to make threaded programs fair to unthreaded programs. Libpthread processes will already do this to some extent and libthr processes already disable it. Also: Since this makes such a big change to the scheduler(s), take the opportunity to rename some structures and elements that had to be moved anyhow. This makes the code a lot more readable. The ULE scheduler compiles again but I have no idea if it works. The 4bsd scheduler still reqires a little cleaning and some functions that now do ALMOST nothing will go away, but I thought I'd do that as a separate commit. Tested by David Xu, and Dan Eischen using libthr and libpthread.
* o Do not leave uninitialized birthtime: in MSDOSFSMNT_LONGNAMEmaxim2006-12-031-2/+4
| | | | | | | | | | | | set birthtime to FAT CTime (creation time) and in the other cases set birthtime to -1. o Set ctime to mtime instead of FAT CTime which has completely different meaning. PR: kern/106018 Submitted by: Oliver Fromme MFC after: 1 month
OpenPOWER on IntegriCloud