summaryrefslogtreecommitdiffstats
path: root/sys/ufs
Commit message (Collapse)AuthorAgeFilesLines
* Revert the previous commit. The race is not applicable to the lockmgrjhb2010-07-161-2/+0
| | | | | | | | implementation in 8.0 and later as its flags field does not hold dynamic state such as waiters flags, but is only modified in lockinit() aside from VN_LOCK_*(). Discussed with: attilio
* When the MNTK_EXTENDED_SHARED mount option was added, some filesystems werejhb2010-07-161-0/+2
| | | | | | | | | | | | | | changed to defer the setting of VN_LOCK_ASHARE() (which clears LK_NOSHARE in the vnode lock's flags) until after they had determined if the vnode was a FIFO. This occurs after the vnode has been inserted a VFS hash or some similar table, so it is possible for another thread to find this vnode via vget() on an i-node number and block on the vnode lock. If the lockmgr interlock (vnode interlock for vnode locks) is not held when clearing the LK_NOSHARE flag, then the lk_flags field can be clobbered. As a result the thread blocked on the vnode lock may never get woken up. Fix this by holding the vnode interlock while modifying the lock flags in this case. MFC after: 3 days
* - Handle the truncation of an inode with an effective link count of 0 injeff2010-07-069-128/+34
| | | | | | | | | | | | | | | the context of the process that reduced the effective count. Previously all truncation as a result of unlink happened in the softdep flush thread. This had the effect of being impossible to rate limit properly with the journal code. Now the process issuing unlinks is suspended when the journal files. This has a side-effect of improving rm performance by allowing more concurrent work. - Handle two cases in inactive, one for effnlink == 0 and another when nlink finally reaches 0. - Eliminate the SPACECOUNTED related code since the truncation is no longer delayed. Discussed with: mckusick
* Ensure that VOP_ACCESSX is called with exclusively locked vnode forkib2010-06-201-0/+33
| | | | | | | | | | | | the kernel compiled with QUOTA option. ufs_accessx() upgrades the vdp vnode lock from shared to exclusive to assign the dquot structure to the vnode, and ufs_delete_denied() is called when tvp is locked. Since upgrade drops shared lock when non-blocked upgrade failed, LOR is there. Reported and tested by: Dmitry Pryanishnikov <lynx.ripe gmail com> Tested by: pho PR: kern/147890 MFC after: 1 week
* ffs_softdep: change K&R in function defintions to ANSI prototypesavg2010-06-111-19/+6
| | | | | | | | Apparently it's bad when we first have an ANSI prototype in function declaration, but then use K&R in its defintion. Complaint from: clang MFC after: 2 weeks
* Extend the scope of the lock on the quota file vnode in quotaon() tokib2010-06-031-4/+6
| | | | | | cover the initial read by dqopen(). Assert that vnode is locked in dqopen(). Remove VFS_LOCK_GIANT() from dqopen(), since quotaon() keeps Giant locked if needed around the call.
* ffs_mount: accept and drop userland-only options that can be passed fromavg2010-05-191-3/+12
| | | | | | | | | | | | | | | | | | | loader(8) In r193192 loader(8) has grown an ability to pass root mount options from fstab via vfs.root.mountfrom.options. Unfortunately, some options that can be present in fstab are for userland only and lead to root mounting failure when seen by kernel. Rather than teaching loader about FFS-specific options that should be filtered out, ffs_mount recognizes those options as valid, but ignores and deletes[1] them. [1] is suggested by jh. PR: kern/141050 Reported by: many Reviewed by: jh, bde MFC after: 4 days
* - Don't immediately re-run softdepflush if we didn't make any progressjeff2010-05-192-51/+72
| | | | | | | | | | | | on the last iteration. This can lead to a deadlock when we have worklist items that cannot be immediately satisfied. Reported by: uqs, Dimitry Andric <dimitry@andric.com> - Remove some unnecessary debugging code and place some other under SUJ_DEBUG. - Examine the journal state in softdep_slowdown(). - Re-format some comments so I may more easily add flag descriptions.
* - Call softdep_prealloc() before any of the balloc routines in thejeff2010-05-072-1/+10
| | | | | | | | snapshot code. - Don't fsync() vnodes in prealloc if copy on write is in progress. It is not safe to recurse back into the write path here. Reported by: Vladimir Grebenschikov <vova@fbsd.ru>
* - Use the correct flag mask when determining whether an inode hasjeff2010-05-071-1/+1
| | | | | | | | | successfully made it to the free list yet or not. This fixes a deadlock that can occur with unlinked but referenced files. Journal space and inodedeps were not correctly reclaimed because the inode block was not left dirty. Tested/Reported by: lwindschuh@googlemail.com
* Merger of the quota64 project into head.mckusick2010-05-074-38/+417
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This joint work of Dag-Erling Smørgrav and myself updates the FFS quota system to support both traditional 32-bit and new 64-bit quotas (for those of you who want to put 2+Tb quotas on your users). By default quotas are not compiled into the kernel. To include them in your kernel configuration you need to specify: options QUOTA # Enable FFS quotas If you are already running with the current 32-bit quotas, they should continue to work just as they have in the past. If you wish to convert to using 64-bit quotas, use `quotacheck -c 64'; if you wish to revert from 64-bit quotas back to 32-bit quotas, use `quotacheck -c 32'. There is a new library of functions to simplify the use of the quota system, do `man quotafile' for details. If your application is currently using the quotactl(2), it is highly recommended that you convert your application to use the quotafile interface. Note that existing binaries will continue to work. Special thanks to John Kozubik of rsync.net for getting me interested in pursuing 64-bit quota support and for funding part of my development time on this project.
| * Final update to current version of head in preparation for reintegration.mckusick2010-05-063-20/+181
| |\
| * \ Update to current version of head.mckusick2010-04-2818-1803/+7576
| |\ \
| * | | Debugging nits found while testing the new 64-bit quota code.mckusick2010-03-163-3/+42
| | | |
| * | | IFH@204581des2010-03-0410-332/+919
| |\ \ \
| * \ \ \ Sync with headdes2009-09-251-4/+0
| |\ \ \ \
| * | | | | Further improve comments.des2009-09-251-12/+6
| | | | | |
| * | | | | Improve comments, and remove a bogus 0 id check.des2009-09-251-16/+35
| | | | | |
| * | | | | Merge from headdes2009-09-1716-369/+660
| |\ \ \ \ \
| * \ \ \ \ \ Merge from head up to r188941 (last revision before the USB stack switch)des2009-09-1712-105/+174
| |\ \ \ \ \ \
| * | | | | | | WIPdes2009-01-304-37/+364
| | | | | | | |
* | | | | | | | Eliminate page queues locking around most calls to vm_page_free().alc2010-05-061-2/+0
| |_|_|_|_|_|/ |/| | | | | |
* | | | | | | Acquire the page lock around all remaining calls to vm_page_free() onalc2010-05-051-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | managed pages that didn't already have that lock held. (Freeing an unmanaged page, such as the various pmaps use, doesn't require the page lock.) This allows a change in vm_page_remove()'s locking requirements. It now expects the page lock to be held instead of the page queues lock. Consequently, the page queues lock is no longer required at all by callers to vm_page_rename(). Discussed with: kib
* | | | | | | Move checking against RLIMIT_FSIZE into one place, vn_rlimit_fsize().trasz2010-05-051-15/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Reviewed by: kib
* | | | | | | ffs_vfsops: restore alphabetic order of options in ffs_optsavg2010-04-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The order was not correct only for nfsv4acls. ("no" prefix is ignored) MFC after: 1 week
* | | | | | | - When canceling jaddrefs they may not yet be in the journal if this is viajeff2010-04-281-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a revert call. In this case don't attempt to remove something that has not yet been added. Otherwise this jaddref must hang around to prevent the bitmap write as normal.
* | | | | | | - Fix builds without SOFTUPDATES defined in the kernel config.jeff2010-04-281-0/+171
| |_|_|_|_|/ |/| | | | |
* | | | | | Fix build for UFS without SOFTUPDATES.pjd2010-04-241-1/+2
| | | | | |
* | | | | | - Merge soft-updates journaling from projects/suj/head into head. Thisjeff2010-04-2418-1801/+7566
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | brings in support for an optional intent log which eliminates the need for background fsck on unclean shutdown. Sponsored by: iXsystems, Yahoo!, and Juniper. With help from: McKusick and Peter Holm
* | | | | | The cache_enter(9) function shall not be called for doomed dvp.kib2010-04-201-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Assert this. In the reported panic, vdestroy() fired the assertion "vp has namecache for ..", because pseudofs may end up doing cache_enter() with reclaimed dvp, after dotdot lookup temporary unlocked dvp. Similar problem exists in ufs_lookup() for "." lookup, when vnode lock needs to be upgraded. Verify that dvp is not reclaimed before calling cache_enter(). Reported and tested by: pho Reviewed by: kan MFC after: 2 weeks
* | | | | | ffs_mount: remove redundant assignment of geom consumer to devvp.v_bufobjavg2010-04-031-1/+0
| |_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The assignment is already done in g_vfs_open. Redundant assignment is harmless, but can become a problem if g_vfs_open logic is changed. MFC after: 1 week
* | | | | When ffs_realloccg() failed to allocate bigger fragment and, becausekib2010-02-131-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pending blocks are scheduled for removal, goes to retry the (re)allocation, clear the bp pointer. It might happen that meantime free space is really exhausted and we are entering nospace: label without bread()ing buffer, causing stale bp value to be brelse()d again. Tested by: pho (Producing a scenario to reliably reproduce the race appeared to be much harder then fixing the bug) MFC after: 1 week
* | | | | One last pass to get all the unsigned comparisons correct.mckusick2010-02-111-1/+1
| | | | |
* | | | | This fix corrects a problem in the file system that treats largemckusick2010-02-102-58/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | inode numbers as negative rather than unsigned. For a default (16K block) file system, this bug began to show up at a file system size above about 16Tb. To fully handle this problem, newfs must be updated to ensure that it will never create a filesystem with more than 2^32 inodes. That patch will be forthcoming soon. Reported by: Scott Burns, John Kilburg, Bruce Evans Followup by: Jeff Roberson PR: 133980 MFC after: 2 weeks
* | | | | Remove unused variable.trasz2010-02-101-3/+2
| | | | |
* | | | | Return proper error code.trasz2010-01-251-1/+1
| | | | | | | | | | | | | | | | | | | | Found with: clang
* | | | | Move out code that does POSIX.1e ACL inheritance into separate routines.trasz2010-01-241-186/+171
| | | | | | | | | | | | | | | | | | | | Reviewed by: rwatson
* | | | | Cast 64-bit quantity to intptr_t rather than int so as to work properlymckusick2010-01-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with 64-bit architectures (such as amd64). Reported by: bz
* | | | | Background:mckusick2010-01-113-9/+133
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When renaming a directory it passes through several intermediate states. First its new name will be created causing it to have two names (from possibly different parents). Next, if it has different parents, its value of ".." will be changed from pointing to the old parent to pointing to the new parent. Concurrently, its old name will be removed bringing it back into a consistent state. When fsck encounters an extra name for a directory, it offers to remove the "extraneous hard link"; when it finds that the names have been changed but the update to ".." has not happened, it offers to rewrite ".." to point at the correct parent. Both of these changes were considered unexpected so would cause fsck in preen mode or fsck in background mode to fail with the need to run fsck manually to fix these problems. Fsck running in preen mode or background mode now corrects these expected inconsistencies that arise during directory rename. The functionality added with this update is used by fsck running in background mode to make these fixes. Solution: This update adds three new fsck sysctl commands to support background fsck in correcting expected inconsistencies that arise from incomplete directory rename operations. They are: setcwd(dirinode) - set the current directory to dirinode in the filesystem associated with the snapshot. setdotdot(oldvalue, newvalue) - Verify that the inode number for ".." in the current directory is oldvalue then change it to newvalue. unlink(nameptr, oldvalue) - Verify that the inode number associated with nameptr in the current directory is oldvalue then unlink it. As with all other fsck sysctls, these new ones may only be used by processes with appropriate priviledge. Reported by: jeff Security issues: rwatson
* | | | | Remove extraneous semicolons, no functional changes.mbr2010-01-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Submitted by: Marc Balmer <marc@msys.ch> MFC after: 1 week
* | | | | KASSERT that condition raised by Coverity cannot happen.mckusick2010-01-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Found by: Coverity Prevent (tm) KASSERT by: sam
* | | | | Implement NFSv4 ACL support for UFS.trasz2009-12-216-72/+541
| | | | | | | | | | | | | | | | | | | | Reviewed by: rwatson
* | | | | VI_OBJDIRTY vnode flag mirrors the state of OBJ_MIGHTBEDIRTY vm objectkib2009-12-211-7/+8
| |_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | flag. Besides providing the redundand information, need to update both vnode and object flags causes more acquisition of vnode interlock. OBJ_MIGHTBEDIRTY is only checked for vnode-backed vm objects. Remove VI_OBJDIRTY and make sure that OBJ_MIGHTBEDIRTY is set only for vnode-backed vm objects. Suggested and reviewed by: alc Tested by: pho MFC after: 3 weeks
* | | | Don't build ufs_gjournal.c at all if UFS_GJOURNAL option is not givenrdivacky2009-09-221-4/+0
| |_|/ |/| | | | | | | | | | | | | | | | | instead of building an almost empty C file. Approved by: pjd Approved by: ed (mentor, implicit)
* | | Allocate space for the group array in a static credential used inbrooks2009-09-171-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | the quota code. One case was correctly handled in r194498, but this one was missed. PR: kern/138657 Tested by: PR submitter MFC after: 3 days
* | | Remove useless variable assignment.trasz2009-09-081-3/+0
| | |
* | | insmntque_stddtr() clears vp->v_data and resets vp->v_op tokib2009-09-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | dead_vnodeops before calling vgone(). Revert r189706 and corresponding part of the r186560. Noted and reviewed by: tegge Approved by: des (pseudofs part) MFC after: 3 days
* | | The clear_remove() and clear_inodedeps() call vn_start_write(NULL, &mp,kib2009-09-061-5/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | V_NOWAIT) on the non-busied mount point. Unmount might free ufs-specific mp data, causing ffs_vgetf() to access freed memory. Busy mountpoint before dropping softdep lk. Noted and reviewed by: tegge Tested by: pho MFC after: 1 week
* | | When a UFS node is truncated to the zero length, e.g. by explicitkib2009-08-141-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | truncate(2) call, or by being removed or truncated on open, either new softupdate freeblks structure is allocated to track the freed blocks of the node, or truncation is done syncronously when too many SU dependencies are accumulated. The decision does not take into account the allocated freeblks dependencies, allowing workloads that do huge amount of truncations to exhaust the kernel memory. Take the number of allocated freeblks into consideration for softdep_slowdown(). Reported by: pluknet gmail com Diagnosed and tested by: pho Approved by: re (rwatson) MFC after: 1 month
* | | Fix fpathconf(3) on fifos, in effect making ls(1) properlytrasz2009-07-021-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | display '+' on them. Taken from kern/125613, with cosmetic changes. PR: kern/125613 Submitted by: Jaakko Heinonen <jh at saunalahti dot fi> Approved by: re (kib)
OpenPOWER on IntegriCloud