summaryrefslogtreecommitdiffstats
path: root/sys/ufs
Commit message (Collapse)AuthorAgeFilesLines
* - Call softdep_prealloc() before any of the balloc routines in thejeff2010-05-072-1/+10
| | | | | | | | snapshot code. - Don't fsync() vnodes in prealloc if copy on write is in progress. It is not safe to recurse back into the write path here. Reported by: Vladimir Grebenschikov <vova@fbsd.ru>
* - Use the correct flag mask when determining whether an inode hasjeff2010-05-071-1/+1
| | | | | | | | | successfully made it to the free list yet or not. This fixes a deadlock that can occur with unlinked but referenced files. Journal space and inodedeps were not correctly reclaimed because the inode block was not left dirty. Tested/Reported by: lwindschuh@googlemail.com
* Merger of the quota64 project into head.mckusick2010-05-074-38/+417
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This joint work of Dag-Erling Smørgrav and myself updates the FFS quota system to support both traditional 32-bit and new 64-bit quotas (for those of you who want to put 2+Tb quotas on your users). By default quotas are not compiled into the kernel. To include them in your kernel configuration you need to specify: options QUOTA # Enable FFS quotas If you are already running with the current 32-bit quotas, they should continue to work just as they have in the past. If you wish to convert to using 64-bit quotas, use `quotacheck -c 64'; if you wish to revert from 64-bit quotas back to 32-bit quotas, use `quotacheck -c 32'. There is a new library of functions to simplify the use of the quota system, do `man quotafile' for details. If your application is currently using the quotactl(2), it is highly recommended that you convert your application to use the quotafile interface. Note that existing binaries will continue to work. Special thanks to John Kozubik of rsync.net for getting me interested in pursuing 64-bit quota support and for funding part of my development time on this project.
| * Final update to current version of head in preparation for reintegration.mckusick2010-05-063-20/+181
| |\
| * \ Update to current version of head.mckusick2010-04-2818-1803/+7576
| |\ \
| * | | Debugging nits found while testing the new 64-bit quota code.mckusick2010-03-163-3/+42
| | | |
| * | | IFH@204581des2010-03-0410-332/+919
| |\ \ \
| * \ \ \ Sync with headdes2009-09-251-4/+0
| |\ \ \ \
| * | | | | Further improve comments.des2009-09-251-12/+6
| | | | | |
| * | | | | Improve comments, and remove a bogus 0 id check.des2009-09-251-16/+35
| | | | | |
| * | | | | Merge from headdes2009-09-1716-369/+660
| |\ \ \ \ \
| * \ \ \ \ \ Merge from head up to r188941 (last revision before the USB stack switch)des2009-09-1712-105/+174
| |\ \ \ \ \ \
| * | | | | | | WIPdes2009-01-304-37/+364
| | | | | | | |
* | | | | | | | Eliminate page queues locking around most calls to vm_page_free().alc2010-05-061-2/+0
| |_|_|_|_|_|/ |/| | | | | |
* | | | | | | Acquire the page lock around all remaining calls to vm_page_free() onalc2010-05-051-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | managed pages that didn't already have that lock held. (Freeing an unmanaged page, such as the various pmaps use, doesn't require the page lock.) This allows a change in vm_page_remove()'s locking requirements. It now expects the page lock to be held instead of the page queues lock. Consequently, the page queues lock is no longer required at all by callers to vm_page_rename(). Discussed with: kib
* | | | | | | Move checking against RLIMIT_FSIZE into one place, vn_rlimit_fsize().trasz2010-05-051-15/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Reviewed by: kib
* | | | | | | ffs_vfsops: restore alphabetic order of options in ffs_optsavg2010-04-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The order was not correct only for nfsv4acls. ("no" prefix is ignored) MFC after: 1 week
* | | | | | | - When canceling jaddrefs they may not yet be in the journal if this is viajeff2010-04-281-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a revert call. In this case don't attempt to remove something that has not yet been added. Otherwise this jaddref must hang around to prevent the bitmap write as normal.
* | | | | | | - Fix builds without SOFTUPDATES defined in the kernel config.jeff2010-04-281-0/+171
| |_|_|_|_|/ |/| | | | |
* | | | | | Fix build for UFS without SOFTUPDATES.pjd2010-04-241-1/+2
| | | | | |
* | | | | | - Merge soft-updates journaling from projects/suj/head into head. Thisjeff2010-04-2418-1801/+7566
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | brings in support for an optional intent log which eliminates the need for background fsck on unclean shutdown. Sponsored by: iXsystems, Yahoo!, and Juniper. With help from: McKusick and Peter Holm
* | | | | | The cache_enter(9) function shall not be called for doomed dvp.kib2010-04-201-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Assert this. In the reported panic, vdestroy() fired the assertion "vp has namecache for ..", because pseudofs may end up doing cache_enter() with reclaimed dvp, after dotdot lookup temporary unlocked dvp. Similar problem exists in ufs_lookup() for "." lookup, when vnode lock needs to be upgraded. Verify that dvp is not reclaimed before calling cache_enter(). Reported and tested by: pho Reviewed by: kan MFC after: 2 weeks
* | | | | | ffs_mount: remove redundant assignment of geom consumer to devvp.v_bufobjavg2010-04-031-1/+0
| |_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The assignment is already done in g_vfs_open. Redundant assignment is harmless, but can become a problem if g_vfs_open logic is changed. MFC after: 1 week
* | | | | When ffs_realloccg() failed to allocate bigger fragment and, becausekib2010-02-131-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pending blocks are scheduled for removal, goes to retry the (re)allocation, clear the bp pointer. It might happen that meantime free space is really exhausted and we are entering nospace: label without bread()ing buffer, causing stale bp value to be brelse()d again. Tested by: pho (Producing a scenario to reliably reproduce the race appeared to be much harder then fixing the bug) MFC after: 1 week
* | | | | One last pass to get all the unsigned comparisons correct.mckusick2010-02-111-1/+1
| | | | |
* | | | | This fix corrects a problem in the file system that treats largemckusick2010-02-102-58/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | inode numbers as negative rather than unsigned. For a default (16K block) file system, this bug began to show up at a file system size above about 16Tb. To fully handle this problem, newfs must be updated to ensure that it will never create a filesystem with more than 2^32 inodes. That patch will be forthcoming soon. Reported by: Scott Burns, John Kilburg, Bruce Evans Followup by: Jeff Roberson PR: 133980 MFC after: 2 weeks
* | | | | Remove unused variable.trasz2010-02-101-3/+2
| | | | |
* | | | | Return proper error code.trasz2010-01-251-1/+1
| | | | | | | | | | | | | | | | | | | | Found with: clang
* | | | | Move out code that does POSIX.1e ACL inheritance into separate routines.trasz2010-01-241-186/+171
| | | | | | | | | | | | | | | | | | | | Reviewed by: rwatson
* | | | | Cast 64-bit quantity to intptr_t rather than int so as to work properlymckusick2010-01-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with 64-bit architectures (such as amd64). Reported by: bz
* | | | | Background:mckusick2010-01-113-9/+133
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When renaming a directory it passes through several intermediate states. First its new name will be created causing it to have two names (from possibly different parents). Next, if it has different parents, its value of ".." will be changed from pointing to the old parent to pointing to the new parent. Concurrently, its old name will be removed bringing it back into a consistent state. When fsck encounters an extra name for a directory, it offers to remove the "extraneous hard link"; when it finds that the names have been changed but the update to ".." has not happened, it offers to rewrite ".." to point at the correct parent. Both of these changes were considered unexpected so would cause fsck in preen mode or fsck in background mode to fail with the need to run fsck manually to fix these problems. Fsck running in preen mode or background mode now corrects these expected inconsistencies that arise during directory rename. The functionality added with this update is used by fsck running in background mode to make these fixes. Solution: This update adds three new fsck sysctl commands to support background fsck in correcting expected inconsistencies that arise from incomplete directory rename operations. They are: setcwd(dirinode) - set the current directory to dirinode in the filesystem associated with the snapshot. setdotdot(oldvalue, newvalue) - Verify that the inode number for ".." in the current directory is oldvalue then change it to newvalue. unlink(nameptr, oldvalue) - Verify that the inode number associated with nameptr in the current directory is oldvalue then unlink it. As with all other fsck sysctls, these new ones may only be used by processes with appropriate priviledge. Reported by: jeff Security issues: rwatson
* | | | | Remove extraneous semicolons, no functional changes.mbr2010-01-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Submitted by: Marc Balmer <marc@msys.ch> MFC after: 1 week
* | | | | KASSERT that condition raised by Coverity cannot happen.mckusick2010-01-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Found by: Coverity Prevent (tm) KASSERT by: sam
* | | | | Implement NFSv4 ACL support for UFS.trasz2009-12-216-72/+541
| | | | | | | | | | | | | | | | | | | | Reviewed by: rwatson
* | | | | VI_OBJDIRTY vnode flag mirrors the state of OBJ_MIGHTBEDIRTY vm objectkib2009-12-211-7/+8
| |_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | flag. Besides providing the redundand information, need to update both vnode and object flags causes more acquisition of vnode interlock. OBJ_MIGHTBEDIRTY is only checked for vnode-backed vm objects. Remove VI_OBJDIRTY and make sure that OBJ_MIGHTBEDIRTY is set only for vnode-backed vm objects. Suggested and reviewed by: alc Tested by: pho MFC after: 3 weeks
* | | | Don't build ufs_gjournal.c at all if UFS_GJOURNAL option is not givenrdivacky2009-09-221-4/+0
| |_|/ |/| | | | | | | | | | | | | | | | | instead of building an almost empty C file. Approved by: pjd Approved by: ed (mentor, implicit)
* | | Allocate space for the group array in a static credential used inbrooks2009-09-171-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | the quota code. One case was correctly handled in r194498, but this one was missed. PR: kern/138657 Tested by: PR submitter MFC after: 3 days
* | | Remove useless variable assignment.trasz2009-09-081-3/+0
| | |
* | | insmntque_stddtr() clears vp->v_data and resets vp->v_op tokib2009-09-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | dead_vnodeops before calling vgone(). Revert r189706 and corresponding part of the r186560. Noted and reviewed by: tegge Approved by: des (pseudofs part) MFC after: 3 days
* | | The clear_remove() and clear_inodedeps() call vn_start_write(NULL, &mp,kib2009-09-061-5/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | V_NOWAIT) on the non-busied mount point. Unmount might free ufs-specific mp data, causing ffs_vgetf() to access freed memory. Busy mountpoint before dropping softdep lk. Noted and reviewed by: tegge Tested by: pho MFC after: 1 week
* | | When a UFS node is truncated to the zero length, e.g. by explicitkib2009-08-141-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | truncate(2) call, or by being removed or truncated on open, either new softupdate freeblks structure is allocated to track the freed blocks of the node, or truncation is done syncronously when too many SU dependencies are accumulated. The decision does not take into account the allocated freeblks dependencies, allowing workloads that do huge amount of truncations to exhaust the kernel memory. Take the number of allocated freeblks into consideration for softdep_slowdown(). Reported by: pluknet gmail com Diagnosed and tested by: pho Approved by: re (rwatson) MFC after: 1 month
* | | Fix fpathconf(3) on fifos, in effect making ls(1) properlytrasz2009-07-021-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | display '+' on them. Taken from kern/125613, with cosmetic changes. PR: kern/125613 Submitted by: Jaakko Heinonen <jh at saunalahti dot fi> Approved by: re (kib)
* | | In vn_vget_ino() and their inline equivalents, mnt_ref() the mount pointkib2009-07-021-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | around the sequence that drop vnode lock and then busies the mount point. Not having vlocked node or direct reference to the mp allows for the forced unmount to proceed, making mp unmounted or reused. Tested by: pho Reviewed by: jeff Approved by: re (kensmith) MFC after: 2 weeks
* | | Don't panic on attempt to set ACL on a block device file.trasz2009-07-011-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | This is just a part of kern/125613. PR: kern/125613 Submitted by: Jaakko Heinonen <jh at saunalahti dot fi> Reviewed by: rwatson Approved by: re (kib)
* | | For SU mounts, softdep_fsync() might drop vnode lock, allowing otherkib2009-06-301-4/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | threads to put dirty buffers on the vnode bufobj list. For regular files and synchronous fsync requests, check for the condition and restart the fsync vop if a new dirty buffer arrived. Tested by: pho Approved by: re (kensmith) MFC after: 1 month
* | | Softdep_fsync() may need to lock parent directory of the synced vnode.kib2009-06-301-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | Use inlined (due to FFSV_FORCEINSMQ) version of vn_vget_ino() to prevent mountpoint from being unmounted and freed while no vnodes are locked. Tested by: pho Approved by: re (kensmith) MFC after: 1 month
* | | Fix a bug reported by pho@ where one can induce a panic by decreasingsnb2009-06-251-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vfs.ufs.dirhash_maxmem below the current amount of memory used by dirhash. When ufsdirhash_build() is called with the memory in use greater than dirhash_maxmem, it attempts to free up memory by calling ufsdirhash_recycle(). If successful in freeing enough memory, ufsdirhash_recycle() leaves the dirhash list locked. But at this point in ufsdirhash_build(), the list is not explicitly unlocked after the call(s) to ufsdirhash_recycle(). When we next attempt to lock the dirhash list, we will get a "panic: _mtx_lock_sleep: recursed on non-recursive mutex dirhash list". Tested by: pho Approved by: dwmalone (mentor) MFC after: 3 weeks
* | | Rework the credential code to support larger values of NGROUPS andbrooks2009-06-191-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NGROUPS_MAX, eliminate ABI dependencies on them, and raise the to 1024 and 1023 respectively. (Previously they were equal, but under a close reading of POSIX, NGROUPS_MAX was defined to be too large by 1 since it is the number of supplemental groups, not total number of groups.) The bulk of the change consists of converting the struct ucred member cr_groups from a static array to a pointer. Do the equivalent in kinfo_proc. Introduce new interfaces crcopysafe() and crsetgroups() for duplicating a process credential before modifying it and for setting group lists respectively. Both interfaces take care for the details of allocating groups array. crsetgroups() takes care of truncating the group list to the current maximum (NGROUPS) if necessary. In the future, crsetgroups() may be responsible for insuring invariants such as sorting the supplemental groups to allow groupmember() to be implemented as a binary search. Because we can not change struct xucred without breaking application ABIs, we leave it alone and introduce a new XU_NGROUPS value which is always 16 and is to be used or NGRPS as appropriate for things such as NFS which need to use no more than 16 groups. When feasible, truncate the group list rather than generating an error. Minor changes: - Reduce the number of hand rolled versions of groupmember(). - Do not assign to both cr_gid and cr_groups[0]. - Modify ipfw to cache ucreds instead of part of their contents since they are immutable once referenced by more than one entity. Submitted by: Isilon Systems (initial implementation) X-MFC after: never PR: bin/113398 kern/133867
* | | Keep dirhash tailq locked throughout the entirety of ufsdirhash_destroy() to fixsnb2009-06-171-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a potential race pointed out by pjd. Also use TAILQ_FOREACH_SAFE to iterate over dirhashes in ufsdirhash_lowmem(), so that we can continue iterating even after a dirhash is destroyed. Suggested by: pjd Tested by: pho Approved by: dwmalone (mentor)
* | | Do not use casts (int *)0 and (struct thread *)0 for the arguments ofkib2009-06-162-2/+2
| | | | | | | | | | | | | | | | | | | | | vn_rdwr, use NULL. Reviewed by: jhb MFC after: 1 week
OpenPOWER on IntegriCloud