summaryrefslogtreecommitdiffstats
path: root/sys/ufs/ffs
Commit message (Collapse)AuthorAgeFilesLines
* Make sure the cdev doesn't go away while the filesystem is still mounted.trasz2009-01-291-0/+3
| | | | | | | | Otherwise dev2udev() could return garbage. Reviewed by: kib Approved by: rwatson (mentor) Sponsored by: FreeBSD Foundation
* Following a fair amount of real world experience with ACLs andrwatson2009-01-275-39/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | extended attributes since FreeBSD 5, make the following semantic changes: - Don't update the inode modification time (mtime) when extended attributes (and hence also ACLs) are added, modified, or removed. - Don't update the inode access tie (atime) when extended attributes (and hence also ACLs) are queried. This means that rsync (and related tools) won't improperly think that the data in the file has changed when only the ACL has changed. Note that ffs_reallocblks() has not been changed to not update on an IO_EXT transaction, but currently EAs don't use the cluster write routines so this shouldn't be a problem. If EAs grow support for clustering, then VOP_REALLOCBLKS() will need to grow a flag argument to carry down IO_EXT to UFS. MFC after: 1 week PR: ports/125739 Reported by: Alexander Zagrebin <alexz@visp.ru> Tested by: pluknet <pluknet@gmail.com>, Greg Byshenk <freebsd@byshenk.net> Discussed with: kib, kientzle, timur, Alexander Bokovoy <ab@samba.org>
* The r187467 should remove all pages for V_NORMAL case too, becausekib2009-01-201-8/+17
| | | | | | | | | | | indirect block pages are not removed by the mentioned invocation of the vnode_pager_setsize(). Put a common code into the helper function ffs_pages_remove(). Reported and tested by: dchagin Reviewed by: ups MFC after: 3 weeks
* When extending inode size, we call vnode_pager_setsize(), to have akib2009-01-202-2/+6
| | | | | | | | | | | | | | address space where to put vnode pages, and then call UFS_BALLOC(), to actually allocate new block and map it. When UFS_BALLOC() returns error, sometimes we forget to revert the vm object size increase, allowing for the pages that are not backed by the logical disk blocks. Revert vnode_pager_setsize() back when UFS_BALLOC() failed, for ffs_truncate() and ffs_write(). PR: 129956 Reviewed by: ups MFC after: 3 weeks
* FFS puts the extended attributes blocks at the negative blocks for thekib2009-01-201-0/+9
| | | | | | | | | | | | | | | | | | vnode, from -1 down. When vinvalbuf(vp, V_ALT) is done for the vnode, it incorrectly does vm_object_page_remove(0, 0), removing all pages from the underlying vm object, not only the pages that back the extended attributes data. Change vinvalbuf() to not remove any pages from the object when V_NORMAL or V_ALT are specified. Instead, the only in-tree caller in ffs_inode.c:ffs_truncate() that specifies V_ALT explicitely removes the corresponding page range. The V_NORMAL caller does vnode_pager_setsize(vp, 0) immediately after the call to vinvalbuf(V_NORMAL) already. Reported by: csjp Reviewed by: ups MFC after: 3 weeks
* If unmount of the ffs mp failed, reinitialize the extended attributeskib2009-01-081-0/+14
| | | | | | | | for the mp, and restart them if autostart is enabled. Reported and tested by: pho Reviewed by: rwatson MFC after: 3 weeks
* For now on every 10 cyclinder groups flush the buffer cache to freeambrisko2008-11-131-0/+4
| | | | | | | | | up space. If the buffer cache fills up then the disk systems can grind to a halt. Better tuning can be figured out later. Tested by: Tim, others and work Reviewed by: Kostik Belousov PR: 128832
* Improve VFS locking:attilio2008-11-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Implement real draining for vfs consumers by not relying on the mnt_lock and using instead a refcount in order to keep track of lock requesters. - Due to the change above, remove the mnt_lock lockmgr because it is now useless. - Due to the change above, vfs_busy() is no more linked to a lockmgr. Change so its KPI by removing the interlock argument and defining 2 new flags for it: MBF_NOWAIT which basically replaces the LK_NOWAIT of the old version (which was unlinked from the lockmgr alredy) and MBF_MNTLSTLOCK which provides the ability to drop the mountlist_mtx once the mnt interlock is held (ability still desired by most consumers). - The stub used into vfs_mount_destroy(), that allows to override the mnt_ref if running for more than 3 seconds, make it totally useless. Remove it as it was thought to work into older versions. If a problem of "refcount held never going away" should appear, we will need to fix properly instead than trust on such hackish solution. - Fix a bug where returning (with an error) from dounmount() was still leaving the MNTK_MWAIT flag on even if it the waiters were actually woken up. Just a place in vfs_mount_destroy() is left because it is going to recycle the structure in any case, so it doesn't matter. - Remove the markercnt refcount as it is useless. This patch modifies VFS ABI and breaks KPI for vfs_busy() so manpages and __FreeBSD_version will be modified accordingly. Discussed with: kib Tested by: pho
* Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessarytrasz2008-10-281-4/+4
| | | | | | | to add more V* constants, and the variables changed by this patch were often being assigned to mode_t variables, which is 16 bit. Approved by: rwatson (mentor)
* Fix a number of style issues in the MALLOC / FREE commit. I've tried todes2008-10-231-2/+4
| | | | | be careful not to fix anything that was already broken; the NFSv4 code is particularly bad in this respect.
* Retire the MALLOC and FREE macros. They are an abomination unto style(9).des2008-10-234-49/+47
| | | | MFC after: 3 months
* Assert that v_holdcnt is non-zero before entering lockmgr in vn_lockkib2008-10-201-0/+4
| | | | | | | | | and ffs_lock. This cannot catch situations where holdcnt is incremented not by curthread, but I think it is useful. Reviewed by: tegge, attilio Tested by: pho MFC after: 2 weeks
* Sync up summary information for cylinder groups while data is alreadykib2008-10-131-0/+7
| | | | | | | | in memory during snapshot creation. This improves the results of the background fsck. Submitted by: tegge MFC after: 1 week
* Remove the struct thread unuseful argument from bufobj interface.attilio2008-10-102-6/+6
| | | | | | | | | | | | | | | | | | | | | In particular following functions KPI results modified: - bufobj_invalbuf() - bufsync() and BO_SYNC() "virtual method" of the buffer objects set. Main consumers of bufobj functions are affected by this change too and, in particular, functions which changed their KPI are: - vinvalbuf() - g_vfs_close() Due to the KPI breakage, __FreeBSD_version will be bumped in a later commit. As a side note, please consider just temporary the 'curthread' argument passing to VOP_SYNC() (in bufsync()) as it will be axed out ASAP Reviewed by: kib Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
* Enable shared lookups on UFS. There are some remaining issues with forcedjhb2008-09-241-1/+1
| | | | | | unmounts, but those are in the VFS lookup code are not UFS specific. Tested by: pho, kris
* Suspend the write operations on the UFS filesystem being unmounted orkib2008-09-161-14/+73
| | | | | | | | remounted from rw to ro. Proposed and reviewed by: tegge In collaboration with: pho MFC after: 1 month
* When attempt is made to suspend a filesystem that is already syspended,kib2008-09-163-2/+7
| | | | | | | | | | | | | | | | | | | wait until the current suspension is lifted instead of silently returning success immediately. The consequences of calling vfs_write() resume when not owning the suspension are not well-defined at best. Add the vfs_susp_clean() mount method to be called from vfs_write_resume(). Set it to process_deferred_inactive() for ffs, and stop calling it manually. Add the thread flag TDP_IGNSUSP that allows to bypass the suspension point in the vn_start_write. It is intended for use by VFS in the situations where the suspender want to do some i/o requiring calls to vn_start_write(), and this i/o cannot be done later. Reviewed by: tegge In collaboration with: pho MFC after: 1 month
* Add the ffs structures introspection functions for ddb.kib2008-09-162-1/+65
| | | | | | | | | Show the b_dep value for the buffer in the show buffer command. Add a comand to dump the dirty/clean buffer list for vnode. Reviewed by: tegge Tested and used by: pho MFC after: 1 month
* When downgrading the read-write mount to read-only, do_unmount() setskib2008-09-163-0/+11
| | | | | | | | | | | | | | | MNT_RDONLY flag before the VFS_MOUNT() is called. In ufs_inactive() and ufs_itimes_locked(), UFS verifies whether the fs is read-only by checking MNT_RDONLY, but this may cause loss of the IN_MODIFIED flag for inode on the fs being remounted rw->ro. Introduce UFS_RDONLY() struct ufsmount' method that reports the value of the fs_ronly. The later is set to 1 only after the remount is finished. Reviewed by: tegge In collaboration with: pho MFC after: 1 month
* The struct inode *ip supplied to softdep_freefile is not neccessary thekib2008-09-161-1/+2
| | | | | | | | | | | inode having number ino. In r170991, the ip was marked IN_MODIFIED, that is not quite correct. Mark only the right inode modified by checking inode number. Reviewed by: tegge In collaboration with: pho MFC after: 1 month
* When calling extattr_check_cred, use V{READ,WRITE}, not I{READ,WRITE}.trasz2008-09-031-4/+4
| | | | Approved by: rwatson (mentor)
* Decontextualize vfs_busy(), vfs_unbusy() and vfs_mount_alloc() functions.attilio2008-08-311-2/+2
| | | | | | Manpages are updated accordingly. Tested by: Diego Sardina <siarodx at gmail dot com>
* Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed threadattilio2008-08-281-1/+1
| | | | | | was always curthread and totally unuseful. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
* In ffs_valloc(), ffs_vget() may fail because insmntque() refused tokib2008-08-281-1/+11
| | | | | | | | | | | | | | insert new vnode into the mount vnode list. Then, for the SU-enabled mount, ffs_vfree could create freefile dependency. This dependency can hang around forever since inode is not marked as IN_MODIFIED and correspondingly inodeblock may be not marked as dirty. After ffs_vget() fails, retry with FFSV_FORCEINSMQ, mark the inode as modified, and vput() it immediately. Take care of the dup alloc. Tested by: pho Reviewed by: tegge MFC after: 1 month
* Softdep code may need to instantiate vnode when processingkib2008-08-283-15/+59
| | | | | | | | | | | | | | | | | | dependencies. In particular, it may need this while syncing filesystem being unmounted. Since during unmount MNTK_NOINSMNTQUE flag is set, that could sometimes disallow insertion of the vnode into the vnode mount list, softdep code needs to overwrite the MNTK_NOINSMNTQUE flag. Create the ffs_vgetf() function that sets the VV_FORCEINSMQ flag for new vnode and use it consistently from the softdep code instead of ffs_vget(). Add the retry logic to the softdep_flushfiles() to flush the vnodes that could be instantiated while flushing softdep dependencies. Tested by: pho, kris Reviewed by: tegge MFC after: 1 month
* Revert r181345.kib2008-08-101-2/+1
| | | | | | | Move the NULL pointer check to the vfs_deleteopt() function. Discussed with: rodrigc MFC after: 3 days
* User may do "mount -o snapshot ...", that causes new FFS mount to bekib2008-08-061-1/+2
| | | | | | | | performed with snapshot option, while the mp->mnt_opt is NULL. Protect against NULL pointer dereference. Noted by: Mateusz Guzik <mjguzik gmail com> MFC after: 3 days
* The ffs_balloc_ufs{1,2} functions call bdwrite() while having severalkib2008-07-231-2/+22
| | | | | | | | | | | | | | | vnode buffers locked at once. In particular, there are indirect buffers among locked ones. The bdwrite() may start the flushing to keep dirty buffer list at the bounds. If any buffer on the dirty list requires translation from logical to physical block number, code may ends up trying to lock an indirect buffer already locked in ffs_balloc_ufsX. Prevent the bdflush() activity when several buffers are locked at once by setting the TDP_INBDFUSH for the problematic code blocks. Reported and tested by: pho, Josef Buchsteiner at Juniper In collaboration with: kan MFC after: 1 month
* Say hi to svn, by simplifing ffs_vget() function a bit - there is no need forpjd2008-07-191-3/+1
| | | | a variable that is used only once.
* Fix comments to replace SBSIZE with SBLOCKSIZE, since SBSIZErodrigc2008-05-241-2/+2
| | | | | | was renamed to SBLOCKSIZE in version 1.33 Reviewed by: mckusick
* After converting the "snapshot" mount option to the MNT_SNAPSHOT flag,rodrigc2008-05-241-1/+8
| | | | | | | | | | | delete "snapshot" from the persistent mount options list. This should fix problems with doing a mount -o snapshot of a file system, followed by an NFS export of the same file system. PR: 122833 Reported by: Leon Kos <leon.kos lecad fs uni-lj si>, Jaakko Heinonen <jh saunalahti fi> MFC after: 1 month
* For the following mount options, do not perform the string to flag conversionsrodrigc2008-05-241-21/+0
| | | | | | | | | | | | | here, because we already do them further up in vfs_donmount() in vfs_mount.c async -> MNT_ASYNC force -> MNT_FORCE multilabel -> MNT_MULTILABEL noatime -> MNT_NOATIME noclusterr -> MNT_NOCLUSTERR noclusterw -> MNT_NOCLUSTERW MFC after: 1 month
* Optimize lockmgr in order to get rid of the pool mutex interlock, of theattilio2008-04-061-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | state transitioning flags and of msleep(9) callings. Use, instead, an algorithm very similar to what sx(9) and rwlock(9) alredy do and direct accesses to the sleepqueue(9) primitive. In order to avoid writer starvation a mechanism very similar to what rwlock(9) uses now is implemented, with the correspective per-thread shared lockmgrs counter. This patch also adds 2 new functions to lockmgr KPI: lockmgr_rw() and lockmgr_args_rw(). These two are like the 2 "normal" versions, but they both accept a rwlock as interlock. In order to realize this, the general lockmgr manager function "__lockmgr_args()" has been implemented through the generic lock layer. It supports all the blocking primitives, but currently only these 2 mappers live. The patch drops the support for WITNESS atm, but it will be probabilly added soon. Also, there is a little race in the draining code which is also present in the current CVS stock implementation: if some sharers, once they wakeup, are in the runqueue they can contend the lock with the exclusive drainer. This is hard to be fixed but the now committed code mitigate this issue a lot better than the (past) CVS version. In addition assertive KA_HELD and KA_UNHELD have been made mute assertions because they are dangerous and they will be nomore supported soon. In order to avoid namespace pollution, stack.h is splitted into two parts: one which includes only the "struct stack" definition (_stack.h) and one defining the KPI. In this way, newly added _lockmgr.h can just include _stack.h. Kernel ABI results heavilly changed by this commit (the now committed version of "struct lock" is a lot smaller than the previous one) and KPI results broken by lockmgr_rw() / lockmgr_args_rw() introduction, so manpages and __FreeBSD_version will be updated accordingly. Tested by: kris, pho, jeff, danger Reviewed by: jeff Sponsored by: Google, Summer of Code program 2007
* Add the support for the AT_FDCWD and fd-relative name lookups to thekib2008-03-311-0/+1
| | | | | | | | | namei(9). Based on the submission by rdivacky, sponsored by Google Summer of Code 2007 Reviewed by: rwatson, rdivacky Tested by: pho
* - Since rev 1.142 of ffs_snapshot.c the interlock has not been requiredjeff2008-03-311-11/+4
| | | | | | | | | | | | | to protect the v_lock pointer. Removing the interlock acquisition here allows vn_lock() to proceed without requiring the interlock at all. - If the lock mutated while we were sleeping on it the interlock has been dropped. It is conceivable that the upper layer code was relying on the interlock and LK_NOWAIT to protect the identity or state of the vnode while acquiring the lock. In this case return EBUSY rather than trying the new lock to prevent potential races. Reviewed by: tegge
* - Don't free snapdata structures when they are no longer in use.jeff2008-03-311-67/+109
| | | | | | | | | | | | | | Keeping the lockmgr lock valid allows us to switch the v_lock pointer in snapshot vnodes between the embedded lockmgr lock and snapdata lock without needing the vnode interlock to protect against races - Keep unused snapdata structures in a list. - Add a function to lock the devvp and allocate a snapdata to it or acquire a new one without races. The old function was safe from creation races because we set the mount flag when creating snapshots and thus serializing them. However, it might have been subject to destroying races. Reviewed by: tegge
* Fix a nit with the 'nofoo' options where 'foo' is mapped to 'nonofoo'jhb2008-03-261-3/+3
| | | | | | | | | | | | | (such as 'atime' vs 'noatime'). The filesystems will always see either 'nofoo' or 'nonofoo', never plain 'foo'. As such, their list of valid mount options should include 'nofoo' instead of 'foo'. With this fix, you can do 'mount -u -o atime' on a FFS filesystem that isn't marked as noatime without getting an error. You can also update a noatime FFS filesystem mounted via mount(2) (e.g. 6.x /sbin/mount binary) to 'atime' using nmount(2) (e.g. 7.x /sbin/mount binary). MFC after: 1 week Reviewed by: crodig
* Yield the cpu in the kernel while iterating the list of thekib2008-03-231-0/+1
| | | | | | | | | | | | | vnodes belonging to the mountpoint. Also, yield when in the softdep_process_worklist() even when we are not going to sleep due to buffer drain. It is believed that the ULE fixed the problem [1], but the yielding seems to be needed at least for the 4BSD case. Discussed: on stable@, with bde Reviewed by: tegge, jeff [1] MFC after: 2 weeks
* - Complete part of the unfinished bufobj work by consistently usingjeff2008-03-225-104/+99
| | | | | | | | | | | | | | | | | BO_LOCK/UNLOCK/MTX when manipulating the bufobj. - Create a new lock in the bufobj to lock bufobj fields independently. This leaves the vnode interlock as an 'identity' lock while the bufobj is an io lock. The bufobj lock is ordered before the vnode interlock and also before the mnt ilock. - Exploit this new lock order to simplify softdep_check_suspend(). - A few sync related functions are marked with a new XXX to note that we may not properly interlock against a non-zero bv_cnt when attempting to sync all vnodes on a mountlist. I do not believe this race is important. If I'm wrong this will make these locations easier to find. Reviewed by: kib (earlier diff) Tested by: kris, pho (earlier diff)
* Reduce the acquisition of the vnode interlock in the ffs_read() andkib2008-03-211-2/+4
| | | | | | | | ffs_extread() when setting the IN_ACCESS flag by checking whether the IN_ACCESS is already set. The possible race there is admissible. Tested by: pho Submitted by: jeff
* - Relax requirements for p_numthreads, p_threads, p_swtick, and p_nice fromjeff2008-03-191-4/+0
| | | | | | | requiring the per-process spinlock to only requiring the process lock. - Reflect these changes in the proc.h documentation and consumers throughout the kernel. This is a substantial reduction in locking cost for these fields and was made possible by recent changes to threading support.
* In keeping with style(9)'s recommendations on macros, use a ';'rwatson2008-03-161-1/+2
| | | | | | | | | after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr. MFC after: 1 month Discussed with: imp, rink
* Replace the non-MPSAFE timeout(9) API in ffs_softdep.c with the MPSAFEcokane2008-03-131-8/+15
| | | | | | | | callout_* API (e.g. callout_init_mtx(9)). This was one of the numerous items on the http://wiki.freebsd.org/SMPTODO list. Reviewed by: imp, obrien, jhb MFC after: 1 week
* Remove include of opt_quota.h; as of revision 1.205 there is no longeremaste2008-03-101-1/+0
| | | | any #ifdef QUOTA conditional code.
* Initialize mnt_stat.f_iosize before autostarting UFS1 extattrs.kib2008-03-051-0/+1
| | | | | | | | | | | | | | It is normally initialized by ffs_statfs() after ffs_mount finished. The extattr autostart code calls the ufs_lookup(), that uses value above to iterate over the directory blocks, see bmask initialization in the ufs_lookup() and ufsdirhash. Having the filesystem with root directory spanning more then one block would result in reading a random kernel memory. PR: kern/120781 Test case provided by: rwatson MFC after: 1 week
* Move setting of MNTK_MPSAFE flag before UFS1 extended attributerwatson2008-03-041-3/+3
| | | | | | | | auto-start so that the flag is set before we start performing I/O in the auto-start routine. MFC after: 2 weeks Suggested by: kib
* Minor typo nit.keramida2008-02-251-1/+1
|
* Axe the 'thread' argument from VOP_ISLOCKED() and lockstatus() as it isattilio2008-02-251-7/+5
| | | | | | | | | always curthread. As KPI gets broken by this patch, manpages and __FreeBSD_version will be updated by further commits. Tested by: Andrea Barberio <insomniac at slackware dot it>
* Introduce some functions in the vnode locks namespace and in the ffsattilio2008-02-242-5/+8
| | | | | | | | | | | | | | | namespace in order to handle lockmgr fields in a controlled way instead than spreading all around bogus stubs: - VN_LOCK_AREC() allows lock recursion for a specified vnode - VN_LOCK_ASHARE() allows lock sharing for a specified vnode In FFS land: - BUF_AREC() allows lock recursion for a specified buffer lock - BUF_NOREC() disallows recursion for a specified buffer lock Side note: union_subr.c::unionfs_node_update() is the only other function directly handling lockmgr fields. As this is not simple to fix, it has been left behind as "sole" exception.
* - Introduce lockmgr_args() in the lockmgr space. This function performsattilio2008-02-151-3/+5
| | | | | | | | | | | the same operation of lockmgr() but accepting a custom wmesg, prio and timo for the particular lock instance, overriding default values lkp->lk_wmesg, lkp->lk_prio and lkp->lk_timo. - Use lockmgr_args() in order to implement BUF_TIMELOCK() - Cleanup BUF_LOCK() - Remove LK_INTERNAL as it is nomore used in the lockmgr namespace Tested by: Andrea Barberio <insomniac at slackware dot it>
OpenPOWER on IntegriCloud