summaryrefslogtreecommitdiffstats
path: root/sys/ufs
diff options
context:
space:
mode:
authormckusick <mckusick@FreeBSD.org>2012-02-07 20:43:28 +0000
committermckusick <mckusick@FreeBSD.org>2012-02-07 20:43:28 +0000
commit3619d603a7c9a5ce6abe764366b472e421a5da40 (patch)
treebc91bb302c502bbccb492f8d484b961f67b0b921 /sys/ufs
parente04eff2488f2e61bf16e804dbb4cdcf635ef86dc (diff)
downloadFreeBSD-src-3619d603a7c9a5ce6abe764366b472e421a5da40.zip
FreeBSD-src-3619d603a7c9a5ce6abe764366b472e421a5da40.tar.gz
In the original days of BSD, a sync was issued on every filesystem
every 30 seconds. This spike in I/O caused the system to pause every 30 seconds which was quite annoying. So, the way that sync worked was changed so that when a vnode was first dirtied, it was put on a 30-second cleaning queue (see the syncer_workitem_pending queues in kern/vfs_subr.c). If the file has not been written or deleted after 30 seconds, the syncer pushes it out. As the syncer runs once per second, dirty files are trickled out slowly over the 30-second period instead of all at once by a call to sync(2). The one drawback to this is that it does not cover the filesystem metadata. To handle the metadata, vfs_allocate_syncvnode() is called to create a "filesystem syncer vnode" at mount time which cycles around the cleaning queue being sync'ed every 30 seconds. In the original design, the only things it would sync for UFS were the filesystem metadata: inode blocks, cylinder group bitmaps, and the superblock (e.g., by VOP_FSYNC'ing devvp, the device vnode from which the filesystem is mounted). Somewhere in its path to integration with FreeBSD the flushing of the filesystem syncer vnode got changed to sync every vnode associated with the filesystem. The result of this change is to return to the old filesystem-wide flush every 30-seconds behavior and makes the whole 30-second delay per vnode useless. This change goes back to the originally intended trickle out sync behavior. Key to ensuring that all the intended semantics are preserved (e.g., that all inode updates get flushed within a bounded period of time) is that all inode modifications get pushed to their corresponding inode blocks so that the metadata flush by the filesystem syncer vnode gets them to the disk in a timely way. Thanks to Konstantin Belousov (kib@) for doing the audit and commit -r231122 which ensures that all of these updates are being made. Reviewed by: kib Tested by: scottl MFC after: 2 weeks
Diffstat (limited to 'sys/ufs')
-rw-r--r--sys/ufs/ffs/ffs_vfsops.c20
1 files changed, 15 insertions, 5 deletions
diff --git a/sys/ufs/ffs/ffs_vfsops.c b/sys/ufs/ffs/ffs_vfsops.c
index a97d23a..a12fd83 100644
--- a/sys/ufs/ffs/ffs_vfsops.c
+++ b/sys/ufs/ffs/ffs_vfsops.c
@@ -1436,17 +1436,26 @@ ffs_sync(mp, waitfor)
int softdep_accdeps;
struct bufobj *bo;
+ wait = 0;
+ suspend = 0;
+ suspended = 0;
td = curthread;
fs = ump->um_fs;
if (fs->fs_fmod != 0 && fs->fs_ronly != 0 && ump->um_fsckpid == 0)
panic("%s: ffs_sync: modification on read-only filesystem",
fs->fs_fsmnt);
/*
+ * For a lazy sync, we just care about the filesystem metadata.
+ */
+ if (waitfor == MNT_LAZY) {
+ secondary_accwrites = 0;
+ secondary_writes = 0;
+ lockreq = 0;
+ goto metasync;
+ }
+ /*
* Write back each (modified) inode.
*/
- wait = 0;
- suspend = 0;
- suspended = 0;
lockreq = LK_EXCLUSIVE | LK_NOWAIT;
if (waitfor == MNT_SUSPEND) {
suspend = 1;
@@ -1517,11 +1526,12 @@ loop:
#ifdef QUOTA
qsync(mp);
#endif
+
+metasync:
devvp = ump->um_devvp;
bo = &devvp->v_bufobj;
BO_LOCK(bo);
- if (waitfor != MNT_LAZY &&
- (bo->bo_numoutput > 0 || bo->bo_dirty.bv_cnt > 0)) {
+ if (bo->bo_numoutput > 0 || bo->bo_dirty.bv_cnt > 0) {
BO_UNLOCK(bo);
vn_lock(devvp, LK_EXCLUSIVE | LK_RETRY);
if ((error = VOP_FSYNC(devvp, waitfor, td)) != 0)
OpenPOWER on IntegriCloud