summaryrefslogtreecommitdiffstats
path: root/sys
diff options
context:
space:
mode:
authormckusick <mckusick@FreeBSD.org>2013-08-28 17:38:05 +0000
committermckusick <mckusick@FreeBSD.org>2013-08-28 17:38:05 +0000
commit2ecfc284152b0f42d7d72d81c1aee8786a7a81c9 (patch)
tree637d93d1aa9bf993830238324660217589e581f5 /sys
parent0c40bd77afdf5a783614e79c12a317267d9aa6e1 (diff)
downloadFreeBSD-src-2ecfc284152b0f42d7d72d81c1aee8786a7a81c9.zip
FreeBSD-src-2ecfc284152b0f42d7d72d81c1aee8786a7a81c9.tar.gz
A performance problem was reported in PR kern/181226:
I have 25TB Dell PERC 6 RAID5 array. When it becomes almost full (10-20GB free), processes which write data to it start eating 100% CPU and write speed drops below 1MB/sec (normally to gives 400MB/sec). The revision at which it first became apparent was http://svnweb.freebsd.org/changeset/base/249782. The offending change reserved an area in each cylinder group to store metadata. The new algorithm attempts to save this area for metadata and allows its use for non-metadata only after all the data areas have been exhausted. The size of the reserved area defaults to half of minfree, so the filesystem reports full before the data area can completely fill. However, in this report, the filesystem has had minfree reduced to 1% thus forcing the metadata area to be used for data. As the filesystem approached full, it had only metadata areas left to allocate. The result was that every block allocation had to scan summary data for 30,000 cylinder groups before falling back to searching up to 30,000 metadata areas. The fix is to give up on saving the metadata areas once the free space reserve drops below 2%. The effect of this change is to use the old algorithm of just accepting the first available block that we find. Since most filesystems use the default 5% minfree, this will have no effect on their operation. For those that want to push to the limit, they will get their crappy block placements quickly. Submitted by: Dmitry Sivachenko Fix Tested by: Dmitry Sivachenko PR: kern/181226 MFC after: 2 weeks
Diffstat (limited to 'sys')
-rw-r--r--sys/ufs/ffs/ffs_alloc.c16
1 files changed, 14 insertions, 2 deletions
diff --git a/sys/ufs/ffs/ffs_alloc.c b/sys/ufs/ffs/ffs_alloc.c
index cf1d953..cb5d45c 100644
--- a/sys/ufs/ffs/ffs_alloc.c
+++ b/sys/ufs/ffs/ffs_alloc.c
@@ -516,7 +516,13 @@ ffs_reallocblks_ufs1(ap)
ip = VTOI(vp);
fs = ip->i_fs;
ump = ip->i_ump;
- if (fs->fs_contigsumsize <= 0)
+ /*
+ * If we are not tracking block clusters or if we have less than 2%
+ * free blocks left, then do not attempt to cluster. Running with
+ * less than 5% free block reserve is not recommended and those that
+ * choose to do so do not expect to have good file layout.
+ */
+ if (fs->fs_contigsumsize <= 0 || freespace(fs, 2) < 0)
return (ENOSPC);
buflist = ap->a_buflist;
len = buflist->bs_nchildren;
@@ -737,7 +743,13 @@ ffs_reallocblks_ufs2(ap)
ip = VTOI(vp);
fs = ip->i_fs;
ump = ip->i_ump;
- if (fs->fs_contigsumsize <= 0)
+ /*
+ * If we are not tracking block clusters or if we have less than 2%
+ * free blocks left, then do not attempt to cluster. Running with
+ * less than 5% free block reserve is not recommended and those that
+ * choose to do so do not expect to have good file layout.
+ */
+ if (fs->fs_contigsumsize <= 0 || freespace(fs, 2) < 0)
return (ENOSPC);
buflist = ap->a_buflist;
len = buflist->bs_nchildren;
OpenPOWER on IntegriCloud