summaryrefslogtreecommitdiffstats
path: root/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/trim_map.c
Commit message (Collapse)AuthorAgeFilesLines
* MFC r277185:mav2015-01-281-1/+1
| | | | Fix overflow bug from r248577, turning 30s TRIM timeout into ~4s.
* MFC r277169: Reimplement TRIM throttling added in r248577.mav2015-01-281-51/+42
| | | | | | | | | | | | | | | Previous throttling implementation approached problem from the wrong side. It significantly limited useful delaying of TRIM requests and aggregation potential, while not so much controlled TRIM burstiness under heavy load. With this change random 4K write benchmarks (probably the worst case for TRIM) show me IOPS increase by 20%, average latency reduction by 30%, peak TRIM bursts reduction by 3 times and same peak TRIM map size (memory usage). Also the new logic does not force map size down so heavily, really allowing to keep deleted data for 32 TXG or 30 seconds under moderate load. It was practically impossible with old throttling logic, which pushed map down to only 64 segments.
* MFC r276983: When aggregating TRIM segments, move the new one to the end.mav2015-01-251-0/+6
| | | | | | | New segment at the list head may block all TRIM requests until txg of that segment can be processed. On my random I/O tests this change reduce peak TRIM list length from 650 to 450 segments. Hopefully it should reduce TRIM burstiness when list processing is unblocked.
* MFC r265218 (smh):delphij2015-01-101-1/+0
| | | | Removed pointless / duplicated call to trim_map_first.
* MFC r274619:smh2014-11-211-4/+2
| | | | | | Disable TRIM on file backed ZFS vdevs and fix TRIM on init Sponsored by: Multiplay
* MFC r265152 - Reintroduce priority for the TRIM ZIOs instead of using the ↵smh2014-08-211-2/+13
| | | | | | | | | "NOW" priority MFC r265321 - Fix double fault panic when returning EOPNOTSUPP MFC r269407 - Don't return ZIO_PIPELINE_CONTINUE from vdev_op_io_start methods Sponsored by: Multiplay
* Changed ZFS TRIM sysctl from vfs.zfs.trim_disable -> vfs.zfs.trim.enabledsmh2013-04-261-9/+9
| | | | | | | | Enabled ZFS TRIM by default Reviewed by: pjd (mentor) Approved by: pjd (mentor) MFC after: 2 weeks
* Fix for building libzpool under i386.smh2013-03-211-1/+1
| | | | | | Reviewed by: pjd (mentor) Approved by: pjd (mentor) MFC after: 2 weeks
* Optimisation of TRIM processing.smh2013-03-211-34/+97
| | | | | | | | | | | | | | | | | | | | | | | | | Previously TRIM processing was very bursty. This was made worse by the fact that TRIM requests on SSD's are typically much slower than reads or writes. This often resulted in stalls while large numbers of TRIM's where processed. In addition due to the way the TRIM thread was only woken by writes, deletes could stall in the queue for extensive periods of time. This patch adds a number of controls to how often the TRIM thread for each SPA processes its outstanding delete requests. vfs.zfs.trim.timeout: Delay TRIMs by up to this many seconds vfs.zfs.trim.txg_delay: Delay TRIMs by up to this many TXGs (reduced to 32) vfs.zfs.vdev.trim_max_bytes: Maximum pending TRIM bytes for a vdev vfs.zfs.vdev.trim_max_pending: Maximum pending TRIM segments for a vdev vfs.zfs.trim.max_interval: Maximum interval between TRIM queue processing (seconds) Given the most common TRIM implementation is ATA TRIM the current defaults are targeted at that. Reviewed by: pjd (mentor) Approved by: pjd (mentor) MFC after: 2 weeks
* Names the ZFS TRIM threadsmh2013-03-211-0/+5
| | | | | | Reviewed by: pjd (mentor) Approved by: pjd (mentor) MFC after: 2 weeks
* TRIM cache devices based on time instead of TXGs.smh2013-03-211-7/+27
| | | | | | | | | | | | | | | | | | | | | | Currently, the trim module uses the same algorithm for data and cache devices when deciding to issue TRIM requests, based on how far in the past the TXG is. Unfortunately, this is not ideal for cache devices, because the L2ARC doesn't use the concept of TXGs at all. In fact, when using a pool for reading only, the L2ARC is written but the TXG counter doesn't increase, and so no new TRIM requests are issued to the cache device. This patch fixes the issue by using time instead of the TXG number as the criteria for trimming on cache devices. The basic delay principle stays the same, but parameters are expressed in seconds instead of TXGs. The new parameters are named trim_l2arc_limit and trim_l2arc_batch, and both default to 30 second. Reviewed by: pjd (mentor) Approved by: pjd (mentor) Obtained from: https://github.com/dechamps/zfs/commit/17122c31ac7f82875e837019205c21651c05f8cd MFC after: 2 weeks
* Improve TXG handling in the TRIM module.smh2013-03-211-5/+4
| | | | | | | | | | | | | | | | | | | | This patch adds some improvements to the way the trim module considers TXGs: - Free ZIOs are registered with the TXG from the ZIO itself, not the current SPA syncing TXG (which may be out of date); - L2ARC are registered with a zero TXG number, as L2ARC has no concept of TXGs; - The TXG limit for issuing TRIMs is now computed from the last synced TXG, not the currently syncing TXG. Indeed, under extremely unlikely race conditions, there is a risk we could trim blocks which have been freed in a TXG that has not finished syncing, resulting in potential data corruption in case of a crash. Reviewed by: pjd (mentor) Approved by: pjd (mentor) Obtained from: https://github.com/dechamps/zfs/commit/5b46ad40d9081d75505d6f3bf04ac652445df366 MFC after: 2 weeks
* Add TRIM support for L2ARC.smh2013-03-211-6/+5
| | | | | | | | | | | | | This adds TRIM support to cache vdevs. When ARC buffers are removed from the L2ARC in arc_hdr_destroy(), arc_release() or l2arc_evict(), the size previously occupied by the buffer gets scheduled for TRIMming. As always, actual TRIMs are only issued to the L2ARC after txg_trim_limit. Reviewed by: pjd (mentor) Approved by: pjd (mentor) Obtained from: https://github.com/dechamps/zfs/commit/31aae373994fd112256607edba7de2359da3e9dc MFC after: 2 weeks
* Upgrades trim free request sizes before inserting them into to free map,smh2012-12-131-2/+13
| | | | | | | | | | | | | | | | | making range consolidation much more effective particularly for small deletes. This reduces memory used by the free map as well as reducing the number of bio requests down to geom required to process all deletes. In tests this achieved a factor of 10 reduction of trim ranges / geom call downs. While I'm here correct the description of zio_vdev_io_start. PR: kern/173254 Submitted by: Steven Hartland Approved by: pjd (mentor)
* Add TRIM support.pjd2012-09-231-0/+541
The code builds a map of regions that were freed. On every write the code consults the map and eventually removes ranges that were freed before, but are now overwritten. Freed blocks are not TRIMed immediately. There is a tunable that defines how many txg we should wait with TRIMming freed blocks (64 by default). There is a low priority thread that TRIMs ranges when the time comes. During TRIM we keep in-flight ranges on a list to detect colliding writes - we have to delay writes that collide with in-flight TRIMs in case something will be reordered and write will reached the disk before the TRIM. We don't have to do the same for in-flight writes, as colliding writes just remove ranges to TRIM. Sponsored by: multiplay.co.uk This work includes some important fixes and some improvements obtained from the zfsonlinux project, including TRIMming entire vdevs on pool create/add/attach and on pool import for spare and cache vdevs. Obtained from: zfsonlinux Submitted by: Etienne Dechamps <etienne.dechamps@ovh.net>
OpenPOWER on IntegriCloud