summaryrefslogtreecommitdiffstats
path: root/sys/cddl
Commit message (Collapse)AuthorAgeFilesLines
* MFC r284593: MFV r284412: 5911 ZFS "hangs" while deleting fileavg2015-07-065-33/+89
| | | | | | | | | | | | | | | | | | | | | illumos/illumos-gate@46e1baa6cf6d5432f5fd231bb588df8f9570c858 https://www.illumos.org/issues/5911 Sometimes ZFS appears to hang while deleting a file. It is actually making slow progress at the file deletion, but other operations (administrative and writes via the data path) "hang" until the file removal completes, which can take a long time if the file has many blocks. The deletion (or most of it) happens in a single txg, and the sync thread spends most of its time reading indirect blocks... Reviewed by: Bayard Bell <buffer.g.overflow@gmail.com> Reviewed by: Alek Pinchuk <alek@nexenta.com> Reviewed by: Simon Klinkert <simon.klinkert@gmail.com> Reviewed by: Dan McDonald <danmcd@omniti.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Matthew Ahrens <mahrens@delphix.com> PR: 199775 Approved by: re(kib)
* MFC r284304: MFV r284030: 5818 zfs {ref}compressratio is incorrect with 4k ↵avg2015-07-016-17/+40
| | | | | | sector size Note: no MFC to stable/9 because r268075 (vendor r267565) has not been MFC-ed.
* MFC r284306: MFV r284036: 5961 Fix stack overflow in zfs_create_fsavg2015-06-241-16/+17
|
* MFC r284303: MFV r283534: 5515 dataset user hold doesn't reject empty tagsavg2015-06-241-2/+15
|
* MFC r284301: MFV r284040: check that datasets are snapshotsavg2015-06-242-0/+12
|
* MFC r283525: zfs: fixes for a full stream received into an existing datasetavg2015-06-121-4/+6
|
* MFC r283602:kib2015-06-103-1/+5
| | | | | | | | | Prevent dounmount() from acting on the freed (although type-stable) memory by changing the interface to require the mount point to be referenced. MFC r283629: Add missed {}.
* MFC r278167, MFV r266995:pfg2015-06-071-2/+3
| | | | | | | | | 4767 dtrace_probe() always has the timestamp Reference: https://illumos.org/issues/4767 Obtained from: Illumos
* MFC r278166, MFV r266993:pfg2015-06-073-47/+54
| | | | | | | | | | | 4469 DTrace helper tracing should be dynamic Reference: https://illumos.org/issues/4469 Obtained from: Illumos Phabric: D1551 Reviewed by: markj
* MFC r278136, r278137, r278370:markj2015-06-071-13/+19
| | | | | Diff reduction with illumos, in preparation for merging r266993 from the vendor branch. No functional change.
* MFC r283524: dsl_dataset_promote_check: ensure that shared snaps do notavg2015-06-051-0/+6
| | | | become too long
* MFC r282766: zfs ioctls: use fget_write / fget_read instead of getf wrapper ↵avg2015-06-051-5/+23
| | | | for fget
* MFC r283515:kib2015-06-011-2/+0
| | | | Remove excess Giant acquisition around the dounmount() call.
* MFC r277915:markj2015-05-291-3/+8
| | | | | | | Don't attempt to disable enabled fasttrap probes in an exiting process. MFC r277914: fasttrap_sigtrap(): use tdsendsignal() to send SIGTRAP.
* MFC r281915:markj2015-05-296-459/+4
| | | | | | | Make vpanic() externally visible. MFC r281916: Fix DTrace's panic() action.
* MFC r282475: zfs: do not hold an extra reference on a root vnodeavg2015-05-252-9/+2
|
* MFC r282473: dmu_recv_end_check: don't leak hold if ↵avg2015-05-251-2/+6
| | | | dsl_destroy_snapshot_check_impl fails
* MFC r282632: MFV r282630: 5809 Blowaway full receive in v1 pool causes ↵avg2015-05-251-1/+2
| | | | kernel panic
* MFC r282880:smh2015-05-161-0/+1
| | | | | | Add copyright info missing from r282205 Sponsored by: Multiplay
* MFC r282205:smh2015-05-121-15/+17
| | | | | | | Fix misuse of input argument in traverse_visitbp Obtained from: zfsonlinux (a585f2f844ed3d4270221fed88f5e494eb55d932 Sponsored by: Multiplay
* MFC r282131: replace a comment about zfs recv -F corner case with a longer oneavg2015-05-111-7/+17
|
* MFC r282130: zfs_onexit_fd_hold: return EBADF even if devfs_get_cdevpriv ↵avg2015-05-111-1/+1
| | | | gave ENOENT
* MFC r282127: dsl_dir_rename_check: return EXDEV on cross-pool rename attemptavg2015-05-111-1/+1
|
* MFC r282126: FV r282123: 5610 zfs clone from different source and target poolsavg2015-05-112-12/+4
|
* MFC r282125: MFV r282124: 5393 spurious failures from dsl_dataset_hold_obj()avg2015-05-111-1/+2
|
* MFC r282122: nvpair_type_is_array: DATA_TYPE_INT8_ARRAY was not recognizedavg2015-05-111-0/+1
|
* MFC r275576: remove opensolaris cyclic code, replace with high-precision ↵avg2015-05-1110-2260/+171
| | | | callouts
* MFC r281026, r281108, r281109:mav2015-05-031-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | Make ZFS ARC track both KVA usage and fragmentation. Even on Illumos, with its much larger KVA, ZFS ARC steps back if KVA usage reaches certain threshold (3/4 on i386 or 16/17 otherwise). FreeBSD has even less KVA, but had no such limit on archs with direct map as amd64. As result, on machines with a lot of RAM, during load with very small user- space memory pressure, such as `zfs send`, it was possible to reach state, when there is enough both physical RAM and KVA (I've seen up to 25-30%), but no continuous KVA range to allocate even single 128KB I/O request. Address this situation from two sides: - restore KVA usage limitations in a way the most close to Illumos; - introduce new requirement for KVA fragmentation, specifying that we should have at least one sequential KVA range of zfs_max_recordsize bytes. Experiments show that first limitation done alone is not sufficient. On machine with 64GB of RAM it is sometimes needed to drop up to half of ARC size to get at leats one 1MB KVA chunk. Statically limiting ARC to half of KVA/RAM is too strict, so second limitation makes it to work in cycles: accumulate trash up to certain critical mass, do massive spring-cleaning, and then start littering again.
* MFC r281667:delphij2015-04-251-6/+0
| | | | | Remove vfs.zfs.snapshot_list_prefetch, the corresponding code was gone in r248571 already.
* MFC r280834:markj2015-04-132-2/+44
| | | | Bound the number of frames traversed when executing the ustackdepth action.
* MFC r280822: Some cosmetic polishing. No functional change.mav2015-04-051-5/+5
|
* MFC r279927: Make DIOCGATTR in device mode handle "GEOM::candelete".mav2015-03-271-1/+3
|
* Merge r263233 from HEAD to stable/10:rwatson2015-03-191-1/+1
| | | | | | | | | Update kernel inclusions of capability.h to use capsicum.h instead; some further refinement is required as some device drivers intended to be portable over FreeBSD versions rely on __FreeBSD_version to decide whether to include capability.h. Sponsored by: Google, Inc.
* MFC r277419:mav2015-02-032-6/+7
| | | | | | | | | | | | | | Allow skipping dmu_buf_will_dirty() call in dsl_dir_transfer_space(). dsl_dir_transfer_space() is mostly called after dsl_dir_diduse_space(), which already calls dmu_buf_will_dirty() for the same dbuf and tx, so its duplicate call in those cases will change nothing, only spend time. Skipping this call by four times reduces time spent in dbuf_write_done() and descendants, updating dataset statistics with several congested lock acquisitions. When rewriting 8K zvol blocks at 1GB/s rate, this reduces CPU time spent inside dbuf_write_done(), according to profiling, from 45% of 683K samples to 18% of 422K.
* MFC r276123:smh2015-02-011-3/+27
| | | | | | | | | Always sync the global ZFS config cache to reflect the new mosconfig MFC r277351: Clean ZFS spa config before syncing Sponsored by: Multiplay
* MFC r277185:mav2015-01-281-1/+1
| | | | Fix overflow bug from r248577, turning 30s TRIM timeout into ~4s.
* MFC r277169: Reimplement TRIM throttling added in r248577.mav2015-01-281-51/+42
| | | | | | | | | | | | | | | Previous throttling implementation approached problem from the wrong side. It significantly limited useful delaying of TRIM requests and aggregation potential, while not so much controlled TRIM burstiness under heavy load. With this change random 4K write benchmarks (probably the worst case for TRIM) show me IOPS increase by 20%, average latency reduction by 30%, peak TRIM bursts reduction by 3 times and same peak TRIM map size (memory usage). Also the new logic does not force map size down so heavily, really allowing to keep deleted data for 32 TXG or 30 seconds under moderate load. It was practically impossible with old throttling logic, which pushed map down to only 64 segments.
* MFC r277096: Skip extra bcopy() when scrubbing vdev without redundancy.mav2015-01-261-1/+2
| | | | According to profiler, this bcopy() can use about 10% of CPU time.
* MFC r276983: When aggregating TRIM segments, move the new one to the end.mav2015-01-251-0/+6
| | | | | | | New segment at the list head may block all TRIM requests until txg of that segment can be processed. On my random I/O tests this change reduce peak TRIM list length from 650 to 450 segments. Hopefully it should reduce TRIM burstiness when list processing is unblocked.
* MFC r276952: Add LBA as secondary sort key for synchronous I/O requests.mav2015-01-251-0/+5
| | | | | | | | | | On FreeBSD gethrtime() implemented via getnanouptime(), that has 1ms (1/hz) precision. It makes primary sort key (timestamp) collision very possible. In such situations sorting by secondary key of LBA is much more reasonable then by totally meaningless zio pointer value. With this change on multi-threaded synchronous ZVOL read I've measured 10% throughput increase and average latency reduction.
* MFC r276913: Use new optimized dmu_read_uio_dbuf() for ZVOLs in device mode.mav2015-01-251-1/+1
| | | | This slightly reduces overhead by avoiding dnode_hold()/dnode_rele() calls.
* MFC r275923:delphij2015-01-231-1/+3
| | | | | | | | | | | Add missing continue: we can't proceed further if the kernel does not panic with zfs_panic_recover. Illumos issue: 5438 zfs_blkptr_verify should continue after zfs_panic_recover Reported by: Coverity CID: 1232014
* MFC r275922: MFV r275914:delphij2015-01-231-6/+4
| | | | | | | | | | | | | | | | | | | | | | | | As of r270383, the dbuf_compare comparator compares the dbuf attributes in the following order: db_level (indirect level) db_blkid (block number) db_state (current state) the address of the element Because db_state is being considered before the element's state, changing of db_state would affect balancedness of the AVL tree, even when the address of element compares differently. For instance, in dbuf_create, db_state may be altered after the node is inserted into the AVL tree and may break AVL tree balancedness. Instead of using db_state as a comparision critera (introduced in r270383), consider it only when we are doing a lookup, that is one of the two dbuf pointers contains DB_SEARCH. Illumos issue: 5422 preserve AVL invariants in dn_dbufs
* MFC r275811: MFV r275783:delphij2015-01-2310-388/+394
| | | | | | | | | | | | | Convert ARC flags to use enum. Previously, public flags are defined in arc.h and private flags are defined in arc.c which can lead to confusion and programming errors. Consistently use 'hdr' (when referencing arc_buf_hdr_t) instead of 'buf' or 'ab' because arc_buf_t are often named 'buf' as well. Illumos issue: 5369 arc flags should be an enum 5370 consistent arc_buf_hdr_t naming scheme
* MFC r275782: MFV r275551:delphij2015-01-2331-806/+912
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove "dbuf phys" db->db_data pointer aliases. Use function accessors that cast db->db_data to the appropriate "phys" type, removing the need for clients of the dmu buf user API to keep properly typed pointer aliases to db->db_data in order to conveniently access their data. sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_leaf.c: In zap_leaf() and zap_leaf_byteswap, now that the pointer alias field l_phys has been removed, use the db_data field in an on stack dmu_buf_t to point to the leaf's phys data. sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c: Remove the db_user_data_ptr_ptr field from dbuf and all logic to maintain it. sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dbuf.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c: Modify the DMU buf user API to remove the ability to specify a db_data aliasing pointer (db_user_data_ptr_ptr). cddl/contrib/opensolaris/cmd/zdb/zdb.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_bookmark.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deadlist.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deleg.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_destroy.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_prop.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_synctask.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_userhold.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_history.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_leaf.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_dataset.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_dir.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_impl.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_leaf.h: Create and use the new "phys data" accessor functions dsl_dir_phys(), dsl_dataset_phys(), zap_m_phys(), zap_f_phys(), and zap_leaf_phys(). sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_dataset.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_dir.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_impl.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_leaf.h: Remove now unused "phys pointer" aliases to db->db_data from clients of the DMU buf user API. Illumos issue: 5314 Remove "dbuf phys" db->db_data pointer aliases in ZFS
* MFC r275781: MFV r275550:delphij2015-01-234-25/+51
| | | | | | | | In addition to r273158, make the code in spa_sync() that checks if the current TXG is a no-op TXG less fragile. Illumos issue: 5347 idle pool may run itself out of space
* MFC r275748: MFV r247174:delphij2015-01-231-10/+42
| | | | | | | | | | | | | Expose arc_meta_limit, et al via kstats. Note that as a result, vfs.zfs.arc_meta_used is removed. The existing vfs.zfs.arc_meta_limit sysctl/tunable is retained with a SYSCTL_PROC wrapper. Illumos ZFS issues: 3561 arc_meta_limit should be exposed via kstats Relnotes: yes
* MFC r275740: MFV r275548:delphij2015-01-232-4/+87
| | | | | | | | | | | Verify that the block pointer is structurally valid, before attempting to read it in. It can only be invalid in the case of a ZFS bug, but this change will help identify such bugs in a more transparent way, by panic'ing with a relevant message, rather than indexing off the end of an array or something. Illumos issue: 5349 verify that block pointer is plausible before reading
* MFC r275738: MFV r275546:delphij2015-01-231-7/+22
| | | | | | | | | | | | | | Reduce scrub activities when system there is enough dirty data, namely when dirty data is more than zfs_vdev_async_write_active_min_dirty_percent (once we start to increase the number of concurrent async writes). While there also correct rounding error which would make scrub end up pausing for (zfs_txg_timeout + 1) seconds instead of the desired zfs_txg_timeout seconds. Illumos issue: 5351 scrub goes for an extra second each txg 5352 scrub should pause when there is some dirty data
* MFC r275737: MFV r275545:delphij2015-01-231-1/+2
| | | | | | | | | If zio_checksum_error() returns other than ECKSUM (e.g. EINVAL), it does not fill in the "zio_bad_cksum_t *info" parameter. Caller should not attempt to use it in this case. Illumos issue: 5348 zio_checksum_error() only fills in info if ECKSUM
OpenPOWER on IntegriCloud