summaryrefslogtreecommitdiffstats
path: root/sys/cddl
Commit message (Collapse)AuthorAgeFilesLines
* Fix the following -Werror warning from clang 3.5.0, while building ↵dim2014-11-191-0/+2
| | | | | | | | | | | | | | | | | | | | | cddl/lib/libctf: In file included from cddl/contrib/opensolaris/common/ctf/ctf_create.c:31: In file included from sys/cddl/contrib/opensolaris/uts/common/sys/sysmacros.h:34: sys/cddl/contrib/opensolaris/uts/common/sys/isa_defs.h:334:9: warning: '_ILP32' macro redefined [-Wmacro-redefined] #define _ILP32 ^ <built-in>:26:9: note: previous definition is here #define _ILP32 1 ^ 1 warning generated. This is because clang 3.5.0 started predefining _ILP32 and __ILP32__ for the i386 arch. (Earlier versions already predefined _LP64 and __LP64__ for the x86_64 arch.) Reviewed by: emaste, avg, smh, delphij, markj Differential Revision: https://reviews.freebsd.org/D1187
* Make vfs.zfs.max_recordsize read-write at runtime.delphij2014-11-181-1/+1
| | | | MFC after: 2 weeks
* Add a tunable for spa_slop_shift which controls how much space wedelphij2014-11-181-0/+3
| | | | | | would reserve by default. Tuning is not recommended. MFC after: 2 weeks
* Allow tuning zfs_max_recordsize via loader tunable. Tuning is NOTdelphij2014-11-181-0/+5
| | | | | | | recommended. Requested by: Slawa Olhovchenkov <slw zxy spb ru> MFC after: 2 weeks
* l2arc: restore correct rounding up of asize of compressed dataavg2014-11-171-7/+9
| | | | | | | | | | | | | | | | | | | | | | | This rounding up was lost in a mismerge of illumos code. See r268075 MFV r267565. After that commit zio_compress_data() no longer performs any compressed size adjustment, so it needs to be done externally. On FreeBSD we round up the size using vdev_ashift rather than SPA_MINBLOCKSIZE so that 4KB devices are properly supported. Additionally, zero out the buffer tail only if compression succeeds. The compression is considered successful if the size of compressed data after rounding up to account for the vdev ashift is less than the original data size. It does not make sense to have the data compressed if all the savings are lost to rounding up. With the new zio_compress_data() it could have been possible that the rounded compressed size would be greater than the original size and thus we could zero beyond the allocated buffer if the zeroing code was kept at the original place. Discussed with: delphij, gibbs MFC after: 2 weeks X-MFC with: r274627
* Revert r269093 which introduced physical zio alignment transformavg2014-11-171-2/+1
| | | | | | | | | Size of physical ZIOs must never be implicitly adjusted, it's a responsibility of a caller to make sure that such a ZIO has proper offset and size. Discussed with: delphij, gibbs MFC after: 2 weeks
* Disable TRIM on file backed ZFS vdevs and fix TRIM on initsmh2014-11-176-9/+18
| | | | | | | | | | | | | | | | After r265152 TRIM requests are ZIO_TYPE_FREE instead of ZIO_TYPE_IOCTL this meant file backed vdevs to attempted to process the ZIO as a write causing a panic. We now disable TRIM on file backed vdevs and ASSERT the ZIO types supported by each vdev type to ensure we explicity support the ZIO type being processed. Also ensure that TRIM on init is not procesed for devices which declare they didn't support TRIM via vdev_notrim. PR: 195061, 194976, 191573 Sponsored by: Multiplay
* Remove the no-at variants of the kern_xx() syscall helpers. E.g., wekib2014-11-131-2/+2
| | | | | | | | | | | | have both kern_open() and kern_openat(); change the callers to use kern_openat(). This removes one (sometimes two) levels of indirection and consolidates arguments checks. Reviewed by: mckusick Sponsored by: The FreeBSD Foundation MFC after: 1 week
* MFV r274273:delphij2014-11-1044-130/+420
| | | | | | | | | | | | | | | | ZFS large block support. Please note that booting from datasets that have recordsize greater than 128KB is not supported (but it's Okay to enable the feature on the pool). This *may* remain unchanged because of memory constraint. Limited safety belt is provided for mounted root filesystem but use caution is advised. Illumos issue: 5027 zfs large block support MFC after: 1 month
* MFV r274272 and diff reduction with upstream.delphij2014-11-099-50/+59
| | | | | | | | Illumos issue: 5244 zio pipeline callers should explicitly invoke next stage Tested with: ztest plus ZFS over GELI configuration MFC after: 1 month
* MFV r274271:delphij2014-11-081-19/+22
| | | | | | | | | | | | | | Improve zdb -b performance: - Reduce gethrtime() call to 1/100th of blkptr's; - Skip manipulating the size-ordered tree; - Issue more (10, previously 3) async reads; - Use lighter weight testing in traverse_visitbp(); Illumos issue: 5243 zdb -b could be much faster MFC after: 2 weeks
* fix l2arc compression buffers leakavg2014-11-061-10/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | We have observed that arc_release() can be called concurrently with a l2arc in-flight write. Also, we have observed that arc_hdr_destroy() can be called from arc_write_done() for a zio with ZIO_FLAG_IO_REWRITE flag in similar circumstances. Previously the l2arc headers would be freed while leaking their associated compression buffers. Now the buffers are placed on l2arc_free_on_write list for delayed freeing. This is similar to what was already done to arc buffers that were supposed to be freed concurrently with in-flight writes of those buffers. In addition to fixing the discovered leaks this change also adds some protective code to assert that a compression buffer associated with a l2arc header is never leaked. A new kstat l2_cdata_free_on_write is added. It keeps a count of delayed compression buffer frees which previously would have been leaks. Tested by: Vitalij Satanivskij <satan@ukr.net> et al Requested by: many MFC after: 2 weeks Sponsored by: HybridCluster / ClusterHQ
* Add to CTL support for logical block provisioning threshold notifications.mav2014-11-061-1/+53
| | | | | | | | For ZVOL-backed LUNs this allows to inform initiators if storage's used or available spaces get above/below the configured thresholds. MFC after: 2 weeks Sponsored by: iXsystems, Inc.
* This change addresses 4 bugs in ZFS exposed by Richard Kojedzinszky'sjpaetzel2014-10-254-29/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | crash.sh script attached to FreeNAS bug 4109: https://bugs.freenas.org/issues/4109 Three are in the snapshot layer: a) AVG explains in his notes: https://wiki.freebsd.org/AvgVfsSolarisVsFreeBSD "VOP_INACTIVE must not do any destructive actions to a vnode and its filesystem node, nor invalidate them in any way." gfs_vop_inactive and zfsctl_snapshot_inactive did just that. In OpenSolaris VOP_INACTIVE is much closer to FreeBSD's VOP_RECLAIM. Rename & move them to gfs_vop_reclaim and zfsctl_snapshot_reclaim and merge in the requisite vnode_destroy from zfsctl_common_reclaim. b) gfs_lookup_dot and various zfsctl functions do not honor the FreeBSD VFS convention of only locking from the root downward. When looking up ".." the convention is to drop the current leaf vnode lock before acquiring the directory vnode and then subsequently re-acquiring the lock on the leaf vnode. This fixes that in all the places that our exercised by crash.sh. c) The snapshot may already be unmounted when the directory vnode is reclaimed. Check for this case and return. One in the common layer: d) Callers of traverse expect the reference to the vnode passed in to be maintained. Don't release it. This last one may be an unclear contract. There may in fact be some callers that do expect the reference to be dropped on success in addition to callers that expect it to be released. In this case a further audit of the callers is needed and a consensus on the correct behavior. PR: 184677 Submitted by: kmacy Reviewed by: delphij, will, avg MFC after: 2 weeks Sponsored by: iXsystems
* Whitespacejhibbits2014-10-241-1/+1
| | | | | X-MFC-with: r273570 MFC after: 1 week
* Three updates to PowerPC FBT:jhibbits2014-10-241-3/+15
| | | | | | | | | * Use a constant to define the number of stack frames in a probe exception. * Only allow function symbols in powerpc64 ('.' prefixed) * Set the fbtp_roffset for return probes, so the correct dtrace_probe call is made. MFC after: 1 week
* Fix multiple incorrect SYSCTL arguments in the kernel:hselasky2014-10-212-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | - Wrong integer type was specified. - Wrong or missing "access" specifier. The "access" specifier sometimes included the SYSCTL type, which it should not, except for procedural SYSCTL nodes. - Logical OR where binary OR was expected. - Properly assert the "access" argument passed to all SYSCTL macros, using the CTASSERT macro. This applies to both static- and dynamically created SYSCTLs. - Properly assert the the data type for both static and dynamic SYSCTLs. In the case of static SYSCTLs we only assert that the data pointed to by the SYSCTL data pointer has the correct size, hence there is no easy way to assert types in the C language outside a C-function. - Rewrote some code which doesn't pass a constant "access" specifier when creating dynamic SYSCTL nodes, which is now a requirement. - Updated "EXAMPLES" section in SYSCTL manual page. MFC after: 3 days Sponsored by: Mellanox Technologies
* Add tunable vfs.zfs.space_map_blksz for space map's maximum block size.delphij2014-10-181-0/+4
| | | | MFC after: 2 weeks
* Follow up to r225617. In order to maximize the re-usability of kernel codedavide2014-10-161-1/+1
| | | | | | | | in userland rename in-kernel getenv()/setenv() to kern_setenv()/kern_getenv(). This fixes a namespace collision with libc symbols. Submitted by: kmacy Tested by: make universe
* Prevent ZFS leaking pool free spacesmh2014-10-161-7/+8
| | | | | | | | | | | | | | | | | | | | | When processing async destroys ZFS would leak space every txg timeout (5 seconds by default), if no writes occurred, until the pool is totally full. At this point it would be unfixable without a pool recreation. In addition if the machine was rebooted with the pool in this situation would fail to import on boot, hanging indefinitely, as the import process requires the ability to write data to the pool. Any attempts to query the pool status during the hung import would not return as the import holds the pool lock. The only way to import such a pool would be to specify -o readonly=on to the zpool import. zdb -bb <pool> can be used to check for "deferred free" size which is where this lost space will be counted. MFC after: 3 days Sponsored by: Multiplay
* Use write_psize instead of write_asize when doing vdev_space_update.delphij2014-10-131-1/+1
| | | | | | | | Without this change the accounting of L2ARC usage would be wrong and give 16EB free space because the number became negative and overflows. Obtained from: FreeNAS (issue #6239) MFC after: 2 weeks
* Add a tunable for arc_shrink_shift (vfs.zfs.arc_shrink_shift) thatdelphij2014-10-131-0/+5
| | | | | | | | controls how much fraction, 1/2^arc_shrink_shift, should be reclaimed when there is memory pressure. Submitted by: Richard Kojedzinszky <krichy at tvnetwork.hu> MFC after: 2 weeks
* MFV r272804:delphij2014-10-094-63/+61
| | | | | | | | | Refactor the code and stop restore_object from creating two transactions. Illumos issue: 3693 restore_object uses at least two transactions to restore an object MFC after: 2 weeks
* MFV r272803:delphij2014-10-093-11/+78
| | | | | | | Illumos issue: 5175 implement dmu_read_uio_dbuf() to improve cached read performance MFC after: 2 weeks
* l2arc_write_buffers: reduce headroom valueavg2014-10-071-1/+1
| | | | | | | | | | | | | | FreeBSD has ARC_BUFC_NUMMETADATALISTS metadata lists and ARC_BUFC_NUMDATALISTS data lists (currently both are 16) while illumos has just a single list of each kind. headroom determines how much data is scanned on a single list during each run of the l2arc feed thread. Because FreeBSD has more lists we proportionally decrease the limit. Reviewed by: Brendan Gregg (earlier version) MFC after: 2 weeks Sponsored by: HybridCluster
* revert r272702: wrong (earlier) change was committedavg2014-10-071-3/+1
|
* reduce L2ARC_WRITE_SIZE on FreeBSDavg2014-10-071-1/+3
| | | | | | | | | | | | | | | | FreeBSD has ARC_BUFC_NUMMETADATALISTS metadata lists and ARC_BUFC_NUMDATALISTS data lists (currently both are 16) while illumos has just a single list of each kind. L2ARC_WRITE_SIZE determines the default value of l2arc_write_max which defines limits on how much data is scanned and written to a cache device during each run of the l2arc feed thread. The limits are applied on the per buffer list basis. Because FreeBSD has more lists we proportionally reduce the limits. Reviewed by: Brendan Gregg (earlier version) MFC after: 2 weeks Sponsored by: HybridCluster
* make userland __assfail from opensolaris compat honor 'aok' variableavg2014-10-071-4/+8
| | | | | | This should allow zdb -A option to actually make difference. MFC after: 2 weeks
* MFV r272591:delphij2014-10-062-18/+37
| | | | | | | | | Use loaned ARC buffer for zfs receive to avoid copy. Illumos issue: 5162 zfs recv should use loaned arc buffer to avoid copy MFC after: 2 weeks
* MFV r272585:delphij2014-10-063-7/+20
| | | | | | | | | | Split the godfather zio into CPU number's to reduce lock contention. Illumos issue: 5176 lock contention on godfather zio MFC after: 2 weeks
* MFV r272501:delphij2014-10-061-44/+36
| | | | | | | Illumos issue: 5177 remove dead code from dsl_scan.c MFC after: 2 weeks
* MFV r272500:delphij2014-10-061-2/+8
| | | | | | | | | | | | Don't inherit flags other than DS_FLAG_CI_DATASET and DS_FLAG_INCONSISTENT when cloning. This prevents DS_FLAG_DEFER_DESTROY being inherited from a clone that is marked for deferred destroy, which causes snapshots of the clone being destroyed when getting a hold or clone. Illumos issue: 5150 zfs clone of a defer_destroy snapshot causes strangeness MFC after: 1 week
* Don't make nested definition for range_seg_cache.delphij2014-10-041-1/+1
| | | | | | Reported by: ian MFC after: 1 week X-MFC-With: r272506
* MFV r272499:delphij2014-10-041-0/+2
| | | | | | | Illumos issue: 5174 add sdt probe for blocked read in dbuf_read() MFC after: 2 weeks
* Add a new sysctl, vfs.zfs.vol.unmap_enabled, which allows the systemdelphij2014-10-041-0/+14
| | | | | | | | | administrator to toggle whether ZFS should ignore UNMAP requests. Illumos issue: 5149 zvols need a way to ignore DKIOCFREE MFC after: 2 weeks
* Diff reduction with upstream. The code change is not really applicabledelphij2014-10-041-3/+3
| | | | | | | | | to FreeBSD. Illumos issue: 5148 zvol's DKIOCFREE holds zfsdev_state_lock too long MFC after: 1 month
* MFV r272496:delphij2014-10-041-2/+11
| | | | | | | | | | | Add tunable for number of metaslabs per vdev (vfs.zfs.vdev.metaslabs_per_vdev). The default remains at 200. Illumos issue: 5161 add tunable for number of metaslabs per vdev MFC after: 2 weeks
* MFV r272495:delphij2014-10-042-1/+3
| | | | | | | | | | In arc_kmem_reap_now(), reap range_seg_cache too to reclaim memory in response of memory pressure. Illumos issue: 5163 arc should reap range_seg_cache MFC after: 1 week
* MFV r272494:delphij2014-10-043-113/+50
| | | | | | | | | | | | | Make space_map_truncate() always do space_map_reallocate(). Without this, setting space_map_max_blksz would cause panic for existing pool, as dmu_objset_set_blocksize would fail if the object have multiple blocks. Illumos issues: 5164 space_map_max_blksz causes panic, does not work 5165 zdb fails assertion when run on pool with recently-enabled spacemap_histogram feature MFC after: 2 weeks
* Refactor ZFS ARC reclaim checks and limitssmh2014-10-033-107/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | Remove previously added kmem methods in favour of defines which allow diff minimisation between upstream code base. Rebalance ARC free target to be vm_pageout_wakeup_thresh by default which eliminates issue where ARC gets minimised instead of balancing with VM pageout. The restores the target point prior to r270759. Bring in missing upstream only changes which move unused code to further eliminate code differences. Add additional DTRACE probe to aid monitoring of ARC behaviour. Enable upstream i386 code paths on platforms which don't define UMA_MD_SMALL_ALLOC. Fix mixture of byte an page values in arc_memory_throttle i386 code path value assignment of available_memory. PR: 187594 Review: D702 Reviewed by: avg MFC after: 1 week X-MFC-With: r270759 & r270861 Sponsored by: Multiplay
* Fix various issues with zvolssmh2014-10-033-7/+54
| | | | | | | | | | | | | | | | | | | | | | When performing snapshot renames we could deadlock due to the locking in zvol_rename_minors. In order to avoid this use the same workaround as zvol_open in zvol_rename_minors. Add missing zvol_rename_minors to dsl_dataset_promote_sync. Protect against invalid index into zv_name in zvol_remove_minors. Replace zvol_remove_minor calls with zvol_remove_minors to ensure any potential children are also renamed. Don't fail zvol_create_minors if zvol_create_minor returns EEXIST. Restore the valid pool check in zfs_ioc_destroy_snaps to ensure we don't call zvol_remove_minors when zfs_unmount_snap fails. PR: 193803 MFC after: 1 week Sponsored by: Multiplay
* Fix failures and warnings reported by newpynfs20090424 test tool.araujo2014-10-031-0/+1
| | | | | | | | | | | This fix addresses only issues with the pynfs reports, none of these issues are know to create problems for extant real clients. Submitted by: Bart Hsiao <bart.hsiao@gmail.com> Reworked by: myself Reviewed by: rmacklem Approved by: rmacklem Sponsored by: QNAP Systems Inc.
* Diff reduction with kernel code: instruct the compiler that the data ofdelphij2014-10-021-0/+19
| | | | | | | | these types may be unaligned to their "normal" alignment and exercise caution when accessing them. PR: 194071 MFC after: 3 days
* zfsvfs_create(): Refuse to mount datasets whose names are too long.will2014-10-011-0/+11
| | | | | | | | | | | | | | | | | | This is checked for in the zfs_snapshot_004_neg STF/ATF test (currently still in projects/zfsd rather than head). sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c: - zfsvfs_create(): Check whether the objset name fits into statfs.f_mntfromname, and return ENAMETOOLONG if not. Although the filesystem can be unmounted via the umount(8) command, any interface that relies on iterating on statfs (e.g. libzfs) will fail to find the filesystem by its objset name, and thus assume it's not mounted. This causes "zfs unmount", "zfs destroy", etc. to fail on these filesystems, whether or not -f is passed. MFC after: 1 month Sponsored by: Spectra Logic MFSpectraBSD: 974872 on 2013/08/09
* Fix a mismerge in r260183 which prevents snapshot zvol devices beingdelphij2014-09-301-2/+6
| | | | | | | removed and re-instate the fix in r242862. Reported by: Leon Dang <ldang nahannisys com>, smh MFC after: 3 days
* Remove sys/types.h include as per style (9)smh2014-09-181-1/+0
| | | | | | | SDT requries sys/param.h due to use of NULL Reported by: Garrett Sponsored by: Multiplay
* Add dtrace probe support for zfs SET_ERROR(..)smh2014-09-182-1/+47
| | | | | MFC after: 1 week Sponsored by: Multiplay
* Remove debug.zfs_flags in favor of the new vfs.zfs.debug_flags.will2014-09-181-5/+1
| | | | | | Replace TUNABLE_INT with CTLFLAG_RWTUN. Submitted by: avg (debug.zfs_flags removal), smh (TUNABLE_INT replacement)
* Enable ZFS debug flags to be modified via vfs.zfs.debug_flags.will2014-09-181-0/+27
| | | | | | | | | | This is primarily only of interest to ZFS developers, but it makes it easier to get additional debugging. Submitted by: gibbs MFC after: 1 month Sponsored by: Spectra Logic MFSpectraBSD: 517074 on 2011/12/15 (by will), 662343 on 2013/03/20 (by gibbs)
* Reorder sysctls for spa.c global tunables; add sysctl for ccw_retry_interval.will2014-09-181-4/+8
| | | | | MFC after: 1 month Sponsored by: Spectra Logic
OpenPOWER on IntegriCloud