summaryrefslogtreecommitdiffstats
path: root/sys/cddl/contrib
Commit message (Collapse)AuthorAgeFilesLines
* MFC r269189:kib2014-08-042-4/+3
| | | | Initialize zfs vnode v_hash when the vnode is allocated.
* MFC r268865: MFV r268852:delphij2014-08-027-17/+130
| | | | | | | | | | | | | Reduce lock contention on the z_teardown_lock under heavily cached read workload by splitting the single teardown rrw lock into RRM_NUM_LOCKS (17) of them. Read acquisitions are randomly distributed among these locks based on curthread pointer. Write acquisitions are going to all the locks, which for the usage of this type of lock should be rare. Illumos issue: 5008 lock contention (rrw_exit) while running a read only load
* MFC r268859: MFV r268851:delphij2014-08-025-6/+44
| | | | | | | | | When a sync task is waiting for a txg to complete, we should hurry it along by increasing the number of outstanding async writes (i.e. make vdev_queue_max_async_writes() return a larger number). Illumos issue: 4753 increase number of outstanding async writes when sync task is waiting
* MFC r268858: MFV r268850:delphij2014-08-023-81/+64
| | | | | | | | | | | | | | | Change the interaction between the DMU and ARC so that when the DMU is shutting down an objset, we do not evict the data from the ARC. Instead we simply coordinate the destruction of the DMU's data with the ARC. The only case where we actually need to explicitly evict from the ARC is when dbuf_rele_and_unlock() determines that the administrator has requested that it not be kept in memory, via the primarycache/secondarycache properties. In this case, we evict the data from the ARC by its blkptr_t, the same way as when a block is freed we explicitly evict it from the ARC. Illumos issue: 4631 zvol_get_stats triggering too many reads
* MFC r268855: MFV r268848:delphij2014-08-027-39/+93
| | | | | | | | | | | | | | | | | | Instead of asserting all zio's be properly aligned, only assert on the logical ones. Cap uberblocks at 8k, otherwise with ashift=17, there would be only one uberblock. This fixes a problem that zdb would trip assert on pools with ashift >= 0xe (8k). While there, also change the code so it only attempt to condense space map unless the uncondensed size consumes greater than zfs_metaslab_condense_block_threshold blocks. Illumos issue: 4958 zdb trips assert on pools with ashift >= 0xe
* MFC r264434:markj2014-07-313-11/+248
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DTrace's pid provider works by inserting breakpoint instructions at probe sites and installing a hook at the kernel's trap handler. The fasttrap code will emulate the overwritten instruction in some common cases, but otherwise copies it out into some scratch space in the traced process' address space and ensures that it's executed after returning from the trap. In Solaris and illumos, this (per-thread) scratch space comes from some reserved space in TLS, accessible via the fs segment register. This approach is somewhat unappealing on FreeBSD since it would require some modifications to rtld and jemalloc (for static TLS) to ensure that TLS is executable, and would thus introduce dependencies on their implementation details. I think it would also be impossible to safely trace static binaries compiled without these modifications. This change implements the functionality in a different way, by having fasttrap map pages into the target process' address space on demand. Each page is divided into 64-byte chunks for use by individual threads, and fasttrap's process descriptor struct has been extended to keep track of any scratch space allocated for the corresponding process. With this change it's possible to trace all libc functions in a program, e.g. with pid$target:libc.so.*::entry {@[probefunc] = count();} Previously this would generally cause the victim process to crash, as tracing memcpy on amd64 requires the functionality described above.
* MFC r268720: MFV r268714:delphij2014-07-291-13/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Improve extreme rewind import. When doing an "extreme rewind" import ("zpool import -XF"), we attempt to verify all data in the pool, essentially scrubbing the entire pool. The problem is that spa_load_verify_cb() issues an unbounded number of concurrent scrub i/os. This can lead to all of memory being used for these zio's, wedging the system. Like normal scrub, we need to put a cap on the number of outstanding i/os, and have the traverse thread block when we reach this cap. For this purpose the cap can be very large (10,000) to optimize the elevator algorithm. Three kernel tunables have been added: vfs.zfs.spa_load_verify_maxinflight vfs.zfs.spa_load_verify_metadata vfs.zfs.spa_load_verify_data The latter two tunables controls whether metadata and/or user data when doing extreme rewind. Make 'zpool import -T' imply scrub. Make zpool import -T <txg> accept hexadecimal values for the txg when prefixed with 0x. Skip txg's for which there is no uberblock when doing extreme rewind. Skip reading all user data twice by skipping prefetches when doing extreme rewinds as we do not access via the ARC. Illumos issues: 4970 need controls on i/o issued by zpool import -XF 4971 zpool import -T should accept hex values 4972 zpool import -T implies extreme rewind, and thus a scrub 4973 spa_load_retry retries the same txg 4974 spa_load_verify() reads all data twice
* MFC r268713: MFV r268702:delphij2014-07-295-1/+19
| | | | | | | Add missing *_destroy() calls in various places with ZFS. Illumos issue: 4975 missing mutex_destroy() calls in zfs
* MFC r268420:mav2014-07-241-1/+1
| | | | | | | | | | | | Remove IO_SYNC flag when writing extended file attributes on ZFS. While it is possible to create and write file, modify its permissions, etc. without ever doing sync, it looks odd that it is required for setting extended file attributes on ZFS. UFS does not do sync there too. Samba uses those extended attributes to store some its data, and doing it synchronously by many times reduces file creation performance for systems without SLOG device.
* MFC r268473: MFV r268455:delphij2014-07-2319-70/+160
| | | | Use reserved space for ZFS administrative commands.
* MFC r268464: MFV r268452:delphij2014-07-237-21/+66
| | | | | | | | Explicitly mark file removal transactions as "presumed to result in a net free of space" so they will not fail with ENOSPC. Illumos issue: 4950 files sometimes can't be removed from a full filesystem
* MFC r268116:delphij2014-07-172-73/+104
| | | | | | | | - Fix handling of "new" style of ioctl in compatiblity mode [1]; - Reorganize code and reduce diff from upstream; - Improve forward compatibility shims for previous kernel; Reported by: sbruno [1]
* MFC r268097:pfg2014-07-161-15/+13
| | | | | | | | | | | | | | | | MFV r260708 4427 pid provider rejects probes with valid UTF-8 names This make use of Solaris' u8_validate() which we happen to use since r185029 for ZFS. Use of u8_textprep.c required -Wno-cast-qual for powerpc. Illumos Revision: 1444d846b126463eb1059a572ff114d51f7562e5 Reference: https://www.illumos.org/issues/4427 Obtained from: Illumos
* MFC r268128: MFV r268122:delphij2014-07-153-1/+10
| | | | 4929 want prevsnap property
* MFC r268126: MFV r268121:delphij2014-07-154-35/+32
| | | | 4924 LZ4 Compression for metadata
* MFC r268123: MFV r268119:delphij2014-07-1524-109/+109
| | | | 4914 zfs on-disk bookmark structure should be named *_phys_t
* MFC r268086: MFV r267570:delphij2014-07-151-3/+22
| | | | 4756 metaslab_group_preload() could deadlock
* MFC r268085: MFV r267569:delphij2014-07-151-5/+13
| | | | 4897 Space accounting mismatch in L2ARC/zpool
* MFC r268082: MFV r267567:delphij2014-07-151-15/+20
| | | | | 4881 zfs send performance degradation when embedded block pointers are encountered
* MFC r268079: MFV r267566:delphij2014-07-1516-157/+294
| | | | | 4390 i/o errors when deleting filesystem/zvol can lead to space map corruption
* MFC r268075: MFV r267565:delphij2014-07-1535-223/+1109
| | | | | 4757 ZFS embedded-data block pointers ("zero block compression") 4913 zfs release should not be subject to space checks
* MFC r266771: MFV r266766:delphij2014-07-156-22/+76
| | | | | | | | | | | | | | | | | | | Add a new zfs property, "redundant_metadata" which can have values "all" or "most". The default will be "all", which is the current behavior. When set to all, ZFS stores an extra copy of all metadata. If a single on-disk block is corrupt, at worst a single block of user data (which is recordsize bytes long) can be lost. Setting to "most" will cause us to only store 1 copy of level-1 indirect blocks of user data files. This can improve performance of random writes, because less metadata has to be written. In practice, at worst about 100 blocks (of recordsize bytes each) of user data can be lost if a single on-disk block is corrupt. The exact behavior of which metadata blocks are stored redundantly may change in future releases. Illumos issue: 3835 zfs need not store 2 copies of all metadata
* MFC r268290:pfg2014-07-131-14/+64
| | | | | | | | | | | | | Merge from OpenSolaris (24-Jul-2010): 6679140 asymmetric alloc/dealloc activity can induce dynamic variable drops 6679193 dtrace_dynvar walker produces flood of dtrace_dynhash_sink This finishes a set of merges from the older OpenSolaris releases. Still the FreeBSD port has many differences that are difficult to account for but that seems normal given that the kernels are different. Obtained from: OpenSolaris (through Illumos)
* MFC 267929, 267937, 267939, 267940, 267941, 267942, 267987, 268006:rpaulo2014-07-125-80/+1070
| | | | | | | | | | | | | | | | 2915 DTrace in a zone should see "cpu", "curpsinfo", et al 2916 DTrace in a zone should be able to access fds[] 2917 DTrace in a zone should have limited provider access 4477 DTrace should speak JSON Add stubs for CTF functions which are not yet implemented. 4474 DTrace Userland CTF Support 4475 DTrace userland Keyword 4476 DTrace tests should be better citizens 4479 pid provider types 4480 dof emulation is missing checks 4471 DTrace count() with histogram 4472 DTrace full width distribution histograms 4473 DTrace frequency trails
* MFC r268130, r268224, r268230, r268231:pfg2014-07-123-9/+48
| | | | | | | | | | | | | | | | | | Various DTrace Merges from OpenSolaris/Illumos: 15-Sep-2008: 6735480 race between probe enabling and provider registration 20-Apr-2008: 6822482 DOF validation needs to handle loadable sections flagged as unloadable 22-Apr-2009: 6823388 DTrace ioctl handlers must validate all structure members 30-Jun-2009: 6851093 system drops to kmdb with anonymous dtrace probes + kmdb Obtained from: OpenSolaris
* MFC r268125:pfg2014-07-061-2/+4
| | | | | | | | | | | | Small merges from OpenSolaris: These have no effect on FreeBSD, in fact they are ifdef'ed, but make easier future merges: 6699767 panic in spec_open() 6718877 crgetzoneid() use can cause problems when forking processes with USDT providers in a non global zone
* MFC r268178:mav2014-07-051-0/+4
| | | | | | | Fix bug in sync control in new "dev" mode of ZVOL (r265678). Don't check ZVOL_WCE flag, used in Solaris to control device "write cache". It is not applicable on FreeBSD and by default set to "disable".
* MVC r268014:pfg2014-07-021-7/+6
| | | | | | Reduce some warnings in the Solaris unicode support. Clean some warnings from parenthesis and minor style issues.
* MFC r267029, r267038:mav2014-06-171-0/+4
| | | | | | Replace gethrtime() with cpu_ticks(), as source of random for the taskqueue selection. gethrtime() in our port updated with HZ rate, so unusable for this specific purpose, completely draining benefit of multiple taskqueues.
* MFC r266915: MFV 266913+266914:delphij2014-06-061-5/+12
| | | | | 3897 zfs filesystem and snapshot limits (fix leak) 4901 zfs filesystem/snapshot limit leaks
* MFC r264885smh2014-05-261-26/+30
| | | | | | Eliminate duplicate checks in vdev_geom_io_intr error handling Sponsored by: Multiplay
* MFC r262329:markj2014-05-251-15/+33
| | | | | | | | | | Define the KM_NORMALPRI flag for kmem_alloc(), as it is used in some upstream DTrace code. MFC r262330: 1452 DTrace buffer autoscaling should be less violent illumos/illumos-gate@6fb4854bed54ce82bd8610896b64ddebcd4af706
* MFC r264850smh2014-05-152-18/+73
| | | | | | | | | | Add the ability to set a minimum ashift size for ZFS pool creation or root level vdev addition. Change max_auto_ashift sysctl to error when an invalid value is requested instead of silently limiting it. Sponsored by: Multiplay
* MFC r262665:markj2014-05-151-1/+2
| | | | | | | Expose a few DTrace parameters as sysctls under kern.dtrace and add descriptions for several existing sysctls. PR: 187027
* MFC r265458:delphij2014-05-091-1/+2
| | | | | | | | | | | Import George Wilson's change for Illumos #4730: 4730 metaslab group taskq should be destroyed in metaslab_group_destroy() Reviewed by: Alex Reece <alex.reece@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Sebastien Roy <sebastien.roy@delphix.com> Original author: George Wilson
* MFC r264836 (MFV r264830):delphij2014-05-092-7/+8
| | | | 4745 fix AVL code misspellings
* MFC r264835 (MFV r264829):delphij2014-05-0913-25/+871
| | | | 3897 zfs filesystem and snapshot limits
* MFC r264671 (MFV r264668):delphij2014-05-093-57/+7
| | | | | | | 4754 io issued to near-full luns even after setting noalloc threshold 4755 mg_alloc_failures is no longer needed illumos/illumos@b6240e830b871f59c22a3918aebb3b36c872edba
* MFC r264669: MFV r264666:delphij2014-05-0917-168/+156
| | | | | | 4374 dn_free_ranges should use range_tree_t illumos/illumos-gate@bf16b11e8deb633dd6c4296d46e92399d1582df4
* MFC r264145:mav2014-05-083-117/+370
| | | | | | | | | | | | | | | | | | | Add property and sysctl to control how ZVOLs are exposed to OS. New ZFS property volmode and sysctl vfs.zfs.vol.mode allow switching ZVOL between three modes: geom -- existing fully functional behavior (default); dev -- exposing volumes only as raw disk device file in devfs; none -- not exposing volumes outside ZFS. The "dev" mode is less functional (can't be partitioned, mounted, etc), but it is faster, and in some scenarios with untrusted consumers safer. It can be useful for NAS, VM block storages, etc. The "none" mode may be convenient for backup servers, etc. that don't need direct data access. Due to the way ZVOL is integrated with main ZFS code, those property and sysctl are checked only during pool import and volume creation.
* MFC r264086:mav2014-05-081-31/+52
| | | | | | | | 3580 Want zvols to return volblocksize when queried for physical block size illumos/illumos-gate@a0b60564dfc644f4bfaef1ce26d343b44cf68bc5 It is irrelevant for FreeBSD, just reducing diff.
* MFC r262661:markj2014-05-051-11/+4
| | | | | Fix emulation of call and jmp instructions on i386 and for 32-bit processes on amd64.
* MFC r262542:markj2014-05-031-1/+1
| | | | | Move some files that are identical on i386 and amd64 to an x86 subdirectory rather than keeping duplicate copies.
* MFC r264040:pfg2014-05-021-5/+5
| | | | | | | | | | | | | 4248 dtrace(1M) should never create DOF with empty probes section 4249 Only probes from the first DTrace object file will be included Illumos Revision: 4a20ab41aadcb81c53e72fc65886e964e9add59 Reference: https://www.illumos.org/issues/4248 https://www.illumos.org/issues/4249 Obtained from: Illumos
* MFC r265046smh2014-04-302-23/+47
| | | | | | Fix ZIO reordering issue which could cause data loss / corruption. Sponsored by: Multiplay
* MFC r262596:markj2014-04-231-1/+1
| | | | | | 4478 dtrace_dof_maxsize is far too small illumos/illumos-gate@d339a29bb4765c4b6883a935cf69b669cd05bca0
* MFC r264193:mav2014-04-211-0/+3
| | | | In addition to r264077, tell GEOM that we do support BIO_DELETE now.
* MFC r264077:mav2014-04-211-4/+70
| | | | | | | | Add BIO_DELETE support to ZVOL. It is an adapted merge from the vendor branch of: 701 UNMAP support for COMSTAR (in part related to ZFS) 2130 zvol DKIOCFREE uses nested DMU transactions
* MFC r264341:mav2014-04-211-0/+4
| | | | | | | | Create zvol devices on zfs clone. While big and shiny patch is not ready, it is better to have something. PR: kern/178999
* MFC r263118:mav2014-04-011-5/+9
| | | | Report ZVOL block size as GEOM stripesize.
OpenPOWER on IntegriCloud