| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
Move implementations of uread() and uwrite() to the illumos compat layer.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit partially reverts r273641 which introduced the leak.
It did so to accomodate for some consumers of traverse() that expected
the starting vnode to stay as-is. But that introduced the leak in the
case when a mounted filesystem was found and its root vnode was
returned.
r299914 removed the troublesome consumers and now there is no reason to
keep the starting vnode. So, now the new rules are:
- if there is no mounted filesystem, then nothing is changed
- otherwise the starting vnode is always released
- the root vnode of the mounted filesystem is returned locked and
referenced in the case of success
MFC after: 5 weeks
X-MFC after: r299914
|
|
|
|
|
|
|
| |
This makes sure that the original vnode is always unlocked and released
if any error happens.
MFC after: 4 weeks
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Remove excessive references on a snapshot mountpoint vnode.
zfsctl_snapdir_lookup() called VN_HOLD() on a vnode returned from
zfsctl_snapshot_mknode() and the latter also had a call to VN_HOLD()
on the same vnode.
On top of that gfs_dir_create() already returns the vnode with the
use count of 1 (set in getnewvnode).
So there was 3 references on the vnode.
* mount_snapshot() should keep a reference to a covered vnode.
That reference is owned by the mountpoint (mounted snapshot filesystem).
* Remove cryptic manipulations of a covered vnode in zfs_umount().
FreeBSD dounmount() already does the right thing and releases the covered
vnode.
PR: 207464
Reported by: dustinwenz@ebureau.com
Tested by: Howard Powell <hpowell@lighthouseinstruments.com>
MFC after: 3 weeks
|
|
|
|
| |
MFC after: 3 days
|
|
|
|
|
|
|
| |
Submitted by: sef
Reviewed by: trasz
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D3540
|
|
|
|
|
|
|
|
| |
Previously several places were doing it on its own, partially
incorrectly (e.g. without the filedesc locked) or even actively harmful
by populating jdir or assigning rootvnode without vrefing it.
Reviewed by: kib
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
crash.sh script attached to FreeNAS bug 4109:
https://bugs.freenas.org/issues/4109
Three are in the snapshot layer:
a) AVG explains in his notes: https://wiki.freebsd.org/AvgVfsSolarisVsFreeBSD
"VOP_INACTIVE must not do any destructive actions to a vnode
and its filesystem node, nor invalidate them in any way."
gfs_vop_inactive and zfsctl_snapshot_inactive did just that. In
OpenSolaris VOP_INACTIVE is much closer to FreeBSD's VOP_RECLAIM.
Rename & move them to gfs_vop_reclaim and zfsctl_snapshot_reclaim
and merge in the requisite vnode_destroy from zfsctl_common_reclaim.
b) gfs_lookup_dot and various zfsctl functions do not honor the
FreeBSD VFS convention of only locking from the root downward. When
looking up ".." the convention is to drop the current leaf vnode lock before
acquiring the directory vnode and then subsequently re-acquiring the lock on the
leaf vnode. This fixes that in all the places that our exercised by crash.sh.
c) The snapshot may already be unmounted when the directory vnode is reclaimed.
Check for this case and return.
One in the common layer:
d) Callers of traverse expect the reference to the vnode passed in to be
maintained. Don't release it.
This last one may be an unclear contract. There may in fact be some callers that
do expect the reference to be dropped on success in addition to callers that
expect it to be released. In this case a further audit of the callers is needed
and a consensus on the correct behavior.
PR: 184677
Submitted by: kmacy
Reviewed by: delphij, will, avg
MFC after: 2 weeks
Sponsored by: iXsystems
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove previously added kmem methods in favour of defines which
allow diff minimisation between upstream code base.
Rebalance ARC free target to be vm_pageout_wakeup_thresh by default
which eliminates issue where ARC gets minimised instead of balancing
with VM pageout. The restores the target point prior to r270759.
Bring in missing upstream only changes which move unused code to
further eliminate code differences.
Add additional DTRACE probe to aid monitoring of ARC behaviour.
Enable upstream i386 code paths on platforms which don't define
UMA_MD_SMALL_ALLOC.
Fix mixture of byte an page values in arc_memory_throttle i386 code
path value assignment of available_memory.
PR: 187594
Review: D702
Reviewed by: avg
MFC after: 1 week
X-MFC-With: r270759 & r270861
Sponsored by: Multiplay
|
|
|
|
|
|
|
| |
SDT requries sys/param.h due to use of NULL
Reported by: Garrett
Sponsored by: Multiplay
|
|
|
|
|
| |
MFC after: 1 week
Sponsored by: Multiplay
|
|
|
|
|
|
|
|
|
|
|
| |
Also restore kmem_used() check for i386 as it has KVA limits that the raw
page counts above don't consider
PR: 187594
Reviewed by: peter
X-MFC-With: r270759
Review: D700
Sponsored by: Multiplay
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to this change we triggered ARC reclaim when kmem usage passed 3/4
of the total available, as indicated by vmem_size(kmem_arena, VMEM_ALLOC).
This could lead large amounts of unused RAM e.g. on a 192GB machine with
ARC the only major RAM consumer, 40GB of RAM would remain unused.
The old method has also been seen to result in extreme RAM usage under
certain loads, causing poor performance and stalls.
We now trigger ARC reclaim when the number of free pages drops below the
value defined by the new sysctl vfs.zfs.arc_free_target, which defaults
to the value of vm.v_free_target.
Credit to Karl Denninger for the original patch on which this update was
based.
PR: 191510 and 187594
Tested by: dteske
MFC after: 1 week
Relnotes: yes
Sponsored by: Multiplay
|
|
|
|
|
| |
that symbol (which will be correct in both kernel and userland contexts)
rather than just __arm__ to decide whether to use a local implementation.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
These changes prevent sysctl(8) from returning proper output,
such as:
1) no output from sysctl(8)
2) erroneously returning ENOMEM with tools like truss(1)
or uname(1)
truss: can not get etype: Cannot allocate memory
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.
Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.
MFC after: 2 weeks
Sponsored by: Mellanox Technologies
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To reduce the diff struct pcu.cnt field was not renamed, so
PCPU_OP(cnt.field) is still used. pc_cnt and pcpu are also used in
kvm(3) and vmstat(8). The goal was to not affect externally used KPI.
Bump __FreeBSD_version_ in case some out-of-tree module/code relies on the
the global cnt variable.
Exp-run revealed no ports using it directly.
No objection from: arch@
Sponsored by: EMC / Isilon Storage Division
|
|
|
|
|
|
|
| |
this should be more optimal than writing pages one-by-one via zfs_write ->
update_pages in the case of multi-page putpages call
MFC after: 16 days
|
|
|
|
| |
MFC after: 10 days
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
943 zio_interrupt ends up calling taskq_dispatch with TQ_SLEEP
illumos/illumos-gate@5aeb94743e3be0c51e86f73096334611ae3a058e
Essentially FreeBSD taskqueues already operate in a mode that
was added to Illumos with taskq_dispatch_ent change.
We even exposed the superior FreeBSD interface as taskq_dispatch_safe.
Now we just rename taskq_dispatch_safe to taskq_dispatch_ent and
struct struct ostask to taskq_ent_t, so that code differences will be
minimal.
After this change sys/cddl/compat/opensolaris/sys/taskq.h header is no
longer needed.
Note that this commit is not an MFV because the upstream change was not
individually committed to the vendor area.
MFC after: 8 days
|
|
|
|
|
|
|
|
| |
- drop trailing whitespace
- remove redundant "extern" from function declarations
- remove unused macro
MFC after: 1 week
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
transparent layering and better fragmentation.
- Normalize functions that allocate memory to use kmem_*
- Those that allocate address space are named kva_*
- Those that operate on maps are named kmap_*
- Implement recursive allocation handling for kmem_arena in vmem.
Reviewed by: alc
Tested by: pho
Sponsored by: EMC / Isilon Storage Division
|
|
|
|
| |
MFC after: 2 days
|
|
|
|
| |
Sponsored by: EMC / Isilon storage division
|
|
|
|
|
|
|
| |
const variables.
Sponsored by: EMC / Isilon storage division
Reported by: pjd
|
|
|
|
|
|
|
|
| |
Rename current VM_OBJECT_LOCK_ASSERT(foo, RA_WLOCKED) into
VM_OBJECT_ASSERT_WLOCKED(foo)
Sponsored by: EMC / Isilon storage division
Requested by: alc
|
|
|
|
| |
Sponsored by: EMC / Isilon storage division
|
|
|
|
|
|
|
|
|
|
|
| |
Added description option to kstats.
Added descriptions for zio_trim kstats
PR: kern/173113
Submitted by: Steven Hartland
Reviewed by: pjd
Approved by: pjd
MFC after: 2 weeks
|
|
|
|
| |
MFC after: 6 days
|
|
|
|
|
|
|
| |
To do: investigate if it would be possible to use normal vfs_domount here.
Reviewed by: kib
MFC after: 19 days
|
|
|
|
|
|
|
| |
... to ensure that we have a valid mountpoint during the call.
Reviewed by: kib
MFC after: 19 days
|
|
|
|
|
|
|
|
|
|
|
|
| |
In particular, do not lock Giant conditionally when calling into the
filesystem module, remove the VFS_LOCK_GIANT() and related
macros. Stop handling buffers belonging to non-mpsafe filesystems.
The VFS_VERSION is bumped to indicate the interface change which does
not result in the interface signatures changes.
Conducted and reviewed by: attilio
Tested by: pho
|
|
|
|
|
|
| |
Remove obsoleted intermediate cddl/compat/opensolaris/sys/debug.h.
MFC after: 2 weeks
|
|
|
|
|
|
|
| |
... instead of just a verbatim format string.
Reviewed by: pjd
MFC after: 1 week
|
|
|
|
|
|
|
|
| |
Reported by: David P. Discher <dpd@bitgravity.com>
Tested by: will
Reviewed by: art
Discussed with: dwhite
MFC after: 2 weeks
|
|
|
|
|
|
|
|
|
|
| |
User upgrades his system to fix the problem, but if he has any ZFS snapshots
for the file system which contains problematic binary, any user can mount the
snapshot and execute vulnerable binary.
Prevent this from happening by always mounting snapshots with setuid turned off.
MFC after: 2 weeks
|
|
|
|
| |
MFC after: 2 weeks
|
|
|
|
|
|
|
|
| |
The task structure might be no longer available.
This also allows to eliminates the need for two tasks in the zio structure.
Submitted by: anonymous
MFC after: 2 weeks
|
|
|
|
| |
MFC after: 1 month
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Few new things available from now on:
- Data deduplication.
- Triple parity RAIDZ (RAIDZ3).
- zfs diff.
- zpool split.
- Snapshot holds.
- zpool import -F. Allows to rewind corrupted pool to earlier
transaction group.
- Possibility to import pool in read-only mode.
MFC after: 1 month
|
|
|
|
|
|
|
| |
functions to obtain the address and size of the preloaded pool
configuration file/repository.
Sponsored by: Juniper Networks.
|
|
|
|
| |
Provide 64 bit atomic ops, and use 32 bit pointer.
|
|
|
|
| |
existing uses. Rename sysctl_handle_quad() to sysctl_handle_64().
|
|
|
|
|
|
|
|
|
| |
with filesystems created under MacOS X ZFS port. This is kind of filesystem
corruption (we don't allow for setting empty ACLs), so make acl_get_file(3)
and related syscalls fail with EINVAL in that case. In theory, we could
return empty ACL to userland, but I'm afraid this would break some code.
MFC after: 3 days
|
|
|
|
| |
Found with: clang
|
|
|
|
|
|
|
|
|
| |
physical memory
This is needed to correctly autotune ZFS ARC size when vm_kmem_size is
set to value larger than available physical memory.
MFC after: 2 weeks
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- better ACL caching and speedup of ACL permission checks
- faster handling of stat()
- lowered mutex contention in the read/writer lock (rrwlock)
- several related bugfixes
Detailed information (OpenSolaris onnv changesets and Bug IDs):
9749:105f407a2680
6802734 Support for Access Based Enumeration (not used on FreeBSD)
6844861 inconsistent xattr readdir behavior with too-small buffer
9866:ddc5f1d8eb4e
6848431 zfs with rstchown=0 or file_chown_self privilege allows user to "take" ownership
9981:b4907297e740
6775100 stat() performance on files on zfs should be improved
6827779 rrwlock is overly protective of its counters
10143:d2d432dfe597
6857433 memory leaks found at: zfs_acl_alloc/zfs_acl_node_alloc
6860318 truncate() on zfsroot succeeds when file has a component of its path set without access permission
10232:f37b85f7e03e
6865875 zfs sometimes incorrectly giving search access to a dir
10250:b179ceb34b62
6867395 zpool_upgrade_007_pos testcase panic'd with BAD TRAP: type=e (#pf Page fault)
10269:2788675568fd
6868276 zfs_rezget() can be hazardous when znode has a cached ACL
10295:f7a18a1e9610
6870564 panic in zfs_getsecattr
Approved by: delphij (mentor)
Obtained from: OpenSolaris (multiple Bug IDs)
MFC after: 2 weeks
|
|
|
|
|
|
| |
test.
Sponsored by: The FreeBSD Foundation
|