| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
it. GEOM MULTIPATH itself never touches the data and so transparent.
|
|
|
|
|
|
|
| |
- Replace single done mutex with per-disk ones. On system with several
disks on several HBAs that removes small, but measurable lock congestion.
- Modify disk destruction process to not destroy the mutex prematurely.
- Remove some extra pointer derefences.
|
|
|
|
|
|
|
|
| |
Use destroy_dev_sched_cb() to not wait for device destruction while holding
GEOM topology lock (that actually caused deadlock). Use request counting
protected by mutex to properly wait for outstanding requests completion in
cases of device closing and geom destruction. Unlike r227009, this code
does not block taskqueue thread for indefinite time, waiting for completion.
|
|
|
|
|
|
|
|
|
| |
more topology change done that may require its attention. Add few missing
g_do_wither() calls in respective places to signal it.
This fixes potential infinite loop here when some provider is withered, but
still opened or connected for some reason and so can not be destroyed. For
example, see r227009 and r227510.
|
|
|
|
| |
Reported and tested by: Ivan Klymenko <fidaj@ukr.net
|
|
|
|
|
|
| |
buffers are allowed.
Sponsored by: The FreeBSD Foundation
|
|
|
|
|
|
|
| |
provider does so, since geom_part never inspects the bio_data.
Sponsored by: The FreeBSD Foundation
Tested by: pho
|
|
|
|
|
|
|
| |
unmapped i/o requests.
Sponsored by: The FreeBSD Foundation
Tested by: pho
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
do not map the b_pages pages into buffer_map KVA. The use of the
unmapped buffers eliminate the need to perform TLB shootdown for
mapping on the buffer creation and reuse, greatly reducing the amount
of IPIs for shootdown on big-SMP machines and eliminating up to 25-30%
of the system time on i/o intensive workloads.
The unmapped buffer should be explicitely requested by the GB_UNMAPPED
flag by the consumer. For unmapped buffer, no KVA reservation is
performed at all. The consumer might request unmapped buffer which
does have a KVA reserve, to manually map it without recursing into
buffer cache and blocking, with the GB_KVAALLOC flag.
When the mapped buffer is requested and unmapped buffer already
exists, the cache performs an upgrade, possibly reusing the KVA
reservation.
Unmapped buffer is translated into unmapped bio in g_vfs_strategy().
Unmapped bio carry a pointer to the vm_page_t array, offset and length
instead of the data pointer. The provider which processes the bio
should explicitely specify a readiness to accept unmapped bio,
otherwise g_down geom thread performs the transient upgrade of the bio
request by mapping the pages into the new bio_transient_map KVA
submap.
The bio_transient_map submap claims up to 10% of the buffer map, and
the total buffer_map + bio_transient_map KVA usage stays the
same. Still, it could be manually tuned by kern.bio_transient_maxcnt
tunable, in the units of the transient mappings. Eventually, the
bio_transient_map could be removed after all geom classes and drivers
can accept unmapped i/o requests.
Unmapped support can be turned off by the vfs.unmapped_buf_allowed
tunable, disabling which makes the buffer (or cluster) creation
requests to ignore GB_UNMAPPED and GB_KVAALLOC flags. Unmapped
buffers are only enabled by default on the architectures where
pmap_copy_page() was implemented and tested.
In the rework, filesystem metadata is not the subject to maxbufspace
limit anymore. Since the metadata buffers are always mapped, the
buffers still have to fit into the buffer map, which provides a
reasonable (but practically unreachable) upper bound on it. The
non-metadata buffer allocations, both mapped and unmapped, is
accounted against maxbufspace, as before. Effectively, this means that
the maxbufspace is forced on mapped and unmapped buffers separately.
The pre-patch bufspace limiting code did not worked, because
buffer_map fragmentation does not allow the limit to be reached.
By Jeff Roberson request, the getnewbuf() function was split into
smaller single-purpose functions.
Sponsored by: The FreeBSD Foundation
Discussed with: jeff (previous version)
Tested by: pho, scottl (previous version), jhb, bf
MFC after: 2 weeks
|
|
|
|
| |
This fixes handling BIO_DELETE larger than MAXPHYS.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
of upgrading older machines using ataraid(4) to newer releases.
This optional parameter is controlled via kern.geom.raid.legacy_aliases
and will create a /dev/ar0 device that will point at /dev/raid/r0 for
example.
Tested on Dell SC 1425 DDF-1 format software raid controllers installing from
stable/7 and upgrading to stable/9 without having to adjust /etc/fstab
Reviewed by: mav
Obtained from: Yahoo!
MFC after: 2 Weeks
|
|
|
|
|
|
|
| |
This will avoid a 0-byte read (in g_read_data()) leading to a panic, if
previously read data are erroneous.
Suggested by: John-Mark Gurney <jmg@funkthat.com>
|
|
|
|
|
|
| |
PR: kern/174714
Submitted by: 4721 at hushmail dot com
MFC after: 1 week
|
|
|
|
|
|
|
| |
is not set (255).
Reported by: sbruno
MFC after: 1 week
|
|
|
|
|
|
| |
Without this, read data is mis-interpreted. This could trigger a panic,
as was the case on one computer where computed "recsize" was zero,
leading to a call to g_read_page() asking for 0 bytes.
|
|
|
|
|
|
| |
used structs and values.
This patch is not targeted for MFC.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
write is a disk write request that tells the disk that the buffer
being written must be committed to the media along with any writes
that preceeded it before any future blocks may be written to the drive.
Barrier writes are provided by adding the functions bbarrierwrite
(bwrite with barrier) and babarrierwrite (bawrite with barrier).
Following a bbarrierwrite the client knows that the requested buffer
is on the media. It does not ensure that buffers written before that
buffer are on the media. It only ensure that buffers written before
that buffer will get to the media before any buffers written after
that buffer. A flush command must be sent to the disk to ensure that
all earlier written buffers are on the media.
Reviewed by: kib
Tested by: Peter Holm
|
|
|
|
|
|
|
|
|
|
|
| |
This allows to use gmirror e.g. on top of ZVOLs.
PR: kern/175323
Submitted by: Alexei.Volkov@softlynx.ru, mav
Reported by: Alexei.Volkov@softlynx.ru
Tested by: Alexei.Volkov@softlynx.ru
Reviewed by: ae, mav, pjd
MFC after: 1 week
|
|
|
|
| |
- Identify one more metadata field.
|
|
|
|
| |
and for volumes with sector size above 512 bytes.
|
|
|
|
|
| |
size given by metadata, as it should be correct and in some cases can be
smaller then subdisk size.
|
|
|
|
|
| |
PR: kern/160562
MFC after: 2 weeks
|
|
|
|
|
|
|
|
|
|
|
| |
as clean on shutdown and move that action from shutdown_pre_sync stage to
shutdown_post_sync to avoid extra flapping.
ZFS tends to not close devices on shutdown, that doesn't allow GEOM RAID
to shutdown gracefully. To handle that, mark volume as clean just when
shutdown time comes and there are no active writes.
MFC after: 2 weeks
|
|
|
|
|
|
|
|
|
|
|
|
| |
as clean on shutdown and move that action from shutdown_pre_sync stage to
shutdown_post_sync to avoid extra flapping.
ZFS tends to not close devices on shutdown, that doesn't allow GEOM RAID
to shutdown gracefully. To handle that, mark volume as clean just when
shutdown time comes and there are no active writes.
PR: kern/113957
MFC after: 2 weeks
|
|
|
|
|
| |
previous value of config_id when it is changed in some cases. I guess it
may be used do avoid some split-brain conditions.
|
| |
|
| |
|
|
|
|
| |
as a hint for raid/rX device number to make it persistent across reboots.
|
|
|
|
|
|
| |
unsupported metadata types like Intel Smart Response to not corrupt them.
- Improve setting of these things during metadata writing to protect from
incapable BIOS'es and other implementations.
|
|
|
|
|
| |
reconnected back, leave it as disconnected. If new disk inserted instead of
disabled, rebuild it and leave as enabled.
|
|
|
|
|
|
| |
disks should be rebuilt. Our rebuild code is same time disk-centric. To
handle this situation properly check all disks for RBLD flags, and if no
disk specified try rebuild/resync all of them except newly inserted.
|
|
|
|
|
|
|
| |
Windows driver uses such migration when it creates new arrays. While GEOM
RAID has no mechanism to implement migration in general case, this specifc
case still can be handled easily via degraded RAID1 creation followed by
regular rebuild.
|
|
|
|
|
|
|
|
|
| |
It is alike to RAID1, but with dedicating master and recovery disks and
providing manual control over synchronization. It allows to use recovery
disk as snapshot of the master disk from the time of the last sync.
This implementation is not functionaly complete comparing to Windows,
but it is better then silent conversion to RAID1 on first boot.
|
|
|
|
|
|
| |
vfs_write_resume_flags().
Sponsored by: The FreeBSD Foundation
|
|
|
|
|
|
|
|
| |
conditions. This fixes assertion which checks those fields when kernel is
compiled with DIAGNOSTIC.
Reported by: kib, pho
MFC after: 1 week
|
|
|
|
|
|
|
|
|
| |
'"'. Mangling is only done for label names read from file system
metadata. Encoding resembles URL encoding. For example, the space
character becomes %20.
Help by: kib
Discussed with: imp, kib, pjd
|
|
|
|
|
|
|
| |
- Add __printflike() attributes.
- Remove an extra argument for the g_new_geomf() call in swapongeom_ev().
Reviewed by: pjd
|
|
|
|
|
|
|
|
|
| |
state of crashdump target devices.
This will be used to add a "-l" (ell) flag to dumpon(8) to list the
currently configured dumpdev.
Reviewed by: phk
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
extended using growfs(8). The problem here is that geom_label checks if
the filesystem size recorded in UFS superblock is equal to the provider
(i.e. device) size. This check cannot be removed due to backward
compatibility. On the other hand, in most cases growfs(8) cannot set
fs_size in the superblock to match the provider size, because, differently
from newfs(8), it cannot recompute cylinder group sizes.
To fix this problem, add another superblock field, fs_providersize, used
only for this purpose. The geom_label(4) will attach if either fs_size
(filesystem created with newfs(8)) or fs_providersize (filesystem expanded
using growfs(8)) matches the device size.
PR: kern/165962
Reviewed by: mckusick
Sponsored by: FreeBSD Foundation
|
|
|
|
|
|
|
|
| |
Alike to BIO_WRITE, report success if at least one subdisk succeeded with
BIO_DELETE. But unlike BIO_WRITE don't fail disk on BIO_DELETE error.
Sponsored by: iXsystems, Inc.
MFC after: 1 month
|
|
|
|
|
|
|
|
|
| |
If at least one subdisk in the volume supports it, BIO_DELETE requests
will be propagated down. Unfortunatelly, for RAID levels with redundancy
unmapped blocks will be mapped back during first rebuild/resync process.
Sponsored by: iXsystems, Inc.
MFC after: 1 month
|
|
|
|
|
|
| |
topology lock, resulting in assertion when running with DIAGNOSTIC.
Reviewed by: mav (earlier version)
|
|
|
|
|
|
|
|
|
|
|
| |
and move that action from shutdown_pre_sync to shutdown_post_sync stage
to avoid extra flapping.
ZFS tends to not close devices on shutdown, that doesn't allow GEOM RAID
to shutdown gracefully. To handle that, mark volume as clean just when
shutdown time comes and there are no active writes.
MFC after: 2 weeks
|
|
|
|
|
|
|
|
|
|
|
|
| |
In particular, do not lock Giant conditionally when calling into the
filesystem module, remove the VFS_LOCK_GIANT() and related
macros. Stop handling buffers belonging to non-mpsafe filesystems.
The VFS_VERSION is bumped to indicate the interface change which does
not result in the interface signatures changes.
Conducted and reviewed by: attilio
Tested by: pho
|
|
|
|
|
|
|
|
|
| |
filesystems that we don't support natively.
Revert part of r241636 to do so.
This patch is not targeted for MFC.
Requested by: gleb, jhb
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GIANT from VFS. This code is particulary broken and fragile and other
in-kernel implementations around, found in other operating systems,
don't really seem clean and solid enough to be imported at all.
If someone wants to reconsider in-kernel NTFS implementation for
inclusion again, a fair effort for completely fixing and cleaning it
up is expected.
In the while NTFS regular users can use FUSE interface and ntfs-3g
port to work with their NTFS partitions.
This is not targeted for MFC.
|
|
|
|
|
|
| |
This should be only a cosmetic change.
Found by: Clang Static Analyzer
|
|
|
|
|
|
|
|
|
|
| |
provider name to be specified instead of geom name (first argument in all
subcommands except label). In most cases there is only one array used
any way, so it is not really useful to make user type ugly geom names like
Intel-f0bdf223 or SiI-732c2b9448cf. Though they can be used in some cases.
Sponsored by: iXsystems, Inc.
MFC after: 1 month
|
|
|
|
|
|
|
| |
Besides withered but still alive consumers may interfere with
re-tatsing.
MFC after: 16 days
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
mutexes held and the topology lock is an sx lock.
The topology lock was there to protect traversing through the list of providers
of disk's geom, but it seems that disk's geom has always exactly one provider.
Change the code to call g_wither_provider() for this one provider, which is
safe to do without holding the topology lock and assert that there is indeed
only one provider.
Discussed with: ken
MFC after: 1 week
|