| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
nvme: do not revert to single I/O queue when per-CPU queues not available
Previously nvme(4) would revert to a single I/O queue if it could not
allocate enought interrupt vectors or NVMe submission/completion queues
to have one I/O queue per core. This patch determines how to utilize a
smaller number of available interrupt vectors, and assigns (as closely
as possible) an equal number of cores to each associated I/O queue.
|
|
|
|
|
|
|
| |
nvme: do not pre-allocate MSI-X IRQ resources
The issue referenced here was resolved by other changes
in recent commits, so this code is no longer needed.
|
|
|
|
|
|
|
|
|
|
| |
nvme: remove per_cpu_io_queues from struct nvme_controller
Instead just use num_io_queues to make this determination.
This prepares for some future changes enabling use of multiple
queues when we do not have enough queues or MSI-X vectors
for one queue per CPU.
|
|
|
|
|
|
|
|
|
| |
nvme: remove CHATHAM related code
Chatham was an internal NVMe prototype board used for
early driver development.
Sponsored by: Intel
|
|
|
|
|
|
|
|
|
|
|
| |
nvme: create separate DMA tag for non-payload DMA buffers
Submission and completion queue memory need to use a
separate DMA tag for mappings than payload buffers,
to ensure mappings remain contiguous even with DMAR
enabled.
Sponsored by: Intel
|
|
|
|
|
| |
nvme: Allocate all MSI resources up front so that we can fall back to
INTx if necessary.
|
|
|
|
|
| |
nvme: Close hole where nvd(4) would not be notified of all nvme(4)
instances if modules loaded during boot.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
they occur.
This prevents repeated notifications of the same event.
Status of these events may be viewed at any time by viewing the
SMART/Health Info Page using nvmecontrol, whether or not asynchronous
events notifications for those events are enabled. This log page can
be viewed using:
nvmecontrol logpage -p 2 <ctrlr id>
Future enhancements may re-enable these notifications on a periodic basis
so that if the notified condition persists, it will continue to be logged.
Sponsored by: Intel
Reviewed by: carl
Approved by: re (hrs)
MFC after: 1 week
|
|
|
|
|
|
|
|
|
| |
benefit from it.
Sponsored by: Intel
Reviewed by: kib (earlier version), carl
Approved by: re (hrs)
MFC after: 1 week
|
|
|
|
|
|
|
|
|
| |
notification gets sent in cases where system shuts down with driver
unloaded.
Sponsored by: Intel
Reviewed by: carl
MFC after: 3 days
|
|
|
|
|
|
|
| |
if not already defined elsewhere.
Requested by: attilio
MFC after: 3 days
|
|
|
|
| |
MFC after: 3 days
|
|
|
|
|
|
|
|
| |
The nvme_physio() function was removed quite a while ago, which was the
only user of this uio-related code.
Sponsored by: Intel
MFC after: 3 days
|
|
|
|
|
|
|
|
|
|
|
| |
Also allow admin commands to transfer up to this maximum I/O size, rather
than the artificial limit previously imposed. The larger I/O size is very
beneficial for upcoming firmware download support. This has the added
benefit of simplifying the code since both admin and I/O commands now use
the same maximum I/O size.
Sponsored by: Intel
MFC after: 3 days
|
|
|
|
|
|
| |
This removes nvme_uio.c completely.
Sponsored by: Intel
|
|
|
|
| |
Sponsored by: Intel
|
|
|
|
|
|
| |
locking operations on the controller.
Sponsored by: Intel
|
|
|
|
| |
Sponsored by: Intel
|
|
|
|
|
|
|
| |
NULL. This simplifies decisions around if/how requests are routed through
busdma. It also paves the way for supporting unmapped bios.
Sponsored by: Intel
|
|
|
|
|
|
|
|
|
| |
1) Consistently use device_printf.
2) Make dump_completion and dump_command into something more
human-readable.
Sponsored by: Intel
Reviewed by: carl
|
|
|
|
|
|
|
|
| |
separate function.
Sponsored by: Intel
Suggested by: carl
Reviewed by: carl
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
mechanism.
Now that all requests are timed, we are guaranteed to get a completion
notification, even if it is an abort status due to a timed out admin
command.
This has the effect of simplifying the controller and namespace setup
code, so that it reads straight through rather than broken up into
a bunch of different callback functions.
Sponsored by: Intel
Reviewed by: carl
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
start or reset. Also add a notifier for NVMe consumers for controller fail
conditions and plumb this notifier for nvd(4) to destroy the associated
GEOM disks when a failure occurs.
This requires a bit of work to cover the races when a consumer is sending
I/O requests to a controller that is transitioning to the failed state. To
help cover this condition, add a task to defer completion of I/Os submitted
to a failed controller, so that the consumer will still always receive its
completions in a different context than the submission.
Sponsored by: Intel
Reviewed by: carl
|
|
|
|
|
|
|
|
|
|
| |
This flag was originally added to communicate to the sysctl code
which oids should be built, but there are easier ways to do this. This
needs to be cleaned up prior to adding new controller states - for example,
controller failure.
Sponsored by: Intel
Reviewed by: carl
|
|
|
|
|
|
|
|
|
|
| |
The controller's IDENTIFY data contains MDTS (Max Data Transfer Size) to
allow the controller to specify the maximum I/O data transfer size. nvme(4)
already provides a default maximum, but make sure it does not exceed what
MDTS reports.
Sponsored by: Intel
Reviewed by: carl
|
|
|
|
|
|
|
|
|
| |
that if a specific I/O repeatedly times out, we don't retry it indefinitely.
The default number of retries will be 4, but is adjusted using hw.nvme.retry_count.
Sponsored by: Intel
Reviewed by: carl
|
|
|
|
|
| |
Sponsored by: Intel
Reviewed by: carl
|
|
|
|
|
|
|
|
|
|
| |
specified log page.
This satisfies the spec condition that future async events of the same type
will not be sent until the associated log page is fetched.
Sponsored by: Intel
Reviewed by: carl
|
|
|
|
|
|
|
| |
log pages.
Sponsored by: Intel
Reviewed by: carl
|
|
|
|
|
|
|
| |
error log pages.
Sponsored by: Intel
Reviewed by: carl
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This protects against cases where a controller crashes with multiple
I/O outstanding, each timing out and requesting controller resets
simultaneously.
While here, remove a debugging printf from a previous commit, and add
more logging around I/O that need to be resubmitted after a controller
reset.
Sponsored by: Intel
Reviewed by: carl
|
|
|
|
|
|
|
|
|
|
| |
While aborts are typically cleaner than a full controller reset, many times
an I/O timeout indicates other controller-level issues where aborts may not
work. NVMe drivers for other operating systems are also defaulting to
controller reset rather than aborts for timed out I/O.
Sponsored by: Intel
Reviewed by: carl
|
|
|
|
|
|
|
| |
but can be adjusted between a min/max of 5 and 120 seconds.
Sponsored by: Intel
Reviewed by: carl
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On any I/O timeout, check for csts.cfs==1. If set, the controller
is reporting fatal status and we reset the controller immediately,
rather than trying to abort the timed out command.
This changeset also includes deferring the controller start portion
of the reset to a separate task. This ensures we are always performing
a controller start operation from a consistent context.
Sponsored by: Intel
Reviewed by: carl
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
invoke it from nvmecontrol(8).
Controller reset will be performed in cases where I/O are repeatedly
timing out, the controller reports an unrecoverable condition, or
when explicitly requested via IOCTL or an nvme consumer. Since the
controller may be in such a state where it cannot even process queue
deletion requests, we will perform a controller reset without trying
to clean up anything on the controller first.
Sponsored by: Intel
Reviewed by: carl
|
|
|
|
|
|
| |
This enables in-order re-submission of I/O after a controller reset.
Sponsored by: Intel
|
|
|
|
| |
Sponsored by: Intel
|
|
|
|
|
|
| |
notifications when new nvme controllers are added to the system.
Sponsored by: Intel
|
|
|
|
|
|
|
|
| |
Also add logic to clean up all outstanding asynchronous event requests
when resetting or shutting down the controller, since these requests
will not be explicitly completed by the controller itself.
Sponsored by: Intel
|
|
|
|
|
|
| |
function.
Sponsored by: Intel
|
|
|
|
|
|
|
| |
This is primarily driven by the need to disable timeouts for asynchronous
event requests, which by nature should not be timed out.
Sponsored by: Intel
|
|
|
|
|
|
|
|
| |
an I/O times out.
Also ensure that we retry commands that are aborted due to a timeout.
Sponsored by: Intel
|
|
|
|
|
|
| |
behind BAR 4/5, rather than in BAR 0/1 with the control/doorbell registers.
Sponsored by: Intel
|
|
|
|
|
|
| |
matches MSI-X behavior.
Sponsored by: Intel
|
|
|
|
|
|
| |
previously defined IDT PCI device ID was for a 32-channel controller.
Submitted by: Joe Golio <joseph.golio@isilon.com>
|
|
|
|
|
|
|
|
|
| |
This eliminates the need to manage queue depth at the nvd(4) level for
Chatham prototype board workarounds, and also adds the ability to
accept a number of requests on a single qpair that is much larger
than the number of trackers allocated.
Sponsored by: Intel
|
|
|
|
|
|
| |
than dynamically creating them at runtime.
Sponsored by: Intel
|
|
|
|
|
|
|
| |
duplication between the admin and io controller-level submit
functions.
Sponsored by: Intel
|
|
|
|
|
|
|
|
|
|
|
|
| |
nvme_ctrlr_submit_io_request().
While here, also fix case where a uio may have more than 1 iovec.
NVMe's definition of SGEs (called PRPs) only allows for the first SGE to
start on a non-page boundary. The simplest way to handle this is to
construct a temporary uio for each iovec, and submit an NVMe request
for each.
Sponsored by: Intel
|
|
|
|
|
|
|
| |
code for allocating nvme_tracker objects and making calls into
bus_dmamap_load for commands which have payloads.
Sponsored by: Intel
|