summaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/Changes10
-rw-r--r--Documentation/DocBook/libata.tmpl1072
-rw-r--r--Documentation/SubmittingPatches86
-rw-r--r--Documentation/block/biodoc.txt113
-rw-r--r--Documentation/connector/connector.txt44
-rw-r--r--Documentation/dell_rbu.txt38
-rw-r--r--Documentation/device-mapper/snapshot.txt73
-rw-r--r--Documentation/kernel-parameters.txt496
-rw-r--r--Documentation/keys-request-key.txt161
-rw-r--r--Documentation/keys.txt92
-rw-r--r--Documentation/networking/bonding.txt5
-rw-r--r--Documentation/networking/ip-sysctl.txt10
-rw-r--r--Documentation/sparse.txt4
-rw-r--r--Documentation/usb/URB.txt74
14 files changed, 1894 insertions, 384 deletions
diff --git a/Documentation/Changes b/Documentation/Changes
index 5eaab04..27232be 100644
--- a/Documentation/Changes
+++ b/Documentation/Changes
@@ -237,6 +237,12 @@ udev
udev is a userspace application for populating /dev dynamically with
only entries for devices actually present. udev replaces devfs.
+FUSE
+----
+
+Needs libfuse 2.4.0 or later. Absolute minimum is 2.3.0 but mount
+options 'direct_io' and 'kernel_cache' won't work.
+
Networking
==========
@@ -390,6 +396,10 @@ udev
----
o <http://www.kernel.org/pub/linux/utils/kernel/hotplug/udev.html>
+FUSE
+----
+o <http://sourceforge.net/projects/fuse>
+
Networking
**********
diff --git a/Documentation/DocBook/libata.tmpl b/Documentation/DocBook/libata.tmpl
index 375ae76..d260d92 100644
--- a/Documentation/DocBook/libata.tmpl
+++ b/Documentation/DocBook/libata.tmpl
@@ -415,6 +415,362 @@ and other resources, etc.
</sect1>
</chapter>
+ <chapter id="libataEH">
+ <title>Error handling</title>
+
+ <para>
+ This chapter describes how errors are handled under libata.
+ Readers are advised to read SCSI EH
+ (Documentation/scsi/scsi_eh.txt) and ATA exceptions doc first.
+ </para>
+
+ <sect1><title>Origins of commands</title>
+ <para>
+ In libata, a command is represented with struct ata_queued_cmd
+ or qc. qc's are preallocated during port initialization and
+ repetitively used for command executions. Currently only one
+ qc is allocated per port but yet-to-be-merged NCQ branch
+ allocates one for each tag and maps each qc to NCQ tag 1-to-1.
+ </para>
+ <para>
+ libata commands can originate from two sources - libata itself
+ and SCSI midlayer. libata internal commands are used for
+ initialization and error handling. All normal blk requests
+ and commands for SCSI emulation are passed as SCSI commands
+ through queuecommand callback of SCSI host template.
+ </para>
+ </sect1>
+
+ <sect1><title>How commands are issued</title>
+
+ <variablelist>
+
+ <varlistentry><term>Internal commands</term>
+ <listitem>
+ <para>
+ First, qc is allocated and initialized using
+ ata_qc_new_init(). Although ata_qc_new_init() doesn't
+ implement any wait or retry mechanism when qc is not
+ available, internal commands are currently issued only during
+ initialization and error recovery, so no other command is
+ active and allocation is guaranteed to succeed.
+ </para>
+ <para>
+ Once allocated qc's taskfile is initialized for the command to
+ be executed. qc currently has two mechanisms to notify
+ completion. One is via qc->complete_fn() callback and the
+ other is completion qc->waiting. qc->complete_fn() callback
+ is the asynchronous path used by normal SCSI translated
+ commands and qc->waiting is the synchronous (issuer sleeps in
+ process context) path used by internal commands.
+ </para>
+ <para>
+ Once initialization is complete, host_set lock is acquired
+ and the qc is issued.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry><term>SCSI commands</term>
+ <listitem>
+ <para>
+ All libata drivers use ata_scsi_queuecmd() as
+ hostt->queuecommand callback. scmds can either be simulated
+ or translated. No qc is involved in processing a simulated
+ scmd. The result is computed right away and the scmd is
+ completed.
+ </para>
+ <para>
+ For a translated scmd, ata_qc_new_init() is invoked to
+ allocate a qc and the scmd is translated into the qc. SCSI
+ midlayer's completion notification function pointer is stored
+ into qc->scsidone.
+ </para>
+ <para>
+ qc->complete_fn() callback is used for completion
+ notification. ATA commands use ata_scsi_qc_complete() while
+ ATAPI commands use atapi_qc_complete(). Both functions end up
+ calling qc->scsidone to notify upper layer when the qc is
+ finished. After translation is completed, the qc is issued
+ with ata_qc_issue().
+ </para>
+ <para>
+ Note that SCSI midlayer invokes hostt->queuecommand while
+ holding host_set lock, so all above occur while holding
+ host_set lock.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+ </sect1>
+
+ <sect1><title>How commands are processed</title>
+ <para>
+ Depending on which protocol and which controller are used,
+ commands are processed differently. For the purpose of
+ discussion, a controller which uses taskfile interface and all
+ standard callbacks is assumed.
+ </para>
+ <para>
+ Currently 6 ATA command protocols are used. They can be
+ sorted into the following four categories according to how
+ they are processed.
+ </para>
+
+ <variablelist>
+ <varlistentry><term>ATA NO DATA or DMA</term>
+ <listitem>
+ <para>
+ ATA_PROT_NODATA and ATA_PROT_DMA fall into this category.
+ These types of commands don't require any software
+ intervention once issued. Device will raise interrupt on
+ completion.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry><term>ATA PIO</term>
+ <listitem>
+ <para>
+ ATA_PROT_PIO is in this category. libata currently
+ implements PIO with polling. ATA_NIEN bit is set to turn
+ off interrupt and pio_task on ata_wq performs polling and
+ IO.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry><term>ATAPI NODATA or DMA</term>
+ <listitem>
+ <para>
+ ATA_PROT_ATAPI_NODATA and ATA_PROT_ATAPI_DMA are in this
+ category. packet_task is used to poll BSY bit after
+ issuing PACKET command. Once BSY is turned off by the
+ device, packet_task transfers CDB and hands off processing
+ to interrupt handler.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry><term>ATAPI PIO</term>
+ <listitem>
+ <para>
+ ATA_PROT_ATAPI is in this category. ATA_NIEN bit is set
+ and, as in ATAPI NODATA or DMA, packet_task submits cdb.
+ However, after submitting cdb, further processing (data
+ transfer) is handed off to pio_task.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </sect1>
+
+ <sect1><title>How commands are completed</title>
+ <para>
+ Once issued, all qc's are either completed with
+ ata_qc_complete() or time out. For commands which are handled
+ by interrupts, ata_host_intr() invokes ata_qc_complete(), and,
+ for PIO tasks, pio_task invokes ata_qc_complete(). In error
+ cases, packet_task may also complete commands.
+ </para>
+ <para>
+ ata_qc_complete() does the following.
+ </para>
+
+ <orderedlist>
+
+ <listitem>
+ <para>
+ DMA memory is unmapped.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ ATA_QCFLAG_ACTIVE is clared from qc->flags.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ qc->complete_fn() callback is invoked. If the return value of
+ the callback is not zero. Completion is short circuited and
+ ata_qc_complete() returns.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ __ata_qc_complete() is called, which does
+ <orderedlist>
+
+ <listitem>
+ <para>
+ qc->flags is cleared to zero.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ ap->active_tag and qc->tag are poisoned.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ qc->waiting is claread &amp; completed (in that order).
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ qc is deallocated by clearing appropriate bit in ap->qactive.
+ </para>
+ </listitem>
+
+ </orderedlist>
+ </para>
+ </listitem>
+
+ </orderedlist>
+
+ <para>
+ So, it basically notifies upper layer and deallocates qc. One
+ exception is short-circuit path in #3 which is used by
+ atapi_qc_complete().
+ </para>
+ <para>
+ For all non-ATAPI commands, whether it fails or not, almost
+ the same code path is taken and very little error handling
+ takes place. A qc is completed with success status if it
+ succeeded, with failed status otherwise.
+ </para>
+ <para>
+ However, failed ATAPI commands require more handling as
+ REQUEST SENSE is needed to acquire sense data. If an ATAPI
+ command fails, ata_qc_complete() is invoked with error status,
+ which in turn invokes atapi_qc_complete() via
+ qc->complete_fn() callback.
+ </para>
+ <para>
+ This makes atapi_qc_complete() set scmd->result to
+ SAM_STAT_CHECK_CONDITION, complete the scmd and return 1. As
+ the sense data is empty but scmd->result is CHECK CONDITION,
+ SCSI midlayer will invoke EH for the scmd, and returning 1
+ makes ata_qc_complete() to return without deallocating the qc.
+ This leads us to ata_scsi_error() with partially completed qc.
+ </para>
+
+ </sect1>
+
+ <sect1><title>ata_scsi_error()</title>
+ <para>
+ ata_scsi_error() is the current hostt->eh_strategy_handler()
+ for libata. As discussed above, this will be entered in two
+ cases - timeout and ATAPI error completion. This function
+ calls low level libata driver's eng_timeout() callback, the
+ standard callback for which is ata_eng_timeout(). It checks
+ if a qc is active and calls ata_qc_timeout() on the qc if so.
+ Actual error handling occurs in ata_qc_timeout().
+ </para>
+ <para>
+ If EH is invoked for timeout, ata_qc_timeout() stops BMDMA and
+ completes the qc. Note that as we're currently in EH, we
+ cannot call scsi_done. As described in SCSI EH doc, a
+ recovered scmd should be either retried with
+ scsi_queue_insert() or finished with scsi_finish_command().
+ Here, we override qc->scsidone with scsi_finish_command() and
+ calls ata_qc_complete().
+ </para>
+ <para>
+ If EH is invoked due to a failed ATAPI qc, the qc here is
+ completed but not deallocated. The purpose of this
+ half-completion is to use the qc as place holder to make EH
+ code reach this place. This is a bit hackish, but it works.
+ </para>
+ <para>
+ Once control reaches here, the qc is deallocated by invoking
+ __ata_qc_complete() explicitly. Then, internal qc for REQUEST
+ SENSE is issued. Once sense data is acquired, scmd is
+ finished by directly invoking scsi_finish_command() on the
+ scmd. Note that as we already have completed and deallocated
+ the qc which was associated with the scmd, we don't need
+ to/cannot call ata_qc_complete() again.
+ </para>
+
+ </sect1>
+
+ <sect1><title>Problems with the current EH</title>
+
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ Error representation is too crude. Currently any and all
+ error conditions are represented with ATA STATUS and ERROR
+ registers. Errors which aren't ATA device errors are treated
+ as ATA device errors by setting ATA_ERR bit. Better error
+ descriptor which can properly represent ATA and other
+ errors/exceptions is needed.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ When handling timeouts, no action is taken to make device
+ forget about the timed out command and ready for new commands.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ EH handling via ata_scsi_error() is not properly protected
+ from usual command processing. On EH entrance, the device is
+ not in quiescent state. Timed out commands may succeed or
+ fail any time. pio_task and atapi_task may still be running.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Too weak error recovery. Devices / controllers causing HSM
+ mismatch errors and other errors quite often require reset to
+ return to known state. Also, advanced error handling is
+ necessary to support features like NCQ and hotplug.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ ATA errors are directly handled in the interrupt handler and
+ PIO errors in pio_task. This is problematic for advanced
+ error handling for the following reasons.
+ </para>
+ <para>
+ First, advanced error handling often requires context and
+ internal qc execution.
+ </para>
+ <para>
+ Second, even a simple failure (say, CRC error) needs
+ information gathering and could trigger complex error handling
+ (say, resetting &amp; reconfiguring). Having multiple code
+ paths to gather information, enter EH and trigger actions
+ makes life painful.
+ </para>
+ <para>
+ Third, scattered EH code makes implementing low level drivers
+ difficult. Low level drivers override libata callbacks. If
+ EH is scattered over several places, each affected callbacks
+ should perform its part of error handling. This can be error
+ prone and painful.
+ </para>
+ </listitem>
+
+ </itemizedlist>
+ </sect1>
+ </chapter>
+
<chapter id="libataExt">
<title>libata Library</title>
!Edrivers/scsi/libata-core.c
@@ -431,6 +787,722 @@ and other resources, etc.
!Idrivers/scsi/libata-scsi.c
</chapter>
+ <chapter id="ataExceptions">
+ <title>ATA errors &amp; exceptions</title>
+
+ <para>
+ This chapter tries to identify what error/exception conditions exist
+ for ATA/ATAPI devices and describe how they should be handled in
+ implementation-neutral way.
+ </para>
+
+ <para>
+ The term 'error' is used to describe conditions where either an
+ explicit error condition is reported from device or a command has
+ timed out.
+ </para>
+
+ <para>
+ The term 'exception' is either used to describe exceptional
+ conditions which are not errors (say, power or hotplug events), or
+ to describe both errors and non-error exceptional conditions. Where
+ explicit distinction between error and exception is necessary, the
+ term 'non-error exception' is used.
+ </para>
+
+ <sect1 id="excat">
+ <title>Exception categories</title>
+ <para>
+ Exceptions are described primarily with respect to legacy
+ taskfile + bus master IDE interface. If a controller provides
+ other better mechanism for error reporting, mapping those into
+ categories described below shouldn't be difficult.
+ </para>
+
+ <para>
+ In the following sections, two recovery actions - reset and
+ reconfiguring transport - are mentioned. These are described
+ further in <xref linkend="exrec"/>.
+ </para>
+
+ <sect2 id="excatHSMviolation">
+ <title>HSM violation</title>
+ <para>
+ This error is indicated when STATUS value doesn't match HSM
+ requirement during issuing or excution any ATA/ATAPI command.
+ </para>
+
+ <itemizedlist>
+ <title>Examples</title>
+
+ <listitem>
+ <para>
+ ATA_STATUS doesn't contain !BSY &amp;&amp; DRDY &amp;&amp; !DRQ while trying
+ to issue a command.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ !BSY &amp;&amp; !DRQ during PIO data transfer.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ DRQ on command completion.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ !BSY &amp;&amp; ERR after CDB tranfer starts but before the
+ last byte of CDB is transferred. ATA/ATAPI standard states
+ that &quot;The device shall not terminate the PACKET command
+ with an error before the last byte of the command packet has
+ been written&quot; in the error outputs description of PACKET
+ command and the state diagram doesn't include such
+ transitions.
+ </para>
+ </listitem>
+
+ </itemizedlist>
+
+ <para>
+ In these cases, HSM is violated and not much information
+ regarding the error can be acquired from STATUS or ERROR
+ register. IOW, this error can be anything - driver bug,
+ faulty device, controller and/or cable.
+ </para>
+
+ <para>
+ As HSM is violated, reset is necessary to restore known state.
+ Reconfiguring transport for lower speed might be helpful too
+ as transmission errors sometimes cause this kind of errors.
+ </para>
+ </sect2>
+
+ <sect2 id="excatDevErr">
+ <title>ATA/ATAPI device error (non-NCQ / non-CHECK CONDITION)</title>
+
+ <para>
+ These are errors detected and reported by ATA/ATAPI devices
+ indicating device problems. For this type of errors, STATUS
+ and ERROR register values are valid and describe error
+ condition. Note that some of ATA bus errors are detected by
+ ATA/ATAPI devices and reported using the same mechanism as
+ device errors. Those cases are described later in this
+ section.
+ </para>
+
+ <para>
+ For ATA commands, this type of errors are indicated by !BSY
+ &amp;&amp; ERR during command execution and on completion.
+ </para>
+
+ <para>For ATAPI commands,</para>
+
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ !BSY &amp;&amp; ERR &amp;&amp; ABRT right after issuing PACKET
+ indicates that PACKET command is not supported and falls in
+ this category.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ !BSY &amp;&amp; ERR(==CHK) &amp;&amp; !ABRT after the last
+ byte of CDB is transferred indicates CHECK CONDITION and
+ doesn't fall in this category.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ !BSY &amp;&amp; ERR(==CHK) &amp;&amp; ABRT after the last byte
+ of CDB is transferred *probably* indicates CHECK CONDITION and
+ doesn't fall in this category.
+ </para>
+ </listitem>
+
+ </itemizedlist>
+
+ <para>
+ Of errors detected as above, the followings are not ATA/ATAPI
+ device errors but ATA bus errors and should be handled
+ according to <xref linkend="excatATAbusErr"/>.
+ </para>
+
+ <variablelist>
+
+ <varlistentry>
+ <term>CRC error during data transfer</term>
+ <listitem>
+ <para>
+ This is indicated by ICRC bit in the ERROR register and
+ means that corruption occurred during data transfer. Upto
+ ATA/ATAPI-7, the standard specifies that this bit is only
+ applicable to UDMA transfers but ATA/ATAPI-8 draft revision
+ 1f says that the bit may be applicable to multiword DMA and
+ PIO.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>ABRT error during data transfer or on completion</term>
+ <listitem>
+ <para>
+ Upto ATA/ATAPI-7, the standard specifies that ABRT could be
+ set on ICRC errors and on cases where a device is not able
+ to complete a command. Combined with the fact that MWDMA
+ and PIO transfer errors aren't allowed to use ICRC bit upto
+ ATA/ATAPI-7, it seems to imply that ABRT bit alone could
+ indicate tranfer errors.
+ </para>
+ <para>
+ However, ATA/ATAPI-8 draft revision 1f removes the part
+ that ICRC errors can turn on ABRT. So, this is kind of
+ gray area. Some heuristics are needed here.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ <para>
+ ATA/ATAPI device errors can be further categorized as follows.
+ </para>
+
+ <variablelist>
+
+ <varlistentry>
+ <term>Media errors</term>
+ <listitem>
+ <para>
+ This is indicated by UNC bit in the ERROR register. ATA
+ devices reports UNC error only after certain number of
+ retries cannot recover the data, so there's nothing much
+ else to do other than notifying upper layer.
+ </para>
+ <para>
+ READ and WRITE commands report CHS or LBA of the first
+ failed sector but ATA/ATAPI standard specifies that the
+ amount of transferred data on error completion is
+ indeterminate, so we cannot assume that sectors preceding
+ the failed sector have been transferred and thus cannot
+ complete those sectors successfully as SCSI does.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Media changed / media change requested error</term>
+ <listitem>
+ <para>
+ &lt;&lt;TODO: fill here&gt;&gt;
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry><term>Address error</term>
+ <listitem>
+ <para>
+ This is indicated by IDNF bit in the ERROR register.
+ Report to upper layer.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry><term>Other errors</term>
+ <listitem>
+ <para>
+ This can be invalid command or parameter indicated by ABRT
+ ERROR bit or some other error condition. Note that ABRT
+ bit can indicate a lot of things including ICRC and Address
+ errors. Heuristics needed.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ <para>
+ Depending on commands, not all STATUS/ERROR bits are
+ applicable. These non-applicable bits are marked with
+ &quot;na&quot; in the output descriptions but upto ATA/ATAPI-7
+ no definition of &quot;na&quot; can be found. However,
+ ATA/ATAPI-8 draft revision 1f describes &quot;N/A&quot; as
+ follows.
+ </para>
+
+ <blockquote>
+ <variablelist>
+ <varlistentry><term>3.2.3.3a N/A</term>
+ <listitem>
+ <para>
+ A keyword the indicates a field has no defined value in
+ this standard and should not be checked by the host or
+ device. N/A fields should be cleared to zero.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </blockquote>
+
+ <para>
+ So, it seems reasonable to assume that &quot;na&quot; bits are
+ cleared to zero by devices and thus need no explicit masking.
+ </para>
+
+ </sect2>
+
+ <sect2 id="excatATAPIcc">
+ <title>ATAPI device CHECK CONDITION</title>
+
+ <para>
+ ATAPI device CHECK CONDITION error is indicated by set CHK bit
+ (ERR bit) in the STATUS register after the last byte of CDB is
+ transferred for a PACKET command. For this kind of errors,
+ sense data should be acquired to gather information regarding
+ the errors. REQUEST SENSE packet command should be used to
+ acquire sense data.
+ </para>
+
+ <para>
+ Once sense data is acquired, this type of errors can be
+ handled similary to other SCSI errors. Note that sense data
+ may indicate ATA bus error (e.g. Sense Key 04h HARDWARE ERROR
+ &amp;&amp; ASC/ASCQ 47h/00h SCSI PARITY ERROR). In such
+ cases, the error should be considered as an ATA bus error and
+ handled according to <xref linkend="excatATAbusErr"/>.
+ </para>
+
+ </sect2>
+
+ <sect2 id="excatNCQerr">
+ <title>ATA device error (NCQ)</title>
+
+ <para>
+ NCQ command error is indicated by cleared BSY and set ERR bit
+ during NCQ command phase (one or more NCQ commands
+ outstanding). Although STATUS and ERROR registers will
+ contain valid values describing the error, READ LOG EXT is
+ required to clear the error condition, determine which command
+ has failed and acquire more information.
+ </para>
+
+ <para>
+ READ LOG EXT Log Page 10h reports which tag has failed and
+ taskfile register values describing the error. With this
+ information the failed command can be handled as a normal ATA
+ command error as in <xref linkend="excatDevErr"/> and all
+ other in-flight commands must be retried. Note that this
+ retry should not be counted - it's likely that commands
+ retried this way would have completed normally if it were not
+ for the failed command.
+ </para>
+
+ <para>
+ Note that ATA bus errors can be reported as ATA device NCQ
+ errors. This should be handled as described in <xref
+ linkend="excatATAbusErr"/>.
+ </para>
+
+ <para>
+ If READ LOG EXT Log Page 10h fails or reports NQ, we're
+ thoroughly screwed. This condition should be treated
+ according to <xref linkend="excatHSMviolation"/>.
+ </para>
+
+ </sect2>
+
+ <sect2 id="excatATAbusErr">
+ <title>ATA bus error</title>
+
+ <para>
+ ATA bus error means that data corruption occurred during
+ transmission over ATA bus (SATA or PATA). This type of errors
+ can be indicated by
+ </para>
+
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ ICRC or ABRT error as described in <xref linkend="excatDevErr"/>.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Controller-specific error completion with error information
+ indicating transmission error.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ On some controllers, command timeout. In this case, there may
+ be a mechanism to determine that the timeout is due to
+ transmission error.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Unknown/random errors, timeouts and all sorts of weirdities.
+ </para>
+ </listitem>
+
+ </itemizedlist>
+
+ <para>
+ As described above, transmission errors can cause wide variety
+ of symptoms ranging from device ICRC error to random device
+ lockup, and, for many cases, there is no way to tell if an
+ error condition is due to transmission error or not;
+ therefore, it's necessary to employ some kind of heuristic
+ when dealing with errors and timeouts. For example,
+ encountering repetitive ABRT errors for known supported
+ command is likely to indicate ATA bus error.
+ </para>
+
+ <para>
+ Once it's determined that ATA bus errors have possibly
+ occurred, lowering ATA bus transmission speed is one of
+ actions which may alleviate the problem. See <xref
+ linkend="exrecReconf"/> for more information.
+ </para>
+
+ </sect2>
+
+ <sect2 id="excatPCIbusErr">
+ <title>PCI bus error</title>
+
+ <para>
+ Data corruption or other failures during transmission over PCI
+ (or other system bus). For standard BMDMA, this is indicated
+ by Error bit in the BMDMA Status register. This type of
+ errors must be logged as it indicates something is very wrong
+ with the system. Resetting host controller is recommended.
+ </para>
+
+ </sect2>
+
+ <sect2 id="excatLateCompletion">
+ <title>Late completion</title>
+
+ <para>
+ This occurs when timeout occurs and the timeout handler finds
+ out that the timed out command has completed successfully or
+ with error. This is usually caused by lost interrupts. This
+ type of errors must be logged. Resetting host controller is
+ recommended.
+ </para>
+
+ </sect2>
+
+ <sect2 id="excatUnknown">
+ <title>Unknown error (timeout)</title>
+
+ <para>
+ This is when timeout occurs and the command is still
+ processing or the host and device are in unknown state. When
+ this occurs, HSM could be in any valid or invalid state. To
+ bring the device to known state and make it forget about the
+ timed out command, resetting is necessary. The timed out
+ command may be retried.
+ </para>
+
+ <para>
+ Timeouts can also be caused by transmission errors. Refer to
+ <xref linkend="excatATAbusErr"/> for more details.
+ </para>
+
+ </sect2>
+
+ <sect2 id="excatHoplugPM">
+ <title>Hotplug and power management exceptions</title>
+
+ <para>
+ &lt;&lt;TODO: fill here&gt;&gt;
+ </para>
+
+ </sect2>
+
+ </sect1>
+
+ <sect1 id="exrec">
+ <title>EH recovery actions</title>
+
+ <para>
+ This section discusses several important recovery actions.
+ </para>
+
+ <sect2 id="exrecClr">
+ <title>Clearing error condition</title>
+
+ <para>
+ Many controllers require its error registers to be cleared by
+ error handler. Different controllers may have different
+ requirements.
+ </para>
+
+ <para>
+ For SATA, it's strongly recommended to clear at least SError
+ register during error handling.
+ </para>
+ </sect2>
+
+ <sect2 id="exrecRst">
+ <title>Reset</title>
+
+ <para>
+ During EH, resetting is necessary in the following cases.
+ </para>
+
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ HSM is in unknown or invalid state
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ HBA is in unknown or invalid state
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ EH needs to make HBA/device forget about in-flight commands
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ HBA/device behaves weirdly
+ </para>
+ </listitem>
+
+ </itemizedlist>
+
+ <para>
+ Resetting during EH might be a good idea regardless of error
+ condition to improve EH robustness. Whether to reset both or
+ either one of HBA and device depends on situation but the
+ following scheme is recommended.
+ </para>
+
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ When it's known that HBA is in ready state but ATA/ATAPI
+ device in in unknown state, reset only device.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ If HBA is in unknown state, reset both HBA and device.
+ </para>
+ </listitem>
+
+ </itemizedlist>
+
+ <para>
+ HBA resetting is implementation specific. For a controller
+ complying to taskfile/BMDMA PCI IDE, stopping active DMA
+ transaction may be sufficient iff BMDMA state is the only HBA
+ context. But even mostly taskfile/BMDMA PCI IDE complying
+ controllers may have implementation specific requirements and
+ mechanism to reset themselves. This must be addressed by
+ specific drivers.
+ </para>
+
+ <para>
+ OTOH, ATA/ATAPI standard describes in detail ways to reset
+ ATA/ATAPI devices.
+ </para>
+
+ <variablelist>
+
+ <varlistentry><term>PATA hardware reset</term>
+ <listitem>
+ <para>
+ This is hardware initiated device reset signalled with
+ asserted PATA RESET- signal. There is no standard way to
+ initiate hardware reset from software although some
+ hardware provides registers that allow driver to directly
+ tweak the RESET- signal.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry><term>Software reset</term>
+ <listitem>
+ <para>
+ This is achieved by turning CONTROL SRST bit on for at
+ least 5us. Both PATA and SATA support it but, in case of
+ SATA, this may require controller-specific support as the
+ second Register FIS to clear SRST should be transmitted
+ while BSY bit is still set. Note that on PATA, this resets
+ both master and slave devices on a channel.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry><term>EXECUTE DEVICE DIAGNOSTIC command</term>
+ <listitem>
+ <para>
+ Although ATA/ATAPI standard doesn't describe exactly, EDD
+ implies some level of resetting, possibly similar level
+ with software reset. Host-side EDD protocol can be handled
+ with normal command processing and most SATA controllers
+ should be able to handle EDD's just like other commands.
+ As in software reset, EDD affects both devices on a PATA
+ bus.
+ </para>
+ <para>
+ Although EDD does reset devices, this doesn't suit error
+ handling as EDD cannot be issued while BSY is set and it's
+ unclear how it will act when device is in unknown/weird
+ state.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry><term>ATAPI DEVICE RESET command</term>
+ <listitem>
+ <para>
+ This is very similar to software reset except that reset
+ can be restricted to the selected device without affecting
+ the other device sharing the cable.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry><term>SATA phy reset</term>
+ <listitem>
+ <para>
+ This is the preferred way of resetting a SATA device. In
+ effect, it's identical to PATA hardware reset. Note that
+ this can be done with the standard SCR Control register.
+ As such, it's usually easier to implement than software
+ reset.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ <para>
+ One more thing to consider when resetting devices is that
+ resetting clears certain configuration parameters and they
+ need to be set to their previous or newly adjusted values
+ after reset.
+ </para>
+
+ <para>
+ Parameters affected are.
+ </para>
+
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ CHS set up with INITIALIZE DEVICE PARAMETERS (seldomly used)
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Parameters set with SET FEATURES including transfer mode setting
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Block count set with SET MULTIPLE MODE
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Other parameters (SET MAX, MEDIA LOCK...)
+ </para>
+ </listitem>
+
+ </itemizedlist>
+
+ <para>
+ ATA/ATAPI standard specifies that some parameters must be
+ maintained across hardware or software reset, but doesn't
+ strictly specify all of them. Always reconfiguring needed
+ parameters after reset is required for robustness. Note that
+ this also applies when resuming from deep sleep (power-off).
+ </para>
+
+ <para>
+ Also, ATA/ATAPI standard requires that IDENTIFY DEVICE /
+ IDENTIFY PACKET DEVICE is issued after any configuration
+ parameter is updated or a hardware reset and the result used
+ for further operation. OS driver is required to implement
+ revalidation mechanism to support this.
+ </para>
+
+ </sect2>
+
+ <sect2 id="exrecReconf">
+ <title>Reconfigure transport</title>
+
+ <para>
+ For both PATA and SATA, a lot of corners are cut for cheap
+ connectors, cables or controllers and it's quite common to see
+ high transmission error rate. This can be mitigated by
+ lowering transmission speed.
+ </para>
+
+ <para>
+ The following is a possible scheme Jeff Garzik suggested.
+ </para>
+
+ <blockquote>
+ <para>
+ If more than $N (3?) transmission errors happen in 15 minutes,
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ if SATA, decrease SATA PHY speed. if speed cannot be decreased,
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ decrease UDMA xfer speed. if at UDMA0, switch to PIO4,
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ decrease PIO xfer speed. if at PIO3, complain, but continue
+ </para>
+ </listitem>
+ </itemizedlist>
+ </blockquote>
+
+ </sect2>
+
+ </sect1>
+
+ </chapter>
+
<chapter id="PiixInt">
<title>ata_piix Internals</title>
!Idrivers/scsi/ata_piix.c
diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index 7f43b04..237d54c 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -301,8 +301,84 @@ now, but you can do this to mark internal company procedures or just
point out some special detail about the sign-off.
+12) The canonical patch format
-12) More references for submitting patches
+The canonical patch subject line is:
+
+ Subject: [PATCH 001/123] subsystem: summary phrase
+
+The canonical patch message body contains the following:
+
+ - A "from" line specifying the patch author.
+
+ - An empty line.
+
+ - The body of the explanation, which will be copied to the
+ permanent changelog to describe this patch.
+
+ - The "Signed-off-by:" lines, described above, which will
+ also go in the changelog.
+
+ - A marker line containing simply "---".
+
+ - Any additional comments not suitable for the changelog.
+
+ - The actual patch (diff output).
+
+The Subject line format makes it very easy to sort the emails
+alphabetically by subject line - pretty much any email reader will
+support that - since because the sequence number is zero-padded,
+the numerical and alphabetic sort is the same.
+
+The "subsystem" in the email's Subject should identify which
+area or subsystem of the kernel is being patched.
+
+The "summary phrase" in the email's Subject should concisely
+describe the patch which that email contains. The "summary
+phrase" should not be a filename. Do not use the same "summary
+phrase" for every patch in a whole patch series.
+
+Bear in mind that the "summary phrase" of your email becomes
+a globally-unique identifier for that patch. It propagates
+all the way into the git changelog. The "summary phrase" may
+later be used in developer discussions which refer to the patch.
+People will want to google for the "summary phrase" to read
+discussion regarding that patch.
+
+A couple of example Subjects:
+
+ Subject: [patch 2/5] ext2: improve scalability of bitmap searching
+ Subject: [PATCHv2 001/207] x86: fix eflags tracking
+
+The "from" line must be the very first line in the message body,
+and has the form:
+
+ From: Original Author <author@example.com>
+
+The "from" line specifies who will be credited as the author of the
+patch in the permanent changelog. If the "from" line is missing,
+then the "From:" line from the email header will be used to determine
+the patch author in the changelog.
+
+The explanation body will be committed to the permanent source
+changelog, so should make sense to a competent reader who has long
+since forgotten the immediate details of the discussion that might
+have led to this patch.
+
+The "---" marker line serves the essential purpose of marking for patch
+handling tools where the changelog message ends.
+
+One good use for the additional comments after the "---" marker is for
+a diffstat, to show what files have changed, and the number of inserted
+and deleted lines per file. A diffstat is especially useful on bigger
+patches. Other comments relevant only to the moment or the maintainer,
+not suitable for the permanent changelog, should also go here.
+
+See more details on the proper patch format in the following
+references.
+
+
+13) More references for submitting patches
Andrew Morton, "The perfect patch" (tpp).
<http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt>
@@ -310,6 +386,14 @@ Andrew Morton, "The perfect patch" (tpp).
Jeff Garzik, "Linux kernel patch submission format."
<http://linux.yyz.us/patch-format.html>
+Greg KH, "How to piss off a kernel subsystem maintainer"
+ <http://www.kroah.com/log/2005/03/31/>
+
+Kernel Documentation/CodingStyle
+ <http://sosdg.org/~coywolf/lxr/source/Documentation/CodingStyle>
+
+Linus Torvald's mail on the canonical patch format:
+ <http://lkml.org/lkml/2005/4/7/183>
-----------------------------------
diff --git a/Documentation/block/biodoc.txt b/Documentation/block/biodoc.txt
index 6dd274d..2d65c21 100644
--- a/Documentation/block/biodoc.txt
+++ b/Documentation/block/biodoc.txt
@@ -906,9 +906,20 @@ Aside:
4. The I/O scheduler
-I/O schedulers are now per queue. They should be runtime switchable and modular
-but aren't yet. Jens has most bits to do this, but the sysfs implementation is
-missing.
+I/O scheduler, a.k.a. elevator, is implemented in two layers. Generic dispatch
+queue and specific I/O schedulers. Unless stated otherwise, elevator is used
+to refer to both parts and I/O scheduler to specific I/O schedulers.
+
+Block layer implements generic dispatch queue in ll_rw_blk.c and elevator.c.
+The generic dispatch queue is responsible for properly ordering barrier
+requests, requeueing, handling non-fs requests and all other subtleties.
+
+Specific I/O schedulers are responsible for ordering normal filesystem
+requests. They can also choose to delay certain requests to improve
+throughput or whatever purpose. As the plural form indicates, there are
+multiple I/O schedulers. They can be built as modules but at least one should
+be built inside the kernel. Each queue can choose different one and can also
+change to another one dynamically.
A block layer call to the i/o scheduler follows the convention elv_xxx(). This
calls elevator_xxx_fn in the elevator switch (drivers/block/elevator.c). Oh,
@@ -921,44 +932,36 @@ keeping work.
The functions an elevator may implement are: (* are mandatory)
elevator_merge_fn called to query requests for merge with a bio
-elevator_merge_req_fn " " " with another request
+elevator_merge_req_fn called when two requests get merged. the one
+ which gets merged into the other one will be
+ never seen by I/O scheduler again. IOW, after
+ being merged, the request is gone.
elevator_merged_fn called when a request in the scheduler has been
involved in a merge. It is used in the deadline
scheduler for example, to reposition the request
if its sorting order has changed.
-*elevator_next_req_fn returns the next scheduled request, or NULL
- if there are none (or none are ready).
+elevator_dispatch_fn fills the dispatch queue with ready requests.
+ I/O schedulers are free to postpone requests by
+ not filling the dispatch queue unless @force
+ is non-zero. Once dispatched, I/O schedulers
+ are not allowed to manipulate the requests -
+ they belong to generic dispatch queue.
-*elevator_add_req_fn called to add a new request into the scheduler
+elevator_add_req_fn called to add a new request into the scheduler
elevator_queue_empty_fn returns true if the merge queue is empty.
Drivers shouldn't use this, but rather check
if elv_next_request is NULL (without losing the
request if one exists!)
-elevator_remove_req_fn This is called when a driver claims ownership of
- the target request - it now belongs to the
- driver. It must not be modified or merged.
- Drivers must not lose the request! A subsequent
- call of elevator_next_req_fn must return the
- _next_ request.
-
-elevator_requeue_req_fn called to add a request to the scheduler. This
- is used when the request has alrnadebeen
- returned by elv_next_request, but hasn't
- completed. If this is not implemented then
- elevator_add_req_fn is called instead.
-
elevator_former_req_fn
elevator_latter_req_fn These return the request before or after the
one specified in disk sort order. Used by the
block layer to find merge possibilities.
-elevator_completed_req_fn called when a request is completed. This might
- come about due to being merged with another or
- when the device completes the request.
+elevator_completed_req_fn called when a request is completed.
elevator_may_queue_fn returns true if the scheduler wants to allow the
current context to queue a new request even if
@@ -967,13 +970,33 @@ elevator_may_queue_fn returns true if the scheduler wants to allow the
elevator_set_req_fn
elevator_put_req_fn Must be used to allocate and free any elevator
- specific storate for a request.
+ specific storage for a request.
+
+elevator_activate_req_fn Called when device driver first sees a request.
+ I/O schedulers can use this callback to
+ determine when actual execution of a request
+ starts.
+elevator_deactivate_req_fn Called when device driver decides to delay
+ a request by requeueing it.
elevator_init_fn
elevator_exit_fn Allocate and free any elevator specific storage
for a queue.
-4.2 I/O scheduler implementation
+4.2 Request flows seen by I/O schedulers
+All requests seens by I/O schedulers strictly follow one of the following three
+flows.
+
+ set_req_fn ->
+
+ i. add_req_fn -> (merged_fn ->)* -> dispatch_fn -> activate_req_fn ->
+ (deactivate_req_fn -> activate_req_fn ->)* -> completed_req_fn
+ ii. add_req_fn -> (merged_fn ->)* -> merge_req_fn
+ iii. [none]
+
+ -> put_req_fn
+
+4.3 I/O scheduler implementation
The generic i/o scheduler algorithm attempts to sort/merge/batch requests for
optimal disk scan and request servicing performance (based on generic
principles and device capabilities), optimized for:
@@ -993,18 +1016,7 @@ request in sort order to prevent binary tree lookups.
This arrangement is not a generic block layer characteristic however, so
elevators may implement queues as they please.
-ii. Last merge hint
-The last merge hint is part of the generic queue layer. I/O schedulers must do
-some management on it. For the most part, the most important thing is to make
-sure q->last_merge is cleared (set to NULL) when the request on it is no longer
-a candidate for merging (for example if it has been sent to the driver).
-
-The last merge performed is cached as a hint for the subsequent request. If
-sequential data is being submitted, the hint is used to perform merges without
-any scanning. This is not sufficient when there are multiple processes doing
-I/O though, so a "merge hash" is used by some schedulers.
-
-iii. Merge hash
+ii. Merge hash
AS and deadline use a hash table indexed by the last sector of a request. This
enables merging code to quickly look up "back merge" candidates, even when
multiple I/O streams are being performed at once on one disk.
@@ -1013,29 +1025,8 @@ multiple I/O streams are being performed at once on one disk.
are far less common than "back merges" due to the nature of most I/O patterns.
Front merges are handled by the binary trees in AS and deadline schedulers.
-iv. Handling barrier cases
-A request with flags REQ_HARDBARRIER or REQ_SOFTBARRIER must not be ordered
-around. That is, they must be processed after all older requests, and before
-any newer ones. This includes merges!
-
-In AS and deadline schedulers, barriers have the effect of flushing the reorder
-queue. The performance cost of this will vary from nothing to a lot depending
-on i/o patterns and device characteristics. Obviously they won't improve
-performance, so their use should be kept to a minimum.
-
-v. Handling insertion position directives
-A request may be inserted with a position directive. The directives are one of
-ELEVATOR_INSERT_BACK, ELEVATOR_INSERT_FRONT, ELEVATOR_INSERT_SORT.
-
-ELEVATOR_INSERT_SORT is a general directive for non-barrier requests.
-ELEVATOR_INSERT_BACK is used to insert a barrier to the back of the queue.
-ELEVATOR_INSERT_FRONT is used to insert a barrier to the front of the queue, and
-overrides the ordering requested by any previous barriers. In practice this is
-harmless and required, because it is used for SCSI requeueing. This does not
-require flushing the reorder queue, so does not impose a performance penalty.
-
-vi. Plugging the queue to batch requests in anticipation of opportunities for
- merge/sort optimizations
+iii. Plugging the queue to batch requests in anticipation of opportunities for
+ merge/sort optimizations
This is just the same as in 2.4 so far, though per-device unplugging
support is anticipated for 2.5. Also with a priority-based i/o scheduler,
@@ -1069,7 +1060,7 @@ Aside:
blk_kick_queue() to unplug a specific queue (right away ?)
or optionally, all queues, is in the plan.
-4.3 I/O contexts
+4.4 I/O contexts
I/O contexts provide a dynamically allocated per process data area. They may
be used in I/O schedulers, and in the block layer (could be used for IO statis,
priorities for example). See *io_context in drivers/block/ll_rw_blk.c, and
diff --git a/Documentation/connector/connector.txt b/Documentation/connector/connector.txt
index 54a0a14b..57a314b 100644
--- a/Documentation/connector/connector.txt
+++ b/Documentation/connector/connector.txt
@@ -131,3 +131,47 @@ Netlink itself is not reliable protocol, that means that messages can
be lost due to memory pressure or process' receiving queue overflowed,
so caller is warned must be prepared. That is why struct cn_msg [main
connector's message header] contains u32 seq and u32 ack fields.
+
+/*****************************************/
+Userspace usage.
+/*****************************************/
+2.6.14 has a new netlink socket implementation, which by default does not
+allow to send data to netlink groups other than 1.
+So, if to use netlink socket (for example using connector)
+with different group number userspace application must subscribe to
+that group. It can be achieved by following pseudocode:
+
+s = socket(PF_NETLINK, SOCK_DGRAM, NETLINK_CONNECTOR);
+
+l_local.nl_family = AF_NETLINK;
+l_local.nl_groups = 12345;
+l_local.nl_pid = 0;
+
+if (bind(s, (struct sockaddr *)&l_local, sizeof(struct sockaddr_nl)) == -1) {
+ perror("bind");
+ close(s);
+ return -1;
+}
+
+{
+ int on = l_local.nl_groups;
+ setsockopt(s, 270, 1, &on, sizeof(on));
+}
+
+Where 270 above is SOL_NETLINK, and 1 is a NETLINK_ADD_MEMBERSHIP socket
+option. To drop multicast subscription one should call above socket option
+with NETLINK_DROP_MEMBERSHIP parameter which is defined as 0.
+
+2.6.14 netlink code only allows to select a group which is less or equal to
+the maximum group number, which is used at netlink_kernel_create() time.
+In case of connector it is CN_NETLINK_USERS + 0xf, so if you want to use
+group number 12345, you must increment CN_NETLINK_USERS to that number.
+Additional 0xf numbers are allocated to be used by non-in-kernel users.
+
+Due to this limitation, group 0xffffffff does not work now, so one can
+not use add/remove connector's group notifications, but as far as I know,
+only cn_test.c test module used it.
+
+Some work in netlink area is still being done, so things can be changed in
+2.6.15 timeframe, if it will happen, documentation will be updated for that
+kernel.
diff --git a/Documentation/dell_rbu.txt b/Documentation/dell_rbu.txt
index 95d7f62..941343a 100644
--- a/Documentation/dell_rbu.txt
+++ b/Documentation/dell_rbu.txt
@@ -35,6 +35,7 @@ The driver load creates the following directories under the /sys file system.
/sys/class/firmware/dell_rbu/data
/sys/devices/platform/dell_rbu/image_type
/sys/devices/platform/dell_rbu/data
+/sys/devices/platform/dell_rbu/packet_size
The driver supports two types of update mechanism; monolithic and packetized.
These update mechanism depends upon the BIOS currently running on the system.
@@ -47,8 +48,26 @@ By default the driver uses monolithic memory for the update type. This can be
changed to packets during the driver load time by specifying the load
parameter image_type=packet. This can also be changed later as below
echo packet > /sys/devices/platform/dell_rbu/image_type
-Also echoing either mono ,packet or init in to image_type will free up the
-memory allocated by the driver.
+
+In packet update mode the packet size has to be given before any packets can
+be downloaded. It is done as below
+echo XXXX > /sys/devices/platform/dell_rbu/packet_size
+In the packet update mechanism, the user neesd to create a new file having
+packets of data arranged back to back. It can be done as follows
+The user creates packets header, gets the chunk of the BIOS image and
+placs it next to the packetheader; now, the packetheader + BIOS image chunk
+added to geather should match the specified packet_size. This makes one
+packet, the user needs to create more such packets out of the entire BIOS
+image file and then arrange all these packets back to back in to one single
+file.
+This file is then copied to /sys/class/firmware/dell_rbu/data.
+Once this file gets to the driver, the driver extracts packet_size data from
+the file and spreads it accross the physical memory in contiguous packet_sized
+space.
+This method makes sure that all the packets get to the driver in a single operation.
+
+In monolithic update the user simply get the BIOS image (.hdr file) and copies
+to the data file as is without any change to the BIOS image itself.
Do the steps below to download the BIOS image.
1) echo 1 > /sys/class/firmware/dell_rbu/loading
@@ -58,7 +77,10 @@ Do the steps below to download the BIOS image.
The /sys/class/firmware/dell_rbu/ entries will remain till the following is
done.
echo -1 > /sys/class/firmware/dell_rbu/loading.
-Until this step is completed the drivr cannot be unloaded.
+Until this step is completed the driver cannot be unloaded.
+Also echoing either mono ,packet or init in to image_type will free up the
+memory allocated by the driver.
+
If an user by accident executes steps 1 and 3 above without executing step 2;
it will make the /sys/class/firmware/dell_rbu/ entries to disappear.
The entries can be recreated by doing the following
@@ -66,15 +88,11 @@ echo init > /sys/devices/platform/dell_rbu/image_type
NOTE: echoing init in image_type does not change it original value.
Also the driver provides /sys/devices/platform/dell_rbu/data readonly file to
-read back the image downloaded. This is useful in case of packet update
-mechanism where the above steps 1,2,3 will repeated for every packet.
-By reading the /sys/devices/platform/dell_rbu/data file all packet data
-downloaded can be verified in a single file.
-The packets are arranged in this file one after the other in a FIFO order.
+read back the image downloaded.
NOTE:
-This driver requires a patch for firmware_class.c which has the addition
-of request_firmware_nowait_nohotplug function to wortk
+This driver requires a patch for firmware_class.c which has the modified
+request_firmware_nowait function.
Also after updating the BIOS image an user mdoe application neeeds to execute
code which message the BIOS update request to the BIOS. So on the next reboot
the BIOS knows about the new image downloaded and it updates it self.
diff --git a/Documentation/device-mapper/snapshot.txt b/Documentation/device-mapper/snapshot.txt
new file mode 100644
index 0000000..dca274f
--- /dev/null
+++ b/Documentation/device-mapper/snapshot.txt
@@ -0,0 +1,73 @@
+Device-mapper snapshot support
+==============================
+
+Device-mapper allows you, without massive data copying:
+
+*) To create snapshots of any block device i.e. mountable, saved states of
+the block device which are also writable without interfering with the
+original content;
+*) To create device "forks", i.e. multiple different versions of the
+same data stream.
+
+
+In both cases, dm copies only the chunks of data that get changed and
+uses a separate copy-on-write (COW) block device for storage.
+
+
+There are two dm targets available: snapshot and snapshot-origin.
+
+*) snapshot-origin <origin>
+
+which will normally have one or more snapshots based on it.
+You must create the snapshot-origin device before you can create snapshots.
+Reads will be mapped directly to the backing device. For each write, the
+original data will be saved in the <COW device> of each snapshot to keep
+its visible content unchanged, at least until the <COW device> fills up.
+
+
+*) snapshot <origin> <COW device> <persistent?> <chunksize>
+
+A snapshot is created of the <origin> block device. Changed chunks of
+<chunksize> sectors will be stored on the <COW device>. Writes will
+only go to the <COW device>. Reads will come from the <COW device> or
+from <origin> for unchanged data. <COW device> will often be
+smaller than the origin and if it fills up the snapshot will become
+useless and be disabled, returning errors. So it is important to monitor
+the amount of free space and expand the <COW device> before it fills up.
+
+<persistent?> is P (Persistent) or N (Not persistent - will not survive
+after reboot).
+
+
+How this is used by LVM2
+========================
+When you create the first LVM2 snapshot of a volume, four dm devices are used:
+
+1) a device containing the original mapping table of the source volume;
+2) a device used as the <COW device>;
+3) a "snapshot" device, combining #1 and #2, which is the visible snapshot
+ volume;
+4) the "original" volume (which uses the device number used by the original
+ source volume), whose table is replaced by a "snapshot-origin" mapping
+ from device #1.
+
+A fixed naming scheme is used, so with the following commands:
+
+lvcreate -L 1G -n base volumeGroup
+lvcreate -L 100M --snapshot -n snap volumeGroup/base
+
+we'll have this situation (with volumes in above order):
+
+# dmsetup table|grep volumeGroup
+
+volumeGroup-base-real: 0 2097152 linear 8:19 384
+volumeGroup-snap-cow: 0 204800 linear 8:19 2097536
+volumeGroup-snap: 0 2097152 snapshot 254:11 254:12 P 16
+volumeGroup-base: 0 2097152 snapshot-origin 254:11
+
+# ls -lL /dev/mapper/volumeGroup-*
+brw------- 1 root root 254, 11 29 ago 18:15 /dev/mapper/volumeGroup-base-real
+brw------- 1 root root 254, 12 29 ago 18:15 /dev/mapper/volumeGroup-snap-cow
+brw------- 1 root root 254, 13 29 ago 18:15 /dev/mapper/volumeGroup-snap
+brw------- 1 root root 254, 10 29 ago 18:14 /dev/mapper/volumeGroup-base
+
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 7086f0a..971589a 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -17,7 +17,7 @@ are specified on the kernel command line with the module name plus
usbcore.blinkenlights=1
-The text in square brackets at the beginning of the description state the
+The text in square brackets at the beginning of the description states the
restrictions on the kernel for the said kernel parameter to be valid. The
restrictions referred to are that the relevant option is valid if:
@@ -27,8 +27,8 @@ restrictions referred to are that the relevant option is valid if:
APM Advanced Power Management support is enabled.
AX25 Appropriate AX.25 support is enabled.
CD Appropriate CD support is enabled.
- DEVFS devfs support is enabled.
- DRM Direct Rendering Management support is enabled.
+ DEVFS devfs support is enabled.
+ DRM Direct Rendering Management support is enabled.
EDD BIOS Enhanced Disk Drive Services (EDD) is enabled
EFI EFI Partitioning (GPT) is enabled
EIDE EIDE/ATAPI support is enabled.
@@ -71,7 +71,7 @@ restrictions referred to are that the relevant option is valid if:
SERIAL Serial support is enabled.
SMP The kernel is an SMP kernel.
SPARC Sparc architecture is enabled.
- SWSUSP Software suspension is enabled.
+ SWSUSP Software suspend is enabled.
TS Appropriate touchscreen support is enabled.
USB USB support is enabled.
USBHID USB Human Interface Device support is enabled.
@@ -105,13 +105,13 @@ running once the system is up.
See header of drivers/scsi/53c7xx.c.
See also Documentation/scsi/ncr53c7xx.txt.
- acpi= [HW,ACPI] Advanced Configuration and Power Interface
- Format: { force | off | ht | strict }
+ acpi= [HW,ACPI] Advanced Configuration and Power Interface
+ Format: { force | off | ht | strict | noirq }
force -- enable ACPI if default was off
off -- disable ACPI if default was on
noirq -- do not use ACPI for IRQ routing
ht -- run only enough ACPI to enable Hyper Threading
- strict -- Be less tolerant of platforms that are not
+ strict -- Be less tolerant of platforms that are not
strictly ACPI specification compliant.
See also Documentation/pm.txt, pci=noacpi
@@ -119,20 +119,23 @@ running once the system is up.
acpi_sleep= [HW,ACPI] Sleep options
Format: { s3_bios, s3_mode }
See Documentation/power/video.txt
-
+
acpi_sci= [HW,ACPI] ACPI System Control Interrupt trigger mode
- Format: { level | edge | high | low }
+ Format: { level | edge | high | low }
- acpi_irq_balance [HW,ACPI] ACPI will balance active IRQs
- default in APIC mode
+ acpi_irq_balance [HW,ACPI]
+ ACPI will balance active IRQs
+ default in APIC mode
- acpi_irq_nobalance [HW,ACPI] ACPI will not move active IRQs (default)
- default in PIC mode
+ acpi_irq_nobalance [HW,ACPI]
+ ACPI will not move active IRQs (default)
+ default in PIC mode
- acpi_irq_pci= [HW,ACPI] If irq_balance, Clear listed IRQs for use by PCI
+ acpi_irq_pci= [HW,ACPI] If irq_balance, clear listed IRQs for
+ use by PCI
Format: <irq>,<irq>...
- acpi_irq_isa= [HW,ACPI] If irq_balance, Mark listed IRQs used by ISA
+ acpi_irq_isa= [HW,ACPI] If irq_balance, mark listed IRQs used by ISA
Format: <irq>,<irq>...
acpi_osi= [HW,ACPI] empty param disables _OSI
@@ -145,14 +148,14 @@ running once the system is up.
acpi_dbg_layer= [HW,ACPI]
Format: <int>
- Each bit of the <int> indicates an acpi debug layer,
+ Each bit of the <int> indicates an ACPI debug layer,
1: enable, 0: disable. It is useful for boot time
debugging. After system has booted up, it can be set
via /proc/acpi/debug_layer.
acpi_dbg_level= [HW,ACPI]
Format: <int>
- Each bit of the <int> indicates an acpi debug level,
+ Each bit of the <int> indicates an ACPI debug level,
1: enable, 0: disable. It is useful for boot time
debugging. After system has booted up, it can be set
via /proc/acpi/debug_level.
@@ -161,12 +164,13 @@ running once the system is up.
acpi_generic_hotkey [HW,ACPI]
Allow consolidated generic hotkey driver to
- over-ride platform specific driver.
+ override platform specific driver.
See also Documentation/acpi-hotkey.txt.
enable_timer_pin_1 [i386,x86-64]
Enable PIN 1 of APIC timer
- Can be useful to work around chipset bugs (in particular on some ATI chipsets)
+ Can be useful to work around chipset bugs
+ (in particular on some ATI chipsets).
The kernel tries to set a reasonable default.
disable_timer_pin_1 [i386,x86-64]
@@ -182,7 +186,7 @@ running once the system is up.
adlib= [HW,OSS]
Format: <io>
-
+
advansys= [HW,SCSI]
See header of drivers/scsi/advansys.c.
@@ -192,7 +196,7 @@ running once the system is up.
aedsp16= [HW,OSS] Audio Excel DSP 16
Format: <io>,<irq>,<dma>,<mss_io>,<mpu_io>,<mpu_irq>
See also header of sound/oss/aedsp16.c.
-
+
aha152x= [HW,SCSI]
See Documentation/scsi/aha152x.txt.
@@ -205,10 +209,6 @@ running once the system is up.
aic79xx= [HW,SCSI]
See Documentation/scsi/aic79xx.txt.
- AM53C974= [HW,SCSI]
- Format: <host-scsi-id>,<target-scsi-id>,<max-rate>,<max-offset>
- See also header of drivers/scsi/AM53C974.c.
-
amijoy.map= [HW,JOY] Amiga joystick support
Map of devices attached to JOY0DAT and JOY1DAT
Format: <a>,<b>
@@ -219,23 +219,24 @@ running once the system is up.
connected to one of 16 gameports
Format: <type1>,<type2>,..<type16>
- apc= [HW,SPARC] Power management functions (SPARCstation-4/5 + deriv.)
+ apc= [HW,SPARC]
+ Power management functions (SPARCstation-4/5 + deriv.)
Format: noidle
Disable APC CPU standby support. SPARCstation-Fox does
not play well with APC CPU idle - disable it if you have
APC and your system crashes randomly.
- apic= [APIC,i386] Change the output verbosity whilst booting
+ apic= [APIC,i386] Change the output verbosity whilst booting
Format: { quiet (default) | verbose | debug }
Change the amount of debugging information output
when initialising the APIC and IO-APIC components.
-
+
apm= [APM] Advanced Power Management
See header of arch/i386/kernel/apm.c.
applicom= [HW]
Format: <mem>,<irq>
-
+
arcrimi= [HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
Format: <io>,<irq>,<nodeID>
@@ -250,38 +251,40 @@ running once the system is up.
atkbd.reset= [HW] Reset keyboard during initialization
- atkbd.set= [HW] Select keyboard code set
- Format: <int> (2 = AT (default) 3 = PS/2)
+ atkbd.set= [HW] Select keyboard code set
+ Format: <int> (2 = AT (default), 3 = PS/2)
atkbd.scroll= [HW] Enable scroll wheel on MS Office and similar
keyboards
atkbd.softraw= [HW] Choose between synthetic and real raw mode
Format: <bool> (0 = real, 1 = synthetic (default))
-
- atkbd.softrepeat=
- [HW] Use software keyboard repeat
+
+ atkbd.softrepeat= [HW]
+ Use software keyboard repeat
autotest [IA64]
awe= [HW,OSS] AWE32/SB32/AWE64 wave table synth
Format: <io>,<memsize>,<isapnp>
-
+
aztcd= [HW,CD] Aztech CD268 CDROM driver
Format: <io>,0x79 (?)
baycom_epp= [HW,AX25]
Format: <io>,<mode>
-
+
baycom_par= [HW,AX25] BayCom Parallel Port AX.25 Modem
Format: <io>,<mode>
See header of drivers/net/hamradio/baycom_par.c.
- baycom_ser_fdx= [HW,AX25] BayCom Serial Port AX.25 Modem (Full Duplex Mode)
+ baycom_ser_fdx= [HW,AX25]
+ BayCom Serial Port AX.25 Modem (Full Duplex Mode)
Format: <io>,<irq>,<mode>[,<baud>]
See header of drivers/net/hamradio/baycom_ser_fdx.c.
- baycom_ser_hdx= [HW,AX25] BayCom Serial Port AX.25 Modem (Half Duplex Mode)
+ baycom_ser_hdx= [HW,AX25]
+ BayCom Serial Port AX.25 Modem (Half Duplex Mode)
Format: <io>,<irq>,<mode>
See header of drivers/net/hamradio/baycom_ser_hdx.c.
@@ -292,7 +295,8 @@ running once the system is up.
blkmtd_count=
bttv.card= [HW,V4L] bttv (bt848 + bt878 based grabber cards)
- bttv.radio= Most important insmod options are available as kernel args too.
+ bttv.radio= Most important insmod options are available as
+ kernel args too.
bttv.pll= See Documentation/video4linux/bttv/Insmod-options
bttv.tuner= and Documentation/video4linux/bttv/CARDLIST
@@ -318,15 +322,17 @@ running once the system is up.
checkreqprot [SELINUX] Set initial checkreqprot flag value.
Format: { "0" | "1" }
See security/selinux/Kconfig help text.
- 0 -- check protection applied by kernel (includes any implied execute protection).
+ 0 -- check protection applied by kernel (includes
+ any implied execute protection).
1 -- check protection requested by application.
Default value is set via a kernel config option.
- Value can be changed at runtime via /selinux/checkreqprot.
-
- clock= [BUGS=IA-32, HW] gettimeofday timesource override.
+ Value can be changed at runtime via
+ /selinux/checkreqprot.
+
+ clock= [BUGS=IA-32,HW] gettimeofday timesource override.
Forces specified timesource (if avaliable) to be used
- when calculating gettimeofday(). If specicified timesource
- is not avalible, it defaults to PIT.
+ when calculating gettimeofday(). If specicified
+ timesource is not avalible, it defaults to PIT.
Format: { pit | tsc | cyclone | pmtmr }
hpet= [IA-32,HPET] option to disable HPET and use PIT.
@@ -336,17 +342,19 @@ running once the system is up.
Format: { auto | [<io>,][<irq>] }
com20020= [HW,NET] ARCnet - COM20020 chipset
- Format: <io>[,<irq>[,<nodeID>[,<backplane>[,<ckp>[,<timeout>]]]]]
+ Format:
+ <io>[,<irq>[,<nodeID>[,<backplane>[,<ckp>[,<timeout>]]]]]
com90io= [HW,NET] ARCnet - COM90xx chipset (IO-mapped buffers)
Format: <io>[,<irq>]
- com90xx= [HW,NET] ARCnet - COM90xx chipset (memory-mapped buffers)
+ com90xx= [HW,NET]
+ ARCnet - COM90xx chipset (memory-mapped buffers)
Format: <io>[,<irq>[,<memstart>]]
condev= [HW,S390] console device
conmode=
-
+
console= [KNL] Output console device and options.
tty<n> Use the virtual console device <n>.
@@ -367,7 +375,8 @@ running once the system is up.
options are the same as for ttyS, above.
cpcihp_generic= [HW,PCI] Generic port I/O CompactPCI driver
- Format: <first_slot>,<last_slot>,<port>,<enum_bit>[,<debug>]
+ Format:
+ <first_slot>,<last_slot>,<port>,<enum_bit>[,<debug>]
cpia_pp= [HW,PPT]
Format: { parport<nr> | auto | none }
@@ -384,10 +393,10 @@ running once the system is up.
cs89x0_media= [HW,NET]
Format: { rj45 | aui | bnc }
-
+
cyclades= [HW,SERIAL] Cyclades multi-serial port adapter.
-
- dasd= [HW,NET]
+
+ dasd= [HW,NET]
See header of drivers/s390/block/dasd_devmap.c.
db9.dev[2|3]= [HW,JOY] Multisystem joystick support via parallel port
@@ -406,7 +415,7 @@ running once the system is up.
dhash_entries= [KNL]
Set number of hash buckets for dentry cache.
-
+
digi= [HW,SERIAL]
IO parameters + enable/disable command.
@@ -424,11 +433,11 @@ running once the system is up.
dtc3181e= [HW,SCSI]
- earlyprintk= [IA-32, X86-64]
+ earlyprintk= [IA-32,X86-64]
earlyprintk=vga
earlyprintk=serial[,ttySn[,baudrate]]
- Append ,keep to not disable it when the real console
+ Append ",keep" to not disable it when the real console
takes over.
Only vga or serial at a time, not both.
@@ -451,7 +460,7 @@ running once the system is up.
Format: {"of[f]" | "sk[ipmbr]"}
See comment in arch/i386/boot/edd.S
- eicon= [HW,ISDN]
+ eicon= [HW,ISDN]
Format: <id>,<membase>,<irq>
eisa_irq_edge= [PARISC,HW]
@@ -462,12 +471,13 @@ running once the system is up.
arch/i386/kernel/cpu/cpufreq/elanfreq.c.
elevator= [IOSCHED]
- Format: {"as"|"cfq"|"deadline"|"noop"}
- See Documentation/block/as-iosched.txt
- and Documentation/block/deadline-iosched.txt for details.
+ Format: {"as" | "cfq" | "deadline" | "noop"}
+ See Documentation/block/as-iosched.txt and
+ Documentation/block/deadline-iosched.txt for details.
+
elfcorehdr= [IA-32]
- Specifies physical address of start of kernel core image
- elf header.
+ Specifies physical address of start of kernel core
+ image elf header.
See Documentation/kdump.txt for details.
enforcing [SELINUX] Set initial enforcing status.
@@ -485,7 +495,7 @@ running once the system is up.
es1371= [HW,OSS]
Format: <spdif>,[<nomix>,[<amplifier>]]
See also header of sound/oss/es1371.c.
-
+
ether= [HW,NET] Ethernet cards parameters
This option is obsoleted by the "netdev=" option, which
has equivalent usage. See its documentation for details.
@@ -526,12 +536,13 @@ running once the system is up.
gus= [HW,OSS]
Format: <io>,<irq>,<dma>,<dma16>
-
+
gvp11= [HW,SCSI]
hashdist= [KNL,NUMA] Large hashes allocated during boot
are distributed across NUMA nodes. Defaults on
for IA-64, off otherwise.
+ Format: 0 | 1 (for off | on)
hcl= [IA-64] SGI's Hardware Graph compatibility layer
@@ -595,13 +606,13 @@ running once the system is up.
ide?= [HW] (E)IDE subsystem
Format: ide?=noprobe or chipset specific parameters.
See Documentation/ide.txt.
-
+
idebus= [HW] (E)IDE subsystem - VLB/PCI bus speed
See Documentation/ide.txt.
idle= [HW]
Format: idle=poll or idle=halt
-
+
ihash_entries= [KNL]
Set number of hash buckets for inode cache.
@@ -649,7 +660,7 @@ running once the system is up.
firmware running.
isapnp= [ISAPNP]
- Format: <RDP>, <reset>, <pci_scan>, <verbosity>
+ Format: <RDP>,<reset>,<pci_scan>,<verbosity>
isolcpus= [KNL,SMP] Isolate CPUs from the general scheduler.
Format: <cpu number>,...,<cpu number>
@@ -661,32 +672,33 @@ running once the system is up.
"number of CPUs in system - 1".
This option is the preferred way to isolate CPUs. The
- alternative - manually setting the CPU mask of all tasks
- in the system can cause problems and suboptimal load
- balancer performance.
+ alternative -- manually setting the CPU mask of all
+ tasks in the system -- can cause problems and
+ suboptimal load balancer performance.
isp16= [HW,CD]
Format: <io>,<irq>,<dma>,<setup>
- iucv= [HW,NET]
+ iucv= [HW,NET]
js= [HW,JOY] Analog joystick
See Documentation/input/joystick.txt.
keepinitrd [HW,ARM]
- kstack=N [IA-32, X86-64] Print N words from the kernel stack
+ kstack=N [IA-32,X86-64] Print N words from the kernel stack
in oops dumps.
l2cr= [PPC]
- lapic [IA-32,APIC] Enable the local APIC even if BIOS disabled it.
+ lapic [IA-32,APIC] Enable the local APIC even if BIOS
+ disabled it.
lasi= [HW,SCSI] PARISC LASI driver for the 53c700 chip
Format: addr:<io>,irq:<irq>
- llsc*= [IA64]
- See function print_params() in arch/ia64/sn/kernel/llsc4.c.
+ llsc*= [IA64] See function print_params() in
+ arch/ia64/sn/kernel/llsc4.c.
load_ramdisk= [RAM] List of ramdisks to load from floppy
See Documentation/ramdisk.txt.
@@ -713,8 +725,9 @@ running once the system is up.
7 (KERN_DEBUG) debug-level messages
log_buf_len=n Sets the size of the printk ring buffer, in bytes.
- Format is n, nk, nM. n must be a power of two. The
- default is set in kernel config.
+ Format: { n | nk | nM }
+ n must be a power of two. The default size
+ is set in the kernel config file.
lp=0 [LP] Specify parallel ports to use, e.g,
lp=port[,port...] lp=none,parport0 (lp0 not configured, lp1 uses
@@ -750,23 +763,23 @@ running once the system is up.
ltpc= [NET]
Format: <io>,<irq>,<dma>
- mac5380= [HW,SCSI]
- Format: <can_queue>,<cmd_per_lun>,<sg_tablesize>,<hostid>,<use_tags>
+ mac5380= [HW,SCSI] Format:
+ <can_queue>,<cmd_per_lun>,<sg_tablesize>,<hostid>,<use_tags>
- mac53c9x= [HW,SCSI]
- Format: <num_esps>,<disconnect>,<nosync>,<can_queue>,<cmd_per_lun>,<sg_tablesize>,<hostid>,<use_tags>
+ mac53c9x= [HW,SCSI] Format:
+ <num_esps>,<disconnect>,<nosync>,<can_queue>,<cmd_per_lun>,<sg_tablesize>,<hostid>,<use_tags>
- machvec= [IA64]
- Force the use of a particular machine-vector (machvec) in a generic
- kernel. Example: machvec=hpzx1_swiotlb
+ machvec= [IA64] Force the use of a particular machine-vector
+ (machvec) in a generic kernel.
+ Example: machvec=hpzx1_swiotlb
- mad16= [HW,OSS]
- Format: <io>,<irq>,<dma>,<dma16>,<mpu_io>,<mpu_irq>,<joystick>
+ mad16= [HW,OSS] Format:
+ <io>,<irq>,<dma>,<dma16>,<mpu_io>,<mpu_irq>,<joystick>
maui= [HW,OSS]
Format: <io>,<irq>
-
- max_loop= [LOOP] Maximum number of loopback devices that can
+
+ max_loop= [LOOP] Maximum number of loopback devices that can
be mounted
Format: <1-256>
@@ -776,11 +789,11 @@ running once the system is up.
max_addr=[KMG] [KNL,BOOT,ia64] All physical memory greater than or
equal to this physical address is ignored.
- max_luns= [SCSI] Maximum number of LUNs to probe
+ max_luns= [SCSI] Maximum number of LUNs to probe.
Should be between 1 and 2^32-1.
max_report_luns=
- [SCSI] Maximum number of LUNs received
+ [SCSI] Maximum number of LUNs received.
Should be between 1 and 16384.
mca-pentium [BUGS=IA-32]
@@ -796,11 +809,11 @@ running once the system is up.
md= [HW] RAID subsystems devices and level
See Documentation/md.txt.
-
+
mdacon= [MDA]
Format: <first>,<last>
Specifies range of consoles to be captured by the MDA.
-
+
mem=nn[KMG] [KNL,BOOT] Force usage of a specific amount of memory
Amount of memory to be used when the kernel is not able
to see the whole system memory or for test.
@@ -851,15 +864,15 @@ running once the system is up.
MTD_Partition= [MTD]
Format: <name>,<region-number>,<size>,<offset>
- MTD_Region= [MTD]
- Format: <name>,<region-number>[,<base>,<size>,<buswidth>,<altbuswidth>]
+ MTD_Region= [MTD] Format:
+ <name>,<region-number>[,<base>,<size>,<buswidth>,<altbuswidth>]
mtdparts= [MTD]
See drivers/mtd/cmdline.c.
mtouchusb.raw_coordinates=
- [HW] Make the MicroTouch USB driver use raw coordinates ('y', default)
- or cooked coordinates ('n')
+ [HW] Make the MicroTouch USB driver use raw coordinates
+ ('y', default) or cooked coordinates ('n')
n2= [NET] SDL Inc. RISCom/N2 synchronous serial card
@@ -880,7 +893,9 @@ running once the system is up.
Format: <irq>,<io>,<mem_start>,<mem_end>,<name>
Note that mem_start is often overloaded to mean
something different and driver-specific.
-
+ This usage is only documented in each driver source
+ file if at all.
+
nfsaddrs= [NFS]
See Documentation/nfsroot.txt.
@@ -893,8 +908,8 @@ running once the system is up.
emulation library even if a 387 maths coprocessor
is present.
- noalign [KNL,ARM]
-
+ noalign [KNL,ARM]
+
noapic [SMP,APIC] Tells the kernel to not make use of any
IOAPICs that may be present in the system.
@@ -905,19 +920,19 @@ running once the system is up.
on "Classic" PPC cores.
nocache [ARM]
-
+
nodisconnect [HW,SCSI,M68K] Disables SCSI disconnects.
noexec [IA-64]
- noexec [IA-32, X86-64]
+ noexec [IA-32,X86-64]
noexec=on: enable non-executable mappings (default)
noexec=off: disable nn-executable mappings
nofxsr [BUGS=IA-32]
nohlt [BUGS=ARM]
-
+
no-hlt [BUGS=IA-32] Tells the kernel that the hlt
instruction doesn't work correctly and not to
use it.
@@ -948,8 +963,9 @@ running once the system is up.
noresidual [PPC] Don't use residual data on PReP machines.
- noresume [SWSUSP] Disables resume and restore original swap space.
-
+ noresume [SWSUSP] Disables resume and restores original swap
+ space.
+
no-scroll [VGA] Disables scrollback.
This is required for the Braillex ib80-piezo Braille
reader made by F.H. Papenmeier (Germany).
@@ -965,16 +981,16 @@ running once the system is up.
nousb [USB] Disable the USB subsystem
nowb [ARM]
-
+
opl3= [HW,OSS]
Format: <io>
opl3sa= [HW,OSS]
Format: <io>,<irq>,<dma>,<dma2>,<mpu_io>,<mpu_irq>
- opl3sa2= [HW,OSS]
- Format: <io>,<irq>,<dma>,<dma2>,<mss_io>,<mpu_io>,<ymode>,<loopback>[,<isapnp>,<multiple]
-
+ opl3sa2= [HW,OSS] Format:
+ <io>,<irq>,<dma>,<dma2>,<mss_io>,<mpu_io>,<ymode>,<loopback>[,<isapnp>,<multiple]
+
oprofile.timer= [HW]
Use timer interrupt instead of performance counters
@@ -993,36 +1009,33 @@ running once the system is up.
Format: <parport#>
parkbd.mode= [HW] Parallel port keyboard adapter mode of operation,
0 for XT, 1 for AT (default is AT).
- Format: <mode>
-
- parport=0 [HW,PPT] Specify parallel ports. 0 disables.
- parport=auto Use 'auto' to force the driver to use
- parport=0xBBB[,IRQ[,DMA]] any IRQ/DMA settings detected (the
- default is to ignore detected IRQ/DMA
- settings because of possible
- conflicts). You can specify the base
- address, IRQ, and DMA settings; IRQ and
- DMA should be numbers, or 'auto' (for
- using detected settings on that
- particular port), or 'nofifo' (to avoid
- using a FIFO even if it is detected).
- Parallel ports are assigned in the
- order they are specified on the command
- line, starting with parport0.
-
- parport_init_mode=
- [HW,PPT] Configure VIA parallel port to
- operate in specific mode. This is
- necessary on Pegasos computer where
- firmware has no options for setting up
- parallel port mode and sets it to
- spp. Currently this function knows
- 686a and 8231 chips.
+ Format: <mode>
+
+ parport= [HW,PPT] Specify parallel ports. 0 disables.
+ Format: { 0 | auto | 0xBBB[,IRQ[,DMA]] }
+ Use 'auto' to force the driver to use any
+ IRQ/DMA settings detected (the default is to
+ ignore detected IRQ/DMA settings because of
+ possible conflicts). You can specify the base
+ address, IRQ, and DMA settings; IRQ and DMA
+ should be numbers, or 'auto' (for using detected
+ settings on that particular port), or 'nofifo'
+ (to avoid using a FIFO even if it is detected).
+ Parallel ports are assigned in the order they
+ are specified on the command line, starting
+ with parport0.
+
+ parport_init_mode= [HW,PPT]
+ Configure VIA parallel port to operate in
+ a specific mode. This is necessary on Pegasos
+ computer where firmware has no options for setting
+ up parallel port mode and sets it to spp.
+ Currently this function knows 686a and 8231 chips.
Format: [spp|ps2|epp|ecp|ecpepp]
- pas2= [HW,OSS]
- Format: <io>,<irq>,<dma>,<dma16>,<sb_io>,<sb_irq>,<sb_dma>,<sb_dma16>
-
+ pas2= [HW,OSS] Format:
+ <io>,<irq>,<dma>,<dma16>,<sb_io>,<sb_irq>,<sb_dma>,<sb_dma16>
+
pas16= [HW,SCSI]
See header of drivers/scsi/pas16.c.
@@ -1032,64 +1045,67 @@ running once the system is up.
See header of drivers/block/paride/pcd.c.
See also Documentation/paride.txt.
- pci=option[,option...] [PCI] various PCI subsystem options:
- off [IA-32] don't probe for the PCI bus
- bios [IA-32] force use of PCI BIOS, don't access
- the hardware directly. Use this if your machine
- has a non-standard PCI host bridge.
- nobios [IA-32] disallow use of PCI BIOS, only direct
- hardware access methods are allowed. Use this
- if you experience crashes upon bootup and you
- suspect they are caused by the BIOS.
- conf1 [IA-32] Force use of PCI Configuration Mechanism 1.
- conf2 [IA-32] Force use of PCI Configuration Mechanism 2.
- nosort [IA-32] Don't sort PCI devices according to
- order given by the PCI BIOS. This sorting is done
- to get a device order compatible with older kernels.
- biosirq [IA-32] Use PCI BIOS calls to get the interrupt
- routing table. These calls are known to be buggy
- on several machines and they hang the machine when used,
- but on other computers it's the only way to get the
- interrupt routing table. Try this option if the kernel
- is unable to allocate IRQs or discover secondary PCI
- buses on your motherboard.
- rom [IA-32] Assign address space to expansion ROMs.
- Use with caution as certain devices share address
- decoders between ROMs and other resources.
- irqmask=0xMMMM [IA-32] Set a bit mask of IRQs allowed to be assigned
- automatically to PCI devices. You can make the kernel
- exclude IRQs of your ISA cards this way.
+ pci=option[,option...] [PCI] various PCI subsystem options:
+ off [IA-32] don't probe for the PCI bus
+ bios [IA-32] force use of PCI BIOS, don't access
+ the hardware directly. Use this if your machine
+ has a non-standard PCI host bridge.
+ nobios [IA-32] disallow use of PCI BIOS, only direct
+ hardware access methods are allowed. Use this
+ if you experience crashes upon bootup and you
+ suspect they are caused by the BIOS.
+ conf1 [IA-32] Force use of PCI Configuration
+ Mechanism 1.
+ conf2 [IA-32] Force use of PCI Configuration
+ Mechanism 2.
+ nosort [IA-32] Don't sort PCI devices according to
+ order given by the PCI BIOS. This sorting is
+ done to get a device order compatible with
+ older kernels.
+ biosirq [IA-32] Use PCI BIOS calls to get the interrupt
+ routing table. These calls are known to be buggy
+ on several machines and they hang the machine
+ when used, but on other computers it's the only
+ way to get the interrupt routing table. Try
+ this option if the kernel is unable to allocate
+ IRQs or discover secondary PCI buses on your
+ motherboard.
+ rom [IA-32] Assign address space to expansion ROMs.
+ Use with caution as certain devices share
+ address decoders between ROMs and other
+ resources.
+ irqmask=0xMMMM [IA-32] Set a bit mask of IRQs allowed to be
+ assigned automatically to PCI devices. You can
+ make the kernel exclude IRQs of your ISA cards
+ this way.
pirqaddr=0xAAAAA [IA-32] Specify the physical address
- of the PIRQ table (normally generated
- by the BIOS) if it is outside the
- F0000h-100000h range.
- lastbus=N [IA-32] Scan all buses till bus #N. Can be useful
- if the kernel is unable to find your secondary buses
- and you want to tell it explicitly which ones they are.
- assign-busses [IA-32] Always assign all PCI bus
- numbers ourselves, overriding
- whatever the firmware may have
- done.
- usepirqmask [IA-32] Honor the possible IRQ mask
- stored in the BIOS $PIR table. This is
- needed on some systems with broken
- BIOSes, notably some HP Pavilion N5400
- and Omnibook XE3 notebooks. This will
- have no effect if ACPI IRQ routing is
- enabled.
- noacpi [IA-32] Do not use ACPI for IRQ routing
- or for PCI scanning.
- routeirq Do IRQ routing for all PCI devices.
- This is normally done in pci_enable_device(),
- so this option is a temporary workaround
- for broken drivers that don't call it.
-
- firmware [ARM] Do not re-enumerate the bus but
- instead just use the configuration
- from the bootloader. This is currently
- used on IXP2000 systems where the
- bus has to be configured a certain way
- for adjunct CPUs.
+ of the PIRQ table (normally generated
+ by the BIOS) if it is outside the
+ F0000h-100000h range.
+ lastbus=N [IA-32] Scan all buses thru bus #N. Can be
+ useful if the kernel is unable to find your
+ secondary buses and you want to tell it
+ explicitly which ones they are.
+ assign-busses [IA-32] Always assign all PCI bus
+ numbers ourselves, overriding
+ whatever the firmware may have done.
+ usepirqmask [IA-32] Honor the possible IRQ mask stored
+ in the BIOS $PIR table. This is needed on
+ some systems with broken BIOSes, notably
+ some HP Pavilion N5400 and Omnibook XE3
+ notebooks. This will have no effect if ACPI
+ IRQ routing is enabled.
+ noacpi [IA-32] Do not use ACPI for IRQ routing
+ or for PCI scanning.
+ routeirq Do IRQ routing for all PCI devices.
+ This is normally done in pci_enable_device(),
+ so this option is a temporary workaround
+ for broken drivers that don't call it.
+ firmware [ARM] Do not re-enumerate the bus but instead
+ just use the configuration from the
+ bootloader. This is currently used on
+ IXP2000 systems where the bus has to be
+ configured a certain way for adjunct CPUs.
pcmv= [HW,PCMCIA] BadgePAD 4
@@ -1127,19 +1143,20 @@ running once the system is up.
[ISAPNP] Exclude DMAs for the autoconfiguration
pnp_reserve_io= [ISAPNP] Exclude I/O ports for the autoconfiguration
- Ranges are in pairs (I/O port base and size).
+ Ranges are in pairs (I/O port base and size).
pnp_reserve_mem=
- [ISAPNP] Exclude memory regions for the autoconfiguration
+ [ISAPNP] Exclude memory regions for the
+ autoconfiguration.
Ranges are in pairs (memory base and size).
profile= [KNL] Enable kernel profiling via /proc/profile
- { schedule | <number> }
- (param: schedule - profile schedule points}
- (param: profile step/bucket size as a power of 2 for
- statistical time based profiling)
+ Format: [schedule,]<number>
+ Param: "schedule" - profile schedule points.
+ Param: <number> - step/bucket size as a power of 2 for
+ statistical time based profiling.
- processor.max_cstate= [HW, ACPI]
+ processor.max_cstate= [HW,ACPI]
Limit processor to maximum C-state
max_cstate=9 overrides any DMI blacklist limit.
@@ -1147,27 +1164,28 @@ running once the system is up.
before loading.
See Documentation/ramdisk.txt.
- psmouse.proto= [HW,MOUSE] Highest PS2 mouse protocol extension to
- probe for (bare|imps|exps|lifebook|any).
+ psmouse.proto= [HW,MOUSE] Highest PS2 mouse protocol extension to
+ probe for; one of (bare|imps|exps|lifebook|any).
psmouse.rate= [HW,MOUSE] Set desired mouse report rate, in reports
per second.
- psmouse.resetafter=
- [HW,MOUSE] Try to reset the device after so many bad packets
+ psmouse.resetafter= [HW,MOUSE]
+ Try to reset the device after so many bad packets
(0 = never).
psmouse.resolution=
[HW,MOUSE] Set desired mouse resolution, in dpi.
psmouse.smartscroll=
- [HW,MOUSE] Controls Logitech smartscroll autorepeat,
+ [HW,MOUSE] Controls Logitech smartscroll autorepeat.
0 = disabled, 1 = enabled (default).
pss= [HW,OSS] Personal Sound System (ECHO ESC614)
- Format: <io>,<mss_io>,<mss_irq>,<mss_dma>,<mpu_io>,<mpu_irq>
+ Format:
+ <io>,<mss_io>,<mss_irq>,<mss_dma>,<mpu_io>,<mpu_irq>
pt. [PARIDE]
See Documentation/paride.txt.
quiet= [KNL] Disable log messages
-
+
r128= [HW,DRM]
raid= [HW,RAID]
@@ -1176,10 +1194,9 @@ running once the system is up.
ramdisk= [RAM] Sizes of RAM disks in kilobytes [deprecated]
See Documentation/ramdisk.txt.
- ramdisk_blocksize=
- [RAM]
+ ramdisk_blocksize= [RAM]
See Documentation/ramdisk.txt.
-
+
ramdisk_size= [RAM] Sizes of RAM disks in kilobytes
New name for the ramdisk parameter.
See Documentation/ramdisk.txt.
@@ -1195,7 +1212,8 @@ running once the system is up.
reserve= [KNL,BUGS] Force the kernel to ignore some iomem area
- resume= [SWSUSP] Specify the partition device for software suspension
+ resume= [SWSUSP]
+ Specify the partition device for software suspend
rhash_entries= [KNL,NET]
Set number of hash buckets for route cache
@@ -1225,7 +1243,7 @@ running once the system is up.
Format: <io>,<irq>,<dma>,<dma2>
sbni= [NET] Granch SBNI12 leased line adapter
-
+
sbpcd= [HW,CD] Soundblaster CD adapter
Format: <io>,<type>
See a comment before function sbpcd_setup() in
@@ -1258,21 +1276,20 @@ running once the system is up.
serialnumber [BUGS=IA-32]
- sg_def_reserved_size=
- [SCSI]
-
+ sg_def_reserved_size= [SCSI]
+
sgalaxy= [HW,OSS]
Format: <io>,<irq>,<dma>,<dma2>,<sgbase>
shapers= [NET]
Maximal number of shapers.
-
+
sim710= [SCSI,HW]
See header of drivers/scsi/sim710.c.
simeth= [IA-64]
simscsi=
-
+
sjcd= [HW,CD]
Format: <io>,<irq>,<dma>
See header of drivers/cdrom/sjcd.c.
@@ -1403,10 +1420,10 @@ running once the system is up.
snd-wavefront= [HW,ALSA]
snd-ymfpci= [HW,ALSA]
-
+
sonicvibes= [HW,OSS]
Format: <reverb>
-
+
sonycd535= [HW,CD]
Format: <io>[,<irq>]
@@ -1423,7 +1440,7 @@ running once the system is up.
sscape= [HW,OSS]
Format: <io>,<irq>,<dma>,<mpu_io>,<mpu_irq>
-
+
st= [HW,SCSI] SCSI tape parameters (buffers, etc.)
See Documentation/scsi/st.txt.
@@ -1446,7 +1463,7 @@ running once the system is up.
stram_swap= [HW,M68k]
swiotlb= [IA-64] Number of I/O TLB slabs
-
+
switches= [HW,M68k]
sym53c416= [HW,SCSI]
@@ -1479,14 +1496,16 @@ running once the system is up.
tp720= [HW,PS2]
trix= [HW,OSS] MediaTrix AudioTrix Pro
- Format: <io>,<irq>,<dma>,<dma2>,<sb_io>,<sb_irq>,<sb_dma>,<mpu_io>,<mpu_irq>
-
+ Format:
+ <io>,<irq>,<dma>,<dma2>,<sb_io>,<sb_irq>,<sb_dma>,<mpu_io>,<mpu_irq>
+
tsdev.xres= [TS] Horizontal screen resolution.
tsdev.yres= [TS] Vertical screen resolution.
- turbografx.map[2|3]=
- [HW,JOY] TurboGraFX parallel port interface
- Format: <port#>,<js1>,<js2>,<js3>,<js4>,<js5>,<js6>,<js7>
+ turbografx.map[2|3]= [HW,JOY]
+ TurboGraFX parallel port interface
+ Format:
+ <port#>,<js1>,<js2>,<js3>,<js4>,<js5>,<js6>,<js7>
See also Documentation/input/joystick-parport.txt
u14-34f= [HW,SCSI] UltraStor 14F/34F SCSI host adapter
@@ -1502,17 +1521,18 @@ running once the system is up.
usbhid.mousepoll=
[USBHID] The interval which mice are to be polled at.
-
+
video= [FB] Frame buffer configuration
See Documentation/fb/modedb.txt.
vga= [BOOT,IA-32] Select a particular video mode
- See Documentation/i386/boot.txt and Documentation/svga.txt.
+ See Documentation/i386/boot.txt and
+ Documentation/svga.txt.
Use vga=ask for menu.
This is actually a boot loader parameter; the value is
passed to the kernel using a special protocol.
- vmalloc=nn[KMG] [KNL,BOOT] forces the vmalloc area to have an exact
+ vmalloc=nn[KMG] [KNL,BOOT] Forces the vmalloc area to have an exact
size of <nn>. This can be used to increase the
minimum size (128MB on x86). It can also be used to
decrease the size and leave more room for directly
@@ -1520,11 +1540,11 @@ running once the system is up.
vmhalt= [KNL,S390]
- vmpoff= [KNL,S390]
-
+ vmpoff= [KNL,S390]
+
waveartist= [HW,OSS]
Format: <io>,<irq>,<dma>,<dma2>
-
+
wd33c93= [HW,SCSI]
See header of drivers/scsi/wd33c93.c.
@@ -1538,21 +1558,25 @@ running once the system is up.
xd_geo= See header of drivers/block/xd.c.
xirc2ps_cs= [NET,PCMCIA]
- Format: <irq>,<irq_mask>,<io>,<full_duplex>,<do_sound>,<lockup_hack>[,<irq2>[,<irq3>[,<irq4>]]]
-
+ Format:
+ <irq>,<irq_mask>,<io>,<full_duplex>,<do_sound>,<lockup_hack>[,<irq2>[,<irq3>[,<irq4>]]]
+______________________________________________________________________
Changelog:
+2000-06-?? Mr. Unknown
The last known update (for 2.4.0) - the changelog was not kept before.
- 2000-06-?? Mr. Unknown
+2002-11-24 Petr Baudis <pasky@ucw.cz>
+ Randy Dunlap <randy.dunlap@verizon.net>
Update for 2.5.49, description for most of the options introduced,
references to other documentation (C files, READMEs, ..), added S390,
PPC, SPARC, MTD, ALSA and OSS category. Minor corrections and
reformatting.
- 2002-11-24 Petr Baudis <pasky@ucw.cz>
- Randy Dunlap <randy.dunlap@verizon.net>
+
+2005-10-19 Randy Dunlap <rdunlap@xenotime.net>
+ Lots of typos, whitespace, some reformatting.
TODO:
diff --git a/Documentation/keys-request-key.txt b/Documentation/keys-request-key.txt
new file mode 100644
index 0000000..5f2b9c5
--- /dev/null
+++ b/Documentation/keys-request-key.txt
@@ -0,0 +1,161 @@
+ ===================
+ KEY REQUEST SERVICE
+ ===================
+
+The key request service is part of the key retention service (refer to
+Documentation/keys.txt). This document explains more fully how that the
+requesting algorithm works.
+
+The process starts by either the kernel requesting a service by calling
+request_key():
+
+ struct key *request_key(const struct key_type *type,
+ const char *description,
+ const char *callout_string);
+
+Or by userspace invoking the request_key system call:
+
+ key_serial_t request_key(const char *type,
+ const char *description,
+ const char *callout_info,
+ key_serial_t dest_keyring);
+
+The main difference between the two access points is that the in-kernel
+interface does not need to link the key to a keyring to prevent it from being
+immediately destroyed. The kernel interface returns a pointer directly to the
+key, and it's up to the caller to destroy the key.
+
+The userspace interface links the key to a keyring associated with the process
+to prevent the key from going away, and returns the serial number of the key to
+the caller.
+
+
+===========
+THE PROCESS
+===========
+
+A request proceeds in the following manner:
+
+ (1) Process A calls request_key() [the userspace syscall calls the kernel
+ interface].
+
+ (2) request_key() searches the process's subscribed keyrings to see if there's
+ a suitable key there. If there is, it returns the key. If there isn't, and
+ callout_info is not set, an error is returned. Otherwise the process
+ proceeds to the next step.
+
+ (3) request_key() sees that A doesn't have the desired key yet, so it creates
+ two things:
+
+ (a) An uninstantiated key U of requested type and description.
+
+ (b) An authorisation key V that refers to key U and notes that process A
+ is the context in which key U should be instantiated and secured, and
+ from which associated key requests may be satisfied.
+
+ (4) request_key() then forks and executes /sbin/request-key with a new session
+ keyring that contains a link to auth key V.
+
+ (5) /sbin/request-key execs an appropriate program to perform the actual
+ instantiation.
+
+ (6) The program may want to access another key from A's context (say a
+ Kerberos TGT key). It just requests the appropriate key, and the keyring
+ search notes that the session keyring has auth key V in its bottom level.
+
+ This will permit it to then search the keyrings of process A with the
+ UID, GID, groups and security info of process A as if it was process A,
+ and come up with key W.
+
+ (7) The program then does what it must to get the data with which to
+ instantiate key U, using key W as a reference (perhaps it contacts a
+ Kerberos server using the TGT) and then instantiates key U.
+
+ (8) Upon instantiating key U, auth key V is automatically revoked so that it
+ may not be used again.
+
+ (9) The program then exits 0 and request_key() deletes key V and returns key
+ U to the caller.
+
+This also extends further. If key W (step 5 above) didn't exist, key W would be
+created uninstantiated, another auth key (X) would be created [as per step 3]
+and another copy of /sbin/request-key spawned [as per step 4]; but the context
+specified by auth key X will still be process A, as it was in auth key V.
+
+This is because process A's keyrings can't simply be attached to
+/sbin/request-key at the appropriate places because (a) execve will discard two
+of them, and (b) it requires the same UID/GID/Groups all the way through.
+
+
+======================
+NEGATIVE INSTANTIATION
+======================
+
+Rather than instantiating a key, it is possible for the possessor of an
+authorisation key to negatively instantiate a key that's under construction.
+This is a short duration placeholder that causes any attempt at re-requesting
+the key whilst it exists to fail with error ENOKEY.
+
+This is provided to prevent excessive repeated spawning of /sbin/request-key
+processes for a key that will never be obtainable.
+
+Should the /sbin/request-key process exit anything other than 0 or die on a
+signal, the key under construction will be automatically negatively
+instantiated for a short amount of time.
+
+
+====================
+THE SEARCH ALGORITHM
+====================
+
+A search of any particular keyring proceeds in the following fashion:
+
+ (1) When the key management code searches for a key (keyring_search_aux) it
+ firstly calls key_permission(SEARCH) on the keyring it's starting with,
+ if this denies permission, it doesn't search further.
+
+ (2) It considers all the non-keyring keys within that keyring and, if any key
+ matches the criteria specified, calls key_permission(SEARCH) on it to see
+ if the key is allowed to be found. If it is, that key is returned; if
+ not, the search continues, and the error code is retained if of higher
+ priority than the one currently set.
+
+ (3) It then considers all the keyring-type keys in the keyring it's currently
+ searching. It calls key_permission(SEARCH) on each keyring, and if this
+ grants permission, it recurses, executing steps (2) and (3) on that
+ keyring.
+
+The process stops immediately a valid key is found with permission granted to
+use it. Any error from a previous match attempt is discarded and the key is
+returned.
+
+When search_process_keyrings() is invoked, it performs the following searches
+until one succeeds:
+
+ (1) If extant, the process's thread keyring is searched.
+
+ (2) If extant, the process's process keyring is searched.
+
+ (3) The process's session keyring is searched.
+
+ (4) If the process has a request_key() authorisation key in its session
+ keyring then:
+
+ (a) If extant, the calling process's thread keyring is searched.
+
+ (b) If extant, the calling process's process keyring is searched.
+
+ (c) The calling process's session keyring is searched.
+
+The moment one succeeds, all pending errors are discarded and the found key is
+returned.
+
+Only if all these fail does the whole thing fail with the highest priority
+error. Note that several errors may have come from LSM.
+
+The error priority is:
+
+ EKEYREVOKED > EKEYEXPIRED > ENOKEY
+
+EACCES/EPERM are only returned on a direct search of a specific keyring where
+the basal keyring does not grant Search permission.
diff --git a/Documentation/keys.txt b/Documentation/keys.txt
index 0321ded..4afe03a 100644
--- a/Documentation/keys.txt
+++ b/Documentation/keys.txt
@@ -195,8 +195,8 @@ KEY ACCESS PERMISSIONS
======================
Keys have an owner user ID, a group access ID, and a permissions mask. The mask
-has up to eight bits each for user, group and other access. Only five of each
-set of eight bits are defined. These permissions granted are:
+has up to eight bits each for possessor, user, group and other access. Only
+five of each set of eight bits are defined. These permissions granted are:
(*) View
@@ -241,16 +241,16 @@ about the status of the key service:
type, description and permissions. The payload of the key is not available
this way:
- SERIAL FLAGS USAGE EXPY PERM UID GID TYPE DESCRIPTION: SUMMARY
- 00000001 I----- 39 perm 1f0000 0 0 keyring _uid_ses.0: 1/4
- 00000002 I----- 2 perm 1f0000 0 0 keyring _uid.0: empty
- 00000007 I----- 1 perm 1f0000 0 0 keyring _pid.1: empty
- 0000018d I----- 1 perm 1f0000 0 0 keyring _pid.412: empty
- 000004d2 I--Q-- 1 perm 1f0000 32 -1 keyring _uid.32: 1/4
- 000004d3 I--Q-- 3 perm 1f0000 32 -1 keyring _uid_ses.32: empty
- 00000892 I--QU- 1 perm 1f0000 0 0 user metal:copper: 0
- 00000893 I--Q-N 1 35s 1f0000 0 0 user metal:silver: 0
- 00000894 I--Q-- 1 10h 1f0000 0 0 user metal:gold: 0
+ SERIAL FLAGS USAGE EXPY PERM UID GID TYPE DESCRIPTION: SUMMARY
+ 00000001 I----- 39 perm 1f1f0000 0 0 keyring _uid_ses.0: 1/4
+ 00000002 I----- 2 perm 1f1f0000 0 0 keyring _uid.0: empty
+ 00000007 I----- 1 perm 1f1f0000 0 0 keyring _pid.1: empty
+ 0000018d I----- 1 perm 1f1f0000 0 0 keyring _pid.412: empty
+ 000004d2 I--Q-- 1 perm 1f1f0000 32 -1 keyring _uid.32: 1/4
+ 000004d3 I--Q-- 3 perm 1f1f0000 32 -1 keyring _uid_ses.32: empty
+ 00000892 I--QU- 1 perm 1f000000 0 0 user metal:copper: 0
+ 00000893 I--Q-N 1 35s 1f1f0000 0 0 user metal:silver: 0
+ 00000894 I--Q-- 1 10h 001f0000 0 0 user metal:gold: 0
The flags are:
@@ -361,6 +361,8 @@ The main syscalls are:
/sbin/request-key will be invoked in an attempt to obtain a key. The
callout_info string will be passed as an argument to the program.
+ See also Documentation/keys-request-key.txt.
+
The keyctl syscall functions are:
@@ -533,8 +535,8 @@ The keyctl syscall functions are:
(*) Read the payload data from a key:
- key_serial_t keyctl(KEYCTL_READ, key_serial_t keyring, char *buffer,
- size_t buflen);
+ long keyctl(KEYCTL_READ, key_serial_t keyring, char *buffer,
+ size_t buflen);
This function attempts to read the payload data from the specified key
into the buffer. The process must have read permission on the key to
@@ -555,9 +557,9 @@ The keyctl syscall functions are:
(*) Instantiate a partially constructed key.
- key_serial_t keyctl(KEYCTL_INSTANTIATE, key_serial_t key,
- const void *payload, size_t plen,
- key_serial_t keyring);
+ long keyctl(KEYCTL_INSTANTIATE, key_serial_t key,
+ const void *payload, size_t plen,
+ key_serial_t keyring);
If the kernel calls back to userspace to complete the instantiation of a
key, userspace should use this call to supply data for the key before the
@@ -576,8 +578,8 @@ The keyctl syscall functions are:
(*) Negatively instantiate a partially constructed key.
- key_serial_t keyctl(KEYCTL_NEGATE, key_serial_t key,
- unsigned timeout, key_serial_t keyring);
+ long keyctl(KEYCTL_NEGATE, key_serial_t key,
+ unsigned timeout, key_serial_t keyring);
If the kernel calls back to userspace to complete the instantiation of a
key, userspace should use this call mark the key as negative before the
@@ -637,6 +639,34 @@ call, and the key released upon close. How to deal with conflicting keys due to
two different users opening the same file is left to the filesystem author to
solve.
+Note that there are two different types of pointers to keys that may be
+encountered:
+
+ (*) struct key *
+
+ This simply points to the key structure itself. Key structures will be at
+ least four-byte aligned.
+
+ (*) key_ref_t
+
+ This is equivalent to a struct key *, but the least significant bit is set
+ if the caller "possesses" the key. By "possession" it is meant that the
+ calling processes has a searchable link to the key from one of its
+ keyrings. There are three functions for dealing with these:
+
+ key_ref_t make_key_ref(const struct key *key,
+ unsigned long possession);
+
+ struct key *key_ref_to_ptr(const key_ref_t key_ref);
+
+ unsigned long is_key_possessed(const key_ref_t key_ref);
+
+ The first function constructs a key reference from a key pointer and
+ possession information (which must be 0 or 1 and not any other value).
+
+ The second function retrieves the key pointer from a reference and the
+ third retrieves the possession flag.
+
When accessing a key's payload contents, certain precautions must be taken to
prevent access vs modification races. See the section "Notes on accessing
payload contents" for more information.
@@ -660,12 +690,18 @@ payload contents" for more information.
If successful, the key will have been attached to the default keyring for
implicitly obtained request-key keys, as set by KEYCTL_SET_REQKEY_KEYRING.
+ See also Documentation/keys-request-key.txt.
+
(*) When it is no longer required, the key should be released using:
void key_put(struct key *key);
- This can be called from interrupt context. If CONFIG_KEYS is not set then
+ Or:
+
+ void key_ref_put(key_ref_t key_ref);
+
+ These can be called from interrupt context. If CONFIG_KEYS is not set then
the argument will not be parsed.
@@ -689,13 +725,17 @@ payload contents" for more information.
(*) If a keyring was found in the search, this can be further searched by:
- struct key *keyring_search(struct key *keyring,
- const struct key_type *type,
- const char *description)
+ key_ref_t keyring_search(key_ref_t keyring_ref,
+ const struct key_type *type,
+ const char *description)
This searches the keyring tree specified for a matching key. Error ENOKEY
- is returned upon failure. If successful, the returned key will need to be
- released.
+ is returned upon failure (use IS_ERR/PTR_ERR to determine). If successful,
+ the returned key will need to be released.
+
+ The possession attribute from the keyring reference is used to control
+ access through the permissions mask and is propagated to the returned key
+ reference pointer if successful.
(*) To check the validity of a key, this function can be called:
@@ -732,7 +772,7 @@ More complex payload contents must be allocated and a pointer to them set in
key->payload.data. One of the following ways must be selected to access the
data:
- (1) Unmodifyable key type.
+ (1) Unmodifiable key type.
If the key type does not have a modify method, then the key's payload can
be accessed without any form of locking, provided that it's known to be
diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
index a55f0f9..b0fe41d 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -777,7 +777,7 @@ doing so is the same as described in the "Configuring Multiple Bonds
Manually" section, below.
NOTE: It has been observed that some Red Hat supplied kernels
-are apparently unable to rename modules at load time (the "-obonding1"
+are apparently unable to rename modules at load time (the "-o bond1"
part). Attempts to pass that option to modprobe will produce an
"Operation not permitted" error. This has been reported on some
Fedora Core kernels, and has been seen on RHEL 4 as well. On kernels
@@ -883,7 +883,8 @@ the above does not work, and the second bonding instance never sees
its options. In that case, the second options line can be substituted
as follows:
-install bonding1 /sbin/modprobe bonding -obond1 mode=balance-alb miimon=50
+install bond1 /sbin/modprobe --ignore-install bonding -o bond1 \
+ mode=balance-alb miimon=50
This may be repeated any number of times, specifying a new and
unique name in place of bond1 for each subsequent instance.
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index ab65714..b433c8a 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -355,10 +355,14 @@ ip_dynaddr - BOOLEAN
Default: 0
icmp_echo_ignore_all - BOOLEAN
+ If set non-zero, then the kernel will ignore all ICMP ECHO
+ requests sent to it.
+ Default: 0
+
icmp_echo_ignore_broadcasts - BOOLEAN
- If either is set to true, then the kernel will ignore either all
- ICMP ECHO requests sent to it or just those to broadcast/multicast
- addresses, respectively.
+ If set non-zero, then the kernel will ignore all ICMP ECHO and
+ TIMESTAMP requests sent to it via broadcast/multicast.
+ Default: 1
icmp_ratelimit - INTEGER
Limit the maximal rates for sending ICMP packets whose type matches
diff --git a/Documentation/sparse.txt b/Documentation/sparse.txt
index 5df44dc..1829009 100644
--- a/Documentation/sparse.txt
+++ b/Documentation/sparse.txt
@@ -51,9 +51,9 @@ or you don't get any checking at all.
Where to get sparse
~~~~~~~~~~~~~~~~~~~
-With BK, you can just get it from
+With git, you can just get it from
- bk://sparse.bkbits.net/sparse
+ rsync://rsync.kernel.org/pub/scm/devel/sparse/sparse.git
and DaveJ has tar-balls at
diff --git a/Documentation/usb/URB.txt b/Documentation/usb/URB.txt
index d59b95c..a49e5f2 100644
--- a/Documentation/usb/URB.txt
+++ b/Documentation/usb/URB.txt
@@ -1,5 +1,6 @@
Revised: 2000-Dec-05.
Again: 2002-Jul-06
+Again: 2005-Sep-19
NOTE:
@@ -18,8 +19,8 @@ called USB Request Block, or URB for short.
and deliver the data and status back.
- Execution of an URB is inherently an asynchronous operation, i.e. the
- usb_submit_urb(urb) call returns immediately after it has successfully queued
- the requested action.
+ usb_submit_urb(urb) call returns immediately after it has successfully
+ queued the requested action.
- Transfers for one URB can be canceled with usb_unlink_urb(urb) at any time.
@@ -94,8 +95,9 @@ To free an URB, use
void usb_free_urb(struct urb *urb)
-You may not free an urb that you've submitted, but which hasn't yet been
-returned to you in a completion callback.
+You may free an urb that you've submitted, but which hasn't yet been
+returned to you in a completion callback. It will automatically be
+deallocated when it is no longer in use.
1.4. What has to be filled in?
@@ -145,30 +147,36 @@ to get seamless ISO streaming.
1.6. How to cancel an already running URB?
-For an URB which you've submitted, but which hasn't been returned to
-your driver by the host controller, call
+There are two ways to cancel an URB you've submitted but which hasn't
+been returned to your driver yet. For an asynchronous cancel, call
int usb_unlink_urb(struct urb *urb)
It removes the urb from the internal list and frees all allocated
-HW descriptors. The status is changed to reflect unlinking. After
-usb_unlink_urb() returns with that status code, you can free the URB
-with usb_free_urb().
+HW descriptors. The status is changed to reflect unlinking. Note
+that the URB will not normally have finished when usb_unlink_urb()
+returns; you must still wait for the completion handler to be called.
-There is also an asynchronous unlink mode. To use this, set the
-the URB_ASYNC_UNLINK flag in urb->transfer flags before calling
-usb_unlink_urb(). When using async unlinking, the URB will not
-normally be unlinked when usb_unlink_urb() returns. Instead, wait
-for the completion handler to be called.
+To cancel an URB synchronously, call
+
+ void usb_kill_urb(struct urb *urb)
+
+It does everything usb_unlink_urb does, and in addition it waits
+until after the URB has been returned and the completion handler
+has finished. It also marks the URB as temporarily unusable, so
+that if the completion handler or anyone else tries to resubmit it
+they will get a -EPERM error. Thus you can be sure that when
+usb_kill_urb() returns, the URB is totally idle.
1.7. What about the completion handler?
The handler is of the following type:
- typedef void (*usb_complete_t)(struct urb *);
+ typedef void (*usb_complete_t)(struct urb *, struct pt_regs *)
-i.e. it gets just the URB that caused the completion call.
+I.e., it gets the URB that caused the completion call, plus the
+register values at the time of the corresponding interrupt (if any).
In the completion handler, you should have a look at urb->status to
detect any USB errors. Since the context parameter is included in the URB,
you can pass information to the completion handler.
@@ -176,17 +184,11 @@ you can pass information to the completion handler.
Note that even when an error (or unlink) is reported, data may have been
transferred. That's because USB transfers are packetized; it might take
sixteen packets to transfer your 1KByte buffer, and ten of them might
-have transferred succesfully before the completion is called.
+have transferred succesfully before the completion was called.
NOTE: ***** WARNING *****
-Don't use urb->dev field in your completion handler; it's cleared
-as part of giving urbs back to drivers. (Addressing an issue with
-ownership of periodic URBs, which was otherwise ambiguous.) Instead,
-use urb->context to hold all the data your driver needs.
-
-NOTE: ***** WARNING *****
-Also, NEVER SLEEP IN A COMPLETION HANDLER. These are normally called
+NEVER SLEEP IN A COMPLETION HANDLER. These are normally called
during hardware interrupt processing. If you can, defer substantial
work to a tasklet (bottom half) to keep system latencies low. You'll
probably need to use spinlocks to protect data structures you manipulate
@@ -229,24 +231,10 @@ ISO data with some other event stream.
Interrupt transfers, like isochronous transfers, are periodic, and happen
in intervals that are powers of two (1, 2, 4 etc) units. Units are frames
for full and low speed devices, and microframes for high speed ones.
-
-Currently, after you submit one interrupt URB, that urb is owned by the
-host controller driver until you cancel it with usb_unlink_urb(). You
-may unlink interrupt urbs in their completion handlers, if you need to.
-
-After a transfer completion is called, the URB is automagically resubmitted.
-THIS BEHAVIOR IS EXPECTED TO BE REMOVED!!
-
-Interrupt transfers may only send (or receive) the "maxpacket" value for
-the given interrupt endpoint; if you need more data, you will need to
-copy that data out of (or into) another buffer. Similarly, you can't
-queue interrupt transfers.
-THESE RESTRICTIONS ARE EXPECTED TO BE REMOVED!!
-
-Note that this automagic resubmission model does make it awkward to use
-interrupt OUT transfers. The portable solution involves unlinking those
-OUT urbs after the data is transferred, and perhaps submitting a final
-URB for a short packet.
-
The usb_submit_urb() call modifies urb->interval to the implemented interval
value that is less than or equal to the requested interval value.
+
+In Linux 2.6, unlike earlier versions, interrupt URBs are not automagically
+restarted when they complete. They end when the completion handler is
+called, just like other URBs. If you want an interrupt URB to be restarted,
+your completion handler must resubmit it.
OpenPOWER on IntegriCloud