diff options
Diffstat (limited to 'Documentation')
55 files changed, 1876 insertions, 184 deletions
diff --git a/Documentation/ABI/obsolete/sysfs-block-zram b/Documentation/ABI/obsolete/sysfs-block-zram new file mode 100644 index 0000000..720ea92 --- /dev/null +++ b/Documentation/ABI/obsolete/sysfs-block-zram @@ -0,0 +1,119 @@ +What: /sys/block/zram<id>/num_reads +Date: August 2015 +Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> +Description: + The num_reads file is read-only and specifies the number of + reads (failed or successful) done on this device. + Now accessible via zram<id>/stat node. + +What: /sys/block/zram<id>/num_writes +Date: August 2015 +Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> +Description: + The num_writes file is read-only and specifies the number of + writes (failed or successful) done on this device. + Now accessible via zram<id>/stat node. + +What: /sys/block/zram<id>/invalid_io +Date: August 2015 +Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> +Description: + The invalid_io file is read-only and specifies the number of + non-page-size-aligned I/O requests issued to this device. + Now accessible via zram<id>/io_stat node. + +What: /sys/block/zram<id>/failed_reads +Date: August 2015 +Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> +Description: + The failed_reads file is read-only and specifies the number of + failed reads happened on this device. + Now accessible via zram<id>/io_stat node. + +What: /sys/block/zram<id>/failed_writes +Date: August 2015 +Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> +Description: + The failed_writes file is read-only and specifies the number of + failed writes happened on this device. + Now accessible via zram<id>/io_stat node. + +What: /sys/block/zram<id>/notify_free +Date: August 2015 +Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> +Description: + The notify_free file is read-only. Depending on device usage + scenario it may account a) the number of pages freed because + of swap slot free notifications or b) the number of pages freed + because of REQ_DISCARD requests sent by bio. The former ones + are sent to a swap block device when a swap slot is freed, which + implies that this disk is being used as a swap disk. The latter + ones are sent by filesystem mounted with discard option, + whenever some data blocks are getting discarded. + Now accessible via zram<id>/io_stat node. + +What: /sys/block/zram<id>/zero_pages +Date: August 2015 +Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> +Description: + The zero_pages file is read-only and specifies number of zero + filled pages written to this disk. No memory is allocated for + such pages. + Now accessible via zram<id>/mm_stat node. + +What: /sys/block/zram<id>/orig_data_size +Date: August 2015 +Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> +Description: + The orig_data_size file is read-only and specifies uncompressed + size of data stored in this disk. This excludes zero-filled + pages (zero_pages) since no memory is allocated for them. + Unit: bytes + Now accessible via zram<id>/mm_stat node. + +What: /sys/block/zram<id>/compr_data_size +Date: August 2015 +Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> +Description: + The compr_data_size file is read-only and specifies compressed + size of data stored in this disk. So, compression ratio can be + calculated using orig_data_size and this statistic. + Unit: bytes + Now accessible via zram<id>/mm_stat node. + +What: /sys/block/zram<id>/mem_used_total +Date: August 2015 +Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> +Description: + The mem_used_total file is read-only and specifies the amount + of memory, including allocator fragmentation and metadata + overhead, allocated for this disk. So, allocator space + efficiency can be calculated using compr_data_size and this + statistic. + Unit: bytes + Now accessible via zram<id>/mm_stat node. + +What: /sys/block/zram<id>/mem_used_max +Date: August 2015 +Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> +Description: + The mem_used_max file is read/write and specifies the amount + of maximum memory zram have consumed to store compressed data. + For resetting the value, you should write "0". Otherwise, + you could see -EINVAL. + Unit: bytes + Downgraded to write-only node: so it's possible to set new + value only; its current value is stored in zram<id>/mm_stat + node. + +What: /sys/block/zram<id>/mem_limit +Date: August 2015 +Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> +Description: + The mem_limit file is read/write and specifies the maximum + amount of memory ZRAM can use to store the compressed data. + The limit could be changed in run time and "0" means disable + the limit. No limit is the initial state. Unit: bytes + Downgraded to write-only node: so it's possible to set new + value only; its current value is stored in zram<id>/mm_stat + node. diff --git a/Documentation/ABI/testing/sysfs-block-zram b/Documentation/ABI/testing/sysfs-block-zram index a6148ea..2e69e83 100644 --- a/Documentation/ABI/testing/sysfs-block-zram +++ b/Documentation/ABI/testing/sysfs-block-zram @@ -141,3 +141,28 @@ Description: amount of memory ZRAM can use to store the compressed data. The limit could be changed in run time and "0" means disable the limit. No limit is the initial state. Unit: bytes + +What: /sys/block/zram<id>/compact +Date: August 2015 +Contact: Minchan Kim <minchan@kernel.org> +Description: + The compact file is write-only and trigger compaction for + allocator zrm uses. The allocator moves some objects so that + it could free fragment space. + +What: /sys/block/zram<id>/io_stat +Date: August 2015 +Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> +Description: + The io_stat file is read-only and accumulates device's I/O + statistics not accounted by block layer. For example, + failed_reads, failed_writes, etc. File format is similar to + block layer statistics file format. + +What: /sys/block/zram<id>/mm_stat +Date: August 2015 +Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> +Description: + The mm_stat file is read-only and represents device's mm + statistics (orig_data_size, compr_data_size, etc.) in a format + similar to block layer statistics file format. diff --git a/Documentation/ABI/testing/sysfs-class-cxl b/Documentation/ABI/testing/sysfs-class-cxl index 3680364..d46bba8 100644 --- a/Documentation/ABI/testing/sysfs-class-cxl +++ b/Documentation/ABI/testing/sysfs-class-cxl @@ -100,7 +100,7 @@ Description: read only Hexadecimal value of the device ID found in this AFU configuration record. -What: /sys/class/cxl/<afu>/cr<config num>/vendor +What: /sys/class/cxl/<afu>/cr<config num>/class Date: February 2015 Contact: linuxppc-dev@lists.ozlabs.org Description: read only diff --git a/Documentation/ABI/testing/sysfs-class-led-flash b/Documentation/ABI/testing/sysfs-class-led-flash new file mode 100644 index 0000000..220a0270 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-class-led-flash @@ -0,0 +1,80 @@ +What: /sys/class/leds/<led>/flash_brightness +Date: March 2015 +KernelVersion: 4.0 +Contact: Jacek Anaszewski <j.anaszewski@samsung.com> +Description: read/write + Set the brightness of this LED in the flash strobe mode, in + microamperes. The file is created only for the flash LED devices + that support setting flash brightness. + + The value is between 0 and + /sys/class/leds/<led>/max_flash_brightness. + +What: /sys/class/leds/<led>/max_flash_brightness +Date: March 2015 +KernelVersion: 4.0 +Contact: Jacek Anaszewski <j.anaszewski@samsung.com> +Description: read only + Maximum brightness level for this LED in the flash strobe mode, + in microamperes. + +What: /sys/class/leds/<led>/flash_timeout +Date: March 2015 +KernelVersion: 4.0 +Contact: Jacek Anaszewski <j.anaszewski@samsung.com> +Description: read/write + Hardware timeout for flash, in microseconds. The flash strobe + is stopped after this period of time has passed from the start + of the strobe. The file is created only for the flash LED + devices that support setting flash timeout. + +What: /sys/class/leds/<led>/max_flash_timeout +Date: March 2015 +KernelVersion: 4.0 +Contact: Jacek Anaszewski <j.anaszewski@samsung.com> +Description: read only + Maximum flash timeout for this LED, in microseconds. + +What: /sys/class/leds/<led>/flash_strobe +Date: March 2015 +KernelVersion: 4.0 +Contact: Jacek Anaszewski <j.anaszewski@samsung.com> +Description: read/write + Flash strobe state. When written with 1 it triggers flash strobe + and when written with 0 it turns the flash off. + + On read 1 means that flash is currently strobing and 0 means + that flash is off. + +What: /sys/class/leds/<led>/flash_fault +Date: March 2015 +KernelVersion: 4.0 +Contact: Jacek Anaszewski <j.anaszewski@samsung.com> +Description: read only + Space separated list of flash faults that may have occurred. + Flash faults are re-read after strobing the flash. Possible + flash faults: + + * led-over-voltage - flash controller voltage to the flash LED + has exceeded the limit specific to the flash controller + * flash-timeout-exceeded - the flash strobe was still on when + the timeout set by the user has expired; not all flash + controllers may set this in all such conditions + * controller-over-temperature - the flash controller has + overheated + * controller-short-circuit - the short circuit protection + of the flash controller has been triggered + * led-power-supply-over-current - current in the LED power + supply has exceeded the limit specific to the flash + controller + * indicator-led-fault - the flash controller has detected + a short or open circuit condition on the indicator LED + * led-under-voltage - flash controller voltage to the flash + LED has been below the minimum limit specific to + the flash + * controller-under-voltage - the input voltage of the flash + controller is below the limit under which strobing the + flash at full current will not be possible; + the condition persists until this flag is no longer set + * led-over-temperature - the temperature of the LED has exceeded + its allowed upper limit diff --git a/Documentation/CodingStyle b/Documentation/CodingStyle index 449a8a1..4d4f06d 100644 --- a/Documentation/CodingStyle +++ b/Documentation/CodingStyle @@ -659,6 +659,19 @@ macros using parameters. #define CONSTANT 0x4000 #define CONSTEXP (CONSTANT | 3) +5) namespace collisions when defining local variables in macros resembling +functions: + +#define FOO(x) \ +({ \ + typeof(x) ret; \ + ret = calc_ret(x); \ + (ret); \ +)} + +ret is a common name for a local variable - __foo_ret is less likely +to collide with an existing variable. + The cpp manual deals with macros exhaustively. The gcc internals manual also covers RTL which is used frequently with assembly language in the kernel. diff --git a/Documentation/IRQ-domain.txt b/Documentation/IRQ-domain.txt index 39cfa72..3a8e15c 100644 --- a/Documentation/IRQ-domain.txt +++ b/Documentation/IRQ-domain.txt @@ -95,8 +95,7 @@ since it doesn't need to allocate a table as large as the largest hwirq number. The disadvantage is that hwirq to IRQ number lookup is dependent on how many entries are in the table. -Very few drivers should need this mapping. At the moment, powerpc -iseries is the only user. +Very few drivers should need this mapping. ==== No Map ===- irq_domain_add_nomap() diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches index 447671b..b03a832 100644 --- a/Documentation/SubmittingPatches +++ b/Documentation/SubmittingPatches @@ -614,8 +614,8 @@ The canonical patch message body contains the following: - An empty line. - - The body of the explanation, which will be copied to the - permanent changelog to describe this patch. + - The body of the explanation, line wrapped at 75 columns, which will + be copied to the permanent changelog to describe this patch. - The "Signed-off-by:" lines, described above, which will also go in the changelog. diff --git a/Documentation/blockdev/nbd.txt b/Documentation/blockdev/nbd.txt index 271e607..db242ea 100644 --- a/Documentation/blockdev/nbd.txt +++ b/Documentation/blockdev/nbd.txt @@ -1,17 +1,31 @@ - Network Block Device (TCP version) - - What is it: With this compiled in the kernel (or as a module), Linux - can use a remote server as one of its block devices. So every time - the client computer wants to read, e.g., /dev/nb0, it sends a - request over TCP to the server, which will reply with the data read. - This can be used for stations with low disk space (or even diskless) - to borrow disk space from another computer. - Unlike NFS, it is possible to put any filesystem on it, etc. - - For more information, or to download the nbd-client and nbd-server - tools, go to http://nbd.sf.net/. - - The nbd kernel module need only be installed on the client - system, as the nbd-server is completely in userspace. In fact, - the nbd-server has been successfully ported to other operating - systems, including Windows. +Network Block Device (TCP version) +================================== + +1) Overview +----------- + +What is it: With this compiled in the kernel (or as a module), Linux +can use a remote server as one of its block devices. So every time +the client computer wants to read, e.g., /dev/nb0, it sends a +request over TCP to the server, which will reply with the data read. +This can be used for stations with low disk space (or even diskless) +to borrow disk space from another computer. +Unlike NFS, it is possible to put any filesystem on it, etc. + +For more information, or to download the nbd-client and nbd-server +tools, go to http://nbd.sf.net/. + +The nbd kernel module need only be installed on the client +system, as the nbd-server is completely in userspace. In fact, +the nbd-server has been successfully ported to other operating +systems, including Windows. + +A) NBD parameters +----------------- + +max_part + Number of partitions per device (default: 0). + +nbds_max + Number of block devices that should be initialized (default: 16). + diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt index 7fcf9c6..48a183e 100644 --- a/Documentation/blockdev/zram.txt +++ b/Documentation/blockdev/zram.txt @@ -98,20 +98,79 @@ size of the disk when not in use so a huge zram is wasteful. mount /dev/zram1 /tmp 7) Stats: - Per-device statistics are exported as various nodes under - /sys/block/zram<id>/ - disksize - num_reads - num_writes - failed_reads - failed_writes - invalid_io - notify_free - zero_pages - orig_data_size - compr_data_size - mem_used_total - mem_used_max +Per-device statistics are exported as various nodes under /sys/block/zram<id>/ + +A brief description of exported device attritbutes. For more details please +read Documentation/ABI/testing/sysfs-block-zram. + +Name access description +---- ------ ----------- +disksize RW show and set the device's disk size +initstate RO shows the initialization state of the device +reset WO trigger device reset +num_reads RO the number of reads +failed_reads RO the number of failed reads +num_write RO the number of writes +failed_writes RO the number of failed writes +invalid_io RO the number of non-page-size-aligned I/O requests +max_comp_streams RW the number of possible concurrent compress operations +comp_algorithm RW show and change the compression algorithm +notify_free RO the number of notifications to free pages (either + slot free notifications or REQ_DISCARD requests) +zero_pages RO the number of zero filled pages written to this disk +orig_data_size RO uncompressed size of data stored in this disk +compr_data_size RO compressed size of data stored in this disk +mem_used_total RO the amount of memory allocated for this disk +mem_used_max RW the maximum amount memory zram have consumed to + store compressed data +mem_limit RW the maximum amount of memory ZRAM can use to store + the compressed data +num_migrated RO the number of objects migrated migrated by compaction + + +WARNING +======= +per-stat sysfs attributes are considered to be deprecated. +The basic strategy is: +-- the existing RW nodes will be downgraded to WO nodes (in linux 4.11) +-- deprecated RO sysfs nodes will eventually be removed (in linux 4.11) + +The list of deprecated attributes can be found here: +Documentation/ABI/obsolete/sysfs-block-zram + +Basically, every attribute that has its own read accessible sysfs node +(e.g. num_reads) *AND* is accessible via one of the stat files (zram<id>/stat +or zram<id>/io_stat or zram<id>/mm_stat) is considered to be deprecated. + +User space is advised to use the following files to read the device statistics. + +File /sys/block/zram<id>/stat + +Represents block layer statistics. Read Documentation/block/stat.txt for +details. + +File /sys/block/zram<id>/io_stat + +The stat file represents device's I/O statistics not accounted by block +layer and, thus, not available in zram<id>/stat file. It consists of a +single line of text and contains the following stats separated by +whitespace: + failed_reads + failed_writes + invalid_io + notify_free + +File /sys/block/zram<id>/mm_stat + +The stat file represents device's mm statistics. It consists of a single +line of text and contains the following stats separated by whitespace: + orig_data_size + compr_data_size + mem_used_total + mem_limit + mem_used_max + zero_pages + num_migrated 8) Deactivate: swapoff /dev/zram0 diff --git a/Documentation/devicetree/bindings/arm/pmu.txt b/Documentation/devicetree/bindings/arm/pmu.txt index 6e54a9d..3b5f5d1 100644 --- a/Documentation/devicetree/bindings/arm/pmu.txt +++ b/Documentation/devicetree/bindings/arm/pmu.txt @@ -26,6 +26,13 @@ Required properties: Optional properties: +- interrupt-affinity : Valid only when using SPIs, specifies a list of phandles + to CPU nodes corresponding directly to the affinity of + the SPIs listed in the interrupts property. + + This property should be present when there is more than + a single SPI. + - qcom,no-pc-write : Indicates that this PMU doesn't support the 0xc and 0xd events. diff --git a/Documentation/devicetree/bindings/clock/pistachio-clock.txt b/Documentation/devicetree/bindings/clock/pistachio-clock.txt new file mode 100644 index 0000000..868db49 --- /dev/null +++ b/Documentation/devicetree/bindings/clock/pistachio-clock.txt @@ -0,0 +1,123 @@ +Imagination Technologies Pistachio SoC clock controllers +======================================================== + +Pistachio has four clock controllers (core clock, peripheral clock, peripheral +general control, and top general control) which are instantiated individually +from the device-tree. + +External clocks: +---------------- + +There are three external inputs to the clock controllers which should be +defined with the following clock-output-names: +- "xtal": External 52Mhz oscillator (required) +- "audio_clk_in": Alternate audio reference clock (optional) +- "enet_clk_in": Alternate ethernet PHY clock (optional) + +Core clock controller: +---------------------- + +The core clock controller generates clocks for the CPU, RPU (WiFi + BT +co-processor), audio, and several peripherals. + +Required properties: +- compatible: Must be "img,pistachio-clk". +- reg: Must contain the base address and length of the core clock controller. +- #clock-cells: Must be 1. The single cell is the clock identifier. + See dt-bindings/clock/pistachio-clk.h for the list of valid identifiers. +- clocks: Must contain an entry for each clock in clock-names. +- clock-names: Must include "xtal" (see "External clocks") and + "audio_clk_in_gate", "enet_clk_in_gate" which are generated by the + top-level general control. + +Example: + clk_core: clock-controller@18144000 { + compatible = "img,pistachio-clk"; + reg = <0x18144000 0x800>; + clocks = <&xtal>, <&cr_top EXT_CLK_AUDIO_IN>, + <&cr_top EXT_CLK_ENET_IN>; + clock-names = "xtal", "audio_clk_in_gate", "enet_clk_in_gate"; + + #clock-cells = <1>; + }; + +Peripheral clock controller: +---------------------------- + +The peripheral clock controller generates clocks for the DDR, ROM, and other +peripherals. The peripheral system clock ("periph_sys") generated by the core +clock controller is the input clock to the peripheral clock controller. + +Required properties: +- compatible: Must be "img,pistachio-periph-clk". +- reg: Must contain the base address and length of the peripheral clock + controller. +- #clock-cells: Must be 1. The single cell is the clock identifier. + See dt-bindings/clock/pistachio-clk.h for the list of valid identifiers. +- clocks: Must contain an entry for each clock in clock-names. +- clock-names: Must include "periph_sys", the peripheral system clock generated + by the core clock controller. + +Example: + clk_periph: clock-controller@18144800 { + compatible = "img,pistachio-clk-periph"; + reg = <0x18144800 0x800>; + clocks = <&clk_core CLK_PERIPH_SYS>; + clock-names = "periph_sys"; + + #clock-cells = <1>; + }; + +Peripheral general control: +--------------------------- + +The peripheral general control block generates system interface clocks and +resets for various peripherals. It also contains miscellaneous peripheral +control registers. The system clock ("sys") generated by the peripheral clock +controller is the input clock to the system clock controller. + +Required properties: +- compatible: Must include "img,pistachio-periph-cr" and "syscon". +- reg: Must contain the base address and length of the peripheral general + control registers. +- #clock-cells: Must be 1. The single cell is the clock identifier. + See dt-bindings/clock/pistachio-clk.h for the list of valid identifiers. +- clocks: Must contain an entry for each clock in clock-names. +- clock-names: Must include "sys", the system clock generated by the peripheral + clock controller. + +Example: + cr_periph: syscon@18144800 { + compatible = "img,pistachio-cr-periph", "syscon"; + reg = <0x18148000 0x1000>; + clocks = <&clock_periph PERIPH_CLK_PERIPH_SYS>; + clock-names = "sys"; + + #clock-cells = <1>; + }; + +Top-level general control: +-------------------------- + +The top-level general control block contains miscellaneous control registers and +gates for the external clocks "audio_clk_in" and "enet_clk_in". + +Required properties: +- compatible: Must include "img,pistachio-cr-top" and "syscon". +- reg: Must contain the base address and length of the top-level + control registers. +- clocks: Must contain an entry for each clock in clock-names. +- clock-names: Two optional clocks, "audio_clk_in" and "enet_clk_in" (see + "External clocks"). +- #clock-cells: Must be 1. The single cell is the clock identifier. + See dt-bindings/clock/pistachio-clk.h for the list of valid identifiers. + +Example: + cr_top: syscon@18144800 { + compatible = "img,pistachio-cr-top", "syscon"; + reg = <0x18149000 0x200>; + clocks = <&audio_refclk>, <&ext_enet_in>; + clock-names = "audio_clk_in", "enet_clk_in"; + + #clock-cells = <1>; + }; diff --git a/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm3380-l2-intc.txt b/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm3380-l2-intc.txt new file mode 100644 index 0000000..8f48aad --- /dev/null +++ b/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm3380-l2-intc.txt @@ -0,0 +1,41 @@ +Broadcom BCM3380-style Level 1 / Level 2 interrupt controller + +This interrupt controller shows up in various forms on many BCM338x/BCM63xx +chipsets. It has the following properties: + +- outputs a single interrupt signal to its interrupt controller parent + +- contains one or more enable/status word pairs, which often appear at + different offsets in different blocks + +- no atomic set/clear operations + +Required properties: + +- compatible: should be "brcm,bcm3380-l2-intc" +- reg: specifies one or more enable/status pairs, in the following format: + <enable_reg 0x4 status_reg 0x4>... +- interrupt-controller: identifies the node as an interrupt controller +- #interrupt-cells: specifies the number of cells needed to encode an interrupt + source, should be 1. +- interrupt-parent: specifies the phandle to the parent interrupt controller + this one is cascaded from +- interrupts: specifies the interrupt line in the interrupt-parent controller + node, valid values depend on the type of parent interrupt controller + +Optional properties: + +- brcm,irq-can-wake: if present, this means the L2 controller can be used as a + wakeup source for system suspend/resume. + +Example: + +irq0_intc: interrupt-controller@10000020 { + compatible = "brcm,bcm3380-l2-intc"; + reg = <0x10000024 0x4 0x1000002c 0x4>, + <0x10000020 0x4 0x10000028 0x4>; + interrupt-controller; + #interrupt-cells = <1>; + interrupt-parent = <&cpu_intc>; + interrupts = <2>; +}; diff --git a/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm7038-l1-intc.txt b/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm7038-l1-intc.txt new file mode 100644 index 0000000..cc217b2 --- /dev/null +++ b/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm7038-l1-intc.txt @@ -0,0 +1,52 @@ +Broadcom BCM7038-style Level 1 interrupt controller + +This block is a first level interrupt controller that is typically connected +directly to one of the HW INT lines on each CPU. Every BCM7xxx set-top chip +since BCM7038 has contained this hardware. + +Key elements of the hardware design include: + +- 64, 96, 128, or 160 incoming level IRQ lines + +- Most onchip peripherals are wired directly to an L1 input + +- A separate instance of the register set for each CPU, allowing individual + peripheral IRQs to be routed to any CPU + +- Atomic mask/unmask operations + +- No polarity/level/edge settings + +- No FIFO or priority encoder logic; software is expected to read all + 2-5 status words to determine which IRQs are pending + +Required properties: + +- compatible: should be "brcm,bcm7038-l1-intc" +- reg: specifies the base physical address and size of the registers; + the number of supported IRQs is inferred from the size argument +- interrupt-controller: identifies the node as an interrupt controller +- #interrupt-cells: specifies the number of cells needed to encode an interrupt + source, should be 1. +- interrupt-parent: specifies the phandle to the parent interrupt controller(s) + this one is cascaded from +- interrupts: specifies the interrupt line(s) in the interrupt-parent controller + node; valid values depend on the type of parent interrupt controller + +If multiple reg ranges and interrupt-parent entries are present on an SMP +system, the driver will allow IRQ SMP affinity to be set up through the +/proc/irq/ interface. In the simplest possible configuration, only one +reg range and one interrupt-parent is needed. + +Example: + +periph_intc: periph_intc@1041a400 { + compatible = "brcm,bcm7038-l1-intc"; + reg = <0x1041a400 0x30 0x1041a600 0x30>; + + interrupt-controller; + #interrupt-cells = <1>; + + interrupt-parent = <&cpu_intc>; + interrupts = <2>, <3>; +}; diff --git a/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm7120-l2-intc.txt b/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm7120-l2-intc.txt index bae1f21..44a9bb1 100644 --- a/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm7120-l2-intc.txt +++ b/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm7120-l2-intc.txt @@ -13,8 +13,7 @@ Such an interrupt controller has the following hardware design: or if they will output an interrupt signal at this 2nd level interrupt controller, in particular for UARTs -- typically has one 32-bit enable word and one 32-bit status word, but on - some hardware may have more than one enable/status pair +- has one 32-bit enable word and one 32-bit status word - no atomic set/clear operations @@ -53,9 +52,7 @@ The typical hardware layout for this controller is represented below: Required properties: - compatible: should be "brcm,bcm7120-l2-intc" -- reg: specifies the base physical address and size of the registers; - multiple pairs may be specified, with the first pair handling IRQ offsets - 0..31 and the second pair handling 32..63 +- reg: specifies the base physical address and size of the registers - interrupt-controller: identifies the node as an interrupt controller - #interrupt-cells: specifies the number of cells needed to encode an interrupt source, should be 1. @@ -66,10 +63,7 @@ Required properties: - brcm,int-map-mask: 32-bits bit mask describing how many and which interrupts are wired to this 2nd level interrupt controller, and how they match their respective interrupt parents. Should match exactly the number of interrupts - specified in the 'interrupts' property, multiplied by the number of - enable/status register pairs implemented by this controller. For - multiple parent IRQs with multiple enable/status words, this looks like: - <irq0_w0 irq0_w1 irq1_w0 irq1_w1 ...> + specified in the 'interrupts' property. Optional properties: diff --git a/Documentation/devicetree/bindings/interrupt-controller/cdns,xtensa-mx.txt b/Documentation/devicetree/bindings/interrupt-controller/cdns,xtensa-mx.txt new file mode 100644 index 0000000..d4de980 --- /dev/null +++ b/Documentation/devicetree/bindings/interrupt-controller/cdns,xtensa-mx.txt @@ -0,0 +1,18 @@ +* Xtensa Interrupt Distributor and Programmable Interrupt Controller (MX) + +Required properties: +- compatible: Should be "cdns,xtensa-mx". + +Remaining properties have exact same meaning as in Xtensa PIC +(see cdns,xtensa-pic.txt). + +Examples: + pic: pic { + compatible = "cdns,xtensa-mx"; + /* one cell: internal irq number, + * two cells: second cell == 0: internal irq number + * second cell == 1: external irq number + */ + #interrupt-cells = <2>; + interrupt-controller; + }; diff --git a/Documentation/devicetree/bindings/interrupt-controller/cdns,xtensa-pic.txt b/Documentation/devicetree/bindings/interrupt-controller/cdns,xtensa-pic.txt new file mode 100644 index 0000000..026ef4c --- /dev/null +++ b/Documentation/devicetree/bindings/interrupt-controller/cdns,xtensa-pic.txt @@ -0,0 +1,25 @@ +* Xtensa built-in Programmable Interrupt Controller (PIC) + +Required properties: +- compatible: Should be "cdns,xtensa-pic". +- interrupt-controller: Identifies the node as an interrupt controller. +- #interrupt-cells: The number of cells to define the interrupts. + It may be either 1 or 2. + When it's 1, the first cell is the internal IRQ number. + When it's 2, the first cell is the IRQ number, and the second cell + specifies whether it's internal (0) or external (1). + Periferals are usually connected to a fixed external IRQ, but for different + core variants it may be mapped to different internal IRQ. + IRQ sensitivity and priority are fixed for each core variant and may not be + changed at runtime. + +Examples: + pic: pic { + compatible = "cdns,xtensa-pic"; + /* one cell: internal irq number, + * two cells: second cell == 0: internal irq number + * second cell == 1: external irq number + */ + #interrupt-cells = <2>; + interrupt-controller; + }; diff --git a/Documentation/devicetree/bindings/interrupt-controller/mips-gic.txt b/Documentation/devicetree/bindings/interrupt-controller/mips-gic.txt index 5a65478..aae4c38 100644 --- a/Documentation/devicetree/bindings/interrupt-controller/mips-gic.txt +++ b/Documentation/devicetree/bindings/interrupt-controller/mips-gic.txt @@ -27,8 +27,13 @@ Optional properties: Required properties for timer sub-node: - compatible : Should be "mti,gic-timer". - interrupts : Interrupt for the GIC local timer. + +Optional properties for timer sub-node: +- clocks : GIC timer operating clock. - clock-frequency : Clock frequency at which the GIC timers operate. +Note that one of clocks or clock-frequency must be specified. + Example: gic: interrupt-controller@1bdc0000 { diff --git a/Documentation/devicetree/bindings/leds/common.txt b/Documentation/devicetree/bindings/leds/common.txt index 34811c5..747c538 100644 --- a/Documentation/devicetree/bindings/leds/common.txt +++ b/Documentation/devicetree/bindings/leds/common.txt @@ -14,8 +14,10 @@ Optional properties for child nodes: - led-sources : List of device current outputs the LED is connected to. The outputs are identified by the numbers that must be defined in the LED device binding documentation. -- label : The label for this LED. If omitted, the label is - taken from the node name (excluding the unit address). +- label : The label for this LED. If omitted, the label is taken from the node + name (excluding the unit address). It has to uniquely identify + a device, i.e. no other LED class device can be assigned the same + label. - linux,default-trigger : This parameter, if present, is a string defining the trigger assigned to the LED. Current triggers are: diff --git a/Documentation/devicetree/bindings/leds/leds-gpio.txt b/Documentation/devicetree/bindings/leds/leds-gpio.txt index f77148f..fea1ebf 100644 --- a/Documentation/devicetree/bindings/leds/leds-gpio.txt +++ b/Documentation/devicetree/bindings/leds/leds-gpio.txt @@ -26,16 +26,18 @@ LED sub-node properties: Examples: +#include <dt-bindings/gpio/gpio.h> + leds { compatible = "gpio-leds"; hdd { label = "IDE Activity"; - gpios = <&mcu_pio 0 1>; /* Active low */ + gpios = <&mcu_pio 0 GPIO_ACTIVE_LOW>; linux,default-trigger = "ide-disk"; }; fault { - gpios = <&mcu_pio 1 0>; + gpios = <&mcu_pio 1 GPIO_ACTIVE_HIGH>; /* Keep LED on if BIOS detected hardware fault */ default-state = "keep"; }; @@ -44,11 +46,11 @@ leds { run-control { compatible = "gpio-leds"; red { - gpios = <&mpc8572 6 0>; + gpios = <&mpc8572 6 GPIO_ACTIVE_HIGH>; default-state = "off"; }; green { - gpios = <&mpc8572 7 0>; + gpios = <&mpc8572 7 GPIO_ACTIVE_HIGH>; default-state = "on"; }; }; @@ -57,7 +59,7 @@ leds { compatible = "gpio-leds"; charger-led { - gpios = <&gpio1 2 0>; + gpios = <&gpio1 2 GPIO_ACTIVE_HIGH>; linux,default-trigger = "max8903-charger-charging"; retain-state-suspended; }; diff --git a/Documentation/devicetree/bindings/leds/leds-pm8941-wled.txt b/Documentation/devicetree/bindings/leds/leds-pm8941-wled.txt new file mode 100644 index 0000000..a85a964 --- /dev/null +++ b/Documentation/devicetree/bindings/leds/leds-pm8941-wled.txt @@ -0,0 +1,43 @@ +Binding for Qualcomm PM8941 WLED driver + +Required properties: +- compatible: should be "qcom,pm8941-wled" +- reg: slave address + +Optional properties: +- label: The label for this led + See Documentation/devicetree/bindings/leds/common.txt +- linux,default-trigger: Default trigger assigned to the LED + See Documentation/devicetree/bindings/leds/common.txt +- qcom,cs-out: bool; enable current sink output +- qcom,cabc: bool; enable content adaptive backlight control +- qcom,ext-gen: bool; use externally generated modulator signal to dim +- qcom,current-limit: mA; per-string current limit; value from 0 to 25 + default: 20mA +- qcom,current-boost-limit: mA; boost current limit; one of: + 105, 385, 525, 805, 980, 1260, 1400, 1680 + default: 805mA +- qcom,switching-freq: kHz; switching frequency; one of: + 600, 640, 685, 738, 800, 872, 960, 1066, 1200, 1371, + 1600, 1920, 2400, 3200, 4800, 9600, + default: 1600kHz +- qcom,ovp: V; Over-voltage protection limit; one of: + 27, 29, 32, 35 + default: 29V +- qcom,num-strings: #; number of led strings attached; value from 1 to 3 + default: 2 + +Example: + +pm8941-wled@d800 { + compatible = "qcom,pm8941-wled"; + reg = <0xd800>; + label = "backlight"; + + qcom,cs-out; + qcom,current-limit = <20>; + qcom,current-boost-limit = <805>; + qcom,switching-freq = <1600>; + qcom,ovp = <29>; + qcom,num-strings = <2>; +}; diff --git a/Documentation/devicetree/bindings/mailbox/arm-mhu.txt b/Documentation/devicetree/bindings/mailbox/arm-mhu.txt new file mode 100644 index 0000000..4971f03 --- /dev/null +++ b/Documentation/devicetree/bindings/mailbox/arm-mhu.txt @@ -0,0 +1,43 @@ +ARM MHU Mailbox Driver +====================== + +The ARM's Message-Handling-Unit (MHU) is a mailbox controller that has +3 independent channels/links to communicate with remote processor(s). + MHU links are hardwired on a platform. A link raises interrupt for any +received data. However, there is no specified way of knowing if the sent +data has been read by the remote. This driver assumes the sender polls +STAT register and the remote clears it after having read the data. +The last channel is specified to be a 'Secure' resource, hence can't be +used by Linux running NS. + +Mailbox Device Node: +==================== + +Required properties: +-------------------- +- compatible: Shall be "arm,mhu" & "arm,primecell" +- reg: Contains the mailbox register address range (base + address and length) +- #mbox-cells Shall be 1 - the index of the channel needed. +- interrupts: Contains the interrupt information corresponding to + each of the 3 links of MHU. + +Example: +-------- + + mhu: mailbox@2b1f0000 { + #mbox-cells = <1>; + compatible = "arm,mhu", "arm,primecell"; + reg = <0 0x2b1f0000 0x1000>; + interrupts = <0 36 4>, /* LP-NonSecure */ + <0 35 4>, /* HP-NonSecure */ + <0 37 4>; /* Secure */ + clocks = <&clock 0 2 1>; + clock-names = "apb_pclk"; + }; + + mhu_client: scb@2e000000 { + compatible = "fujitsu,mb86s70-scb-1.0"; + reg = <0 0x2e000000 0x4000>; + mboxes = <&mhu 1>; /* HP-NonSecure */ + }; diff --git a/Documentation/devicetree/bindings/mips/brcm/bcm3384-intc.txt b/Documentation/devicetree/bindings/mips/brcm/bcm3384-intc.txt deleted file mode 100644 index d4e0141d..0000000 --- a/Documentation/devicetree/bindings/mips/brcm/bcm3384-intc.txt +++ /dev/null @@ -1,37 +0,0 @@ -* Interrupt Controller - -Properties: -- compatible: "brcm,bcm3384-intc" - - Compatibility with BCM3384 and possibly other BCM33xx/BCM63xx SoCs. - -- reg: Address/length pairs for each mask/status register set. Length must - be 8. If multiple register sets are specified, the first set will - handle IRQ offsets 0..31, the second set 32..63, and so on. - -- interrupt-controller: This is an interrupt controller. - -- #interrupt-cells: Must be <1>. Just a simple IRQ offset; no level/edge - or polarity configuration is possible with this controller. - -- interrupt-parent: This controller is cascaded from a MIPS CPU HW IRQ, or - from another INTC. - -- interrupts: The IRQ on the parent controller. - -Example: - periph_intc: periph_intc@14e00038 { - compatible = "brcm,bcm3384-intc"; - - /* - * IRQs 0..31: mask reg 0x14e00038, status reg 0x14e0003c - * IRQs 32..63: mask reg 0x14e00340, status reg 0x14e00344 - */ - reg = <0x14e00038 0x8 0x14e00340 0x8>; - - interrupt-controller; - #interrupt-cells = <1>; - - interrupt-parent = <&cpu_intc>; - interrupts = <4>; - }; diff --git a/Documentation/devicetree/bindings/mips/brcm/cm-dsl.txt b/Documentation/devicetree/bindings/mips/brcm/cm-dsl.txt deleted file mode 100644 index 8a139cb3..0000000 --- a/Documentation/devicetree/bindings/mips/brcm/cm-dsl.txt +++ /dev/null @@ -1,11 +0,0 @@ -* Broadcom cable/DSL platforms - -SoCs: - -Required properties: -- compatible: "brcm,bcm3384", "brcm,bcm33843" - -Boards: - -Required properties: -- compatible: "brcm,bcm93384wvg" diff --git a/Documentation/devicetree/bindings/mips/brcm/soc.txt b/Documentation/devicetree/bindings/mips/brcm/soc.txt new file mode 100644 index 0000000..7bab90c --- /dev/null +++ b/Documentation/devicetree/bindings/mips/brcm/soc.txt @@ -0,0 +1,12 @@ +* Broadcom cable/DSL/settop platforms + +Required properties: + +- compatible: "brcm,bcm3384", "brcm,bcm33843" + "brcm,bcm3384-viper", "brcm,bcm33843-viper" + "brcm,bcm6328", "brcm,bcm6368", + "brcm,bcm7125", "brcm,bcm7346", "brcm,bcm7358", "brcm,bcm7360", + "brcm,bcm7362", "brcm,bcm7420", "brcm,bcm7425" + +The experimental -viper variants are for running Linux on the 3384's +BMIPS4355 cable modem CPU instead of the BMIPS5000 application processor. diff --git a/Documentation/devicetree/bindings/mips/img/pistachio.txt b/Documentation/devicetree/bindings/mips/img/pistachio.txt new file mode 100644 index 0000000..a736d88 --- /dev/null +++ b/Documentation/devicetree/bindings/mips/img/pistachio.txt @@ -0,0 +1,42 @@ +Imagination Pistachio SoC +========================= + +Required properties: +-------------------- + - compatible: Must include "img,pistachio". + +CPU nodes: +---------- +A "cpus" node is required. Required properties: + - #address-cells: Must be 1. + - #size-cells: Must be 0. +A CPU sub-node is also required for at least CPU 0. Since the topology may +be probed via CPS, it is not necessary to specify secondary CPUs. Required +propertis: + - device_type: Must be "cpu". + - compatible: Must be "mti,interaptiv". + - reg: CPU number. + - clocks: Must include the CPU clock. See ../../clock/clock-bindings.txt for + details on clock bindings. +Example: + cpus { + #address-cells = <1>; + #size-cells = <0>; + + cpu0: cpu@0 { + device_type = "cpu"; + compatible = "mti,interaptiv"; + reg = <0>; + clocks = <&clk_core CLK_MIPS>; + }; + }; + + +Boot protocol: +-------------- +In accordance with the MIPS UHI specification[1], the bootloader must pass the +following arguments to the kernel: + - $a0: -2. + - $a1: KSEG0 address of the flattened device-tree blob. + +[1] http://prplfoundation.org/wiki/MIPS_documentation diff --git a/Documentation/devicetree/bindings/rtc/digicolor-rtc.txt b/Documentation/devicetree/bindings/rtc/digicolor-rtc.txt new file mode 100644 index 0000000..d4649860 --- /dev/null +++ b/Documentation/devicetree/bindings/rtc/digicolor-rtc.txt @@ -0,0 +1,17 @@ +Conexant Digicolor Real Time Clock controller + +This binding currently supports the CX92755 SoC. + +Required properties: +- compatible: should be "cnxt,cx92755-rtc" +- reg: physical base address of the controller and length of memory mapped + region. +- interrupts: rtc alarm interrupt + +Example: + + rtc@f0000c30 { + compatible = "cnxt,cx92755-rtc"; + reg = <0xf0000c30 0x18>; + interrupts = <25>; + }; diff --git a/Documentation/devicetree/bindings/rtc/stmp3xxx-rtc.txt b/Documentation/devicetree/bindings/rtc/stmp3xxx-rtc.txt index b800070..fa6a942 100644 --- a/Documentation/devicetree/bindings/rtc/stmp3xxx-rtc.txt +++ b/Documentation/devicetree/bindings/rtc/stmp3xxx-rtc.txt @@ -7,6 +7,11 @@ Required properties: region. - interrupts: rtc alarm interrupt +Optional properties: +- stmp,crystal-freq: override crystal frequency as determined from fuse bits. + Only <32000> and <32768> are possible for the hardware. Use <0> for + "no crystal". + Example: rtc@80056000 { diff --git a/Documentation/devicetree/bindings/sound/ingenic,jz4740-i2s.txt b/Documentation/devicetree/bindings/sound/ingenic,jz4740-i2s.txt index b414333..b623d50 100644 --- a/Documentation/devicetree/bindings/sound/ingenic,jz4740-i2s.txt +++ b/Documentation/devicetree/bindings/sound/ingenic,jz4740-i2s.txt @@ -1,7 +1,7 @@ Ingenic JZ4740 I2S controller Required properties: -- compatible : "ingenic,jz4740-i2s" +- compatible : "ingenic,jz4740-i2s" or "ingenic,jz4780-i2s" - reg : I2S registers location and length - clocks : AIC and I2S PLL clock specifiers. - clock-names: "aic" and "i2s" diff --git a/Documentation/devicetree/bindings/sound/max98925.txt b/Documentation/devicetree/bindings/sound/max98925.txt new file mode 100644 index 0000000..27be63e --- /dev/null +++ b/Documentation/devicetree/bindings/sound/max98925.txt @@ -0,0 +1,22 @@ +max98925 audio CODEC + +This device supports I2C. + +Required properties: + + - compatible : "maxim,max98925" + + - vmon-slot-no : slot number used to send voltage information + + - imon-slot-no : slot number used to send current information + + - reg : the I2C address of the device for I2C + +Example: + +codec: max98925@1a { + compatible = "maxim,max98925"; + vmon-slot-no = <0>; + imon-slot-no = <2>; + reg = <0x1a>; +}; diff --git a/Documentation/devicetree/bindings/sound/nvidia,tegra-audio-max98090.txt b/Documentation/devicetree/bindings/sound/nvidia,tegra-audio-max98090.txt index c949abc..c3495be 100644 --- a/Documentation/devicetree/bindings/sound/nvidia,tegra-audio-max98090.txt +++ b/Documentation/devicetree/bindings/sound/nvidia,tegra-audio-max98090.txt @@ -18,6 +18,7 @@ Required properties: * Headphones * Speakers * Mic Jack + * Int Mic - nvidia,i2s-controller : The phandle of the Tegra I2S controller that's connected to the CODEC. diff --git a/Documentation/devicetree/bindings/sound/qcom,lpass-cpu.txt b/Documentation/devicetree/bindings/sound/qcom,lpass-cpu.txt new file mode 100644 index 0000000..e00732d --- /dev/null +++ b/Documentation/devicetree/bindings/sound/qcom,lpass-cpu.txt @@ -0,0 +1,43 @@ +* Qualcomm Technologies LPASS CPU DAI + +This node models the Qualcomm Technologies Low-Power Audio SubSystem (LPASS). + +Required properties: + +- compatible : "qcom,lpass-cpu" +- clocks : Must contain an entry for each entry in clock-names. +- clock-names : A list which must include the following entries: + * "ahbix-clk" + * "mi2s-osr-clk" + * "mi2s-bit-clk" +- interrupts : Must contain an entry for each entry in + interrupt-names. +- interrupt-names : A list which must include the following entries: + * "lpass-irq-lpaif" +- pinctrl-N : One property must exist for each entry in + pinctrl-names. See ../pinctrl/pinctrl-bindings.txt + for details of the property values. +- pinctrl-names : Must contain a "default" entry. +- reg : Must contain an address for each entry in reg-names. +- reg-names : A list which must include the following entries: + * "lpass-lpaif" + +Optional properties: + +- qcom,adsp : Phandle for the audio DSP node + +Example: + +lpass@28100000 { + compatible = "qcom,lpass-cpu"; + clocks = <&lcc AHBIX_CLK>, <&lcc MI2S_OSR_CLK>, <&lcc MI2S_BIT_CLK>; + clock-names = "ahbix-clk", "mi2s-osr-clk", "mi2s-bit-clk"; + interrupts = <0 85 1>; + interrupt-names = "lpass-irq-lpaif"; + pinctrl-names = "default", "idle"; + pinctrl-0 = <&mi2s_default>; + pinctrl-1 = <&mi2s_idle>; + reg = <0x28100000 0x10000>; + reg-names = "lpass-lpaif"; + qcom,adsp = <&adsp>; +}; diff --git a/Documentation/devicetree/bindings/sound/renesas,rsnd.txt b/Documentation/devicetree/bindings/sound/renesas,rsnd.txt index 2dd690b..f316ce1 100644 --- a/Documentation/devicetree/bindings/sound/renesas,rsnd.txt +++ b/Documentation/devicetree/bindings/sound/renesas,rsnd.txt @@ -29,9 +29,17 @@ SSI subnode properties: - shared-pin : if shared clock pin - pio-transfer : use PIO transfer mode - no-busif : BUSIF is not ussed when [mem -> SSI] via DMA case +- dma : Should contain Audio DMAC entry +- dma-names : SSI case "rx" (=playback), "tx" (=capture) + SSIU case "rxu" (=playback), "txu" (=capture) SRC subnode properties: -no properties at this point +- dma : Should contain Audio DMAC entry +- dma-names : "rx" (=playback), "tx" (=capture) + +DVC subnode properties: +- dma : Should contain Audio DMAC entry +- dma-names : "tx" (=playback/capture) DAI subnode properties: - playback : list of playback modules @@ -45,56 +53,145 @@ rcar_sound: rcar_sound@ec500000 { reg = <0 0xec500000 0 0x1000>, /* SCU */ <0 0xec5a0000 0 0x100>, /* ADG */ <0 0xec540000 0 0x1000>, /* SSIU */ - <0 0xec541000 0 0x1280>; /* SSI */ + <0 0xec541000 0 0x1280>, /* SSI */ + <0 0xec740000 0 0x200>; /* Audio DMAC peri peri*/ + reg-names = "scu", "adg", "ssiu", "ssi", "audmapp"; + + clocks = <&mstp10_clks R8A7790_CLK_SSI_ALL>, + <&mstp10_clks R8A7790_CLK_SSI9>, <&mstp10_clks R8A7790_CLK_SSI8>, + <&mstp10_clks R8A7790_CLK_SSI7>, <&mstp10_clks R8A7790_CLK_SSI6>, + <&mstp10_clks R8A7790_CLK_SSI5>, <&mstp10_clks R8A7790_CLK_SSI4>, + <&mstp10_clks R8A7790_CLK_SSI3>, <&mstp10_clks R8A7790_CLK_SSI2>, + <&mstp10_clks R8A7790_CLK_SSI1>, <&mstp10_clks R8A7790_CLK_SSI0>, + <&mstp10_clks R8A7790_CLK_SCU_SRC9>, <&mstp10_clks R8A7790_CLK_SCU_SRC8>, + <&mstp10_clks R8A7790_CLK_SCU_SRC7>, <&mstp10_clks R8A7790_CLK_SCU_SRC6>, + <&mstp10_clks R8A7790_CLK_SCU_SRC5>, <&mstp10_clks R8A7790_CLK_SCU_SRC4>, + <&mstp10_clks R8A7790_CLK_SCU_SRC3>, <&mstp10_clks R8A7790_CLK_SCU_SRC2>, + <&mstp10_clks R8A7790_CLK_SCU_SRC1>, <&mstp10_clks R8A7790_CLK_SCU_SRC0>, + <&mstp10_clks R8A7790_CLK_SCU_DVC0>, <&mstp10_clks R8A7790_CLK_SCU_DVC1>, + <&audio_clk_a>, <&audio_clk_b>, <&audio_clk_c>, <&m2_clk>; + clock-names = "ssi-all", + "ssi.9", "ssi.8", "ssi.7", "ssi.6", "ssi.5", + "ssi.4", "ssi.3", "ssi.2", "ssi.1", "ssi.0", + "src.9", "src.8", "src.7", "src.6", "src.5", + "src.4", "src.3", "src.2", "src.1", "src.0", + "dvc.0", "dvc.1", + "clk_a", "clk_b", "clk_c", "clk_i"; rcar_sound,dvc { - dvc0: dvc@0 { }; - dvc1: dvc@1 { }; + dvc0: dvc@0 { + dmas = <&audma0 0xbc>; + dma-names = "tx"; + }; + dvc1: dvc@1 { + dmas = <&audma0 0xbe>; + dma-names = "tx"; + }; }; rcar_sound,src { - src0: src@0 { }; - src1: src@1 { }; - src2: src@2 { }; - src3: src@3 { }; - src4: src@4 { }; - src5: src@5 { }; - src6: src@6 { }; - src7: src@7 { }; - src8: src@8 { }; - src9: src@9 { }; + src0: src@0 { + interrupts = <0 352 IRQ_TYPE_LEVEL_HIGH>; + dmas = <&audma0 0x85>, <&audma1 0x9a>; + dma-names = "rx", "tx"; + }; + src1: src@1 { + interrupts = <0 353 IRQ_TYPE_LEVEL_HIGH>; + dmas = <&audma0 0x87>, <&audma1 0x9c>; + dma-names = "rx", "tx"; + }; + src2: src@2 { + interrupts = <0 354 IRQ_TYPE_LEVEL_HIGH>; + dmas = <&audma0 0x89>, <&audma1 0x9e>; + dma-names = "rx", "tx"; + }; + src3: src@3 { + interrupts = <0 355 IRQ_TYPE_LEVEL_HIGH>; + dmas = <&audma0 0x8b>, <&audma1 0xa0>; + dma-names = "rx", "tx"; + }; + src4: src@4 { + interrupts = <0 356 IRQ_TYPE_LEVEL_HIGH>; + dmas = <&audma0 0x8d>, <&audma1 0xb0>; + dma-names = "rx", "tx"; + }; + src5: src@5 { + interrupts = <0 357 IRQ_TYPE_LEVEL_HIGH>; + dmas = <&audma0 0x8f>, <&audma1 0xb2>; + dma-names = "rx", "tx"; + }; + src6: src@6 { + interrupts = <0 358 IRQ_TYPE_LEVEL_HIGH>; + dmas = <&audma0 0x91>, <&audma1 0xb4>; + dma-names = "rx", "tx"; + }; + src7: src@7 { + interrupts = <0 359 IRQ_TYPE_LEVEL_HIGH>; + dmas = <&audma0 0x93>, <&audma1 0xb6>; + dma-names = "rx", "tx"; + }; + src8: src@8 { + interrupts = <0 360 IRQ_TYPE_LEVEL_HIGH>; + dmas = <&audma0 0x95>, <&audma1 0xb8>; + dma-names = "rx", "tx"; + }; + src9: src@9 { + interrupts = <0 361 IRQ_TYPE_LEVEL_HIGH>; + dmas = <&audma0 0x97>, <&audma1 0xba>; + dma-names = "rx", "tx"; + }; }; rcar_sound,ssi { ssi0: ssi@0 { interrupts = <0 370 IRQ_TYPE_LEVEL_HIGH>; + dmas = <&audma0 0x01>, <&audma1 0x02>, <&audma0 0x15>, <&audma1 0x16>; + dma-names = "rx", "tx", "rxu", "txu"; }; ssi1: ssi@1 { interrupts = <0 371 IRQ_TYPE_LEVEL_HIGH>; + dmas = <&audma0 0x03>, <&audma1 0x04>, <&audma0 0x49>, <&audma1 0x4a>; + dma-names = "rx", "tx", "rxu", "txu"; }; ssi2: ssi@2 { interrupts = <0 372 IRQ_TYPE_LEVEL_HIGH>; + dmas = <&audma0 0x05>, <&audma1 0x06>, <&audma0 0x63>, <&audma1 0x64>; + dma-names = "rx", "tx", "rxu", "txu"; }; ssi3: ssi@3 { interrupts = <0 373 IRQ_TYPE_LEVEL_HIGH>; + dmas = <&audma0 0x07>, <&audma1 0x08>, <&audma0 0x6f>, <&audma1 0x70>; + dma-names = "rx", "tx", "rxu", "txu"; }; ssi4: ssi@4 { interrupts = <0 374 IRQ_TYPE_LEVEL_HIGH>; + dmas = <&audma0 0x09>, <&audma1 0x0a>, <&audma0 0x71>, <&audma1 0x72>; + dma-names = "rx", "tx", "rxu", "txu"; }; ssi5: ssi@5 { interrupts = <0 375 IRQ_TYPE_LEVEL_HIGH>; + dmas = <&audma0 0x0b>, <&audma1 0x0c>, <&audma0 0x73>, <&audma1 0x74>; + dma-names = "rx", "tx", "rxu", "txu"; }; ssi6: ssi@6 { interrupts = <0 376 IRQ_TYPE_LEVEL_HIGH>; + dmas = <&audma0 0x0d>, <&audma1 0x0e>, <&audma0 0x75>, <&audma1 0x76>; + dma-names = "rx", "tx", "rxu", "txu"; }; ssi7: ssi@7 { interrupts = <0 377 IRQ_TYPE_LEVEL_HIGH>; + dmas = <&audma0 0x0f>, <&audma1 0x10>, <&audma0 0x79>, <&audma1 0x7a>; + dma-names = "rx", "tx", "rxu", "txu"; }; ssi8: ssi@8 { interrupts = <0 378 IRQ_TYPE_LEVEL_HIGH>; + dmas = <&audma0 0x11>, <&audma1 0x12>, <&audma0 0x7b>, <&audma1 0x7c>; + dma-names = "rx", "tx", "rxu", "txu"; }; ssi9: ssi@9 { interrupts = <0 379 IRQ_TYPE_LEVEL_HIGH>; + dmas = <&audma0 0x13>, <&audma1 0x14>, <&audma0 0x7d>, <&audma1 0x7e>; + dma-names = "rx", "tx", "rxu", "txu"; }; }; diff --git a/Documentation/devicetree/bindings/sound/renesas,rsrc-card.txt b/Documentation/devicetree/bindings/sound/renesas,rsrc-card.txt new file mode 100644 index 0000000..c641550 --- /dev/null +++ b/Documentation/devicetree/bindings/sound/renesas,rsrc-card.txt @@ -0,0 +1,67 @@ +Renesas Sampling Rate Convert Sound Card: + +Renesas Sampling Rate Convert Sound Card specifies audio DAI connections of SoC <-> codec. + +Required properties: + +- compatible : "renesas,rsrc-card,<board>" + Examples with soctypes are: + - "renesas,rsrc-card,lager" + - "renesas,rsrc-card,koelsch" +Optional properties: + +- card_name : User specified audio sound card name, one string + property. +- cpu : CPU sub-node +- codec : CODEC sub-node + +Optional subnode properties: + +- format : CPU/CODEC common audio format. + "i2s", "right_j", "left_j" , "dsp_a" + "dsp_b", "ac97", "pdm", "msb", "lsb" +- frame-master : Indicates dai-link frame master. + phandle to a cpu or codec subnode. +- bitclock-master : Indicates dai-link bit clock master. + phandle to a cpu or codec subnode. +- bitclock-inversion : bool property. Add this if the + dai-link uses bit clock inversion. +- frame-inversion : bool property. Add this if the + dai-link uses frame clock inversion. +- convert-rate : platform specified sampling rate convert + +Required CPU/CODEC subnodes properties: + +- sound-dai : phandle and port of CPU/CODEC + +Optional CPU/CODEC subnodes properties: + +- clocks / system-clock-frequency : specify subnode's clock if needed. + it can be specified via "clocks" if system has + clock node (= common clock), or "system-clock-frequency" + (if system doens't support common clock) + If a clock is specified, it is + enabled with clk_prepare_enable() + in dai startup() and disabled with + clk_disable_unprepare() in dai + shutdown(). + +Example + +sound { + compatible = "renesas,rsrc-card,lager"; + + card-name = "rsnd-ak4643"; + format = "left_j"; + bitclock-master = <&sndcodec>; + frame-master = <&sndcodec>; + + sndcpu: cpu { + sound-dai = <&rcar_sound>; + }; + + sndcodec: codec { + sound-dai = <&ak4643>; + system-clock-frequency = <11289600>; + }; +}; diff --git a/Documentation/devicetree/bindings/sound/storm.txt b/Documentation/devicetree/bindings/sound/storm.txt new file mode 100644 index 0000000..062a4c1 --- /dev/null +++ b/Documentation/devicetree/bindings/sound/storm.txt @@ -0,0 +1,23 @@ +* Sound complex for Storm boards + +Models a soundcard for Storm boards with the Qualcomm Technologies IPQ806x SOC +connected to a MAX98357A DAC via I2S. + +Required properties: + +- compatible : "google,storm-audio" +- cpu : Phandle of the CPU DAI +- codec : Phandle of the codec DAI + +Optional properties: + +- qcom,model : The user-visible name of this sound card. + +Example: + +sound { + compatible = "google,storm-audio"; + qcom,model = "ipq806x-storm"; + cpu = <&lpass_cpu>; + codec = <&max98357a>; +}; diff --git a/Documentation/devicetree/bindings/sound/wm8804.txt b/Documentation/devicetree/bindings/sound/wm8804.txt index 4d3a56f..6fd124b 100644 --- a/Documentation/devicetree/bindings/sound/wm8804.txt +++ b/Documentation/devicetree/bindings/sound/wm8804.txt @@ -10,6 +10,13 @@ Required properties: - reg : the I2C address of the device for I2C, the chip select number for SPI. + - PVDD-supply, DVDD-supply : Power supplies for the device, as covered + in Documentation/devicetree/bindings/regulator/regulator.txt + +Optional properties: + + - wlf,reset-gpio: A GPIO specifier for the GPIO controlling the reset pin + Example: codec: wm8804@1a { diff --git a/Documentation/devicetree/booting-without-of.txt b/Documentation/devicetree/booting-without-of.txt index 7768518..e49e423 100644 --- a/Documentation/devicetree/booting-without-of.txt +++ b/Documentation/devicetree/booting-without-of.txt @@ -15,6 +15,7 @@ Table of Contents 1) Entry point for arch/arm 2) Entry point for arch/powerpc 3) Entry point for arch/x86 + 4) Entry point for arch/mips/bmips II - The DT block format 1) Header @@ -288,6 +289,33 @@ it with special cases. or initrd address. It simply holds information which can not be retrieved otherwise like interrupt routing or a list of devices behind an I2C bus. +4) Entry point for arch/mips/bmips +---------------------------------- + + Some bootloaders only support a single entry point, at the start of the + kernel image. Other bootloaders will jump to the ELF start address. + Both schemes are supported; CONFIG_BOOT_RAW=y and CONFIG_NO_EXCEPT_FILL=y, + so the first instruction immediately jumps to kernel_entry(). + + Similar to the arch/arm case (b), a DT-aware bootloader is expected to + set up the following registers: + + a0 : 0 + + a1 : 0xffffffff + + a2 : Physical pointer to the device tree block (defined in chapter + II) in RAM. The device tree can be located anywhere in the first + 512MB of the physical address space (0x00000000 - 0x1fffffff), + aligned on a 64 bit boundary. + + Legacy bootloaders do not use this convention, and they do not pass in a + DT block. In this case, Linux will look for a builtin DTB, selected via + CONFIG_DT_*. + + This convention is defined for 32-bit systems only, as there are not + currently any 64-bit BMIPS implementations. + II - The DT block format ======================== diff --git a/Documentation/driver-model/devres.txt b/Documentation/driver-model/devres.txt index 6d1e8ee..e1e2bbd 100644 --- a/Documentation/driver-model/devres.txt +++ b/Documentation/driver-model/devres.txt @@ -289,6 +289,10 @@ IRQ devm_request_irq() devm_request_threaded_irq() +LED + devm_led_classdev_register() + devm_led_classdev_unregister() + MDIO devm_mdiobus_alloc() devm_mdiobus_alloc_size() diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking index f91926f..0a926e2 100644 --- a/Documentation/filesystems/Locking +++ b/Documentation/filesystems/Locking @@ -196,7 +196,7 @@ prototypes: void (*invalidatepage) (struct page *, unsigned int, unsigned int); int (*releasepage) (struct page *, int); void (*freepage)(struct page *); - int (*direct_IO)(int, struct kiocb *, struct iov_iter *iter, loff_t offset); + int (*direct_IO)(struct kiocb *, struct iov_iter *iter, loff_t offset); int (*migratepage)(struct address_space *, struct page *, struct page *); int (*launder_page)(struct page *); int (*is_partially_uptodate)(struct page *, unsigned long, unsigned long); @@ -429,8 +429,6 @@ prototypes: loff_t (*llseek) (struct file *, loff_t, int); ssize_t (*read) (struct file *, char __user *, size_t, loff_t *); ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *); - ssize_t (*aio_read) (struct kiocb *, const struct iovec *, unsigned long, loff_t); - ssize_t (*aio_write) (struct kiocb *, const struct iovec *, unsigned long, loff_t); ssize_t (*read_iter) (struct kiocb *, struct iov_iter *); ssize_t (*write_iter) (struct kiocb *, struct iov_iter *); int (*iterate) (struct file *, struct dir_context *); @@ -525,6 +523,7 @@ prototypes: void (*close)(struct vm_area_struct*); int (*fault)(struct vm_area_struct*, struct vm_fault *); int (*page_mkwrite)(struct vm_area_struct *, struct vm_fault *); + int (*pfn_mkwrite)(struct vm_area_struct *, struct vm_fault *); int (*access)(struct vm_area_struct *, unsigned long, void*, int, int); locking rules: @@ -534,6 +533,7 @@ close: yes fault: yes can return with page locked map_pages: yes page_mkwrite: yes can return with page locked +pfn_mkwrite: yes access: yes ->fault() is called when a previously not present pte is about @@ -560,6 +560,12 @@ the page has been truncated, the filesystem should not look up a new page like the ->fault() handler, but simply return with VM_FAULT_NOPAGE, which will cause the VM to retry the fault. + ->pfn_mkwrite() is the same as page_mkwrite but when the pte is +VM_PFNMAP or VM_MIXEDMAP with a page-less entry. Expected return is +VM_FAULT_NOPAGE. Or one of the VM_FAULT_ERROR types. The default behavior +after this call is to make the pte read-write, unless pfn_mkwrite returns +an error. + ->access() is called when get_user_pages() fails in access_process_vm(), typically used to debug a process through /proc/pid/mem or ptrace. This function is needed only for diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting index fa2db08..e69274d 100644 --- a/Documentation/filesystems/porting +++ b/Documentation/filesystems/porting @@ -471,3 +471,15 @@ in your dentry operations instead. [mandatory] f_dentry is gone; use f_path.dentry, or, better yet, see if you can avoid it entirely. +-- +[mandatory] + never call ->read() and ->write() directly; use __vfs_{read,write} or + wrappers; instead of checking for ->write or ->read being NULL, look for + FMODE_CAN_{WRITE,READ} in file->f_mode. +-- +[mandatory] + do _not_ use new_sync_{read,write} for ->read/->write; leave it NULL + instead. +-- +[mandatory] + ->aio_read/->aio_write are gone. Use ->read_iter/->write_iter. diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index a07ba61..8e36c7e 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt @@ -200,12 +200,12 @@ contains details information about the process itself. Its fields are explained in Table 1-4. (for SMP CONFIG users) -For making accounting scalable, RSS related information are handled in -asynchronous manner and the vaule may not be very precise. To see a precise +For making accounting scalable, RSS related information are handled in an +asynchronous manner and the value may not be very precise. To see a precise snapshot of a moment, you can see /proc/<pid>/smaps file and scan page table. It's slow but very precise. -Table 1-2: Contents of the status files (as of 2.6.30-rc7) +Table 1-2: Contents of the status files (as of 3.20.0) .............................................................................. Field Content Name filename of the executable @@ -213,6 +213,7 @@ Table 1-2: Contents of the status files (as of 2.6.30-rc7) in an uninterruptible wait, Z is zombie, T is traced or stopped) Tgid thread group ID + Ngid NUMA group ID (0 if none) Pid process id PPid process id of the parent process TracerPid PID of process tracing this process (0 if not) @@ -220,6 +221,10 @@ Table 1-2: Contents of the status files (as of 2.6.30-rc7) Gid Real, effective, saved set, and file system GIDs FDSize number of file descriptor slots currently allocated Groups supplementary group list + NStgid descendant namespace thread group ID hierarchy + NSpid descendant namespace process ID hierarchy + NSpgid descendant namespace process group ID hierarchy + NSsid descendant namespace session ID hierarchy VmPeak peak virtual memory size VmSize total program size VmLck locked memory size @@ -1704,6 +1709,10 @@ A typical output is flags: 0100002 mnt_id: 19 +All locks associated with a file descriptor are shown in its fdinfo too. + +lock: 1: FLOCK ADVISORY WRITE 359 00:13:11691 0 EOF + The files such as eventfd, fsnotify, signalfd, epoll among the regular pos/flags pair provide additional information particular to the objects they represent. diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt index 966b228..5d833b3 100644 --- a/Documentation/filesystems/vfs.txt +++ b/Documentation/filesystems/vfs.txt @@ -590,7 +590,7 @@ struct address_space_operations { void (*invalidatepage) (struct page *, unsigned int, unsigned int); int (*releasepage) (struct page *, int); void (*freepage)(struct page *); - ssize_t (*direct_IO)(int, struct kiocb *, struct iov_iter *iter, loff_t offset); + ssize_t (*direct_IO)(struct kiocb *, struct iov_iter *iter, loff_t offset); /* migrate the contents of a page to the specified target */ int (*migratepage) (struct page *, struct page *); int (*launder_page) (struct page *); @@ -804,8 +804,6 @@ struct file_operations { loff_t (*llseek) (struct file *, loff_t, int); ssize_t (*read) (struct file *, char __user *, size_t, loff_t *); ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *); - ssize_t (*aio_read) (struct kiocb *, const struct iovec *, unsigned long, loff_t); - ssize_t (*aio_write) (struct kiocb *, const struct iovec *, unsigned long, loff_t); ssize_t (*read_iter) (struct kiocb *, struct iov_iter *); ssize_t (*write_iter) (struct kiocb *, struct iov_iter *); int (*iterate) (struct file *, struct dir_context *); @@ -838,14 +836,10 @@ otherwise noted. read: called by read(2) and related system calls - aio_read: vectored, possibly asynchronous read - read_iter: possibly asynchronous read with iov_iter as destination write: called by write(2) and related system calls - aio_write: vectored, possibly asynchronous write - write_iter: possibly asynchronous write with iov_iter as source iterate: called when the VFS needs to read the directory contents diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 491bbd1..11a76df 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2317,7 +2317,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted. noexec32=off: disable non-executable mappings read implies executable mappings - nofpu [SH] Disable hardware FPU at boot time. + nofpu [MIPS,SH] Disable hardware FPU at boot time. nofxsr [BUGS=X86-32] Disables x86 floating point extended register save and restore. The kernel will only save diff --git a/Documentation/leds/leds-class-flash.txt b/Documentation/leds/leds-class-flash.txt new file mode 100644 index 0000000..19bb673 --- /dev/null +++ b/Documentation/leds/leds-class-flash.txt @@ -0,0 +1,22 @@ + +Flash LED handling under Linux +============================== + +Some LED devices provide two modes - torch and flash. In the LED subsystem +those modes are supported by LED class (see Documentation/leds/leds-class.txt) +and LED Flash class respectively. The torch mode related features are enabled +by default and the flash ones only if a driver declares it by setting +LED_DEV_CAP_FLASH flag. + +In order to enable the support for flash LEDs CONFIG_LEDS_CLASS_FLASH symbol +must be defined in the kernel config. A LED Flash class driver must be +registered in the LED subsystem with led_classdev_flash_register function. + +Following sysfs attributes are exposed for controlling flash LED devices: +(see Documentation/ABI/testing/sysfs-class-led-flash) + - flash_brightness + - max_flash_brightness + - flash_timeout + - max_flash_timeout + - flash_strobe + - flash_fault diff --git a/Documentation/powerpc/pci_iov_resource_on_powernv.txt b/Documentation/powerpc/pci_iov_resource_on_powernv.txt new file mode 100644 index 0000000..b55c5cd --- /dev/null +++ b/Documentation/powerpc/pci_iov_resource_on_powernv.txt @@ -0,0 +1,301 @@ +Wei Yang <weiyang@linux.vnet.ibm.com> +Benjamin Herrenschmidt <benh@au1.ibm.com> +Bjorn Helgaas <bhelgaas@google.com> +26 Aug 2014 + +This document describes the requirement from hardware for PCI MMIO resource +sizing and assignment on PowerKVM and how generic PCI code handles this +requirement. The first two sections describe the concepts of Partitionable +Endpoints and the implementation on P8 (IODA2). The next two sections talks +about considerations on enabling SRIOV on IODA2. + +1. Introduction to Partitionable Endpoints + +A Partitionable Endpoint (PE) is a way to group the various resources +associated with a device or a set of devices to provide isolation between +partitions (i.e., filtering of DMA, MSIs etc.) and to provide a mechanism +to freeze a device that is causing errors in order to limit the possibility +of propagation of bad data. + +There is thus, in HW, a table of PE states that contains a pair of "frozen" +state bits (one for MMIO and one for DMA, they get set together but can be +cleared independently) for each PE. + +When a PE is frozen, all stores in any direction are dropped and all loads +return all 1's value. MSIs are also blocked. There's a bit more state that +captures things like the details of the error that caused the freeze etc., but +that's not critical. + +The interesting part is how the various PCIe transactions (MMIO, DMA, ...) +are matched to their corresponding PEs. + +The following section provides a rough description of what we have on P8 +(IODA2). Keep in mind that this is all per PHB (PCI host bridge). Each PHB +is a completely separate HW entity that replicates the entire logic, so has +its own set of PEs, etc. + +2. Implementation of Partitionable Endpoints on P8 (IODA2) + +P8 supports up to 256 Partitionable Endpoints per PHB. + + * Inbound + + For DMA, MSIs and inbound PCIe error messages, we have a table (in + memory but accessed in HW by the chip) that provides a direct + correspondence between a PCIe RID (bus/dev/fn) with a PE number. + We call this the RTT. + + - For DMA we then provide an entire address space for each PE that can + contain two "windows", depending on the value of PCI address bit 59. + Each window can be configured to be remapped via a "TCE table" (IOMMU + translation table), which has various configurable characteristics + not described here. + + - For MSIs, we have two windows in the address space (one at the top of + the 32-bit space and one much higher) which, via a combination of the + address and MSI value, will result in one of the 2048 interrupts per + bridge being triggered. There's a PE# in the interrupt controller + descriptor table as well which is compared with the PE# obtained from + the RTT to "authorize" the device to emit that specific interrupt. + + - Error messages just use the RTT. + + * Outbound. That's where the tricky part is. + + Like other PCI host bridges, the Power8 IODA2 PHB supports "windows" + from the CPU address space to the PCI address space. There is one M32 + window and sixteen M64 windows. They have different characteristics. + First what they have in common: they forward a configurable portion of + the CPU address space to the PCIe bus and must be naturally aligned + power of two in size. The rest is different: + + - The M32 window: + + * Is limited to 4GB in size. + + * Drops the top bits of the address (above the size) and replaces + them with a configurable value. This is typically used to generate + 32-bit PCIe accesses. We configure that window at boot from FW and + don't touch it from Linux; it's usually set to forward a 2GB + portion of address space from the CPU to PCIe + 0x8000_0000..0xffff_ffff. (Note: The top 64KB are actually + reserved for MSIs but this is not a problem at this point; we just + need to ensure Linux doesn't assign anything there, the M32 logic + ignores that however and will forward in that space if we try). + + * It is divided into 256 segments of equal size. A table in the chip + maps each segment to a PE#. That allows portions of the MMIO space + to be assigned to PEs on a segment granularity. For a 2GB window, + the segment granularity is 2GB/256 = 8MB. + + Now, this is the "main" window we use in Linux today (excluding + SR-IOV). We basically use the trick of forcing the bridge MMIO windows + onto a segment alignment/granularity so that the space behind a bridge + can be assigned to a PE. + + Ideally we would like to be able to have individual functions in PEs + but that would mean using a completely different address allocation + scheme where individual function BARs can be "grouped" to fit in one or + more segments. + + - The M64 windows: + + * Must be at least 256MB in size. + + * Do not translate addresses (the address on PCIe is the same as the + address on the PowerBus). There is a way to also set the top 14 + bits which are not conveyed by PowerBus but we don't use this. + + * Can be configured to be segmented. When not segmented, we can + specify the PE# for the entire window. When segmented, a window + has 256 segments; however, there is no table for mapping a segment + to a PE#. The segment number *is* the PE#. + + * Support overlaps. If an address is covered by multiple windows, + there's a defined ordering for which window applies. + + We have code (fairly new compared to the M32 stuff) that exploits that + for large BARs in 64-bit space: + + We configure an M64 window to cover the entire region of address space + that has been assigned by FW for the PHB (about 64GB, ignore the space + for the M32, it comes out of a different "reserve"). We configure it + as segmented. + + Then we do the same thing as with M32, using the bridge alignment + trick, to match to those giant segments. + + Since we cannot remap, we have two additional constraints: + + - We do the PE# allocation *after* the 64-bit space has been assigned + because the addresses we use directly determine the PE#. We then + update the M32 PE# for the devices that use both 32-bit and 64-bit + spaces or assign the remaining PE# to 32-bit only devices. + + - We cannot "group" segments in HW, so if a device ends up using more + than one segment, we end up with more than one PE#. There is a HW + mechanism to make the freeze state cascade to "companion" PEs but + that only works for PCIe error messages (typically used so that if + you freeze a switch, it freezes all its children). So we do it in + SW. We lose a bit of effectiveness of EEH in that case, but that's + the best we found. So when any of the PEs freezes, we freeze the + other ones for that "domain". We thus introduce the concept of + "master PE" which is the one used for DMA, MSIs, etc., and "secondary + PEs" that are used for the remaining M64 segments. + + We would like to investigate using additional M64 windows in "single + PE" mode to overlay over specific BARs to work around some of that, for + example for devices with very large BARs, e.g., GPUs. It would make + sense, but we haven't done it yet. + +3. Considerations for SR-IOV on PowerKVM + + * SR-IOV Background + + The PCIe SR-IOV feature allows a single Physical Function (PF) to + support several Virtual Functions (VFs). Registers in the PF's SR-IOV + Capability control the number of VFs and whether they are enabled. + + When VFs are enabled, they appear in Configuration Space like normal + PCI devices, but the BARs in VF config space headers are unusual. For + a non-VF device, software uses BARs in the config space header to + discover the BAR sizes and assign addresses for them. For VF devices, + software uses VF BAR registers in the *PF* SR-IOV Capability to + discover sizes and assign addresses. The BARs in the VF's config space + header are read-only zeros. + + When a VF BAR in the PF SR-IOV Capability is programmed, it sets the + base address for all the corresponding VF(n) BARs. For example, if the + PF SR-IOV Capability is programmed to enable eight VFs, and it has a + 1MB VF BAR0, the address in that VF BAR sets the base of an 8MB region. + This region is divided into eight contiguous 1MB regions, each of which + is a BAR0 for one of the VFs. Note that even though the VF BAR + describes an 8MB region, the alignment requirement is for a single VF, + i.e., 1MB in this example. + + There are several strategies for isolating VFs in PEs: + + - M32 window: There's one M32 window, and it is split into 256 + equally-sized segments. The finest granularity possible is a 256MB + window with 1MB segments. VF BARs that are 1MB or larger could be + mapped to separate PEs in this window. Each segment can be + individually mapped to a PE via the lookup table, so this is quite + flexible, but it works best when all the VF BARs are the same size. If + they are different sizes, the entire window has to be small enough that + the segment size matches the smallest VF BAR, which means larger VF + BARs span several segments. + + - Non-segmented M64 window: A non-segmented M64 window is mapped entirely + to a single PE, so it could only isolate one VF. + + - Single segmented M64 windows: A segmented M64 window could be used just + like the M32 window, but the segments can't be individually mapped to + PEs (the segment number is the PE#), so there isn't as much + flexibility. A VF with multiple BARs would have to be in a "domain" of + multiple PEs, which is not as well isolated as a single PE. + + - Multiple segmented M64 windows: As usual, each window is split into 256 + equally-sized segments, and the segment number is the PE#. But if we + use several M64 windows, they can be set to different base addresses + and different segment sizes. If we have VFs that each have a 1MB BAR + and a 32MB BAR, we could use one M64 window to assign 1MB segments and + another M64 window to assign 32MB segments. + + Finally, the plan to use M64 windows for SR-IOV, which will be described + more in the next two sections. For a given VF BAR, we need to + effectively reserve the entire 256 segments (256 * VF BAR size) and + position the VF BAR to start at the beginning of a free range of + segments/PEs inside that M64 window. + + The goal is of course to be able to give a separate PE for each VF. + + The IODA2 platform has 16 M64 windows, which are used to map MMIO + range to PE#. Each M64 window defines one MMIO range and this range is + divided into 256 segments, with each segment corresponding to one PE. + + We decide to leverage this M64 window to map VFs to individual PEs, since + SR-IOV VF BARs are all the same size. + + But doing so introduces another problem: total_VFs is usually smaller + than the number of M64 window segments, so if we map one VF BAR directly + to one M64 window, some part of the M64 window will map to another + device's MMIO range. + + IODA supports 256 PEs, so segmented windows contain 256 segments, so if + total_VFs is less than 256, we have the situation in Figure 1.0, where + segments [total_VFs, 255] of the M64 window may map to some MMIO range on + other devices: + + 0 1 total_VFs - 1 + +------+------+- -+------+------+ + | | | ... | | | + +------+------+- -+------+------+ + + VF(n) BAR space + + 0 1 total_VFs - 1 255 + +------+------+- -+------+------+- -+------+------+ + | | | ... | | | ... | | | + +------+------+- -+------+------+- -+------+------+ + + M64 window + + Figure 1.0 Direct map VF(n) BAR space + + Our current solution is to allocate 256 segments even if the VF(n) BAR + space doesn't need that much, as shown in Figure 1.1: + + 0 1 total_VFs - 1 255 + +------+------+- -+------+------+- -+------+------+ + | | | ... | | | ... | | | + +------+------+- -+------+------+- -+------+------+ + + VF(n) BAR space + extra + + 0 1 total_VFs - 1 255 + +------+------+- -+------+------+- -+------+------+ + | | | ... | | | ... | | | + +------+------+- -+------+------+- -+------+------+ + + M64 window + + Figure 1.1 Map VF(n) BAR space + extra + + Allocating the extra space ensures that the entire M64 window will be + assigned to this one SR-IOV device and none of the space will be + available for other devices. Note that this only expands the space + reserved in software; there are still only total_VFs VFs, and they only + respond to segments [0, total_VFs - 1]. There's nothing in hardware that + responds to segments [total_VFs, 255]. + +4. Implications for the Generic PCI Code + +The PCIe SR-IOV spec requires that the base of the VF(n) BAR space be +aligned to the size of an individual VF BAR. + +In IODA2, the MMIO address determines the PE#. If the address is in an M32 +window, we can set the PE# by updating the table that translates segments +to PE#s. Similarly, if the address is in an unsegmented M64 window, we can +set the PE# for the window. But if it's in a segmented M64 window, the +segment number is the PE#. + +Therefore, the only way to control the PE# for a VF is to change the base +of the VF(n) BAR space in the VF BAR. If the PCI core allocates the exact +amount of space required for the VF(n) BAR space, the VF BAR value is fixed +and cannot be changed. + +On the other hand, if the PCI core allocates additional space, the VF BAR +value can be changed as long as the entire VF(n) BAR space remains inside +the space allocated by the core. + +Ideally the segment size will be the same as an individual VF BAR size. +Then each VF will be in its own PE. The VF BARs (and therefore the PE#s) +are contiguous. If VF0 is in PE(x), then VF(n) is in PE(x+n). If we +allocate 256 segments, there are (256 - numVFs) choices for the PE# of VF0. + +If the segment size is smaller than the VF BAR size, it will take several +segments to cover a VF BAR, and a VF will be in several PEs. This is +possible, but the isolation isn't as good, and it reduces the number of PE# +choices because instead of consuming only numVFs segments, the VF(n) BAR +space will consume (numVFs * n) segments. That means there aren't as many +available segments for adjusting base of the VF(n) BAR space. diff --git a/Documentation/powerpc/transactional_memory.txt b/Documentation/powerpc/transactional_memory.txt index 9791e98a..ba0a2a4 100644 --- a/Documentation/powerpc/transactional_memory.txt +++ b/Documentation/powerpc/transactional_memory.txt @@ -74,22 +74,23 @@ Causes of transaction aborts Syscalls ======== -Performing syscalls from within transaction is not recommended, and can lead -to unpredictable results. +Syscalls made from within an active transaction will not be performed and the +transaction will be doomed by the kernel with the failure code TM_CAUSE_SYSCALL +| TM_CAUSE_PERSISTENT. -Syscalls do not by design abort transactions, but beware: The kernel code will -not be running in transactional state. The effect of syscalls will always -remain visible, but depending on the call they may abort your transaction as a -side-effect, read soon-to-be-aborted transactional data that should not remain -invisible, etc. If you constantly retry a transaction that constantly aborts -itself by calling a syscall, you'll have a livelock & make no progress. +Syscalls made from within a suspended transaction are performed as normal and +the transaction is not explicitly doomed by the kernel. However, what the +kernel does to perform the syscall may result in the transaction being doomed +by the hardware. The syscall is performed in suspended mode so any side +effects will be persistent, independent of transaction success or failure. No +guarantees are provided by the kernel about which syscalls will affect +transaction success. -Simple syscalls (e.g. sigprocmask()) "could" be OK. Even things like write() -from, say, printf() should be OK as long as the kernel does not access any -memory that was accessed transactionally. - -Consider any syscalls that happen to work as debug-only -- not recommended for -production use. Best to queue them up till after the transaction is over. +Care must be taken when relying on syscalls to abort during active transactions +if the calls are made via a library. Libraries may cache values (which may +give the appearance of success) or perform operations that cause transaction +failure before entering the kernel (which may produce different failure codes). +Examples are glibc's getpid() and lazy symbol resolution. Signals @@ -174,10 +175,9 @@ These are defined in <asm/reg.h>, and distinguish different reasons why the kernel aborted a transaction: TM_CAUSE_RESCHED Thread was rescheduled. - TM_CAUSE_TLBI Software TLB invalide. + TM_CAUSE_TLBI Software TLB invalid. TM_CAUSE_FAC_UNAV FP/VEC/VSX unavailable trap. - TM_CAUSE_SYSCALL Currently unused; future syscalls that must abort - transactions for consistency will use this. + TM_CAUSE_SYSCALL Syscall from active transaction. TM_CAUSE_SIGNAL Signal delivered. TM_CAUSE_MISC Currently unused. TM_CAUSE_ALIGNMENT Alignment fault. @@ -185,7 +185,7 @@ kernel aborted a transaction: These can be checked by the user program's abort handler as TEXASR[0:7]. If bit 7 is set, it indicates that the error is consider persistent. For example -a TM_CAUSE_ALIGNMENT will be persistent while a TM_CAUSE_RESCHED will not.q +a TM_CAUSE_ALIGNMENT will be persistent while a TM_CAUSE_RESCHED will not. GDB === diff --git a/Documentation/printk-formats.txt b/Documentation/printk-formats.txt index 5a615c1..cb6a596 100644 --- a/Documentation/printk-formats.txt +++ b/Documentation/printk-formats.txt @@ -8,6 +8,21 @@ If variable is of Type, use printk format specifier: unsigned long long %llu or %llx size_t %zu or %zx ssize_t %zd or %zx + s32 %d or %x + u32 %u or %x + s64 %lld or %llx + u64 %llu or %llx + +If <type> is dependent on a config option for its size (e.g., sector_t, +blkcnt_t) or is architecture-dependent for its size (e.g., tcflag_t), use a +format specifier of its largest possible type and explicitly cast to it. +Example: + + printk("test: sector number/total blocks: %llu/%llu\n", + (unsigned long long)sector, (unsigned long long)blockcount); + +Reminder: sizeof() result is of type size_t. + Raw pointer value SHOULD be printed with %p. The kernel supports the following extended format specifiers for pointer types: @@ -54,6 +69,7 @@ Struct Resources: For printing struct resources. The 'R' and 'r' specifiers result in a printed resource with ('R') or without ('r') a decoded flags member. + Passed by reference. Physical addresses types phys_addr_t: @@ -132,6 +148,8 @@ MAC/FDDI addresses: specifier to use reversed byte order suitable for visual interpretation of Bluetooth addresses which are in the little endian order. + Passed by reference. + IPv4 addresses: %pI4 1.2.3.4 @@ -146,6 +164,8 @@ IPv4 addresses: host, network, big or little endian order addresses respectively. Where no specifier is provided the default network/big endian order is used. + Passed by reference. + IPv6 addresses: %pI6 0001:0002:0003:0004:0005:0006:0007:0008 @@ -160,6 +180,8 @@ IPv6 addresses: print a compressed IPv6 address as described by http://tools.ietf.org/html/rfc5952 + Passed by reference. + IPv4/IPv6 addresses (generic, with port, flowinfo, scope): %pIS 1.2.3.4 or 0001:0002:0003:0004:0005:0006:0007:0008 @@ -186,6 +208,8 @@ IPv4/IPv6 addresses (generic, with port, flowinfo, scope): specifiers can be used as well and are ignored in case of an IPv6 address. + Passed by reference. + Further examples: %pISfc 1.2.3.4 or [1:2:3:4:5:6:7:8]/123456789 @@ -207,6 +231,8 @@ UUID/GUID addresses: Where no additional specifiers are used the default little endian order with lower case hex characters will be printed. + Passed by reference. + dentry names: %pd{,2,3,4} %pD{,2,3,4} @@ -216,6 +242,8 @@ dentry names: equivalent of %s dentry->d_name.name we used to use, %pd<n> prints n last components. %pD does the same thing for struct file. + Passed by reference. + struct va_format: %pV @@ -231,23 +259,20 @@ struct va_format: Do not use this feature without some mechanism to verify the correctness of the format string and va_list arguments. -u64 SHOULD be printed with %llu/%llx: - - printk("%llu", u64_var); + Passed by reference. -s64 SHOULD be printed with %lld/%llx: +struct clk: - printk("%lld", s64_var); + %pC pll1 + %pCn pll1 + %pCr 1560000000 -If <type> is dependent on a config option for its size (e.g., sector_t, -blkcnt_t) or is architecture-dependent for its size (e.g., tcflag_t), use a -format specifier of its largest possible type and explicitly cast to it. -Example: + For printing struct clk structures. '%pC' and '%pCn' print the name + (Common Clock Framework) or address (legacy clock framework) of the + structure; '%pCr' prints the current clock rate. - printk("test: sector number/total blocks: %llu/%llu\n", - (unsigned long long)sector, (unsigned long long)blockcount); + Passed by reference. -Reminder: sizeof() result is of type size_t. Thank you for your cooperation and attention. diff --git a/Documentation/sound/alsa/ControlNames.txt b/Documentation/sound/alsa/ControlNames.txt index 79a6127..3fc1cf50 100644 --- a/Documentation/sound/alsa/ControlNames.txt +++ b/Documentation/sound/alsa/ControlNames.txt @@ -71,11 +71,11 @@ SOURCE: HDMI/DP (either HDMI or DisplayPort) Exceptions (deprecated): - [Digital] Capture Source - [Digital] Capture Switch (aka input gain switch) - [Digital] Capture Volume (aka input gain volume) - [Digital] Playback Switch (aka output gain switch) - [Digital] Playback Volume (aka output gain volume) + [Analogue|Digital] Capture Source + [Analogue|Digital] Capture Switch (aka input gain switch) + [Analogue|Digital] Capture Volume (aka input gain volume) + [Analogue|Digital] Playback Switch (aka output gain switch) + [Analogue|Digital] Playback Volume (aka output gain volume) Tone Control - Switch Tone Control - Bass Tone Control - Treble diff --git a/Documentation/sound/alsa/HD-Audio.txt b/Documentation/sound/alsa/HD-Audio.txt index 42a0a39..e7193aa 100644 --- a/Documentation/sound/alsa/HD-Audio.txt +++ b/Documentation/sound/alsa/HD-Audio.txt @@ -466,7 +466,11 @@ The generic parser supports the following hints: - add_jack_modes (bool): add "xxx Jack Mode" enum controls to each I/O jack for allowing to change the headphone amp and mic bias VREF capabilities -- power_down_unused (bool): power down the unused widgets +- power_save_node (bool): advanced power management for each widget, + controlling the power sate (D0/D3) of each widget node depending on + the actual pin and stream states +- power_down_unused (bool): power down the unused widgets, a subset of + power_save_node, and will be dropped in future - add_hp_mic (bool): add the headphone to capture source if possible - hp_mic_detect (bool): enable/disable the hp/mic shared input for a single built-in mic case; default true diff --git a/Documentation/sound/alsa/timestamping.txt b/Documentation/sound/alsa/timestamping.txt new file mode 100644 index 0000000..0b191a2 --- /dev/null +++ b/Documentation/sound/alsa/timestamping.txt @@ -0,0 +1,200 @@ +The ALSA API can provide two different system timestamps: + +- Trigger_tstamp is the system time snapshot taken when the .trigger +callback is invoked. This snapshot is taken by the ALSA core in the +general case, but specific hardware may have synchronization +capabilities or conversely may only be able to provide a correct +estimate with a delay. In the latter two cases, the low-level driver +is responsible for updating the trigger_tstamp at the most appropriate +and precise moment. Applications should not rely solely on the first +trigger_tstamp but update their internal calculations if the driver +provides a refined estimate with a delay. + +- tstamp is the current system timestamp updated during the last +event or application query. +The difference (tstamp - trigger_tstamp) defines the elapsed time. + +The ALSA API provides reports two basic pieces of information, avail +and delay, which combined with the trigger and current system +timestamps allow for applications to keep track of the 'fullness' of +the ring buffer and the amount of queued samples. + +The use of these different pointers and time information depends on +the application needs: + +- 'avail' reports how much can be written in the ring buffer +- 'delay' reports the time it will take to hear a new sample after all +queued samples have been played out. + +When timestamps are enabled, the avail/delay information is reported +along with a snapshot of system time. Applications can select from +CLOCK_REALTIME (NTP corrections including going backwards), +CLOCK_MONOTONIC (NTP corrections but never going backwards), +CLOCK_MONOTIC_RAW (without NTP corrections) and change the mode +dynamically with sw_params + + +The ALSA API also provide an audio_tstamp which reflects the passage +of time as measured by different components of audio hardware. In +ascii-art, this could be represented as follows (for the playback +case): + + +--------------------------------------------------------------> time + ^ ^ ^ ^ ^ + | | | | | + analog link dma app FullBuffer + time time time time time + | | | | | + |< codec delay >|<--hw delay-->|<queued samples>|<---avail->| + |<----------------- delay---------------------->| | + |<----ring buffer length---->| + +The analog time is taken at the last stage of the playback, as close +as possible to the actual transducer + +The link time is taken at the output of the SOC/chipset as the samples +are pushed on a link. The link time can be directly measured if +supported in hardware by sample counters or wallclocks (e.g. with +HDAudio 24MHz or PTP clock for networked solutions) or indirectly +estimated (e.g. with the frame counter in USB). + +The DMA time is measured using counters - typically the least reliable +of all measurements due to the bursty natured of DMA transfers. + +The app time corresponds to the time tracked by an application after +writing in the ring buffer. + +The application can query what the hardware supports, define which +audio time it wants reported by selecting the relevant settings in +audio_tstamp_config fields, get an estimate of the timestamp +accuracy. It can also request the delay-to-analog be included in the +measurement. Direct access to the link time is very interesting on +platforms that provide an embedded DSP; measuring directly the link +time with dedicated hardware, possibly synchronized with system time, +removes the need to keep track of internal DSP processing times and +latency. + +In case the application requests an audio tstamp that is not supported +in hardware/low-level driver, the type is overridden as DEFAULT and the +timestamp will report the DMA time based on the hw_pointer value. + +For backwards compatibility with previous implementations that did not +provide timestamp selection, with a zero-valued COMPAT timestamp type +the results will default to the HDAudio wall clock for playback +streams and to the DMA time (hw_ptr) in all other cases. + +The audio timestamp accuracy can be returned to user-space, so that +appropriate decisions are made: + +- for dma time (default), the granularity of the transfers can be + inferred from the steps between updates and in turn provide + information on how much the application pointer can be rewound + safely. + +- the link time can be used to track long-term drifts between audio + and system time using the (tstamp-trigger_tstamp)/audio_tstamp + ratio, the precision helps define how much smoothing/low-pass + filtering is required. The link time can be either reset on startup + or reported as is (the latter being useful to compare progress of + different streams - but may require the wallclock to be always + running and not wrap-around during idle periods). If supported in + hardware, the absolute link time could also be used to define a + precise start time (patches WIP) + +- including the delay in the audio timestamp may + counter-intuitively not increase the precision of timestamps, e.g. if a + codec includes variable-latency DSP processing or a chain of + hardware components the delay is typically not known with precision. + +The accuracy is reported in nanosecond units (using an unsigned 32-bit +word), which gives a max precision of 4.29s, more than enough for +audio applications... + +Due to the varied nature of timestamping needs, even for a single +application, the audio_tstamp_config can be changed dynamically. In +the STATUS ioctl, the parameters are read-only and do not allow for +any application selection. To work around this limitation without +impacting legacy applications, a new STATUS_EXT ioctl is introduced +with read/write parameters. ALSA-lib will be modified to make use of +STATUS_EXT and effectively deprecate STATUS. + +The ALSA API only allows for a single audio timestamp to be reported +at a time. This is a conscious design decision, reading the audio +timestamps from hardware registers or from IPC takes time, the more +timestamps are read the more imprecise the combined measurements +are. To avoid any interpretation issues, a single (system, audio) +timestamp is reported. Applications that need different timestamps +will be required to issue multiple queries and perform an +interpolation of the results + +In some hardware-specific configuration, the system timestamp is +latched by a low-level audio subsytem, and the information provided +back to the driver. Due to potential delays in the communication with +the hardware, there is a risk of misalignment with the avail and delay +information. To make sure applications are not confused, a +driver_timestamp field is added in the snd_pcm_status structure; this +timestamp shows when the information is put together by the driver +before returning from the STATUS and STATUS_EXT ioctl. in most cases +this driver_timestamp will be identical to the regular system tstamp. + +Examples of typestamping with HDaudio: + +1. DMA timestamp, no compensation for DMA+analog delay +$ ./audio_time -p --ts_type=1 +playback: systime: 341121338 nsec, audio time 342000000 nsec, systime delta -878662 +playback: systime: 426236663 nsec, audio time 427187500 nsec, systime delta -950837 +playback: systime: 597080580 nsec, audio time 598000000 nsec, systime delta -919420 +playback: systime: 682059782 nsec, audio time 683020833 nsec, systime delta -961051 +playback: systime: 852896415 nsec, audio time 853854166 nsec, systime delta -957751 +playback: systime: 937903344 nsec, audio time 938854166 nsec, systime delta -950822 + +2. DMA timestamp, compensation for DMA+analog delay +$ ./audio_time -p --ts_type=1 -d +playback: systime: 341053347 nsec, audio time 341062500 nsec, systime delta -9153 +playback: systime: 426072447 nsec, audio time 426062500 nsec, systime delta 9947 +playback: systime: 596899518 nsec, audio time 596895833 nsec, systime delta 3685 +playback: systime: 681915317 nsec, audio time 681916666 nsec, systime delta -1349 +playback: systime: 852741306 nsec, audio time 852750000 nsec, systime delta -8694 + +3. link timestamp, compensation for DMA+analog delay +$ ./audio_time -p --ts_type=2 -d +playback: systime: 341060004 nsec, audio time 341062791 nsec, systime delta -2787 +playback: systime: 426242074 nsec, audio time 426244875 nsec, systime delta -2801 +playback: systime: 597080992 nsec, audio time 597084583 nsec, systime delta -3591 +playback: systime: 682084512 nsec, audio time 682088291 nsec, systime delta -3779 +playback: systime: 852936229 nsec, audio time 852940916 nsec, systime delta -4687 +playback: systime: 938107562 nsec, audio time 938112708 nsec, systime delta -5146 + +Example 1 shows that the timestamp at the DMA level is close to 1ms +ahead of the actual playback time (as a side time this sort of +measurement can help define rewind safeguards). Compensating for the +DMA-link delay in example 2 helps remove the hardware buffering abut +the information is still very jittery, with up to one sample of +error. In example 3 where the timestamps are measured with the link +wallclock, the timestamps show a monotonic behavior and a lower +dispersion. + +Example 3 and 4 are with USB audio class. Example 3 shows a high +offset between audio time and system time due to buffering. Example 4 +shows how compensating for the delay exposes a 1ms accuracy (due to +the use of the frame counter by the driver) + +Example 3: DMA timestamp, no compensation for delay, delta of ~5ms +$ ./audio_time -p -Dhw:1 -t1 +playback: systime: 120174019 nsec, audio time 125000000 nsec, systime delta -4825981 +playback: systime: 245041136 nsec, audio time 250000000 nsec, systime delta -4958864 +playback: systime: 370106088 nsec, audio time 375000000 nsec, systime delta -4893912 +playback: systime: 495040065 nsec, audio time 500000000 nsec, systime delta -4959935 +playback: systime: 620038179 nsec, audio time 625000000 nsec, systime delta -4961821 +playback: systime: 745087741 nsec, audio time 750000000 nsec, systime delta -4912259 +playback: systime: 870037336 nsec, audio time 875000000 nsec, systime delta -4962664 + +Example 4: DMA timestamp, compensation for delay, delay of ~1ms +$ ./audio_time -p -Dhw:1 -t1 -d +playback: systime: 120190520 nsec, audio time 120000000 nsec, systime delta 190520 +playback: systime: 245036740 nsec, audio time 244000000 nsec, systime delta 1036740 +playback: systime: 370034081 nsec, audio time 369000000 nsec, systime delta 1034081 +playback: systime: 495159907 nsec, audio time 494000000 nsec, systime delta 1159907 +playback: systime: 620098824 nsec, audio time 619000000 nsec, systime delta 1098824 +playback: systime: 745031847 nsec, audio time 744000000 nsec, systime delta 1031847 diff --git a/Documentation/spi/spidev_test.c b/Documentation/spi/spidev_test.c index 94f574b..135b3f5 100644 --- a/Documentation/spi/spidev_test.c +++ b/Documentation/spi/spidev_test.c @@ -80,7 +80,7 @@ static void hex_dump(const void *src, size_t length, size_t line_size, char *pre * Unescape - process hexadecimal escape character * converts shell input "\x23" -> 0x23 */ -int unespcape(char *_dst, char *_src, size_t len) +static int unescape(char *_dst, char *_src, size_t len) { int ret = 0; char *src = _src; @@ -304,7 +304,7 @@ int main(int argc, char *argv[]) size = strlen(input_tx+1); tx = malloc(size); rx = malloc(size); - size = unespcape((char *)tx, input_tx, size); + size = unescape((char *)tx, input_tx, size); transfer(fd, tx, rx, size); free(rx); free(tx); diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index 99d7eb3..c831001 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt @@ -872,6 +872,27 @@ can be ORed together: ============================================================== +threads-max + +This value controls the maximum number of threads that can be created +using fork(). + +During initialization the kernel sets this value such that even if the +maximum number of threads is created, the thread structures occupy only +a part (1/8th) of the available RAM pages. + +The minimum value that can be written to threads-max is 20. +The maximum value that can be written to threads-max is given by the +constant FUTEX_TID_MASK (0x3fffffff). +If a value outside of this range is written to threads-max an error +EINVAL occurs. + +The value written is checked against the available RAM pages. If the +thread structures would occupy too much (more than 1/8th) of the +available RAM pages threads-max is reduced accordingly. + +============================================================== + unknown_nmi_panic: The value in this file affects behavior of handling NMI. When the diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt index 902b457..9832ec5 100644 --- a/Documentation/sysctl/vm.txt +++ b/Documentation/sysctl/vm.txt @@ -21,6 +21,7 @@ Currently, these files are in /proc/sys/vm: - admin_reserve_kbytes - block_dump - compact_memory +- compact_unevictable_allowed - dirty_background_bytes - dirty_background_ratio - dirty_bytes @@ -106,6 +107,16 @@ huge pages although processes will also directly compact memory as required. ============================================================== +compact_unevictable_allowed + +Available only when CONFIG_COMPACTION is set. When set to 1, compaction is +allowed to examine the unevictable lru (mlocked pages) for pages to compact. +This should be used on systems where stalls for minor page faults are an +acceptable trade for large contiguous free memory. Set to 0 to prevent +compaction from moving pages that are unevictable. Default value is 1. + +============================================================== + dirty_background_bytes Contains the amount of dirty memory at which the background kernel diff --git a/Documentation/vm/hugetlbpage.txt b/Documentation/vm/hugetlbpage.txt index f2d3a10..030977f 100644 --- a/Documentation/vm/hugetlbpage.txt +++ b/Documentation/vm/hugetlbpage.txt @@ -267,21 +267,34 @@ call, then it is required that system administrator mount a file system of type hugetlbfs: mount -t hugetlbfs \ - -o uid=<value>,gid=<value>,mode=<value>,size=<value>,nr_inodes=<value> \ - none /mnt/huge + -o uid=<value>,gid=<value>,mode=<value>,pagesize=<value>,size=<value>,\ + min_size=<value>,nr_inodes=<value> none /mnt/huge This command mounts a (pseudo) filesystem of type hugetlbfs on the directory /mnt/huge. Any files created on /mnt/huge uses huge pages. The uid and gid options sets the owner and group of the root of the file system. By default the uid and gid of the current process are taken. The mode option sets the mode of root of file system to value & 01777. This value is given in octal. -By default the value 0755 is picked. The size option sets the maximum value of -memory (huge pages) allowed for that filesystem (/mnt/huge). The size is -rounded down to HPAGE_SIZE. The option nr_inodes sets the maximum number of -inodes that /mnt/huge can use. If the size or nr_inodes option is not -provided on command line then no limits are set. For size and nr_inodes -options, you can use [G|g]/[M|m]/[K|k] to represent giga/mega/kilo. For -example, size=2K has the same meaning as size=2048. +By default the value 0755 is picked. If the paltform supports multiple huge +page sizes, the pagesize option can be used to specify the huge page size and +associated pool. pagesize is specified in bytes. If pagesize is not specified +the paltform's default huge page size and associated pool will be used. The +size option sets the maximum value of memory (huge pages) allowed for that +filesystem (/mnt/huge). The size option can be specified in bytes, or as a +percentage of the specified huge page pool (nr_hugepages). The size is +rounded down to HPAGE_SIZE boundary. The min_size option sets the minimum +value of memory (huge pages) allowed for the filesystem. min_size can be +specified in the same way as size, either bytes or a percentage of the +huge page pool. At mount time, the number of huge pages specified by +min_size are reserved for use by the filesystem. If there are not enough +free huge pages available, the mount will fail. As huge pages are allocated +to the filesystem and freed, the reserve count is adjusted so that the sum +of allocated and reserved huge pages is always at least min_size. The option +nr_inodes sets the maximum number of inodes that /mnt/huge can use. If the +size, min_size or nr_inodes option is not provided on command line then +no limits are set. For pagesize, size, min_size and nr_inodes options, you +can use [G|g]/[M|m]/[K|k] to represent giga/mega/kilo. For example, size=2K +has the same meaning as size=2048. While read system calls are supported on files that reside on hugetlb file systems, write system calls are not. @@ -289,15 +302,23 @@ file systems, write system calls are not. Regular chown, chgrp, and chmod commands (with right permissions) could be used to change the file attributes on hugetlbfs. -Also, it is important to note that no such mount command is required if the +Also, it is important to note that no such mount command is required if applications are going to use only shmat/shmget system calls or mmap with -MAP_HUGETLB. Users who wish to use hugetlb page via shared memory segment -should be a member of a supplementary group and system admin needs to -configure that gid into /proc/sys/vm/hugetlb_shm_group. It is possible for -same or different applications to use any combination of mmaps and shm* -calls, though the mount of filesystem will be required for using mmap calls -without MAP_HUGETLB. For an example of how to use mmap with MAP_HUGETLB see -map_hugetlb.c. +MAP_HUGETLB. For an example of how to use mmap with MAP_HUGETLB see map_hugetlb +below. + +Users who wish to use hugetlb memory via shared memory segment should be a +member of a supplementary group and system admin needs to configure that gid +into /proc/sys/vm/hugetlb_shm_group. It is possible for same or different +applications to use any combination of mmaps and shm* calls, though the mount of +filesystem will be required for using mmap calls without MAP_HUGETLB. + +Syscalls that operate on memory backed by hugetlb pages only have their lengths +aligned to the native page size of the processor; they will normally fail with +errno set to EINVAL or exclude hugetlb pages that extend beyond the length if +not hugepage aligned. For example, munmap(2) will fail if memory is backed by +a hugetlb page and the length is smaller than the hugepage size. + Examples ======== diff --git a/Documentation/vm/unevictable-lru.txt b/Documentation/vm/unevictable-lru.txt index 86cb462..3be0bfc 100644 --- a/Documentation/vm/unevictable-lru.txt +++ b/Documentation/vm/unevictable-lru.txt @@ -22,6 +22,7 @@ CONTENTS - Filtering special vmas. - munlock()/munlockall() system call handling. - Migrating mlocked pages. + - Compacting mlocked pages. - mmap(MAP_LOCKED) system call handling. - munmap()/exit()/exec() system call handling. - try_to_unmap(). @@ -450,6 +451,17 @@ list because of a race between munlock and migration, page migration uses the putback_lru_page() function to add migrated pages back to the LRU. +COMPACTING MLOCKED PAGES +------------------------ + +The unevictable LRU can be scanned for compactable regions and the default +behavior is to do so. /proc/sys/vm/compact_unevictable_allowed controls +this behavior (see Documentation/sysctl/vm.txt). Once scanning of the +unevictable LRU is enabled, the work of compaction is mostly handled by +the page migration code and the same work flow as described in MIGRATING +MLOCKED PAGES will apply. + + mmap(MAP_LOCKED) SYSTEM CALL HANDLING ------------------------------------- diff --git a/Documentation/vm/zsmalloc.txt b/Documentation/vm/zsmalloc.txt new file mode 100644 index 0000000..64ed63c --- /dev/null +++ b/Documentation/vm/zsmalloc.txt @@ -0,0 +1,70 @@ +zsmalloc +-------- + +This allocator is designed for use with zram. Thus, the allocator is +supposed to work well under low memory conditions. In particular, it +never attempts higher order page allocation which is very likely to +fail under memory pressure. On the other hand, if we just use single +(0-order) pages, it would suffer from very high fragmentation -- +any object of size PAGE_SIZE/2 or larger would occupy an entire page. +This was one of the major issues with its predecessor (xvmalloc). + +To overcome these issues, zsmalloc allocates a bunch of 0-order pages +and links them together using various 'struct page' fields. These linked +pages act as a single higher-order page i.e. an object can span 0-order +page boundaries. The code refers to these linked pages as a single entity +called zspage. + +For simplicity, zsmalloc can only allocate objects of size up to PAGE_SIZE +since this satisfies the requirements of all its current users (in the +worst case, page is incompressible and is thus stored "as-is" i.e. in +uncompressed form). For allocation requests larger than this size, failure +is returned (see zs_malloc). + +Additionally, zs_malloc() does not return a dereferenceable pointer. +Instead, it returns an opaque handle (unsigned long) which encodes actual +location of the allocated object. The reason for this indirection is that +zsmalloc does not keep zspages permanently mapped since that would cause +issues on 32-bit systems where the VA region for kernel space mappings +is very small. So, before using the allocating memory, the object has to +be mapped using zs_map_object() to get a usable pointer and subsequently +unmapped using zs_unmap_object(). + +stat +---- + +With CONFIG_ZSMALLOC_STAT, we could see zsmalloc internal information via +/sys/kernel/debug/zsmalloc/<user name>. Here is a sample of stat output: + +# cat /sys/kernel/debug/zsmalloc/zram0/classes + + class size almost_full almost_empty obj_allocated obj_used pages_used pages_per_zspage + .. + .. + 9 176 0 1 186 129 8 4 + 10 192 1 0 2880 2872 135 3 + 11 208 0 1 819 795 42 2 + 12 224 0 1 219 159 12 4 + .. + .. + + +class: index +size: object size zspage stores +almost_empty: the number of ZS_ALMOST_EMPTY zspages(see below) +almost_full: the number of ZS_ALMOST_FULL zspages(see below) +obj_allocated: the number of objects allocated +obj_used: the number of objects allocated to the user +pages_used: the number of pages allocated for the class +pages_per_zspage: the number of 0-order pages to make a zspage + +We assign a zspage to ZS_ALMOST_EMPTY fullness group when: + n <= N / f, where +n = number of allocated objects +N = total number of objects zspage can store +f = fullness_threshold_frac(ie, 4 at the moment) + +Similarly, we assign zspage to: + ZS_ALMOST_FULL when n > N / f + ZS_EMPTY when n == 0 + ZS_FULL when n == N |