diff options
Diffstat (limited to 'Documentation')
21 files changed, 190 insertions, 152 deletions
diff --git a/Documentation/ABI/testing/sysfs-block-rssd b/Documentation/ABI/testing/sysfs-block-rssd index 679ce35..beef30c 100644 --- a/Documentation/ABI/testing/sysfs-block-rssd +++ b/Documentation/ABI/testing/sysfs-block-rssd @@ -1,26 +1,5 @@ -What: /sys/block/rssd*/registers -Date: March 2012 -KernelVersion: 3.3 -Contact: Asai Thambi S P <asamymuthupa@micron.com> -Description: This is a read-only file. Dumps below driver information and - hardware registers. - - S ACTive - - Command Issue - - Completed - - PORT IRQ STAT - - HOST IRQ STAT - - Allocated - - Commands in Q - What: /sys/block/rssd*/status Date: April 2012 KernelVersion: 3.4 Contact: Asai Thambi S P <asamymuthupa@micron.com> Description: This is a read-only file. Indicates the status of the device. - -What: /sys/block/rssd*/flags -Date: May 2012 -KernelVersion: 3.5 -Contact: Asai Thambi S P <asamymuthupa@micron.com> -Description: This is a read-only file. Dumps the flags in port and driver - data structure diff --git a/Documentation/ABI/testing/sysfs-class-mtd b/Documentation/ABI/testing/sysfs-class-mtd index db1ad7e..938ef71 100644 --- a/Documentation/ABI/testing/sysfs-class-mtd +++ b/Documentation/ABI/testing/sysfs-class-mtd @@ -142,13 +142,14 @@ KernelVersion: 3.4 Contact: linux-mtd@lists.infradead.org Description: This allows the user to examine and adjust the criteria by which - mtd returns -EUCLEAN from mtd_read(). If the maximum number of - bit errors that were corrected on any single region comprising - an ecc step (as reported by the driver) equals or exceeds this - value, -EUCLEAN is returned. Otherwise, absent an error, 0 is - returned. Higher layers (e.g., UBI) use this return code as an - indication that an erase block may be degrading and should be - scrutinized as a candidate for being marked as bad. + mtd returns -EUCLEAN from mtd_read() and mtd_read_oob(). If the + maximum number of bit errors that were corrected on any single + region comprising an ecc step (as reported by the driver) equals + or exceeds this value, -EUCLEAN is returned. Otherwise, absent + an error, 0 is returned. Higher layers (e.g., UBI) use this + return code as an indication that an erase block may be + degrading and should be scrutinized as a candidate for being + marked as bad. The initial value may be specified by the flash device driver. If not, then the default value is ecc_strength. @@ -167,7 +168,7 @@ Description: block degradation, but high enough to avoid the consequences of a persistent return value of -EUCLEAN on devices where sticky bitflips occur. Note that if bitflip_threshold exceeds - ecc_strength, -EUCLEAN is never returned by mtd_read(). + ecc_strength, -EUCLEAN is never returned by the read operations. Conversely, if bitflip_threshold is zero, -EUCLEAN is always returned, absent a hard error. diff --git a/Documentation/DocBook/media/v4l/controls.xml b/Documentation/DocBook/media/v4l/controls.xml index 676bc46..cda0dfb 100644 --- a/Documentation/DocBook/media/v4l/controls.xml +++ b/Documentation/DocBook/media/v4l/controls.xml @@ -3988,7 +3988,7 @@ interface and may change in the future.</para> from RGB to Y'CbCr color space. </entry> </row> - <row id = "v4l2-jpeg-chroma-subsampling"> + <row> <entrytbl spanname="descr" cols="2"> <tbody valign="top"> <row> diff --git a/Documentation/DocBook/media/v4l/pixfmt.xml b/Documentation/DocBook/media/v4l/pixfmt.xml index f5ac15e..e58934c 100644 --- a/Documentation/DocBook/media/v4l/pixfmt.xml +++ b/Documentation/DocBook/media/v4l/pixfmt.xml @@ -986,13 +986,13 @@ http://www.thedirks.org/winnov/</ulink></para></entry> <row id="V4L2-PIX-FMT-Y4"> <entry><constant>V4L2_PIX_FMT_Y4</constant></entry> <entry>'Y04 '</entry> - <entry>Old 4-bit greyscale format. Only the least significant 4 bits of each byte are used, + <entry>Old 4-bit greyscale format. Only the most significant 4 bits of each byte are used, the other bits are set to 0.</entry> </row> <row id="V4L2-PIX-FMT-Y6"> <entry><constant>V4L2_PIX_FMT_Y6</constant></entry> <entry>'Y06 '</entry> - <entry>Old 6-bit greyscale format. Only the least significant 6 bits of each byte are used, + <entry>Old 6-bit greyscale format. Only the most significant 6 bits of each byte are used, the other bits are set to 0.</entry> </row> </tbody> diff --git a/Documentation/DocBook/media/v4l/v4l2.xml b/Documentation/DocBook/media/v4l/v4l2.xml index 015c561..008c2d73 100644 --- a/Documentation/DocBook/media/v4l/v4l2.xml +++ b/Documentation/DocBook/media/v4l/v4l2.xml @@ -560,6 +560,7 @@ and discussions on the V4L mailing list.</revremark> &sub-g-tuner; &sub-log-status; &sub-overlay; + &sub-prepare-buf; &sub-qbuf; &sub-querybuf; &sub-querycap; @@ -567,7 +568,6 @@ and discussions on the V4L mailing list.</revremark> &sub-query-dv-preset; &sub-query-dv-timings; &sub-querystd; - &sub-prepare-buf; &sub-reqbufs; &sub-s-hw-freq-seek; &sub-streamon; diff --git a/Documentation/DocBook/media/v4l/vidioc-create-bufs.xml b/Documentation/DocBook/media/v4l/vidioc-create-bufs.xml index 765549f..a2474ec 100644 --- a/Documentation/DocBook/media/v4l/vidioc-create-bufs.xml +++ b/Documentation/DocBook/media/v4l/vidioc-create-bufs.xml @@ -108,10 +108,9 @@ information.</para> /></entry> </row> <row> - <entry>__u32</entry> + <entry>struct v4l2_format</entry> <entry><structfield>format</structfield></entry> - <entry>Filled in by the application, preserved by the driver. - See <xref linkend="v4l2-format" />.</entry> + <entry>Filled in by the application, preserved by the driver.</entry> </row> <row> <entry>__u32</entry> diff --git a/Documentation/DocBook/media/v4l/vidioc-dqevent.xml b/Documentation/DocBook/media/v4l/vidioc-dqevent.xml index e8714aa..98a856f 100644 --- a/Documentation/DocBook/media/v4l/vidioc-dqevent.xml +++ b/Documentation/DocBook/media/v4l/vidioc-dqevent.xml @@ -89,7 +89,7 @@ <row> <entry></entry> <entry>&v4l2-event-frame-sync;</entry> - <entry><structfield>frame</structfield></entry> + <entry><structfield>frame_sync</structfield></entry> <entry>Event data for event V4L2_EVENT_FRAME_SYNC.</entry> </row> <row> diff --git a/Documentation/DocBook/media/v4l/vidioc-g-ext-ctrls.xml b/Documentation/DocBook/media/v4l/vidioc-g-ext-ctrls.xml index e3d5afcd..0a4b90f 100644 --- a/Documentation/DocBook/media/v4l/vidioc-g-ext-ctrls.xml +++ b/Documentation/DocBook/media/v4l/vidioc-g-ext-ctrls.xml @@ -284,13 +284,6 @@ These controls are described in <xref processing controls. These controls are described in <xref linkend="image-process-controls" />.</entry> </row> - <row> - <entry><constant>V4L2_CTRL_CLASS_JPEG</constant></entry> - <entry>0x9d0000</entry> - <entry>The class containing JPEG compression controls. -These controls are described in <xref - linkend="jpeg-controls" />.</entry> - </row> </tbody> </tgroup> </table> diff --git a/Documentation/device-mapper/verity.txt b/Documentation/device-mapper/verity.txt index 32e4879..9884681 100644 --- a/Documentation/device-mapper/verity.txt +++ b/Documentation/device-mapper/verity.txt @@ -7,39 +7,39 @@ This target is read-only. Construction Parameters ======================= - <version> <dev> <hash_dev> <hash_start> + <version> <dev> <hash_dev> <data_block_size> <hash_block_size> <num_data_blocks> <hash_start_block> <algorithm> <digest> <salt> <version> - This is the version number of the on-disk format. + This is the type of the on-disk hash format. 0 is the original format used in the Chromium OS. - The salt is appended when hashing, digests are stored continuously and - the rest of the block is padded with zeros. + The salt is appended when hashing, digests are stored continuously and + the rest of the block is padded with zeros. 1 is the current format that should be used for new devices. - The salt is prepended when hashing and each digest is - padded with zeros to the power of two. + The salt is prepended when hashing and each digest is + padded with zeros to the power of two. <dev> - This is the device containing the data the integrity of which needs to be + This is the device containing data, the integrity of which needs to be checked. It may be specified as a path, like /dev/sdaX, or a device number, <major>:<minor>. <hash_dev> - This is the device that that supplies the hash tree data. It may be + This is the device that supplies the hash tree data. It may be specified similarly to the device path and may be the same device. If the - same device is used, the hash_start should be outside of the dm-verity - configured device size. + same device is used, the hash_start should be outside the configured + dm-verity device. <data_block_size> - The block size on a data device. Each block corresponds to one digest on - the hash device. + The block size on a data device in bytes. + Each block corresponds to one digest on the hash device. <hash_block_size> - The size of a hash block. + The size of a hash block in bytes. <num_data_blocks> The number of data blocks on the data device. Additional blocks are @@ -65,7 +65,7 @@ Construction Parameters Theory of operation =================== -dm-verity is meant to be setup as part of a verified boot path. This +dm-verity is meant to be set up as part of a verified boot path. This may be anything ranging from a boot using tboot or trustedgrub to just booting from a known-good device (like a USB drive or CD). @@ -73,20 +73,20 @@ When a dm-verity device is configured, it is expected that the caller has been authenticated in some way (cryptographic signatures, etc). After instantiation, all hashes will be verified on-demand during disk access. If they cannot be verified up to the root node of the -tree, the root hash, then the I/O will fail. This should identify +tree, the root hash, then the I/O will fail. This should detect tampering with any data on the device and the hash data. Cryptographic hashes are used to assert the integrity of the device on a -per-block basis. This allows for a lightweight hash computation on first read -into the page cache. Block hashes are stored linearly-aligned to the nearest -block the size of a page. +per-block basis. This allows for a lightweight hash computation on first read +into the page cache. Block hashes are stored linearly, aligned to the nearest +block size. Hash Tree --------- Each node in the tree is a cryptographic hash. If it is a leaf node, the hash -is of some block data on disk. If it is an intermediary node, then the hash is -of a number of child nodes. +of some data block on disk is calculated. If it is an intermediary node, +the hash of a number of child nodes is calculated. Each entry in the tree is a collection of neighboring nodes that fit in one block. The number is determined based on block_size and the size of the @@ -110,63 +110,23 @@ alg = sha256, num_blocks = 32768, block_size = 4096 On-disk format ============== -Below is the recommended on-disk format. The verity kernel code does not -read the on-disk header. It only reads the hash blocks which directly -follow the header. It is expected that a user-space tool will verify the -integrity of the verity_header and then call dmsetup with the correct -parameters. Alternatively, the header can be omitted and the dmsetup -parameters can be passed via the kernel command-line in a rooted chain -of trust where the command-line is verified. +The verity kernel code does not read the verity metadata on-disk header. +It only reads the hash blocks which directly follow the header. +It is expected that a user-space tool will verify the integrity of the +verity header. -The on-disk format is especially useful in cases where the hash blocks -are on a separate partition. The magic number allows easy identification -of the partition contents. Alternatively, the hash blocks can be stored -in the same partition as the data to be verified. In such a configuration -the filesystem on the partition would be sized a little smaller than -the full-partition, leaving room for the hash blocks. - -struct superblock { - uint8_t signature[8] - "verity\0\0"; - - uint8_t version; - 1 - current format - - uint8_t data_block_bits; - log2(data block size) - - uint8_t hash_block_bits; - log2(hash block size) - - uint8_t pad1[1]; - zero padding - - uint16_t salt_size; - big-endian salt size - - uint8_t pad2[2]; - zero padding - - uint32_t data_blocks_hi; - big-endian high 32 bits of the 64-bit number of data blocks - - uint32_t data_blocks_lo; - big-endian low 32 bits of the 64-bit number of data blocks - - uint8_t algorithm[16]; - cryptographic algorithm - - uint8_t salt[384]; - salt (the salt size is specified above) - - uint8_t pad3[88]; - zero padding to 512-byte boundary -} +Alternatively, the header can be omitted and the dmsetup parameters can +be passed via the kernel command-line in a rooted chain of trust where +the command-line is verified. Directly following the header (and with sector number padded to the next hash block boundary) are the hash blocks which are stored a depth at a time (starting from the root), sorted in order of increasing index. +The full specification of kernel parameters and on-disk metadata format +is available at the cryptsetup project's wiki page + http://code.google.com/p/cryptsetup/wiki/DMVerity + Status ====== V (for Valid) is returned if every check performed so far was valid. @@ -174,21 +134,22 @@ If any check failed, C (for Corruption) is returned. Example ======= - -Setup a device: - dmsetup create vroot --table \ - "0 2097152 "\ - "verity 1 /dev/sda1 /dev/sda2 4096 4096 2097152 1 "\ +Set up a device: + # dmsetup create vroot --readonly --table \ + "0 2097152 verity 1 /dev/sda1 /dev/sda2 4096 4096 262144 1 sha256 "\ "4392712ba01368efdf14b05c76f9e4df0d53664630b5d48632ed17a137f39076 "\ "1234000000000000000000000000000000000000000000000000000000000000" A command line tool veritysetup is available to compute or verify -the hash tree or activate the kernel driver. This is available from -the LVM2 upstream repository and may be supplied as a package called -device-mapper-verity-tools: - git://sources.redhat.com/git/lvm2 - http://sourceware.org/git/?p=lvm2.git - http://sourceware.org/cgi-bin/cvsweb.cgi/LVM2/verity?cvsroot=lvm2 - -veritysetup -a vroot /dev/sda1 /dev/sda2 \ - 4392712ba01368efdf14b05c76f9e4df0d53664630b5d48632ed17a137f39076 +the hash tree or activate the kernel device. This is available from +the cryptsetup upstream repository http://code.google.com/p/cryptsetup/ +(as a libcryptsetup extension). + +Create hash on the device: + # veritysetup format /dev/sda1 /dev/sda2 + ... + Root hash: 4392712ba01368efdf14b05c76f9e4df0d53664630b5d48632ed17a137f39076 + +Activate the device: + # veritysetup create vroot /dev/sda1 /dev/sda2 \ + 4392712ba01368efdf14b05c76f9e4df0d53664630b5d48632ed17a137f39076 diff --git a/Documentation/devicetree/bindings/input/fsl-mma8450.txt b/Documentation/devicetree/bindings/input/fsl-mma8450.txt index a00c94c..0b96e57 100644 --- a/Documentation/devicetree/bindings/input/fsl-mma8450.txt +++ b/Documentation/devicetree/bindings/input/fsl-mma8450.txt @@ -2,6 +2,7 @@ Required properties: - compatible : "fsl,mma8450". +- reg: the I2C address of MMA8450 Example: diff --git a/Documentation/devicetree/bindings/mfd/mc13xxx.txt b/Documentation/devicetree/bindings/mfd/mc13xxx.txt index 19f6af4..baf0798 100644 --- a/Documentation/devicetree/bindings/mfd/mc13xxx.txt +++ b/Documentation/devicetree/bindings/mfd/mc13xxx.txt @@ -46,8 +46,8 @@ Examples: ecspi@70010000 { /* ECSPI1 */ fsl,spi-num-chipselects = <2>; - cs-gpios = <&gpio3 24 0>, /* GPIO4_24 */ - <&gpio3 25 0>; /* GPIO4_25 */ + cs-gpios = <&gpio4 24 0>, /* GPIO4_24 */ + <&gpio4 25 0>; /* GPIO4_25 */ status = "okay"; pmic: mc13892@0 { diff --git a/Documentation/devicetree/bindings/mmc/fsl-imx-esdhc.txt b/Documentation/devicetree/bindings/mmc/fsl-imx-esdhc.txt index c7e404b..fea541e 100644 --- a/Documentation/devicetree/bindings/mmc/fsl-imx-esdhc.txt +++ b/Documentation/devicetree/bindings/mmc/fsl-imx-esdhc.txt @@ -29,6 +29,6 @@ esdhc@70008000 { compatible = "fsl,imx51-esdhc"; reg = <0x70008000 0x4000>; interrupts = <2>; - cd-gpios = <&gpio0 6 0>; /* GPIO1_6 */ - wp-gpios = <&gpio0 5 0>; /* GPIO1_5 */ + cd-gpios = <&gpio1 6 0>; /* GPIO1_6 */ + wp-gpios = <&gpio1 5 0>; /* GPIO1_5 */ }; diff --git a/Documentation/devicetree/bindings/net/fsl-fec.txt b/Documentation/devicetree/bindings/net/fsl-fec.txt index 7ab9e1a..4616fc2 100644 --- a/Documentation/devicetree/bindings/net/fsl-fec.txt +++ b/Documentation/devicetree/bindings/net/fsl-fec.txt @@ -19,6 +19,6 @@ ethernet@83fec000 { reg = <0x83fec000 0x4000>; interrupts = <87>; phy-mode = "mii"; - phy-reset-gpios = <&gpio1 14 0>; /* GPIO2_14 */ + phy-reset-gpios = <&gpio2 14 0>; /* GPIO2_14 */ local-mac-address = [00 04 9F 01 1B B9]; }; diff --git a/Documentation/devicetree/bindings/spi/fsl-imx-cspi.txt b/Documentation/devicetree/bindings/spi/fsl-imx-cspi.txt index 9841057..4256a6d 100644 --- a/Documentation/devicetree/bindings/spi/fsl-imx-cspi.txt +++ b/Documentation/devicetree/bindings/spi/fsl-imx-cspi.txt @@ -17,6 +17,6 @@ ecspi@70010000 { reg = <0x70010000 0x4000>; interrupts = <36>; fsl,spi-num-chipselects = <2>; - cs-gpios = <&gpio3 24 0>, /* GPIO4_24 */ - <&gpio3 25 0>; /* GPIO4_25 */ + cs-gpios = <&gpio3 24 0>, /* GPIO3_24 */ + <&gpio3 25 0>; /* GPIO3_25 */ }; diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt b/Documentation/devicetree/bindings/vendor-prefixes.txt index 6eab917..db4d3af 100644 --- a/Documentation/devicetree/bindings/vendor-prefixes.txt +++ b/Documentation/devicetree/bindings/vendor-prefixes.txt @@ -3,6 +3,7 @@ Device tree binding vendor prefix registry. Keep list in alphabetical order. This isn't an exhaustive list, but you should add new prefixes to it before using them to avoid name-space collisions. +ad Avionic Design GmbH adi Analog Devices, Inc. amcc Applied Micro Circuits Corporation (APM, formally AMCC) apm Applied Micro Circuits Corporation (APM) diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking index 8e2da1e..e0cce2a 100644 --- a/Documentation/filesystems/Locking +++ b/Documentation/filesystems/Locking @@ -9,7 +9,7 @@ be able to use diff(1). --------------------------- dentry_operations -------------------------- prototypes: - int (*d_revalidate)(struct dentry *, struct nameidata *); + int (*d_revalidate)(struct dentry *, unsigned int); int (*d_hash)(const struct dentry *, const struct inode *, struct qstr *); int (*d_compare)(const struct dentry *, const struct inode *, @@ -37,9 +37,8 @@ d_manage: no no yes (ref-walk) maybe --------------------------- inode_operations --------------------------- prototypes: - int (*create) (struct inode *,struct dentry *,umode_t, struct nameidata *); - struct dentry * (*lookup) (struct inode *,struct dentry *, struct nameid -ata *); + int (*create) (struct inode *,struct dentry *,umode_t, bool); + struct dentry * (*lookup) (struct inode *,struct dentry *, unsigned int); int (*link) (struct dentry *,struct inode *,struct dentry *); int (*unlink) (struct inode *,struct dentry *); int (*symlink) (struct inode *,struct dentry *,const char *); @@ -62,6 +61,9 @@ ata *); int (*removexattr) (struct dentry *, const char *); int (*fiemap)(struct inode *, struct fiemap_extent_info *, u64 start, u64 len); void (*update_time)(struct inode *, struct timespec *, int); + int (*atomic_open)(struct inode *, struct dentry *, + struct file *, unsigned open_flag, + umode_t create_mode, int *opened); locking rules: all may block @@ -89,6 +91,7 @@ listxattr: no removexattr: yes fiemap: no update_time: no +atomic_open: yes Additionally, ->rmdir(), ->unlink() and ->rename() have ->i_mutex on victim. diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting index 8c91d10..2bef2b3 100644 --- a/Documentation/filesystems/porting +++ b/Documentation/filesystems/porting @@ -355,12 +355,10 @@ protects *all* the dcache state of a given dentry. via rcu-walk path walk (basically, if the file can have had a path name in the vfs namespace). - i_dentry and i_rcu share storage in a union, and the vfs expects -i_dentry to be reinitialized before it is freed, so an: - - INIT_LIST_HEAD(&inode->i_dentry); - -must be done in the RCU callback. + Even though i_dentry and i_rcu share storage in a union, we will +initialize the former in inode_init_always(), so just leave it alone in +the callback. It used to be necessary to clean it there, but not anymore +(starting at 3.2). -- [recommended] @@ -433,3 +431,14 @@ release it yourself. d_alloc_root() is gone, along with a lot of bugs caused by code misusing it. Replacement: d_make_root(inode). The difference is, d_make_root() drops the reference to inode if dentry allocation fails. + +-- +[mandatory] + The witch is dead! Well, 2/3 of it, anyway. ->d_revalidate() and +->lookup() do *not* take struct nameidata anymore; just the flags. +-- +[mandatory] + ->create() doesn't take struct nameidata *; unlike the previous +two, it gets "is it an O_EXCL or equivalent?" boolean argument. Note that +local filesystems can ignore tha argument - they are guaranteed that the +object doesn't exist. It's remote/distributed ones that might care... diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt index efd23f4..aa754e0 100644 --- a/Documentation/filesystems/vfs.txt +++ b/Documentation/filesystems/vfs.txt @@ -341,8 +341,8 @@ This describes how the VFS can manipulate an inode in your filesystem. As of kernel 2.6.22, the following members are defined: struct inode_operations { - int (*create) (struct inode *,struct dentry *, umode_t, struct nameidata *); - struct dentry * (*lookup) (struct inode *,struct dentry *, struct nameidata *); + int (*create) (struct inode *,struct dentry *, umode_t, bool); + struct dentry * (*lookup) (struct inode *,struct dentry *, unsigned int); int (*link) (struct dentry *,struct inode *,struct dentry *); int (*unlink) (struct inode *,struct dentry *); int (*symlink) (struct inode *,struct dentry *,const char *); @@ -364,6 +364,9 @@ struct inode_operations { ssize_t (*listxattr) (struct dentry *, char *, size_t); int (*removexattr) (struct dentry *, const char *); void (*update_time)(struct inode *, struct timespec *, int); + int (*atomic_open)(struct inode *, struct dentry *, + struct file *, unsigned open_flag, + umode_t create_mode, int *opened); }; Again, all methods are called without any locks being held, unless @@ -476,6 +479,14 @@ otherwise noted. an inode. If this is not defined the VFS will update the inode itself and call mark_inode_dirty_sync. + atomic_open: called on the last component of an open. Using this optional + method the filesystem can look up, possibly create and open the file in + one atomic operation. If it cannot perform this (e.g. the file type + turned out to be wrong) it may signal this by returning 1 instead of + usual 0 or -ve . This method is only called if the last + component is negative or needs lookup. Cached positive dentries are + still handled by f_op->open(). + The Address Space Object ======================== @@ -891,7 +902,7 @@ the VFS uses a default. As of kernel 2.6.22, the following members are defined: struct dentry_operations { - int (*d_revalidate)(struct dentry *, struct nameidata *); + int (*d_revalidate)(struct dentry *, unsigned int); int (*d_hash)(const struct dentry *, const struct inode *, struct qstr *); int (*d_compare)(const struct dentry *, const struct inode *, @@ -910,11 +921,11 @@ struct dentry_operations { dcache. Most filesystems leave this as NULL, because all their dentries in the dcache are valid - d_revalidate may be called in rcu-walk mode (nd->flags & LOOKUP_RCU). + d_revalidate may be called in rcu-walk mode (flags & LOOKUP_RCU). If in rcu-walk mode, the filesystem must revalidate the dentry without blocking or storing to the dentry, d_parent and d_inode should not be - used without care (because they can go NULL), instead nd->inode should - be used. + used without care (because they can change and, in d_inode case, even + become NULL under us). If a situation is encountered that rcu-walk cannot handle, return -ECHILD and it will be called again in ref-walk mode. diff --git a/Documentation/prctl/no_new_privs.txt b/Documentation/prctl/no_new_privs.txt new file mode 100644 index 0000000..f7be84f --- /dev/null +++ b/Documentation/prctl/no_new_privs.txt @@ -0,0 +1,57 @@ +The execve system call can grant a newly-started program privileges that +its parent did not have. The most obvious examples are setuid/setgid +programs and file capabilities. To prevent the parent program from +gaining these privileges as well, the kernel and user code must be +careful to prevent the parent from doing anything that could subvert the +child. For example: + + - The dynamic loader handles LD_* environment variables differently if + a program is setuid. + + - chroot is disallowed to unprivileged processes, since it would allow + /etc/passwd to be replaced from the point of view of a process that + inherited chroot. + + - The exec code has special handling for ptrace. + +These are all ad-hoc fixes. The no_new_privs bit (since Linux 3.5) is a +new, generic mechanism to make it safe for a process to modify its +execution environment in a manner that persists across execve. Any task +can set no_new_privs. Once the bit is set, it is inherited across fork, +clone, and execve and cannot be unset. With no_new_privs set, execve +promises not to grant the privilege to do anything that could not have +been done without the execve call. For example, the setuid and setgid +bits will no longer change the uid or gid; file capabilities will not +add to the permitted set, and LSMs will not relax constraints after +execve. + +To set no_new_privs, use prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0). + +Be careful, though: LSMs might also not tighten constraints on exec +in no_new_privs mode. (This means that setting up a general-purpose +service launcher to set no_new_privs before execing daemons may +interfere with LSM-based sandboxing.) + +Note that no_new_privs does not prevent privilege changes that do not +involve execve. An appropriately privileged task can still call +setuid(2) and receive SCM_RIGHTS datagrams. + +There are two main use cases for no_new_privs so far: + + - Filters installed for the seccomp mode 2 sandbox persist across + execve and can change the behavior of newly-executed programs. + Unprivileged users are therefore only allowed to install such filters + if no_new_privs is set. + + - By itself, no_new_privs can be used to reduce the attack surface + available to an unprivileged user. If everything running with a + given uid has no_new_privs set, then that uid will be unable to + escalate its privileges by directly attacking setuid, setgid, and + fcap-using binaries; it will need to compromise something without the + no_new_privs bit set first. + +In the future, other potentially dangerous kernel features could become +available to unprivileged tasks if no_new_privs is set. In principle, +several options to unshare(2) and clone(2) would be safe when +no_new_privs is set, and no_new_privs + chroot is considerable less +dangerous than chroot by itself. diff --git a/Documentation/stable_kernel_rules.txt b/Documentation/stable_kernel_rules.txt index f0ab5cf..4a7b54b 100644 --- a/Documentation/stable_kernel_rules.txt +++ b/Documentation/stable_kernel_rules.txt @@ -12,6 +12,12 @@ Rules on what kind of patches are accepted, and which ones are not, into the marked CONFIG_BROKEN), an oops, a hang, data corruption, a real security issue, or some "oh, that's not good" issue. In short, something critical. + - Serious issues as reported by a user of a distribution kernel may also + be considered if they fix a notable performance or interactivity issue. + As these fixes are not as obvious and have a higher risk of a subtle + regression they should only be submitted by a distribution kernel + maintainer and include an addendum linking to a bugzilla entry if it + exists and additional information on the user-visible impact. - New device IDs and quirks are also accepted. - No "theoretical race condition" issues, unless an explanation of how the race can be exploited is also provided. diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 9301266..2c99483 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -1930,6 +1930,23 @@ The "pte_enc" field provides a value that can OR'ed into the hash PTE's RPN field (ie, it needs to be shifted left by 12 to OR it into the hash PTE second double word). +4.75 KVM_IRQFD + +Capability: KVM_CAP_IRQFD +Architectures: x86 +Type: vm ioctl +Parameters: struct kvm_irqfd (in) +Returns: 0 on success, -1 on error + +Allows setting an eventfd to directly trigger a guest interrupt. +kvm_irqfd.fd specifies the file descriptor to use as the eventfd and +kvm_irqfd.gsi specifies the irqchip pin toggled by this event. When +an event is tiggered on the eventfd, an interrupt is injected into +the guest using the specified gsi pin. The irqfd is removed using +the KVM_IRQFD_FLAG_DEASSIGN flag, specifying both kvm_irqfd.fd +and kvm_irqfd.gsi. + + 5. The kvm_run structure ------------------------ |