summaryrefslogtreecommitdiffstats
path: root/Documentation/filesystems
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/filesystems')
-rw-r--r--Documentation/filesystems/9p.txt29
-rw-r--r--Documentation/filesystems/Locking12
-rw-r--r--Documentation/filesystems/adfs.txt18
-rw-r--r--Documentation/filesystems/autofs4-mount-control.txt2
-rw-r--r--Documentation/filesystems/caching/netfs-api.txt18
-rw-r--r--Documentation/filesystems/configfs/configfs.txt2
-rw-r--r--Documentation/filesystems/configfs/configfs_example_explicit.c6
-rw-r--r--Documentation/filesystems/configfs/configfs_example_macros.c6
-rw-r--r--Documentation/filesystems/exofs.txt10
-rw-r--r--Documentation/filesystems/ext4.txt215
-rw-r--r--Documentation/filesystems/gfs2-uevents.txt2
-rw-r--r--Documentation/filesystems/gfs2.txt2
-rw-r--r--Documentation/filesystems/nfs/idmapper.txt4
-rw-r--r--Documentation/filesystems/nfs/pnfs.txt7
-rw-r--r--Documentation/filesystems/ntfs.txt2
-rw-r--r--Documentation/filesystems/ocfs2.txt10
-rw-r--r--Documentation/filesystems/path-lookup.txt4
-rw-r--r--Documentation/filesystems/pohmelfs/network_protocol.txt2
-rw-r--r--Documentation/filesystems/porting23
-rw-r--r--Documentation/filesystems/proc.txt18
-rw-r--r--Documentation/filesystems/romfs.txt3
-rw-r--r--Documentation/filesystems/squashfs.txt30
-rw-r--r--Documentation/filesystems/sysfs.txt18
-rw-r--r--Documentation/filesystems/ubifs.txt30
-rw-r--r--Documentation/filesystems/vfs.txt66
-rw-r--r--Documentation/filesystems/xfs-delayed-logging-design.txt15
-rw-r--r--Documentation/filesystems/xfs.txt6
27 files changed, 398 insertions, 162 deletions
diff --git a/Documentation/filesystems/9p.txt b/Documentation/filesystems/9p.txt
index b22abba..13de64c 100644
--- a/Documentation/filesystems/9p.txt
+++ b/Documentation/filesystems/9p.txt
@@ -25,6 +25,8 @@ Other applications are described in the following papers:
http://xcpu.org/papers/cellfs-talk.pdf
* PROSE I/O: Using 9p to enable Application Partitions
http://plan9.escet.urjc.es/iwp9/cready/PROSE_iwp9_2006.pdf
+ * VirtFS: A Virtualization Aware File System pass-through
+ http://goo.gl/3WPDg
USAGE
=====
@@ -130,31 +132,20 @@ OPTIONS
RESOURCES
=========
-Our current recommendation is to use Inferno (http://www.vitanuova.com/nferno/index.html)
-as the 9p server. You can start a 9p server under Inferno by issuing the
-following command:
- ; styxlisten -A tcp!*!564 export '#U*'
+Protocol specifications are maintained on github:
+http://ericvh.github.com/9p-rfc/
-The -A specifies an unauthenticated export. The 564 is the port # (you may
-have to choose a higher port number if running as a normal user). The '#U*'
-specifies exporting the root of the Linux name space. You may specify a
-subset of the namespace by extending the path: '#U*'/tmp would just export
-/tmp. For more information, see the Inferno manual pages covering styxlisten
-and export.
+9p client and server implementations are listed on
+http://9p.cat-v.org/implementations
-A Linux version of the 9p server is now maintained under the npfs project
-on sourceforge (http://sourceforge.net/projects/npfs). The currently
-maintained version is the single-threaded version of the server (named spfs)
-available from the same SVN repository.
+A 9p2000.L server is being developed by LLNL and can be found
+at http://code.google.com/p/diod/
There are user and developer mailing lists available through the v9fs project
on sourceforge (http://sourceforge.net/projects/v9fs).
-A stand-alone version of the module (which should build for any 2.6 kernel)
-is available via (http://github.com/ericvh/9p-sac/tree/master)
-
-News and other information is maintained on SWiK (http://swik.net/v9fs)
-and the Wiki (http://sf.net/apps/mediawiki/v9fs/index.php).
+News and other information is maintained on a Wiki.
+(http://sf.net/apps/mediawiki/v9fs/index.php).
Bug reports may be issued through the kernel.org bugzilla
(http://bugzilla.kernel.org)
diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
index 4471a41..57d827d6 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -104,7 +104,7 @@ of the locking scheme for directory operations.
prototypes:
struct inode *(*alloc_inode)(struct super_block *sb);
void (*destroy_inode)(struct inode *);
- void (*dirty_inode) (struct inode *);
+ void (*dirty_inode) (struct inode *, int flags);
int (*write_inode) (struct inode *, struct writeback_control *wbc);
int (*drop_inode) (struct inode *);
void (*evict_inode) (struct inode *);
@@ -126,9 +126,9 @@ locking rules:
s_umount
alloc_inode:
destroy_inode:
-dirty_inode: (must not sleep)
+dirty_inode:
write_inode:
-drop_inode: !!!inode_lock!!!
+drop_inode: !!!inode->i_lock!!!
evict_inode:
put_super: write
write_super: read
@@ -166,13 +166,11 @@ prototypes:
void (*kill_sb) (struct super_block *);
locking rules:
may block
-get_sb yes
mount yes
kill_sb yes
-->get_sb() returns error or 0 with locked superblock attached to the vfsmount
-(exclusive on ->s_umount).
-->mount() returns ERR_PTR or the root dentry.
+->mount() returns ERR_PTR or the root dentry; its superblock should be locked
+on return.
->kill_sb() takes a write-locked superblock, does all shutdown work on it,
unlocks and drops the reference.
diff --git a/Documentation/filesystems/adfs.txt b/Documentation/filesystems/adfs.txt
index 9e8811f..5949766 100644
--- a/Documentation/filesystems/adfs.txt
+++ b/Documentation/filesystems/adfs.txt
@@ -9,6 +9,9 @@ Mount options for ADFS
will be nnn. Default 0700.
othmask=nnn The permission mask for ADFS 'other' permissions
will be nnn. Default 0077.
+ ftsuffix=n When ftsuffix=0, no file type suffix will be applied.
+ When ftsuffix=1, a hexadecimal suffix corresponding to
+ the RISC OS file type will be added. Default 0.
Mapping of ADFS permissions to Linux permissions
------------------------------------------------
@@ -55,3 +58,18 @@ Mapping of ADFS permissions to Linux permissions
You can therefore tailor the permission translation to whatever you
desire the permissions should be under Linux.
+
+RISC OS file type suffix
+------------------------
+
+ RISC OS file types are stored in bits 19..8 of the file load address.
+
+ To enable non-RISC OS systems to be used to store files without losing
+ file type information, a file naming convention was devised (initially
+ for use with NFS) such that a hexadecimal suffix of the form ,xyz
+ denoted the file type: e.g. BasicFile,ffb is a BASIC (0xffb) file. This
+ naming convention is now also used by RISC OS emulators such as RPCEmu.
+
+ Mounting an ADFS disc with option ftsuffix=1 will cause appropriate file
+ type suffixes to be appended to file names read from a directory. If the
+ ftsuffix option is zero or omitted, no file type suffixes will be added.
diff --git a/Documentation/filesystems/autofs4-mount-control.txt b/Documentation/filesystems/autofs4-mount-control.txt
index 51986bf..4c95935 100644
--- a/Documentation/filesystems/autofs4-mount-control.txt
+++ b/Documentation/filesystems/autofs4-mount-control.txt
@@ -309,7 +309,7 @@ ioctlfd field set to the descriptor obtained from the open call.
AUTOFS_DEV_IOCTL_TIMEOUT_CMD
----------------------------
-Set the expire timeout for mounts withing an autofs mount point.
+Set the expire timeout for mounts within an autofs mount point.
The call requires an initialized struct autofs_dev_ioctl with the
ioctlfd field set to the descriptor obtained from the open call.
diff --git a/Documentation/filesystems/caching/netfs-api.txt b/Documentation/filesystems/caching/netfs-api.txt
index 1902c57..a167ab8 100644
--- a/Documentation/filesystems/caching/netfs-api.txt
+++ b/Documentation/filesystems/caching/netfs-api.txt
@@ -95,7 +95,7 @@ restraints as possible on how an index is structured and where it is placed in
the tree. The netfs can even mix indices and data files at the same level, but
it's not recommended.
-Each index entry consists of a key of indeterminate length plus some auxilliary
+Each index entry consists of a key of indeterminate length plus some auxiliary
data, also of indeterminate length.
There are some limits on indices:
@@ -203,23 +203,23 @@ This has the following fields:
If the function is absent, a file size of 0 is assumed.
- (6) A function to retrieve auxilliary data from the netfs [optional].
+ (6) A function to retrieve auxiliary data from the netfs [optional].
This function will be called with the netfs data that was passed to the
- cookie acquisition function and the maximum length of auxilliary data that
- it may provide. It should write the auxilliary data into the given buffer
+ cookie acquisition function and the maximum length of auxiliary data that
+ it may provide. It should write the auxiliary data into the given buffer
and return the quantity it wrote.
- If this function is absent, the auxilliary data length will be set to 0.
+ If this function is absent, the auxiliary data length will be set to 0.
- The length of the auxilliary data buffer may be dependent on the key
+ The length of the auxiliary data buffer may be dependent on the key
length. A netfs mustn't rely on being able to provide more than 400 bytes
for both.
- (7) A function to check the auxilliary data [optional].
+ (7) A function to check the auxiliary data [optional].
This function will be called to check that a match found in the cache for
- this object is valid. For instance with AFS it could check the auxilliary
+ this object is valid. For instance with AFS it could check the auxiliary
data against the data version number returned by the server to determine
whether the index entry in a cache is still valid.
@@ -232,7 +232,7 @@ This has the following fields:
(*) FSCACHE_CHECKAUX_NEEDS_UPDATE - the entry requires update
(*) FSCACHE_CHECKAUX_OBSOLETE - the entry should be deleted
- This function can also be used to extract data from the auxilliary data in
+ This function can also be used to extract data from the auxiliary data in
the cache and copy it into the netfs's structures.
(8) A pair of functions to manage contexts for the completion callback
diff --git a/Documentation/filesystems/configfs/configfs.txt b/Documentation/filesystems/configfs/configfs.txt
index fabcb0e..dd57bb6 100644
--- a/Documentation/filesystems/configfs/configfs.txt
+++ b/Documentation/filesystems/configfs/configfs.txt
@@ -409,7 +409,7 @@ As a consequence of this, default_groups cannot be removed directly via
rmdir(2). They also are not considered when rmdir(2) on the parent
group is checking for children.
-[Dependant Subsystems]
+[Dependent Subsystems]
Sometimes other drivers depend on particular configfs items. For
example, ocfs2 mounts depend on a heartbeat region item. If that
diff --git a/Documentation/filesystems/configfs/configfs_example_explicit.c b/Documentation/filesystems/configfs/configfs_example_explicit.c
index fd53869f..1420233 100644
--- a/Documentation/filesystems/configfs/configfs_example_explicit.c
+++ b/Documentation/filesystems/configfs/configfs_example_explicit.c
@@ -464,9 +464,8 @@ static int __init configfs_example_init(void)
return 0;
out_unregister:
- for (; i >= 0; i--) {
+ for (i--; i >= 0; i--)
configfs_unregister_subsystem(example_subsys[i]);
- }
return ret;
}
@@ -475,9 +474,8 @@ static void __exit configfs_example_exit(void)
{
int i;
- for (i = 0; example_subsys[i]; i++) {
+ for (i = 0; example_subsys[i]; i++)
configfs_unregister_subsystem(example_subsys[i]);
- }
}
module_init(configfs_example_init);
diff --git a/Documentation/filesystems/configfs/configfs_example_macros.c b/Documentation/filesystems/configfs/configfs_example_macros.c
index d8e30a0..327dfbc 100644
--- a/Documentation/filesystems/configfs/configfs_example_macros.c
+++ b/Documentation/filesystems/configfs/configfs_example_macros.c
@@ -427,9 +427,8 @@ static int __init configfs_example_init(void)
return 0;
out_unregister:
- for (; i >= 0; i--) {
+ for (i--; i >= 0; i--)
configfs_unregister_subsystem(example_subsys[i]);
- }
return ret;
}
@@ -438,9 +437,8 @@ static void __exit configfs_example_exit(void)
{
int i;
- for (i = 0; example_subsys[i]; i++) {
+ for (i = 0; example_subsys[i]; i++)
configfs_unregister_subsystem(example_subsys[i]);
- }
}
module_init(configfs_example_init);
diff --git a/Documentation/filesystems/exofs.txt b/Documentation/filesystems/exofs.txt
index abd2a9b..23583a1 100644
--- a/Documentation/filesystems/exofs.txt
+++ b/Documentation/filesystems/exofs.txt
@@ -104,7 +104,15 @@ Where:
exofs specific options: Options are separated by commas (,)
pid=<integer> - The partition number to mount/create as
container of the filesystem.
- This option is mandatory.
+ This option is mandatory. integer can be
+ Hex by pre-pending an 0x to the number.
+ osdname=<id> - Mount by a device's osdname.
+ osdname is usually a 36 character uuid of the
+ form "d2683732-c906-4ee1-9dbd-c10c27bb40df".
+ It is one of the device's uuid specified in the
+ mkfs.exofs format command.
+ If this option is specified then the /dev/osdX
+ above can be empty and is ignored.
to=<integer> - Timeout in ticks for a single command.
default is (60 * HZ) [for debugging only]
diff --git a/Documentation/filesystems/ext4.txt b/Documentation/filesystems/ext4.txt
index 6ab9442..3ae9bc9 100644
--- a/Documentation/filesystems/ext4.txt
+++ b/Documentation/filesystems/ext4.txt
@@ -97,7 +97,7 @@ Note: More extensive information for getting started with ext4 can be
* Inode allocation using large virtual block groups via flex_bg
* delayed allocation
* large block (up to pagesize) support
-* efficent new ordered mode in JBD2 and ext4(avoid using buffer head to force
+* efficient new ordered mode in JBD2 and ext4(avoid using buffer head to force
the ordering)
[1] Filesystems with a block size of 1k may see a limit imposed by the
@@ -106,7 +106,7 @@ directory hash tree having a maximum depth of two.
2.2 Candidate features for future inclusion
* Online defrag (patches available but not well tested)
-* reduced mke2fs time via lazy itable initialization in conjuction with
+* reduced mke2fs time via lazy itable initialization in conjunction with
the uninit_bg feature (capability to do this is available in e2fsprogs
but a kernel thread to do lazy zeroing of unused inode table blocks
after filesystem is first mounted is required for safety)
@@ -226,10 +226,6 @@ acl Enables POSIX Access Control Lists support.
noacl This option disables POSIX Access Control List
support.
-reservation
-
-noreservation
-
bsddf (*) Make 'df' act like BSD.
minixdf Make 'df' act like Minix.
@@ -367,12 +363,47 @@ init_itable=n The lazy itable init code will wait n times the
minimizes the impact on the systme performance
while file system's inode table is being initialized.
-discard Controls whether ext4 should issue discard/TRIM
+discard Controls whether ext4 should issue discard/TRIM
nodiscard(*) commands to the underlying block device when
blocks are freed. This is useful for SSD devices
and sparse/thinly-provisioned LUNs, but it is off
by default until sufficient testing has been done.
+nouid32 Disables 32-bit UIDs and GIDs. This is for
+ interoperability with older kernels which only
+ store and expect 16-bit values.
+
+resize Allows to resize filesystem to the end of the last
+ existing block group, further resize has to be done
+ with resize2fs either online, or offline. It can be
+ used only with conjunction with remount.
+
+block_validity This options allows to enables/disables the in-kernel
+noblock_validity facility for tracking filesystem metadata blocks
+ within internal data structures. This allows multi-
+ block allocator and other routines to quickly locate
+ extents which might overlap with filesystem metadata
+ blocks. This option is intended for debugging
+ purposes and since it negatively affects the
+ performance, it is off by default.
+
+dioread_lock Controls whether or not ext4 should use the DIO read
+dioread_nolock locking. If the dioread_nolock option is specified
+ ext4 will allocate uninitialized extent before buffer
+ write and convert the extent to initialized after IO
+ completes. This approach allows ext4 code to avoid
+ using inode mutex, which improves scalability on high
+ speed storages. However this does not work with nobh
+ option and the mount will fail. Nor does it work with
+ data journaling and dioread_nolock option will be
+ ignored with kernel warning. Note that dioread_nolock
+ code path is only used for extent-based files.
+ Because of the restrictions this options comprises
+ it is off by default (e.g. dioread_lock).
+
+i_version Enable 64-bit inode version support. This option is
+ off by default.
+
Data Mode
=========
There are 3 different data modes:
@@ -400,6 +431,176 @@ needs to be read from and written to disk at the same time where it
outperforms all others modes. Currently ext4 does not have delayed
allocation support if this data journalling mode is selected.
+/proc entries
+=============
+
+Information about mounted ext4 file systems can be found in
+/proc/fs/ext4. Each mounted filesystem will have a directory in
+/proc/fs/ext4 based on its device name (i.e., /proc/fs/ext4/hdc or
+/proc/fs/ext4/dm-0). The files in each per-device directory are shown
+in table below.
+
+Files in /proc/fs/ext4/<devname>
+..............................................................................
+ File Content
+ mb_groups details of multiblock allocator buddy cache of free blocks
+..............................................................................
+
+/sys entries
+============
+
+Information about mounted ext4 file systems can be found in
+/sys/fs/ext4. Each mounted filesystem will have a directory in
+/sys/fs/ext4 based on its device name (i.e., /sys/fs/ext4/hdc or
+/sys/fs/ext4/dm-0). The files in each per-device directory are shown
+in table below.
+
+Files in /sys/fs/ext4/<devname>
+(see also Documentation/ABI/testing/sysfs-fs-ext4)
+..............................................................................
+ File Content
+
+ delayed_allocation_blocks This file is read-only and shows the number of
+ blocks that are dirty in the page cache, but
+ which do not have their location in the
+ filesystem allocated yet.
+
+ inode_goal Tuning parameter which (if non-zero) controls
+ the goal inode used by the inode allocator in
+ preference to all other allocation heuristics.
+ This is intended for debugging use only, and
+ should be 0 on production systems.
+
+ inode_readahead_blks Tuning parameter which controls the maximum
+ number of inode table blocks that ext4's inode
+ table readahead algorithm will pre-read into
+ the buffer cache
+
+ lifetime_write_kbytes This file is read-only and shows the number of
+ kilobytes of data that have been written to this
+ filesystem since it was created.
+
+ max_writeback_mb_bump The maximum number of megabytes the writeback
+ code will try to write out before move on to
+ another inode.
+
+ mb_group_prealloc The multiblock allocator will round up allocation
+ requests to a multiple of this tuning parameter if
+ the stripe size is not set in the ext4 superblock
+
+ mb_max_to_scan The maximum number of extents the multiblock
+ allocator will search to find the best extent
+
+ mb_min_to_scan The minimum number of extents the multiblock
+ allocator will search to find the best extent
+
+ mb_order2_req Tuning parameter which controls the minimum size
+ for requests (as a power of 2) where the buddy
+ cache is used
+
+ mb_stats Controls whether the multiblock allocator should
+ collect statistics, which are shown during the
+ unmount. 1 means to collect statistics, 0 means
+ not to collect statistics
+
+ mb_stream_req Files which have fewer blocks than this tunable
+ parameter will have their blocks allocated out
+ of a block group specific preallocation pool, so
+ that small files are packed closely together.
+ Each large file will have its blocks allocated
+ out of its own unique preallocation pool.
+
+ session_write_kbytes This file is read-only and shows the number of
+ kilobytes of data that have been written to this
+ filesystem since it was mounted.
+..............................................................................
+
+Ioctls
+======
+
+There is some Ext4 specific functionality which can be accessed by applications
+through the system call interfaces. The list of all Ext4 specific ioctls are
+shown in the table below.
+
+Table of Ext4 specific ioctls
+..............................................................................
+ Ioctl Description
+ EXT4_IOC_GETFLAGS Get additional attributes associated with inode.
+ The ioctl argument is an integer bitfield, with
+ bit values described in ext4.h. This ioctl is an
+ alias for FS_IOC_GETFLAGS.
+
+ EXT4_IOC_SETFLAGS Set additional attributes associated with inode.
+ The ioctl argument is an integer bitfield, with
+ bit values described in ext4.h. This ioctl is an
+ alias for FS_IOC_SETFLAGS.
+
+ EXT4_IOC_GETVERSION
+ EXT4_IOC_GETVERSION_OLD
+ Get the inode i_generation number stored for
+ each inode. The i_generation number is normally
+ changed only when new inode is created and it is
+ particularly useful for network filesystems. The
+ '_OLD' version of this ioctl is an alias for
+ FS_IOC_GETVERSION.
+
+ EXT4_IOC_SETVERSION
+ EXT4_IOC_SETVERSION_OLD
+ Set the inode i_generation number stored for
+ each inode. The '_OLD' version of this ioctl
+ is an alias for FS_IOC_SETVERSION.
+
+ EXT4_IOC_GROUP_EXTEND This ioctl has the same purpose as the resize
+ mount option. It allows to resize filesystem
+ to the end of the last existing block group,
+ further resize has to be done with resize2fs,
+ either online, or offline. The argument points
+ to the unsigned logn number representing the
+ filesystem new block count.
+
+ EXT4_IOC_MOVE_EXT Move the block extents from orig_fd (the one
+ this ioctl is pointing to) to the donor_fd (the
+ one specified in move_extent structure passed
+ as an argument to this ioctl). Then, exchange
+ inode metadata between orig_fd and donor_fd.
+ This is especially useful for online
+ defragmentation, because the allocator has the
+ opportunity to allocate moved blocks better,
+ ideally into one contiguous extent.
+
+ EXT4_IOC_GROUP_ADD Add a new group descriptor to an existing or
+ new group descriptor block. The new group
+ descriptor is described by ext4_new_group_input
+ structure, which is passed as an argument to
+ this ioctl. This is especially useful in
+ conjunction with EXT4_IOC_GROUP_EXTEND,
+ which allows online resize of the filesystem
+ to the end of the last existing block group.
+ Those two ioctls combined is used in userspace
+ online resize tool (e.g. resize2fs).
+
+ EXT4_IOC_MIGRATE This ioctl operates on the filesystem itself.
+ It converts (migrates) ext3 indirect block mapped
+ inode to ext4 extent mapped inode by walking
+ through indirect block mapping of the original
+ inode and converting contiguous block ranges
+ into ext4 extents of the temporary inode. Then,
+ inodes are swapped. This ioctl might help, when
+ migrating from ext3 to ext4 filesystem, however
+ suggestion is to create fresh ext4 filesystem
+ and copy data from the backup. Note, that
+ filesystem has to support extents for this ioctl
+ to work.
+
+ EXT4_IOC_ALLOC_DA_BLKS Force all of the delay allocated blocks to be
+ allocated to preserve application-expected ext3
+ behaviour. Note that this will also start
+ triggering a write of the data blocks, but this
+ behaviour may change in the future as it is
+ not necessary and has been done this way only
+ for sake of simplicity.
+..............................................................................
+
References
==========
diff --git a/Documentation/filesystems/gfs2-uevents.txt b/Documentation/filesystems/gfs2-uevents.txt
index fd966dc..d818896 100644
--- a/Documentation/filesystems/gfs2-uevents.txt
+++ b/Documentation/filesystems/gfs2-uevents.txt
@@ -62,7 +62,7 @@ be fixed.
The REMOVE uevent is generated at the end of an unsuccessful mount
or at the end of a umount of the filesystem. All REMOVE uevents will
-have been preceeded by at least an ADD uevent for the same fileystem,
+have been preceded by at least an ADD uevent for the same fileystem,
and unlike the other uevents is generated automatically by the kernel's
kobject subsystem.
diff --git a/Documentation/filesystems/gfs2.txt b/Documentation/filesystems/gfs2.txt
index 0b59c02..4cda926 100644
--- a/Documentation/filesystems/gfs2.txt
+++ b/Documentation/filesystems/gfs2.txt
@@ -11,7 +11,7 @@ their I/O so file system consistency is maintained. One of the nifty
features of GFS is perfect consistency -- changes made to the file system
on one machine show up immediately on all other machines in the cluster.
-GFS uses interchangable inter-node locking mechanisms, the currently
+GFS uses interchangeable inter-node locking mechanisms, the currently
supported mechanisms are:
lock_nolock -- allows gfs to be used as a local file system
diff --git a/Documentation/filesystems/nfs/idmapper.txt b/Documentation/filesystems/nfs/idmapper.txt
index b9b4192..9c8fd61 100644
--- a/Documentation/filesystems/nfs/idmapper.txt
+++ b/Documentation/filesystems/nfs/idmapper.txt
@@ -47,8 +47,8 @@ request-key will find the first matching line and corresponding program. In
this case, /some/other/program will handle all uid lookups and
/usr/sbin/nfs.idmap will handle gid, user, and group lookups.
-See <file:Documentation/keys-request-keys.txt> for more information about the
-request-key function.
+See <file:Documentation/security/keys-request-keys.txt> for more information
+about the request-key function.
=========
diff --git a/Documentation/filesystems/nfs/pnfs.txt b/Documentation/filesystems/nfs/pnfs.txt
index bc0b9cf..983e14a 100644
--- a/Documentation/filesystems/nfs/pnfs.txt
+++ b/Documentation/filesystems/nfs/pnfs.txt
@@ -46,3 +46,10 @@ data server cache
file driver devices refer to data servers, which are kept in a module
level cache. Its reference is held over the lifetime of the deviceid
pointing to it.
+
+lseg
+----
+lseg maintains an extra reference corresponding to the NFS_LSEG_VALID
+bit which holds it in the pnfs_layout_hdr's list. When the final lseg
+is removed from the pnfs_layout_hdr's list, the NFS_LAYOUT_DESTROYED
+bit is set, preventing any new lsegs from being added.
diff --git a/Documentation/filesystems/ntfs.txt b/Documentation/filesystems/ntfs.txt
index 933bc66..791af8d 100644
--- a/Documentation/filesystems/ntfs.txt
+++ b/Documentation/filesystems/ntfs.txt
@@ -350,7 +350,7 @@ Note the "Should sync?" parameter "nosync" means that the two mirrors are
already in sync which will be the case on a clean shutdown of Windows. If the
mirrors are not clean, you can specify the "sync" option instead of "nosync"
and the Device-Mapper driver will then copy the entirety of the "Source Device"
-to the "Target Device" or if you specified multipled target devices to all of
+to the "Target Device" or if you specified multiple target devices to all of
them.
Once you have your table, save it in a file somewhere (e.g. /etc/ntfsvolume1),
diff --git a/Documentation/filesystems/ocfs2.txt b/Documentation/filesystems/ocfs2.txt
index 5393e66..7618a28 100644
--- a/Documentation/filesystems/ocfs2.txt
+++ b/Documentation/filesystems/ocfs2.txt
@@ -46,9 +46,15 @@ errors=panic Panic and halt the machine if an error occurs.
intr (*) Allow signals to interrupt cluster operations.
nointr Do not allow signals to interrupt cluster
operations.
+noatime Do not update access time.
+relatime(*) Update atime if the previous atime is older than
+ mtime or ctime
+strictatime Always update atime, but the minimum update interval
+ is specified by atime_quantum.
atime_quantum=60(*) OCFS2 will not update atime unless this number
of seconds has passed since the last update.
- Set to zero to always update atime.
+ Set to zero to always update atime. This option need
+ work with strictatime.
data=ordered (*) All data are forced directly out to the main file
system prior to its metadata being committed to the
journal.
@@ -80,7 +86,7 @@ user_xattr (*) Enables Extended User Attributes.
nouser_xattr Disables Extended User Attributes.
acl Enables POSIX Access Control Lists support.
noacl (*) Disables POSIX Access Control Lists support.
-resv_level=2 (*) Set how agressive allocation reservations will be.
+resv_level=2 (*) Set how aggressive allocation reservations will be.
Valid values are between 0 (reservations off) to 8
(maximum space for reservations).
dir_resv_level= (*) By default, directory reservations will scale with file
diff --git a/Documentation/filesystems/path-lookup.txt b/Documentation/filesystems/path-lookup.txt
index eb59c8b..3571667 100644
--- a/Documentation/filesystems/path-lookup.txt
+++ b/Documentation/filesystems/path-lookup.txt
@@ -42,7 +42,7 @@ Path walking overview
A name string specifies a start (root directory, cwd, fd-relative) and a
sequence of elements (directory entry names), which together refer to a path in
the namespace. A path is represented as a (dentry, vfsmount) tuple. The name
-elements are sub-strings, seperated by '/'.
+elements are sub-strings, separated by '/'.
Name lookups will want to find a particular path that a name string refers to
(usually the final element, or parent of final element). This is done by taking
@@ -354,7 +354,7 @@ vfstest 24185492 4945 708725(2.9%) 1076136(4.4%) 0 2651
What this shows is that failed rcu-walk lookups, ie. ones that are restarted
entirely with ref-walk, are quite rare. Even the "vfstest" case which
-specifically has concurrent renames/mkdir/rmdir/ creat/unlink/etc to excercise
+specifically has concurrent renames/mkdir/rmdir/ creat/unlink/etc to exercise
such races is not showing a huge amount of restarts.
Dropping from rcu-walk to ref-walk mean that we have encountered a dentry where
diff --git a/Documentation/filesystems/pohmelfs/network_protocol.txt b/Documentation/filesystems/pohmelfs/network_protocol.txt
index 40ea6c2..65e03dd 100644
--- a/Documentation/filesystems/pohmelfs/network_protocol.txt
+++ b/Documentation/filesystems/pohmelfs/network_protocol.txt
@@ -20,7 +20,7 @@ Commands can be embedded into transaction command (which in turn has own command
so one can extend protocol as needed without breaking backward compatibility as long
as old commands are supported. All string lengths include tail 0 byte.
-All commans are transfered over the network in big-endian. CPU endianess is used at the end peers.
+All commands are transferred over the network in big-endian. CPU endianess is used at the end peers.
@cmd - command number, which specifies command to be processed. Following
commands are used currently:
diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting
index dfbcd1b..6e29954 100644
--- a/Documentation/filesystems/porting
+++ b/Documentation/filesystems/porting
@@ -298,11 +298,14 @@ be used instead. It gets called whenever the inode is evicted, whether it has
remaining links or not. Caller does *not* evict the pagecache or inode-associated
metadata buffers; getting rid of those is responsibility of method, as it had
been for ->delete_inode().
- ->drop_inode() returns int now; it's called on final iput() with inode_lock
-held and it returns true if filesystems wants the inode to be dropped. As before,
-generic_drop_inode() is still the default and it's been updated appropriately.
-generic_delete_inode() is also alive and it consists simply of return 1. Note that
-all actual eviction work is done by caller after ->drop_inode() returns.
+
+ ->drop_inode() returns int now; it's called on final iput() with
+inode->i_lock held and it returns true if filesystems wants the inode to be
+dropped. As before, generic_drop_inode() is still the default and it's been
+updated appropriately. generic_delete_inode() is also alive and it consists
+simply of return 1. Note that all actual eviction work is done by caller after
+->drop_inode() returns.
+
clear_inode() is gone; use end_writeback() instead. As before, it must
be called exactly once on each call of ->evict_inode() (as it used to be for
each call of ->delete_inode()). Unlike before, if you are using inode-associated
@@ -394,3 +397,13 @@ file) you must return -EOPNOTSUPP if FALLOC_FL_PUNCH_HOLE is set in mode.
Currently you can only have FALLOC_FL_PUNCH_HOLE with FALLOC_FL_KEEP_SIZE set,
so the i_size should not change when hole punching, even when puching the end of
a file off.
+
+--
+[mandatory]
+
+--
+[mandatory]
+ ->get_sb() is gone. Switch to use of ->mount(). Typically it's just
+a matter of switching from calling get_sb_... to mount_... and changing the
+function type. If you were doing it manually, just switch from setting ->mnt_root
+to some pointer to returning that pointer. On errors return ERR_PTR(...).
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 23cae65..f481780 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -543,7 +543,7 @@ just those considered 'most important'. The new vectors are:
their statistics are used by kernel developers and interested users to
determine the occurrence of interrupts of the given type.
-The above IRQ vectors are displayed only when relevent. For example,
+The above IRQ vectors are displayed only when relevant. For example,
the threshold vector does not exist on x86_64 platforms. Others are
suppressed when the system is a uniprocessor. As of this writing, only
i386 and x86_64 platforms support the new IRQ vector displays.
@@ -574,6 +574,12 @@ The contents of each smp_affinity file is the same by default:
> cat /proc/irq/0/smp_affinity
ffffffff
+There is an alternate interface, smp_affinity_list which allows specifying
+a cpu range instead of a bitmask:
+
+ > cat /proc/irq/0/smp_affinity_list
+ 1024-1031
+
The default_smp_affinity mask applies to all non-active IRQs, which are the
IRQs which have not yet been allocated/activated, and hence which lack a
/proc/irq/[0-9]* directory.
@@ -583,12 +589,13 @@ reports itself as being attached. This hardware locality information does not
include information about any possible driver locality preference.
prof_cpu_mask specifies which CPUs are to be profiled by the system wide
-profiler. Default value is ffffffff (all cpus).
+profiler. Default value is ffffffff (all cpus if there are only 32 of them).
The way IRQs are routed is handled by the IO-APIC, and it's Round Robin
between all the CPUs which are allowed to handle it. As usual the kernel has
more info than you and does a better job than you, so the defaults are the
-best choice for almost everyone.
+best choice for almost everyone. [Note this applies only to those IO-APIC's
+that support "Round Robin" interrupt distribution.]
There are three more important subdirectories in /proc: net, scsi, and sys.
The general rule is that the contents, or even the existence of these
@@ -836,7 +843,6 @@ Provides counts of softirq handlers serviced since boot time, for each cpu.
TASKLET: 0 0 0 290
SCHED: 27035 26983 26971 26746
HRTIMER: 0 0 0 0
- RCU: 1678 1769 2178 2250
1.3 IDE devices in /proc/ide
@@ -1202,7 +1208,7 @@ The columns are:
W = can do write operations
U = can do unblank
flags E = it is enabled
- C = it is prefered console
+ C = it is preferred console
B = it is primary boot console
p = it is used for printk buffer
b = it is not a TTY but a Braille device
@@ -1331,7 +1337,7 @@ NOTICE: /proc/<pid>/oom_adj is deprecated and will be removed, please see
Documentation/feature-removal-schedule.txt.
Caveat: when a parent task is selected, the oom killer will sacrifice any first
-generation children with seperate address spaces instead, if possible. This
+generation children with separate address spaces instead, if possible. This
avoids servers and important system daemons from being killed and loses the
minimal amount of work.
diff --git a/Documentation/filesystems/romfs.txt b/Documentation/filesystems/romfs.txt
index 2d2a7b2..e2b07cc 100644
--- a/Documentation/filesystems/romfs.txt
+++ b/Documentation/filesystems/romfs.txt
@@ -17,8 +17,7 @@ comparison, an actual rescue disk used up 3202 blocks with ext2, while
with romfs, it needed 3079 blocks.
To create such a file system, you'll need a user program named
-genromfs. It is available via anonymous ftp on sunsite.unc.edu and
-its mirrors, in the /pub/Linux/system/recovery/ directory.
+genromfs. It is available on http://romfs.sourceforge.net/
As the name suggests, romfs could be also used (space-efficiently) on
various read-only media, like (E)EPROM disks if someone will have the
diff --git a/Documentation/filesystems/squashfs.txt b/Documentation/filesystems/squashfs.txt
index 66699af..d4d41465 100644
--- a/Documentation/filesystems/squashfs.txt
+++ b/Documentation/filesystems/squashfs.txt
@@ -59,12 +59,15 @@ obtained from this site also.
3. SQUASHFS FILESYSTEM DESIGN
-----------------------------
-A squashfs filesystem consists of a maximum of eight parts, packed together on a byte
-alignment:
+A squashfs filesystem consists of a maximum of nine parts, packed together on a
+byte alignment:
---------------
| superblock |
|---------------|
+ | compression |
+ | options |
+ |---------------|
| datablocks |
| & fragments |
|---------------|
@@ -91,7 +94,14 @@ the source directory, and checked for duplicates. Once all file data has been
written the completed inode, directory, fragment, export and uid/gid lookup
tables are written.
-3.1 Inodes
+3.1 Compression options
+-----------------------
+
+Compressors can optionally support compression specific options (e.g.
+dictionary size). If non-default compression options have been used, then
+these are stored here.
+
+3.2 Inodes
----------
Metadata (inodes and directories) are compressed in 8Kbyte blocks. Each
@@ -114,7 +124,7 @@ directory inode are defined: inodes optimised for frequently occurring
regular files and directories, and extended types where extra
information has to be stored.
-3.2 Directories
+3.3 Directories
---------------
Like inodes, directories are packed into compressed metadata blocks, stored
@@ -144,7 +154,7 @@ decompressed to do a lookup irrespective of the length of the directory.
This scheme has the advantage that it doesn't require extra memory overhead
and doesn't require much extra storage on disk.
-3.3 File data
+3.4 File data
-------------
Regular files consist of a sequence of contiguous compressed blocks, and/or a
@@ -163,7 +173,7 @@ Larger files use multiple slots, with 1.75 TiB files using all 8 slots.
The index cache is designed to be memory efficient, and by default uses
16 KiB.
-3.4 Fragment lookup table
+3.5 Fragment lookup table
-------------------------
Regular files can contain a fragment index which is mapped to a fragment
@@ -173,7 +183,7 @@ A second index table is used to locate these. This second index table for
speed of access (and because it is small) is read at mount time and cached
in memory.
-3.5 Uid/gid lookup table
+3.6 Uid/gid lookup table
------------------------
For space efficiency regular files store uid and gid indexes, which are
@@ -182,7 +192,7 @@ stored compressed into metadata blocks. A second index table is used to
locate these. This second index table for speed of access (and because it
is small) is read at mount time and cached in memory.
-3.6 Export table
+3.7 Export table
----------------
To enable Squashfs filesystems to be exportable (via NFS etc.) filesystems
@@ -196,7 +206,7 @@ This table is stored compressed into metadata blocks. A second index table is
used to locate these. This second index table for speed of access (and because
it is small) is read at mount time and cached in memory.
-3.7 Xattr table
+3.8 Xattr table
---------------
The xattr table contains extended attributes for each inode. The xattrs
@@ -209,7 +219,7 @@ or if it is stored out of line (in which case the value field stores a
reference to where the actual value is stored). This allows large values
to be stored out of line improving scanning and lookup performance and it
also allows values to be de-duplicated, the value being stored once, and
-all other occurences holding an out of line reference to that value.
+all other occurrences holding an out of line reference to that value.
The xattr lists are packed into compressed 8K metadata blocks.
To reduce overhead in inodes, rather than storing the on-disk
diff --git a/Documentation/filesystems/sysfs.txt b/Documentation/filesystems/sysfs.txt
index 5d1335fae..597f728 100644
--- a/Documentation/filesystems/sysfs.txt
+++ b/Documentation/filesystems/sysfs.txt
@@ -39,10 +39,12 @@ userspace. Top-level directories in sysfs represent the common
ancestors of object hierarchies; i.e. the subsystems the objects
belong to.
-Sysfs internally stores the kobject that owns the directory in the
-->d_fsdata pointer of the directory's dentry. This allows sysfs to do
-reference counting directly on the kobject when the file is opened and
-closed.
+Sysfs internally stores a pointer to the kobject that implements a
+directory in the sysfs_dirent object associated with the directory. In
+the past this kobject pointer has been used by sysfs to do reference
+counting directly on the kobject whenever the file is opened or closed.
+With the current sysfs implementation the kobject reference count is
+only modified directly by the function sysfs_schedule_callback().
Attributes
@@ -60,7 +62,7 @@ values of the same type.
Mixing types, expressing multiple lines of data, and doing fancy
formatting of data is heavily frowned upon. Doing these things may get
-you publically humiliated and your code rewritten without notice.
+you publicly humiliated and your code rewritten without notice.
An attribute definition is simply:
@@ -208,9 +210,9 @@ Other notes:
is 4096.
- show() methods should return the number of bytes printed into the
- buffer. This is the return value of snprintf().
+ buffer. This is the return value of scnprintf().
-- show() should always use snprintf().
+- show() should always use scnprintf().
- store() should return the number of bytes used from the buffer. If the
entire buffer has been used, just return the count argument.
@@ -229,7 +231,7 @@ A very simple (and naive) implementation of a device attribute is:
static ssize_t show_name(struct device *dev, struct device_attribute *attr,
char *buf)
{
- return snprintf(buf, PAGE_SIZE, "%s\n", dev->name);
+ return scnprintf(buf, PAGE_SIZE, "%s\n", dev->name);
}
static ssize_t store_name(struct device *dev, struct device_attribute *attr,
diff --git a/Documentation/filesystems/ubifs.txt b/Documentation/filesystems/ubifs.txt
index 12fedb7..8e4fab6 100644
--- a/Documentation/filesystems/ubifs.txt
+++ b/Documentation/filesystems/ubifs.txt
@@ -82,12 +82,12 @@ Mount options
bulk_read read more in one go to take advantage of flash
media that read faster sequentially
no_bulk_read (*) do not bulk-read
-no_chk_data_crc skip checking of CRCs on data nodes in order to
+no_chk_data_crc (*) skip checking of CRCs on data nodes in order to
improve read performance. Use this option only
if the flash media is highly reliable. The effect
of this option is that corruption of the contents
of a file can go unnoticed.
-chk_data_crc (*) do not skip checking CRCs on data nodes
+chk_data_crc do not skip checking CRCs on data nodes
compr=none override default compressor and set it to "none"
compr=lzo override default compressor and set it to "lzo"
compr=zlib override default compressor and set it to "zlib"
@@ -115,28 +115,8 @@ ubi.mtd=0 root=ubi0:rootfs rootfstype=ubifs
Module Parameters for Debugging
===============================
-When UBIFS has been compiled with debugging enabled, there are 3 module
+When UBIFS has been compiled with debugging enabled, there are 2 module
parameters that are available to control aspects of testing and debugging.
-The parameters are unsigned integers where each bit controls an option.
-The parameters are:
-
-debug_msgs Selects which debug messages to display, as follows:
-
- Message Type Flag value
-
- General messages 1
- Journal messages 2
- Mount messages 4
- Commit messages 8
- LEB search messages 16
- Budgeting messages 32
- Garbage collection messages 64
- Tree Node Cache (TNC) messages 128
- LEB properties (lprops) messages 256
- Input/output messages 512
- Log messages 1024
- Scan messages 2048
- Recovery messages 4096
debug_chks Selects extra checks that UBIFS can do while running:
@@ -154,11 +134,9 @@ debug_tsts Selects a mode of testing, as follows:
Test mode Flag value
- Force in-the-gaps method 2
Failure mode for recovery testing 4
-For example, set debug_msgs to 5 to display General messages and Mount
-messages.
+For example, set debug_chks to 3 to enable general and TNC checks.
References
diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index 94cf97b..88b9f55 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -95,10 +95,11 @@ functions:
extern int unregister_filesystem(struct file_system_type *);
The passed struct file_system_type describes your filesystem. When a
-request is made to mount a device onto a directory in your filespace,
-the VFS will call the appropriate get_sb() method for the specific
-filesystem. The dentry for the mount point will then be updated to
-point to the root inode for the new filesystem.
+request is made to mount a filesystem onto a directory in your namespace,
+the VFS will call the appropriate mount() method for the specific
+filesystem. New vfsmount referring to the tree returned by ->mount()
+will be attached to the mountpoint, so that when pathname resolution
+reaches the mountpoint it will jump into the root of that vfsmount.
You can see all filesystems that are registered to the kernel in the
file /proc/filesystems.
@@ -107,14 +108,14 @@ file /proc/filesystems.
struct file_system_type
-----------------------
-This describes the filesystem. As of kernel 2.6.22, the following
+This describes the filesystem. As of kernel 2.6.39, the following
members are defined:
struct file_system_type {
const char *name;
int fs_flags;
- int (*get_sb) (struct file_system_type *, int,
- const char *, void *, struct vfsmount *);
+ struct dentry (*mount) (struct file_system_type *, int,
+ const char *, void *);
void (*kill_sb) (struct super_block *);
struct module *owner;
struct file_system_type * next;
@@ -128,11 +129,11 @@ struct file_system_type {
fs_flags: various flags (i.e. FS_REQUIRES_DEV, FS_NO_DCACHE, etc.)
- get_sb: the method to call when a new instance of this
+ mount: the method to call when a new instance of this
filesystem should be mounted
kill_sb: the method to call when an instance of this filesystem
- should be unmounted
+ should be shut down
owner: for internal VFS use: you should initialize this to THIS_MODULE in
most cases.
@@ -141,7 +142,7 @@ struct file_system_type {
s_lock_key, s_umount_key: lockdep-specific
-The get_sb() method has the following arguments:
+The mount() method has the following arguments:
struct file_system_type *fs_type: describes the filesystem, partly initialized
by the specific filesystem code
@@ -153,32 +154,39 @@ The get_sb() method has the following arguments:
void *data: arbitrary mount options, usually comes as an ASCII
string (see "Mount Options" section)
- struct vfsmount *mnt: a vfs-internal representation of a mount point
+The mount() method must return the root dentry of the tree requested by
+caller. An active reference to its superblock must be grabbed and the
+superblock must be locked. On failure it should return ERR_PTR(error).
-The get_sb() method must determine if the block device specified
-in the dev_name and fs_type contains a filesystem of the type the method
-supports. If it succeeds in opening the named block device, it initializes a
-struct super_block descriptor for the filesystem contained by the block device.
-On failure it returns an error.
+The arguments match those of mount(2) and their interpretation
+depends on filesystem type. E.g. for block filesystems, dev_name is
+interpreted as block device name, that device is opened and if it
+contains a suitable filesystem image the method creates and initializes
+struct super_block accordingly, returning its root dentry to caller.
+
+->mount() may choose to return a subtree of existing filesystem - it
+doesn't have to create a new one. The main result from the caller's
+point of view is a reference to dentry at the root of (sub)tree to
+be attached; creation of new superblock is a common side effect.
The most interesting member of the superblock structure that the
-get_sb() method fills in is the "s_op" field. This is a pointer to
+mount() method fills in is the "s_op" field. This is a pointer to
a "struct super_operations" which describes the next level of the
filesystem implementation.
-Usually, a filesystem uses one of the generic get_sb() implementations
-and provides a fill_super() method instead. The generic methods are:
+Usually, a filesystem uses one of the generic mount() implementations
+and provides a fill_super() callback instead. The generic variants are:
- get_sb_bdev: mount a filesystem residing on a block device
+ mount_bdev: mount a filesystem residing on a block device
- get_sb_nodev: mount a filesystem that is not backed by a device
+ mount_nodev: mount a filesystem that is not backed by a device
- get_sb_single: mount a filesystem which shares the instance between
+ mount_single: mount a filesystem which shares the instance between
all mounts
-A fill_super() method implementation has the following arguments:
+A fill_super() callback implementation has the following arguments:
- struct super_block *sb: the superblock structure. The method fill_super()
+ struct super_block *sb: the superblock structure. The callback
must initialize this properly.
void *data: arbitrary mount options, usually comes as an ASCII
@@ -203,7 +211,7 @@ struct super_operations {
struct inode *(*alloc_inode)(struct super_block *sb);
void (*destroy_inode)(struct inode *);
- void (*dirty_inode) (struct inode *);
+ void (*dirty_inode) (struct inode *, int flags);
int (*write_inode) (struct inode *, int);
void (*drop_inode) (struct inode *);
void (*delete_inode) (struct inode *);
@@ -246,7 +254,7 @@ or bottom half).
should be synchronous or not, not all filesystems check this flag.
drop_inode: called when the last access to the inode is dropped,
- with the inode_lock spinlock held.
+ with the inode->i_lock spinlock held.
This method should be either NULL (normal UNIX filesystem
semantics) or "generic_delete_inode" (for filesystems that do not
@@ -865,7 +873,7 @@ struct dentry_operations {
void (*d_iput)(struct dentry *, struct inode *);
char *(*d_dname)(struct dentry *, char *, int);
struct vfsmount *(*d_automount)(struct path *);
- int (*d_manage)(struct dentry *, bool, bool);
+ int (*d_manage)(struct dentry *, bool);
};
d_revalidate: called when the VFS needs to revalidate a dentry. This
@@ -961,10 +969,6 @@ struct dentry_operations {
mounted on it and not to check the automount flag. Any other error
code will abort pathwalk completely.
- If the 'mounting_here' parameter is true, then namespace_sem is being
- held by the caller and the function should not initiate any mounts or
- unmounts that it will then wait for.
-
If the 'rcu_walk' parameter is true, then the caller is doing a
pathwalk in RCU-walk mode. Sleeping is not permitted in this mode,
and the caller can be asked to leave it and call again by returing
diff --git a/Documentation/filesystems/xfs-delayed-logging-design.txt b/Documentation/filesystems/xfs-delayed-logging-design.txt
index 7445bf3..2ce3643 100644
--- a/Documentation/filesystems/xfs-delayed-logging-design.txt
+++ b/Documentation/filesystems/xfs-delayed-logging-design.txt
@@ -42,7 +42,7 @@ the aggregation of all the previous changes currently held only in the log.
This relogging technique also allows objects to be moved forward in the log so
that an object being relogged does not prevent the tail of the log from ever
moving forward. This can be seen in the table above by the changing
-(increasing) LSN of each subsquent transaction - the LSN is effectively a
+(increasing) LSN of each subsequent transaction - the LSN is effectively a
direct encoding of the location in the log of the transaction.
This relogging is also used to implement long-running, multiple-commit
@@ -338,7 +338,7 @@ the same time another transaction modifies the item and inserts the log item
into the new CIL, then checkpoint transaction commit code cannot use log items
to store the list of log vectors that need to be written into the transaction.
Hence log vectors need to be able to be chained together to allow them to be
-detatched from the log items. That is, when the CIL is flushed the memory
+detached from the log items. That is, when the CIL is flushed the memory
buffer and log vector attached to each log item needs to be attached to the
checkpoint context so that the log item can be released. In diagrammatic form,
the CIL would look like this before the flush:
@@ -577,7 +577,7 @@ only becomes unpinned when all the transactions complete and there are no
pending transactions. Thus the pinning and unpinning of a log item is symmetric
as there is a 1:1 relationship with transaction commit and log item completion.
-For delayed logging, however, we have an assymetric transaction commit to
+For delayed logging, however, we have an asymmetric transaction commit to
completion relationship. Every time an object is relogged in the CIL it goes
through the commit process without a corresponding completion being registered.
That is, we now have a many-to-one relationship between transaction commit and
@@ -780,7 +780,7 @@ With delayed logging, there are new steps inserted into the life cycle:
From this, it can be seen that the only life cycle differences between the two
logging methods are in the middle of the life cycle - they still have the same
beginning and end and execution constraints. The only differences are in the
-commiting of the log items to the log itself and the completion processing.
+committing of the log items to the log itself and the completion processing.
Hence delayed logging should not introduce any constraints on log item
behaviour, allocation or freeing that don't already exist.
@@ -791,10 +791,3 @@ mount option. Fundamentally, there is no reason why the log manager would not
be able to swap methods automatically and transparently depending on load
characteristics, but this should not be necessary if delayed logging works as
designed.
-
-Roadmap:
-
-2.6.39 Switch default mount option to use delayed logging
- => should be roughly 12 months after initial merge
- => enough time to shake out remaining problems before next round of
- enterprise distro kernel rebases
diff --git a/Documentation/filesystems/xfs.txt b/Documentation/filesystems/xfs.txt
index 7bff3e4..3fc0c31 100644
--- a/Documentation/filesystems/xfs.txt
+++ b/Documentation/filesystems/xfs.txt
@@ -39,6 +39,12 @@ When mounting an XFS filesystem, the following options are accepted.
drive level write caching to be enabled, for devices that
support write barriers.
+ discard
+ Issue command to let the block device reclaim space freed by the
+ filesystem. This is useful for SSD devices, thinly provisioned
+ LUNs and virtual machine images, but may have a performance
+ impact. This option is incompatible with the nodelaylog option.
+
dmapi
Enable the DMAPI (Data Management API) event callouts.
Use with the "mtpt" option.
OpenPOWER on IntegriCloud