summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
...
* | | | | | | Merge tag 'ext4_for_linus' of ↵Linus Torvalds2012-10-239-45/+74
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Various bug fixes for ext4. The most serious of them fixes a security bug (CVE-2012-4508) which leads to stale data exposure when we have fallocate racing against writes to files undergoing delayed allocation. We also have two fixes for the metadata checksum feature, the most serious of which can cause the superblock to have a invalid checksum after a power failure." * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: Avoid underflow in ext4_trim_fs() ext4: Checksum the block bitmap properly with bigalloc enabled ext4: fix undefined bit shift result in ext4_fill_flex_info ext4: fix metadata checksum calculation for the superblock ext4: race-condition protection for ext4_convert_unwritten_extents_endio ext4: serialize fallocate with ext4_convert_unwritten_extents
| * | | | | | | ext4: Avoid underflow in ext4_trim_fs()Lukas Czerner2012-10-221-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently if len argument in ext4_trim_fs() is smaller than one block, the 'end' variable underflow. Avoid that by returning EINVAL if len is smaller than file system block. Also remove useless unlikely(). Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
| * | | | | | | ext4: Checksum the block bitmap properly with bigalloc enabledTao Ma2012-10-226-20/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In mke2fs, we only checksum the whole bitmap block and it is right. While in the kernel, we use EXT4_BLOCKS_PER_GROUP to indicate the size of the checksumed bitmap which is wrong when we enable bigalloc. The right size should be EXT4_CLUSTERS_PER_GROUP and this patch fixes it. Also as every caller of ext4_block_bitmap_csum_set and ext4_block_bitmap_csum_verify pass in EXT4_BLOCKS_PER_GROUP(sb)/8, we'd better removes this parameter and sets it in the function itself. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Lukas Czerner <lczerner@redhat.com> Cc: stable@vger.kernel.org
| * | | | | | | ext4: fix undefined bit shift result in ext4_fill_flex_infoLukas Czerner2012-10-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The result of the bit shift expression in '1 << sbi->s_log_groups_per_flex' can be undefined in the case that s_log_groups_per_flex is 31 because the result of the shift is bigger than INT_MAX. In reality this probably should not cause much problems since we'll end up with INT_MIN which will then be converted into 'unsigned int' type, but nevertheless according to the ISO C99 the result is actually undefined. Fix this by changing the left operand to 'unsigned int' type. Note that the commit d50f2ab6f050311dbf7b8f5501b25f0bf64a439b already tried to fix the undefined behaviour, but this was missed. Thanks to Laszlo Ersek for pointing this out and suggesting the fix. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reported-by: Laszlo Ersek <lersek@redhat.com>
| * | | | | | | ext4: fix metadata checksum calculation for the superblockTheodore Ts'o2012-10-103-11/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function ext4_handle_dirty_super() was calculating the superblock on the wrong block data. As a result, when the superblock is modified while it is mounted (most commonly, when inodes are added or removed from the orphan list), the superblock checksum would be wrong. We didn't notice because the superblock *was* being correctly calculated in ext4_commit_super(), and this would get called when the file system was unmounted. So the problem only became obvious if the system crashed while the file system was mounted. Fix this by removing the poorly designed function signature for ext4_superblock_csum_set(); if it only took a single argument, the pointer to a struct superblock, the ambiguity which caused this mistake would have been impossible. Reported-by: George Spelvin <linux@horizon.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
| * | | | | | | ext4: race-condition protection for ext4_convert_unwritten_extents_endioDmitry Monakhov2012-10-101-11/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We assumed that at the time we call ext4_convert_unwritten_extents_endio() extent in question is fully inside [map.m_lblk, map->m_len] because it was already split during submission. But this may not be true due to a race between writeback vs fallocate. If extent in question is larger than requested we will split it again. Special precautions should being done if zeroout required because [map.m_lblk, map->m_len] already contains valid data. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
| * | | | | | | ext4: serialize fallocate with ext4_convert_unwritten_extentsDmitry Monakhov2012-10-051-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fallocate should wait for pended ext4_convert_unwritten_extents() otherwise following race may happen: ftruncate( ,12288); fallocate( ,0, 4096) io_sibmit( ,0, 4096); /* Write to fallocated area, split extent if needed */ fallocate( ,0, 8192); /* Grow extent and broke assumption about extent */ Later kwork completion will do: ->ext4_convert_unwritten_extents (0, 4096) ->ext4_map_blocks(handle, inode, &map, EXT4_GET_BLOCKS_IO_CONVERT_EXT); ->ext4_ext_map_blocks() /* Will find new extent: ex = [0,2] !!!!!! */ ->ext4_ext_handle_uninitialized_extents() ->ext4_convert_unwritten_extents_endio() /* convert [0,2] extent to initialized, but only[0,1] was written */ Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* | | | | | | | Merge tag 'nfs-for-3.7-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds2012-10-237-32/+22
|\ \ \ \ \ \ \ \ | |_|_|_|_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client bugfixes from Trond Myklebust: - Do not call pnfs_return_layout() from an rpciod context - nfs4_ds_disconnect can cause Oopses. Kill it... - Fix the return value for nfs_callback_start_svc - Fix a number of compile warnings * tag 'nfs-for-3.7-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4: Fix the return value for nfs_callback_start_svc NFSv4.1: Declare osd_pri_2_pnfs_err(), objio_init_read/write to be static NFSv4: fs/nfs/nfs4getroot.c needs to include "internal.h" NFSv4.1: Use kcalloc() to allocate zeroed arrays instead of kzalloc() NFSv4.1: Do not call pnfs_return_layout() from an rpciod context NFSv4.1: Kill nfs4_ds_disconnect()
| * | | | | | | NFSv4: Fix the return value for nfs_callback_start_svcTrond Myklebust2012-10-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | returning PTR_ERR(cb_info->task) just after we have set it to NULL looks like a typo... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Stanislav Kinsbursky <skinsbursky@parallels.com>
| * | | | | | | NFSv4.1: Declare osd_pri_2_pnfs_err(), objio_init_read/write to be staticTrond Myklebust2012-10-161-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | | | | | NFSv4: fs/nfs/nfs4getroot.c needs to include "internal.h"Trond Myklebust2012-10-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a warning about "no previous prototype for ‘nfs4_get_rootfh’" Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | | | | | NFSv4.1: Use kcalloc() to allocate zeroed arrays instead of kzalloc()Trond Myklebust2012-10-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Don't circumvent the array size checks. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | | | | | NFSv4.1: Do not call pnfs_return_layout() from an rpciod contextTrond Myklebust2012-10-152-3/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the call to pnfs_return_layout() to the read and write rpc_release() callbacks, so that it gets called from nfsiod, which is a more appropriate context. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | | | | | NFSv4.1: Kill nfs4_ds_disconnect()Trond Myklebust2012-10-153-24/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is nothing to prevent another thread from dereferencing ds->ds_clp during or after the call to nfs4_ds_disconnect(), and Oopsing due to the resulting NULL pointer. Instead, we should just rely on filelayout_mark_devid_invalid() to keep us out of trouble by avoiding that deviceid. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | | | | | | | char_dev: pin parent kobjectDmitry Torokhov2012-10-221-1/+17
| |_|_|_|_|/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In certain cases (for example when a cdev structure is embedded into another object whose lifetime is controlled by a separate kobject) it is beneficial to tie lifetime of another object to the lifetime of character device so that related object is not freed until after char_dev object is freed. To achieve this let's pin kobject's parent when doing cdev_add() and unpin when last reference to cdev structure is being released. Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | hold task->mempolicy while numa_maps scans.KAMEZAWA Hiroyuki2012-10-192-3/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | /proc/<pid>/numa_maps scans vma and show mempolicy under mmap_sem. It sometimes accesses task->mempolicy which can be freed without mmap_sem and numa_maps can show some garbage while scanning. This patch tries to take reference count of task->mempolicy at reading numa_maps before calling get_vma_policy(). By this, task->mempolicy will not be freed until numa_maps reaches its end. V2->v3 - updated comments to be more verbose. - removed task_lock() in numa_maps code. V1->V2 - access task->mempolicy only once and remember it. Becase kernel/exit.c can overwrite it. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: David Rientjes <rientjes@google.com> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | Merge branch 'for-3.7' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2012-10-192-2/+3
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull nfsd bugfixes from J Bruce Fields. * 'for-3.7' of git://linux-nfs.org/~bfields/linux: SUNRPC: Prevent kernel stack corruption on long values of flush NLM: nlm_lookup_file() may return NLMv4-specific error codes
| * | | | | | | NLM: nlm_lookup_file() may return NLMv4-specific error codesTrond Myklebust2012-10-172-2/+3
| |/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the filehandle is stale, or open access is denied for some reason, nlm_fopen() may return one of the NLMv4-specific error codes nlm4_stale_fh or nlm4_failed. These get passed right through nlm_lookup_file(), and so when nlmsvc_retrieve_args() calls the latter, it needs to filter the result through the cast_status() machinery. Failure to do so, will trigger the BUG_ON() in encode_nlm_stat... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Reported-by: Larry McVoy <lm@bitmover.com> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* | | | | | | fs, xattr: fix bug when removing a name not in xattr listDavid Rientjes2012-10-181-1/+1
| |_|/ / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 38f38657444d ("xattr: extract simple_xattr code from tmpfs") moved some code from tmpfs but introduced a subtle bug along the way. If the name passed to simple_xattr_remove() does not exist in the list of xattrs, then it is possible to call kfree(new_xattr) when new_xattr is actually initialized to itself on the stack via uninitialized_var(). This causes a BUG() since the memory was not allocated via the slab allocator and was not bypassed through to the page allocator because it was too large. Initialize the local variable to NULL so the kfree() never takes place. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Hugh Dickins <hughd@google.com> Acked-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | Merge branch 'for_linus' of ↵Linus Torvalds2012-10-165-23/+46
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull ext2, ext3, quota fixes from Jan Kara: "Fix three regressions caused by user namespace conversions (ext2, ext3, quota) and minor ext3 fix and cleanup." * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: quota: Silence warning about PRJQUOTA not being handled in need_print_warning() ext3: fix return values on parse_options() failure ext2: fix return values on parse_options() failure ext3: ext3_bread usage audit ext3: fix possible non-initialized variable on htree_dirblock_to_tree()
| * | | | | | quota: Silence warning about PRJQUOTA not being handled in need_print_warning()Jan Kara2012-10-091-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PRJQUOTA value of quota type should never reach need_print_warning() since XFS (which is the only fs which uses that type) doesn't use generic functions calling this function. Anyway, add PRJQUOTA case to the switch to make gcc happy. Signed-off-by: Jan Kara <jack@suse.cz>
| * | | | | | ext3: fix return values on parse_options() failureZhao Hongjiang2012-10-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | parse_options() in ext3 should return 0 when parse the mount options fails. Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com> Reviewed-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz>
| * | | | | | ext2: fix return values on parse_options() failureZhao Hongjiang2012-10-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | parse_options() in ext2 should return 0 when parse the mount options fails. Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com> Reviewed-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz>
| * | | | | | ext3: ext3_bread usage auditCarlos Maiolino2012-10-092-18/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the ext3 version of the same patch applied to Ext4, where such goal is to audit the usage of ext3_bread() due a possible misinterpretion of its return value. Focused on directory blocks, a NULL value returned from ext3_bread() means a hole, which cannot exist into a directory inode. It can pass undetected after a fix in an uninitialized error variable. The (now) initialized variable into ext3_getblk() may lead to a zero'ed return value of ext3_bread() to its callers, which can make the caller do not detect the hole in the directory inode. This patch creates a new wrapper function ext3_dir_bread() which checks for holes properly, reports error, and returns EIO in that case. Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz>
| * | | | | | ext3: fix possible non-initialized variable on htree_dirblock_to_tree()Carlos Maiolino2012-10-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of ext4 commit 90b0a9732 which fixes a possible non-initialized variable on htree_dirblock_to_tree(). Ext3 has the same non initialized variable, but, in any case it will be initialized by ext3_get_blocks_handle(), which will avoid the bug to be triggered, but, the non-initialized variable by htree_dirblock_to_tree() is still a bug. Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz>
* | | | | | | Merge branch 'for-linus' of ↵Linus Torvalds2012-10-161-2/+3
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "Fix for my braino in replace_fd(), dhowell's fix for the fallout from over-enthusiastic bo^Wdeclaration movements plus crapectomy that should've happened a long time ago (SEL_... definitions)." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: bury SEL_{IN,OUT,EX} Unexport some bits of linux/fs.h fix a leak in replace_fd() users
| * | | | | | | fix a leak in replace_fd() usersAl Viro2012-10-161-2/+3
| | |/ / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | replace_fd() began with "eats a reference, tries to insert into descriptor table" semantics; at some point I'd switched it to much saner current behaviour ("try to insert into descriptor table, grabbing a new reference if inserted; caller should do fput() in any case"), but forgot to update the callers. Mea culpa... [Spotted by Pavel Roskin, who has really weird system with pipe-fed coredumps as part of what he considers a normal boot ;-)] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | | | | | | mm, mempolicy: fix printing stack contents in numa_mapsDavid Rientjes2012-10-161-2/+5
|/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When reading /proc/pid/numa_maps, it's possible to return the contents of the stack where the mempolicy string should be printed if the policy gets freed from beneath us. This happens because mpol_to_str() may return an error the stack-allocated buffer is then printed without ever being stored. There are two possible error conditions in mpol_to_str(): - if the buffer allocated is insufficient for the string to be stored, and - if the mempolicy has an invalid mode. The first error condition is not triggered in any of the callers to mpol_to_str(): at least 50 bytes is always allocated on the stack and this is sufficient for the string to be written. A future patch should convert this into BUILD_BUG_ON() since we know the maximum strlen possible, but that's not -rc material. The second error condition is possible if a race occurs in dropping a reference to a task's mempolicy causing it to be freed during the read(). The slab poison value is then used for the mode and mpol_to_str() returns -EINVAL. This race is only possible because get_vma_policy() believes that mm->mmap_sem protects task->mempolicy, which isn't true. The exit path does not hold mm->mmap_sem when dropping the reference or setting task->mempolicy to NULL: it uses task_lock(task) instead. Thus, it's required for the caller of a task mempolicy to hold task_lock(task) while grabbing the mempolicy and reading it. Callers with a vma policy store their mempolicy earlier and can simply increment the reference count so it's guaranteed not to be freed. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | Merge branch 'modules-next' of ↵Linus Torvalds2012-10-142-7/+7
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module signing support from Rusty Russell: "module signing is the highlight, but it's an all-over David Howells frenzy..." Hmm "Magrathea: Glacier signing key". Somebody has been reading too much HHGTTG. * 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (37 commits) X.509: Fix indefinite length element skip error handling X.509: Convert some printk calls to pr_devel asymmetric keys: fix printk format warning MODSIGN: Fix 32-bit overflow in X.509 certificate validity date checking MODSIGN: Make mrproper should remove generated files. MODSIGN: Use utf8 strings in signer's name in autogenerated X.509 certs MODSIGN: Use the same digest for the autogen key sig as for the module sig MODSIGN: Sign modules during the build process MODSIGN: Provide a script for generating a key ID from an X.509 cert MODSIGN: Implement module signature checking MODSIGN: Provide module signing public keys to the kernel MODSIGN: Automatically generate module signing keys if missing MODSIGN: Provide Kconfig options MODSIGN: Provide gitignore and make clean rules for extra files MODSIGN: Add FIPS policy module: signature checking hook X.509: Add a crypto key parser for binary (DER) X.509 certificates MPILIB: Provide a function to read raw data into an MPI X.509: Add an ASN.1 decoder X.509: Add simple ASN.1 grammar compiler ...
| * | | | | | KEYS: Add payload preparsing opportunity prior to key instantiate or updateDavid Howells2012-10-082-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Give the key type the opportunity to preparse the payload prior to the instantiation and update routines being called. This is done with the provision of two new key type operations: int (*preparse)(struct key_preparsed_payload *prep); void (*free_preparse)(struct key_preparsed_payload *prep); If the first operation is present, then it is called before key creation (in the add/update case) or before the key semaphore is taken (in the update and instantiate cases). The second operation is called to clean up if the first was called. preparse() is given the opportunity to fill in the following structure: struct key_preparsed_payload { char *description; void *type_data[2]; void *payload; const void *data; size_t datalen; size_t quotalen; }; Before the preparser is called, the first three fields will have been cleared, the payload pointer and size will be stored in data and datalen and the default quota size from the key_type struct will be stored into quotalen. The preparser may parse the payload in any way it likes and may store data in the type_data[] and payload fields for use by the instantiate() and update() ops. The preparser may also propose a description for the key by attaching it as a string to the description field. This can be used by passing a NULL or "" description to the add_key() system call or the key_create_or_update() function. This cannot work with request_key() as that required the description to tell the upcall about the key to be created. This, for example permits keys that store PGP public keys to generate their own name from the user ID and public key fingerprint in the key. The instantiate() and update() operations are then modified to look like this: int (*instantiate)(struct key *key, struct key_preparsed_payload *prep); int (*update)(struct key *key, struct key_preparsed_payload *prep); and the new payload data is passed in *prep, whether or not it was preparsed. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* | | | | | | Merge branch 'for-linus' of ↵Linus Torvalds2012-10-132-3/+3
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace compile fixes from Eric W Biederman: "This tree contains three trivial fixes. One compiler warning, one thinko fix, and one build fix" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: btrfs: Fix compilation with user namespace support enabled userns: Fix posix_acl_file_xattr_userns gid conversion userns: Properly print bluetooth socket uids
| * | | | | | | btrfs: Fix compilation with user namespace support enabledEric W. Biederman2012-10-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When compiling with user namespace support btrfs fails like: fs/btrfs/tree-log.c: In function ‘fill_inode_item’: fs/btrfs/tree-log.c:2955:2: error: incompatible type for argument 3 of ‘btrfs_set_inode_uid’ fs/btrfs/ctree.h:2026:1: note: expected ‘u32’ but argument is of type ‘kuid_t’ fs/btrfs/tree-log.c:2956:2: error: incompatible type for argument 3 of ‘btrfs_set_inode_gid’ fs/btrfs/ctree.h:2027:1: note: expected ‘u32’ but argument is of type ‘kgid_t’ Fix this by using i_uid_read and i_gid_read in Cc: Chris Mason <chris.mason@fusionio.com> Cc: Josef Bacik <jbacik@fusionio.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * | | | | | | userns: Fix posix_acl_file_xattr_userns gid conversionEric W. Biederman2012-10-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code needs to be from_kgid(make_kgid(...)...) not from_kuid(make_kgid(...)...). Doh! Reported-by: Jan Kara <jack@suse.cz> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* | | | | | | | Merge branch 'for-3.7' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2012-10-1315-305/+227
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull nfsd update from J Bruce Fields: "Another relatively quiet cycle. There was some progress on my remaining 4.1 todo's, but a couple of them were just of the form "check that we do X correctly", so didn't have much affect on the code. Other than that, a bunch of cleanup and some bugfixes (including an annoying NFSv4.0 state leak and a busy-loop in the server that could cause it to peg the CPU without making progress)." * 'for-3.7' of git://linux-nfs.org/~bfields/linux: (46 commits) UAPI: (Scripted) Disintegrate include/linux/sunrpc UAPI: (Scripted) Disintegrate include/linux/nfsd nfsd4: don't allow reclaims of expired clients nfsd4: remove redundant callback probe nfsd4: expire old client earlier nfsd4: separate session allocation and initialization nfsd4: clean up session allocation nfsd4: minor free_session cleanup nfsd4: new_conn_from_crses should only allocate nfsd4: separate connection allocation and initialization nfsd4: reject bad forechannel attrs earlier nfsd4: enforce per-client sessions/no-sessions distinction nfsd4: set cl_minorversion at create time nfsd4: don't pin clientids to pseudoflavors nfsd4: fix bind_conn_to_session xdr comment nfsd4: cast readlink() bug argument NFSD: pass null terminated buf to kstrtouint() nfsd: remove duplicate init in nfsd4_cb_recall nfsd4: eliminate redundant nfs4_free_stateid fs/nfsd/nfs4idmap.c: adjust inconsistent IS_ERR and PTR_ERR ...
| * \ \ \ \ \ \ \ Merge Trond's bugfixesJ. Bruce Fields2012-10-112-3/+3
| |\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge branch 'bugfixes' of git://linux-nfs.org/~trondmy/nfs-2.6 into for-3.7-incoming. Mainly needed for Bryan's "SUNRPC: Set alloc_slot for backchannel tcp ops", without which the 4.1 server oopses.
| * \ \ \ \ \ \ \ \ nfs: disintegrate UAPI for nfsJ. Bruce Fields2012-10-09399-7962/+14232
| |\ \ \ \ \ \ \ \ \ | | | |_|_|/ / / / / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is to complete part of the Userspace API (UAPI) disintegration for which the preparatory patches were pulled recently. After these patches, userspace headers will be segregated into: include/uapi/linux/.../foo.h for the userspace interface stuff, and: include/linux/.../foo.h for the strictly kernel internal stuff. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | nfsd4: don't allow reclaims of expired clientsJ. Bruce Fields2012-10-011-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a confirmed client expires, we normally also need to expire any stable storage record which would allow that client to reclaim state on the next boot. We forgot to do this in some cases. (For example, in destroy_clientid, and in the cases in exchange_id and create_session that destroy and existing confirmed client.) But in most other cases, there's really no harm to calling nfsd4_client_record_remove(), because it is a no-op in the case the client doesn't have an existing The single exception is destroying a client on shutdown, when we want to keep the stable storage records so we can recognize which clients will be allowed to reclaim when we come back up. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | nfsd4: remove redundant callback probeJ. Bruce Fields2012-10-011-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both nfsd4_init_conn and alloc_init_session are probing the callback channel, harmless but pointless. Also, nfsd4_init_conn should probably be probing in the "unknown" case as well. In fact I don't see any harm to just doing it unconditionally when we get a new backchannel connection. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | nfsd4: expire old client earlierJ. Bruce Fields2012-10-011-10/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before we had to delay expiring a client till we'd found out whether the session and connection allocations would succeed. That's no longer necessary. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | nfsd4: separate session allocation and initializationJ. Bruce Fields2012-10-011-23/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This will allow some further simplification. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | nfsd4: clean up session allocationJ. Bruce Fields2012-10-011-12/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | nfsd4: minor free_session cleanupJ. Bruce Fields2012-10-011-10/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | nfsd4: new_conn_from_crses should only allocateJ. Bruce Fields2012-10-011-16/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do the initialization in the caller, and clarify that the only failure ever possible here was due to allocation. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | nfsd4: separate connection allocation and initializationJ. Bruce Fields2012-10-011-10/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It'll be useful to have connection allocation and initialization as separate functions. Also, note we'd been ignoring the alloc_conn error return in bind_conn_to_session. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | nfsd4: reject bad forechannel attrs earlierJ. Bruce Fields2012-10-011-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This could simplify the logic a little later. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | nfsd4: enforce per-client sessions/no-sessions distinctionJ. Bruce Fields2012-10-013-22/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Something like creating a client with setclientid and then trying to confirm it with create_session may not crash the server, but I'm not completely positive of that, and in any case it's obviously bad client behavior. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | nfsd4: set cl_minorversion at create timeJ. Bruce Fields2012-10-011-10/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | And remove some mostly obsolete comments. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | nfsd4: don't pin clientids to pseudoflavorsJ. Bruce Fields2012-10-011-1/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I added cr_flavor to the data compared in same_creds without any justification, in d5497fc693a446ce9100fcf4117c3f795ddfd0d2 "nfsd4: move rq_flavor into svc_cred". Recent client changes then started making mount -osec=krb5 server:/export /mnt/ echo "hello" >/mnt/TMP umount /mnt/ mount -osec=krb5i server:/export /mnt/ echo "hello" >/mnt/TMP to fail due to a clid_inuse on the second open. Mounting sequentially like this with different flavors probably isn't that common outside artificial tests. Also, the real bug here may be that the server isn't just destroying the former clientid in this case (because it isn't good enough at recognizing when the old state is gone). But it prompted some discussion and a look back at the spec, and I think the check was probably wrong. Fix and document. Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | nfsd4: fix bind_conn_to_session xdr commentJ. Bruce Fields2012-09-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | nfsd4: cast readlink() bug argumentJ. Bruce Fields2012-09-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As we already do in readv, writev. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
OpenPOWER on IntegriCloud