summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
...
| | * | | Orangefs: xattr.c cleanupMike Marshall2016-04-081-16/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. It is nonsense to test for negative size_t, suggested by David Binderman <dcb314@hotmail.com> 2. By the time Orangefs gets called, the vfs has ensured that name != NULL, and that buffer and size are sane. Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | | | Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds2016-04-0717-86/+244
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 bugfixes from Ted Ts'o: "These changes contains a fix for overlayfs interacting with some (badly behaved) dentry code in various file systems. These have been reviewed by Al and the respective file system mtinainers and are going through the ext4 tree for convenience. This also has a few ext4 encryption bug fixes that were discovered in Android testing (yes, we will need to get these sync'ed up with the fs/crypto code; I'll take care of that). It also has some bug fixes and a change to ignore the legacy quota options to allow for xfstests regression testing of ext4's internal quota feature and to be more consistent with how xfs handles this case" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: ignore quota mount options if the quota feature is enabled ext4 crypto: fix some error handling ext4: avoid calling dquot_get_next_id() if quota is not enabled ext4: retry block allocation for failed DIO and DAX writes ext4: add lockdep annotations for i_data_sem ext4: allow readdir()'s of large empty directories to be interrupted btrfs: fix crash/invalid memory access on fsync when using overlayfs ext4 crypto: use dget_parent() in ext4_d_revalidate() ext4: use file_dentry() ext4: use dget_parent() in ext4_file_open() nfs: use file_dentry() fs: add file_dentry() ext4 crypto: don't let data integrity writebacks fail with ENOMEM ext4: check if in-inode xattr is corrupted in ext4_expand_extra_isize_ea()
| | * | | | ext4: ignore quota mount options if the quota feature is enabledTheodore Ts'o2016-04-031-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, ext4 would fail the mount if the file system had the quota feature enabled and quota mount options (used for the older quota setups) were present. This broke xfstests, since xfs silently ignores the usrquote and grpquota mount options if they are specified. This commit changes things so that we are consistent with xfs; having the mount options specified is harmless, so no sense break users by forbidding them. Cc: stable@vger.kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>
| | * | | | ext4 crypto: fix some error handlingDan Carpenter2016-04-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We should be testing for -ENOMEM but the minus sign is missing. Fixes: c9af28fdd449 ('ext4 crypto: don't let data integrity writebacks fail with ENOMEM') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
| | * | | | ext4: avoid calling dquot_get_next_id() if quota is not enabledTheodore Ts'o2016-04-011-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This should be fixed in the quota layer so we can test with the quota mutex held, but for now, we need this to avoid tests from crashing the kernel aborting the regression test suite. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
| | * | | | ext4: retry block allocation for failed DIO and DAX writesJan Kara2016-04-011-30/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently if block allocation for DIO or DAX write fails due to ENOSPC, we just returned it to userspace. However these ENOSPC errors can be transient because the transaction freeing blocks has not yet committed. This demonstrates as failures of generic/102 test when the filesystem is mounted with 'dax' mount option. Fix the problem by properly retrying the allocation in case of ENOSPC error in get blocks functions used for direct IO. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Tested-by: Ross Zwisler <ross.zwisler@linux.intel.com>
| | * | | | ext4: add lockdep annotations for i_data_semTheodore Ts'o2016-04-013-4/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the internal Quota feature, mke2fs creates empty quota inodes and quota usage tracking is enabled as soon as the file system is mounted. Since quotacheck is no longer preallocating all of the blocks in the quota inode that are likely needed to be written to, we are now seeing a lockdep false positive caused by needing to allocate a quota block from inside ext4_map_blocks(), while holding i_data_sem for a data inode. This results in this complaint: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&ei->i_data_sem); lock(&s->s_dquot.dqio_mutex); lock(&ei->i_data_sem); lock(&s->s_dquot.dqio_mutex); Google-Bug-Id: 27907753 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
| | * | | | ext4: allow readdir()'s of large empty directories to be interruptedTheodore Ts'o2016-03-302-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a directory has a large number of empty blocks, iterating over all of them can take a long time, leading to scheduler warnings and users getting irritated when they can't kill a process in the middle of one of these long-running readdir operations. Fix this by adding checks to ext4_readdir() and ext4_htree_fill_tree(). Reported-by: Benjamin LaHaise <bcrl@kvack.org> Google-Bug-Id: 27880676 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
| | * | | | btrfs: fix crash/invalid memory access on fsync when using overlayfsFilipe Manana2016-03-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the lower or upper directory of an overlayfs mount belong to a btrfs file system and we fsync the file through the overlayfs' merged directory we ended up accessing an inode that didn't belong to btrfs as if it were a btrfs inode at btrfs_sync_file() resulting in a crash like the following: [ 7782.588845] BUG: unable to handle kernel NULL pointer dereference at 0000000000000544 [ 7782.590624] IP: [<ffffffffa030b7ab>] btrfs_sync_file+0x11b/0x3e9 [btrfs] [ 7782.591931] PGD 4d954067 PUD 1e878067 PMD 0 [ 7782.592016] Oops: 0002 [#6] PREEMPT SMP DEBUG_PAGEALLOC [ 7782.592016] Modules linked in: btrfs overlay ppdev crc32c_generic evdev xor raid6_pq psmouse pcspkr sg serio_raw acpi_cpufreq parport_pc parport tpm_tis i2c_piix4 tpm i2c_core processor button loop autofs4 ext4 crc16 mbcache jbd2 sr_mod cdrom sd_mod ata_generic virtio_scsi ata_piix virtio_pci libata virtio_ring virtio scsi_mod e1000 floppy [last unloaded: btrfs] [ 7782.592016] CPU: 10 PID: 16437 Comm: xfs_io Tainted: G D 4.5.0-rc6-btrfs-next-26+ #1 [ 7782.592016] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014 [ 7782.592016] task: ffff88001b8d40c0 ti: ffff880137488000 task.ti: ffff880137488000 [ 7782.592016] RIP: 0010:[<ffffffffa030b7ab>] [<ffffffffa030b7ab>] btrfs_sync_file+0x11b/0x3e9 [btrfs] [ 7782.592016] RSP: 0018:ffff88013748be40 EFLAGS: 00010286 [ 7782.592016] RAX: 0000000080000000 RBX: ffff880133b30c88 RCX: 0000000000000001 [ 7782.592016] RDX: 0000000000000001 RSI: ffffffff8148fec0 RDI: 00000000ffffffff [ 7782.592016] RBP: ffff88013748bec0 R08: 0000000000000001 R09: 0000000000000000 [ 7782.624248] R10: ffff88013748be40 R11: 0000000000000246 R12: 0000000000000000 [ 7782.624248] R13: 0000000000000000 R14: 00000000009305a0 R15: ffff880015e3be40 [ 7782.624248] FS: 00007fa83b9cb700(0000) GS:ffff88023ed40000(0000) knlGS:0000000000000000 [ 7782.624248] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 7782.624248] CR2: 0000000000000544 CR3: 00000001fa652000 CR4: 00000000000006e0 [ 7782.624248] Stack: [ 7782.624248] ffffffff8108b5cc ffff88013748bec0 0000000000000246 ffff8800b005ded0 [ 7782.624248] ffff880133b30d60 8000000000000000 7fffffffffffffff 0000000000000246 [ 7782.624248] 0000000000000246 ffffffff81074f9b ffffffff8104357c ffff880015e3be40 [ 7782.624248] Call Trace: [ 7782.624248] [<ffffffff8108b5cc>] ? arch_local_irq_save+0x9/0xc [ 7782.624248] [<ffffffff81074f9b>] ? ___might_sleep+0xce/0x217 [ 7782.624248] [<ffffffff8104357c>] ? __do_page_fault+0x3c0/0x43a [ 7782.624248] [<ffffffff811a2351>] vfs_fsync_range+0x8c/0x9e [ 7782.624248] [<ffffffff811a237f>] vfs_fsync+0x1c/0x1e [ 7782.624248] [<ffffffff811a24d6>] do_fsync+0x31/0x4a [ 7782.624248] [<ffffffff811a2700>] SyS_fsync+0x10/0x14 [ 7782.624248] [<ffffffff81493617>] entry_SYSCALL_64_fastpath+0x12/0x6b [ 7782.624248] Code: 85 c0 0f 85 e2 02 00 00 48 8b 45 b0 31 f6 4c 29 e8 48 ff c0 48 89 45 a8 48 8d 83 d8 00 00 00 48 89 c7 48 89 45 a0 e8 fc 43 18 e1 <f0> 41 ff 84 24 44 05 00 00 48 8b 83 58 ff ff ff 48 c1 e8 07 83 [ 7782.624248] RIP [<ffffffffa030b7ab>] btrfs_sync_file+0x11b/0x3e9 [btrfs] [ 7782.624248] RSP <ffff88013748be40> [ 7782.624248] CR2: 0000000000000544 [ 7782.661994] ---[ end trace 721e14960eb939bc ]--- This started happening since commit 4bacc9c9234 (overlayfs: Make f_path always point to the overlay and f_inode to the underlay) and even though after this change we could still access the btrfs inode through struct file->f_mapping->host or struct file->f_inode, we would end up resulting in more similar issues later on at check_parent_dirs_for_sync() because the dentry we got (from struct file->f_path.dentry) was from overlayfs and not from btrfs, that is, we had no way of getting the dentry that belonged to btrfs (we always got the dentry that belonged to overlayfs). The new patch from Miklos Szeredi, titled "vfs: add file_dentry()" and recently submitted to linux-fsdevel, adds a file_dentry() API that allows us to get the btrfs dentry from the input file and therefore being able to fsync when the upper and lower directories belong to btrfs filesystems. This issue has been reported several times by users in the mailing list and bugzilla. A test case for xfstests is being submitted as well. Fixes: 4bacc9c9234c ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay") Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=101951 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=109791 Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com> Cc: stable@vger.kernel.org
| | * | | | ext4 crypto: use dget_parent() in ext4_d_revalidate()Theodore Ts'o2016-03-261-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This avoids potential problems caused by a race where the inode gets renamed out from its parent directory and the parent directory is deleted while ext4_d_revalidate() is running. Fixes: 28b4c263961c Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
| | * | | | ext4: use file_dentry()Miklos Szeredi2016-03-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | EXT4 may be used as lower layer of overlayfs and accessing f_path.dentry can lead to a crash. Fix by replacing direct access of file->f_path.dentry with the file_dentry() accessor, which will always return a native object. Reported-by: Daniel Axtens <dja@axtens.net> Fixes: 4bacc9c9234c ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay") Fixes: ff978b09f973 ("ext4 crypto: move context consistency check to ext4_file_open()") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> # v4.5
| | * | | | ext4: use dget_parent() in ext4_file_open()Miklos Szeredi2016-03-261-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In f_op->open() lock on parent is not held, so there's no guarantee that parent dentry won't go away at any time. Even after this patch there's no guarantee that 'dir' will stay the parent of 'inode', but at least it won't be freed while being used. Fixes: ff978b09f973 ("ext4 crypto: move context consistency check to ext4_file_open()") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: <stable@vger.kernel.org> # v4.5
| | * | | | nfs: use file_dentry()Miklos Szeredi2016-03-263-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NFS may be used as lower layer of overlayfs and accessing f_path.dentry can lead to a crash. Fix by replacing direct access of file->f_path.dentry with the file_dentry() accessor, which will always return a native object. Fixes: 4bacc9c9234c ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Tested-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Acked-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: <stable@vger.kernel.org> # v4.2 Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk>
| | * | | | fs: add file_dentry()Miklos Szeredi2016-03-262-1/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This series fixes bugs in nfs and ext4 due to 4bacc9c9234c ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay"). Regular files opened on overlayfs will result in the file being opened on the underlying filesystem, while f_path points to the overlayfs mount/dentry. This confuses filesystems which get the dentry from struct file and assume it's theirs. Add a new helper, file_dentry() [*], to get the filesystem's own dentry from the file. This checks file->f_path.dentry->d_flags against DCACHE_OP_REAL, and returns file->f_path.dentry if DCACHE_OP_REAL is not set (this is the common, non-overlayfs case). In the uncommon case it will call into overlayfs's ->d_real() to get the underlying dentry, matching file_inode(file). The reason we need to check against the inode is that if the file is copied up while being open, d_real() would return the upper dentry, while the open file comes from the lower dentry. [*] If possible, it's better simply to use file_inode() instead. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Tested-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Reviewed-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: <stable@vger.kernel.org> # v4.2 Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Daniel Axtens <dja@axtens.net>
| | * | | | ext4 crypto: don't let data integrity writebacks fail with ENOMEMTheodore Ts'o2016-03-264-20/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We don't want the writeback triggered from the journal commit (in data=writeback mode) to cause the journal to abort due to generic_writepages() returning an ENOMEM error. In addition, if fsync() fails with ENOMEM, most applications will probably not do the right thing. So if we are doing a data integrity sync, and ext4_encrypt() returns ENOMEM, we will submit any queued I/O to date, and then retry the allocation using GFP_NOFAIL. Google-Bug-Id: 27641567 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
| | * | | | ext4: check if in-inode xattr is corrupted in ext4_expand_extra_isize_ea()Theodore Ts'o2016-03-221-4/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We aren't checking to see if the in-inode extended attribute is corrupted before we try to expand the inode's extra isize fields. This can lead to potential crashes caused by the BUG_ON() check in ext4_xattr_shift_entries(). Signed-off-by: Theodore Ts'o <tytso@mit.edu>
| * | | | | Merge branch 'for_linus' of ↵Linus Torvalds2016-04-042-4/+20
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull quota fixes from Jan Kara: "Fixes for oopses when the new quotactl gets used with quotas disabled" * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: ocfs2: Fix Q_GETNEXTQUOTA for filesystem without quotas quota: Handle Q_GETNEXTQUOTA when quota is disabled
| | * | | | | ocfs2: Fix Q_GETNEXTQUOTA for filesystem without quotasJan Kara2016-03-291-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When Q_GETNEXTQUOTA was called for filesystem with quotas disabled ocfs2_get_next_id() oopses. Fix the problem by checking early whether the filesystem has quotas enabled. Signed-off-by: Jan Kara <jack@suse.cz>
| | * | | | | quota: Handle Q_GETNEXTQUOTA when quota is disabledJan Kara2016-03-291-2/+11
| | | |_|_|/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we oopsed when Q_GETNEXTQUOTA got called when quota was disabled. Properly check whether quota is enabled for the filesystem before calling into quota format handler. Reported-by: Ted Tso <tytso@mit.edu> Signed-off-by: Jan Kara <jack@suse.cz>
| * | | | | Merge tag 'f2fs-for-linus' of ↵Linus Torvalds2016-04-042-44/+72
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs fixes from Jaegeuk Kim. * tag 'f2fs-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: f2fs: retrieve IO write stat from the right place f2fs crypto: fix corrupted symlink in encrypted case f2fs: cover large section in sanity check of super
| | * | | | | f2fs: retrieve IO write stat from the right placeShuoran Liu2016-03-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the following patch, f2fs: split journal cache from curseg cache journal cache is split from curseg cache. So IO write statistics should be retrived from journal cache but not curseg->sum_blk. Otherwise, it will get 0, and the stat is lost. Signed-off-by: Shuoran Liu <liushuoran@huawei.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| | * | | | | f2fs crypto: fix corrupted symlink in encrypted caseJaegeuk Kim2016-03-301-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the encrypted symlink case, we should check its corrupted symname after decrypting it. Otherwise, we can report -ENOENT incorrectly, if encrypted symname starts with '\0'. Cc: stable 4.5+ <stable@vger.kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| | * | | | | f2fs: cover large section in sanity check of superJaegeuk Kim2016-03-281-37/+65
| | |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the bug which does not cover a large section case when checking the sanity of superblock. If f2fs detects misalignment, it will fix the superblock during the mount time, so it doesn't need to trigger fsck.f2fs further. Reported-by: Matthias Prager <linux@matthiasprager.de> Reported-by: David Gnedt <david.gnedt@davizone.at> Cc: stable 4.5+ <stable@vger.kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | | | | Merge branch 'PAGE_CACHE_SIZE-removal'Linus Torvalds2016-04-04247-2183/+2173
| |\ \ \ \ \ | | |_|_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge PAGE_CACHE_SIZE removal patches from Kirill Shutemov: "PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. Let's stop pretending that pages in page cache are special. They are not. The first patch with most changes has been done with coccinelle. The second is manual fixups on top. The third patch removes macros definition" [ I was planning to apply this just before rc2, but then I spaced out, so here it is right _after_ rc2 instead. As Kirill suggested as a possibility, I could have decided to only merge the first two patches, and leave the old interfaces for compatibility, but I'd rather get it all done and any out-of-tree modules and patches can trivially do the converstion while still also working with older kernels, so there is little reason to try to maintain the redundant legacy model. - Linus ] * PAGE_CACHE_SIZE-removal: mm: drop PAGE_CACHE_* and page_cache_{get,release} definition mm, fs: remove remaining PAGE_CACHE_* and page_cache_{get,release} usage mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros
| | * | | | mm, fs: remove remaining PAGE_CACHE_* and page_cache_{get,release} usageKirill A. Shutemov2016-04-0434-89/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mostly direct substitution with occasional adjustment or removing outdated comments. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| | * | | | mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macrosKirill A. Shutemov2016-04-04243-2096/+2097
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | rhashtable: accept GFP flags in rhashtable_walk_initBob Copeland2016-04-051-2/+2
|/ / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In certain cases, the 802.11 mesh pathtable code wants to iterate over all of the entries in the forwarding table from the receive path, which is inside an RCU read-side critical section. Enable walks inside atomic sections by allowing GFP_ATOMIC allocations for the walker state. Change all existing callsites to pass in GFP_KERNEL. Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: Bob Copeland <me@bobcopeland.com> [also adjust gfs2/glock.c and rhashtable tests] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
* | | | | Merge branch 'for-linus-4.6' of ↵Linus Torvalds2016-04-011-20/+25
|\ \ \ \ \ | | |_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "This has a few fixes Dave Sterba had queued up. These are all pretty small, but since they were tested I decided against waiting for more" * 'for-linus-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: btrfs: transaction_kthread() is not freezable btrfs: cleaner_kthread() doesn't need explicit freeze btrfs: do not write corrupted metadata blocks to disk btrfs: csum_tree_block: return proper errno value
| * | | | Merge branch 'misc-4.6' of ↵Chris Mason2016-03-241-20/+25
| |\ \ \ \ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.6
| | * | | | btrfs: transaction_kthread() is not freezableJiri Kosina2016-03-221-9/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | transaction_kthread() is calling try_to_freeze(), but that's just an expeinsive no-op given the fact that the thread is not marked freezable. After removing this, disk-io.c is now independent on freezer API. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | | | btrfs: cleaner_kthread() doesn't need explicit freezeJiri Kosina2016-03-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cleaner_kthread() is not marked freezable, and therefore calling try_to_freeze() in its context is a pointless no-op. In addition to that, as has been clearly demonstrated by 80ad623edd2d ("Revert "btrfs: clear PF_NOFREEZE in cleaner_kthread()"), it's perfectly valid / legal for cleaner_kthread() to stay scheduled out in an arbitrary place during suspend (in that particular example that was waiting for reading of extent pages), so there is no need to leave any traces of freezer in this kthread. Fixes: 80ad623edd2d ("Revert "btrfs: clear PF_NOFREEZE in cleaner_kthread()") Fixes: 696249132158 ("btrfs: clear PF_NOFREEZE in cleaner_kthread()") Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | | | btrfs: do not write corrupted metadata blocks to diskAlex Lyakas2016-03-221-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | csum_dirty_buffer was issuing a warning in case the extent buffer did not look alright, but was still returning success. Let's return error in this case, and also add an additional sanity check on the extent buffer header. The caller up the chain may BUG_ON on this, for example flush_epd_write_bio will, but it is better than to have a silent metadata corruption on disk. Signed-off-by: Alex Lyakas <alex@zadarastorage.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | | | btrfs: csum_tree_block: return proper errno valueAlex Lyakas2016-03-221-8/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Alex Lyakas <alex@zadarastorage.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
* | | | | | Merge tag 'for-linus' of git://github.com/martinbrandenburg/linuxLinus Torvalds2016-04-012-6/+4
|\ \ \ \ \ \ | |_|_|/ / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull OrangeFS fixes from Martin Brandenburg: "Two bugfixes for OrangeFS. One is a reference counting bug and the other is a typo in client minimum version" * tag 'for-linus' of git://github.com/martinbrandenburg/linux: orangefs: minimum userspace version is 2.9.3 orangefs: don't put readdir slot twice
| * | | | | orangefs: minimum userspace version is 2.9.3Martin Brandenburg2016-03-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Version 2.9.4 isn't even released yet. Signed-off-by: Martin Brandenburg <martin@omnibond.com>
| * | | | | orangefs: don't put readdir slot twiceMartin Brandenburg2016-03-311-5/+3
| | |_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was quite an oversight. After a readdir, the module could not be unloaded, the number of slots is wrong, and memory near the slot bitmap is possibly corrupt. Oops. Signed-off-by: Martin Brandenburg <martin@omnibond.com>
* | | | | Merge branch 'for-linus' of ↵Linus Torvalds2016-03-311-4/+6
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fix from Al Viro. Automount handling was broken by commit e3c13928086f ("namei: massage lookup_slow() to be usable by lookup_one_len_unlocked()") moving the test for negative dentry too early. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix the braino in "namei: massage lookup_slow() to be usable by lookup_one_len_unlocked()"
| * | | | | fix the braino in "namei: massage lookup_slow() to be usable by ↵Al Viro2016-03-311-4/+6
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | lookup_one_len_unlocked()" We should try to trigger automount *before* bailing out on negative dentry. Reported-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Reported-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Reported-by: Arend van Spriel <arend@broadcom.com> Tested-by: Arend van Spriel <arend@broadcom.com> Tested-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | | | | dlm: config: Fix ENOMEM failures in make_cluster()Andrew Price2016-03-291-2/+1
|/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 1ae1602de0 "configfs: switch ->default groups to a linked list" left the NULL gps pointer behind after removing the kcalloc() call which made it non-NULL. It also left the !gps check in place so make_cluster() now fails with ENOMEM. Remove the remaining uses of the gps variable to fix that. Reviewed-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Andrew Price <anprice@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
* | | | Merge branch 'for-linus' of ↵Linus Torvalds2016-03-2611-218/+419
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph updates from Sage Weil: "There is quite a bit here, including some overdue refactoring and cleanup on the mon_client and osd_client code from Ilya, scattered writeback support for CephFS and a pile of bug fixes from Zheng, and a few random cleanups and fixes from others" [ I already decided not to pull this because of it having been rebased recently, but ended up changing my mind after all. Next time I'll really hold people to it. Oh well. - Linus ] * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (34 commits) libceph: use KMEM_CACHE macro ceph: use kmem_cache_zalloc rbd: use KMEM_CACHE macro ceph: use lookup request to revalidate dentry ceph: kill ceph_get_dentry_parent_inode() ceph: fix security xattr deadlock ceph: don't request vxattrs from MDS ceph: fix mounting same fs multiple times ceph: remove unnecessary NULL check ceph: avoid updating directory inode's i_size accidentally ceph: fix race during filling readdir cache libceph: use sizeof_footer() more ceph: kill ceph_empty_snapc ceph: fix a wrong comparison ceph: replace CURRENT_TIME by current_fs_time() ceph: scattered page writeback libceph: add helper that duplicates last extent operation libceph: enable large, variable-sized OSD requests libceph: osdc->req_mempool should be backed by a slab pool libceph: make r_request msg_size calculation clearer ...
| * | | | ceph: use kmem_cache_zallocGeliang Tang2016-03-252-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use kmem_cache_zalloc() instead of kmem_cache_alloc() with flag GFP_ZERO. Signed-off-by: Geliang Tang <geliangtang@163.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| * | | | ceph: use lookup request to revalidate dentryYan, Zheng2016-03-252-0/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If dentry has no lease, ceph_d_revalidate() previously return 0. This causes VFS to invalidate the dentry and create a new dentry for later lookup. Invalidating a dentry also detach any underneath mount points. So mount point inside cephfs can disapear mystically (even the mount point is not modified by other hosts). The fix is using lookup request to revalidate dentry without lease. This can partly solve the mount points disapear issue (as long as the mount point is not modified by other hosts) Signed-off-by: Yan, Zheng <zyan@redhat.com>
| * | | | ceph: kill ceph_get_dentry_parent_inode()Yan, Zheng2016-03-252-20/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | use vfs helper dget_parent() instead Signed-off-by: Yan, Zheng <zyan@redhat.com>
| * | | | ceph: fix security xattr deadlockYan, Zheng2016-03-257-10/+123
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When security is enabled, security module can call filesystem's getxattr/setxattr callbacks during d_instantiate(). For cephfs, d_instantiate() is usually called by MDS' dispatch thread, while handling MDS reply. If the MDS reply does not include xattrs and corresponding caps, getxattr/setxattr need to send a new request to MDS and waits for the reply. This makes MDS' dispatch sleep, nobody handles later MDS replies. The fix is make sure lookup/atomic_open reply include xattrs and corresponding caps. So getxattr can be handled by cached xattrs. This requires some modification to both MDS and request message. (Client tells MDS what caps it wants; MDS encodes proper caps in the reply) Smack security module may call setxattr during d_instantiate(). Unlike getxattr, we can't force MDS to issue CEPH_CAP_XATTR_EXCL to us. So just make setxattr return error when called by MDS' dispatch thread. Signed-off-by: Yan, Zheng <zyan@redhat.com>
| * | | | ceph: don't request vxattrs from MDSYan, Zheng2016-03-251-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's uselese because MDS reply does not carry any vxattr. Signed-off-by: Yan, Zheng <zyan@redhat.com>
| * | | | ceph: fix mounting same fs multiple timesYan, Zheng2016-03-251-18/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now __ceph_open_session() only accepts closed client. An opened client will tigger BUG_ON(). Signed-off-by: Yan, Zheng <zyan@redhat.com>
| * | | | ceph: remove unnecessary NULL checkYan, Zheng2016-03-251-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If page->mapping is NULL, releasepage() callback does not get called. Remove the unnecessary NULL check to make static code analysis tool happy Signed-off-by: Yan, Zheng <zyan@redhat.com>
| * | | | ceph: avoid updating directory inode's i_size accidentallyYan, Zheng2016-03-251-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Directory inode's i_size is used by readdir cache. Signed-off-by: Yan, Zheng <zyan@redhat.com>
| * | | | ceph: fix race during filling readdir cacheYan, Zheng2016-03-251-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Readdir cache uses page cache to save dentry pointers. When adding dentry pointers to middle of a page, we need to make sure the page already exists. Otherwise the beginning part of the page will be invalid pointers. Signed-off-by: Yan, Zheng <zyan@redhat.com>
| * | | | ceph: kill ceph_empty_snapcIlya Dryomov2016-03-254-34/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ceph_empty_snapc->num_snaps == 0 at all times. Passing such a snapc to ceph_osdc_alloc_request() (possibly through ceph_osdc_new_request()) is equivalent to passing NULL, as ceph_osdc_alloc_request() uses it only for sizing the request message. Further, in all four cases the subsequent ceph_osdc_build_request() is passed NULL for snapc, meaning that 0 is encoded for seq and num_snaps and making ceph_empty_snapc entirely useless. The two cases where it actually mattered were removed in commits 860560904962 ("ceph: avoid sending unnessesary FLUSHSNAP message") and 23078637e054 ("ceph: fix queuing inode to mdsdir's snaprealm"). Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Yan, Zheng <zyan@redhat.com>
OpenPOWER on IntegriCloud