summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
...
| * | | | | | NFSv4.1/pnfs: Don't ask for a read layout for an empty file.Trond Myklebust2015-08-311-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFSv4.1: Fix a protocol issue with CLOSE stateidsTrond Myklebust2015-08-301-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to RFC5661 Section 18.2.4, CLOSE is supposed to return the zero stateid. This means that nfs_clear_open_stateid_locked() cannot assume that the result stateid will always match the 'other' field of the existing open stateid when trying to determine a race with a parallel OPEN. Instead, we look at the argument, and check for matches. Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFSv4.1/flexfiles: Don't mark the entire deviceid as bad for file errorsTrond Myklebust2015-08-301-8/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the file was fenced and/or has been deleted on the DS, then we want to retry pNFS after a layoutreturn with error report. If the server cannot fix the problem, then we rely on it to tell us so in the response to the LAYOUTGET. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFSv4.1/pnfs: Ensure layoutreturn reserves space for the opaque payloadTrond Myklebust2015-08-271-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "FIXME" is outdated. Flexfiles does add a payload. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFSv4.1/flexfiles: Fix a protocol error in layoutreturnTrond Myklebust2015-08-271-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to the flexfiles protocol, the layoutreturn should specify an array of errors in the following format: struct ff_ioerr4 { offset4 ffie_offset; length4 ffie_length; stateid4 ffie_stateid; device_error4 ffie_errors<>; }; This patch fixes up the code to ensure that our ffie_errors is indeed encoded as an array (albeit with only a single entry). Reported-by: Tom Haynes <thomas.haynes@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFS: Send attributes in OPEN request for NFS4_CREATE_EXCLUSIVE4_1Kinglong Mee2015-08-272-12/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Client sends a SETATTR request after OPEN for updating attributes. For create file with S_ISGID is set, the S_ISGID in SETATTR will be ignored at nfs server as chmod of no PERMISSION. v3, same as v2. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFS: Get suppattr_exclcreat when getting server capabilitiesKinglong Mee2015-08-272-6/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Create file with attributs as NFS4_CREATE_EXCLUSIVE4_1 mode depends on suppattr_exclcreat attribut. v3, same as v2. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFS: Make opened as optional argument in _nfs4_do_openKinglong Mee2015-08-272-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Check opened, only update it when non-NULL. It's not needs define an unused value for the opened when calling _nfs4_do_open. v3, same as v2. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFS: Check size by inode_newsize_ok in nfs_setattrKinglong Mee2015-08-271-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set rlimit for NFS's files is useless right now. For local process's rlimit, it should be checked by nfs client. The same, CIFS also call inode_change_ok checking rlimit at its client in cifs_setattr_nounix() and cifs_setattr_unix(). v3, fix bad using of error Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFSv4.1/pNFS: pnfs_mark_matching_lsegs_return must notify of layout returnTrond Myklebust2015-08-271-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's not sufficient to just mark the layout segment for layout return. We also need to set the NFS_LAYOUT_RETURN_BEFORE_CLOSE flag in the layout header. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | nfs42: remove unused declarationPeng Tao2015-08-251-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | nfs42: decode_layoutstats does not need res parameterPeng Tao2015-08-251-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFSv4.1/flexfiles: Allow coalescing of new layout segments and existing onesTrond Myklebust2015-08-252-0/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to ensure atomicity of updates, we merge the old layout segments into the new ones, and then invalidate the old ones. Also ensure that we order the list of layout segments so that RO segments are preferred over RW. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFSv4.1/pnfs: Allow pNFS device drivers to customise layout segment insertionTrond Myklebust2015-08-252-9/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is needed in order to allow merging of contiguous layout segments, and also to correct the ordering of layouts for those device drivers that don't necessarily want to place the read-write layouts first. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFSv4.1/pnfs: Add sanity check for the layout range returned by the serverTrond Myklebust2015-08-251-1/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFSv4.1/pnfs Improve the packing of struct pnfs_layout_hdrTrond Myklebust2015-08-251-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Eliminate a couple of holes in the structure, and move the 2 atomics into the same cacheline. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFSv4.1/flexfile: ff_layout_remove_mirror can be statickbuild test robot2015-08-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFSv4.2/pnfs: Make the layoutstats timer configurableTrond Myklebust2015-08-253-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow advanced users to set the layoutstats timer in order to lengthen or shorten the period between layoutstat transmissions to the server. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFSv4.1/flexfile: Ensure uniqueness of mirrors across layout segmentsTrond Myklebust2015-08-252-29/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Keep the full list of mirrors in the struct nfs4_ff_layout_mirror so that they can be shared among the layout segments that use them. Also ensure that we send out only one copy of the layoutstats per mirror. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFSv4.1/flexfiles: Remove mirror backpointer to lseg.Trond Myklebust2015-08-252-14/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we start sharing mirrors between several lsegs, we won't be able to keep it. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFSv4.1/flexfiles: Add refcounting to struct nfs4_ff_layout_mirrorTrond Myklebust2015-08-252-9/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We do want to share mirrors between layout segments, so add a refcount to enable that. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFS41/flexfiles: zero out DS write wccPeng Tao2015-08-251-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We do not want to update inode attributes with DS values. Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFS41: remove NFS_LAYOUT_ROC flagPeng Tao2015-08-252-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we return delegation before closing, we fail to do roc check during close because NFS_LAYOUT_ROC is cleared by delegreturn and it causes layouts to be still hanging around after delegreturn + close, which is a voilation against protocol. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFSv4: Add a tracepoint for CB_LAYOUTRECALLTrond Myklebust2015-08-252-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Only support for single file layoutrecall for now. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFSv4: Add a tracepoint for CB_GETATTRTrond Myklebust2015-08-252-1/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFSv4.1/pnfs: Add a tracepoint for return-on-close eventsTrond Myklebust2015-08-252-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow tracing of return-on-close. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFSv4: Force a post-op attribute update when holding a delegationTrond Myklebust2015-08-251-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the ctime or mtime or change attribute have changed because of an operation we initiated, we should make sure that we force an attribute update. However we do not want to mark the page cache for revalidation. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org # v4.0+
| * | | | | | NFSv4.1/pnfs Ensure flexfiles reports all connection related errorsTrond Myklebust2015-08-201-13/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make sure that we also handle RPC level connection and protocol negotiation errors. Reported-by: Tom Haynes <loghyr@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFSv4.1/pnfs: Ensure the flexfiles layoutstats timers are consistentTrond Myklebust2015-08-201-27/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We want to ensure that the stopwatches for the busy timer and the aggregate timer are consistent. This means that they need to use the same start/stop times. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFS41: fix list splice typePeng Tao2015-08-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We want to move commiting pages to pages list instead. Otherwise it causes pnfs small writes crash like: [34560.037692] BUG: unable to handle kernel NULL pointer dereference at 0000000000000068 [34560.038557] IP: [<ffffffffa05423d6>] nfs_init_commit+0x26/0x130 [nfs] [34560.039400] PGD 69f5a067 PUD 69f59067 PMD 0 [34560.040207] Oops: 0000 [#1] SMP [34560.041014] Modules linked in: nfsv3(OE) nfs_layout_flexfiles(OE) nfsv4(OE) nfs(OE) fscache(E) rpcsec_gss_krb5(E) xt_addrtype(E) xt_conntrack(E) ipt_MASQUERADE(E) nf_nat_masquerade_ipv4(E) iptable_nat(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) nf_nat_ipv4(E) iptable_filter(E) ip_tables(E) x_tables(E) nf_nat(E) nf_conntrack(E) bridge(E) stp(E) llc(E) dm_thin_pool(E) dm_persistent_data(E) dm_bio_prison(E) dm_bufio(E) ppdev(E) vmw_balloon(E) coretemp(E) crc32_pclmul(E) ghash_clmulni_intel(E) aesni_intel(E) aes_x86_64(E) glue_helper(E) lrw(E) gf128mul(E) ablk_helper(E) cryptd(E) psmouse(E) serio_raw(E) vmw_vmci(E) i2c_piix4(E) shpchp(E) parport_pc(E) parport(E) mac_hid(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) xfs(E) libcrc32c(E) hid_generic(E) usbhid(E) hid(E) e1000(E) mptspi(E) [34560.045106] mptscsih(E) mptbase(E) vmwgfx(E) drm_kms_helper(E) ttm(E) drm(E) autofs4(E) [last unloaded: fscache] [34560.045897] CPU: 0 PID: 130543 Comm: bash Tainted: G OE 4.2.0-rc5-dp-00057-gf993a93 #11 [34560.046699] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/20/2014 [34560.047525] task: ffff880031b0a980 ti: ffff880045fec000 task.ti: ffff880045fec000 [34560.048264] RIP: 0010:[<ffffffffa05423d6>] [<ffffffffa05423d6>] nfs_init_commit+0x26/0x130 [nfs] [34560.049000] RSP: 0018:ffff880045fefc18 EFLAGS: 00010246 [34560.049717] RAX: 0000000000000000 RBX: ffff8800208fbc80 RCX: ffff880045fefd50 [34560.050396] RDX: ffff880031c19ec0 RSI: ffff880045fefc88 RDI: ffff8800208fbc80 [34560.051041] RBP: ffff880045fefc28 R08: ffff8800208fbe68 R09: ffff880045fefc88 [34560.051666] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880045fefc78 [34560.052247] R13: ffff880045fefc88 R14: ffff880045fefa90 R15: ffff880045fefd50 [34560.052825] FS: 00007fa02d58c740(0000) GS:ffff88006d600000(0000) knlGS:0000000000000000 [34560.053410] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [34560.053992] CR2: 0000000000000068 CR3: 000000003b37a000 CR4: 00000000001406f0 [34560.054615] Stack: [34560.055200] ffff8800208fbc80 ffff8800208fbc80 ffff880045fefcc8 ffffffffa05c1a5b [34560.055800] ffff880045fefcc8 ffff880045fefd50 0000000045fefcb8 ffff880045fefd40 [34560.056418] ffff8800420608e0 ffffffffa04f3910 0000000100000001 ffff880045fefd50 [34560.057013] Call Trace: [34560.057672] [<ffffffffa05c1a5b>] pnfs_generic_commit_pagelist+0x1cb/0x300 [nfsv4] [34560.058277] [<ffffffffa04f3910>] ? ff_layout_commit_pagelist+0x20/0x20 [nfs_layout_flexfiles] [34560.058907] [<ffffffffa04f3905>] ff_layout_commit_pagelist+0x15/0x20 [nfs_layout_flexfiles] [34560.059557] [<ffffffffa0543fc1>] nfs_generic_commit_list+0xb1/0xf0 [nfs] [34560.060214] [<ffffffffa0543e47>] ? nfs_scan_commit+0x37/0xa0 [nfs] [34560.060825] [<ffffffffa0544081>] nfs_commit_inode+0x81/0x150 [nfs] [34560.061432] [<ffffffffa05443ae>] nfs_wb_all+0x1ae/0x400 [nfs] [34560.062035] [<ffffffffa05380ad>] nfs_getattr+0x33d/0x510 [nfs] [34560.062630] [<ffffffff8122499c>] vfs_getattr_nosec+0x2c/0x40 [34560.063223] [<ffffffff81224a66>] vfs_getattr+0x26/0x30 [34560.063818] [<ffffffff81224b35>] vfs_fstatat+0x65/0xa0 [34560.064413] [<ffffffff81224f3f>] SYSC_newstat+0x1f/0x40 [34560.065016] [<ffffffff8102b176>] ? do_audit_syscall_entry+0x66/0x70 [34560.065626] [<ffffffff8102c773>] ? syscall_trace_enter_phase1+0x113/0x170 [34560.066245] [<ffffffff81003017>] ? trace_hardirqs_on_thunk+0x17/0x19 [34560.066868] [<ffffffff812251ae>] SyS_newstat+0xe/0x10 [34560.067533] [<ffffffff817a5df2>] entry_SYSCALL_64_fastpath+0x16/0x7a [34560.068173] Code: 0f 1f 44 00 00 0f 1f 44 00 00 55 4c 8d 87 e8 01 00 00 48 89 e5 53 48 89 fb 48 83 ec 08 4c 8b 0e 49 8b 41 18 4c 39 ce 48 8b 40 40 <4c> 8b 50 68 74 24 48 8b 87 e8 01 00 00 48 8b 7e 08 4d 89 41 08 [34560.069609] RIP [<ffffffffa05423d6>] nfs_init_commit+0x26/0x130 [nfs] [34560.070295] RSP <ffff880045fefc18> [34560.071008] CR2: 0000000000000068 [34560.073207] ---[ end trace f85f873260977406 ]--- [fixes 27571297a7e(pNFS: Tighten up locking around DS commit buckets)] Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFSv4: Enable delegated opens even when reboot recovery is pendingTrond Myklebust2015-08-191-8/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unlike the previous attempt, this takes into account the fact that we may be calling it from the recovery thread itself. Detect this by looking at what kind of open we're doing, and checking the state of the NFS_DELEGATION_NEED_RECLAIM if it turns out we're doing a reboot reclaim-type open. Cc: Olga Kornievskaia <aglo@umich.edu> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | pNFS: Fix an unused variable warning in pnfs_roc_get_barrierTrond Myklebust2015-08-191-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFS41/flexfiles: update inode after write finishesPeng Tao2015-08-191-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise we break fstest case tests/read_write/mctime.t Does files layout need the same fix as well? Cc: stable@vger.kernel.org # v4.0+ Cc: Anna Schumaker <anna.schumaker@netapp.com> Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFS41: make sure sending LAYOUTRETURN before close if marked soPeng Tao2015-08-191-23/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If layout is marked by NFS_LAYOUT_RETURN_BEFORE_CLOSE, we should always send LAYOUTRETURN before close, and we don't need to do ROC drain if we do send LAYOUTRETURN. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | Revert "NFSv4: Remove incorrect check in can_open_delegated()"Trond Myklebust2015-08-191-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 4e379d36c050b0117b5d10048be63a44f5036115. This commit opens up a race between the recovery code and the open code. Reported-by: Olga Kornievskaia <aglo@umich.edu> Cc: stable@vger.kernel # v4.0+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFSv4.1/pnfs: Play safe w.r.t. close() races when return-on-close is setTrond Myklebust2015-08-181-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we have an OPEN_DOWNGRADE and CLOSE race with one another, we want to ensure that the layout is forgotten by the client, so that we start afresh with a new layoutget. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFSv4.1/pnfs: Fix a close/delegreturn hang when return-on-close is setTrond Myklebust2015-08-183-35/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The helper pnfs_roc() has already verified that we have no delegations, and no further open files, hence no outstanding I/O and it has marked all the return-on-close lsegs as being invalid. Furthermore, it sets the NFS_LAYOUT_RETURN bit, thus serialising the close/delegreturn with all future layoutget calls on this inode. The checks in pnfs_roc_drain() for valid layout segments are therefore redundant: those cannot exist until another layoutget completes. The other check for whether or not NFS_LAYOUT_RETURN is set, actually causes a hang, since we already know that we hold that flag. To fix, we therefore strip out all the functionality in pnfs_roc_drain() except the retrieval of the barrier state, and then rename the function accordingly. Reported-by: Christoph Hellwig <hch@infradead.org> Fixes: 5c4a79fb2b1c ("Don't prevent layoutgets when doing return-on-close") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFS: Don't fsync twice for O_SYNC/IS_SYNC filesTrond Myklebust2015-08-171-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | generic_file_write_iter() will already do an fsync on our behalf if the file descriptor is O_SYNC or the file is marked as IS_SYNC. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | NFS: Don't let the ctime override attribute barriers.Trond Myklebust2015-08-171-8/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Chuck reports seeing cases where a GETATTR that happens to race with an asynchronous WRITE is overriding the file size, despite the attribute barrier being set by the writeback code. The culprit turns out to be the check in nfs_ctime_need_update(), which sees that the ctime is newer than the cached ctime, and assumes that it is safe to override the attribute barrier. This patch removes that override, and ensures that attribute barriers are always respected. Reported-by: Chuck Lever <chuck.lever@oracle.com> Fixes: a08a8cd375db9 ("NFS: Add attribute update barriers to NFS writebacks") Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | Merge branch 'layoutfixes'Trond Myklebust2015-08-172-48/+44
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * layoutfixes: NFSv4.1/pnfs: Remove redundant wakeup in pnfs_send_layoutreturn() NFSv4.1/pnfs: Remove redundant check in pnfs_layoutgets_blocked() NFSv4.1/pnfs: Remove redundant lo->plh_block_lgets in layoutreturn NFSv4.1/pnfs: Don't prevent layoutgets when doing return-on-close NFSv4.1/pnfs: Fix serialisation of layout return and layoutget NFSv4.1/pnfs: Remove redundant checks in pnfs_layoutgets_blocked() pNFS: Tighten up locking around DS commit buckets
| | * | | | | | NFSv4.1/pnfs: Remove redundant wakeup in pnfs_send_layoutreturn()Trond Myklebust2015-08-121-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pnfs_clear_layoutreturn_waitbit() should already be calling rpc_wake_up(&NFS_SERVER(ino)->roc_rpcwaitq) for us. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | | | | NFSv4.1/pnfs: Remove redundant check in pnfs_layoutgets_blocked()Trond Myklebust2015-08-121-21/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | layoutget now should already be serialised w.r.t. layout returns Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | | | | NFSv4.1/pnfs: Remove redundant lo->plh_block_lgets in layoutreturnTrond Myklebust2015-08-121-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The NFS_LAYOUT_RETURN bit already suffices to ensure that layoutget is blocked. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | | | | NFSv4.1/pnfs: Don't prevent layoutgets when doing return-on-closeTrond Myklebust2015-08-121-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If there is an outstanding return-on-close, then we just want new layoutget requests to wait rather than fail. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | | | | NFSv4.1/pnfs: Fix serialisation of layout return and layoutgetTrond Myklebust2015-08-121-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We should always test for outstanding layout returns, whether or not pnfs_should_retry_layoutget() is true. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | | | | NFSv4.1/pnfs: Remove redundant checks in pnfs_layoutgets_blocked()Trond Myklebust2015-08-121-6/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If there are no valid layout segments, then we should already have checked in pnfs_update_layout() whether or not this is the first layoutget. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | | | | pNFS: Tighten up locking around DS commit bucketsTrond Myklebust2015-08-121-18/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I'm not aware of any bugreports around this issue, but the locking around the pnfs_commit_bucket is inconsistent at best. This patch tightens it up by ensuring that the 'bucket->committing' list is always changed atomically w.r.t. the 'bucket->clseg' layout segment tracking. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | | | | | Merge branch 'bugfixes'Trond Myklebust2015-08-172-16/+21
| |\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * bugfixes: SUNRPC: Fix a thinko in xs_connect() NFSv4.1/pNFS: Fix borken function _same_data_server_addrs_locked() NFS: nfs_set_pgio_error sometimes misses errors
| | * | | | | | | NFSv4.1/pNFS: Fix borken function _same_data_server_addrs_locked()Trond Myklebust2015-08-171-14/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Switch back to using list_for_each_entry(). Fixes an incorrect test for list NULL termination. - Do not assume that lists are sorted. - Finally, consider an existing entry to match if it consists of a subset of the addresses in the new entry. Cc: stable@vger.kernel.org # 4.0+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * | | | | | | NFS: nfs_set_pgio_error sometimes misses errorsTrond Myklebust2015-08-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We should ensure that we always set the pgio_header's error field if a READ or WRITE RPC call returns an error. The current code depends on 'hdr->good_bytes' always being initialised to a large value, which is not always done correctly by callers. When this happens, applications may end up missing important errors. Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
OpenPOWER on IntegriCloud