summaryrefslogtreecommitdiffstats
path: root/fs/ceph
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'for-linus' of ↵Linus Torvalds2010-05-2430-759/+876
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (59 commits) ceph: reuse mon subscribe message instead of allocated anew ceph: avoid resending queued message to monitor ceph: Storage class should be before const qualifier ceph: all allocation functions should get gfp_mask ceph: specify max_bytes on readdir replies ceph: cleanup pool op strings ceph: Use kzalloc ceph: use common helper for aborted dir request invalidation ceph: cope with out of order (unsafe after safe) mds reply ceph: save peer feature bits in connection structure ceph: resync headers with userland ceph: use ceph. prefix for virtual xattrs ceph: throw out dirty caps metadata, data on session teardown ceph: attempt mds reconnect if mds closes our session ceph: clean up send_mds_reconnect interface ceph: wait for mds OPEN reply to indicate reconnect success ceph: only send cap releases when mds is OPEN|HUNG ceph: dicard cap releases on mds restart ceph: make mon client statfs handling more generic ceph: drop src address(es) from message header [new protocol feature] ...
| * ceph: reuse mon subscribe message instead of allocated anewSage Weil2010-05-212-10/+14
| | | | | | | | | | | | | | Use the same message, allocated during startup. No need to reallocate a new one each time around (and potentially ENOMEM). Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: avoid resending queued message to monitorSage Weil2010-05-211-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | The auth_reply handler will (re)send any pending requests. For the initial mon authenticate phase, that's correct, but when a auth ticket renewal races with an in-flight request, we may resend a request message that is already in flight. Avoid this by revoking the message before sending it. We should also avoid resending requests at all during ticket renewal; that will come soon. Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: Storage class should be before const qualifierTobias Klauser2010-05-213-6/+6
| | | | | | | | | | | | | | | | | | | | | | The C99 specification states in section 6.11.5: The placement of a storage-class specifier other than at the beginning of the declaration specifiers in a declaration is an obsolescent feature. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: all allocation functions should get gfp_maskYehuda Sadeh2010-05-178-30/+32
| | | | | | | | | | | | | | | | | | This is essential, as for the rados block device we'll need to run in different contexts that would need flags that are other than GFP_NOFS. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: specify max_bytes on readdir repliesSage Weil2010-05-174-1/+14
| | | | | | | | | | | | | | Specify max bytes in request to bound size of reply. Add associated mount option with default value of 512 KB. Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: cleanup pool op stringsSage Weil2010-05-171-19/+12
| | | | | | | | Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: Use kzallocJulia Lawall2010-05-171-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use kzalloc rather than the combination of kmalloc and memset. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression x,size,flags; statement S; @@ -x = kmalloc(size,flags); +x = kzalloc(size,flags); if (x == NULL) S -memset(x, 0, size); // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: use common helper for aborted dir request invalidationSage Weil2010-05-173-31/+27
| | | | | | | | | | | | | | We invalidate I_COMPLETE and dentry leases in two places: on aborted mds request and on request replay. Use common helper to avoid duplicate code. Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: cope with out of order (unsafe after safe) mds replySage Weil2010-05-171-0/+6
| | | | | | | | Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: save peer feature bits in connection structureSage Weil2010-05-172-0/+2
| | | | | | | | | | | | | | These are used for adjusting behavior, such as conditionally encoding a newer message format. Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: resync headers with userlandSage Weil2010-05-176-22/+91
| | | | | | | | | | | | | | Notable changes include pool op defines and types, FLOCK feature bit, and new CMPXATTR osd ops. Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: use ceph. prefix for virtual xattrsSage Weil2010-05-171-10/+11
| | | | | | | | | | | | Drop the 'user.' prefix and use just 'ceph.' for fs virtual xattrs. Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: throw out dirty caps metadata, data on session teardownSage Weil2010-05-171-3/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The remove_session_caps() helper is called when an MDS closes out our session (either normally, or as a result of a failed reconnect), and when we tear down state for umount. If we remove the last cap, and there are no cap migrations in progress, then there is little hope of us flushing out that data to the mds (without heroic efforts to reconnect and flush). So, to avoid leaving inodes pinned (due to dirty state) and crashing after umount, throw out dirty caps state and unpin the inodes. Print a warning to the console so we know something was lost. NOTE: Although we drop wrbuffer refs, we don't actually mark pages clean; maybe a truncate should be queued? Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: attempt mds reconnect if mds closes our sessionSage Weil2010-05-171-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, if our session is closed (due to a timeout, or explicit close, or whatever), we just sit there doing nothing unless/until the MDS restarts, at which point we try to reconnect. Change client to attempt an immediate reconnect if our session is closed. Note that currently the MDS doesn't support this, and our attempt will fail. We'll get a session CLOSE, our caps and dirty cap state will be dropped, and the client will be free to attempt to reconnect. That's clearly not as nice as a successful reconnect, but it at least allows us to try to carry on, and in the future the MDS will support a reconnect and we will fare better. Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: clean up send_mds_reconnect interfaceSage Weil2010-05-171-31/+16
| | | | | | | | | | | | | | | | | | | | | | Pass a ceph_mds_session, since the caller has it. Remove the dead code for sending empty reconnects. It used to be used when the MDS contacted _us_ to solicit a reconnect, and we could reply saying "go away, I have no session." Now we only send reconnects based on the mds map, and only when we do in fact have an open session. Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: wait for mds OPEN reply to indicate reconnect successSage Weil2010-05-171-15/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We used to infer reconnect success by watching the MDS state, essentially assuming that hearing nothing meant things were ok. That wasn't particularly reliable. Instead, the MDS replies with an explicit OPEN message to indicate success. Strictly speaking, this is a protocol change, but it is a backwards compatible one that does not break new clients + old servers or old clients + new servers. At least not yet. Drop unused @all argument from kick_requests while we're at it. Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: only send cap releases when mds is OPEN|HUNGSage Weil2010-05-171-1/+3
| | | | | | | | | | | | | | | | | | On OPENING we shouldn't have any caps (or releases). On CLOSING, we should wait until we succeed (and throw it all out), or don't (and are OPEN again). On RECONNECTING we can wait until we are OPEN. Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: dicard cap releases on mds restartSage Weil2010-05-171-0/+41
| | | | | | | | | | | | | | | | | | If the MDS restarts, the expire caps state is no longer shared, and can be thrown out. Caps state will be rebuilt on the MDS during the reconnect process that follows. Zero out any release messages and adjust the release counter accordingly. Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: make mon client statfs handling more genericYehuda Sadeh2010-05-173-52/+58
| | | | | | | | | | | | | | | | This is being done so that we could reuse the statfs infrastructure with other requests that return values. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: drop src address(es) from message header [new protocol feature]Sage Weil2010-05-173-11/+36
| | | | | | | | | | | | | | | | | | | | The CEPH_FEATURE_NOSRCADDR protocol feature avoids putting the full source address in each message header (twice). This patch switches the client to the new scheme, and _requires_ this feature on the server. The server will support both the old and new schemes. That means an old client will work with a new server, but a new client will not work with an old server. Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: cleanup: remove unused assignementDan Carpenter2010-05-171-2/+1
| | | | | | | | | | | | | | We don't ever use "dirty" so we can remove it. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: clean up cap release loop vs spinlockSage Weil2010-05-171-4/+3
| | | | | | | | Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: name bdi ceph-%d instead of major:minorSage Weil2010-05-171-1/+4
| | | | | | | | | | | | | | The bdi_setup_and_register() helper doesn't help us since we bdi_init() in create_client() and bdi_register() only when sget() succeeds. Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: skip mds sync on forced unmountSage Weil2010-05-171-0/+3
| | | | | | | | Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: adjust masked struct_v variable namesSage Weil2010-05-171-9/+9
| | | | | | | | | | Reported-by: Bill Pemberton <wfp5p@virginia.edu> Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: clean up mount options, ->show_options()Sage Weil2010-05-172-40/+69
| | | | | | | | | | | | Ensure all options are included in /proc/mounts. Some cleanup. Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: set dn offset when splicedSage Weil2010-05-172-40/+44
| | | | | | | | | | | | | | | | | | | | We want to assign an offset when the dentry goes from null to linked, which is always done by splice_dentry(). Notably, we should NOT assign an offset when a dentry is first created and is still null. BUG if we try to splice a non-null dentry (we shouldn't). Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: don't clobber i_max_offset on already complete dirSage Weil2010-05-171-1/+2
| | | | | | | | | | | | | | This can screw up offsets assigned to new dentries and break dcache readdir results. Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: skip set_dentry_offset work if directory not I_COMPLETESage Weil2010-05-171-0/+4
| | | | | | | | Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: set next_offset on readdir finishSage Weil2010-05-171-1/+1
| | | | | | | | | | | | Set next_offset to 2 (always 2!), not 0, on readdir finish. Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: listxattr should compare version by >=Henry C Chang2010-05-171-1/+1
| | | | | | | | | | | | | | If the version hasn't changed, don't rebuild the index. Signed-off-by: Henry C Chang <henry_c_chang@tcloudcomputing.com> Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: fix xattr dangling pointer / double freeSage Weil2010-05-171-0/+1
| | | | | | | | | | | | | | | | If we use the xattr_blob, clear the pointer so we don't release the memory at the bottom of the fuction. Reported-by: Henry C Chang <henry_c_chang@tcloudcomputing.com> Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: close messenger raceSage Weil2010-05-171-7/+7
| | | | | | | | | | | | | | Simplify messenger locking, and close race between ceph_con_close() setting the CLOSED bit and con_work() checking the bit, then taking the mutex. Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: name msgpools; useful error messagesSage Weil2010-05-173-7/+16
| | | | | | | | Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: fix memory leak due to possible dentry init raceSage Weil2010-05-171-1/+4
| | | | | | | | | | | | Free dentry_info in error path. Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: include auth method in error messagesSage Weil2010-05-174-4/+9
| | | | | | | | Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: osdtimeout=0 for now timeoutSage Weil2010-05-171-1/+1
| | | | | | | | | | | | Allow the osd reset timeout to be disabled. Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: d_obtain_alias() returns ERR_PTR()Dan Carpenter2010-05-171-6/+6
| | | | | | | | | | | | | | d_obtain_alias() doesn't return NULL, it returns an ERR_PTR(). Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: wake up mount thread when getting osdmapYehuda Sadeh2010-05-171-0/+1
| | | | | | | | | | | | | | Now that the mount thread waits for the osdmap, it needs to be awaken. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
| * ceph: remove unused #includesHuang Weiyi2010-05-171-3/+0
| | | | | | | | | | | | | | | | Remove unused #include's in fs/ceph/super.c Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: wait for both monmap and osdmap when opening sessionSage Weil2010-05-171-5/+6
| | | | | | | | Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
| * ceph: clean up connection resetSage Weil2010-05-172-1/+2
| | | | | | | | | | | | Reset out_keepalive_pending and peer_global_seq, and drop unused var. Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: simplify ceph_msg_newSage Weil2010-05-177-36/+29
| | | | | | | | | | | | | | We only need to pass in front_len. Callers can attach any other payload pieces (middle, data) as they see fit. Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: make ceph_msg_new return NULL on failure; clean up, fix callersSage Weil2010-05-177-80/+48
| | | | | | | | | | | | | | Returning ERR_PTR(-ENOMEM) is useless extra work. Return NULL on failure instead, and fix up the callers (about half of which were wrong anyway). Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: rewrite msgpool using mempool_tSage Weil2010-05-172-151/+29
| | | | | | | | | | | | | | | | | | Since we don't need to maintain large pools of messages, we can just use the standard mempool_t. We maintain a msgpool 'wrapper' because we need the mempool_t* in the alloc function, and mempool gives us only pool_data. Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: use ceph_sb_to_client instead of ceph_clientCheng Renquan2010-05-1711-33/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ceph_sb_to_client and ceph_client are really identical, we need to dump one; while function ceph_client is confusing with "struct ceph_client", ceph_sb_to_client's definition is more clear; so we'd better switch all call to ceph_sb_to_client. -static inline struct ceph_client *ceph_client(struct super_block *sb) -{ - return sb->s_fs_info; -} Signed-off-by: Cheng Renquan <crquan@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: handle kzalloc() failureCheng Renquan2010-05-171-0/+4
| | | | | | | | | | Signed-off-by: Cheng Renquan <crquan@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: drop unnecessary msgpool for mon_client subscribe_ackSage Weil2010-05-172-13/+13
| | | | | | | | | | | | Preallocate a single message to reuse instead. Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: drop unnecessary msgpool for mon_client auth_replySage Weil2010-05-172-10/+14
| | | | | | | | | | | | Preallocate a single reply message that we can reuse instead. Signed-off-by: Sage Weil <sage@newdream.net>
OpenPOWER on IntegriCloud