op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	ceph: fix iterate_caps removal race	Sage Weil	2010-02-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We need to be able to iterate over all caps on a session with a possibly slow callback on each cap. To allow this, we used to prevent cap reordering while we were iterating. However, we were not safe from races with removal: removing the 'next' cap would make the next pointer from list_for_each_entry_safe be invalid, and cause a lock up or similar badness. Instead, we keep an iterator pointer in the session pointing to the current cap. As before, we avoid reordering. For removal, if the cap isn't the current cap we are iterating over, we are fine. If it is, we clear cap->ci (to mark the cap as pending removal) but leave it in the session list. In iterate_caps, we can safely finish removal and get the next cap pointer. While we're at it, clean up put_cap to not take a cap reservation context, as it was never used. Signed-off-by: Sage Weil <sage@newdream.net>
*	ceph: use rbtree for snap_realms	Sage Weil	2010-02-16	1	-2/+1
\| \| \| \| \| \| \|	Switch from radix tree to rbtree for snap realms. This is much more appropriate given that realm keys are few and far between. Signed-off-by: Sage Weil <sage@newdream.net>
*	ceph: use rbtree for mds requests	Sage Weil	2010-02-16	1	-1/+3
\| \| \| \| \| \| \| \| \| \|	The rbtree is a more appropriate data structure than a radix_tree. It avoids extra memory usage and simplifies the code. It also fixes a bug where the debugfs 'mdsc' file wasn't including the most recent mds request. Signed-off-by: Sage Weil <sage@newdream.net>
*	ceph: properly handle aborted mds requests	Sage Weil	2010-01-25	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, if the MDS request was interrupted, we would unregister the request and ignore any reply. This could cause the caps or other cache state to become out of sync. (For instance, aborting dbench and doing rm -r on clients would complain about a non-empty directory because the client didn't realize it's aborted file create request completed.) Even we don't unregister, we still can't process the reply normally because we are no longer holding the caller's locks (like the dir i_mutex). So, mark aborted operations with r_aborted, and in the reply handler, be sure to process all the caps. Do not process the namespace changes, though, since we no longer will hold the dir i_mutex. The dentry lease state can also be ignored as it's more forgiving. Signed-off-by: Sage Weil <sage@newdream.net>
*	ceph: do not touch_caps while iterating over caps list	Sage Weil	2009-12-23	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	Avoid confusing iterate_session_caps(), flag the session while we are iterating so that __touch_cap does not rearrange items on the list. All other modifiers of session->s_caps do so under the protection of s_mutex. Signed-off-by: Sage Weil <sage@newdream.net>
*	ceph: use kref for struct ceph_mds_request	Sage Weil	2009-12-07	1	-3/+8
\| \| \| \|	Signed-off-by: Sage Weil <sage@newdream.net>
*	ceph: negotiate authentication protocol; implement AUTH_NONE protocol	Sage Weil	2009-11-18	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we open a monitor session, we send an initial AUTH message listing the auth protocols we support, our entity name, and (possibly) a previously assigned global_id. The monitor chooses a protocol and responds with an initial message. Initially implement AUTH_NONE, a dummy protocol that provides no security, but works within the new framework. It generates 'authorizers' that are used when connecting to (mds, osd) services that simply state our entity name and global_id. This is a wire protocol change. Signed-off-by: Sage Weil <sage@newdream.net>
*	ceph: handle errors during osd client init	Sage Weil	2009-11-18	1	-1/+1
\| \| \| \| \| \|	Unwind initializing if we get ENOMEM during client initialization. Signed-off-by: Sage Weil <sage@newdream.net>
*	ceph: build cleanly without CONFIG_DEBUG_FS	Sage Weil	2009-11-12	1	-0/+2
\| \| \| \|	Signed-off-by: Sage Weil <sage@newdream.net>
*	ceph: remove recon_gen logic	Sage Weil	2009-11-10	1	-2/+0
\| \| \| \| \| \| \| \|	We don't get an explicit affirmative confirmation that our caps reconnect, nor do we necessarily want to pay that cost. So, take all this code out for now. Signed-off-by: Sage Weil <sage@newdream.net>
*	ceph: do not confuse stale and dead (unreconnected) caps	Sage Weil	2009-11-09	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	We were using the cap_gen to track both stale caps (caps that timed out due to temporarily losing touch with the mds) and dead caps that did not reconnect after an MDS failure. Introduce a recon_gen counter to track reconnections to restarted MDSs and kill dead caps based on that instead. Rename gen to cap_gen while we're at it to make it more clear which is which. Signed-off-by: Sage Weil <sage@newdream.net>
*	ceph: MDS client	Sage Weil	2009-10-06	1	-0/+321
	The MDS (metadata server) client is responsible for submitting requests to the MDS cluster and parsing the response. We decide which MDS to submit each request to based on cached information about the current partition of the directory hierarchy across the cluster. A stateful session is opened with each MDS before we submit requests to it, and a mutex is used to control the ordering of messages within each session. An MDS request may generate two responses. The first indicates the operation was a success and returns any result. A second reply is sent when the operation commits to disk. Note that locking on the MDS ensures that the results of updates are visible only to the updating client before the operation commits. Requests are linked to the containing directory so that an fsync will wait for them to commit. If an MDS fails and/or recovers, we resubmit requests as needed. We also reconnect existing capabilities to a recovering MDS to reestablish that shared session state. Old dentry leases are invalidated. Signed-off-by: Sage Weil <sage@newdream.net>