summaryrefslogtreecommitdiffstats
path: root/fs/ocfs2/dlm/dlmrecovery.c
Commit message (Collapse)AuthorAgeFilesLines
* ocfs2/dlm: Print message showing the recovery masterSunil Mushran2008-03-101-3/+3
| | | | | | | | | Knowing the dlm recovery master helps in debugging recovery issues. This patch prints a message on the recovery master node. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2/dlm: Add missing dlm_lockres_put()s in migration pathSunil Mushran2008-03-101-6/+34
| | | | | | | | | | | | | | | | During migration, the recovery master node may be asked to master a lockres it may not know about. In that case, it would not only have to create a lockres and add it to the hash, but also remember to to do the _put_ corresponding to the kref_init in dlm_init_lockres(), as soon as the migration is completed. Yes, we don't wait for the dlm_purge_lockres() to do that matching put. Note the ref added for it being in the hash protects the lockres from being freed prematurely. This patch adds that missing put, as described above, to plug a memleak. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2/dlm: Add missing dlm_lock_put()sSunil Mushran2008-03-101-0/+9
| | | | | | | | | | | | | | | | | | | Normally locks for remote nodes are freed when that node sends an UNLOCK message to the master. The master node tags an DLM_UNLOCK_FREE_LOCK action to do an extra put on the lock at the end. However, there are times when the master node has to free the locks for the remote nodes forcibly. Two cases when this happens are: 1. When the master has migrated the lockres plus all locks to another node. 2. When the master is clearing all the locks of a dead node. It was in the above two conditions that the dlm was missing the extra put. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Use dlm_print_one_lock_resource for lock resource printTao Ma2008-03-101-1/+1
| | | | | | | | | | __dlm_print_one_lock_resource must be called with spin_lock the res->spinlock. While in some cases, we use it without this precondition and lead to the failure of assert_spin_locked. So call dlm_print_one_lock_resource instead. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2/dlm: Clear joining_node on hearbeat node downTao Ma2008-01-251-6/+6
| | | | | | | | | | | | | | | | Currently the process of dlm join contains 2 steps: query join and assert join. After query join, the joined node will set its joining_node. So if the joining node happens to panic before the 2nd step, the joined node will fail to clear its joining_node flag because that node isn't in the domain map. It at least cause 2 problems. 1. All the new join request will fail. So no new node can mount the volume. 2. The joined node can't umount the volume since during the umount process it has to wait for the joining_node to be unknown. So the umount will be hanged. The solution is to clear the joining_node before we check the domain map. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2_dlm: Call node eviction callbacks from heartbeat handlerMark Fasheh2008-01-251-0/+7
| | | | | | | | With this, a dlm client can take advantage of the group protocol in the dlm to get full notification whenever a node within the dlm domain leaves unexpectedly. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* Use helpers to obtain task pid in printksPavel Emelyanov2007-10-191-5/+5
| | | | | | | | | | | | | | | | The task_struct->pid member is going to be deprecated, so start using the helpers (task_pid_nr/task_pid_vnr/task_pid_nr_ns) in the kernel. The first thing to start with is the pid, printed to dmesg - in this case we may safely use task_pid_nr(). Besides, printks produce more (much more) than a half of all the explicit pid usage. [akpm@linux-foundation.org: git-drm went and changed lots of stuff] Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Dave Airlie <airlied@linux.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [KJ PATCH] Replacing memset(<addr>,0,PAGE_SIZE) with clear_page() in ↵Shani Moideen2007-07-101-1/+1
| | | | | | | | | | fs/ocfs2/dlm/dlmrecovery.c Replacing memset(<addr>,0,PAGE_SIZE) with clear_page() in fs/ocfs2/dlm/dlmrecovery.c Signed-off-by: Shani Moideen <shani.moideen@wipro.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] ocfs2: use list_for_each_entry where beneficalChristoph Hellwig2007-07-101-52/+25
| | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: fix sparse warnings in fs/ocfs2/dlmMark Fasheh2007-05-021-2/+2
| | | | Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2_dlm: fix race in dlm_remaster_locksSrinivas Eeda2007-04-261-0/+2
| | | | | | | | | | | There is a possibility that dlm_remaster_locks could overwride node->state with DLM_RECO_NODE_DATA_REQUESTED after dlm_reco_data_done_handler sets the node->state to DLM_RECO_NODE_DATA_DONE. This could lead to recovery getting stuck and requires a cluster reboot. Synchronize with dlm_reco_state_lock spinlock. Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Added post handler callable function in o2net message handlerKurt Hackel2007-02-071-6/+12
| | | | | | | | | | | | | Currently o2net allows one handler function per message type. This patch adds the ability to call another function to be called after the handler has returned the message to the other node. Handlers are now given the option of returning a context (in the form of a void **) which will be passed back into the post message handler function. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2_dlm: Cookies in locks not being printed correctly in error messagesKurt Hackel2007-02-071-6/+6
| | | | | | | | | | | | The dlm encodes the node number and a sequence number in the lock cookie. It also stores the cookie in the lockres in the big endian format to avoid swapping 8 bytes on each lock request. The bug here was that it was assuming the cookie to be in the cpu format when decoding it for printing the error message. This patch swaps the bytes before the print. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2_dlm: wake up sleepers on the lockres waitqueueKurt Hackel2007-02-071-0/+1
| | | | | | | | | | The dlm was not waking up threads waiting on the lockres wait queue, waiting for the lockres to be no longer be in the DLM_LOCK_RES_IN_PROGRESS and the DLM_LOCK_RES_MIGRATING states. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2_dlm: Dlm dispatch was stopping too earlyKurt Hackel2007-02-071-3/+0
| | | | | | | | | | dlm_dispatch_work was not processing the queued up tasks at the first sign of the node leaving the domain leading to not only incompleted tasks but also a mismatch in the dlm refcnt. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2_dlm: Drop inflight refmap even if no locks found on the lockresKurt Hackel2007-02-071-5/+3
| | | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2_dlm: Fix migrate lockres handler queue scanningKurt Hackel2007-02-071-6/+20
| | | | | | | | | | | | The migrate lockres handler was only searching for its lock on migrated lockres on the expected queue. This could be problematic as the new master could have also issued a convert request during the migration and thus moved the lock to the convert queue. We now search for the lock on all three queues. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Sunil Mushran <Sunil.Mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2_dlm: Make dlmunlock() wait for migration to completeKurt Hackel2007-02-071-0/+1
| | | | | | | | | dlmunlock() was not waiting for migration to complete before releasing locks on locally mastered locks. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Sunil Mushran <Sunil.Mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2_dlm: fix cluster-wide refcounting of lock resourcesKurt Hackel2007-02-071-7/+116
| | | | | | | | | | | This was previously broken and migration of some locks had to be temporarily disabled. We use a new (and backward-incompatible) set of network messages to account for all references to a lock resources held across the cluster. once these are all freed, the master node may then free the lock resource memory once its local references are dropped. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] Fix numerous kcalloc() calls, convert to kzalloc()Robert P. J. Day2006-12-131-3/+3
| | | | | | | | | | | | | | | | | | | All kcalloc() calls of the form "kcalloc(1,...)" are converted to the equivalent kzalloc() calls, and a few kcalloc() calls with the incorrect ordering of the first two arguments are fixed. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Cc: Jeff Garzik <jeff@garzik.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Adam Belay <ambx1@neo.rr.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Greg KH <greg@kroah.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* WorkStruct: make allyesconfigDavid Howells2006-11-221-2/+3
| | | | | | Fix up for make allyesconfig. Signed-Off-By: David Howells <dhowells@redhat.com>
* ocfs2: Allow binary names in the DLMMark Fasheh2006-09-241-1/+2
| | | | | | | | The OCFS2 DLM uses strlen() to determine lock name length, which excludes the possibility of putting binary values in the name string. Fix this by requiring that string length be passed in as a parameter. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] fs/ocfs2/dlm/dlmrecovery.c: make dlm_lockres_master_requery() staticAdrian Bunk2006-06-291-2/+6
| | | | | | | dlm_lockres_master_requery() became global without any external usage. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] spin/rwlock init cleanupsIngo Molnar2006-06-271-2/+2
| | | | | | | | | | | | | | | | | | | | | locking init cleanups: - convert " = SPIN_LOCK_UNLOCKED" to spin_lock_init() or DEFINE_SPINLOCK() - convert rwlocks in a similar manner this patch was generated automatically. Motivation: - cleanliness - lockdep needs control of lock initialization, which the open-coded variants do not give - it's also useful for -rt and for lock debugging in general Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] fs/ocfs2/dlm/: cleanupsAdrian Bunk2006-06-261-1/+1
| | | | | | | | | | | | This patch #if 0's the no longer used dlm_dump_lock_resources(). Since this makes dlmdebug.h empty, this patch also removes this header. Additionally, the needlessly global dlm_is_node_recovered() is made static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: move dlm work to a private work queueKurt Hackel2006-06-261-2/+11
| | | | | | | | The work that is done can block for long periods of time and so is not appropriate for keventd. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: tune down some noisy messages during dlm recoveryKurt Hackel2006-06-261-1/+1
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: display message before waiting for recovery to completeKurt Hackel2006-06-261-0/+7
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: use GFP_NOFS in some dlm operationsKurt Hackel2006-06-261-5/+5
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: wait for recovery when starting lock masteryKurt Hackel2006-06-261-0/+30
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: continue recovery when a dead node is encounteredKurt Hackel2006-06-261-0/+1
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: remove unneccesary spin_unlock() in dlm_remaster_locks()Kurt Hackel2006-06-261-1/+0
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: dlm_remaster_locks() should never exit without completingKurt Hackel2006-06-261-54/+62
| | | | | | | | | We cannot restart recovery. Once we begin to recover a node, keep the state of the recovery intact and follow through, regardless of any other node deaths that may occur. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: special case recovery lock in dlmlock_remote()Kurt Hackel2006-06-261-0/+4
| | | | | | | | | | If the previous master of the recovery lock dies, let calc_usage take it down completely and let the caller completely redo the dlmlock() call. Otherwise, there will never be an opportunity to re-master the lockres and recovery wont be able to progress. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: update lvb immediately during recoveryKurt Hackel2006-06-261-18/+26
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: dump mismatching migrated lvbs before BUG()Kurt Hackel2006-06-261-2/+13
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: make dlm recovery finalization 2 stageKurt Hackel2006-06-261-17/+95
| | | | | | | Makes it easier for the recovery process to deal with node death. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: dlm recovery / lockres reference count fixKurt Hackel2006-06-261-3/+13
| | | | | | | Take a reference on lockres structures while they are on the recovery list. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: clean up recovery related messagesKurt Hackel2006-06-261-12/+90
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: handle network errors during recoveryKurt Hackel2006-06-261-17/+36
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: only recover one dead node at a timeKurt Hackel2006-06-261-3/+35
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Better tracking for recovery state changesKurt Hackel2006-06-261-9/+28
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Fix empty lvb checkKurt Hackel2006-06-261-5/+7
| | | | | | | | The check for an empty lvb should check the entire buffer not just the first byte. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: fix inverted logic in dlm_is_node_deadKurt Hackel2006-06-261-1/+1
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: allocate lockres hash pages in an arrayDaniel Phillips2006-06-261-2/+2
| | | | | | | | This allows us to have a hash table greater than a single page which greatly improves dlm performance on some tests. Signed-off-by: Daniel Phillips <phillips@google.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: calculate lockid hash values outside of the spinlockMark Fasheh2006-06-261-1/+4
| | | | | | Fixes a performance bug - pointed out by Andrew. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] fs: use list_move()Akinobu Mita2006-06-261-6/+3
| | | | | | | | | | | | | | | | This patch converts the combination of list_del(A) and list_add(A, B) to list_move(A, B) under fs/. Cc: Ian Kent <raven@themaw.net> Acked-by: Joel Becker <joel.becker@oracle.com> Cc: Neil Brown <neilb@cse.unsw.edu.au> Cc: Hans Reiser <reiserfs-dev@namesys.com> Cc: Urban Widmark <urban@teststation.com> Acked-by: David Howells <dhowells@redhat.com> Acked-by: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Akinobu Mita <mita@miraclelinux.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* ocfs2: don't use MLF* in dlm/ filesKurt Hackel2006-03-241-4/+8
| | | | | Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] ocfs2: dlm recovery fixesKurt Hackel2006-03-241-17/+21
| | | | | | | | | | when starting lock mastery (excepting the recovery lock) wait on any nodes needing recovery. fix one instance where lock resources were left attached to the recovery list after recovery completed. ensure that the node_down code is run uniformly regardless of which node found the dead node first. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] ocfs2: use hlists for lockres hashMark Fasheh2006-03-011-11/+12
| | | | | | | Switch from list_head to hlist_head. Make the size of the hash dependent upon the allocated area, rather than a constant. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
OpenPOWER on IntegriCloud