summaryrefslogtreecommitdiffstats
path: root/sys/kern/kern_lock.c
Commit message (Collapse)AuthorAgeFilesLines
* Add dedicated routines to toggle lockmgr flags such as LK_NOSHARE andjhb2010-08-201-0/+28
| | | | | | | | | | | | LK_CANRECURSE after a lock is created. Use them to implement macros that otherwise manipulated the flags directly. Assert that the associated lockmgr lock is exclusively locked by the current thread when manipulating these flags to ensure the flag updates are safe. This last change required some minor shuffling in a few filesystems to exclusively lock a brand new vnode slightly earlier. Reviewed by: kib MFC after: 3 days
* Fix typos.attilio2010-01-071-6/+6
|
* Tweak comments.attilio2010-01-071-0/+18
|
* Exclusive waiters sleeping with LK_SLEEPFAIL on and using interruptibleattilio2010-01-071-3/+25
| | | | | | | | | | | | | sleeps/timeout may have left spourious lk_exslpfail counts on, so clean it up even when accessing a shared queue acquisition, giving to lk_exslpfail the value of 'upper limit'. In the worst case scenario, infact (mixed interruptible sleep / LK_SLEEPFAIL waiters) what may happen is that both queues are awaken even if that's not necessary, but still no harm. Reported by: Lucius Windschuh <lwindschuh at googlemail dot com> Reviewed by: kib Tested by: pho, Lucius Windschuh <lwindschuh at googlemail dot com>
* In current code, threads performing an interruptible sleep (on bothattilio2009-12-121-13/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sxlock, via the sx_{s, x}lock_sig() interface, or plain lockmgr), will leave the waiters flag on forcing the owner to do a wakeup even when if the waiter queue is empty. That operation may lead to a deadlock in the case of doing a fake wakeup on the "preferred" (based on the wakeup algorithm) queue while the other queue has real waiters on it, because nobody is going to wakeup the 2nd queue waiters and they will sleep indefinitively. A similar bug, is present, for lockmgr in the case the waiters are sleeping with LK_SLEEPFAIL on. In this case, even if the waiters queue is not empty, the waiters won't progress after being awake but they will just fail, still not taking care of the 2nd queue waiters (as instead the lock owned doing the wakeup would expect). In order to fix this bug in a cheap way (without adding too much locking and complicating too much the semantic) add a sleepqueue interface which does report the actual number of waiters on a specified queue of a waitchannel (sleepq_sleepcnt()) and use it in order to determine if the exclusive waiters (or shared waiters) are actually present on the lockmgr (or sx) before to give them precedence in the wakeup algorithm. This fix alone, however doesn't solve the LK_SLEEPFAIL bug. In order to cope with it, add the tracking of how many exclusive LK_SLEEPFAIL waiters a lockmgr has and if all the waiters on the exclusive waiters queue are LK_SLEEPFAIL just wake both queues. The sleepq_sleepcnt() introduction and ABI breakage require __FreeBSD_version bumping. Reported by: avg, kib, pho Reviewed by: kib Tested by: pho
* Save the sack when doing a lockmgr_disown() call.attilio2009-11-061-0/+1
| | | | | Requested by: kib MFC: 3 days
* When releasing a lockmgr held in shared way we need to use a write memoryattilio2009-10-031-3/+3
| | | | | | | barrier in order to avoid, on architectures which doesn't have strong ordered writes, CPU instructions reordering. Diagnosed by: fabio
* Revert previous commit and add myself to the list of people who shouldphk2009-09-081-1/+0
| | | | know better than to commit with a cat in the area.
* Add necessary include.phk2009-09-081-0/+1
|
* Fix some bugs related to adaptive spinning:attilio2009-09-021-2/+11
| | | | | | | | | | | | | | | | | | | In the lockmgr support: - GIANT_RESTORE() is just called when the sleep finishes, so the current code can ends up into a giant unlock problem. Fix it by appropriately call GIANT_RESTORE() when needed. Note that this is not exactly ideal because for any interation of the adaptive spinning we drop and restore Giant, but the overhead should be not a factor. - In the lock held in exclusive mode case, after the adaptive spinning is brought to completition, we should just retry to acquire the lock instead to fallthrough. Fix that. - Fix a style nit In the sx support: - Call GIANT_SAVE() before than looping. This saves some overhead because in the current code GIANT_SAVE() is called several times. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
* * Change the scope of the ASSERT_ATOMIC_LOAD() from a generic check toattilio2009-08-171-0/+3
| | | | | | | | | | | | | | | a pointer-fetching specific operation check. Consequently, rename the operation ASSERT_ATOMIC_LOAD_PTR(). * Fix the implementation of ASSERT_ATOMIC_LOAD_PTR() by checking directly alignment on the word boundry, for all the given specific architectures. That's a bit too strict for some common case, but it assures safety. * Add a comment explaining the scope of the macro * Add a new stub in the lockmgr specific implementation Tested by: marcel (initial version), marius Reviewed by: rwatson, jhb (comment specific review) Approved by: re (kib)
* Introduce support for adaptive spinning in lockmgr.attilio2009-06-171-18/+204
| | | | | | | | | | | | | | | | | | | | | | | | | | Actually, as it did receive few tuning, the support is disabled by default, but it can opt-in with the option ADAPTIVE_LOCKMGRS. Due to the nature of lockmgrs, adaptive spinning needs to be selectively enabled for any interested lockmgr. The support is bi-directional, or, in other ways, it will work in both cases if the lock is held in read or write way. In particular, the read path is passible of further tunning using the sysctls debug.lockmgr.retries and debug.lockmgr.loops . Ideally, such sysctls should be axed or compiled out before release. Addictionally note that adaptive spinning doesn't cope well with LK_SLEEPFAIL. The reason is that many (and probabilly all) consumers of LK_SLEEPFAIL are mainly interested in knowing if the interlock was dropped or not in order to reacquire it and re-test initial conditions. This directly interacts with adaptive spinning because lockmgr needs to drop the interlock while spinning in order to avoid a deadlock (further details in the comments inside the patch). Final note: finding someone willing to help on tuning this with relevant workloads would be either very important and appreciated. Tested by: jeff, pho Requested by: many
* Handle lock recursion differenty by always checking against LO_RECURSABLEattilio2009-06-021-5/+6
| | | | | | instead the lock own flag itself. Tested by: pho
* Add the OpenSolaris dtrace lockstat provider. The lockstat providersson2009-05-261-1/+17
| | | | | | | | | | adds probes for mutexes, reader/writer and shared/exclusive locks to gather contention statistics and other locking information for dtrace scripts, the lockstat(1M) command and other potential consumers. Reviewed by: attilio jhb jb Approved by: gnn (mentor)
* Add missing 'break' statement.trasz2009-05-121-0/+1
| | | | | Found with: Coverity Prevent(tm) CID: 3919
* - Wrap lock profiling state variables in #ifdef LOCK_PROFILING blocks.jeff2009-03-151-4/+5
|
* - Call lock_profile_release when we're transitioning a lock to be owned byjeff2009-03-141-1/+3
| | | | | | LK_KERNPROC. Discussed with: attilio
* Tweak the output of VOP_PRINT/vn_printf() some.jhb2009-02-061-3/+3
| | | | | | | | - Align the fifo output in fifo_print() with other vn_printf() output. - Remove the leading space from lockmgr_printinfo() so its output lines up in vn_printf(). - lockmgr_printinfo() now ends with a newline, so remove an extra newline from vn_printf().
* Teach WITNESS about the interlocks used with lockmgr. This removes a bunchjhb2008-09-101-3/+3
| | | | | | | | of spurious witness warnings since lockmgr grew witness support. Before this, every time you passed an interlock to a lockmgr lock WITNESS treated it as a LOR. Reviewed by: attilio
* Use |= rather than += when aggregrating requests to wakeup the swapper.jhb2008-08-221-2/+2
| | | | | What we really want is an inclusive or of all the requests, and += can in theory roll over to 0.
* If a thread that is swapped out is made runnable, then the setrunnable()jhb2008-08-051-9/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | routine wakes up proc0 so that proc0 can swap the thread back in. Historically, this has been done by waking up proc0 directly from setrunnable() itself via a wakeup(). When waking up a sleeping thread that was swapped out (the usual case when waking proc0 since only sleeping threads are eligible to be swapped out), this resulted in a bit of recursion (e.g. wakeup() -> setrunnable() -> wakeup()). With sleep queues having separate locks in 6.x and later, this caused a spin lock LOR (sleepq lock -> sched_lock/thread lock -> sleepq lock). An attempt was made to fix this in 7.0 by making the proc0 wakeup use the ithread mechanism for doing the wakeup. However, this required grabbing proc0's thread lock to perform the wakeup. If proc0 was asleep elsewhere in the kernel (e.g. waiting for disk I/O), then this degenerated into the same LOR since the thread lock would be some other sleepq lock. Fix this by deferring the wakeup of the swapper until after the sleepq lock held by the upper layer has been locked. The setrunnable() routine now returns a boolean value to indicate whether or not proc0 needs to be woken up. The end result is that consumers of the sleepq API such as *sleep/wakeup, condition variables, sx locks, and lockmgr, have to wakeup proc0 if they get a non-zero return value from sleepq_abort(), sleepq_broadcast(), or sleepq_signal(). Discussed with: jeff Glanced at by: sam Tested by: Jurgen Weber jurgen - ish com au MFC after: 2 weeks
* s/alredy/already/ in the comments and the log message.kib2008-07-251-5/+5
|
* The "if" semantic is not needed, just fix this.attilio2008-05-251-1/+1
|
* Use a "rel" memory barrier for disowning the lock as it cames from anattilio2008-04-131-1/+1
| | | | exclusive locking operation.
* - Re-introduce WITNESS support for lockmgr. About the old implementationattilio2008-04-121-17/+78
| | | | | | | | the only one difference is that lockmgr*() functions now accept LK_NOWITNESS flag which skips ordering for the instanced calling. - Remove an unuseful stub in witness_checkorder() (because the above check doesn't allow ever happening) and allow witness_upgrade() to accept non-try operation too.
* - Remove a stale comment.attilio2008-04-121-4/+2
| | | | - Add an extra assertion in order to catch malformed requested operations.
* - Use a different encoding for lockmgr options: make them encoded byattilio2008-04-071-1/+1
| | | | | | | | | | bit in order to allow per-bit checks on the options flag, in particular in the consumers code [1] - Re-enable the check against TDP_DEADLKTREAT as the anti-waiters starvation patch allows exclusive waiters to override new shared requests. [1] Requested by: pjd, jeff
* Optimize lockmgr in order to get rid of the pool mutex interlock, of theattilio2008-04-061-623/+805
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | state transitioning flags and of msleep(9) callings. Use, instead, an algorithm very similar to what sx(9) and rwlock(9) alredy do and direct accesses to the sleepqueue(9) primitive. In order to avoid writer starvation a mechanism very similar to what rwlock(9) uses now is implemented, with the correspective per-thread shared lockmgrs counter. This patch also adds 2 new functions to lockmgr KPI: lockmgr_rw() and lockmgr_args_rw(). These two are like the 2 "normal" versions, but they both accept a rwlock as interlock. In order to realize this, the general lockmgr manager function "__lockmgr_args()" has been implemented through the generic lock layer. It supports all the blocking primitives, but currently only these 2 mappers live. The patch drops the support for WITNESS atm, but it will be probabilly added soon. Also, there is a little race in the draining code which is also present in the current CVS stock implementation: if some sharers, once they wakeup, are in the runqueue they can contend the lock with the exclusive drainer. This is hard to be fixed but the now committed code mitigate this issue a lot better than the (past) CVS version. In addition assertive KA_HELD and KA_UNHELD have been made mute assertions because they are dangerous and they will be nomore supported soon. In order to avoid namespace pollution, stack.h is splitted into two parts: one which includes only the "struct stack" definition (_stack.h) and one defining the KPI. In this way, newly added _lockmgr.h can just include _stack.h. Kernel ABI results heavilly changed by this commit (the now committed version of "struct lock" is a lot smaller than the previous one) and KPI results broken by lockmgr_rw() / lockmgr_args_rw() introduction, so manpages and __FreeBSD_version will be updated accordingly. Tested by: kris, pho, jeff, danger Reviewed by: jeff Sponsored by: Google, Summer of Code program 2007
* - Handle buffer lock waiters count directly in the buffer cache insteadattilio2008-03-011-18/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | than rely on the lockmgr support [1]: * bump the waiters only if the interlock is held * let brelvp() return the waiters count * rely on brelvp() instead than BUF_LOCKWAITERS() in order to check for the waiters number - Remove a namespace pollution introduced recently with lockmgr.h including lock.h by including lock.h directly in the consumers and making it mandatory for using lockmgr. - Modify flags accepted by lockinit(): * introduce LK_NOPROFILE which disables lock profiling for the specified lockmgr * introduce LK_QUIET which disables ktr tracing for the specified lockmgr [2] * disallow LK_SLEEPFAIL and LK_NOWAIT to be passed there so that it can only be used on a per-instance basis - Remove BUF_LOCKWAITERS() and lockwaiters() as they are no longer used This patch breaks KPI so __FreBSD_version will be bumped and manpages updated by further commits. Additively, 'struct buf' changes results in a disturbed ABI also. [2] Really, currently there is no ktr tracing in the lockmgr, but it will be added soon. [1] Submitted by: kib Tested by: pho, Andrea Barberio <insomniac at slackware dot it>
* Axe the 'thread' argument from VOP_ISLOCKED() and lockstatus() as it isattilio2008-02-251-5/+2
| | | | | | | | | always curthread. As KPI gets broken by this patch, manpages and __FreeBSD_version will be updated by further commits. Tested by: Andrea Barberio <insomniac at slackware dot it>
* - Introduce lockmgr_args() in the lockmgr space. This function performsattilio2008-02-151-24/+44
| | | | | | | | | | | the same operation of lockmgr() but accepting a custom wmesg, prio and timo for the particular lock instance, overriding default values lkp->lk_wmesg, lkp->lk_prio and lkp->lk_timo. - Use lockmgr_args() in order to implement BUF_TIMELOCK() - Cleanup BUF_LOCK() - Remove LK_INTERNAL as it is nomore used in the lockmgr namespace Tested by: Andrea Barberio <insomniac at slackware dot it>
* - Add real assertions to lockmgr locking primitives.attilio2008-02-131-28/+117
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A couple of notes for this: * WITNESS support, when enabled, is only used for shared locks in order to avoid problems with the "disowned" locks * KA_HELD and KA_UNHELD only exists in the lockmgr namespace in order to assert for a generic thread (not curthread) owning or not the lock. Really, this kind of check is bogus but it seems very widespread in the consumers code. So, for the moment, we cater this untrusted behaviour, until the consumers are not fixed and the options could be removed (hopefully during 8.0-CURRENT lifecycle) * Implementing KA_HELD and KA_UNHELD (not surported natively by WITNESS) made necessary the introduction of LA_MASKASSERT which specifies the range for default lock assertion flags * About other aspects, lockmgr_assert() follows exactly what other locking primitives offer about this operation. - Build real assertions for buffer cache locks on the top of lockmgr_assert(). They can be used with the BUF_ASSERT_*(bp) paradigm. - Add checks at lock destruction time and use a cookie for verifying lock integrity at any operation. - Redefine BUF_LOCKFREE() in order to not use a direct assert but let it rely on the aforementioned destruction time check. KPI results evidently broken, so __FreeBSD_version bumping and manpage update result necessary and will be committed soon. Side note: lockmgr_assert() will be used soon in order to implement real assertions in the vnode namespace replacing the legacy and still bogus "VOP_ISLOCKED()" way. Tested by: kris (earlier version) Reviewed by: jhb
* Conver all explicit instances to VOP_ISLOCKED(arg, NULL) intoattilio2008-02-081-2/+2
| | | | | | | | VOP_ISLOCKED(arg, curthread). Now, VOP_ISLOCKED() and lockstatus() should only acquire curthread as argument; this will lead in axing the additional argument from both functions, making the code cleaner. Reviewed by: jeff, kib
* td cannot be NULL in that place, so just axe out the check.attilio2008-02-061-1/+1
|
* Add WITNESS support to lockmgr locking primitive.attilio2008-02-061-11/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This support tries to be as parallel as possible with other locking primitives, but there are differences; more specifically: - The base witness support is alredy equipped for allowing lock duplication acquisition as lockmgr rely on this. - In the case of lockmgr_disown() the lock result unlocked by witness even if it is still held by the "kernel context" - In the case of upgrading we can have 3 different situations: * Total unlocking of the shared lock and nothing else * Real witness upgrade if the owner is the first upgrader * Shared unlocking and exclusive locking if the owner is not the first upgrade but it is still allowed to upgrade - LK_DRAIN is basically handled like an exclusive acquisition Additively new options LK_NODUP and LK_NOWITNESS can now be used with lockinit(): LK_NOWITNESS disables WITNESS for the specified lock while LK_NODUP enable duplicated locks tracking. This will require manpages update and a __FreeBSD_version bumping (addressed by further commits). This patch also fixes a problem occurring if a lockmgr is held in exclusive mode and the same owner try to acquire it in shared mode: currently there is a spourious shared locking acquisition while what we really want is a lock downgrade. Probabilly, this situation can be better served with a EDEADLK failing errno return. Side note: first testing on this patch alredy reveleated several LORs reported, so please expect LORs cascades until resolved. NTFS also is reported broken by WITNESS introduction. BTW, NTFS is exposing a lock leak which needs to be fixed, and this patch can help it out if rightly tweaked. Tested by: kris, yar, Scot Hetzel <swhetzel at gmail dot com>
* Cleanup lockmgr interface and exported KPI:attilio2008-01-241-24/+7
| | | | | | | | | | | | | | | | | | | | - Remove the "thread" argument from the lockmgr() function as it is always curthread now - Axe lockcount() function as it is no longer used - Axe LOCKMGR_ASSERT() as it is bogus really and no currently used. Hopefully this will be soonly replaced by something suitable for it. - Remove the prototype for dumplockinfo() as the function is no longer present Addictionally: - Introduce a KASSERT() in lockstatus() in order to let it accept only curthread or NULL as they should only be passed - Do a little bit of style(9) cleanup on lockmgr.h KPI results heavilly broken by this change, so manpages and FreeBSD_version will be modified accordingly by further commits. Tested by: matteo
* lockmgr() function will return successfully when trying to work underattilio2008-01-111-3/+6
| | | | | | | | | | | panic but it won't actually lock anything. This can lead some paths to reach lockmgr_disown() with inconsistent lock which will let trigger the relative assertions. Fix those in order to recognize panic situation and to not trigger. Reported by: pho Submitted by: kib
* Fix a last second typo about recent lockmgr_disown() introduction.attilio2008-01-091-2/+2
|
* Remove explicit calling of lockmgr() with the NULL argument.attilio2008-01-081-23/+42
| | | | | | | | | | | | | | | | | | Now, lockmgr() function can only be called passing curthread and the KASSERT() is upgraded according with this. In order to support on-the-fly owner switching, the new function lockmgr_disown() has been introduced and gets used in BUF_KERNPROC(). KPI, so, results changed and FreeBSD version will be bumped soon. Differently from previous code, we assume idle thread cannot try to acquire the lockmgr as it cannot sleep, so loose the relative check[1] in BUF_KERNPROC(). Tested by: kris [1] kib asked for a KASSERT in the lockmgr_disown() about this condition, but after thinking at it, as this is a well known general rule, I found it not really necessary.
* Trimm out now unused option LK_EXCLUPGRADE from the lockmgr namespace.attilio2007-12-281-13/+0
| | | | | | | | | | | | | This option just adds complexity and the new implementation no longer will support it, so axing it now that it is unused is probabilly the better idea. FreeBSD version is bumped in order to reflect the KPI breakage introduced by this patch. In the ports tree, kris found that only old OSKit code uses it, but as it is thought to work only on 2.x kernels serie, version bumping will solve any problem.
* In order to avoid a huge class of deadlocks (in particular in interactionsattilio2007-12-271-1/+9
| | | | | | | | | | with the interlock), owner of the lock should be only curthread or at least, for its limited usage, NULL which identifies LK_KERNPROC. The thread "extra argument" for the lockmgr interface is going to be removed in the near future, but for the moment, just let kernel run for some days with this check on in order to find potential deadlocking places around the kernel and fix them.
* Modify stack(9) stack_print() and stack_sbuf_print() routines to use newrwatson2007-12-011-1/+1
| | | | | | | | | | | | | | | | linker interfaces for looking up function names and offsets from instruction pointers. Create two variants of each call: one that is "DDB-safe" and avoids locking in the linker, and one that is safe for use in live kernels, by virtue of observing locking, and in particular safe when kernel modules are being loaded and unloaded simultaneous to their use. This will allow them to be used outside of debugging contexts. Modify two of three current stack(9) consumers to use the DDB-safe interfaces, as they run in low-level debugging contexts, such as inside lockmgr(9) and the kernel memory allocator. Update man page.
* transferlockers() is a very dangerous and hack-ish function as waitersattilio2007-11-241-28/+0
| | | | | | | | | | | should never be moved by one lock to another. As, luckily, nothing in our tree is using it, axe the function. This breaks lockmgr KPI, so interested, third-party modules should update their source code with appropriate replacement. Ok'ed by: ups, rwatson MFC after: 3 days
* Expand lock class with the "virtual" function lc_assert which will offerattilio2007-11-181-0/+9
| | | | | | | | | an unified way for all the lock primitives to express lock assertions. Currenty, lockmgrs and rmlocks don't have assertions, so just panic in that case. This will be a base for more callout improvements. Ok'ed by: jhb, jeff
* generally we are interested in what thread did something asjulian2007-11-141-1/+1
| | | | | | opposed to what process. Since threads by default have teh name of the process unless over-written with more useful information, just print the thread name instead.
* Move lock_profile_object_{init,destroy}() into lock_{init,destroy}().jhb2007-05-181-2/+1
|
* - Use lock_init/lock_destroy() to setup the lock_object inside of lockmgr.jhb2007-03-301-7/+11
| | | | | | | We can now use LOCK_CLASS() as a stronger check in lockmgr_chain() as a result. This required putting back lk_flags as lockmgr's use of flags conflicted with other flags in lo_flags otherwise. - Tweak 'show lock' output for lockmgr to match sx, rw, and mtx.
* Rename the 'mtx_object', 'rw_object', and 'sx_object' members of mutexes,jhb2007-03-211-1/+1
| | | | rwlocks, and sx locks to 'lock_object'.
* Handle the case when a thread is blocked on a lockmgr lock with LK_DRAINjhb2007-03-211-3/+16
| | | | | | in DDB's 'show sleepchain'. MFC after: 3 days
* Add two new function pointers 'lc_lock' and 'lc_unlock' to lock classes.jhb2007-03-091-3/+19
| | | | | | | | | | | | | These functions are intended to be used to drop a lock and then reacquire it when doing an sleep such as msleep(9). Both functions accept a 'struct lock_object *' as their first parameter. The 'lc_unlock' function returns an integer that is then passed as the second paramter to the subsequent 'lc_lock' function. This can be used to communicate state. For example, sx locks and rwlocks use this to indicate if the lock was share/read locked vs exclusive/write locked. Currently, spin mutexes and lockmgr locks do not provide working lc_lock and lc_unlock functions.
OpenPOWER on IntegriCloud