| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
Microoptimize locking primitives by avoiding unnecessary atomic ops.
Inline version of primitives do an atomic op and if it fails they fallback to
actual primitives, which immediately retry the atomic op.
The obvious optimisation is to check if the lock is free and only then proceed
to do an atomic op.
|
|
|
|
| |
sys/kern: spelling fixes in comments.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
for real this time
r272315
Explicitly return None for negative event indices. Prior to this,
eventat(-1) would return the next-to-last event causing the back button
to cycle back to the end of an event source instead of stopping at the
start.
r272757
Add schedgraph traces for callout handlers. Specifically, a callwheel logs
a running event each time it executes a callout function. The event
includes the function pointer, argument, and whether or not it was run from
hardware interrupt context. The callwheel is marked idle when each handler
completes. This effectively logs the duration of each callout routine in
the graph.
r274091
Bind Ctrl-Q as a global hotkey to exit. Bind Ctrl-W as a hotkey to close
dialogs.
r274902
Add a new thread state "spinning" to schedgraph and add tracepoints at the
start and stop of spinning waits in lock primitives.
Reviewed by: jhb
|
|
|
|
| |
Submitted by: dhw and Thomas Mueller <tmueller@sysgo.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
r272315
Explicitly return None for negative event indices. Prior to this,
eventat(-1) would return the next-to-last event causing the back button
to cycle back to the end of an event source instead of stopping at the
start.
r272757
Add schedgraph traces for callout handlers. Specifically, a callwheel logs
a running event each time it executes a callout function. The event
includes the function pointer, argument, and whether or not it was run from
hardware interrupt context. The callwheel is marked idle when each handler
completes. This effectively logs the duration of each callout routine in
the graph.
r274091
Bind Ctrl-Q as a global hotkey to exit. Bind Ctrl-W as a hotkey to close
dialogs.
r274902
Add a new thread state "spinning" to schedgraph and add tracepoints at the
start and stop of spinning waits in lock primitives.
Reviewed by: jhb
|
|
|
|
| |
Do not try to dereference thread pointer when the value is not a pointer.
|
|
|
|
|
|
|
|
| |
Fix two issues with lockmgr(9) LK_CAN_SHARE() test, related
to the exclusive locker starvation.
MFC r273986:
Fix the build with ADAPTIVE_LOCKMGRS kernel option.
|
|
|
|
|
|
| |
auto-promotion of shared to exclusive.
Approved by: re (gjb)
|
|
|
|
|
|
|
|
|
|
|
| |
atomically upgrade shared lock to exclusive. On failure, error is
returned and lock is not dropped in the process.
Tested by: pho (previous version)
No objections from: attilio
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Approved by: re (glebius)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
current lock classes KPI it was really difficult because there was no
way to pass an rmtracker object to the lock/unlock routines. In order
to accomplish the task, modify the aforementioned functions so that
they can return (or pass as argument) an uinptr_t, which is in the rm
case used to hold a pointer to struct rm_priotracker for current
thread. As an added bonus, this fixes rm_sleep() in the rm shared
case, which right now can communicate priotracker structure between
lc_unlock()/lc_lock().
Suggested by: jhb
Reviewed by: jhb
Approved by: re (delphij)
|
|
|
|
|
|
|
|
| |
- Call lock_init() first before setting any lock_object fields in
lock init routines. This way if the machine panics due to a duplicate
init the lock's original state is preserved.
- Somewhat similarly, don't decrement td_locks and td_slocks until after
an unlock operation has completed successfully.
|
|
|
|
|
|
| |
of a lock within a single thread.
- Fix handling of interlocks in WITNESS by properly requiring the interlock
to be held exactly once if it is specified.
|
|
|
|
|
|
|
|
|
| |
locks. To support this, VNODE locks are created with the LK_IS_VNODE
flag. This flag is propagated down using the LO_IS_VNODE flag.
Note that WITNESS still records the LOR. Only the printing and the
optional entering into the kernel debugger is bypassed with the
WITNESS_NO_VNODE option.
|
|
|
|
|
|
|
|
| |
requests for LK_NOSHARE locks, just like for shared locks.
PR: kern/174969
Reviewed by: attilio
MFC after: 1 week
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
interrupt context can still be idlethread. At that point, without the
panic condition, it can still happen that idlethread then will try to
acquire some locks to carry on some operations.
Skip the idlethread check on block/sleep lock operations when KDB is
active.
Reported by: jh
Tested by: jh
MFC after: 1 week
|
|
|
|
|
|
|
| |
also in !debugging kernel rather than having "undefined" behaviour.
Tested by: avg
MFC after: 1 week
|
|
|
|
|
|
|
|
|
|
|
|
| |
Idle threads are not allowed to acquire any lock but spinlocks.
Deny any attempt to do so by panicing at the locking operation
when INVARIANTS is on. Then, remove the check on blocking on a
turnstile.
The check in sleepqueues is left because they are not allowed to use
tsleep() either which could happen still.
Reviewed by: bde, jhb, kib
MFC after: 1 week
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
New kernel events can be added at various location for sampling or counting.
This will for example allow easy system profiling whatever the processor is
with known tools like pmcstat(8).
Simultaneous usage of software PMC and hardware PMC is possible, for example
looking at the lock acquire failure, page fault while sampling on
instructions.
Sponsored by: NETASQ
MFC after: 1 month
|
|
|
|
|
|
|
| |
This is useful because the message can end up in system logs in
non-debugging operation.
Reviewed by: attilio (earlier version)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Historical behavior of letting other CPUs merily go on is a default for
time being. The new behavior can be switched on via
kern.stop_scheduler_on_panic tunable and sysctl.
Stopping of the CPUs has (at least) the following benefits:
- more of the system state at panic time is preserved intact
- threads and interrupts do not interfere with dumping of the system
state
Only one thread runs uninterrupted after panic if stop_scheduler_on_panic
is set. That thread might call code that is also used in normal context
and that code might use locks to prevent concurrent execution of certain
parts. Those locks might be held by the stopped threads and would never
be released. To work around this issue, it was decided that instead of
explicit checks for panic context, we would rather put those checks
inside the locking primitives.
This change has substantial portions written and re-written by attilio
and kib at various times. Other changes are heavily based on the ideas
and patches submitted by jhb and mdf. bde has provided many insights
into the details and history of the current code.
The new behavior may cause problems for systems that use a USB keyboard
for interfacing with system console. This is because of some unusual
locking patterns in the ukbd code which have to be used because on one
hand ukbd is below syscons, but on the other hand it has to interface
with other usb code that uses regular mutexes/Giant for its concurrency
protection. Dumping to USB-connected disks may also be affected.
PR: amd64/139614 (at least)
In cooperation with: attilio, jhb, kib, mdf
Discussed with: arch@, bde
Tested by: Eugene Grosbein <eugen@grosbein.net>,
gnn,
Steven Hartland <killing@multiplay.co.uk>,
glebius,
Andrew Boyer <aboyer@averesystems.com>
(various versions of the patch)
MFC after: 3 months (or never)
|
|
|
|
|
|
|
| |
This enables locking consumers to pass their own structures around as const and
be able to assert locks embedded into those structures.
Reviewed by: ed, kib, jhb
|
|
|
|
|
|
| |
The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.
|
|
|
|
|
|
|
|
|
|
|
| |
LK_DOWNGRADE lock ops. Namely, the ops should be NOP since LK_NOSHARE
locks are always exclusive.
Reported by: rmacklem
Reviewed by: attilio
Tested by: pho
Approved by: re (kensmith)
MFC after: 1 week
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PMC/SYSV/...).
No FreeBSD version bump, the userland application to query the features will
be committed last and can serve as an indication of the availablility if
needed.
Sponsored by: Google Summer of Code 2010
Submitted by: kibab
Reviewed by: arch@ (parts by rwatson, trasz, jhb)
X-MFC after: to be determined in last commit with code from this project
|
|
|
|
|
|
| |
it internally contain nested includes.
Reviewed by: bde
|
|
|
|
|
|
|
|
|
|
|
|
| |
LK_CANRECURSE after a lock is created. Use them to implement macros that
otherwise manipulated the flags directly. Assert that the associated
lockmgr lock is exclusively locked by the current thread when manipulating
these flags to ensure the flag updates are safe. This last change required
some minor shuffling in a few filesystems to exclusively lock a brand new
vnode slightly earlier.
Reviewed by: kib
MFC after: 3 days
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
sleeps/timeout may have left spourious lk_exslpfail counts on, so clean
it up even when accessing a shared queue acquisition, giving to
lk_exslpfail the value of 'upper limit'.
In the worst case scenario, infact (mixed
interruptible sleep / LK_SLEEPFAIL waiters) what may happen is that both
queues are awaken even if that's not necessary, but still no harm.
Reported by: Lucius Windschuh <lwindschuh at googlemail dot com>
Reviewed by: kib
Tested by: pho, Lucius Windschuh <lwindschuh at googlemail dot com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
sxlock, via the sx_{s, x}lock_sig() interface, or plain lockmgr), will
leave the waiters flag on forcing the owner to do a wakeup even when if
the waiter queue is empty.
That operation may lead to a deadlock in the case of doing a fake wakeup
on the "preferred" (based on the wakeup algorithm) queue while the other
queue has real waiters on it, because nobody is going to wakeup the 2nd
queue waiters and they will sleep indefinitively.
A similar bug, is present, for lockmgr in the case the waiters are
sleeping with LK_SLEEPFAIL on. In this case, even if the waiters queue
is not empty, the waiters won't progress after being awake but they will
just fail, still not taking care of the 2nd queue waiters (as instead the
lock owned doing the wakeup would expect).
In order to fix this bug in a cheap way (without adding too much locking
and complicating too much the semantic) add a sleepqueue interface which
does report the actual number of waiters on a specified queue of a
waitchannel (sleepq_sleepcnt()) and use it in order to determine if the
exclusive waiters (or shared waiters) are actually present on the lockmgr
(or sx) before to give them precedence in the wakeup algorithm.
This fix alone, however doesn't solve the LK_SLEEPFAIL bug. In order to
cope with it, add the tracking of how many exclusive LK_SLEEPFAIL waiters
a lockmgr has and if all the waiters on the exclusive waiters queue are
LK_SLEEPFAIL just wake both queues.
The sleepq_sleepcnt() introduction and ABI breakage require
__FreeBSD_version bumping.
Reported by: avg, kib, pho
Reviewed by: kib
Tested by: pho
|
|
|
|
|
| |
Requested by: kib
MFC: 3 days
|
|
|
|
|
|
|
| |
barrier in order to avoid, on architectures which doesn't have strong
ordered writes, CPU instructions reordering.
Diagnosed by: fabio
|
|
|
|
| |
know better than to commit with a cat in the area.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the lockmgr support:
- GIANT_RESTORE() is just called when the sleep finishes, so the current
code can ends up into a giant unlock problem. Fix it by appropriately
call GIANT_RESTORE() when needed. Note that this is not exactly ideal
because for any interation of the adaptive spinning we drop and restore
Giant, but the overhead should be not a factor.
- In the lock held in exclusive mode case, after the adaptive spinning is
brought to completition, we should just retry to acquire the lock
instead to fallthrough. Fix that.
- Fix a style nit
In the sx support:
- Call GIANT_SAVE() before than looping. This saves some overhead because
in the current code GIANT_SAVE() is called several times.
Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
a pointer-fetching specific operation check. Consequently, rename the
operation ASSERT_ATOMIC_LOAD_PTR().
* Fix the implementation of ASSERT_ATOMIC_LOAD_PTR() by checking
directly alignment on the word boundry, for all the given specific
architectures. That's a bit too strict for some common case, but it
assures safety.
* Add a comment explaining the scope of the macro
* Add a new stub in the lockmgr specific implementation
Tested by: marcel (initial version), marius
Reviewed by: rwatson, jhb (comment specific review)
Approved by: re (kib)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Actually, as it did receive few tuning, the support is disabled by
default, but it can opt-in with the option ADAPTIVE_LOCKMGRS.
Due to the nature of lockmgrs, adaptive spinning needs to be
selectively enabled for any interested lockmgr.
The support is bi-directional, or, in other ways, it will work in both
cases if the lock is held in read or write way. In particular, the
read path is passible of further tunning using the sysctls
debug.lockmgr.retries and debug.lockmgr.loops . Ideally, such sysctls
should be axed or compiled out before release.
Addictionally note that adaptive spinning doesn't cope well with
LK_SLEEPFAIL. The reason is that many (and probabilly all) consumers
of LK_SLEEPFAIL are mainly interested in knowing if the interlock was
dropped or not in order to reacquire it and re-test initial conditions.
This directly interacts with adaptive spinning because lockmgr needs
to drop the interlock while spinning in order to avoid a deadlock
(further details in the comments inside the patch).
Final note: finding someone willing to help on tuning this with
relevant workloads would be either very important and appreciated.
Tested by: jeff, pho
Requested by: many
|
|
|
|
|
|
| |
instead the lock own flag itself.
Tested by: pho
|
|
|
|
|
|
|
|
|
|
| |
adds probes for mutexes, reader/writer and shared/exclusive locks to
gather contention statistics and other locking information for
dtrace scripts, the lockstat(1M) command and other potential
consumers.
Reviewed by: attilio jhb jb
Approved by: gnn (mentor)
|
|
|
|
|
| |
Found with: Coverity Prevent(tm)
CID: 3919
|
| |
|
|
|
|
|
|
| |
LK_KERNPROC.
Discussed with: attilio
|
|
|
|
|
|
|
|
| |
- Align the fifo output in fifo_print() with other vn_printf() output.
- Remove the leading space from lockmgr_printinfo() so its output lines up
in vn_printf().
- lockmgr_printinfo() now ends with a newline, so remove an extra newline
from vn_printf().
|
|
|
|
|
|
|
|
| |
of spurious witness warnings since lockmgr grew witness support. Before
this, every time you passed an interlock to a lockmgr lock WITNESS treated
it as a LOR.
Reviewed by: attilio
|
|
|
|
|
| |
What we really want is an inclusive or of all the requests, and += can
in theory roll over to 0.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
routine wakes up proc0 so that proc0 can swap the thread back in.
Historically, this has been done by waking up proc0 directly from
setrunnable() itself via a wakeup(). When waking up a sleeping thread
that was swapped out (the usual case when waking proc0 since only sleeping
threads are eligible to be swapped out), this resulted in a bit of
recursion (e.g. wakeup() -> setrunnable() -> wakeup()).
With sleep queues having separate locks in 6.x and later, this caused a
spin lock LOR (sleepq lock -> sched_lock/thread lock -> sleepq lock).
An attempt was made to fix this in 7.0 by making the proc0 wakeup use
the ithread mechanism for doing the wakeup. However, this required
grabbing proc0's thread lock to perform the wakeup. If proc0 was asleep
elsewhere in the kernel (e.g. waiting for disk I/O), then this degenerated
into the same LOR since the thread lock would be some other sleepq lock.
Fix this by deferring the wakeup of the swapper until after the sleepq
lock held by the upper layer has been locked. The setrunnable() routine
now returns a boolean value to indicate whether or not proc0 needs to be
woken up. The end result is that consumers of the sleepq API such as
*sleep/wakeup, condition variables, sx locks, and lockmgr, have to wakeup
proc0 if they get a non-zero return value from sleepq_abort(),
sleepq_broadcast(), or sleepq_signal().
Discussed with: jeff
Glanced at by: sam
Tested by: Jurgen Weber jurgen - ish com au
MFC after: 2 weeks
|
| |
|
| |
|
|
|
|
| |
exclusive locking operation.
|
|
|
|
|
|
|
|
| |
the only one difference is that lockmgr*() functions now accept
LK_NOWITNESS flag which skips ordering for the instanced calling.
- Remove an unuseful stub in witness_checkorder() (because the above check
doesn't allow ever happening) and allow witness_upgrade() to accept
non-try operation too.
|