summaryrefslogtreecommitdiffstats
path: root/sys/kern/subr_witness.c
Commit message (Collapse)AuthorAgeFilesLines
* Add an order between UDP inpcb locks and the IPv4 multicast addressrwatson2005-08-091-1/+2
| | | | | | | list lock, as there has been a report that an alternative lock order is getting introduced. This should help ferret it out. Reported by: Ed Maste <emaste at phaedrus dot sandvine dot ca>
* Introduce in_multi_mtx, which will protect IPv4-layer multicast addressrwatson2005-08-031-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | lists, as well as accessor macros. For now, this is a recursive mutex due code sequences where IPv4 multicast calls into IGMP calls into ip_output(), which then tests for a multicast forwarding case. For support macros in in_var.h to check multicast address lists, assert that in_multi_mtx is held. Acquire in_multi_mtx around iteration over the IPv4 multicast address lists, such as in ip_input() and ip_output(). Acquire in_multi_mtx when manipulating the IPv4 layer multicast addresses, as well as over the manipulation of ifnet multicast address lists in order to keep the two layers in sync. Lock down accesses to IPv4 multicast addresses in IGMP, or assert the lock when performing IGMP join/leave events. Eliminate spl's associated with IPv4 multicast addresses, portions of IGMP that weren't previously expunged by IGMP locking. Add in_multi_mtx, igmp_mtx, and if_addr_mtx lock order to hard-coded lock order in WITNESS, in that order. Problem reported by: Ed Maste <emaste at phaedrus dot sandvine dot ca> MFC after: 10 days
* After some input from bde@ and rereading the datasheet use a MTX_SPINmarius2005-06-041-0/+1
| | | | | | | | mutex instead of a MTX_DEF one in order to defer preemption while reading the date and time registers. If we don't manage to read them within the time slot where we are guaranteed that no updates occur we might actually read them during an update in which case the output is undefined.
* - Define the real lock order with cdev and a few vm/vfs related locks. Thisjeff2005-04-221-1/+3
| | | | can be removed once cdev no longer calls free() with the cdev lock held.
* - Check LO_DUPOK as well as LOP_DUPOK when determining whether we shouldjeff2005-04-221-1/+2
| | | | | | warn about duplicate acquires. Sponsored by: Isilon Systems, Inc.
* The latest release of the FreeBSD driver (twa) forvkashyap2005-04-121-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 3ware's 9xxx series controllers. This corresponds to the 9.2 release (for FreeBSD 5.2.1) on the 3ware website. Highlights of this release are: 1. The driver has been re-architected to use a "Common Layer" (all tw_cl* files), which is a consolidation of all OS-independent parts of the driver. The FreeBSD OS specific portions of the driver go into an "OS Layer" (all tw_osl* files). This re-architecture is to achieve better maintainability, consistency of behavior across OS's, and better portability to new OS's (drivers for new OS's can be written by just adding an OS Layer that's specific to the OS, by complying to a "Common Layer Programming Interface" API. 2. The driver takes advantage of multiple processors. 3. The driver has a new firmware image bundled, the new features of which include Online Capacity Expansion and multi-lun support, among others. More details about 3ware's 9.2 release can be found here: http://www.3ware.com/download/Escalade9000Series/9.2/9.2_Release_Notes_Web.pdf Since the Common Layer is used across OS's, the FreeBSD specific include path for header files (/sys/dev/twa) is not part of the #include pre-processor directive in any of the source files. For being able to integrate twa into the kernel despite this, Makefile.<arch> has been changed to add the include path to CFLAGS. Reviewed by: scottl
* CDEV lock should be before 'system map' lock.pjd2005-04-091-0/+6
| | | | | | | Hardcode this order to help track down reported LOR. LOR reported by: Thierry Herbelot <thierry@herbelot.com> LOR info: http://sources.zabbadoz.net/freebsd/lor.html#080
* Add a missing terminator.pjd2005-04-091-0/+1
| | | | Confirmed by: rwatson
* Document, via WITNESS, that the NFS server mutex falls ahead of the socketrwatson2005-03-091-0/+5
| | | | buffer mutexes.
* When you call MiniportInitialize() for an 802.11 driver, it willwpaul2005-03-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | at some point result in a status event being triggered (it should be a link down event: the Microsoft driver design guide says you should generate one when the NIC is initialized). Some drivers generate the event during MiniportInitialize(), such that by the time MiniportInitialize() completes, the NIC is ready to go. But some drivers, in particular the ones for Atheros wireless NICs, don't generate the event until after a device interrupt occurs at some point after MiniportInitialize() has completed. The gotcha is that you have to wait until the link status event occurs one way or the other before you try to fiddle with any settings (ssid, channel, etc...). For the drivers that set the event sycnhronously this isn't a problem, but for the others we have to pause after calling ndis_init_nic() and wait for the event to arrive before continuing. Failing to wait can cause big trouble: on my SMP system, calling ndis_setstate_80211() after ndis_init_nic() completes, but _before_ the link event arrives, will lock up or reset the system. What we do now is check to see if a link event arrived while ndis_init_nic() was running, and if it didn't we msleep() until it does. Along the way, I discovered a few other problems: - Defered procedure calls run at PASSIVE_LEVEL, not DISPATCH_LEVEL. ntoskrnl_run_dpc() has been fixed accordingly. (I read the documentation wrong.) - Similarly, the NDIS interrupt handler, which is essentially a DPC, also doesn't need to run at DISPATCH_LEVEL. ndis_intrtask() has been fixed accordingly. - MiniportQueryInformation() and MiniportSetInformation() run at DISPATCH_LEVEL, and each request must complete before another can be submitted. ndis_get_info() and ndis_set_info() have been fixed accordingly. - Turned the sleep lock that guards the NDIS thread job list into a spin lock. We never do anything with this lock held except manage the job list (no other locks are held), so it's safe to do this, and it's possible that ndis_sched() and ndis_unsched() can be called from DISPATCH_LEVEL, so using a sleep lock here is semantically incorrect. Also updated subr_witness.c to add the lock to the order list.
* When DDB is not defined, don't implement witness_thread_has_locks() andrwatson2005-01-221-0/+2
| | | | | | | | witness_proc_has_locks(), as they are unused, which results in a compiler error. This problem was introduced with the implementation of "show alllocks". Spotted by: Artem Kuchin <matrix at itlegion dot ru>
* - Up the WITNESS_COUNT macro from 200 to 1024 to support the growing numberjhb2004-12-281-2/+1
| | | | | | | | | of lock types in the kernel. This results in an increase of witness data usage from ~145k to ~280k on i386 for kernels with 'options WITNESS'. - Remove the unused witness malloc bucket. Submitted by: Michal Mertl mime at traveller dot cz (1)
* Attempt to slightly refine the print out from "show alllocks" -- listrwatson2004-12-271-2/+2
| | | | | the process and thread numbers/names on the same line rather than on separate lines, and print the thread pointer not just the tid.
* Add "show alllocks" command to DDB, which dumps a list of processesrwatson2004-12-261-0/+42
| | | | | | | | | | | | and threads currently holding sleep mutexes (and spin mutexes for curthread). This can be quite useful in looking for a lock condition summary for a system, as it avoids manually iterating through threads and processes to find all the interesting locks. NB: "alllocks" is up there with "lockedvnods" for a bad argument for show. MFC after: 2 weeks
* clean up some tunables that should of been removed a while ago...jmg2004-11-091-4/+0
|
* Add entropy harvest mutex to hard-coded spin lock witness lock order,rwatson2004-10-111-2/+1
| | | | | | | | remove previous entropy harvesting mutex names as they are no longer present. Commit to this file was ommitted when randomdev_soft.c:1.5 was made. Feet shot: Robert Huff <roberthuff at rcn dot com>
* Don't "implicitly order all sleep locks before spin locks" in witnessgreen2004-10-091-1/+1
| | | | | | when the spin lock in question isn't -- it's the critical_enter() that KDB set. No more panic in DDB for console -> syscons -> tty -> knote operations.
* Hard code witness lock order for BPF locks.rwatson2004-09-091-0/+7
|
* make witness it's own sysctl branch instead of using _ to do this. I havejmg2004-09-061-5/+10
| | | | | | | left the old tunables in to give people a few days to transition their loader.conf and sysctl.conf's over to the new names.. MFC after: 5 days
* Remove a potential deadlock on i386 SMP by changing the lazypmap ipi andjhb2004-08-041-1/+0
| | | | | | | | spin-wait code to use the same spin mutex (smp_tlb_mtx) as the TLB ipi and spin-wait code snippets so that you can't get into the situation of one CPU doing a TLB shootdown to another CPU that is doing a lazy pmap shootdown each of which are waiting on each other. With this change, only one of the CPUs would do an IPI and spin-wait at a time.
* Add netatalk mutexes to hard-coded WITNESS lock order.rwatson2004-07-251-0/+6
|
* Update for the KDB framework:marcel2004-07-101-21/+22
| | | | | | | | | | o Make debugging code conditional upon KDB instead of DDB. o s/WITNESS_DDB/WITNESS_KDB/g o s/witness_ddb/witness_kdb/g o Rename the debug.witness_ddb sysctl to debug.witness_kdb. o Call kdb_backtrace() instead of backtrace(). o Call kdb_enter() instead Debugger(). o Assert kdb_active instead of db_active.
* Check the lock lists to see if they are empty directly rather thanjhb2004-07-091-9/+21
| | | | | | | | | | | | | assigning a pointer to the list and then dereferencing the pointer as a second step. When the first spin lock is acquired, curthread is not in a critical section so it may be preempted and would end up using another CPUs lock list instead of its own. When this code was in witness_lock() this sequence was safe as curthread was in a critical section already since witness_lock() is called after the lock is acquired. Tested by: Daniel Lang dl at leo.org
* Introduce socket and UNIX domain socket locks into hard-coded lockrwatson2004-06-131-1/+8
| | | | | | | | | | | | | | | | | order definition for witness. Send lock before receive lock, and socket locks after accept but before select: filedesc -> accept -> so_snd -> so_rcv -> sellck All routing locks after send lock: so_rcv -> radix node head All protocol locks before socket locks: unp -> so_snd udp -> udpinp -> so_snd tcp -> tcpinp -> so_snd
* - Comment out NULL, NULL barrier for Unix domain sockets section as thejhb2004-06-031-1/+2
| | | | | | | | double NULL entries signal Witness to stop processing the array of order entries meaning none of the spin locks are added resulting in panics on boot. - Add a missing NULL, NULL terminator to the Slip locks list to keep them separate from the spin locks.
* Expand the hard-coded WITNESS lock order to include the followingrwatson2004-06-021-0/+35
| | | | | | | | | | | | | relationships: Sockets: filedesc->accept->sellck Routing: radix node head->rtentry->ifaddr UDP: udp->udpinp TCP: tcp->tcpinp SLIP: slip_mtx->slip sc_mtx Drop in a place holder section for UNIX domain sockets. Various sections to be expanded over the next few days.
* Emit a traceback when witness_trace is set and witness_warn() isalfred2004-03-231-0/+2
| | | | | | | called and triggers (typically caused by sleeping with a non-sleepable lock). Reviewed by: jhb
* Add an implementation of a generic sleep queue abstraction that is usedjhb2004-02-271-0/+1
| | | | | | | | | | | | | | | | to queue threads sleeping on a wait channel similar to how turnstiles are used to queue threads waiting for a lock. This subsystem will be used as the backend for sleep/wakeup and condition variables initially. Eventually it will also be used to replace the ithread-specific iwait thread inhibitor. Sleep queues are also not locked by sched_lock, so this splits sched_lock up a bit further increasing concurrency within the scheduler. Sleep queues also natively support timeouts on sleeps and interruptible sleeps allowing for the reduction of a lot of duplicated code between the sleep/wakeup and condition variable implementations. For more details on the sleep queue implementation, check the comments in sys/sleepqueue.h and kern/subr_sleepqueue.c.
* Remove a bogus assertion.jhb2004-02-031-1/+0
| | | | | Noticed by: bde Pointy hat to: jhb
* - Assert that witness_cold is not true in enroll().jhb2004-02-021-1/+2
| | | | | | - Only check witness_watch once in enroll(). Reported by: ru (2)
* Rework witness_lock() to make it slightly more useful and flexible.jhb2004-01-281-108/+180
| | | | | | | | | | | | | | | | | | | | | | | | | | | - witness_lock() is split into two pieces: witness_checkorder() and witness_lock(). Witness_checkorder() determines if acquiring a specified lock at the time it is called would result in a lock order. It optionally adds a new lock order relationship as well. witness_lock() updates witness's data structures to assume that a lock has been acquired by stick a new lock instance in the appropriate lock instance list. - The mutex and sx lock functions now call checkorder() prior to trying to acquire a lock and continue to call witness_lock() after the acquire is completed. This will let witness catch a deadlock before it happens rather than trying to do so after the threads have deadlocked (i.e. never actually report it). - A new function witness_defineorder() has been added that adds a lock order between two locks at runtime without having to acquire the locks. If the lock order cannot be added it will return an error. This function is available to programmers via the WITNESS_DEFINEORDER() macro which accepts either two mutexes or two sx locks as its arguments. - A few simple wrapper macros were added to allow developers to call witness_checkorder() anywhere as a way of enforcing locking assertions in code that might acquire a certain lock in some situations. The macros are: witness_check_{mutex,shared_sx,exclusive_sx} and take an appropriate lock as the sole argument. - The code to remove a lock instance from a lock list in witness_unlock() was unnested by using a goto to vastly improve the readability of this function.
* Register the uart(4)'s spin lock with witness(4).ru2004-01-251-0/+1
|
* Fix a major faux pas of mine. I was causing 2 very bad things tomarkm2003-11-201-0/+2
| | | | | | | | | | | | | | | happen in interrupt context; 1) sleep locks, and 2) malloc/free calls. 1) is fixed by using spin locks instead. 2) is fixed by preallocating a FIFO (implemented with a STAILQ) and using elements from this FIFO instead. This turns out to be rather fast. OK'ed by: re (scottl) Thanks to: peter, jhb, rwatson, jake Apologies to: *
* Initial landing of SMP support for FreeBSD/amd64.peter2003-11-171-1/+1
| | | | | | | | | | | | | | | | - This is heavily derived from John Baldwin's apic/pci cleanup on i386. - I have completely rewritten or drastically cleaned up some other parts. (in particular, bootstrap) - This is still a WIP. It seems that there are some highly bogus bioses on nVidia nForce3-150 boards. I can't stress how broken these boards are. I have a workaround in mind, but right now the Asus SK8N is broken. The Gigabyte K8NPro (nVidia based) is also mind-numbingly hosed. - Most of my testing has been with SCHED_ULE. SCHED_4BSD works. - the apic and acpi components are 'standard'. - If you have an nVidia nForce3-150 board, you are stuck with 'device atpic' in addition, because they somehow managed to forget to connect the 8254 timer to the apic, even though its in the same silicon! ARGH! This directly violates the ACPI spec.
* Localized the cy driver's locking.bde2003-11-161-3/+0
|
* Add an implementation of turnstiles and change the sleep mutex code to usejhb2003-11-111-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | turnstiles to implement blocking isntead of implementing a thread queue directly. These turnstiles are somewhat similar to those used in Solaris 7 as described in Solaris Internals but are also different. Turnstiles do not come out of a fixed-sized pool. Rather, each thread is assigned a turnstile when it is created that it frees when it is destroyed. When a thread blocks on a lock, it donates its turnstile to that lock to serve as queue of blocked threads. The queue associated with a given lock is found by a lookup in a simple hash table. The turnstile itself is protected by a lock associated with its entry in the hash table. This means that sched_lock is no longer needed to contest on a mutex. Instead, sched_lock is only used when manipulating run queues or thread priorities. Turnstiles also implement priority propagation inherently. Currently turnstiles only support mutexes. Eventually, however, turnstiles may grow two queue's to support a non-sleepable reader/writer lock implementation. For more details, see the comments in sys/turnstile.h and kern/subr_turnstile.c. The two primary advantages from the turnstile code include: 1) the size of struct mutex shrinks by four pointers as it no longer stores the thread queue linkages directly, and 2) less contention on sched_lock in SMP systems including the ability for multiple CPUs to contend on different locks simultaneously (not that this last detail is necessarily that much of a big win). Note that 1) means that this commit is a kernel ABI breaker, so don't mix old modules with a new kernel and vice versa. Tested on: i386 SMP, sparc64 SMP, alpha SMP
* Update spin lock order list for new i386 interrupt and SMP code.jhb2003-11-031-3/+2
|
* Change all SYSCTLS which are readonly and have a related TUNABLEsilby2003-10-211-1/+1
| | | | | from CTLFLAG_RD to CTLFLAG_RDTUN so that sysctl(8) can provide more useful error messages.
* add fast swi taskqueue spinlock to the order_list so witness doesn't complainsam2003-09-061-0/+1
| | | | Submitted by: Tor Egge <Tor.Egge@cvsup.no.freebsd.org>
* Insert cosmetic spaces.jhb2003-08-041-2/+2
| | | | Reported by: kris
* Add a new function to look for a spinlock's instance when it is held byjhb2003-07-311-0/+21
| | | | | | | another thread. We use the td_oncpu member of the other field to locate it's associated CPU and then search the that CPU's list of spin locks contained in its per-CPU data. This is not always safe and may in fact panic or just not work, but it is useful in at least one case.
* unifdef -DLAZY_SWITCH and start to tidy up the associated glue.peter2003-07-101-4/+1
|
* Use __FBSDID().obrien2003-06-111-1/+3
|
* Remove return after panic.phk2003-05-311-1/+0
| | | | Found by: FlexeLint
* Add __amd64__ to the ifdefs that introduce the "pcicfg" spinlock topeter2003-05-311-1/+1
| | | | | | witness. Approved by: re (safe amd64 support)
* Move the _oncpu entry from the KSE to the thread.julian2003-04-101-1/+1
| | | | | The entry in the KSE still exists but it's purpose will change a bit when we add the ability to lock a KSE to a cpu.
* o In struct prison, add an allprison linked list of prisons (protectedmike2003-04-091-0/+1
| | | | | | | | | | | | | | | by allprison_mtx), a unique prison/jail identifier field, two path fields (pr_path for reporting and pr_root vnode instance) to store the chroot() point of each jail. o Add jail_attach(2) to allow a process to bind to an existing jail. o Add change_root() to perform the chroot operation on a specified vnode. o Generalize change_dir() to accept a vnode, and move namei() calls to callers of change_dir(). o Add a new sysctl (security.jail.list) which is a group of struct xprison instances that represent a snapshot of active jails. Reviewed by: rwatson, tjr
* Commit a partial lazy thread switch mechanism for i386. it isn't as lazypeter2003-04-021-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | as it could be and can do with some more cleanup. Currently its under options LAZY_SWITCH. What this does is avoid %cr3 reloads for short context switches that do not involve another user process. ie: we can take an interrupt, switch to a kthread and return to the user without explicitly flushing the tlb. However, this isn't as exciting as it could be, the interrupt overhead is still high and too much blocks on Giant still. There are some debug sysctls, for stats and for an on/off switch. The main problem with doing this has been "what if the process that you're running on exits while we're borrowing its address space?" - in this case we use an IPI to give it a kick when we're about to reclaim the pmap. Its not compiled in unless you add the LAZY_SWITCH option. I want to fix a few more things and get some more feedback before turning it on by default. This is NOT a replacement for Bosko's lazy interrupt stuff. This was more meant for the kthread case, while his was for interrupts. Mine helps a little for interrupts, but his helps a lot more. The stats are enabled with options SWTCH_OPTIM_STATS - this has been a pseudo-option for years, I just added a bunch of stuff to it. One non-trivial change was to select a new thread before calling cpu_switch() in the first place. This allows us to catch the silly case of doing a cpu_switch() to the current process. This happens uncomfortably often. This simplifies a bit of the asm code in cpu_switch (no longer have to call choosethread() in the middle). This has been implemented on i386 and (thanks to jake) sparc64. The others will come soon. This is actually seperate to the lazy switch stuff. Glanced at by: jake, jhb
* - Remove witness_dead and just use witness_watch instead. If witness_watchjhb2003-03-241-21/+51
| | | | | | | | | | is set to 0, it now has the same affect as setting witness_dead used to have. - Added a sysctl handler that allows root to change witness_watch from a non-zero value to zero to disable witness at runtime. Note that you can't turn witness back on once it is off. You can only turn it off as a one-way switch. - Added a comment describing the possible values of witness_watch.
* Trim an extra blank line that snuck into the last commit.jhb2003-03-111-1/+0
|
OpenPOWER on IntegriCloud