summaryrefslogtreecommitdiffstats
path: root/sys/kern/subr_witness.c
Commit message (Collapse)AuthorAgeFilesLines
* Add a drain function for struct sysctl_req, and use it for a variety ofmdf2010-09-091-14/+3
| | | | | | | | | | handlers, some of which had to do awkward things to get a large enough FIXEDLEN buffer. Note that some sysctl handlers were explicitly outputting a trailing NUL byte. This behaviour was preserved, though it should not be necessary. Reviewed by: phk
* Bump the witness pendlist to 768 to accomodate the increased number ofrpaulo2010-07-291-1/+1
| | | | spinlocks.
* "time lock" is no longer a spin-lock since r209371.mav2010-06-211-1/+1
| | | | Reported by: kib@
* Right now, WITNESS just blindly pipes all the output to theattilio2010-05-111-14/+18
| | | | | | | | | | | | | (TOCONS | TOLOG) mask even when called from DDB points. That breaks several output, where the most notable is textdump output. Fix this by having configurable callbacks passed to witness_list_locks() and witness_display_spinlock() for printing out datas. Reported by: several broken textdump outputs Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com> MFC after: 7 days X-MFC: r207922
* There is not a good reason to have a different prototype for db_printf()attilio2010-05-111-6/+6
| | | | | | | | when compared to printf(). Unify it by returning the number of characters displayed for db_printf() as well. MFC after: 7 days
* On Alan's advice, rather than do a wholesale conversion on a singlekmacy2010-04-301-0/+9
| | | | | | | | | | | | architecture from page queue lock to a hashed array of page locks (based on a patch by Jeff Roberson), I've implemented page lock support in the MI code and have only moved vm_page's hold_count out from under page queue mutex to page lock. This changes pmap_extract_and_hold on all pmaps. Supported by: Bitgravity Inc. Discussed with: alc, jeffr, and kib
* SLIP is gone; remove its mutex from witness.trasz2009-12-291-6/+0
|
* Change w_notrunning and w_stillcold from pointer to array so that sizeofantoine2009-09-061-2/+2
| | | | | | | | returns what is expected. PR: kern/138557 Discussed with: brucec@ MFC after: 1 month
* Add minimal ZFS lock hierarchykmacy2009-05-201-0/+7
|
* Bite the bullet, and make the IPv6 SSM and MLDv2 mega-commit:bms2009-04-291-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | import from p4 bms_netdev. Summary of changes: * Connect netinet6/in6_mcast.c to build. The legacy KAME KPIs are mostly preserved. * Eliminate now dead code from ip6_output.c. Don't do mbuf bingo, we are not going to do RFC 2292 style CMSG tricks for multicast options as they are not required by any current IPv6 normative reference. * Refactor transports (UDP, raw_ip6) to do own mcast filtering. SCTP, TCP unaffected by this change. * Add ip6_msource, in6_msource structs to in6_var.h. * Hookup mld_ifinfo state to in6_ifextra, allocate from domifattach path. * Eliminate IN6_LOOKUP_MULTI(), it is no longer referenced. Kernel consumers which need this should use in6m_lookup(). * Refactor IPv6 socket group memberships to use a vector (like IPv4). * Update ifmcstat(8) for IPv6 SSM. * Add witness lock order for IN6_MULTI_LOCK. * Move IN6_MULTI_LOCK out of lower ip6_output()/ip6_input() paths. * Introduce IP6STAT_ADD/SUB/INC/DEC as per rwatson's IPv4 cleanup. * Update carp(4) for new IPv6 SSM KPIs. * Virtualize ip6_mrouter socket. Changes mostly localized to IPv6 MROUTING. * Don't do a local group lookup in MROUTING. * Kill unused KAME prototypes in6_purgemkludge(), in6_restoremkludge(). * Preserve KAME DAD timer jitter behaviour in MLDv1 compatibility mode. * Bump __FreeBSD_version to 800084. * Update UPDATING. NOTE WELL: * This code hasn't been tested against real MLDv2 queriers (yet), although the on-wire protocol has been verified in Wireshark. * There are a few unresolved issues in the socket layer APIs to do with scope ID propagation. * There is a LOR present in ip6_output()'s use of in6_setscope() which needs to be resolved. See comments in mld6.c. This is believed to be benign and can't be avoided for the moment without re-introducing an indirect netisr. This work was mostly derived from the IGMPv3 implementation, and has been sponsored by a third party.
* Decompose the global UNIX domain sockets rwlock into two differentrwatson2009-03-081-0/+2
| | | | | | | | | | | | | | | | | | | | locks: a global list/counter/generation counter protected by a new mutex unp_list_lock, and a global linkage rwlock, unp_global_rwlock, which protects the connections between UNIX domain sockets. This eliminates conditional lock acquisition that was previously a property of the global lock being held over sonewconn() leading to a call to uipc_attach(), which also required the global lock, but couldn't rely on it as other paths existed to uipc_attach() that didn't hold it: now uipc_attach() uses only the list lock, which follows the linkage lock in the lock order. It may also reduce contention on the global lock for some workloads. Add global UNIX domain socket locks to hard-coded witness lock order. MFC after: 1 week Discussed with: kris
* Move the NORELEASE check to after the recurse count decrement and bailout, thisthompsa2009-02-281-6/+6
| | | | is not counted as actually releasing the lock.
* o Use unsigned for bit fields.imp2009-02-031-3/+3
| | | | o Use NULL for pointers in preference to 0.
* Add functions WITNESS so it can be asserted that the lock is not released for athompsa2009-01-211-0/+49
| | | | | | | | | | | section of code, this uses WITNESS_NORELEASE() and WITNESS_RELEASEOK() to mark the boundaries. Both functions require the lock to be held when calling. This is intended for scenarios like a bus asserting that the bus lock is not dropped during a driver call. There doesn't appear to be a man page to document this in. Reviewed by: jhb
* - convert radix node head lock from mutex to rwlockkmacy2008-12-071-1/+1
| | | | | | | | - make radix node head lock not recursive - fix LOR in rtexpunge - fix LOR in rtredirect Reviewed by: sam
* Fix a number of style issues in the MALLOC / FREE commit. I've tried todes2008-10-231-1/+1
| | | | | be careful not to fix anything that was already broken; the NFSv4 code is particularly bad in this respect.
* Retire the MALLOC and FREE macros. They are an abomination unto style(9).des2008-10-231-2/+1
| | | | MFC after: 3 months
* In the actual code for witness_warn:attilio2008-10-201-19/+12
| | | | | | | | | | - If there aren't spinlocks held, but there are problems with old sleeplocks, they are not reported. - If the spinlock found is not the only one, problems are not reported. Fix these 2 problems. Reported by: tegge
* - Fix a race in witness_checkorder() where, between the PCPU_GET() andattilio2008-10-161-30/+52
| | | | | | | | | | | | | | PCPU_PTR() curthread can migrate on another CPU and get incorrect results. - Fix a similar race into witness_warn(). - Fix the interlock's checks bypassing by correctly using the appropriate children even when the lock_list chunk to be explored is not the first one. - Allow witness_warn() to work with spinlocks too. Bugs found by: tegge Submitted by: jhb, tegge Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
* Oops, missed updating a place with with 's/lock1/plock/' when addingjhb2008-10-031-3/+4
| | | | | | | interlock support to WITNESS. Specifically, the printf listing the first location when duplicate locks of the same type are acquired. Reported by: pho
* Update description of witness_watch.jhb2008-09-241-3/+5
|
* Make ddb command registration dynamic so modules can extendsam2008-09-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | the command set (only so long as the module is present): o add db_command_register and db_command_unregister to add and remove commands, respectively o replace linker sets with SYSINIT's (and SYSUINIT's) that register commands o expose 3 list heads: db_cmd_table, db_show_table, and db_show_all_table for registering top-level commands, show operands, and show all operands, respectively While here also: o sort command lists o add DB_ALIAS, DB_SHOW_ALIAS, and DB_SHOW_ALL_ALIAS to add aliases for existing commands o add "show all trace" as an alias for "show alltrace" o add "show all locks" as an alias for "show alllocks" Submitted by: Guillaume Ballet <gballet@gmail.com> (original version) Reviewed by: jhb MFC after: 1 month
* - For any lock list we hold the head in order to reduce allocation fromattilio2008-09-121-5/+18
| | | | | | | | | | | | | | | | | | | | | the free list and in this way avoid contention on the w_mtx. In order to make the code simple, we rely on the rule that when the head has not a child it also doesn't have other subsequent entries. Actually this assertion is broken because we can free all the head children and quit witness_unlock() with the head still allocated, with no children and subsequent entries present. Fix this by shifting the head if other entries are present and still freeing the object, but leaving always an head. - Fix witness_thread_has_locks() in order to report, correctly, if the lock list linked to a specific thread has children or not based on the above explained rule. - Fix a printout into DDB's "show alllocks" command in order to show, correctly, the process name that is really what we want. - Fix style(9) for a comment. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com> Reported by: Marko Kiiskila <marko dot kiiskila at nokia dot com> Sponsored by: Nokia
* Teach WITNESS about the interlocks used with lockmgr. This removes a bunchjhb2008-09-101-9/+30
| | | | | | | | of spurious witness warnings since lockmgr grew witness support. Before this, every time you passed an interlock to a lockmgr lock WITNESS treated it as a LOR. Reviewed by: attilio
* - Improve some witness_watch operability in code which does perform bothattilio2008-08-301-35/+45
| | | | | | | | lock tracking and checks, doing just the former ones. - Fix a bug where sysctl utility was printing crazy values when setting a new value for debug.witness.watch [0] [0] Reported by: yongari
* - Make witness_watch a 3 state value.attilio2008-08-291-35/+42
| | | | | | | | | | | | | 1 means that witness is up and running. 0 means that witness is disabled but that it can be established later again in effective way. -1 means that witness is disabled permanently - Fix a bug causing kernel to panic on witness disabling through witness_watch. lock lists queues were still full of entries and this was causing throubles with debugging stubs (like witness_thread_exit()). Reported by: kris, yongari Sponsored by: Nokia
* Introduce some WITNESS improvements:attilio2008-08-131-492/+1062
| | | | | | | | | | | | | | | | | | | | | | | | | | - Speedup the lock orderings lookup modifying the witness graph from a linked tree to a matrix. A table lookup caches the lock orderings in order to make a O(1) access for them. Any witness object has an unique index withing this lookup cache table. - Reduce the lock contention on w_mtx acquiring it only when the LOR actually happens and not in a sane case. In order to do this don't totally flush lock lists (per-CPU spinlocks list and per-thread sleeplocks list) but check for ll_count anytime we need to have to verify allocations sanity. - Introduce the function witness_thread_exit() in the witness namespace which should verify a thread doesn't hold any witness occurrence why exiting. - Rename the sysctl debug.witness.graphs into debug.witness.fullgraph and add debug.witness.badstacks which prints out stacks for LOR revealed. This is implemented using the stack(9) support, which makes WITNESS to be dependent by the STACK option or by the DDB (including STACK) option. - Fix style(9) for src/sys/kern/subr_witness.c The hash table approach has been developed by Ilya Maykov on the behalf of Isilon Systems which kindly released the patch. Jeff Roberson, ported the patch to -CURRENT and fixed w_mtx contention, on the behalf of Nokia. Submitted by: Ilya Maykov <ivmaykov at gmail dot com> (Isilon Systems), jeff Sponsored by: Nokia
* witness_addgraph() is required even if DDB isn't compiled into the kernel,rwatson2008-07-191-0/+2
| | | | | | so exclude it from #ifdef DDB. Submitted by: attilio
* - Embed the recursion counter for any locking primitive directly in theattilio2008-05-151-64/+41
| | | | | | | | | | | | | | | | | | | | | | | lock_object, using an unified field called lo_data. - Replace lo_type usage with the w_name usage and at init time pass the lock "type" directly to witness_init() from the parent lock init function. Handle delayed initialization before than witness_initialize() is called through the witness_pendhelp structure. - Axe out LO_ENROLLPEND as it is not really needed. The case where the mutex init delayed wants to be destroyed can't happen because witness_destroy() checks for witness_cold and panic in case. - In enroll(), if we cannot allocate a new object from the freelist, notify that to userspace through a printf(). - Modify the depart function in order to return nothing as in the current CVS version it always returns true and adjust callers accordingly. - Fix the witness_addgraph() argument name prototype. - Remove unuseful code from itismychild(). This commit leads to a shrinked struct lock_object and so smaller locks, in particular on amd64 where 2 uintptr_t (16 bytes per-primitive) are gained. Reviewed by: jhb
* Add a new witness sysctl which returns the relations between any lockattilio2008-05-071-0/+58
| | | | | | | | | | | | | | | | | | and its children in the form: "parent","child" so that head and bottom of an oriented graph can be easilly detected and various form of diagrams can be build. The sysctl is called debug.witness.graphs and it is read-only; in order to get the list of relations, a simple: #sysctl debug.witness.graphs will do the trick. This approach has been choosen in order to support easilly things like the DOT format and such. Soon, an auto-explicative awk script, which filters simple informations returned by the sysctl and converts them into a real DOT script, will be committed to the repository between examples. Discussed with: rwatson
* Convert pcbinfo and inpcb mutexes to rwlocks, and modify macros torwatson2008-04-171-5/+5
| | | | | | | | | | | | | | | explicitly select write locking for all use of the inpcb mutex. Update some pcbinfo lock assertions to assert locked rather than write-locked, although in practice almost all uses of the pcbinfo rwlock main exclusive, and all instances of inpcb lock acquisition are exclusive. This change should introduce (ideally) little functional change. However, it lays the groundwork for significantly increased parallelism in the TCP/IP code. MFC after: 3 months Tested by: kris (superset of committered patch)
* struct lock_instance and struct lock_list_entry don't need to be in theattilio2008-04-131-0/+34
| | | | | | public namespace for WITNESS as they are only used internally so just move them in the private namespace for the subsystem (with all related supporting definitions).
* - Re-introduce WITNESS support for lockmgr. About the old implementationattilio2008-04-121-11/+0
| | | | | | | | the only one difference is that lockmgr*() functions now accept LK_NOWITNESS flag which skips ordering for the instanced calling. - Remove an unuseful stub in witness_checkorder() (because the above check doesn't allow ever happening) and allow witness_upgrade() to accept non-try operation too.
* Add missing stubs for spinlocks cpuset and intrcnt.attilio2008-04-121-0/+2
| | | | Submitted by: kris
* - There is no more "uidinfo struct" mutex.pjd2008-03-171-2/+1
| | | | | | - The "uidinfo hash" lock is now a rwlock. Reminded by: kib
* In keeping with style(9)'s recommendations on macros, use a ';'rwatson2008-03-161-1/+2
| | | | | | | | | after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr. MFC after: 1 month Discussed with: imp, rink
* Remove kernel support for M:N threading.jeff2008-03-121-1/+0
| | | | | | | | While the KSE project was quite successful in bringing threading to FreeBSD, the M:N approach taken by the kse library was never developed to its full potential. Backwards compatibility will be provided via libmap.conf for dynamically linked binaries and static binaries will be broken.
* Initial support for Freescale PowerQUICC III MPC85xx system-on-chip family.raj2008-03-031-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | The PQ3 is a high performance integrated communications processing system based on the e500 core, which is an embedded RISC processor that implements the 32-bit Book E definition of the PowerPC architecture. For details refer to: http://www.freescale.com/webapp/sps/site/prod_summary.jsp?code=MPC8555E This port was tested and successfully run on the following members of the PQ3 family: MPC8533, MPC8541, MPC8548, MPC8555. The following major integrated peripherals are supported: * On-chip peripherals bus * OpenPIC interrupt controller * UART * Ethernet (TSEC) * Host/PCI bridge * QUICC engine (SCC functionality) This commit brings the main functionality and will be followed by individual drivers that are logically separate from this base. Approved by: cognet (mentor) Obtained from: Juniper, Semihalf MFp4: e500
* Add a new 'why' argument to kdb_enter(), and a set of constants to userwatson2007-12-251-2/+2
| | | | | | | | | for that argument. This will allow DDB to detect the broad category of reason why the debugger has been entered, which it can use for the purposes of deciding which DDB script to run. Assign approximate why values to all current consumers of the kdb_enter() interface.
* Fix the spinlock static table adding missing spinlocks.attilio2007-11-241-0/+2
| | | | | | - rm_spinlock has turnstile chain as child - srclock has callout and clk as child, found by witness "emulation". Just move it very high in our ranking
* A bunch more files that should probably print out a thread namejulian2007-11-141-1/+1
| | | | instead of a process name.
* Fix some entries in the locks static table of witness.attilio2007-09-201-9/+8
| | | | | | | | | | | | | | | | | | | | In particular: - smp_tlb_mtx is no longer used, so it is axed. - smp rendezvous lock isn't really a leaf spin-mutex. Its bad placement in the table, however, has been the source of a false positive LOR reporting with the dt_lock. However, smp rendezvous lock would have had sched_lock there for older lock, so it wasn't still a leaf lock. - allpmaps is only used in ia32 architecture, so it is inserted in the appropriate stub. Addictionally: - kse_zombie_lock is no longer present, so its definition is axed out. - zombie_lock doesn't need to have an exported symbol, so just let's it be declared as static. Tested by: kris Approved by: jeff (mentor) Approved by: re
* - Remove zstty spin lock for no longer existing zs(4).marius2007-06-161-2/+4
| | | | | | | - Move the rtc_mtx spin lock out from under #ifdef SMP as it's just not SMP-specific. - Add a new spin lock pcib_mtx for locking "fast" interrupt handlers of host-to-PCI bridge drivers on sparc64.
* Update 802.11 wireless support:sam2007-06-111-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | o major overhaul of the way channels are handled: channels are now fully enumerated and uniquely identify the operating characteristics; these changes are visible to user applications which require changes o make scanning support independent of the state machine to enable background scanning and roaming o move scanning support into loadable modules based on the operating mode to enable different policies and reduce the memory footprint on systems w/ constrained resources o add background scanning in station mode (no support for adhoc/ibss mode yet) o significantly speedup sta mode scanning with a variety of techniques o add roaming support when background scanning is supported; for now we use a simple algorithm to trigger a roam: we threshold the rssi and tx rate, if either drops too low we try to roam to a new ap o add tx fragmentation support o add first cut at 802.11n support: this code works with forthcoming drivers but is incomplete; it's included now to establish a baseline for other drivers to be developed and for user applications o adjust max_linkhdr et. al. to reflect 802.11 requirements; this eliminates prepending mbufs for traffic generated locally o add support for Atheros protocol extensions; mainly the fast frames encapsulation (note this can be used with any card that can tx+rx large frames correctly) o add sta support for ap's that beacon both WPA1+2 support o change all data types from bsd-style to posix-style o propagate noise floor data from drivers to net80211 and on to user apps o correct various issues in the sta mode state machine related to handling authentication and association failures o enable the addition of sta mode power save support for drivers that need net80211 support (not in this commit) o remove old WI compatibility ioctls (wicontrol is officially dead) o change the data structures returned for get sta info and get scan results so future additions will not break user apps o fixed tx rate is now maintained internally as an ieee rate and not an index into the rate set; this needs to be extended to deal with multi-mode operation o add extended channel specifications to radiotap to enable 11n sniffing Drivers: o ath: add support for bg scanning, tx fragmentation, fast frames, dynamic turbo (lightly tested), 11n (sniffing only and needs new hal) o awi: compile tested only o ndis: lightly tested o ipw: lightly tested o iwi: add support for bg scanning (well tested but may have some rough edges) o ral, ural, rum: add suppoort for bg scanning, calibrate rssi data o wi: lightly tested This work is based on contributions by Atheros, kmacy, sephe, thompsa, mlaier, kevlo, and others. Much of the scanning work was supported by Atheros. The 11n work was supported by Marvell.
* Commit 10/14 of sched_lock decomposition.jeff2007-06-041-6/+11
| | | | | | | | - Add new spinlocks to support thread_lock() and adjust ordering. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
* Fix some problems introduced with the last descriptors tables lockingattilio2007-05-291-0/+1
| | | | | | | | | | | | | | | | | patch: - Do the correct test for ldt allocation - Drop dt_lock just before to call kmem_free (since it acquires blocking locks inside) - Solve a deadlock with smp_rendezvous() where other CPU will wait undefinitively for dt_lock acquisition. - Add dt_lock in the WITNESS list of spinlocks While applying these modifies, change the requirement for user_ldt_free() making that returning without dt_lock held. Tested by: marcus, tegge Reviewed by: tegge Approved by: jeff (mentor)
* - Move clock synchronization into a seperate clock lock so the globaljeff2007-05-201-0/+1
| | | | | | | | scheduler lock is not involved. sched_lock still protects the sched_clock call. Another patch will remedy this. Contributed by: Attilio Rao <attilio@FreeBSD.org> Tested by: kris, jeff
* Fix witness(4) warnings about mutex use.jkoshy2007-04-191-0/+10
| | | | | | | | | | | | | | | | | | Group mutexes used in hwpmc(4) into 3 "types" in the sense of witness(4): - leaf spin mutexes---only one of these should be held at a time, so these mutexes are specified as belonging to a single witness type "pmc-leaf". - `struct pmc_owner' descriptors are protected by a spin mutex of witness type "pmc-owner-proc". Since we call wakeup_one() while holding these mutexes, the witness type of these mutexes needs to dominate that of "sleepq chain" mutexes. - logger threads use a sleep mutex, of type "pmc-sleep". Submitted by: wkoszek (earlier patch)
* allprison mutex was converted to sx(9) lock.pjd2007-04-051-1/+1
|
* Replace custom file descriptor array sleep lock constructed using a mutexrwatson2007-04-041-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and flags with an sxlock. This leads to a significant and measurable performance improvement as a result of access to shared locking for frequent lookup operations, reduced general overhead, and reduced overhead in the event of contention. All of these are imported for threaded applications where simultaneous access to a shared file descriptor array occurs frequently. Kris has reported 2x-4x transaction rate improvements on 8-core MySQL benchmarks; smaller improvements can be expected for many workloads as a result of reduced overhead. - Generally eliminate the distinction between "fast" and regular acquisisition of the filedesc lock; the plan is that they will now all be fast. Change all locking instances to either shared or exclusive locks. - Correct a bug (pointed out by kib) in fdfree() where previously msleep() was called without the mutex held; sx_sleep() is now always called with the sxlock held exclusively. - Universally hold the struct file lock over changes to struct file, rather than the filedesc lock or no lock. Always update the f_ops field last. A further memory barrier is required here in the future (discussed with jhb). - Improve locking and reference management in linux_at(), which fails to properly acquire vnode references before using vnode pointers. Annotate improper use of vn_fullpath(), which will be replaced at a future date. In fcntl(), we conservatively acquire an exclusive lock, even though in some cases a shared lock may be sufficient, which should be revisited. The dropping of the filedesc lock in fdgrowtable() is no longer required as the sxlock can be held over the sleep operation; we should consider removing that (pointed out by attilio). Tested by: kris Discussed with: jhb, kris, attilio, jeff
OpenPOWER on IntegriCloud