summaryrefslogtreecommitdiffstats
path: root/sys/kern/subr_witness.c
Commit message (Collapse)AuthorAgeFilesLines
* Extend the meaning of the CTLFLAG_TUN flag to automatically check ifhselasky2014-06-271-9/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel. Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change. MFC after: 2 weeks Sponsored by: Mellanox Technologies
* Bump WITNESS_PENDLIST by MAXCPU to account for thegrehan2014-04-291-1/+1
| | | | | | | | | | | | | | pmap pvlist locks which are scaled by MAXCPU. This allows an amd64 system to boot with MAXCPU set to 256, which is currently FreeBSD's hard limit without x2apic support. Compile-tested for other arch's. PR: 185831 Discussed with: jhb MFC after: 3 weeks
* Remove AppleTalk support.glebius2014-03-141-6/+0
| | | | | | | | | | AppleTalk was a network transport protocol for Apple Macintosh devices in 80s and then 90s. Starting with Mac OS X in 2000 the AppleTalk was a legacy protocol and primary networking protocol is TCP/IP. The last Mac OS X release to support AppleTalk happened in 2009. The same year routing equipment vendors (namely Cisco) end their support. Thus, AppleTalk won't be supported in FreeBSD 11.0-RELEASE.
* Bump up WITNESS_COUNT from 1024 to 1536 so there are sufficient entries forneel2014-01-201-1/+1
| | | | | | WITNESS to actually work. Reviewed by: jhb@
* In sys/kern/subr_witness.c, remove static functiondim2013-12-251-7/+0
| | | | | | witness_lock_order_key_empty(), which is unused since r181695. MFC after: 3 days
* Trim a couple of panic messages.jhb2013-09-041-8/+2
|
* The r254167 moved initialization of the sleepqueues before the witnesskib2013-08-101-1/+1
| | | | | | | | | | | | is operational. init_sleepqueues() initializes 256 mutexes, which, due to witness still being cold, started to overflow the pending_locks array. As stated in the reported panic message, increase WITNESS_PENDLIST from 768 to 1024, which provides space for additional 256 locks. Reported by: many Tested by: rakuco, bdrewery
* Make kassert_printf use __printflike.alfred2013-07-071-2/+6
| | | | | | Fix associated errors/warnings while I'm here. Requested by: avg
* - Fix a couple of inverted panic messages for shared/exclusive mismatchesjhb2013-06-031-7/+21
| | | | | | of a lock within a single thread. - Fix handling of interlocks in WITNESS by properly requiring the interlock to be held exactly once if it is specified.
* Add option WITNESS_NO_VNODE to suppress printing LORs between VNODEmarcel2013-05-091-1/+13
| | | | | | | | | locks. To support this, VNODE locks are created with the LK_IS_VNODE flag. This flag is propagated down using the LO_IS_VNODE flag. Note that WITNESS still records the LOR. Only the printing and the optional entering into the kernel debugger is bypassed with the WITNESS_NO_VNODE option.
* Correct the lock class for the vm object lock.kib2013-03-091-1/+1
| | | | Reported and tested by: joel
* Cleanup more of the kassert_panic.alfred2012-12-111-9/+30
| | | | | fix compile warnings on !amd64 and NULL derefs that would happen if kassert_panic() would return.
* Fix WITNESS when INVARIANT_SUPPORT is defined.alfred2012-12-111-0/+1
| | | | | | This fixes tinderbox breakage from r244105. Pointed out by: adrian
* Switch the hardwired WITNESS panics to kassert_panic.alfred2012-12-111-40/+54
| | | | | | | | | | | This is an ongoing effort to provide runtime debug information useful in the field that does not panic existing installations. This gives us the flexibility needed when shipping images to a potentially large audience with WITNESS enabled without worrying about formerly non-fatal LORs hurting a release. Sponsored by: iXsystems
* - Unlike cache invalidation and TLB demapping IPIs, reading registers frommarius2012-08-291-3/+0
| | | | | | | | | | | | | other CPUs doesn't require locking so get rid of it. As the latter is used for the timecounter on certain machine models, using a spin lock in this case can lead to a deadlock with the upcoming callout(9) rework. - Merge r134227/r167250 from x86: Avoid cross-IPI SMP deadlock by using the smp_ipi_mtx spin lock not only for smp_rendezvous_cpus() but also for the MD cache invalidation and TLB demapping IPIs. - Mark some unused function arguments as such. MFC after: 1 week
* Fix the 'show witness' DDB command to honor db_pager_quit.jhb2012-08-221-0/+12
|
* Add new pmap layer locks to the predefined lock order. Change the namesalc2012-06-271-5/+8
| | | | of a few existing VM locks to follow a consistent naming scheme.
* Fix old panic when BPF consumer attaches to destroying interface.melifaro2012-05-211-1/+1
| | | | | | | | | | | | | | | | | 'flags' field is added to the end of bpf_if structure. Currently the only flag is BPFIF_FLAG_DYING which is set on bpf detach and checked by bpf_attachd() Problem can be easily triggered on SMP stable/[89] by the following command (sort of): 'while true; do ifconfig vlan222 create vlan 222 vlandev em0 up ; tcpdump -pi vlan222 & ; ifconfig vlan222 destroy ; done' Fix possible use-after-free when BPF detaches itself from interface, freeing bpf_bif memory, while interface is still UP and there can be routes via this interface. Freeing is now delayed till ifnet_departure_event is received via eventhandler(9) api. Convert bpfd rwlock back to mutex due lack of performance gain (currently checking if packet matches filter is done without holding bpfd lock and we have to acquire write lock if packet matches) Approved by: kib(mentor) MFC in: 4 weeks
* - Improve BPF locking model.melifaro2012-04-061-2/+2
| | | | | | | | | | | | | | | | | | | | | | | Interface locks and descriptor locks are converted from mutex(9) to rwlock(9). This greately improves performance: in most common case we need to acquire 1 reader lock instead of 2 mutexes. - Remove filter(descriptor) (reader) lock in bpf_mtap[2] This was suggested by glebius@. We protect filter by requesting interface writer lock on filter change. - Cover struct bpf_if under BPF_INTERNAL define. This permits including bpf.h without including rwlock stuff. However, this is is temporary solution, struct bpf_if should be made opaque for any external caller. Found by: Dmitrij Tejblum <tejblum@yandex-team.ru> Sponsored by: Yandex LLC Reviewed by: glebius (previous version) Reviewed by: silence on -net@ Approved by: (mentor) MFC after: 3 weeks
* Convert the per-interface address list lock from a mutex to a reader/writerjhb2012-01-091-2/+2
| | | | | | lock. Reviewed by: bz
* panic: add a switch and infrastructure for stopping other CPUs in SMP caseavg2011-12-111-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Historical behavior of letting other CPUs merily go on is a default for time being. The new behavior can be switched on via kern.stop_scheduler_on_panic tunable and sysctl. Stopping of the CPUs has (at least) the following benefits: - more of the system state at panic time is preserved intact - threads and interrupts do not interfere with dumping of the system state Only one thread runs uninterrupted after panic if stop_scheduler_on_panic is set. That thread might call code that is also used in normal context and that code might use locks to prevent concurrent execution of certain parts. Those locks might be held by the stopped threads and would never be released. To work around this issue, it was decided that instead of explicit checks for panic context, we would rather put those checks inside the locking primitives. This change has substantial portions written and re-written by attilio and kib at various times. Other changes are heavily based on the ideas and patches submitted by jhb and mdf. bde has provided many insights into the details and history of the current code. The new behavior may cause problems for systems that use a USB keyboard for interfacing with system console. This is because of some unusual locking patterns in the ukbd code which have to be used because on one hand ukbd is below syscons, but on the other hand it has to interface with other usb code that uses regular mutexes/Giant for its concurrency protection. Dumping to USB-connected disks may also be affected. PR: amd64/139614 (at least) In cooperation with: attilio, jhb, kib, mdf Discussed with: arch@, bde Tested by: Eugene Grosbein <eugen@grosbein.net>, gnn, Steven Hartland <killing@multiplay.co.uk>, glebius, Andrew Boyer <aboyer@averesystems.com> (various versions of the patch) MFC after: 3 months (or never)
* Constify arguments for locking KPIs where possible.pjd2011-11-161-3/+4
| | | | | | | This enables locking consumers to pass their own structures around as const and be able to assert locks embedded into those structures. Reviewed by: ed, kib, jhb
* Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.ed2011-11-071-1/+2
| | | | | | The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.
* Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs.ed2011-11-071-1/+1
| | | | This means that their use is restricted to a single C file.
* - Fixup filenames in a few more places where they are used.jhb2011-10-261-21/+20
| | | | - Some whitespace fixes.
* Don't call fixup_filename() on each witness lock call.adrian2011-10-121-41/+63
| | | | | | | | | | | This has been irking me for a while. This causes significant CPU use on bottlenecked CPUs (eg my older EEEPC w/ an earlier Celeron CPU and my MIPS24k boards) when they're passing a lot of traffic. Since the file/line values are only used for printing, this should only affect display. It should have no operational change on the code, besides reducing CPU use.
* Fix typos - remove duplicate "the".brucec2011-02-211-1/+1
| | | | | | PR: bin/154928 Submitted by: Eitan Adler <lists at eitanadler.com> MFC after: 3 days
* Explicitly wire the user buffer rather than doing it implicitly inmdf2011-01-271-0/+4
| | | | | | | | sbuf_new_for_sysctl(9). This allows using an sbuf with a SYSCTL_OUT drain for extremely large amounts of data where the caller knows that appropriate references are held, and sleeping is not an issue. Inspired by: rwatson
* Re-add r212370 now that the LOR in powerpc64 has been resolved:mdf2010-09-161-14/+3
| | | | | | | | | | | | Add a drain function for struct sysctl_req, and use it for a variety of handlers, some of which had to do awkward things to get a large enough SBUF_FIXEDLEN buffer. Note that some sysctl handlers were explicitly outputting a trailing NUL byte. This behaviour was preserved, though it should not be necessary. Reviewed by: phk (original patch)
* Revert r212370, as it causes a LOR on powerpc. powerpc does a fewmdf2010-09-131-3/+14
| | | | | | | unexpected things in copyout(9) and so wiring the user buffer is not sufficient to perform a copyout(9) while holding a random mutex. Requested by: nwhitehorn
* Add a drain function for struct sysctl_req, and use it for a variety ofmdf2010-09-091-14/+3
| | | | | | | | | | handlers, some of which had to do awkward things to get a large enough FIXEDLEN buffer. Note that some sysctl handlers were explicitly outputting a trailing NUL byte. This behaviour was preserved, though it should not be necessary. Reviewed by: phk
* Bump the witness pendlist to 768 to accomodate the increased number ofrpaulo2010-07-291-1/+1
| | | | spinlocks.
* "time lock" is no longer a spin-lock since r209371.mav2010-06-211-1/+1
| | | | Reported by: kib@
* Right now, WITNESS just blindly pipes all the output to theattilio2010-05-111-14/+18
| | | | | | | | | | | | | (TOCONS | TOLOG) mask even when called from DDB points. That breaks several output, where the most notable is textdump output. Fix this by having configurable callbacks passed to witness_list_locks() and witness_display_spinlock() for printing out datas. Reported by: several broken textdump outputs Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com> MFC after: 7 days X-MFC: r207922
* There is not a good reason to have a different prototype for db_printf()attilio2010-05-111-6/+6
| | | | | | | | when compared to printf(). Unify it by returning the number of characters displayed for db_printf() as well. MFC after: 7 days
* On Alan's advice, rather than do a wholesale conversion on a singlekmacy2010-04-301-0/+9
| | | | | | | | | | | | architecture from page queue lock to a hashed array of page locks (based on a patch by Jeff Roberson), I've implemented page lock support in the MI code and have only moved vm_page's hold_count out from under page queue mutex to page lock. This changes pmap_extract_and_hold on all pmaps. Supported by: Bitgravity Inc. Discussed with: alc, jeffr, and kib
* SLIP is gone; remove its mutex from witness.trasz2009-12-291-6/+0
|
* Change w_notrunning and w_stillcold from pointer to array so that sizeofantoine2009-09-061-2/+2
| | | | | | | | returns what is expected. PR: kern/138557 Discussed with: brucec@ MFC after: 1 month
* Add minimal ZFS lock hierarchykmacy2009-05-201-0/+7
|
* Bite the bullet, and make the IPv6 SSM and MLDv2 mega-commit:bms2009-04-291-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | import from p4 bms_netdev. Summary of changes: * Connect netinet6/in6_mcast.c to build. The legacy KAME KPIs are mostly preserved. * Eliminate now dead code from ip6_output.c. Don't do mbuf bingo, we are not going to do RFC 2292 style CMSG tricks for multicast options as they are not required by any current IPv6 normative reference. * Refactor transports (UDP, raw_ip6) to do own mcast filtering. SCTP, TCP unaffected by this change. * Add ip6_msource, in6_msource structs to in6_var.h. * Hookup mld_ifinfo state to in6_ifextra, allocate from domifattach path. * Eliminate IN6_LOOKUP_MULTI(), it is no longer referenced. Kernel consumers which need this should use in6m_lookup(). * Refactor IPv6 socket group memberships to use a vector (like IPv4). * Update ifmcstat(8) for IPv6 SSM. * Add witness lock order for IN6_MULTI_LOCK. * Move IN6_MULTI_LOCK out of lower ip6_output()/ip6_input() paths. * Introduce IP6STAT_ADD/SUB/INC/DEC as per rwatson's IPv4 cleanup. * Update carp(4) for new IPv6 SSM KPIs. * Virtualize ip6_mrouter socket. Changes mostly localized to IPv6 MROUTING. * Don't do a local group lookup in MROUTING. * Kill unused KAME prototypes in6_purgemkludge(), in6_restoremkludge(). * Preserve KAME DAD timer jitter behaviour in MLDv1 compatibility mode. * Bump __FreeBSD_version to 800084. * Update UPDATING. NOTE WELL: * This code hasn't been tested against real MLDv2 queriers (yet), although the on-wire protocol has been verified in Wireshark. * There are a few unresolved issues in the socket layer APIs to do with scope ID propagation. * There is a LOR present in ip6_output()'s use of in6_setscope() which needs to be resolved. See comments in mld6.c. This is believed to be benign and can't be avoided for the moment without re-introducing an indirect netisr. This work was mostly derived from the IGMPv3 implementation, and has been sponsored by a third party.
* Decompose the global UNIX domain sockets rwlock into two differentrwatson2009-03-081-0/+2
| | | | | | | | | | | | | | | | | | | | locks: a global list/counter/generation counter protected by a new mutex unp_list_lock, and a global linkage rwlock, unp_global_rwlock, which protects the connections between UNIX domain sockets. This eliminates conditional lock acquisition that was previously a property of the global lock being held over sonewconn() leading to a call to uipc_attach(), which also required the global lock, but couldn't rely on it as other paths existed to uipc_attach() that didn't hold it: now uipc_attach() uses only the list lock, which follows the linkage lock in the lock order. It may also reduce contention on the global lock for some workloads. Add global UNIX domain socket locks to hard-coded witness lock order. MFC after: 1 week Discussed with: kris
* Move the NORELEASE check to after the recurse count decrement and bailout, thisthompsa2009-02-281-6/+6
| | | | is not counted as actually releasing the lock.
* o Use unsigned for bit fields.imp2009-02-031-3/+3
| | | | o Use NULL for pointers in preference to 0.
* Add functions WITNESS so it can be asserted that the lock is not released for athompsa2009-01-211-0/+49
| | | | | | | | | | | section of code, this uses WITNESS_NORELEASE() and WITNESS_RELEASEOK() to mark the boundaries. Both functions require the lock to be held when calling. This is intended for scenarios like a bus asserting that the bus lock is not dropped during a driver call. There doesn't appear to be a man page to document this in. Reviewed by: jhb
* - convert radix node head lock from mutex to rwlockkmacy2008-12-071-1/+1
| | | | | | | | - make radix node head lock not recursive - fix LOR in rtexpunge - fix LOR in rtredirect Reviewed by: sam
* Fix a number of style issues in the MALLOC / FREE commit. I've tried todes2008-10-231-1/+1
| | | | | be careful not to fix anything that was already broken; the NFSv4 code is particularly bad in this respect.
* Retire the MALLOC and FREE macros. They are an abomination unto style(9).des2008-10-231-2/+1
| | | | MFC after: 3 months
* In the actual code for witness_warn:attilio2008-10-201-19/+12
| | | | | | | | | | - If there aren't spinlocks held, but there are problems with old sleeplocks, they are not reported. - If the spinlock found is not the only one, problems are not reported. Fix these 2 problems. Reported by: tegge
* - Fix a race in witness_checkorder() where, between the PCPU_GET() andattilio2008-10-161-30/+52
| | | | | | | | | | | | | | PCPU_PTR() curthread can migrate on another CPU and get incorrect results. - Fix a similar race into witness_warn(). - Fix the interlock's checks bypassing by correctly using the appropriate children even when the lock_list chunk to be explored is not the first one. - Allow witness_warn() to work with spinlocks too. Bugs found by: tegge Submitted by: jhb, tegge Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
* Oops, missed updating a place with with 's/lock1/plock/' when addingjhb2008-10-031-3/+4
| | | | | | | interlock support to WITNESS. Specifically, the printf listing the first location when duplicate locks of the same type are acquired. Reported by: pho
OpenPOWER on IntegriCloud