summaryrefslogtreecommitdiffstats
path: root/sys/kern/kern_shutdown.c
Commit message (Collapse)AuthorAgeFilesLines
* Switch vm_object lock to be a rwlock.attilio2013-02-201-0/+1
| | | | | | | | * VM_OBJECT_LOCK and VM_OBJECT_UNLOCK are mapped to write operations * VM_OBJECT_SLEEP() is introduced as a general purpose primitve to get a sleep operation using a VM_OBJECT_LOCK() as protection * The approach must bear with vm_pager.h namespace pollution so many files require including directly rwlock.h
* Switch the hardwired WITNESS panics to kassert_panic.alfred2012-12-111-1/+1
| | | | | | | | | | | This is an ongoing effort to provide runtime debug information useful in the field that does not panic existing installations. This gives us the flexibility needed when shipping images to a potentially large audience with WITNESS enabled without worrying about formerly non-fatal LORs hurting a release. Sponsored by: iXsystems
* allow KASSERT to enter KDB.alfred2012-12-101-0/+14
|
* Allow KASSERT to log instead of panic.alfred2012-12-071-3/+125
| | | | | | | | | | | | | | | | | | | | | | | | | | | This is to allow debug images to be used without taking down the system when non-fatal asserts are hit. The following sysctls are added: debug.kassert.warn_only: 1 = log, 0 = panic debug.kassert.do_ktr: set to a ktr mask for logging via KTR debug.kassert.do_log: 1 = log, 0 = quiet debug.kassert.warnings: stats, number of kasserts hit debug.kassert.log_panic_at: number of kasserts before we actually panic, 0 = never debug.kassert.log_pps_limit: pps limit for log messages debug.kassert.log_mute_at: stop warning after N kasserts, 0 = never stop debug.kassert.kassert: set this sysctl to trigger a kassert Discussed with: scottl, gnn, marcel Sponsored by: iXsystems
* remove stop_scheduler_on_panic knobavg2012-11-251-36/+16
| | | | | | | | | | | | There has not been any complaints about the default behavior, so there is no need to keep a knob that enables the worse alternative. Now that the hard-stopping of other CPUs is the only behavior, the panic_cpu spinlock-like logic can be dropped, because only a single CPU is supposed to win stop_cpus_hard(other_cpus) race and proceed past that call. MFC after: 1 month
* Merge 242488, better use of strlcpy.alfred2012-11-021-2/+3
| | | | Submitted by: Eric van Gyzen <eric@vangyzen.net>
* Provide a device name in the sysctl tree for programs to query thealfred2012-11-011-1/+11
| | | | | | | | | state of crashdump target devices. This will be used to add a "-l" (ell) flag to dumpon(8) to list the currently configured dumpdev. Reviewed by: phk
* free wdog_kern_pat calls in post-panic paths from under SW_WATCHDOGavg2012-06-031-6/+1
| | | | | | Those calls are useful with hardware watchdog drivers too. MFC after: 3 weeks
* Make dumptid non-static. It is used by libkvm to detect whetherharti2012-05-221-1/+1
| | | | | | this is a VNET-kernel or not. gcc used to put the static symbol into the symbol table, clang does not. This fixes the 'netstat: no namelist' error seen on clang+VNET systems.
* Avoid to check the same cache line/variable from all the lockingattilio2012-01-281-2/+1
| | | | | | | | | | | | | | | | primitives by breaking stop_scheduler into a per-thread variable. Also, store the new td_stopsched very close to td_*locks members as they will be accessed mostly in the same codepaths as td_stopsched and this results in avoiding a further cache-line pollution, possibly. STOP_SCHEDULER() was pondered to use a new 'thread' argument, in order to take advantage of already cached curthread, but in the end there should not really be a performance benefit, while introducing a KPI breakage. In collabouration with: flo Reviewed by: avg MFC after: 3 months (or never) X-MFC: r228424
* enable stop_scheduler_on_panic by defaultavg2012-01-091-1/+1
| | | | | | My plan is to make this behavior unconditional before 10.0 release. X-MFC after: r228424 (if ever)
* introduce cngrab/cnungrab stub calls in some places where they make senseavg2011-12-171-0/+3
| | | | MFC after: 2 months
* Match other formatting.obrien2011-12-141-4/+4
|
* Disallow various debug.kdb sysctl's when securelevel is raised.obrien2011-12-131-4/+6
| | | | PR: 161350
* Document a large number of currently undocumented sysctls. While hereeadler2011-12-131-2/+2
| | | | | | | | | | | | fix some style(9) issues and reduce redundancy. PR: kern/155491 PR: kern/155490 PR: kern/155489 Submitted by: Galimov Albert <wtfcrap@mail.ru> Approved by: bde Reviewed by: jhb MFC after: 1 week
* panic: add a switch and infrastructure for stopping other CPUs in SMP caseavg2011-12-111-6/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Historical behavior of letting other CPUs merily go on is a default for time being. The new behavior can be switched on via kern.stop_scheduler_on_panic tunable and sysctl. Stopping of the CPUs has (at least) the following benefits: - more of the system state at panic time is preserved intact - threads and interrupts do not interfere with dumping of the system state Only one thread runs uninterrupted after panic if stop_scheduler_on_panic is set. That thread might call code that is also used in normal context and that code might use locks to prevent concurrent execution of certain parts. Those locks might be held by the stopped threads and would never be released. To work around this issue, it was decided that instead of explicit checks for panic context, we would rather put those checks inside the locking primitives. This change has substantial portions written and re-written by attilio and kib at various times. Other changes are heavily based on the ideas and patches submitted by jhb and mdf. bde has provided many insights into the details and history of the current code. The new behavior may cause problems for systems that use a USB keyboard for interfacing with system console. This is because of some unusual locking patterns in the ukbd code which have to be used because on one hand ukbd is below syscons, but on the other hand it has to interface with other usb code that uses regular mutexes/Giant for its concurrency protection. Dumping to USB-connected disks may also be affected. PR: amd64/139614 (at least) In cooperation with: attilio, jhb, kib, mdf Discussed with: arch@, bde Tested by: Eugene Grosbein <eugen@grosbein.net>, gnn, Steven Hartland <killing@multiplay.co.uk>, glebius, Andrew Boyer <aboyer@averesystems.com> (various versions of the patch) MFC after: 3 months (or never)
* Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.ed2011-11-071-1/+2
| | | | | | The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.
* In order to maximize the re-usability of kernel code in user space thiskmacy2011-09-161-4/+4
| | | | | | | | | | | | | patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls. Reviewed by: rwatson Approved by: re (bz)
* dump_write() returns ENXIO if the dump is trying to be written outsideattilio2011-09-121-2/+5
| | | | | | | | | | | | | | | | of the device boundry. While this is generally ok, the problem is that all the consumers handle similar cases (and expect to catch) ENOSPC for this (for a reference look at minidumpsys() and dumpsys() constructions). That ends up in consumers not recognizing the issue and amd64 failing to retry if the number of pages grows up during minidump. Fix this by returning ENOSPC in dump_write() and while here add some more diagnostic on involved values. Sponsored by: Sandvine Incorporated In collabouration with: emaste Approved by: re (kib) MFC after: 10 days
* Improve the informations reported in case of busy buffers during the shutdown:attilio2011-09-081-8/+20
| | | | | | | | | | | | | | | | | | | - Axe out the SHOW_BUSYBUFS option and uses a tunable for selectively enable/disable it, which is defaulted for not printing anything (0 value) but can be changed for printing (1 value) and be verbose (2 value) - Improves the informations outputed: right now, there is no track of the actual struct buf object or vnode which are referenced by the shutdown process, but it is printed the related struct bufobj object which is not really helpful - Add more verbosity about the state of the struct buf lock and the vnode informations, with the latter to be activated separately by the sysctl Sponsored by: Sandvine Incorporated Reviewed by: emaste, kib Approved by: re (ksmith) MFC after: 10 days
* remove RESTARTABLE_PANICS optionavg2011-07-251-9/+0
| | | | | | | | | | | | | | | | This is done per request/suggestion from John Baldwin who introduced the option. Trying to resume normal system operation after a panic is very unpredictable and dangerous. It will become even more dangerous when we allow a thread in panic(9) to penetrate all lock contexts. I understand that the only purpose of this option was for testing scenarios potentially resulting in panic. Suggested by: jhb Reviewed by: attilio, jhb X-MFC-After: never Approved by: re (kib)
* In the current code, a double panic condition may lead to dumpsattilio2011-06-081-1/+2
| | | | | | | | | | interleaving. Signal dumping to happen only for the first panic which should be the most important. Sponsored by: Sandvine Incorporated Submitted by: Nima Misaghian (nmisaghian AT sandvine DOT com) MFC after: 2 weeks
* Fix making kernel dumps from the debugger by creating a commandmarcel2011-06-071-14/+16
| | | | | | | | | | for it. Do not not expect a developer to call doadump(). Calling doadump does not necessarily work when it's declared static. Nor does it necessarily do what was intended in the context of text dumps. The dump command always creates a core dump. Move printing of error messages from doadump to the dump command, now that we don't have to worry about being called from DDB.
* Add the watchdogs patting during the (shutdown time) disk syncing andattilio2011-04-281-0/+10
| | | | | | | | | | | | | | | | disk dumping. With the option SW_WATCHDOG on, these operations are doomed to let watchdog fire, fi they take too long. I implemented the stubs this way because I really want wdog_kern_* KPI to not be dependant by SW_WATCHDOG being on (and really, the option only enables watchdog activation in hardclock) and also avoid to call them when not necessary (avoiding not-volountary watchdog activations). Sponsored by: Sandvine Incorporated Discussed with: emaste, des MFC after: 2 weeks
* Mostly revert r203420, and add similar functionality into ada(4) since thebrucec2010-10-241-1/+1
| | | | | | | | | | | | | | | | | existing code caused problems with some SCSI controllers. A new sysctl kern.cam.ada.spindown_shutdown has been added that controls whether or not to spin-down disks when shutting down. Spinning down the disks unloads/parks the heads - this is much better than removing power when the disk is still spinning because otherwise an Emergency Unload occurs which may cause damage to the actuator. PR: kern/140752 Submitted by: olli Reviewed by: arundel Discussed with: mav MFC after: 2 weeks
* Rename boot() to kern_reboot() and make it visible outside ofmarcel2010-10-181-7/+6
| | | | | kern_shutdown.c. This makes it easier for emulators and other parts of the kernel to initiate a reboot.
* panic_cpu variable should be volatileavg2010-10-091-4/+3
| | | | | | | | | | This is to prevent caching of its value in a register when it is checked and modified by multiple CPUs in parallel. Also, move the variable into the scope of the only function that uses it. Reviewed by: jhb Hint from: mdf MFC after: 1 week
* sysctls in kern_shutdown: add twin tunablesavg2010-10-011-6/+9
| | | | | | | also make couple of sysctl-controlled variables static Reviewed by: rwatson MFC after: 1 week
* Fix compilation in the !SMP case.attilio2010-04-201-0/+6
| | | | | | | | Keep the interrupts disabled in order to avoid preemption problems. Reported by: tinderbox, b.f. <bf1783 at googlemail dot com> MFC: 2 weeks X-MFC: r206878
* Fix a deadlock in the shutdown code:attilio2010-04-191-7/+12
| | | | | | | | | | | | | | | | | | | | | | | | | When performing a smp_rendezvous() or more likely, on amd64 and i386, a smp_tlb_shootdown() the caller will end up with the smp_ipi_mtx spinlock held, busy-waiting for other CPUs to acknowledge the operation. As long as CPUs are suspended (via cpu_reset()) between the active mask read and IPI sending there can be a deadlock where the caller will wait forever for a dead CPU to acknowledge the operation. Please note that on CPU0 that is going to be someway heavier because of the spinlocks being disabled earlier than quitting the machine. Fix this bug by calling cpu_reset() with the smp_ipi_mtx held. Note that it is very likely that a saner offline/online CPUs mechanism will help heavilly in fixing similar cases as it is likely more bugs of this type may arise in the future. Reported by: rwatson Discussed with: jhb Tested by: rnoland, Giovanni Trematerra <giovanni dot trematerra at gmail dot com> MFC: 2 weeks Special deciation to: anyone who made possible to have 16-ways machines in Netperf
* MFp4:mav2010-02-031-1/+1
| | | | | | Make CAM to stop all attached devices on system shutdown. It allows devices to park heads, reducing stress on power loss. Add `kern.cam.power_down` tunable and sysctl to controll it.
* Don't bother copying the name of a kproc or kthread out into a temporaryjhb2009-10-231-6/+2
| | | | | | | array just to pass that array to printf(). kproc and kthread names are NUL-terminated and can be printed using printf() directly. Reviewed by: bde
* Add a comment on the consequences of reducing the poweroff delayn_hibma2009-09-101-0/+4
|
* * Completely Remove the option STOP_NMI from the kernel. This optionattilio2009-08-131-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | has proven to have a good effect when entering KDB by using a NMI, but it completely violates all the good rules about interrupts disabled while holding a spinlock in other occasions. This can be the cause of deadlocks on events where a normal IPI_STOP is expected. * Adds an new IPI called IPI_STOP_HARD on all the supported architectures. This IPI is responsible for sending a stop message among CPUs using a privileged channel when disponible. In other cases it just does match a normal IPI_STOP. Right now the IPI_STOP_HARD functionality uses a NMI on ia32 and amd64 architectures, while on the other has a normal IPI_STOP effect. It is responsibility of maintainers to eventually implement an hard stop when necessary and possible. * Use the new IPI facility in order to implement a new userend SMP kernel function called stop_cpus_hard(). That is specular to stop_cpu() but it does use the privileged channel for the stopping facility. * Let KDB use the newly introduced function stop_cpus_hard() and leave stop_cpus() for all the other cases * Disable interrupts on CPU0 when starting the process of APs suspension. * Style cleanup and comments adding This patch should fix the reboot/shutdown deadlocks many users are constantly reporting on mailing lists. Please don't forget to update your config file with the STOP_NMI option removal Reviewed by: jhb Tested by: pho, bz, rink Approved by: re (kib)
* Rename the host-related prison fields to be the same as the host.*jamie2009-06-131-1/+1
| | | | | | | parameters they represent, and the variables they replaced, instead of abbreviated versions of them. Approved by: bz (mentor)
* Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERICrwatson2009-06-051-1/+0
| | | | | | | | and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include. Discussed with: pjd
* Place hostnames and similar information fully under the prison system.jamie2009-05-291-2/+2
| | | | | | | | | | | | | | | | | The system hostname is now stored in prison0, and the global variable "hostname" has been removed, as has the hostname_mtx mutex. Jails may have their own host information, or they may inherit it from the parent/system. The proper way to read the hostname is via getcredhostname(), which will copy either the hostname associated with the passed cred, or the system hostname if you pass NULL. The system hostname can still be accessed directly (and without locking) at prison0.pr_host, but that should be avoided where possible. The "similar information" referred to is domainname, hostid, and hostuuid, which have also become prison parameters and had their associated global variables removed. Approved by: bz (mentor)
* PowerPC, meet kernel core dumps. The support is basedmarcel2009-04-041-9/+0
| | | | | | | | | | | | | | | | | | on a generic dumper that creates an ELF core file and uses PMAP functions to scan and iterate over memory chunks, as well as handle memory mappings used during dumping. the PMAP layer can choose to return physical memory chunks or virtual memory chunks. For minidumps, the chunks should be virtual. The default MMU I/F implementation for the scan_md() method returns NULL. Thus, when a PMAP implementation does not implement the required methods, an empty core file is created. Here, empty means having an ELF header only. Obtained from: Juniper Networks
* It's possible that the dump device has gone away after it wasdwmalone2008-11-231-1/+1
| | | | | | | | configured, change the message to let people know this is a possibility. I've slightly changed the message from the one submitted by Pekka to keep the printf on one line. Submitted by: Pekka Savola <pekkas@netcore.fi>
* Collect N identical (or near identical) mkdumpheader() implementations intopeter2008-10-011-0/+22
| | | | one, as threatened in the comment. Textdump magic can be passed in.
* If the panic thread is preempted after setting panicstr but beforekib2008-09-271-0/+2
| | | | | | | | setting TDF_INPANIC then it will never be rescheduled again. Wrap setting the panic condition with the critical section. Noted and reviewed by: tegge MFC after: 1 week
* In keeping with style(9)'s recommendations on macros, use a ';'rwatson2008-03-161-1/+1
| | | | | | | | | after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr. MFC after: 1 month Discussed with: imp, rink
* Make it possible to continue working after calling doadump()ru2008-03-041-0/+1
| | | | manually from debugger. (This got broken in rev. 1.122.)
* Add a wrapper function that bound checks writes to the dump device.ru2008-01-281-0/+14
|
* - Introduce the function lockmgr_recursed() which returns true if theattilio2008-01-191-1/+1
| | | | | | | | | | | | | | | | | | | lockmgr lkp, when held in exclusive mode, is recursed - Introduce the function BUF_RECURSED() which does the same for bufobj locks based on the top of lockmgr_recursed() - Introduce the function BUF_ISLOCKED() which works like the counterpart VOP_ISLOCKED(9), showing the state of lockmgr linked with the bufobj BUF_RECURSED() and BUF_ISLOCKED() entirely replace the usage of bogus BUF_REFCNT() in a more explicative and SMP-compliant way. This allows us to axe out BUF_REFCNT() and leaving the function lockcount() totally unused in our stock kernel. Further commits will axe lockcount() as well as part of lockmgr() cleanup. KPI results, obviously, broken so further commits will update manpages and freebsd version. Tested by: kris (on UFS and NFS)
* Add textdump(4) facility, which provides an alternative form of kernelrwatson2007-12-261-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dump using mechanically generated/extracted debugging output rather than a simple memory dump. Current sources of debugging output are: - DDB output capture buffer, if there is captured output to save - Kernel message buffer - Kernel configuration, if included in kernel - Kernel version string - Panic message Textdumps are stored in swap/dump partitions as with regular dumps, but are laid out as ustar files in order to allow multiple parts to be stored as a stream of sequentially written blocks. Blocks are written out in reverse order, as the size of a textdump isn't known a priori. As with regular dumps, they will be extracted using savecore(8). One new DDB(4) command is added, "textdump", which accepts "set", "unset", and "status" arguments. By default, normal kernel dumps are generated unless "textdump set" is run in order to schedule a textdump. It can be canceled using "textdump unset" to restore generation of a normal kernel dump. Several sysctls exist to configure aspects of textdumps; debug.ddb.textdump.pending can be set to check whether a textdump is pending, or set/unset in order to control whether the next kernel dump will be a textdump from userspace. While textdumps don't have to be generated as a result of a DDB script run automatically as part of a kernel panic, this is a particular useful way to use them, as instead of generating a complete memory dump, a simple transcript of an automated DDB session can be captured using the DDB output capture and textdump facilities. This can be used to generate quite brief kernel bug reports rich in debugging information but not dependent on kernel symbol tables or precisely synchronized source code. Most textdumps I generate are less than 100k including the full message buffer. Using textdumps with an interactive debugging session is also useful, with capture being enabled/disabled in order to record some but not all of the DDB session. MFC after: 3 months
* Add a new 'why' argument to kdb_enter(), and a set of constants to userwatson2007-12-251-1/+1
| | | | | | | | | for that argument. This will allow DDB to detect the broad category of reason why the debugger has been entered, which it can use for the purposes of deciding which DDB script to run. Assign approximate why values to all current consumers of the kdb_enter() interface.
* Introduce a way to make pure kernal threads.julian2007-10-261-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | kthread_add() takes the same parameters as the old kthread_create() plus a pointer to a process structure, and adds a kernel thread to that process. kproc_kthread_add() takes the parameters for kthread_add, plus a process name and a pointer to a pointer to a process instead of just a pointer, and if the proc * is NULL, it creates the process to the specifications required, before adding the thread to it. All other old kthread_xxx() calls return, but act on (struct thread *) instead of (struct proc *). One reason to change the name is so that any old kernel modules that are lying around and expect kthread_create() to make a process will not just accidentally link. fix top to show kernel threads by their thread name in -SH mode add a tdnam formatting option to ps to show thread names. make all idle threads actual kthreads and put them into their own idled process. make all interrupt threads kthreads and put them in an interd process (mainly for aesthetic and accounting reasons) rename proc 0 to be 'kernel' and it's swapper thread is now 'swapper' man page fixes to follow.
* Merge first in a series of TrustedBSD MAC Framework KPI changesrwatson2007-10-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | from Mac OS X Leopard--rationalize naming for entry points to the following general forms: mac_<object>_<method/action> mac_<object>_check_<method/action> The previous naming scheme was inconsistent and mostly reversed from the new scheme. Also, make object types more consistent and remove spaces from object types that contain multiple parts ("posix_sem" -> "posixsem") to make mechanical parsing easier. Introduce a new "netinet" object type for certain IPv4/IPv6-related methods. Also simplify, slightly, some entry point names. All MAC policy modules will need to be recompiled, and modules not updates as part of this commit will need to be modified to conform to the new KPI. Sponsored by: SPARTA (original patches against Mac OS X) Obtained from: TrustedBSD Project, Apple Computer
* Rename the kthread_xxx (e.g. kthread_create()) callsjulian2007-10-201-1/+1
| | | | | | | | | | | to kproc_xxx as they actually make whole processes. Thos makes way for us to add REAL kthread_create() and friends that actually make theads. it turns out that most of these calls actually end up being moved back to the thread version when it's added. but we need to make this cosmetic change first. I'd LOVE to do this rename in 7.0 so that we can eventually MFC the new kthread_xxx() calls.
OpenPOWER on IntegriCloud