summaryrefslogtreecommitdiffstats
path: root/sys/kern/kern_linker.c
Commit message (Collapse)AuthorAgeFilesLines
* MFC r298819:bdrewery2016-06-271-2/+2
| | | | sys/kern: spelling fixes in comments.
* MFC r297391:bdrewery2016-06-271-2/+0
| | | | Remove some NULL checks for M_WAITOK allocations.
* MFC 290728:jhb2016-01-181-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | Export various helper variables describing the layout and size of certain kernel structures for use by debuggers. This mostly aids in examining cores from a kernel without debug symbols as a debugger can infer these values if debug symbols are available. One set of variables describes the layout of 'struct linker_file' to walk the list of loaded kernel modules. A second set of variables describes the layout of 'struct proc' and 'struct thread' to walk the list of processes in the kernel and the threads in each process. The 'pcb_size' variable is used to index into the stoppcbs[] array. The 'vm_maxuser_address' is used to distinguish kernel virtual addresses from user addresses. This doesn't have to be perfect, and 'vm_maxuser_address' is a cheap and simple way to differentiate kernel pointers from simple values like TIDs and PIDs. While here, annotate the fields in struct pcb used by kgdb on amd64 and i386 to note that their ABI should be preserved. Annotations for other platforms will be added in the future.
* MFC r264173:kib2014-04-121-8/+2
| | | | Use realloc(9) instead of doing the reallocation inline.
* MFC r263080:kib2014-03-191-6/+3
| | | | Use correct types for sizeof() in the calculations for the malloc(9) sizes.
* MFC r259587: Invoke the kld_* event handlers from linker_load_file() ...avg2014-02-171-24/+14
|
* Rename the kld_unload event handler to kld_unload_try, and add a newmarkj2013-08-241-26/+13
| | | | | | | | | | | | | | kld_unload event handler which gets invoked after a linker file has been successfully unloaded. The kld_unload and kld_load event handlers are now invoked with the shared linker lock held, while kld_unload_try is invoked with the lock exclusively held. Convert hwpmc(4) to use these event handlers instead of having kern_kldload() and kern_kldunload() invoke hwpmc(4) hooks whenever files are loaded or unloaded. This has no functional effect, but simplifes the linker code somewhat. Reviewed by: jhb
* Set things up so that linker_file_lookup_set() is always called with themarkj2013-08-241-12/+20
| | | | | | | linker lock held. This makes it possible to call it from a kld event handler with the shared lock held. Reviewed by: jhb
* Remove the kld lock macros and just use the sx(9) API. Add locking inmarkj2013-08-241-67/+62
| | | | | | | linker_init_kernel_modules() and linker_preload() in order to remove most of the checks for !cold before asserting that the kld lock is held. These routines are invoked by SYSINIT(9), so there's no harm in them taking the kld lock.
* Use strdup(9) instead of reimplementing it.markj2013-08-161-14/+4
|
* Use kld_{load,unload} instead of mod_{load,unload} for the linker file loadmarkj2013-08-141-2/+2
| | | | | | | and unload event handlers added in r254266. Reported by: jhb X-MFC with: r254266
* FreeBSD's DTrace implementation has a few problems with respect to handlingmarkj2013-08-131-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | probes declared in a kernel module when that module is unloaded. In particular, * Unloading a module with active SDT probes will cause a panic. [1] * A module's (FBT/SDT) probes aren't destroyed when the module is unloaded; trying to use them after the fact will generally cause a panic. This change fixes both problems by porting the DTrace module load/unload handlers from illumos and registering them with the corresponding EVENTHANDLER(9) handlers. This allows the DTrace framework to destroy all probes defined in a module when that module is unloaded, and to prevent a module unload from proceeding if some of its probes are active. The latter problem has already been fixed for FBT probes by checking lf->nenabled in kern_kldunload(), but moving the check into the DTrace framework generalizes it to all kernel providers and also fixes a race in the current implementation (since a probe may be activated between the check and the call to linker_file_unload()). Additionally, the SDT implementation has been reworked to define SDT providers/probes/argtypes in linker sets rather than using SYSINIT/SYSUNINIT to create and destroy SDT probes when a module is loaded or unloaded. This simplifies things quite a bit since it means that pretty much all of the SDT code can live in sdt.ko, and since it becomes easier to integrate SDT with the DTrace framework. Furthermore, this allows FreeBSD to be quite flexible in that SDT providers spanning multiple modules can be created on the fly when a module is loaded; at the moment it looks like illumos' SDT implementation requires all SDT probes to be statically defined in a single kernel table. PR: 166927, 166926, 166928 Reported by: davide [1] Reviewed by: avg, trociny (earlier version) MFC after: 1 month
* Remove some unused fields from struct linker_file. They were added inmarkj2013-08-131-2/+0
| | | | | | | r172862 for use by the DTrace SDT framework but don't seem to have ever been used. MFC after: 2 weeks
* Add event handlers for module load and unload events. The load handlers aremarkj2013-08-131-2/+8
| | | | | | | | | called after the module has been loaded, and the unload handlers are called before the module is unloaded. Moreover, the module unload handlers may return an error to prevent the unload from proceeding. Reviewed by: avg MFC after: 2 weeks
* Remove the support for using non-mpsafe filesystem modules.kib2012-10-221-9/+3
| | | | | | | | | | | | In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems. The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes. Conducted and reviewed by: attilio Tested by: pho
* If a linker file contains at least one module, but all of the modulesjhb2012-04-121-2/+13
| | | | | | | | fail to load (the MOD_LOAD event fails) during a kldload(2), unload the linker file and fail the kldload(2) with ENOEXEC. Reported by: gcooper MFC after: 1 week
* Correct debug message.ae2012-03-221-1/+1
|
* Acquire modules lock before call module_getname() in the KLD_DEBUG case.ae2012-03-211-0/+4
| | | | MFC after: 1 week
* Add CTLFLAG_TUN to the sysctl definition and fix style.ae2012-03-151-2/+2
| | | | | Pointed by: Garrett Cooper MFC after: 2 weeks
* Add debug.kld_debug loader tunable.ae2012-03-151-0/+1
| | | | MFC after: 2 weeks
* Fix found places where uio_resid is truncated to int.kib2012-02-211-2/+3
| | | | | | | | | Add the sysctl debug.iosize_max_clamp, enabled by default. Setting the sysctl to zero allows to perform the SSIZE_MAX-sized i/o requests from the usermode. Discussed with: bde, das (previous versions) MFC after: 1 month
* Revert r231923 for now. Further work is needed to make sure that thedelphij2012-02-201-1/+1
| | | | behavior is consistent.
* Use uprintf instead of printf for the reason why a kernel module can notdelphij2012-02-201-1/+1
| | | | | | | be loaded. This way, the administrator can get response immediately from the shell session rather than relying on dmesg. MFC after: 1 month
* Use strchr() and strrchr().ed2012-01-021-3/+3
| | | | | | | | It seems strchr() and strrchr() are used more often than index() and rindex(). Therefore, simply migrate all kernel code to use it. For the XFS code, remove an empty line to make the code identical to the code in the Linux kernel.
* Add KLD_DEBUG option.fjoe2011-11-061-0/+1
|
* In order to maximize the re-usability of kernel code in user space thiskmacy2011-09-161-8/+8
| | | | | | | | | | | | | patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls. Reviewed by: rwatson Approved by: re (bz)
* Don't leak kld_sx lock in kldunloadf().glebius2011-07-311-1/+2
| | | | Approved by: re (kib)
* Fix a LOR between hwpmc and the kernel linker. When a system-widerstone2011-07-171-11/+24
| | | | | | | | | | | | | | | | | | | | | | sampling mode PMC is allocated, hwpmc calls linker_hwpmc_list_objects() while already holding an exclusive lock on pmc-sx lock. list_objects() tries to acquire an exclusive lock on the kld_sx lock. When a KLD module is loaded or unloaded successfully, kern_kld(un)load calls into the pmc hook while already holding an exclusive lock on the kld_sx lock. Calling the pmc hook requires acquiring a shared lock on the pmc-sx lock. Fix this by only acquiring a shared lock on the kld_sx lock in linker_hwpmc_list_objects(), and also downgrading to a shared lock on the kld_sx lock in kern_kld(un)load before calling into the pmc hook. In kern_kldload this required moving some modifications of the linker_file_t to happen before calling into the pmc hook. This fixes the deadlock by ensuring that the hwpmc -> list_objects() case is always able to proceed. Without this patch, I was able to deadlock a multicore system within minutes by constantly loading and unloading an KLD module while I simultaneously started a sampling mode PMC in a loop. MFC after: 1 month
* Provide compat32 shims for kldstat(2).kib2011-03-301-23/+30
| | | | | Requested and tested by: jpaetzel MFC after: 1 week
* Specify a CTLTYPE_FOO so that a future sysctl(8) change does not needmdf2011-01-181-1/+1
| | | | to rely on the format string.
* Fix page fault that occurred when trying to initialize preloaded kernel module,trasz2011-01-051-3/+11
| | | | | | | | | | | | | | the dependency of which was preloaded, but failed to initialize. Previously, kernel dereferenced NULL pointer returned by modlist_lookup2(); now, when this happens, we unload the dependent module. Since the depended_files list is sorted in dependency order, this properly propagates, unloading modules that depend on failed ones. From the user point of view, this prevents the kernel from panicing when trying to boot kernel compiled without KDTRACE_HOOKS with dtraceall_load="YES" in /boot/loader.conf. Reviewed by: kib
* kdb_backtrace: use stack_print_ddb instead of stack_printavg2010-09-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | This is a followup to r212964. stack_print call chain obtains linker sx lock and thus potentially may lead to a deadlock depending on a kind of a panic. stack_print_ddb doesn't acquire any locks and it doesn't use any facilities of ddb backend. Using stack_print_ddb outside of DDB ifdef required taking a number of helper functions from under it as well. It is a good idea to rename linker_ddb_* and stack_*_ddb functions to have 'unlocked' component in their name instead of 'ddb', because those functions do not use any DDB services, but instead they provide unlocked access to linker symbol information. The latter was previously needed only for DDB, hence the 'ddb' name component. Alternative is to ditch unlocked versions altogether after implementing proper panic handling: 1. stop other cpus upon a panic 2. make all non-spinlock lock operations (mutex, sx, rwlock) be a no-op when panicstr != NULL Suggested by: mdf Discussed with: attilio MFC after: 2 weeks
* - Unbreak build with KLD_DEBUG definedgonzo2009-11-171-1/+6
| | | | | - Add debug.kld_debug sysctl to control KLD debugging level - Print information about KLD dependencies with debug enabled
* Revert previous commit and add myself to the list of people who shouldphk2009-09-081-1/+0
| | | | know better than to commit with a cat in the area.
* Add necessary include.phk2009-09-081-0/+1
|
* Merge the remainder of kern_vimage.c and vimage.h into vnet.c andrwatson2009-08-011-1/+2
| | | | | | | | | | vnet.h, we now use jails (rather than vimages) as the abstraction for virtualization management, and what remained was specific to virtual network stacks. Minor cleanups are done in the process, and comments updated to reflect these changes. Reviewed by: bz Approved by: re (vimage blanket)
* Improve the printf message when a module failed to load. This gives therpaulo2009-07-211-2/+2
| | | | | | | user some clue about the possibility of a __FreeBSD_version mismatch. Discussed with: rwatson, jhb Approved by: re (kib)
* Remove the interim vimage containers, struct vimage and struct procg,jamie2009-07-171-12/+0
| | | | | | and the ioctl-based interface that supported them. Approved by: re (kib), bz (mentor)
* Build on Jeff Roberson's linker-set based dynamic per-CPU allocatorrwatson2009-07-141-15/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables. Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker. Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided. This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS. Bump __FreeBSD_version and update UPDATING. Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)
* Don't assume a default (currently 15) value for preloaded klds whenattilio2009-06-291-39/+19
| | | | | | | | | | | loading hwpmc, but calculate at runtime and allocate the necessary space. Also the current logic is wrong as it can lead to an endless loop. Sponsored by: Sandvine Incorporated Reported by: Ryan Stone <rstone at sandvine dot com> Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com> Approved by: re (kib)
* Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERICrwatson2009-06-051-1/+0
| | | | | | | | and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include. Discussed with: pjd
* Add hierarchical jails. A jail may further virtualize its environmentjamie2009-05-271-2/+3
| | | | | | | | | | | | | | | | | | | | | | by creating a child jail, which is visible to that jail and to any parent jails. Child jails may be restricted more than their parents, but never less. Jail names reflect this hierarchy, being MIB-style dot-separated strings. Every thread now points to a jail, the default being prison0, which contains information about the physical system. Prison0's root directory is the same as rootvnode; its hostname is the same as the global hostname, and its securelevel replaces the global securelevel. Note that the variable "securelevel" has actually gone away, which should not cause any problems for code that properly uses securelevel_gt() and securelevel_ge(). Some jail-related permissions that were kept in global variables and set via sysctls are now per-jail settings. The sysctls still exist for backward compatibility, used only by the now-deprecated jail(2) system call. Approved by: bz (mentor)
* A NOP change: style / whitespace cleanup of the noise that slippedzec2009-05-081-1/+1
| | | | | | | into r191816. Spotted by: bz Approved by: julian (mentor) (an earlier version of the diff)
* Introduce a new virtualization container, provisionally named vprocg, to holdzec2009-05-081-0/+12
| | | | | | | | | | | | | | | | | | | | | | virtualized instances of hostname and domainname, as well as a new top-level virtualization struct vimage, which holds pointers to struct vnet and struct vprocg. Struct vprocg is likely to become replaced in the near future with a new jail management API import. As a consequence of this change, change struct ucred to point to a struct vimage, instead of directly pointing to a vnet. Merge vnet / vimage / ucred refcounting infrastructure from p4 / vimage branch. Permit kldload / kldunload operations to be executed only from the default vimage context. This change should have no functional impact on nooptions VIMAGE kernel builds. Reviewed by: bz Approved by: julian (mentor)
* Change the curvnet variable from a global const struct vnet *,zec2009-05-051-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | previously always pointing to the default vnet context, to a dynamically changing thread-local one. The currvnet context should be set on entry to networking code via CURVNET_SET() macros, and reverted to previous state via CURVNET_RESTORE(). Recursions on curvnet are permitted, though strongly discuouraged. This change should have no functional impact on nooptions VIMAGE kernel builds, where CURVNET_* macros expand to whitespace. The curthread->td_vnet (aka curvnet) variable's purpose is to be an indicator of the vnet context in which the current network-related operation takes place, in case we cannot deduce the current vnet context from any other source, such as by looking at mbuf's m->m_pkthdr.rcvif->if_vnet, sockets's so->so_vnet etc. Moreover, so far curvnet has turned out to be an invaluable consistency checking aid: it helps to catch cases when sockets, ifnets or any other vnet-aware structures may have leaked from one vnet to another. The exact placement of the CURVNET_SET() / CURVNET_RESTORE() macros was a result of an empirical iterative process, whith an aim to reduce recursions on CURVNET_SET() to a minimum, while still reducing the scope of CURVNET_SET() to networking only operations - the alternative would be calling CURVNET_SET() on each system call entry. In general, curvnet has to be set in three typicall cases: when processing socket-related requests from userspace or from within the kernel; when processing inbound traffic flowing from device drivers to upper layers of the networking stack, and when executing timer-driven networking functions. This change also introduces a DDB subcommand to show the list of all vnet instances. Approved by: julian (mentor)
* Scanning all the formats for binary translation of modules loading canattilio2009-02-101-0/+8
| | | | | | | | | | | | | | | | | | | result in errors for a format loading but subsequent correct recognizing for another format. File format loading functions should avoid printing any additional informations but just returning appropriate (and different between each other) error condition, characterizing different informations. Additively, the linker should handle appropriately different format loading errors. While a general mechanism is desired, fix a simple and common case on amd64: file type is not recognized for link elf and confuses the linker. Printout an error if all the registered linker classes can't recognize and load the module. Reviewed by: jhb Sponsored by: Sandvine Incorporated
* Expand the scope of the sysctllock sx lock to protect the sysctl tree itself.jhb2009-02-061-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Back in 1.1 of kern_sysctl.c the sysctl() routine wired the "old" userland buffer for most sysctls (everything except kern.vnode.*). I think to prevent issues with wiring too much memory it used a 'memlock' to serialize all sysctl(2) invocations, meaning that only one user buffer could be wired at a time. In 5.0 the 'memlock' was converted to an sx lock and renamed to 'sysctl lock'. However, it still only served the purpose of serializing sysctls to avoid wiring too much memory and didn't actually protect the sysctl tree as its name suggested. These changes expand the lock to actually protect the tree. Later on in 5.0, sysctl was changed to not wire buffers for requests by default (sysctl_handle_opaque() will still wire buffers larger than a single page, however). As a result, user buffers are no longer wired as often. However, many sysctl handlers still wire user buffers, so it is still desirable to serialize userland sysctl requests. Kernel sysctl requests are allowed to run in parallel, however. - Expose sysctl_lock()/sysctl_unlock() routines to exclusively lock the sysctl tree for a few places outside of kern_sysctl.c that manipulate the sysctl tree directly including the kernel linker and vfs_register(). - sysctl_register() and sysctl_unregister() require the caller to lock the sysctl lock using sysctl_lock() and sysctl_unlock(). The rest of the public sysctl API manage the locking internally. - Add a locked variant of sysctl_remove_oid() for internal use so that external uses of the API do not need to be aware of locking requirements. - The kernel linker no longer needs Giant when manipulating the sysctl tree. - Add a missing break to the loop in vfs_register() so that we stop looking at the sysctl MIB once we have changed it. MFC after: 1 month
* Drop the kernel linker lock while running SYSUNINIT routines and removingjhb2009-02-051-0/+3
| | | | | | | | | sysctls during a linker file unload. We drop the lock when doing similar operations during a linker file load. To close races, clear the LINKED flag before dropping the lock so that the linker file is no longer visible to userland. MFC after: 1 week
* Conditionally compile out V_ globals while instantiating the appropriatezec2008-12-101-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | container structures, depending on VIMAGE_GLOBALS compile time option. Make VIMAGE_GLOBALS a new compile-time option, which by default will not be defined, resulting in instatiations of global variables selected for V_irtualization (enclosed in #ifdef VIMAGE_GLOBALS blocks) to be effectively compiled out. Instantiate new global container structures to hold V_irtualized variables: vnet_net_0, vnet_inet_0, vnet_inet6_0, vnet_ipsec_0, vnet_netgraph_0, and vnet_gif_0. Update the VSYM() macro so that depending on VIMAGE_GLOBALS the V_ macros resolve either to the original globals, or to fields inside container structures, i.e. effectively #ifdef VIMAGE_GLOBALS #define V_rt_tables rt_tables #else #define V_rt_tables vnet_net_0._rt_tables #endif Update SYSCTL_V_*() macros to operate either on globals or on fields inside container structs. Extend the internal kldsym() lookups with the ability to resolve selected fields inside the virtualization container structs. This applies only to the fields which are explicitly registered for kldsym() visibility via VNET_MOD_DECLARE() and vnet_mod_register(), currently this is done only in sys/net/if.c. Fix a few broken instances of MODULE_GLOBAL() macro use in SCTP code, and modify the MODULE_GLOBAL() macro to resolve to V_ macros, which in turn result in proper code being generated depending on VIMAGE_GLOBALS. De-virtualize local static variables in sys/contrib/pf/net/pf_subr.c which were prematurely V_irtualized by automated V_ prepending scripts during earlier merging steps. PF virtualization will be done separately, most probably after next PF import. Convert a few variable initializations at instantiation to initialization in init functions, most notably in ipfw. Also convert TUNABLE_INT() initializers for V_ variables to TUNABLE_FETCH_INT() in initializer functions. Discussed at: devsummit Strassburg Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation
* - Invoke MOD_QUIESCE on all modules in a linker file (kld) beforejhb2008-12-051-4/+27
| | | | | | | | | | | | | | unloading any modules. As a result, if any module veto's an unload request via MOD_QUIESCE, the entire set of modules for that linker file will remain loaded and active now rather than leaving the kld in a weird state where some modules are loaded and some are unloaded. - This also moves the logic for handling the "forced" unload flag out of kern_module.c and into kern_linker.c which is a bit cleaner. - Add a module_name() routine that returns the name of a module and use that instead of printing pointer values in debug messages when a module fails MOD_QUIESCE or MOD_UNLOAD. MFC after: 1 month
OpenPOWER on IntegriCloud