summaryrefslogtreecommitdiffstats
path: root/kernel/module.c
Commit message (Collapse)AuthorAgeFilesLines
* module: don't unlink the module until we've removed all exposure.Rusty Russell2013-04-171-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise we get a race between unload and reload of the same module: the new module doesn't see the old one in the list, but then fails because it can't register over the still-extant entries in sysfs: [ 103.981925] ------------[ cut here ]------------ [ 103.986902] WARNING: at fs/sysfs/dir.c:536 sysfs_add_one+0xab/0xd0() [ 103.993606] Hardware name: CrownBay Platform [ 103.998075] sysfs: cannot create duplicate filename '/module/pch_gbe' [ 104.004784] Modules linked in: pch_gbe(+) [last unloaded: pch_gbe] [ 104.011362] Pid: 3021, comm: modprobe Tainted: G W 3.9.0-rc5+ #5 [ 104.018662] Call Trace: [ 104.021286] [<c103599d>] warn_slowpath_common+0x6d/0xa0 [ 104.026933] [<c1168c8b>] ? sysfs_add_one+0xab/0xd0 [ 104.031986] [<c1168c8b>] ? sysfs_add_one+0xab/0xd0 [ 104.037000] [<c1035a4e>] warn_slowpath_fmt+0x2e/0x30 [ 104.042188] [<c1168c8b>] sysfs_add_one+0xab/0xd0 [ 104.046982] [<c1168dbe>] create_dir+0x5e/0xa0 [ 104.051633] [<c1168e78>] sysfs_create_dir+0x78/0xd0 [ 104.056774] [<c1262bc3>] kobject_add_internal+0x83/0x1f0 [ 104.062351] [<c126daf6>] ? kvasprintf+0x46/0x60 [ 104.067231] [<c1262ebd>] kobject_add_varg+0x2d/0x50 [ 104.072450] [<c1262f07>] kobject_init_and_add+0x27/0x30 [ 104.078075] [<c1089240>] mod_sysfs_setup+0x80/0x540 [ 104.083207] [<c1260851>] ? module_bug_finalize+0x51/0xc0 [ 104.088720] [<c108ab29>] load_module+0x1429/0x18b0 We can teardown sysfs first, then to be sure, put the state in MODULE_STATE_UNFORMED so it's ignored while we deconstruct it. Reported-by: Veaceslav Falico <vfalico@redhat.com> Tested-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* module: fix symbol versioning with symbol prefixesJames Hogan2013-03-201-1/+2
| | | | | | | | | | | | | | | | | | | | | Fix symbol versioning on architectures with symbol prefixes. Although the build was free from warnings the actual modules still wouldn't load as the ____versions table contained unprefixed symbol names, which were being compared against the prefixed symbol names when checking the symbol versions. This is fixed by modifying modpost to add the symbol prefix to the ____versions table it outputs (Modules.symvers still contains unprefixed symbol names). The check_modstruct_version() function is also fixed as it checks the version of the unprefixed "module_layout" symbol which would no longer work. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Michal Marek <mmarek@suse.cz> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jonathan Kliegman <kliegs@chromium.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (use VMLINUX_SYMBOL_STR)
* CONFIG_SYMBOL_PREFIX: cleanup.Rusty Russell2013-03-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have CONFIG_SYMBOL_PREFIX, which three archs define to the string "_". But Al Viro broke this in "consolidate cond_syscall and SYSCALL_ALIAS declarations" (in linux-next), and he's not the first to do so. Using CONFIG_SYMBOL_PREFIX is awkward, since we usually just want to prefix it so something. So various places define helpers which are defined to nothing if CONFIG_SYMBOL_PREFIX isn't set: 1) include/asm-generic/unistd.h defines __SYMBOL_PREFIX. 2) include/asm-generic/vmlinux.lds.h defines VMLINUX_SYMBOL(sym) 3) include/linux/export.h defines MODULE_SYMBOL_PREFIX. 4) include/linux/kernel.h defines SYMBOL_PREFIX (which differs from #7) 5) kernel/modsign_certificate.S defines ASM_SYMBOL(sym) 6) scripts/modpost.c defines MODULE_SYMBOL_PREFIX 7) scripts/Makefile.lib defines SYMBOL_PREFIX on the commandline if CONFIG_SYMBOL_PREFIX is set, so that we have a non-string version for pasting. (arch/h8300/include/asm/linkage.h defines SYMBOL_NAME(), too). Let's solve this properly: 1) No more generic prefix, just CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX. 2) Make linux/export.h usable from asm. 3) Define VMLINUX_SYMBOL() and VMLINUX_SYMBOL_STR(). 4) Make everyone use them. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Reviewed-by: James Hogan <james.hogan@imgtec.com> Tested-by: James Hogan <james.hogan@imgtec.com> (metag)
* Merge branch 'for-linus' of ↵Linus Torvalds2013-02-261-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile (part one) from Al Viro: "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent locking violations, etc. The most visible changes here are death of FS_REVAL_DOT (replaced with "has ->d_weak_revalidate()") and a new helper getting from struct file to inode. Some bits of preparation to xattr method interface changes. Misc patches by various people sent this cycle *and* ocfs2 fixes from several cycles ago that should've been upstream right then. PS: the next vfs pile will be xattr stuff." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits) saner proc_get_inode() calling conventions proc: avoid extra pde_put() in proc_fill_super() fs: change return values from -EACCES to -EPERM fs/exec.c: make bprm_mm_init() static ocfs2/dlm: use GFP_ATOMIC inside a spin_lock ocfs2: fix possible use-after-free with AIO ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero target: writev() on single-element vector is pointless export kernel_write(), convert open-coded instances fs: encode_fh: return FILEID_INVALID if invalid fid_type kill f_vfsmnt vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op nfsd: handle vfs_getattr errors in acl protocol switch vfs_getattr() to struct path default SET_PERSONALITY() in linux/elf.h ceph: prepopulate inodes only when request is aborted d_hash_and_lookup(): export, switch open-coded instances 9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate() 9p: split dropping the acls from v9fs_set_create_acl() ...
| * switch vfs_getattr() to struct pathAl Viro2013-02-261-1/+1
| | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | module: clean up load_module a little more.Rusty Russell2013-01-211-39/+68
| | | | | | | | | | | | | | | | | | | | | | 1fb9341ac34825aa40354e74d9a2c69df7d2c304 made our locking in load_module more complicated: we grab the mutex once to insert the module in the list, then again to upgrade it once it's formed. Since the locking is self-contained, it's neater to do this in separate functions. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* | taint: add explicit flag to show whether lock dep is still OK.Rusty Russell2013-01-211-11/+15
| | | | | | | | | | | | | | Fix up all callers as they were before, with make one change: an unsigned module taints the kernel, but doesn't turn off lockdep. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* | module: printk message when module signature fail taints kernel.Rusty Russell2013-01-211-1/+6
| | | | | | | | | | Reported-by: Chris Samuel <chris@csamuel.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* | module: fix missing module_mutex unlockLinus Torvalds2013-01-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 1fb9341ac348 ("module: put modules in list much earlier") moved some of the module initialization code around, and in the process changed the exit paths too. But for the duplicate export symbol error case the change made the ddebug_cleanup path jump to after the module mutex unlock, even though it happens with the mutex held. Rusty has some patches to split this function up into some helper functions, hopefully the mess of complex goto targets will go away eventually. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge tag 'fixes-for-linus' of ↵Linus Torvalds2013-01-201-46/+108
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module fixes and a virtio block fix from Rusty Russell: "Various minor fixes, but a slightly more complex one to fix the per-cpu overload problem introduced recently by kvm id changes." * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: module: put modules in list much earlier. module: add new state MODULE_STATE_UNFORMED. module: prevent warning when finit_module a 0 sized file virtio-blk: Don't free ida when disk is in use
| * module: put modules in list much earlier.Rusty Russell2013-01-121-41/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prarit's excellent bug report: > In recent Fedora releases (F17 & F18) some users have reported seeing > messages similar to > > [ 15.478160] kvm: Could not allocate 304 bytes percpu data > [ 15.478174] PERCPU: allocation failed, size=304 align=32, alloc from > reserved chunk failed > > during system boot. In some cases, users have also reported seeing this > message along with a failed load of other modules. > > What is happening is systemd is loading an instance of the kvm module for > each cpu found (see commit e9bda3b). When the module load occurs the kernel > currently allocates the modules percpu data area prior to checking to see > if the module is already loaded or is in the process of being loaded. If > the module is already loaded, or finishes load, the module loading code > releases the current instance's module's percpu data. Now we have a new state MODULE_STATE_UNFORMED, we can insert the module into the list (and thus guarantee its uniqueness) before we allocate the per-cpu region. Reported-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Tested-by: Prarit Bhargava <prarit@redhat.com>
| * module: add new state MODULE_STATE_UNFORMED.Rusty Russell2013-01-121-5/+52
| | | | | | | | | | | | | | | | | | You should never look at such a module, so it's excised from all paths which traverse the modules list. We add the state at the end, to avoid gratuitous ABI break (ksplice). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * module: prevent warning when finit_module a 0 sized fileSasha Levin2013-01-031-0/+7
| | | | | | | | | | | | | | | | | | | | | | If we try to finit_module on a file sized 0 bytes vmalloc will scream and spit out a warning. Since modules have to be bigger than 0 bytes anyways we can just check that beforehand and avoid the warning. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* | module, async: async_synchronize_full() on module init iff async is usedTejun Heo2013-01-161-2/+25
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the default iosched is built as module, the kernel may deadlock while trying to load the iosched module on device probe if the probing was running off async. This is because async_synchronize_full() at the end of module init ends up waiting for the async job which initiated the module loading. async A modprobe 1. finds a device 2. registers the block device 3. request_module(default iosched) 4. modprobe in userland 5. load and init module 6. async_synchronize_full() Async A waits for modprobe to finish in request_module() and modprobe waits for async A to finish in async_synchronize_full(). Because there's no easy to track dependency once control goes out to userland, implementing properly nested flushing is difficult. For now, make module init perform async_synchronize_full() iff module init has queued async jobs as suggested by Linus. This avoids the described deadlock because iosched module doesn't use async and thus wouldn't invoke async_synchronize_full(). This is hacky and incomplete. It will deadlock if async module loading nests; however, this works around the known problem case and seems to be the best of bad options. For more details, please refer to the following thread. http://thread.gmane.org/gmane.linux.kernel/1420814 Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Alex Riesen <raa.lkml@gmail.com> Tested-by: Ming Lei <ming.lei@canonical.com> Tested-by: Alex Riesen <raa.lkml@gmail.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge tag 'modules-next-for-linus' of ↵Linus Torvalds2012-12-191-174/+267
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module update from Rusty Russell: "Nothing all that exciting; a new module-from-fd syscall for those who want to verify the source of the module (ChromeOS) and/or use standard IMA on it or other security hooks." * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: MODSIGN: Fix kbuild output when using default extra_certificates MODSIGN: Avoid using .incbin in C source modules: don't hand 0 to vmalloc. module: Remove a extra null character at the top of module->strtab. ASN.1: Use the ASN1_LONG_TAG and ASN1_INDEFINITE_LENGTH constants ASN.1: Define indefinite length marker constant moduleparam: use __UNIQUE_ID() __UNIQUE_ID() MODSIGN: Add modules_sign make target powerpc: add finit_module syscall. ima: support new kernel module syscall add finit_module syscall to asm-generic ARM: add finit_module syscall to ARM security: introduce kernel_module_from_file hook module: add flags arg to sys_finit_module() module: add syscall to load module from fd
| * modules: don't hand 0 to vmalloc.Rusty Russell2012-12-141-15/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit d0a21265dfb5fa8a David Rientjes unified various archs' module_alloc implementation (including x86) and removed the graduitous shortcut for size == 0. Then, in commit de7d2b567d040e3b, Joe Perches added a warning for zero-length vmallocs, which can happen without kallsyms on modules with no init sections (eg. zlib_deflate). Fix this once and for all; the module code has to handle zero length anyway, so get it right at the caller and remove the now-gratuitous checks within the arch-specific module_alloc implementations. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=42608 Reported-by: Conrad Kostecki <ConiKost@gmx.de> Cc: David Rientjes <rientjes@google.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * module: Remove a extra null character at the top of module->strtab.Satoru Takeuchi2012-12-141-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a extra null character('\0') at the top of module->strtab for each module. Commit 59ef28b introduced this bug and this patch fixes it. Live dump log of the current linus git kernel(HEAD is 2844a4870): ============================================================================ crash> mod | grep loop ffffffffa01db0a0 loop 16689 (not loaded) [CONFIG_KALLSYMS] crash> module.core_symtab ffffffffa01db0a0 core_symtab = 0xffffffffa01db320crash> rd 0xffffffffa01db320 12 ffffffffa01db320: 0000005500000001 0000000000000000 ....U........... ffffffffa01db330: 0000000000000000 0002007400000002 ............t... ffffffffa01db340: ffffffffa01d8000 0000000000000038 ........8....... ffffffffa01db350: 001a00640000000e ffffffffa01daeb0 ....d........... ffffffffa01db360: 00000000000000a0 0002007400000019 ............t... ffffffffa01db370: ffffffffa01d8068 000000000000001b h............... crash> module.core_strtab ffffffffa01db0a0 core_strtab = 0xffffffffa01dbb30 "" crash> rd 0xffffffffa01dbb30 4 ffffffffa01dbb30: 615f70616d6b0000 66780063696d6f74 ..kmap_atomic.xf ffffffffa01dbb40: 73636e75665f7265 72665f646e696600 er_funcs.find_fr ============================================================================ We expect Just first one byte of '\0', but actually first two bytes are '\0'. Here is The relationship between symtab and strtab. symtab_idx strtab_idx symbol ----------------------------------------------- 0 0x1 "\0" # startab_idx should be 0 1 0x2 "kmap_atomic" 2 0xe "xfer_funcs" 3 0x19 "find_fr..." By applying this patch, it becomes as follows. symtab_idx strtab_idx symbol ----------------------------------------------- 0 0x0 "\0" # extra byte is removed 1 0x1 "kmap_atomic" 2 0xd "xfer_funcs" 3 0x18 "find_fr..." Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com> Cc: Masaki Kimura <masaki.kimura.kz@hitachi.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * security: introduce kernel_module_from_file hookKees Cook2012-12-141-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that kernel module origins can be reasoned about, provide a hook to the LSMs to make policy decisions about the module file. This will let Chrome OS enforce that loadable kernel modules can only come from its read-only hash-verified root filesystem. Other LSMs can, for example, read extended attributes for signatures, etc. Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Serge E. Hallyn <serge.hallyn@canonical.com> Acked-by: Eric Paris <eparis@redhat.com> Acked-by: Mimi Zohar <zohar@us.ibm.com> Acked-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * module: add flags arg to sys_finit_module()Rusty Russell2012-12-141-14/+26
| | | | | | | | | | | | | | | | | | | | Thanks to Michael Kerrisk for keeping us honest. These flags are actually useful for eliminating the only case where kmod has to mangle a module's internals: for overriding module versioning. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Lucas De Marchi <lucas.demarchi@profusion.mobi> Acked-by: Kees Cook <keescook@chromium.org>
| * module: add syscall to load module from fdKees Cook2012-12-141-148/+219
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As part of the effort to create a stronger boundary between root and kernel, Chrome OS wants to be able to enforce that kernel modules are being loaded only from our read-only crypto-hash verified (dm_verity) root filesystem. Since the init_module syscall hands the kernel a module as a memory blob, no reasoning about the origin of the blob can be made. Earlier proposals for appending signatures to kernel modules would not be useful in Chrome OS, since it would involve adding an additional set of keys to our kernel and builds for no good reason: we already trust the contents of our root filesystem. We don't need to verify those kernel modules a second time. Having to do signature checking on module loading would slow us down and be redundant. All we need to know is where a module is coming from so we can say yes/no to loading it. If a file descriptor is used as the source of a kernel module, many more things can be reasoned about. In Chrome OS's case, we could enforce that the module lives on the filesystem we expect it to live on. In the case of IMA (or other LSMs), it would be possible, for example, to examine extended attributes that may contain signatures over the contents of the module. This introduces a new syscall (on x86), similar to init_module, that has only two arguments. The first argument is used as a file descriptor to the module and the second argument is a pointer to the NULL terminated string of module arguments. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
* | kernel: remove reference to feature-removal-schedule.txtTao Ma2012-12-171-3/+0
|/ | | | | | | | | | In commit 9c0ece069b32 ("Get rid of Documentation/feature-removal.txt"), Linus removed feature-removal-schedule.txt from Documentation, but there is still some reference to this file. So remove them. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* module: fix out-by-one error in kallsymsRusty Russell2012-10-311-11/+16
| | | | | | | | | | | | | | | | | | | | Masaki found and patched a kallsyms issue: the last symbol in a module's symtab wasn't transferred. This is because we manually copy the zero'th entry (which is always empty) then copy the rest in a loop starting at 1, though from src[0]. His fix was minimal, I prefer to rewrite the loops in more standard form. There are two loops: one to get the size, and one to copy. Make these identical: always count entry 0 and any defined symbol in an allocated non-init section. This bug exists since the following commit was introduced. module: reduce symbol table for loaded modules (v2) commit: 4a4962263f07d14660849ec134ee42b63e95ea9a LKML: http://lkml.org/lkml/2012/10/24/27 Reported-by: Masaki Kimura <masaki.kimura.kz@hitachi.com> Cc: stable@kernel.org
* MODSIGN: Move the magic string to the end of a module and eliminate the searchDavid Howells2012-10-191-17/+9
| | | | | | | | | | | | | | | | | | Emit the magic string that indicates a module has a signature after the signature data instead of before it. This allows module_sig_check() to be made simpler and faster by the elimination of the search for the magic string. Instead we just need to do a single memcmp(). This works because at the end of the signature data there is the fixed-length signature information block. This block then falls immediately prior to the magic number. From the contents of the information block, it is trivial to calculate the size of the signature data and thus the size of the actual module data. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* MODSIGN: Add FIPS policyDavid Howells2012-10-101-0/+4
| | | | | | | | | If we're in FIPS mode, we should panic if we fail to verify the signature on a module or we're asked to load an unsigned module in signature enforcing mode. Possibly FIPS mode should automatically enable enforcing mode. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* module: signature checking hookRusty Russell2012-10-101-1/+92
| | | | | | | | | | | | | | | | | We do a very simple search for a particular string appended to the module (which is cache-hot and about to be SHA'd anyway). There's both a config option and a boot parameter which control whether we accept or fail with unsigned modules and modules that are signed with an unknown key. If module signing is enabled, the kernel will be tainted if a module is loaded that is unsigned or has a signature for which we don't have the key. (Useful feedback and tweaks by David Howells <dhowells@redhat.com>) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* module: wait when loading a module which is currently initializing.Rusty Russell2012-09-281-2/+26
| | | | | | | | | | | | | | | | The original module-init-tools module loader used a fnctl lock on the .ko file to avoid attempts to simultaneously load a module. Unfortunately, you can't get an exclusive fcntl lock on a read-only fd, making this not work for read-only mounted filesystems. module-init-tools has a hacky sleep-and-loop for this now. It's not that hard to wait in the kernel, and only return -EEXIST once the first module has finished loading (or continue loading the module if the first one failed to initialize for some reason). It's also consistent with what we do for dependent modules which are still loading. Suggested-by: Lucas De Marchi <lucas.demarchi@profusion.mobi> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* module: fix symbol waiting when module fails before initRusty Russell2012-09-281-4/+4
| | | | | | | | | | | | | | | | | | We use resolve_symbol_wait(), which blocks if the module containing the symbol is still loading. However: 1) The module_wq we use is only woken after calling the modules' init function, but there are other failure paths after the module is placed in the linked list where we need to do the same thing. 2) wake_up() only wakes one waiter, and our waitqueue is shared by all modules, so we need to wake them all. 3) wake_up_all() doesn't imply a memory barrier: I feel happier calling it after we've grabbed and dropped the module_mutex, not just after the state assignment. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* Make most arch asm/module.h files use asm-generic/module.hDavid Howells2012-09-281-20/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the mapping of Elf_[SPE]hdr, Elf_Addr, Elf_Sym, Elf_Dyn, Elf_Rel/Rela, ELF_R_TYPE() and ELF_R_SYM() to either the 32-bit version or the 64-bit version into asm-generic/module.h for all arches bar MIPS. Also, use the generic definition mod_arch_specific where possible. To this end, I've defined three new config bools: (*) HAVE_MOD_ARCH_SPECIFIC Arches define this if they don't want to use the empty generic mod_arch_specific struct. (*) MODULES_USE_ELF_RELA Arches define this if their modules can contain RELA records. This causes the Elf_Rela mapping to be emitted and allows apply_relocate_add() to be defined by the arch rather than have the core emit an error message. (*) MODULES_USE_ELF_REL Arches define this if their modules can contain REL records. This causes the Elf_Rel mapping to be emitted and allows apply_relocate() to be defined by the arch rather than have the core emit an error message. Note that it is possible to allow both REL and RELA records: m68k and mips are two arches that do this. With this, some arch asm/module.h files can be deleted entirely and replaced with a generic-y marker in the arch Kbuild file. Additionally, I have removed the bits from m32r and score that handle the unsupported type of relocation record as that's now handled centrally. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* module: taint kernel when lve module is loadedMatthew Garrett2012-09-281-0/+4
| | | | | | | | | | | | Cloudlinux have a product called lve that includes a kernel module. This was previously GPLed but is now under a proprietary license, but the module continues to declare MODULE_LICENSE("GPL") and makes use of some EXPORT_SYMBOL_GPL symbols. Forcibly taint it in order to avoid this. Signed-off-by: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Alex Lyashkov <umka@cloudlinux.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: stable@kernel.org
* Guard check in module loader against integer overflowDavid Howells2012-05-231-1/+2
| | | | | | | | | | | The check: if (len < hdr->e_shoff + hdr->e_shnum * sizeof(Elf_Shdr)) may not work if there's an overflow in the right-hand side of the condition. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* dynamic_debug: make dynamic-debug work for module initializationJim Cromie2012-04-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This introduces a fake module param $module.dyndbg. Its based upon Thomas Renninger's $module.ddebug boot-time debugging patch from https://lkml.org/lkml/2010/9/15/397 The 'fake' module parameter is provided for all modules, whether or not they need it. It is not explicitly added to each module, but is implemented in callbacks invoked from parse_args. For builtin modules, dynamic_debug_init() now directly calls parse_args(..., &ddebug_dyndbg_boot_params_cb), to process the params undeclared in the modules, just after the ddebug tables are processed. While its slightly weird to reprocess the boot params, parse_args() is already called repeatedly by do_initcall_levels(). More importantly, the dyndbg queries (given in ddebug_query or dyndbg params) cannot be activated until after the ddebug tables are ready, and reusing parse_args is cleaner than doing an ad-hoc parse. This reparse would break options like inc_verbosity, but they probably should be params, like verbosity=3. ddebug_dyndbg_boot_params_cb() handles both bare dyndbg (aka: ddebug_query) and module-prefixed dyndbg params, and ignores all other parameters. For example, the following will enable pr_debug()s in 4 builtin modules, in the order given: dyndbg="module params +p; module aio +p" module.dyndbg=+p pci.dyndbg For loadable modules, parse_args() in load_module() calls ddebug_dyndbg_module_params_cb(). This handles bare dyndbg params as passed from modprobe, and errors on other unknown params. Note that modprobe reads /proc/cmdline, so "modprobe foo" grabs all foo.params, strips the "foo.", and passes these to the kernel. ddebug_dyndbg_module_params_cb() is again called for the unknown params; it handles dyndbg, and errors on others. The "doing" arg added previously contains the module name. For non CONFIG_DYNAMIC_DEBUG builds, the stub function accepts and ignores $module.dyndbg params, other unknowns get -ENOENT. If no param value is given (as in pci.dyndbg example above), "+p" is assumed, which enables all pr_debug callsites in the module. The dyndbg fake parameter is not shown in /sys/module/*/parameters, thus it does not use any resources. Changes to it are made via the control file. Also change pr_info in ddebug_exec_queries to vpr_info, no need to see it all the time. Signed-off-by: Jim Cromie <jim.cromie@gmail.com> CC: Thomas Renninger <trenn@suse.de> CC: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* module: Remove module size limitSasha Levin2012-03-261-2/+1
| | | | | | | | | | | | | Module size was limited to 64MB, this was legacy limitation due to vmalloc() which was removed a while ago. Limiting module size to 64MB is both pointless and affects real world use cases. Cc: Tim Abbott <tim.abbott@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* module: move __module_get and try_module_get() out of line.Steven Rostedt2012-03-261-0/+30
| | | | | | | | | | | | | | | | | | | | | | | With the preempt, tracepoint and everything, it's getting a bit chubby. For an Ubuntu-based config: Before: $ size -t `find * -name '*.ko'` | grep TOTAL 56199906 3870760 1606616 61677282 3ad1ee2 (TOTALS) $ size vmlinux text data bss dec hex filename 8509342 850368 3358720 12718430 c2115e vmlinux After: $ size -t `find * -name '*.ko'` | grep TOTAL 56183760 3867892 1606616 61658268 3acd49c (TOTALS) $ size vmlinux text data bss dec hex filename 8501842 849088 3358720 12709650 c1ef12 vmlinux Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (made all out-of-line)
* params: <level>_initcall-like kernel parametersPawel Moll2012-03-261-1/+2
| | | | | | | | | | | | | | | This patch adds a set of macros that can be used to declare kernel parameters to be parsed _before_ initcalls at a chosen level are executed. We rename the now-unused "flags" field of struct kernel_param as the level. It's signed, for when we use this for early params as well, in future. Linker macro collating init calls had to be modified in order to add additional symbols between levels that are later used by the init code to split the calls into blocks. Signed-off-by: Pawel Moll <pawel.moll@arm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* module: add kernel param to force disable module loadDave Young2012-03-261-0/+1
| | | | | | | | | | | | | | Sometimes we need to test a kernel of same version with code or config option changes. We already have sysctl to disable module load, but add a kernel parameter will be more convenient. Since modules_disabled is int, so here use bint type in core_param. TODO: make sysctl accept bool and change modules_disabled to bool Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* error: implicit declaration of function 'module_flags_taint'Kevin Winchester2012-01-151-20/+20
| | | | | | | | | | | | | | | Recent changes to kernel/module.c caused the following compile error: kernel/module.c: In function ‘show_taint’: kernel/module.c:1024:2: error: implicit declaration of function ‘module_flags_taint’ [-Werror=implicit-function-declaration] cc1: some warnings being treated as errors Correct this error by moving the definition of module_flags_taint outside of the #ifdef CONFIG_MODULE_UNLOAD section. Signed-off-by: Kevin Winchester <kjwinchester@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* modules: sysfs - export: taint, coresize, initsizeKay Sievers2012-01-131-29/+64
| | | | | | | | | | | | | | | | | | | Recent tools do not want to use /proc to retrieve module information. A few values are currently missing from sysfs to replace the information available in /proc/modules. This adds /sys/module/*/{coresize,initsize,taint} attributes. TAINT_PROPRIETARY_MODULE (P) and TAINT_OOT_MODULE (O) flags are both always shown now, and do no longer exclude each other, also in /proc/modules. Replace the open-coded sysfs attribute initializers with the __ATTR() macro. Add the new attributes to Documentation/ABI. Cc: Lucas De Marchi <lucas.demarchi@profusion.mobi> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* module: replace DEBUGP with pr_debugJim Cromie2012-01-131-26/+20
| | | | | | | | | | | Use more flexible pr_debug. This allows: echo "module module +p" > /dbg/dynamic_debug/control to turn on debug messages when needed. Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* module: struct module_ref should contains long fieldsEric Dumazet2012-01-131-4/+4
| | | | | | | | | | | | | | | | | | | module_ref contains two "unsigned int" fields. Thats now too small, since some machines can open more than 2^32 files. Check commit 518de9b39e8 (fs: allow for more than 2^31 files) for reference. We can add an aligned(2 * sizeof(unsigned long)) attribute to force alloc_percpu() allocating module_ref areas in single cache lines. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Rusty Russell <rusty@rustcorp.com.au> CC: Tejun Heo <tj@kernel.org> CC: Robin Holt <holt@sgi.com> CC: David Miller <davem@davemloft.net> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* module: Fix performance regression on modules with large symbol tablesKevin Cernekee2012-01-131-44/+21
| | | | | | | | | | | | | | | | | | | | | Looking at /proc/kallsyms, one starts to ponder whether all of the extra strtab-related complexity in module.c is worth the memory savings. Instead of making the add_kallsyms() loop even more complex, I tried the other route of deleting the strmap logic and naively copying each string into core_strtab with no consideration for consolidating duplicates. Performance on an "already exists" insmod of nvidia.ko (runs add_kallsyms() but does not actually initialize the module): Original scheme: 1.230s With naive copying: 0.058s Extra space used: 35k (of a 408k module). Signed-off-by: Kevin Cernekee <cernekee@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> LKML-Reference: <73defb5e4bca04a6431392cc341112b1@localhost>
* module: Add comments describing how the "strmap" logic worksKevin Cernekee2012-01-131-0/+9
| | | | | Signed-off-by: Kevin Cernekee <cernekee@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* Merge branch 'modsplit-Oct31_2011' of ↵Linus Torvalds2011-11-061-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits) Revert "tracing: Include module.h in define_trace.h" irq: don't put module.h into irq.h for tracking irqgen modules. bluetooth: macroize two small inlines to avoid module.h ip_vs.h: fix implicit use of module_get/module_put from module.h nf_conntrack.h: fix up fallout from implicit moduleparam.h presence include: replace linux/module.h with "struct module" wherever possible include: convert various register fcns to macros to avoid include chaining crypto.h: remove unused crypto_tfm_alg_modname() inline uwb.h: fix implicit use of asm/page.h for PAGE_SIZE pm_runtime.h: explicitly requires notifier.h linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h miscdevice.h: fix up implicit use of lists and types stop_machine.h: fix implicit use of smp.h for smp_processor_id of: fix implicit use of errno.h in include/linux/of.h of_platform.h: delete needless include <linux/module.h> acpi: remove module.h include from platform/aclinux.h miscdevice.h: delete unnecessary inclusion of module.h device_cgroup.h: delete needless include <linux/module.h> net: sch_generic remove redundant use of <linux/module.h> net: inet_timewait_sock doesnt need <linux/module.h> ... Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in - drivers/media/dvb/frontends/dibx000_common.c - drivers/media/video/{mt9m111.c,ov6650.c} - drivers/mfd/ab3550-core.c - include/linux/dmaengine.h
| * kernel: Map most files to use export.h instead of module.hPaul Gortmaker2011-10-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The changed files were only including linux/module.h for the EXPORT_SYMBOL infrastructure, and nothing else. Revector them onto the isolated export header for faster compile times. Nothing to see here but a whole lot of instances of: -#include <linux/module.h> +#include <linux/export.h> This commit is only changing the kernel dir; next targets will probably be mm, fs, the arch dirs, etc. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
* | module,bug: Add TAINT_OOT_MODULE flag for modules not built in-treeBen Hutchings2011-11-071-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | Use of the GPL or a compatible licence doesn't necessarily make the code any good. We already consider staging modules to be suspect, and this should also be true for out-of-tree modules which may receive very little review. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Reviewed-by: Dave Jones <davej@redhat.com> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (patched oops-tracing.txt)
* | module: Enable dynamic debugging regardless of taintBen Hutchings2011-11-071-4/+2
|/ | | | | | | | | | | | | | | | | | | | Dynamic debugging is currently disabled for tainted modules, except for TAINT_CRAP. This prevents use of dynamic debugging for out-of-tree modules once the next patch is applied. This condition was apparently intended to avoid a crash if a force- loaded module has an incompatible definition of dynamic debug structures. However, a administrator that forces us to load a module is claiming that it *is* compatible even though it fails our version checks. If they are mistaken, there are any number of ways the module could crash the system. As a side-effect, proprietary and other tainted modules can now use dynamic_debug. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* Tracepoint: Dissociate from module mutexMathieu Desnoyers2011-08-101-47/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Copy the information needed from struct module into a local module list held within tracepoint.c from within the module coming/going notifier. This vastly simplifies locking of tracepoint registration / unregistration, because we don't have to take the module mutex to register and unregister tracepoints anymore. Steven Rostedt ran into dependency problems related to modules mutex vs kprobes mutex vs ftrace mutex vs tracepoint mutex that seems to be hard to fix without removing this dependency between tracepoint and module mutex. (note: it should be investigated whether kprobes could benefit of being dissociated from the modules mutex too.) This also fixes module handling of tracepoint list iterators, because it was expecting the list to be sorted by pointer address. Given we have control on our own list now, it's OK to sort this list which has tracepoints as its only purpose. The reason why this sorting is required is to handle the fact that seq files (and any read() operation from user-space) cannot hold the tracepoint mutex across multiple calls, so list entries may vanish between calls. With sorting, the tracepoint iterator becomes usable even if the list don't contain the exact item pointed to by the iterator anymore. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Jason Baron <jbaron@redhat.com> CC: Ingo Molnar <mingo@elte.hu> CC: Lai Jiangshan <laijs@cn.fujitsu.com> CC: Peter Zijlstra <a.p.zijlstra@chello.nl> CC: Thomas Gleixner <tglx@linutronix.de> CC: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Link: http://lkml.kernel.org/r/20110810191839.GC8525@Krystal Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* module: add /sys/module/<name>/uevent filesKay Sievers2011-07-241-0/+17
| | | | | | | | | | | | | | | | | Userspace wants to manage module parameters with udev rules. This currently only works for loaded modules, but not for built-in ones. To allow access to the built-in modules we need to re-trigger all module load events that happened before any userspace was running. We already do the same thing for all devices, subsystems(buses) and drivers. This adds the currently missing /sys/module/<name>/uevent files to all module entries. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (split & trivial fix)
* module: change attr callbacks to take struct module_kobjectKay Sievers2011-07-241-7/+7
| | | | | | | | This simplifies the next patch, where we have an attribute on a builtin module (ie. module == NULL). Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (split into 2)
* modules: add default loader hook implementationsJonas Bonn2011-07-241-0/+49
| | | | | | | | | | The module loader code allows architectures to hook into the code by providing a small number of entry points that each arch must implement. This patch provides __weakly linked generic implementations of these entry points for architectures that don't need to do anything special. Signed-off-by: Jonas Bonn <jonas@southpole.se> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* Merge branch 'staging-next' of ↵Linus Torvalds2011-05-231-2/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6 * 'staging-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6: (970 commits) staging: usbip: replace usbip_u{dbg,err,info} and printk with dev_ and pr_ staging:iio: Trivial kconfig reorganization and uniformity improvements. staging:iio:documenation partial update. staging:iio: use pollfunc allocation helpers in remaining drivers. staging:iio:max1363 misc cleanups and use of for_each_bit_set to simplify event code spitting out. staging:iio: implement an iio_info structure to take some of the constant elements out of iio_dev. staging:iio:meter:ade7758: Use private data space from iio_allocate_device staging:iio:accel:lis3l02dq make write_reg_8 take value not a pointer to value. staging:iio: ring core cleanups + check if read_last available in lis3l02dq staging:iio:core cleanup: squash tiny wrappers and use dev_set_name to handle creation of event interface name. staging:iio: poll func allocation clean up. staging:iio:ad7780 trivial unused header cleanup. staging:iio:adc: AD7780: Use private data space from iio_allocate_device + trivial fixes staging:iio:adc:AD7780: Convert to new channel registration method staging:iio:adc: AD7606: Drop dev_data in favour of iio_priv() staging:iio:adc: AD7606: Consitently use indio_dev staging:iio: Rip out helper for software rings. staging:iio:adc:AD7298: Use private data space from iio_allocate_device staging:iio: rationalization of different buffer implementation hooks. staging:iio:imu:adis16400 avoid allocating rx, tx, and state separately from iio_dev. ... Fix up trivial conflicts in - drivers/staging/intel_sst/intelmid.c: patches applied in both branches - drivers/staging/rt2860/common/cmm_data_{pci,usb}.c: removed vs spelling - drivers/staging/usbip/vhci_sysfs.c: trivial header file inclusion
OpenPOWER on IntegriCloud