summaryrefslogtreecommitdiffstats
path: root/sys/kern
Commit message (Collapse)AuthorAgeFilesLines
* Add a timestamp to the msgbuf output in order to determine when wheneadler2012-02-161-8/+45
| | | | | | | | | | | | messages were printed. This can be enabled with the kern.msgbuf_show_timestamp sysctl PR: kern/161553 Reviewed by: avg Submitted by: Arnaud Lacombe <lacombar@gmail.com> Approved by: cperciva MFC after: 1 month
* The PTRACESTOP() macro is used only once. Inline the only use and removekib2012-02-111-1/+5
| | | | | | the macro. MFC after: 1 week
* Remove unneeded newline. It fits in 80 columns now.ed2012-02-101-2/+1
| | | | Pointed out by: jh
* Merge si_name and __si_namebuf.ed2012-02-101-7/+7
| | | | | The si_name pointer always points to the __si_namebuf member inside the same object. Remove it and rename __si_namebuf to si_name.
* Add a missing break. This bug was introduced in r228856.kevlo2012-02-101-0/+1
|
* Mark the automatically attached child with PL_FLAG_CHILD in structkib2012-02-102-0/+4
| | | | | | | lwpinfo flags, for PT_FOLLOWFORK auto-attachment. In collaboration with: Dmitry Mikulin <dmitrym juniper net> MFC after: 1 week
* Add support for mounting devfs inside jails.mm2012-02-091-2/+55
| | | | | | | | | | | | | A new jail(8) option "devfs_ruleset" defines the ruleset enforcement for mounting devfs inside jails. A value of -1 disables mounting devfs in jails, a value of zero means no restrictions. Nested jails can only have mounting devfs disabled or inherit parent's enforcement as jails are not allowed to view or manipulate devfs(8) rules. Utilizes new functions introduced in r231265. Reviewed by: jamie MFC after: 1 month
* Unbreak detection of the async mode for clustered writes after r231075.kib2012-02-081-1/+1
| | | | | Submitted by: bde MFC after: 12 days
* Allow to set kern.ipc.shmmax from /boot/loader.conf.pjd2012-02-081-7/+7
| | | | MFC after: 1 week
* Fix whitespace inconsistencies in TTY code.ed2012-02-063-4/+3
|
* Rename cache_lookup_times() to cache_lookup() and retire the old API andjhb2012-02-061-12/+2
| | | | ABI stub for cache_lookup().
* Current implementations of sync(2) and syncer vnode fsync() VOP useskib2012-02-062-21/+6
| | | | | | | | | | | | | | | | | | | | | | mnt_noasync counter to temporary remove MNTK_ASYNC mount option, which is needed to guarantee a synchronous completion of the initiated i/o before syscall or VOP return. Global removal of MNTK_ASYNC option is harmful because not only i/o started from corresponding thread becomes synchronous, but all i/o is synchronous on the filesystem which is initiated during sync(2) or syncer activity. Instead of removing MNTK_ASYNC from mnt_kern_flag, provide a local thread flag to disable async i/o for current thread only. Use the opportunity to move DOINGASYNC() macro into sys/vnode.h and consistently use it through places which tested for MNTK_ASYNC. Some testing demonstrated 60-70% improvements in run time for the metadata-intensive operations on async-mounted UFS volumes, but still with great deviation due to other reasons. Reviewed by: mckusick Tested by: scottl MFC after: 2 weeks
* - Use uint8_t for the variable x and spell the size of the variablekevlo2012-02-061-42/+42
| | | | | | | | as sizeof(x) - Capitalized comment - Parentheses around return value Requested by: bde
* Analogous to r230407 a separate path buffer in vfs_mount.c is requiredmm2012-02-051-2/+6
| | | | | | for r230129. Fixes a out of bounds write to fspath. MFC after: 10 days
* Add 32-bit compat code for AIO kevent flags introduced in revision 230857.davidxu2012-02-051-0/+1
|
* Whenever a new kernel thread is spawned, explicitly clear any CPU affinityrstone2012-02-041-0/+7
| | | | | | | | | set on the new thread. This prevents the thread from inadvertently inheriting affinity from a random sibling. Submitted by: attilio Tested by: pho MFC after: 1 week
* Fix input validation in SO_SETFIB.hrs2012-02-041-1/+1
| | | | | Reviewed by: bz MFC after: 1 day
* Add kqueue support to /dev/klog.kib2012-02-011-0/+48
| | | | | | Submitted by: Mateusz Guzik <mjguzik gmail com> PR: kern/156423 MFC after: 1 weeks
* If multiple threads call kevent() to get AIO events on same kqueue fd,davidxu2012-02-011-1/+7
| | | | | | | | | | | | | | | it is possible that a single AIO event will be reported to multiple threads, it is not threading friendly, and the existing API can not control this behavior. Allocate a kevent flags field sigev_notify_kevent_flags for AIO event notification in sigevent, and allow user to pass EV_CLEAR, EV_DISPATCH or EV_ONESHOT to AIO kernel code, user can control whether the event should be cleared once it is retrieved by a thread. This change should be comptaible with existing application, because the field should have already been zero-filled, and no additional action will be taken by kernel. PR: kern/156567
* A debugger which requested PT_FOLLOW_FORK should get the notificationkib2012-01-301-1/+2
| | | | | | | | | about new child not only when doing PT_TO_SCX, but also for PT_CONTINUE. If TDB_FORK flag is set, always issue a stop, the same as is done for TDB_EXEC. Reported by: Dmitry Mikulin <dmitrym juniper net> MFC after: 1 week
* Refine the implementation of POSIX_FADV_NOREUSE for the read(2) case suchjhb2012-01-301-7/+7
| | | | | | | | that instead of using direct I/O it allows read-ahead similar to POSIX_FADV_NORMAL, but invokes VOP_ADVISE(POSIX_FADV_DONTNEED) after the read(2) has completed to purge just-read data. The write(2) path continues to use direct I/O for POSIX_FADV_NOREUSE for now. Note that NOREUSE works optimally if an application reads and writes full fs blocks.
* When detaching an AIO or LIO requests grab the lock and tell knlist_removeambrisko2012-01-301-6/+12
| | | | | | | | that we have the lock now. This cleans up a locking panic ASSERT when knlist_empty is called without a lock when INVARIANTS etc. are turned. Reviewed by: kib jhb MFC after: 1 week
* Finally, try to enable the nxstacks on amd64 and powerpc64 for both 64bitkib2012-01-301-1/+6
| | | | | | | and 32bit ABIs. Also try to enable nxstacks for PAE/i386 when supported, and some variants of powerpc32. MFC after: 2 months (if ever)
* Avoid to check the same cache line/variable from all the lockingattilio2012-01-281-2/+1
| | | | | | | | | | | | | | | | primitives by breaking stop_scheduler into a per-thread variable. Also, store the new td_stopsched very close to td_*locks members as they will be accessed mostly in the same codepaths as td_stopsched and this results in avoiding a further cache-line pollution, possibly. STOP_SCHEDULER() was pondered to use a new 'thread' argument, in order to take advantage of already cached curthread, but in the end there should not really be a performance benefit, while introducing a KPI breakage. In collabouration with: flo Reviewed by: avg MFC after: 3 months (or never) X-MFC: r228424
* Fix size check, that prevents getting negative after castingglebius2012-01-271-1/+1
| | | | | | to a signed type Reviewed by: bde
* Xen netback driver rewrite.ken2012-01-262-3/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | share/man/man4/Makefile, share/man/man4/xnb.4, sys/dev/xen/netback/netback.c, sys/dev/xen/netback/netback_unit_tests.c: Rewrote the netback driver for xen to attach properly via newbus and work properly in both HVM and PVM mode (only HVM is tested). Works with the in-tree FreeBSD netfront driver or the Windows netfront driver from SuSE. Has not been extensively tested with a Linux netfront driver. Does not implement LRO, TSO, or polling. Includes unit tests that may be run through sysctl after compiling with XNB_DEBUG defined. sys/dev/xen/blkback/blkback.c, sys/xen/interface/io/netif.h: Comment elaboration. sys/kern/uipc_mbuf.c: Fix page fault in kernel mode when calling m_print() on a null mbuf. Since m_print() is only used for debugging, there are no performance concerns for extra error checking code. sys/kern/subr_scanf.c: Add the "hh" and "ll" width specifiers from C99 to scanf(). A few callers were already using "ll" even though scanf() was handling it as "l". Submitted by: Alan Somers <alans@spectralogic.com> Submitted by: John Suykerbuyk <johns@spectralogic.com> Sponsored by: Spectra Logic MFC after: 1 week Reviewed by: ken
* Although aio_nbytes is size_t, later is is signed toglebius2012-01-261-0/+6
| | | | | | | | | casted types: to ssize_t in filesystem code and to int in buf code, thus supplying a negative argument leads to kernel panic later. To fix that check user supplied argument in the beginning of syscall. Submitted by: Maxim Dounin <mdounin mdounin.ru>, maxim@
* When doing vflush(WRITECLOSE), clean vnode pages.kib2012-01-251-0/+12
| | | | | | | | | | Unmounts do vfs_msync() before calling VFS_UNMOUNT(), but there is still a race allowing a process to dirty pages after msync finished. Remounts rw->ro just left dirty pages in system. Reviewed by: alc, tegge (long time ago) Tested by: pho MFC after: 2 weeks
* Fix remaining calls to cache_enter() in both NFS clients to providekib2012-01-251-9/+5
| | | | | | | | appropriate timestamps. Restore the assertions which verify that NCF_TS is set when timestamp is asked for. Reviewed by: jhb (previous version) MFC after: 2 weeks
* Fix CTL flags in the declarations of KERN_PROC_ENV, AUXV andtrociny2012-01-251-8/+6
| | | | | | PS_STRINGS sysctls: they are read only. MFC after: 1 week
* Apparently, both nfs clients do not use cache_enter_time()kib2012-01-231-29/+23
| | | | | | | | | | | | | | consistently, creating some namecache entries without NCF_TS flag. This causes panic due to failed assertion. As a temporal relief, remove the assert. Return epoch timestamp for the entries without timestamp if asked. While there, consolidate the code which returns timestamps, into a helper cache_out_ts(). Discussed with: jhb MFC after: 2 weeks
* Convert panic()s to KASSERT()s. This is an optimisation forglebius2012-01-231-7/+3
| | | | | hashdestroy() since in absence of INVARIANTS a compiler will drop the entire for() cycle.
* Change kern.proc.rlimit sysctl to:trociny2012-01-222-22/+49
| | | | | | | | | | | | | - retrive only one, specified limit for a process, not the whole array, as it was previously (the sysctl has been added recently and has not been backported to stable yet, so this change is ok); - allow to set a resource limit for another process. Submitted by: Andrey Zonov <andrey at zonov.org> Discussed with: kib Reviewed by: kib MFC after: 2 weeks
* TDF_* flags should be used with td_flags field and TDP_* flags should be usedpjd2012-01-221-1/+2
| | | | | | | with td_pflags field. Correct two places where it was not the case. Discussed with: kib MFC after: 1 week
* Remove the nc_time and nc_ticks elements from struct namecache, andkib2012-01-221-56/+131
| | | | | | | | | | | | | | | | | | provide struct namecache_ts which is the old struct namecache. Only allocate struct namecache_ts if non-null struct timespec *tsp was passed to cache_enter_time, otherwise use struct namecache. Change struct namecache allocation and deallocation macros into static functions, since logic becomes somewhat twisty. Provide accessor for the nc_name member of struct namecache to hide difference between struct namecache and namecache_ts. The aim of the change is to not waste 20 bytes per small namecache entry. Reviewed by: jhb MFC after: 2 weeks X-MFC-note: after r230394
* Use separate buffer for global path to avoid overflow of path buffer.mm2012-01-211-3/+11
| | | | | Reviewed by: jamie@ MFC after: 3 weeks
* Close a race in NFS lookup processing that could result in stale name cachejhb2012-01-201-8/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | entries on one client when a directory was renamed on another client. The root cause for the stale entry being trusted is that each per-vnode nfsnode structure has a single 'n_ctime' timestamp used to validate positive name cache entries. However, if there are multiple entries for a single vnode, they all share a single timestamp. To fix this, extend the name cache to allow filesystems to optionally store a timestamp value in each name cache entry. The NFS clients now fetch the timestamp associated with each name cache entry and use that to validate cache hits instead of the timestamps previously stored in the nfsnode. Another part of the fix is that the NFS clients now use timestamps from the post-op attributes of RPCs when adding name cache entries rather than pulling the timestamps out of the file's attribute cache. The latter is subject to races with other lookups updating the attribute cache concurrently. Some more details: - Add a variant of nfsm_postop_attr() to the old NFS client that can return a vattr structure with a copy of the post-op attributes. - Handle lookups of "." as a special case in the NFS clients since the name cache does not store name cache entries for ".", so we cannot get a useful timestamp. It didn't really make much sense to recheck the attributes on the the directory to validate the namecache hit for "." anyway. - ABI compat shims for the name cache routines are present in this commit so that it is safe to MFC. MFC after: 2 weeks
* Use shared lock for the executable vnode in the exec path after thekib2012-01-191-5/+8
| | | | | | | | VV_TEXT changes are handled. Assert that vnode is exclusively locked at the places that modify VV_TEXT. Discussed with: alc MFC after: 3 weeks
* Explain why it is safe to unlock the vnode.alc2012-01-171-0/+3
| | | | Requested by: kib
* Make sure all intermediate variables holding mount flags (mnt_flag)mckusick2012-01-172-24/+42
| | | | | | | and that all internal kernel calls passing mount flags are declared as uint64_t so that flags in the top 32-bits are not lost. MFC after: 2 weeks
* Improve abstraction. Eliminate direct access by elf*_load_section()alc2012-01-171-26/+25
| | | | | | | | | | | to an OBJT_VNODE-specific field of the vm object. The same information can be just as easily obtained from the struct vattr that is in struct image_params if the latter is passed to elf*_load_section(). Moreover, by replacing the vmspace and vm object parameters to elf*_load_section() with a struct image_params parameter, we actually reduce the size of the object code. In collaboration with: kib
* Be pedantic and change // comment to C-style one.pluknet2012-01-161-1/+1
| | | | Noticed by: Bruce Evans
* Fix a style bugkevlo2012-01-161-1/+1
| | | | Spotted by: avg
* Eliminate branch and insert an explicit reader memory barrier to ensuredavidxu2012-01-161-3/+2
| | | | that waiter bit is set before reading semaphore count.
* Abrogate nchr argument in proc_getargv() and proc_getenvv(): we always wanttrociny2012-01-151-13/+12
| | | | | | | | | | | | | | | to read strings completely to know the actual size. As a side effect it fixes the issue with kern.proc.args and kern.proc.env sysctls, which didn't return the size of available data when calling sysctl(3) with the NULL argument for oldp. Note, in get_ps_strings(), which does actual work for proc_getargv() and proc_getenvv(), we still have a safety limit on the size of data read in case of a corrupted procces stack. Suggested by: kib MFC after: 3 days
* Fix missing in r230129:mm2012-01-152-1/+2
| | | | | | | | kern_jail.c: initialize fullpath_disabled to zero vfs_cache.c: add missing dot in comment Reported by: kib MFC after: 1 month
* Convert files to UTF-8uqs2012-01-151-1/+1
|
* Introduce vn_path_to_global_path()mm2012-01-153-24/+129
| | | | | | | | | | | | | This function updates path string to vnode's full global path and checks the size of the new path string against the pathlen argument. In vfs_domount(), sys_unmount() and kern_jail_set() this new function is used to update the supplied path argument to the respective global path. Unbreaks jailed zfs(8) with enforce_statfs set to 1. Reviewed by: kib MFC after: 1 month
* - Fix undefined behavior when device_get_name is nulleadler2012-01-151-2/+8
| | | | | | | | | - Make error message more informative PR: kern/149800 Submitted by: olgeni Approved by: cperciva MFC after: 1 week
* Fix kernel modules loading for MIPS64 kernel:gonzo2012-01-141-0/+4
| | | | | | | | | | | | | | | | On amd64, link_elf_obj.c must specify KERNBASE rather than VM_MIN_KERNEL_ADDRESS to vm_map_find() because kernel loadable modules must be mapped for execution in the same upper region of the kernel map as the kernel code and data segments. For MIPS32 KERNBASE lies below KVA area (it's less than VM_MIN_KERNEL_ADDRESS) so basically vm_map_find got whole KVA to look through. On MIPS64 it's not the case because KERNBASE is set to the very end of XKSEG, well out of KVA bounds, so vm_map_find always fails. We should use VM_MIN_KERNEL_ADDRESS as a base for vm_map_find. Details obtained from: alc@
OpenPOWER on IntegriCloud