summaryrefslogtreecommitdiffstats
path: root/sys/kern/kern_resource.c
Commit message (Collapse)AuthorAgeFilesLines
* Implement process-shared locks support for libthr.so.3, withoutkib2016-02-281-0/+7
| | | | | | | | | | | | breaking the ABI. Special value is stored in the lock pointer to indicate shared lock, and offline page in the shared memory is allocated to store the actual lock. Reviewed by: vangyzen (previous version) Discussed with: deischen, emaste, jhb, rwatson, Martin Simmons <martin@lispworks.com> Tested by: pho Sponsored by: The FreeBSD Foundation
* Fold lim_shared into lim_copy to mute a -Wunused compiler warning fromngie2015-12-221-10/+1
| | | | | | | | | clang when the kernel is compiled without INVARIANTS Differential Revision: https://reviews.freebsd.org/D4683 Reviewed by: kib, jhb MFC after: 1 week Sponsored by: EMC / Isilon Storage Division
* Speed up rctl operation with large rulesets, by holding the locktrasz2015-11-151-1/+6
| | | | | | | | | during iteration instead of relocking it for each traversed rule. Reviewed by: mjg@ MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D4110
* Get rid of lim_update_thread and cred_update_thread.mjg2015-07-161-14/+0
| | | | | | Their primary use was in thread_cow_update to free up old resources. Freeing had to be done with proc lock held and _cow_ funcs already knew how to free old structs.
* rlimit: deduplicate code in chg* functionsmjg2015-06-251-46/+27
|
* Implement lockless resource limits.mjg2015-06-101-32/+60
| | | | | | | | | | Use the same scheme implemented to manage credentials. Code needing to look at process's credentials (as opposed to thred's) is provided with *_proc variants of relevant functions. Places which possibly had to take the proc lock anyway still use the proc pointer to access limits.
* Implement support for binary to requesting specific stack size for thekib2015-04-151-1/+6
| | | | | | | | | | | | initial thread. It is read by the ELF image activator as the virtual size of the PT_GNU_STACK program header entry, and can be specified by the linker option -z stack-size in newer binutils. The soft RLIMIT_STACK is auto-increased if possible, to satisfy the binary' request. Sponsored by: The FreeBSD Foundation MFC after: 1 week
* The process spin lock currently has the following distinct uses:kib2014-11-261-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | - Threads lifetime cycle, in particular, counting of the threads in the process, and interlocking with process mutex and thread lock. The main reason of this is that turnstile locks are after thread locks, so you e.g. cannot unlock blockable mutex (think process mutex) while owning thread lock. - Virtual and profiling itimers, since the timers activation is done from the clock interrupt context. Replace the p_slock by p_itimmtx and PROC_ITIMLOCK(). - Profiling code (profil(2)), for similar reason. Replace the p_slock by p_profmtx and PROC_PROFLOCK(). - Resource usage accounting. Need for the spinlock there is subtle, my understanding is that spinlock blocks context switching for the current thread, which prevents td_runtime and similar fields from changing (updates are done at the mi_switch()). Replace the p_slock by p_statmtx and PROC_STATLOCK(). The split is done mostly for code clarity, and should not affect scalability. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week
* ifdef RACCT ui_racct_foreach and struct uidinfo's ui_racctmjg2014-11-231-0/+2
| | | | | Change racct_ create and destroy to macros evaluating to nothing without RACCT so that their callers passing ui_racct don't have to be ifdefed.
* Tidy up functions related to uidinfo management.mjg2014-10-271-45/+47
| | | | | - reference found uidinfo in uilookup - reduce nesting by handling shorter cases first
* De-k&r-ify function definitions in kern/kern_resource.cmjg2014-10-271-57/+20
| | | | No functional changes.
* rlimit: plug duplicate assertionmjg2014-10-251-1/+0
| | | | counter sanity is already checked by refcount_release.
* rlimit: avoid unnecessary copying of rlimitsmjg2013-12-131-6/+16
| | | | | | If refcount is 1 just modify rlimits in place. MFC after: 2 weeks
* rlimit: add and utilize lim_sharedmjg2013-12-131-1/+11
| | | | MFC after: 2 weeks
* Add a resource limit for the total number of kqueues available to thekib2013-10-211-0/+18
| | | | | | | | | | | | | | | | | | user. Kqueue now saves the ucred of the allocating thread, to correctly decrement the counter on close. Under some specific and not real-world use scenario for kqueue, it is possible for the kqueues to consume memory proportional to the square of the number of the filedescriptors available to the process. Limit allows administrator to prevent the abuse. This is kernel-mode side of the change, with the user-mode enabling commit following. Reported and tested by: pho Discussed with: jmg Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
* Call sched_prio() to immediately change the priority of the thread inian2013-03-071-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | response to an rtprio_thread() call, when the priority is different than the old priority, and either the old or the new priority class is not RTP_PRIO_NORMAL (timeshare). The reasoning for the second half of the test is that if it's a change in timeshare priority, then the scheduler is going to adjust that priority in a way that completely wipes out the requested change anyway, so what's the point? (If that's not true, then allowing a thread to change its own timeshare priority would subvert the scheduler's adjustments and let a cpu-bound thread monopolize the cpu; if allowed at all, that should require priveleges.) On the other hand, if either the old or new priority class is not timeshare, then the scheduler doesn't make automatic adjustments, so we should honor the request and make the priority change right away. The reason the old class gets caught up in this is the very reason for this change: when thread A changes the priority of its child thread B from idle back to timeshare, thread B never actually gets moved to a timeshare-range run queue unless there are some idle cycles available to allow it to first get scheduled again as an idle thread. Reviewed by: jhb@
* MFcalloutng (r244251 with minor changes):davide2013-03-041-3/+6
| | | | | | | Specify that precision of 0.5s is enough for resource limitation. Sponsored by: Google Summer of Code 2012, iXsystems inc. Tested by: flo, marius, ian, markj, Fabian Keil
* Change kern.proc.rlimit sysctl to:trociny2012-01-221-6/+9
| | | | | | | | | | | | | - retrive only one, specified limit for a process, not the whole array, as it was previously (the sysctl has been added recently and has not been backported to stable yet, so this change is ok); - allow to set a resource limit for another process. Submitted by: Andrey Zonov <andrey at zonov.org> Discussed with: kib Reviewed by: kib MFC after: 2 weeks
* Fix a logic bug in change 228207 in the check for a thread's new userjhb2012-01-051-1/+1
| | | | | | priority being a realtime priority. MFC after: 3 days
* - Add a sysctl to allow non-root users the ability to set idleeadler2011-12-131-25/+33
| | | | | | | | | | | | priorities. - While here fix up some style nits. Discussed with: cperciva (breifly) Reviewed by: pjd (earlier version) Reviewed by: bde Approved by: jhb MFC after: 1 month
* When changing the user priority of a thread, change the real priorityjhb2011-12-021-2/+3
| | | | | | | | | in addition to the user priority for threads whose current real priority is equal to the previous user priority or if the new priority is a real-time priority. This allows priority changes of other threads to have an immediate effect. MFC after: 2 weeks
* In lim_fork() assert that processes locks are held.trociny2011-11-071-0/+4
| | | | Suggested by: kib
* In order to maximize the re-usability of kernel code in user space thiskmacy2011-09-161-8/+8
| | | | | | | | | | | | | patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls. Reviewed by: rwatson Approved by: re (bz)
* - Export each thread's individual resource usage in in struct kinfo_proc'sjhb2011-07-181-6/+34
| | | | | | | | | | | | ki_rusage member when KERN_PROC_INC_THREAD is passed to one of the process sysctls. - Correctly account for the current thread's cputime in the thread when doing the runtime fixup in calcru(). - Use TIDs as the key to lookup the previous thread to compute IO stat deltas in IO mode in top when thread display is enabled. Reviewed by: kib Approved by: re (kib)
* Fix several places to ignore processes that are not yet fully constructed.jhb2011-04-061-3/+6
| | | | MFC after: 1 week
* Add racct. It's an API to keep per-process, per-jail, per-loginclasstrasz2011-03-291-0/+20
| | | | | | | | | and per-loginclass resource accounting information, to be used by the new resource limits code. It's connected to the build, but the code that actually calls the new functions will come later. Sponsored by: The FreeBSD Foundation Reviewed by: kib (earlier version)
* Fix some locking nits with the p_state field of struct proc:jhb2011-03-241-4/+2
| | | | | | | | | | | | | | | | | | - Hold the proc lock while changing the state from PRS_NEW to PRS_NORMAL in fork to honor the locking requirements. While here, expand the scope of the PROC_LOCK() on the new process (p2) to avoid some LORs. Previously the code was locking the new child process (p2) after it had locked the parent process (p1). However, when locking two processes, the safe order is to lock the child first, then the parent. - Fix various places that were checking p_state against PRS_NEW without having the process locked to use PROC_LOCK(). Every place was already locking the process, just after the PRS_NEW check. - Remove or reduce the use of PROC_SLOCK() for places that were checking p_state against PRS_NEW. The PROC_LOCK() alone is sufficient for reading the current state. - Reorder fill_kinfo_proc() slightly so it only acquires PROC_SLOCK() once. MFC after: 1 week
* - Follow r216313, the sched_unlend_user_prio is no longer needed, alwaysdavidxu2010-12-291-0/+2
| | | | | | | use sched_lend_user_prio to set lent priority. - Improve pthread priority-inherit mutex, when a contender's priority is lowered, repropagete priorities, this may cause mutex owner's priority to be lowerd, in old code, mutex owner's priority is rise-only.
* Add back a bounds check on valid idle priorities that was lost in anjhb2010-12-171-8/+6
| | | | | | | | | earlier commit. While here, move the thread lock down in rtp_to_pri(). It is not needed for all of the priority value checks and the computation of newpri. Reported by: swell.k @ gmail MFC after: 3 days
* We've already set p = td->td_proc, so use it.emaste2010-10-181-4/+4
|
* Create a global thread hash table to speed up thread lookup, usedavidxu2010-10-091-23/+13
| | | | | | | | | | rwlock to protect the table. In old code, thread lookup is done with process lock held, to find a thread, kernel has to iterate through process and thread list, this is quite inefficient. With this change, test shows in extreme case performance is dramatically improved. Earlier patch was reviewed by: jhb, julian
* Revert r210225 - turns out I was wrong; the "/*-" is not license-onlytrasz2010-07-181-1/+1
| | | | | | | thing; it's also used to indicate that the comment should not be automatically rewrapped. Explained by: cperciva@
* The "/*-" comment marker is supposed to denote copyrights. Remove non-copyrighttrasz2010-07-181-1/+1
| | | | occurences from sys/sys/ and sys/kern/.
* Remove outdated comment and move part of it into more applicable place.trasz2010-07-181-5/+0
|
* Use ISO C99 integer types in sys/kern where possible.ed2010-06-211-1/+1
| | | | | | There are only about 100 occurences of the BSD-specific u_int*_t datatypes in sys/kern. The ISO C99 integer types are used here more often.
* Fix the double counting of the last process thread td_incruntimekib2010-05-241-3/+3
| | | | | | | | | | | on exit, that is done once in thread_exit() and the second time in proc_reap(), by clearing td_incruntime. Use the opportunity to revert to the pre-RUSAGE_THREAD exporting of ruxagg() instead of ruxagg_locked() and use it from thread_exit(). Diagnosed and tested by: neel MFC after: 3 days
* Implement RUSAGE_THREAD. Add td_rux to keep extended runtime and tickskib2010-05-041-11/+22
| | | | | | | | | | | | | | | information for thread to allow calcru1() (re)use. Rename ruxagg()->ruxagg_locked(), ruxagg_tlock()->ruxagg() [1]. The ruxagg_locked() function no longer clears thread ticks nor td_incruntime. Requested by: attilio [1] Discussed with: attilio, bde Reviewed by: bde Based on submission by: Alexander Krizhanovsky <ak natsys-lab com> MFC after: 1 week X-MFC-Note: td_rux shall be moved to the end of struct thread
* Extract thread_lock()/ruxagg()/thread_unlock() fragment into utilitykib2010-05-011-13/+14
| | | | | | | | function ruxagg_tlock(). Convert the definition of kern_getrusage() to ANSI C. Submitted by: Alexander Krizhanovsky <ak natsys-lab com> MFC after: 1 week
* sched_getparam was just plain broke for time-sharerrs2010-03-031-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | processes. It did not return an error but instead just let garbage be passed back. This I fix so it actually properly translates the priority the process is at to a posix's high means more priority. I also fix it so that if the ULE scheduler has bumped it up to a realtime process you get back a sane value i.e. the highest priority (63 for time-share). sched_setscheduler() had the setting of the timeshare class priority disabled. With some notes about rejecting the posix high numbers is greater priority and use nice instead. This fix also adjusts that to work, with the cavet that a t-s process may well get bumped up or down i.e. the setscheduler() will NOT change the nice value only the current priority. I think this is reasonable considering if the user wants to play with nice then he can. At least all the posix'ish interfaces now respond sanely. MFC after: 3 weeks
* Implement global and per-uid accounting of the anonymous memory. Addkib2009-06-231-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved for the uid. The accounting information (charge) is associated with either map entry, or vm object backing the entry, assuming the object is the first one in the shadow chain and entry does not require COW. Charge is moved from entry to object on allocation of the object, e.g. during the mmap, assuming the object is allocated, or on the first page fault on the entry. It moves back to the entry on forks due to COW setup. The per-entry granularity of accounting makes the charge process fair for processes that change uid during lifetime, and decrements charge for proper uid when region is unmapped. The interface of vm_pager_allocate(9) is extended by adding struct ucred *, that is used to charge appropriate uid when allocation if performed by kernel, e.g. md(4). Several syscalls, among them is fork(2), may now return ENOMEM when global or per-uid limits are enforced. In collaboration with: pho Reviewed by: alc Approved by: re (kensmith)
* Don't rearm callout if the process is exiting, it may leak a calloutdavidxu2008-10-241-1/+2
| | | | | because callout_drain() only waits for running callout, but not disable it if it is rearmed.
* Retire the MALLOC and FREE macros. They are an abomination unto style(9).des2008-10-231-1/+1
| | | | MFC after: 3 months
* Fix a small typo in a comment in calcru1().ed2008-09-051-1/+1
| | | | | | The word "happene" should read "happened". Submitted by: Jille Timmermans <jille quis cx>
* Integrate the new MPSAFE TTY layer to the FreeBSD operating system.ed2008-08-201-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The last half year I've been working on a replacement TTY layer for the FreeBSD kernel. The new TTY layer was designed to improve the following: - Improved driver model: The old TTY layer has a driver model that is not abstract enough to make it friendly to use. A good example is the output path, where the device drivers directly access the output buffers. This means that an in-kernel PPP implementation must always convert network buffers into TTY buffers. If a PPP implementation would be built on top of the new TTY layer (still needs a hooks layer, though), it would allow the PPP implementation to directly hand the data to the TTY driver. - Improved hotplugging: With the old TTY layer, it isn't entirely safe to destroy TTY's from the system. This implementation has a two-step destructing design, where the driver first abandons the TTY. After all threads have left the TTY, the TTY layer calls a routine in the driver, which can be used to free resources (unit numbers, etc). The pts(4) driver also implements this feature, which means posix_openpt() will now return PTY's that are created on the fly. - Improved performance: One of the major improvements is the per-TTY mutex, which is expected to improve scalability when compared to the old Giant locking. Another change is the unbuffered copying to userspace, which is both used on TTY device nodes and PTY masters. Upgrading should be quite straightforward. Unlike previous versions, existing kernel configuration files do not need to be changed, except when they reference device drivers that are listed in UPDATING. Obtained from: //depot/projects/mpsafetty/... Approved by: philip (ex-mentor) Discussed: on the lists, at BSDCan, at the DevSummit Sponsored by: Snow B.V., the Netherlands dcons(4) fixed by: kan
* Remove extra uihold() call that accidentally sneak in during perforcepjd2008-03-191-1/+0
| | | | change @125544.
* - Relax requirements for p_numthreads, p_threads, p_swtick, and p_nice fromjeff2008-03-191-16/+4
| | | | | | | requiring the per-process spinlock to only requiring the process lock. - Reflect these changes in the proc.h documentation and consumers throughout the kernel. This is a substantial reduction in locking cost for these fields and was made possible by recent changes to threading support.
* Whitespace cleanups.pjd2008-03-161-7/+7
|
* - Use wait-free method to manage ui_sbsize and ui_proccnt fields in thepjd2008-03-161-58/+48
| | | | | | | | | | uidinfo structure. This entirely removes contention observed on the ui_mtxp mutex (as it is now gone). - Convert the uihashtbl_mtx mutex to a rwlock, as most of the time we just need to read-lock it. Reviewed by: jhb, jeff, kris & others Tested by: kris
* Style fixes.pjd2008-03-161-11/+7
|
* Fix information leak. We can find PIDs of running processes from withinpjd2008-03-161-1/+2
| | | | | | | | | a jail, etc. by simply calling setpriority(PRIO_PROCESS, <PID>, 0) and checking the return value: 0 means that the process exists and -1 that it doesn't exist. Reviewed by: rwatson MFC after: 1 week
OpenPOWER on IntegriCloud