summaryrefslogtreecommitdiffstats
path: root/sys/kern/kern_proc.c
Commit message (Collapse)AuthorAgeFilesLines
* - Commit work from libprocstat project. These patches add support for runtimestas2011-05-121-4/+8
| | | | | | | | | | | | | | | | | | | | | | | file and processes information retrieval from the running kernel via sysctl in the form of new library, libprocstat. The library also supports KVM backend for analyzing memory crash dumps. Both procstat(1) and fstat(1) utilities have been modified to take advantage of the library (as the bonus point the fstat(1) utility no longer need superuser privileges to operate), and the procstat(1) utility is now able to display information from memory dumps as well. The newly introduced fuser(1) utility also uses this library and able to operate via sysctl and kvm backends. The library is by no means complete (e.g. KVM backend is missing vnode name resolution routines, and there're no manpages for the library itself) so I plan to improve it further. I'm commiting it so it will get wider exposure and review. We won't be able to MFC this work as it relies on changes in HEAD, which was introduced some time ago, that break kernel ABI. OTOH we may be able to merge the library with KVM backend if we really need it there. Discussed with: rwatson
* Fix some locking nits with the p_state field of struct proc:jhb2011-03-241-10/+6
| | | | | | | | | | | | | | | | | | - Hold the proc lock while changing the state from PRS_NEW to PRS_NORMAL in fork to honor the locking requirements. While here, expand the scope of the PROC_LOCK() on the new process (p2) to avoid some LORs. Previously the code was locking the new child process (p2) after it had locked the parent process (p1). However, when locking two processes, the safe order is to lock the child first, then the parent. - Fix various places that were checking p_state against PRS_NEW without having the process locked to use PROC_LOCK(). Every place was already locking the process, just after the PRS_NEW check. - Remove or reduce the use of PROC_SLOCK() for places that were checking p_state against PRS_NEW. The PROC_LOCK() alone is sufficient for reading the current state. - Reorder fill_kinfo_proc() slightly so it only acquires PROC_SLOCK() once. MFC after: 1 week
* Export login class information via kinfo and make it possible to viewtrasz2011-03-051-0/+4
| | | | it using "ps -o class".
* Add initial support for Capsicum's Capability Mode to the FreeBSD kernel,rwatson2011-03-011-1/+3
| | | | | | | | | | | | | | | | | | compiled conditionally on options CAPABILITIES: Add a new credential flag, CRED_FLAG_CAPMODE, which indicates that a subject (typically a process) is in capability mode. Add two new system calls, cap_enter(2) and cap_getmode(2), which allow setting and querying (but never clearing) the flag. Export the capability mode flag via process information sysctls. Sponsored by: Google, Inc. Reviewed by: anderson Discussed with: benl, kris, pjd Obtained from: Capsicum Project MFC after: 3 months
* Allow debugger to specify that children of the traced process should bekib2011-01-251-0/+1
| | | | | | | | automatically traced. Extend the ptrace(PL_LWPINFO) to report that child just forked. Reviewed by: davidxu, jhb MFC after: 2 weeks
* Fix some more style(9) issues.brucec2010-11-141-1/+1
|
* Fix style(9) issues from r215281 and r215282.brucec2010-11-141-1/+2
| | | | MFC after: 1 week
* Add some descriptions to sys/kern sysctls.brucec2010-11-141-1/+1
| | | | | | PR: kern/148710 Tested by: Chip Camden <sterling at camdensoftware.com> MFC after: 1 week
* Fix style.trasz2010-11-111-2/+2
| | | | Submitted by: bde
* Remove unneeded conditional.trasz2010-11-111-12/+10
| | | | Discussed with: kib
* Make a thread's address available via the kern proc sysctl, just like theemaste2010-10-081-0/+1
| | | | | | | | process address. Add "tdaddr" keyword to ps(1) to display this thread address. Distilled from Sandvine's patch set by Mark Johnston.
* Add an extra comment to the SDT probes definition. This allows us to getrpaulo2010-08-221-6/+6
| | | | | | | | | use '-' in probe names, matching the probe names in Solaris.[1] Add userland SDT probes definitions to sys/sdt.h. Sponsored by: The FreeBSD Foundation Discussed with: rwaston [1]
* There isn't really a need to hold the ktrace mutex just to read the valuejhb2010-08-191-6/+0
| | | | | | of p_traceflag that is stored in the kinfo_proc structure. It is still racey even with the lock and the code will read a consistent snapshot of the flag without the lock.
* Add the support for reporting the NOCOREDUMP flag fromattilio2010-05-271-0/+4
| | | | | | | | sysctl_kern_proc_vmmap(). Sponsored by: Sandvine Incorporated Reviewed by: kib, emaste MFC after: 1 week
* Fix a mistake in r207603. td_rux.rux_runtime still needs conversion.kib2010-05-051-1/+1
| | | | | | Reported and tested by: nwhitehorn Pointy hat to: kib MFC after: 6 days
* Use td_rux.rux_runtime for ki_runtime instead of redoing calculation.kib2010-05-041-1/+1
| | | | | Submitted by: bde MFC after: 1 week
* Remove caddr_t casts.kib2010-04-291-5/+3
| | | | | Requested by: bde MFC after: 10 days
* Move the constants specifying the size of struct kinfo_proc intokib2010-04-241-0/+3
| | | | | | | | | | machine-specific header files. Add KINFO_PROC32_SIZE for struct kinfo_proc32 for architectures providing COMPAT_FREEBSD32. Add CTASSERT for the size of struct kinfo_proc32. Submitted by: pluknet Reviewed by: imp, jhb, nwhitehorn MFC after: 2 weeks
* Fix typo.kib2010-04-211-1/+1
| | | | | | Submitted by: emaste Pointy hat to: kib (who needs much bigger wardrobe) MFC after: 1 week
* Provide compat32 shims for kinfo_proc sysctl. This allows 32bit ps(1) tokib2010-04-211-4/+130
| | | | | | | | | | mostly work on 64bit host. The work is based on an original patch submitted by emaste, obtained from Sandvine's source tree. Reviewed by: jhb MFC after: 1 week
* For kinfo_proc in kp->ki_siglist, return the set of the signals pendingkib2010-02-271-4/+6
| | | | | | | | | | in the process queue when gathering information for the process, and set of signals pending for the thread, when gathering information for the thread. Previously, the sysctl returned a union of the process and some arbitrary thread pending set for the process, and union of the process and the thread pending set for the thread. MFC after: 1 week
* Include terminated threads in ps's process cpu time field.jilles2010-02-271-2/+0
| | | | MFC after: 2 weeks
* Remove an unused global.bz2009-12-251-1/+0
| | | | MFC after: 3 days
* Let access overriding to TTYs depend on the cdev_priv, not the vnode.ed2009-12-191-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Basically this commit changes two things, which improves access to TTYs in exceptional conditions. Basically the problem was that when you ran jexec(8) to attach to a jail, you couldn't use /dev/tty (well, also the node of the actual TTY, e.g. /dev/pts/X). This is very inconvenient if you want to attach to screens quickly, use ssh(1), etc. The fixes: - Cache the cdev_priv of the controlling TTY in struct session. Change devfs_access() to compare against the cdev_priv instead of the vnode. This allows you to bypass UNIX permissions, even across different mounts of devfs. - Extend devfs_prison_check() to unconditionally expose the device node of the controlling TTY, even if normal prison nesting rules normally don't allow this. This actually allows you to interact with this device node. To be honest, I'm not really happy with this solution. We now have to store three pointers to a controlling TTY (s_ttyp, s_ttyvp, s_ttydp). In an ideal world, we should just get rid of the latter two and only use s_ttyp, but this makes certian pieces of code very impractical (e.g. devfs, kern_exit.c). Reported by: Many people
* In fill_kinfo_thread, copy the thread's name into struct kinfo_proc evenemaste2009-10-011-2/+1
| | | | | | | | if it is empty. Otherwise the previous thread's name would remain in the struct and then be reported for this thread. Submitted by: Ryan Stone MFC after: 1 week
* Reintroduce the r196640, after fixing the problem with my testing.kib2009-09-011-10/+0
| | | | | | | | | | | | | | | | | | | | | | | | | Remove the altkstacks, instead instantiate threads with kernel stack allocated with the right size from the start. For the thread that has kernel stack cached, verify that requested stack size is equial to the actual, and reallocate the stack if sizes differ [1]. This fixes the bug introduced by r173361 that was committed several days after r173004 and consisted of kthread_add(9) ignoring the non-default kernel stack size. Also, r173361 removed the caching of the kernel stacks for a non-first thread in the process. Introduce separate kernel stack cache that keeps some limited amount of preallocated kernel stacks to lower the latency of thread allocation. Add vm_lowmem handler to prune the cache on low memory condition. This way, system with reasonable amount of the threads get lower latency of thread creation, while still not exhausting significant portion of KVA for unused kstacks. Submitted by: peter [1] Discussed with: jhb, julian, peter Reviewed by: jhb Tested by: pho (and retested according to new test scenarious) MFC after: 1 week
* Reverse r196640 and r196644 for now.kib2009-08-291-0/+10
|
* Remove the altkstacks, instead instantiate threads with kernel stackkib2009-08-291-10/+0
| | | | | | | | | | | | | | | | | | | | | | | | allocated with the right size from the start. For the thread that has kernel stack cached, verify that requested stack size is equial to the actual, and reallocate the stack if sizes differ [1]. This fixes the bug introduced by r173361 that was committed several days after r173004 and consisted of kthread_add(9) ignoring the non-default kernel stack size. Also, r173361 removed the caching of the kernel stacks for a non-first thread in the process. Introduce separate kernel stack cache that keeps some limited amount of preallocated kernel stacks to lower the latency of thread allocation. Add vm_lowmem handler to prune the cache on low memory condition. This way, system with reasonable amount of the threads get lower latency of thread creation, while still not exhausting significant portion of KVA for unused kstacks. Submitted by: peter [1] Discussed with: jhb, julian, peter Reviewed by: jhb Tested by: pho MFC after: 1 week
* Introduce a new sysctl process mib, kern.proc.groups which adds thebrooks2009-07-241-0/+40
| | | | | | | | | | | | ability to retrieve the group list of each process. Modify procstat's -s option to query this mib when the kinfo_proc reports that the field has been truncated. If the mib does not exist, fall back to the truncated list. Reviewed by: rwatson Approved by: re (kib) MFC after: 2 weeks
* Revert the changes to struct kinfo_proc in r194498. Instead, fillbrooks2009-07-241-3/+9
| | | | | | | | | | in up to 16 (KI_NGROUPS) values and steal a bit from ki_cr_flags (all bits currently unused) to indicate overflow with the new flag KI_CRF_GRP_OVERFLOW. This fixes procstat -s. Approved by: re (kib)
* Add a new type of VM object: OBJT_SG. An OBJT_SG object is very similar tojhb2009-07-241-0/+6
| | | | | | | | | | | a device pager (OBJT_DEVICE) object in that it uses fictitious pages to provide aliases to other memory addresses. The primary difference is that it uses an sglist(9) to determine the physical addresses for a given offset into the object instead of invoking the d_mmap() method in a device driver. Reviewed by: alc Approved by: re (kensmith) MFC after: 2 weeks
* Rework the credential code to support larger values of NGROUPS andbrooks2009-06-191-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NGROUPS_MAX, eliminate ABI dependencies on them, and raise the to 1024 and 1023 respectively. (Previously they were equal, but under a close reading of POSIX, NGROUPS_MAX was defined to be too large by 1 since it is the number of supplemental groups, not total number of groups.) The bulk of the change consists of converting the struct ucred member cr_groups from a static array to a pointer. Do the equivalent in kinfo_proc. Introduce new interfaces crcopysafe() and crsetgroups() for duplicating a process credential before modifying it and for setting group lists respectively. Both interfaces take care for the details of allocating groups array. crsetgroups() takes care of truncating the group list to the current maximum (NGROUPS) if necessary. In the future, crsetgroups() may be responsible for insuring invariants such as sorting the supplemental groups to allow groupmember() to be implemented as a binary search. Because we can not change struct xucred without breaking application ABIs, we leave it alone and introduce a new XU_NGROUPS value which is always 16 and is to be used or NGRPS as appropriate for things such as NFS which need to use no more than 16 groups. When feasible, truncate the group list rather than generating an error. Minor changes: - Reduce the number of hand rolled versions of groupmember(). - Do not assign to both cr_gid and cr_groups[0]. - Modify ipfw to cache ucreds instead of part of their contents since they are immutable once referenced by more than one entity. Submitted by: Isilon Systems (initial implementation) X-MFC after: never PR: bin/113398 kern/133867
* Add a flags field to struct ucred, and export that via kinfo_proc,rwatson2009-06-011-0/+1
| | | | | | | | consuming one of its spare fields. The cr_flags field is currently unused, but will be used for features, including capability mode and pay-as-you-go audit. Discussed with: jhb, sson
* Add hierarchical jails. A jail may further virtualize its environmentjamie2009-05-271-2/+2
| | | | | | | | | | | | | | | | | | | | | | by creating a child jail, which is visible to that jail and to any parent jails. Child jails may be restricted more than their parents, but never less. Jail names reflect this hierarchy, being MIB-style dot-separated strings. Every thread now points to a jail, the default being prison0, which contains information about the physical system. Prison0's root directory is the same as rootvnode; its hostname is the same as the global hostname, and its securelevel replaces the global securelevel. Note that the variable "securelevel" has actually gone away, which should not cause any problems for code that properly uses securelevel_gt() and securelevel_ge(). Some jail-related permissions that were kept in global variables and set via sysctls are now per-jail settings. The sysctls still exist for backward compatibility, used only by the now-deprecated jail(2) system call. Approved by: bz (mentor)
* - Add a function (fill_kinfo_aggregate()) which aggregates relevantattilio2009-02-181-22/+44
| | | | | | | | | | | | | | | | | | | | | | | | | members for a kinfo entry on a process-wide system. - Use the newly introduced function in order to fix cases like KERN_PROC_PROC where aggregating stats are broken because they just consider the first thread in the pool for each process. (Note, additively, that KERN_PROC_PROC is rather inaccurate on thread-wide informations like the 'state' of the process. Such informations should maybe be invalidated and being forceably discarded by the consumers?). - Simplify the logic of sysctl_out_proc() and adjust the fill_kinfo_thread() accordingly. - Remove checks on the FIRST_THREAD_IN_PROC() being NULL but add assertives. This patch should fix aggregate statistics for KERN_PROC_PROC. This is one of the reasons why top doesn't use this option and now it can be use it safely. ps, when launched in order to display just processes, now should report correct cpu utilization percentages and times (as opposed by the old code). Reviewed by: jhb, emaste Sponsored by: Sandvine Incorporated
* - Add conditional Giant locking around the vrele() injhb2009-01-231-33/+38
| | | | | | | sysctl_kern_proc_pathname(). - Mark all the kern.proc.* sysctls as MPSAFE. Submitted by: csjp (2)
* vm_map_lock_read() does not increment map->timestamp, so we shouldkib2008-12-291-2/+2
| | | | | | | | | | compare map->timestamp with saved timestamp after map read lock is reacquired, not with saved timestamp + 1. The only consequence of the +1 was unconditional lookup of the next map entry, though. Tested by: pho Approved by: des MFC after: 2 weeks
* Reference the vmspace of the process being inspected by procfs, linprocfskib2008-12-121-3/+15
| | | | | | | | and sysctl kern_proc_vmmap handlers. Reported and tested by: pho Reviewed by: rwatson, des MFC after: 1 week
* Do drop vm map lock earlier in the sysctl_kern_proc_vmmap(), to avoidkib2008-12-081-38/+40
| | | | | | | locking a vnode while having vm map locked. Reported and tested by: pho MFC after: 1 week
* Several threads in a process may do vfork() simultaneously. Then, allkib2008-12-051-0/+1
| | | | | | | | | | | | | | | | | | | parent threads sleep on the parent' struct proc until corresponding child releases the vmspace. Each sleep is interlocked with proc mutex of the child, that triggers assertion in the sleepq_add(). The assertion requires that at any time, all simultaneous sleepers for the channel use the same interlock. Silent the assertion by using conditional variable allocated in the child. Broadcast the variable event on exec() and exit(). Since struct proc * sleep wait channel is overloaded for several unrelated events, I was unable to remove wakeups from the places where cv_broadcast() is added, except exec(). Reported and tested by: ganbold Suggested and reviewed by: jhb MFC after: 2 week
* Merge user/peter/kinfo branch as of r185547 into head.peter2008-12-021-2/+183
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This changes struct kinfo_filedesc and kinfo_vmentry such that they are same on both 32 and 64 bit platforms like i386/amd64 and won't require sysctl wrapping. Two new OIDs are assigned. The old ones are available under COMPAT_FREEBSD7 - but it isn't that simple. The superceded interface was never actually released on 7.x. The other main change is to pack the data passed to userland via the sysctl. kf_structsize and kve_structsize are reduced for the copyout. If you have a process with 100,000+ sockets open, the unpacked records require a 132MB+ copyout. With packing, it is "only" ~35MB. (Still seriously unpleasant, but not quite as devastating). A similar problem exists for the vmentry structure - have lots and lots of shared libraries and small mmaps and its copyout gets expensive too. My immediate problem is valgrind. It traditionally achieves this functionality by parsing procfs output, in a packed format. Secondly, when tracing 32 bit binaries on amd64 under valgrind, it uses a cross compiled 32 bit binary which ran directly into the differing data structures in 32 vs 64 bit mode. (valgrind uses this to track file descriptor operations and this therefore affected every single 32 bit binary) I've added two utility functions to libutil to unpack the structures into a fixed record length and to make it a little more convenient to use.
| * Duplicate another few hundred lines of code in order to be compatiblepeter2008-12-011-2/+179
| | | | | | | | with unreleased binaries.
| * WIP kinfo_file/kinfo_vmmentry tweaks. The idea:peter2008-11-291-3/+7
|/ | | | | | | | | 1) to get the 32 and 64 bit versions in sync so that no shims are needed, Valgrind in particular excercises this. and: 2) reduce the size of the copyout. On large processes this turns out to be a huge problem. Valgrind also suffers from this since it needs to do this in a context that can't malloc. I want to pack the records. 3) Add new types.. 'tell me about fd N' and 'tell me about addr N'.
* Update ZFS from version 6 to 13 and bring some FreeBSD-specific changes.pjd2008-11-171-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This bring huge amount of changes, I'll enumerate only user-visible changes: - Delegated Administration Allows regular users to perform ZFS operations, like file system creation, snapshot creation, etc. - L2ARC Level 2 cache for ZFS - allows to use additional disks for cache. Huge performance improvements mostly for random read of mostly static content. - slog Allow to use additional disks for ZFS Intent Log to speed up operations like fsync(2). - vfs.zfs.super_owner Allows regular users to perform privileged operations on files stored on ZFS file systems owned by him. Very careful with this one. - chflags(2) Not all the flags are supported. This still needs work. - ZFSBoot Support to boot off of ZFS pool. Not finished, AFAIK. Submitted by: dfr - Snapshot properties - New failure modes Before if write requested failed, system paniced. Now one can select from one of three failure modes: - panic - panic on write error - wait - wait for disk to reappear - continue - serve read requests if possible, block write requests - Refquota, refreservation properties Just quota and reservation properties, but don't count space consumed by children file systems, clones and snapshots. - Sparse volumes ZVOLs that don't reserve space in the pool. - External attributes Compatible with extattr(2). - NFSv4-ACLs Not sure about the status, might not be complete yet. Submitted by: trasz - Creation-time properties - Regression tests for zpool(8) command. Obtained from: OpenSolaris
* Remove unnecessary locking around vn_fullpath(). The vnode lock for thejhb2008-11-041-2/+2
| | | | | | | | | | | | | | | | vnode in question does not need to be held. All the data structures used during the name lookup are protected by the global name cache lock. Instead, the caller merely needs to ensure a reference is held on the vnode (such as vhold()) to keep it from being freed. In the case of procfs' <pid>/file entry, grab the process lock while we gain a new reference (via vhold()) on p_textvp to fully close races with execve(2). For the kern.proc.vmmap sysctl handler, use a shared vnode lock around the call to VOP_GETATTR() rather than an exclusive lock. MFC after: 1 month
* Add three extra to the kinfo_proc_vmmap data. kve_offset - the offsetpeter2008-10-311-0/+10
| | | | | | within an object that a mapping refers to. fileid and fsid are inode/dev for vnodes. (Linux procfs has these and valgrind is really unhappy without them.) I believe I didn't change the size of the struct.
* Retire the MALLOC and FREE macros. They are an abomination unto style(9).des2008-10-231-4/+4
| | | | MFC after: 3 months
* Fix minor TTY API inconsistency.ed2008-09-161-1/+0
| | | | | | | | | | Unlike tty_rel_gone() and tty_rel_sess(), the tty_rel_pgrp() routine does not unlock the TTY. I once had the idea to make the code call tty_rel_pgrp() and tty_rel_sess(), picking up the TTY lock once. This turned out a little harder than I expected, so this is how it works now. It's a lot easier if we just let tty_rel_pgrp() unlock the TTY, because the other routines do this anyway.
* If the process id specified is invalid, the system call returns ESRCHkevlo2008-09-041-2/+2
|
* Integrate the new MPSAFE TTY layer to the FreeBSD operating system.ed2008-08-201-28/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The last half year I've been working on a replacement TTY layer for the FreeBSD kernel. The new TTY layer was designed to improve the following: - Improved driver model: The old TTY layer has a driver model that is not abstract enough to make it friendly to use. A good example is the output path, where the device drivers directly access the output buffers. This means that an in-kernel PPP implementation must always convert network buffers into TTY buffers. If a PPP implementation would be built on top of the new TTY layer (still needs a hooks layer, though), it would allow the PPP implementation to directly hand the data to the TTY driver. - Improved hotplugging: With the old TTY layer, it isn't entirely safe to destroy TTY's from the system. This implementation has a two-step destructing design, where the driver first abandons the TTY. After all threads have left the TTY, the TTY layer calls a routine in the driver, which can be used to free resources (unit numbers, etc). The pts(4) driver also implements this feature, which means posix_openpt() will now return PTY's that are created on the fly. - Improved performance: One of the major improvements is the per-TTY mutex, which is expected to improve scalability when compared to the old Giant locking. Another change is the unbuffered copying to userspace, which is both used on TTY device nodes and PTY masters. Upgrading should be quite straightforward. Unlike previous versions, existing kernel configuration files do not need to be changed, except when they reference device drivers that are listed in UPDATING. Obtained from: //depot/projects/mpsafetty/... Approved by: philip (ex-mentor) Discussed: on the lists, at BSDCan, at the DevSummit Sponsored by: Snow B.V., the Netherlands dcons(4) fixed by: kan
OpenPOWER on IntegriCloud