summaryrefslogtreecommitdiffstats
path: root/fs/proc/base.c
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'for-linus' of ↵Linus Torvalds2014-08-091-5/+13
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull namespace updates from Eric Biederman: "This is a bunch of small changes built against 3.16-rc6. The most significant change for users is the first patch which makes setns drmatically faster by removing unneded rcu handling. The next chunk of changes are so that "mount -o remount,.." will not allow the user namespace root to drop flags on a mount set by the system wide root. Aks this forces read-only mounts to stay read-only, no-dev mounts to stay no-dev, no-suid mounts to stay no-suid, no-exec mounts to stay no exec and it prevents unprivileged users from messing with a mounts atime settings. I have included my test case as the last patch in this series so people performing backports can verify this change works correctly. The next change fixes a bug in NFS that was discovered while auditing nsproxy users for the first optimization. Today you can oops the kernel by reading /proc/fs/nfsfs/{servers,volumes} if you are clever with pid namespaces. I rebased and fixed the build of the !CONFIG_NFS_FS case yesterday when a build bot caught my typo. Given that no one to my knowledge bases anything on my tree fixing the typo in place seems more responsible that requiring a typo-fix to be backported as well. The last change is a small semantic cleanup introducing /proc/thread-self and pointing /proc/mounts and /proc/net at it. This prevents several kinds of problemantic corner cases. It is a user-visible change so it has a minute chance of causing regressions so the change to /proc/mounts and /proc/net are individual one line commits that can be trivially reverted. Unfortunately I lost and could not find the email of the original reporter so he is not credited. From at least one perspective this change to /proc/net is a refgression fix to allow pthread /proc/net uses that were broken by the introduction of the network namespace" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: proc: Point /proc/mounts at /proc/thread-self/mounts instead of /proc/self/mounts proc: Point /proc/net at /proc/thread-self/net instead of /proc/self/net proc: Implement /proc/thread-self to point at the directory of the current thread proc: Have net show up under /proc/<tgid>/task/<tid> NFS: Fix /proc/fs/nfsfs/servers and /proc/fs/nfsfs/volumes mnt: Add tests for unprivileged remount cases that have found to be faulty mnt: Change the default remount atime from relatime to the existing value mnt: Correct permission checks in do_remount mnt: Move the test for MNT_LOCK_READONLY from change_mount_flags into do_remount mnt: Only change user settable mount flags in remount namespaces: Use task_lock and not rcu to protect nsproxy
| * proc: Implement /proc/thread-self to point at the directory of the current ↵Eric W. Biederman2014-08-041-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | thread /proc/thread-self is derived from /proc/self. /proc/thread-self points to the directory in proc containing information about the current thread. This funtionality has been missing for a long time, and is tricky to implement in userspace as gettid() is not exported by glibc. More importantly this allows fixing defects in /proc/mounts and /proc/net where in a threaded application today they wind up being empty files when only the initial pthread has exited, causing problems for other threads. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * proc: Have net show up under /proc/<tgid>/task/<tid>Eric W. Biederman2014-08-041-0/+3
| | | | | | | | | | | | | | Network namespaces are per task so it make sense for them to show up in the task directory. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* | proc: remove INF macroAlexey Dobriyan2014-08-081-41/+0
| | | | | | | | | | | | | | | | | | If you're applying this patch, all /proc/$PID/* files were converted to seq_file interface and this code became unused. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: convert /proc/$PID/hardwall to seq_file interfaceAlexey Dobriyan2014-08-081-2/+2
| | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: convert /proc/$PID/io to seq_file interfaceAlexey Dobriyan2014-08-081-8/+10
| | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: convert /proc/$PID/oom_score to seq_file interfaceAlexey Dobriyan2014-08-081-4/+5
| | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: convert /proc/$PID/schedstat to seq_file interfaceAlexey Dobriyan2014-08-081-4/+5
| | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: convert /proc/$PID/wchan to seq_file interfaceAlexey Dobriyan2014-08-081-5/+6
| | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: convert /proc/$PID/cmdline to seq_file interfaceAlexey Dobriyan2014-08-081-4/+11
| | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: convert /proc/$PID/syscall to seq_file interfaceAlexey Dobriyan2014-08-081-6/+7
| | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: convert /proc/$PID/limits to seq_file interfaceAlexey Dobriyan2014-08-081-15/+12
| | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: convert /proc/$PID/auxv to seq_file interfaceAlexey Dobriyan2014-08-081-10/+8
| | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: more "const char *" pointersAlexey Dobriyan2014-08-081-4/+4
| | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: faster /proc/$PID lookupAlexey Dobriyan2014-08-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently lookup for /proc/$PID first goes through spinlock and whole list of misc /proc entries only to confirm that, yes, /proc/42 can not possibly match random proc entry. List is is several dozens entries long (52 entries on my setup). None of this is necessary. Try to convert dentry name to integer first. If it works, it must be /proc/$PID. If it doesn't, it must be random proc entry. Based on patch from Al Viro. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: add and remove /proc entry create checksAlexey Dobriyan2014-08-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * remove proc_create(NULL, ...) check, let it oops * warn about proc_create("", ...) and proc_create("very very long name", ...) proc code keeps length as u8, no 256+ name length possible * warn about proc_create("123", ...) /proc/$PID and /proc/misc namespaces are separate things, but dumb module might create funky a-la $PID entry. * remove post mortem strchr('/') check Triggering it implies either strchr() is buggy or memory corruption. It should be VFS check anyway. In reality, none of these checks will ever trigger, it is preparation for the next patch. Based on patch from Al Viro. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: constify seq_operationsFabian Frederick2014-08-081-1/+1
|/ | | | | | | | | | | | | | | | proc_uid_seq_operations, proc_gid_seq_operations and proc_projid_seq_operations are only called in proc_id_map_open with seq_open as const struct seq_operations so we can constify the 3 structures and update proc_id_map_open prototype. text data bss dec hex filename 6817 404 1984 9205 23f5 kernel/user_namespace.o-before 6913 308 1984 9205 23f5 kernel/user_namespace.o-after Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge git://git.infradead.org/users/eparis/auditLinus Torvalds2014-04-121-34/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull audit updates from Eric Paris. * git://git.infradead.org/users/eparis/audit: (28 commits) AUDIT: make audit_is_compat depend on CONFIG_AUDIT_COMPAT_GENERIC audit: renumber AUDIT_FEATURE_CHANGE into the 1300 range audit: do not cast audit_rule_data pointers pointlesly AUDIT: Allow login in non-init namespaces audit: define audit_is_compat in kernel internal header kernel: Use RCU_INIT_POINTER(x, NULL) in audit.c sched: declare pid_alive as inline audit: use uapi/linux/audit.h for AUDIT_ARCH declarations syscall_get_arch: remove useless function arguments audit: remove stray newline from audit_log_execve_info() audit_panic() call audit: remove stray newlines from audit_log_lost messages audit: include subject in login records audit: remove superfluous new- prefix in AUDIT_LOGIN messages audit: allow user processes to log from another PID namespace audit: anchor all pid references in the initial pid namespace audit: convert PPIDs to the inital PID namespace. pid: get pid_t ppid of task in init_pid_ns audit: rename the misleading audit_get_context() to audit_take_context() audit: Add generic compat syscall support audit: Add CONFIG_HAVE_ARCH_AUDITSYSCALL ...
| * proc: Update get proc_pid_cmdline() to use mm.h helpersWilliam Roberts2014-03-201-34/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Re-factor proc_pid_cmdline() to use get_cmdline() helper from mm.h. Acked-by: David Rientjes <rientjes@google.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: William Roberts <wroberts@tresys.com> Acked-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
* | fault-injection: set bounds on what /proc/self/make-it-fail accepts.Dave Jones2014-04-071-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | /proc/self/make-it-fail is a boolean, but accepts any number, including negative ones. Change variable to unsigned, and cap upper bound at 1. [akpm@linux-foundation.org: don't make make_it_fail unsigned] Signed-off-by: Dave Jones <davej@fedoraproject.org> Reviewed-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | procfs: make /proc/*/pagemap 0400Djalal Harouni2014-04-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The /proc/*/pagemap contain sensitive information and currently its mode is 0444. Change this to 0400, so the VFS will prevent unprivileged processes from getting file descriptors on arbitrary privileged /proc/*/pagemap files. This reduces the scope of address space leaking and bypasses by protecting already running processes. Signed-off-by: Djalal Harouni <tixxdz@opendz.org> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Andy Lutomirski <luto@amacapital.net> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | procfs: make /proc/*/{stack,syscall,personality} 0400Djalal Harouni2014-04-071-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These procfs files contain sensitive information and currently their mode is 0444. Change this to 0400, so the VFS will be able to block unprivileged processes from getting file descriptors on arbitrary privileged /proc/*/{stack,syscall,personality} files. This reduces the scope of ASLR leaking and bypasses by protecting already running processes. Signed-off-by: Djalal Harouni <tixxdz@opendz.org> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Andy Lutomirski <luto@amacapital.net> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | fs/proc/base.c: fix GPF in /proc/$PID/map_filesArtem Fetishev2014-03-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The expected logic of proc_map_files_get_link() is either to return 0 and initialize 'path' or return an error and leave 'path' uninitialized. By the time dname_to_vma_addr() returns 0 the corresponding vma may have already be gone. In this case the path is not initialized but the return value is still 0. This results in 'general protection fault' inside d_path(). Steps to reproduce: CONFIG_CHECKPOINT_RESTORE=y fd = open(...); while (1) { mmap(fd, ...); munmap(fd, ...); } ls -la /proc/$PID/map_files Addresses https://bugzilla.kernel.org/show_bug.cgi?id=68991 Signed-off-by: Artem Fetishev <artem_fetishev@epam.com> Signed-off-by: Aleksandr Terekhov <aleksandr_terekhov@epam.com> Reported-by: <wiebittewas@gmail.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: fix ->f_pos overflows in first_tid()Oleg Nesterov2014-01-231-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. proc_task_readdir()->first_tid() path truncates f_pos to int, this is wrong even on 64bit. We could check that f_pos < PID_MAX or even INT_MAX in proc_task_readdir(), but this patch simply checks the potential overflow in first_tid(), this check is nop on 64bit. We do not care if it was negative and the new unsigned value is huge, all we need to ensure is that we never wrongly return !NULL. 2. Remove the 2nd "nr != 0" check before get_nr_threads(), nr_threads == 0 is not distinguishable from !pid_task() above. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Sameer Nanda <snanda@chromium.org> Cc: Sergey Dyasly <dserrg@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: don't (ab)use ->group_leader in proc_task_readdir() pathsOleg Nesterov2014-01-231-28/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | proc_task_readdir() does not really need "leader", first_tid() has to revalidate it anyway. Just pass proc_pid(inode) to first_tid() instead, it can do pid_task(PIDTYPE_PID) itself and read ->group_leader only if necessary. The patch also extracts the "inode is dead" code from pid_delete_dentry(dentry) into the new trivial helper, proc_inode_is_dead(inode), proc_task_readdir() uses it to return -ENOENT if this dir was removed. This is a bit racy, but the race is very inlikely and the getdents() after openndir() can see the empty "." + ".." dir only once. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Sameer Nanda <snanda@chromium.org> Cc: Sergey Dyasly <dserrg@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: change first_tid() to use while_each_thread() rather than next_thread()Oleg Nesterov2014-01-231-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Rerwrite the main loop to use while_each_thread() instead of next_thread(). We are going to fix or replace while_each_thread(), next_thread() should be avoided whenever possible. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Sameer Nanda <snanda@chromium.org> Cc: Sergey Dyasly <dserrg@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: fix the potential use-after-free in first_tid()Oleg Nesterov2014-01-231-0/+3
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | proc_task_readdir() verifies that the result of get_proc_task() is pid_alive() and thus its ->group_leader is fine too. However this is not necessarily true after rcu_read_unlock(), we need to recheck this again after first_tid() does rcu_read_lock(). Otherwise leader->thread_group.next (used by next_thread()) can be invalid if the rcu grace period expires in between. The race is subtle and unlikely, but still it is possible afaics. To simplify lets ignore the "likely" case when tid != 0, f_version can be cleared by proc_task_operations->llseek(). Suppose we have a main thread M and its subthread T. Suppose that f_pos == 3, iow first_tid() should return T. Now suppose that the following happens between rcu_read_unlock() and rcu_read_lock(): 1. T execs and becomes the new leader. This removes M from ->thread_group but next_thread(M) is still T. 2. T creates another thread X which does exec as well, T goes away. 3. X creates another subthread, this increments nr_threads. 4. first_tid() does next_thread(M) and returns the already dead T. Note also that we need 2. and 3. only because of get_nr_threads() check, and this check was supposed to be optimization only. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Sameer Nanda <snanda@chromium.org> Cc: Sergey Dyasly <dserrg@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* audit: allow unsetting the loginuid (with priv)Eric Paris2013-11-051-4/+10
| | | | | | | | | | | | | If a task has CAP_AUDIT_CONTROL allow that task to unset their loginuid. This would allow a child of that task to set their loginuid without CAP_AUDIT_CONTROL. Thus when launching a new login daemon, a priviledged helper would be able to unset the loginuid and then the daemon, which may be malicious user facing, do not need priv to function correctly. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
* proc_fill_cache(): clean up, get rid of pointless find_inode_number() useAl Viro2013-06-291-23/+13
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* proc_fill_cache(): just make instantiate_t return intAl Viro2013-06-291-31/+28
| | | | | | all instances always return ERR_PTR(-E...) or NULL, anyway Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* proc_pid_readdir(): stop wanking with proc_fill_cache() for /proc/selfAl Viro2013-06-291-3/+3
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* proc_fill_cache(): kill pointless checkAl Viro2013-06-291-4/+2
| | | | | | we'd just checked that child->d_inode is non-NULL, for fuck sake! Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* [readdir] convert procfsAl Viro2013-06-291-230/+133
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* posix-timers: Show clock ID in proc filePavel Tikhomirov2013-05-281-0/+1
| | | | | | | | | | | | | Expand information about posix-timers in /proc/<pid>/timers by adding info about clock, with which the timer was created. I.e. in the forth line of timer info after "notify:" line go "ClockID: <clock_id>". Signed-off-by: Pavel Tikhomirov <snorcht@gmail.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Matthew Helsley <matt.helsley@gmail.com> Cc: Pavel Emelyanov <xemul@parallels.com> Link: http://lkml.kernel.org/r/1368742323-46949-2-git-send-email-snorcht@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* Merge branch 'for-linus' of ↵Linus Torvalds2013-05-011-4/+52
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull VFS updates from Al Viro, Misc cleanups all over the place, mainly wrt /proc interfaces (switch create_proc_entry to proc_create(), get rid of the deprecated create_proc_read_entry() in favor of using proc_create_data() and seq_file etc). 7kloc removed. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits) don't bother with deferred freeing of fdtables proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h proc: Make the PROC_I() and PDE() macros internal to procfs proc: Supply a function to remove a proc entry by PDE take cgroup_open() and cpuset_open() to fs/proc/base.c ppc: Clean up scanlog ppc: Clean up rtas_flash driver somewhat hostap: proc: Use remove_proc_subtree() drm: proc: Use remove_proc_subtree() drm: proc: Use minor->index to label things, not PDE->name drm: Constify drm_proc_list[] zoran: Don't print proc_dir_entry data in debug reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show() proc: Supply an accessor for getting the data from a PDE's parent airo: Use remove_proc_subtree() rtl8192u: Don't need to save device proc dir PDE rtl8187se: Use a dir under /proc/net/r8180/ proc: Add proc_mkdir_data() proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h} proc: Move PDE_NET() to fs/proc/proc_net.c ...
| * take cgroup_open() and cpuset_open() to fs/proc/base.cAl Viro2013-05-011-0/+31
| | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * proc: Uninline pid_delete_dentry()David Howells2013-05-011-0/+9
| | | | | | | | | | | | | | Uninline pid_delete_dentry() as it's only used by three function pointers. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * procfs: switch /proc/self away from proc_dir_entryAl Viro2013-04-091-4/+12
| | | | | | | | | | | | | | Just have it pinned in dcache all along and let procfs ->kill_sb() drop it before kill_anon_super(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | fs, proc: truncate /proc/pid/comm writes to first TASK_COMM_LEN bytesDavid Rientjes2013-04-301-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, a write to a procfs file will return the number of bytes successfully written. If the actual string is longer than this, the remainder of the string will not be be written and userspace will complete the operation by issuing additional write()s. Hence $ echo -n "abcdefghijklmnopqrs" > /proc/self/comm results in $ cat /proc/$$/comm pqrs since the final four bytes were written with a second write() since TASK_COMM_LEN == 16. This is obviously an undesired result and not equivalent to prctl(PR_SET_NAME). The implementation should not need to know the definition of TASK_COMM_LEN. This patch truncates the string to the first TASK_COMM_LEN bytes and returns the bytes written as the length of the string written so the second write() is suppressed. $ cat /proc/$$/comm abcdefghijklmno Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | posix-timers: Show sigevent info in proc filePavel Emelyanov2013-04-171-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previous patch added proc file to list posix timers created by task. Expand the information provided in this file by adding info about notification method, with which timers were created. I.e. after the "ID:" line there go 1. "signal:" line, that shows signal number and sigval bits; 2. "notify:" line, that shows the timer notification method. Thus the timer entry would looke like this: ID: 123 signal: 14/0000000000b005d0 notify: signal/pid.732 This information is enough to understand how timer_create() was called for each particular timer. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Matthew Helsley <matt.helsley@gmail.com> Link: http://lkml.kernel.org/r/513DA024.80404@parallels.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | posix-timers: Introduce /proc/PID/timers filePavel Emelyanov2013-04-171-0/+83
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently kernel doesn't provide any API for getting info about what posix timers are configured by processes. It's implied, that a process which configured some timers, knows what it did. However, for external tools it's impossible to get this information. In particular, this is critical for checkpoint-restore project to have this info. Introduce a per-pid proc file with information about posix timers. Since these timers are shared between threads, this file is present on tgid level only, no such thing in tid subdirs. The file format is expected to be the "/proc/<pid>/smaps"-like, i.e. each timer will occupy seveal lines to allow for future extending. Each new timer entry starts with the ID: <number> line which is added by this patch. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Matthew Helsley <matt.helsley@gmail.com> Link: http://lkml.kernel.org/r/513DA00D.6070009@parallels.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* fs/proc: clean up printksAndrew Morton2013-02-271-1/+2
| | | | | | | | | | | | - use pr_foo() throughout - remove a couple of duplicated KERN_WARNINGs, via WARN(KERN_WARNING "...") - nuke a few warnings which I've never seen happen, ever. Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fs: change return values from -EACCES to -EPERMZhao Hongjiang2013-02-261-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | According to SUSv3: [EACCES] Permission denied. An attempt was made to access a file in a way forbidden by its file access permissions. [EPERM] Operation not permitted. An attempt was made to perform an operation limited to processes with appropriate privileges or to the owner of a file or other resource. So -EPERM should be returned if capability checks fails. Strictly speaking this is an API change since the error code user sees is altered. Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com> Acked-by: Jan Kara <jack@suse.cz> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Acked-by: Ian Kent <raven@themaw.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* d_hash_and_lookup(): export, switch open-coded instancesAl Viro2013-02-261-0/+1
| | | | | | | | * calling conventions change - ERR_PTR() is returned on ->d_hash() errors; NULL is just for dcache miss now. * exported, open-coded instances in ncpfs and cifs converted. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* new helper: file_inode(file)Al Viro2013-02-221-19/+19
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* procfs: drop vmtruncateMarco Stornelli2012-12-201-7/+0
| | | | | | | Removed vmtruncate Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Merge branch 'for-linus' of ↵Linus Torvalds2012-12-171-166/+3
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace changes from Eric Biederman: "While small this set of changes is very significant with respect to containers in general and user namespaces in particular. The user space interface is now complete. This set of changes adds support for unprivileged users to create user namespaces and as a user namespace root to create other namespaces. The tyranny of supporting suid root preventing unprivileged users from using cool new kernel features is broken. This set of changes completes the work on setns, adding support for the pid, user, mount namespaces. This set of changes includes a bunch of basic pid namespace cleanups/simplifications. Of particular significance is the rework of the pid namespace cleanup so it no longer requires sending out tendrils into all kinds of unexpected cleanup paths for operation. At least one case of broken error handling is fixed by this cleanup. The files under /proc/<pid>/ns/ have been converted from regular files to magic symlinks which prevents incorrect caching by the VFS, ensuring the files always refer to the namespace the process is currently using and ensuring that the ptrace_mayaccess permission checks are always applied. The files under /proc/<pid>/ns/ have been given stable inode numbers so it is now possible to see if different processes share the same namespaces. Through the David Miller's net tree are changes to relax many of the permission checks in the networking stack to allowing the user namespace root to usefully use the networking stack. Similar changes for the mount namespace and the pid namespace are coming through my tree. Two small changes to add user namespace support were commited here adn in David Miller's -net tree so that I could complete the work on the /proc/<pid>/ns/ files in this tree. Work remains to make it safe to build user namespaces and 9p, afs, ceph, cifs, coda, gfs2, ncpfs, nfs, nfsd, ocfs2, and xfs so the Kconfig guard remains in place preventing that user namespaces from being built when any of those filesystems are enabled. Future design work remains to allow root users outside of the initial user namespace to mount more than just /proc and /sys." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (38 commits) proc: Usable inode numbers for the namespace file descriptors. proc: Fix the namespace inode permission checks. proc: Generalize proc inode allocation userns: Allow unprivilged mounts of proc and sysfs userns: For /proc/self/{uid,gid}_map derive the lower userns from the struct file procfs: Print task uids and gids in the userns that opened the proc file userns: Implement unshare of the user namespace userns: Implent proc namespace operations userns: Kill task_user_ns userns: Make create_new_namespaces take a user_ns parameter userns: Allow unprivileged use of setns. userns: Allow unprivileged users to create new namespaces userns: Allow setting a userns mapping to your current uid. userns: Allow chown and setgid preservation userns: Allow unprivileged users to create user namespaces. userns: Ignore suid and sgid on binaries if the uid or gid can not be mapped userns: fix return value on mntns_install() failure vfs: Allow unprivileged manipulation of the mount namespace. vfs: Only support slave subtrees across different user namespaces vfs: Add a user namespace reference from struct mnt_namespace ...
| * pidns: Make the pidns proc mount/umount logic obvious.Eric W. Biederman2012-11-191-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Track the number of pids in the proc hash table. When the number of pids goes to 0 schedule work to unmount the kernel mount of proc. Move the mount of proc into alloc_pid when we allocate the pid for init. Remove the surprising calls of pid_ns_release proc in fork and proc_flush_task. Those code paths really shouldn't know about proc namespace implementation details and people have demonstrated several times that finding and understanding those code paths is difficult and non-obvious. Because of the call path detach pid is alwasy called with the rtnl_lock held free_pid is not allowed to sleep, so the work to unmounting proc is moved to a work queue. This has the side benefit of not blocking the entire world waiting for the unnecessary rcu_barrier in deactivate_locked_super. In the process of making the code clear and obvious this fixes a bug reported by Gao feng <gaofeng@cn.fujitsu.com> where we would leak a mount of proc during clone(CLONE_NEWPID|CLONE_NEWNET) if copy_pid_ns succeeded and copy_net_ns failed. Acked-by: "Serge E. Hallyn" <serge@hallyn.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * procfs: Don't cache a pid in the root inode.Eric W. Biederman2012-11-191-10/+1
| | | | | | | | | | | | | | | | | | | | | | Now that we have s_fs_info pointing to our pid namespace the original reason for the proc root inode having a struct pid is gone. Caching a pid in the root inode has led to some complicated code. Now that we don't need the struct pid, just remove it. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
| * procfs: Use the proc generic infrastructure for proc/self.Eric W. Biederman2012-11-191-152/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | I had visions at one point of splitting proc into two filesystems. If that had happened proc/self being the the part of proc that actually deals with pids would have been a nice cleanup. As it is proc/self requires a lot of unnecessary infrastructure for a single file. The only user visible change is that a mounted /proc for a pid namespace that is dead now shows a broken proc symlink, instead of being completely invisible. I don't think anyone will notice or care. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
OpenPOWER on IntegriCloud