summaryrefslogtreecommitdiffstats
path: root/kernel/audit_tree.c
Commit message (Collapse)AuthorAgeFilesLines
* fsnotify: remove pointless NULL initializersJan Kara2014-01-211-2/+0
| | | | | | | | | | | | | We usually rely on the fact that struct members not specified in the initializer are set to NULL. So do that with fsnotify function pointers as well. Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Eric Paris <eparis@parisplace.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fsnotify: remove .should_send_event callbackJan Kara2014-01-211-11/+1
| | | | | | | | | | | | | After removing event structure creation from the generic layer there is no reason for separate .should_send_event and .handle_event callbacks. So just remove the first one. Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Eric Paris <eparis@parisplace.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fsnotify: do not share events between notification groupsJan Kara2014-01-211-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently fsnotify framework creates one event structure for each notification event and links this event into all interested notification groups. This is done so that we save memory when several notification groups are interested in the event. However the need for event structure shared between inotify & fanotify bloats the event structure so the result is often higher memory consumption. Another problem is that fsnotify framework keeps path references with outstanding events so that fanotify can return open file descriptors with its events. This has the undesirable effect that filesystem cannot be unmounted while there are outstanding events - a regression for inotify compared to a situation before it was converted to fsnotify framework. For fanotify this problem is hard to avoid and users of fanotify should kind of expect this behavior when they ask for file descriptors from notified files. This patch changes fsnotify and its users to create separate event structure for each group. This allows for much simpler code (~400 lines removed by this patch) and also smaller event structures. For example on 64-bit system original struct fsnotify_event consumes 120 bytes, plus additional space for file name, additional 24 bytes for second and each subsequent group linking the event, and additional 32 bytes for each inotify group for private data. After the conversion inotify event consumes 48 bytes plus space for file name which is considerably less memory unless file names are long and there are several groups interested in the events (both of which are uncommon). Fanotify event fits in 56 bytes after the conversion (fanotify doesn't care about file names so its events don't have to have it allocated). A win unless there are four or more fanotify groups interested in the event. The conversion also solves the problem with unmount when only inotify is used as we don't have to grab path references for inotify events. [hughd@google.com: fanotify: fix corruption preventing startup] Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Eric Paris <eparis@parisplace.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* kernel/audit_tree.c:audit_add_tree_rule(): protect `rule' from kill_rules()Chen Gang2013-06-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | audit_add_tree_rule() must set 'rule->tree = NULL;' firstly, to protect the rule itself freed in kill_rules(). The reason is when it is killed, the 'rule' itself may have already released, we should not access it. one example: we add a rule to an inode, just at the same time the other task is deleting this inode. The work flow for adding a rule: audit_receive() -> (need audit_cmd_mutex lock) audit_receive_skb() -> audit_receive_msg() -> audit_receive_filter() -> audit_add_rule() -> audit_add_tree_rule() -> (need audit_filter_mutex lock) ... unlock audit_filter_mutex get_tree() ... iterate_mounts() -> (iterate all related inodes) tag_mount() -> tag_trunk() -> create_trunk() -> (assume it is 1st rule) fsnotify_add_mark() -> fsnotify_add_inode_mark() -> (add mark to inode->i_fsnotify_marks) ... get_tree(); (each inode will get one) ... lock audit_filter_mutex The work flow for deleting an inode: __destroy_inode() -> fsnotify_inode_delete() -> __fsnotify_inode_delete() -> fsnotify_clear_marks_by_inode() -> (get mark from inode->i_fsnotify_marks) fsnotify_destroy_mark() -> fsnotify_destroy_mark_locked() -> audit_tree_freeing_mark() -> evict_chunk() -> ... tree->goner = 1 ... kill_rules() -> (assume current->audit_context == NULL) call_rcu() -> (rule->tree != NULL) audit_free_rule_rcu() -> audit_free_rule() ... audit_schedule_prune() -> (assume current->audit_context == NULL) kthread_run() -> (need audit_cmd_mutex and audit_filter_mutex lock) prune_one() -> (delete it from prue_list) put_tree(); (match the original get_tree above) Signed-off-by: Chen Gang <gang.chen@asianux.com> Cc: Eric Paris <eparis@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* kernel/audit_tree.c: tree will leak memory when failure occurs in ↵Chen Gang2013-04-291-1/+1
| | | | | | | | | | | | | | audit_trim_trees() audit_trim_trees() calls get_tree(). If a failure occurs we must call put_tree(). [akpm@linux-foundation.org: run put_tree() before mutex_lock() for small scalability improvement] Signed-off-by: Chen Gang <gang.chen@asianux.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* audit: catch possible NULL audit buffersKees Cook2013-01-111-9/+17
| | | | | | | | | | | | | | | | | It's possible for audit_log_start() to return NULL. Handle it in the various callers. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric Paris <eparis@redhat.com> Cc: Jeff Layton <jlayton@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Julien Tinnes <jln@google.com> Cc: Will Drewry <wad@google.com> Cc: Steve Grubb <sgrubb@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fsnotify: pass group to fsnotify_destroy_mark()Lino Sanfilippo2012-12-111-5/+5
| | | | | | | | In fsnotify_destroy_mark() dont get the group from the passed mark anymore, but pass the group itself as an additional parameter to the function. Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Signed-off-by: Eric Paris <eparis@redhat.com>
* audit: clean up refcounting in audit-treeMiklos Szeredi2012-08-151-3/+9
| | | | | | | | | | Drop the initial reference by fsnotify_init_mark early instead of audit_tree_freeing_mark() at destroy time. In the cases we destroy the mark before we drop the initial reference we need to get rid of the get_mark that balances the put_mark in audit_tree_freeing_mark(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
* audit: fix refcounting in audit-treeMiklos Szeredi2012-08-151-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Refcounting of fsnotify_mark in audit tree is broken. E.g: refcount create_chunk alloc_chunk 1 fsnotify_add_mark 2 untag_chunk fsnotify_get_mark 3 fsnotify_destroy_mark audit_tree_freeing_mark 2 fsnotify_put_mark 1 fsnotify_put_mark 0 via destroy_list fsnotify_mark_destroy -1 This was reported by various people as triggering Oops when stopping auditd. We could just remove the put_mark from audit_tree_freeing_mark() but that would break freeing via inode destruction. So this patch simply omits a put_mark after calling destroy_mark or adds a get_mark before. The additional get_mark is necessary where there's no other put_mark after fsnotify_destroy_mark() since it assumes that the caller is holding a reference (or the inode is keeping the mark pinned, not the case here AFAICS). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reported-by: Valentin Avram <aval13@gmail.com> Reported-by: Peter Moody <pmoody@google.com> Acked-by: Eric Paris <eparis@redhat.com> CC: stable@vger.kernel.org
* audit: don't free_chunk() after fsnotify_add_mark()Miklos Szeredi2012-08-151-3/+3
| | | | | | | | | Don't do free_chunk() after fsnotify_add_mark(). That one does a delayed unref via the destroy list and this results in use-after-free. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Acked-by: Eric Paris <eparis@redhat.com> CC: stable@vger.kernel.org
* VFS: Make clone_mnt()/copy_tree()/collect_mounts() return errorsDavid Howells2012-07-141-5/+5
| | | | | | | | | | | | | | | | | | copy_tree() can theoretically fail in a case other than ENOMEM, but always returns NULL which is interpreted by callers as -ENOMEM. Change it to return an explicit error. Also change clone_mnt() for consistency and because union mounts will add new error cases. Thanks to Andreas Gruenbacher <agruen@suse.de> for a bug fix. [AV: folded braino fix by Dan Carpenter] Original-author: Valerie Aurora <vaurora@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Cc: Valerie Aurora <valerie.aurora@gmail.com> Cc: Andreas Gruenbacher <agruen@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* audit_tree,rcu: Convert call_rcu(__put_tree) to kfree_rcu()Lai Jiangshan2011-07-201-7/+1
| | | | | | | | | | | The rcu callback __put_tree() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(__put_tree). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric Paris <eparis@redhat.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
* Fix common misspellingsLucas De Marchi2011-03-311-1/+1
| | | | | | Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
* in untag_chunk() we need to do alloc_chunk() a bit earlierAl Viro2010-10-301-2/+7
| | | | | | ... while we are not holding spinlocks. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* fanotify: use both marks when possibleEric Paris2010-07-281-1/+1
| | | | | | | | fanotify currently, when given a vfsmount_mark will look up (if it exists) the corresponding inode mark. This patch drops that lookup and uses the mark provided. Signed-off-by: Eric Paris <eparis@redhat.com>
* fsnotify: pass both the vfsmount mark and inode markEric Paris2010-07-281-2/+4
| | | | | | | | should_send_event() and handle_event() will both need to look up the inode event if they get a vfsmount event. Lets just pass both at the same time since we have them both after walking the lists in lockstep. Signed-off-by: Eric Paris <eparis@redhat.com>
* fsnotify: cleanup should_send_eventEric Paris2010-07-281-1/+1
| | | | | | | | The change to use srcu and walk the object list rather than the global fsnotify_group list means that should_send_event is no longer needed for a number of groups and can be simplified for others. Do that. Signed-off-by: Eric Paris <eparis@redhat.com>
* fsnotify: send fsnotify_mark to groups in event handling functionsEric Paris2010-07-281-3/+5
| | | | | | | | | With the change of fsnotify to use srcu walking the marks list instead of walking the global groups list we now know the mark in question. The code can send the mark to the group's handling functions and the groups won't have to find those marks themselves. Signed-off-by: Eric Paris <eparis@redhat.com>
* fsnotify: split generic and inode specific mark codeEric Paris2010-07-281-4/+4
| | | | | | | | currently all marking is done by functions in inode-mark.c. Some of this is pretty generic and should be instead done in a generic function and we should only put the inode specific code in inode-mark.c Signed-off-by: Eric Paris <eparis@redhat.com>
* fsnotify: take inode->i_lock inside fsnotify_find_mark_entry()Andreas Gruenbacher2010-07-281-2/+0
| | | | | | | | | All callers to fsnotify_find_mark_entry() except one take and release inode->i_lock around the call. Take the lock inside fsnotify_find_mark_entry() instead. Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Signed-off-by: Eric Paris <eparis@redhat.com>
* fsnotify: rename fsnotify_find_mark_entry to fsnotify_find_markEric Paris2010-07-281-6/+6
| | | | | | the _entry portion of fsnotify functions is useless. Drop it. Signed-off-by: Eric Paris <eparis@redhat.com>
* fsnotify: rename fsnotify_mark_entry to just fsnotify_markEric Paris2010-07-281-7/+7
| | | | | | | The name is long and it serves no real purpose. So rename fsnotify_mark_entry to just fsnotify_mark. Signed-off-by: Eric Paris <eparis@redhat.com>
* fsnotify: put inode specific fields in an fsnotify_mark in a unionEric Paris2010-07-281-8/+8
| | | | | | | | | The addition of marks on vfs mounts will be simplified if the inode specific parts of a mark and the vfsmnt specific parts of a mark are actually in a union so naming can be easy. This patch just implements the inode struct and the union. Signed-off-by: Eric Paris <eparis@redhat.com>
* fsnotify: include vfsmount in should_send_event when appropriateEric Paris2010-07-281-1/+2
| | | | | | | | | To ensure that a group will not duplicate events when it receives it based on the vfsmount and the inode should_send_event test we should distinguish those two cases. We pass a vfsmount to this function so groups can make their own determinations. Signed-off-by: Eric Paris <eparis@redhat.com>
* fsnotify: drop mask argument from fsnotify_alloc_groupEric Paris2010-07-281-1/+1
| | | | | | | Nothing uses the mask argument to fsnotify_alloc_group. This patch drops that argument. Signed-off-by: Eric Paris <eparis@redhat.com>
* fsnotify: fsnotify_obtain_group should be fsnotify_alloc_groupEric Paris2010-07-281-1/+1
| | | | | | | | fsnotify_obtain_group was intended to be able to find an already existing group. Nothing uses that functionality. This just renames it to fsnotify_alloc_group so it is clear what it is doing. Signed-off-by: Eric Paris <eparis@redhat.com>
* fsnotify: remove group_num altogetherEric Paris2010-07-281-2/+1
| | | | | | | | The original fsnotify interface has a group-num which was intended to be able to find a group after it was added. I no longer think this is a necessary thing to do and so we remove the group_num. Signed-off-by: Eric Paris <eparis@redhat.com>
* fsnotify: include data in should_send callsEric Paris2010-07-281-1/+1
| | | | | | | | | fanotify is going to need to look at file->private_data to know if an event should be sent or not. This passes the data (which might be a file, dentry, inode, or none) to the should_send function calls so fanotify can get that information when available Signed-off-by: Eric Paris <eparis@redhat.com>
* fsnotify: provide the data type to should_send_eventEric Paris2010-07-281-1/+2
| | | | | | | | | | fanotify is only interested in event types which contain enough information to open the original file in the context of the fanotify listener. Since fanotify may not want to send events if that data isn't present we pass the data type to the should_send_event function call so fanotify can express its lack of interest. Signed-off-by: Eric Paris <eparis@redhat.com>
* audit: reimplement audit_trees using fsnotify rather than inotifyEric Paris2010-07-281-104/+130
| | | | | | | Simply switch audit_trees from using inotify to using fsnotify for it's inode pinning and disappearing act information. Signed-off-by: Eric Paris <eparis@redhat.com>
* include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo2010-03-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
* new helper: iterate_mounts()Al Viro2010-03-031-33/+16
| | | | | | | apply function to vfsmounts in set returned by collect_mounts(), stop if it returns non-zero. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* New helper: path_is_under(path1, path2)Al Viro2010-03-031-39/+12
| | | | | | Analog of is_subdir for vfsmount,dentry pairs, moved from audit_tree.c Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* fix more leaks in audit_tree.c tag_chunk()Al Viro2009-12-191-3/+6
| | | | | | | | | Several leaks in audit_tree didn't get caught by commit 318b6d3d7ddbcad3d6867e630711b8a705d873d7, including the leak on normal exit in case of multiple rules refering to the same chunk. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fix braindamage in audit_tree.c untag_chunk()Al Viro2009-12-191-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | ... aka "Al had badly fscked up when writing that thing and nobody noticed until Eric had fixed leaks that used to mask the breakage". The function essentially creates a copy of old array sans one element and replaces the references to elements of original (they are on cyclic lists) with those to corresponding elements of new one. After that the old one is fair game for freeing. First of all, there's a dumb braino: when we get to list_replace_init we use indices for wrong arrays - position in new one with the old array and vice versa. Another bug is more subtle - termination condition is wrong if the element to be excluded happens to be the last one. We shouldn't go until we fill the new array, we should go until we'd finished the old one. Otherwise the element we are trying to kill will remain on the cyclic lists... That crap used to be masked by several leaks, so it was not quite trivial to hit. Eric had fixed some of those leaks a while ago and the shit had hit the fan... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Fix rule eviction order for AUDIT_DIRAl Viro2009-06-241-5/+51
| | | | | | | | | If syscall removes the root of subtree being watched, we definitely do not want the rules refering that subtree to be destroyed without the syscall in question having a chance to match them. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Audit: clean up all op= output to include string quotingEric Paris2009-06-241-6/+4
| | | | | | | | | | | | | | | | | | A number of places in the audit system we send an op= followed by a string that includes spaces. Somehow this works but it's just wrong. This patch moves all of those that I could find to be quoted. Example: Change From: type=CONFIG_CHANGE msg=audit(1244666690.117:31): auid=0 ses=1 subj=unconfined_u:unconfined_r:auditctl_t:s0-s0:c0.c1023 op=remove rule key="number2" list=4 res=0 Change To: type=CONFIG_CHANGE msg=audit(1244666690.117:31): auid=0 ses=1 subj=unconfined_u:unconfined_r:auditctl_t:s0-s0:c0.c1023 op="remove rule" key="number2" list=4 res=0 Signed-off-by: Eric Paris <eparis@redhat.com>
* Switch collect_mounts() to struct pathAl Viro2009-06-111-3/+3
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* No need for crossing to mountpoint in audit_tag_tree()Al Viro2009-04-201-3/+0
| | | | | | | is_under() will DTRT anyway. And yes, is_subdir() behaviour is intentional. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* audit: incorrect ref counting in audit tree tag_chunkEric Paris2009-04-051-0/+2
| | | | | | | | | | tag_chunk has bad exit paths in which the inotify ref counting is wrong. At the top of the function we found &old_watch using inotify_find_watch(). inotify_find_watch takes a reference to the watch. This is never dropped on an error path. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* audit: validate comparison operations, store them in sane formAl Viro2009-01-041-1/+1
| | | | | | | | | | | Don't store the field->op in the messy (and very inconvenient for e.g. audit_comparator()) form; translate to dense set of values and do full validation of userland-submitted value while we are at it. ->audit_init_rule() and ->audit_match_rule() get new values now; in-tree instances updated. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* audit rules ordering, part 2Al Viro2009-01-041-0/+1
| | | | | | | | Fix the actual rule listing; add per-type lists _not_ used for matching, with all exit,... sitting on one such list. Simplifies "do something for all rules" logics, while we are at it... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Fix inotify watch removal/umount racesAl Viro2008-11-151-37/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Inotify watch removals suck violently. To kick the watch out we need (in this order) inode->inotify_mutex and ih->mutex. That's fine if we have a hold on inode; however, for all other cases we need to make damn sure we don't race with umount. We can *NOT* just grab a reference to a watch - inotify_unmount_inodes() will happily sail past it and we'll end with reference to inode potentially outliving its superblock. Ideally we just want to grab an active reference to superblock if we can; that will make sure we won't go into inotify_umount_inodes() until we are done. Cleanup is just deactivate_super(). However, that leaves a messy case - what if we *are* racing with umount() and active references to superblock can't be acquired anymore? We can bump ->s_count, grab ->s_umount, which will almost certainly wait until the superblock is shut down and the watch in question is pining for fjords. That's fine, but there is a problem - we might have hit the window between ->s_active getting to 0 / ->s_count - below S_BIAS (i.e. the moment when superblock is past the point of no return and is heading for shutdown) and the moment when deactivate_super() acquires ->s_umount. We could just do drop_super() yield() and retry, but that's rather antisocial and this stuff is luser-triggerable. OTOH, having grabbed ->s_umount and having found that we'd got there first (i.e. that ->s_root is non-NULL) we know that we won't race with inotify_umount_inodes(). So we could grab a reference to watch and do the rest as above, just with drop_super() instead of deactivate_super(), right? Wrong. We had to drop ih->mutex before we could grab ->s_umount. So the watch could've been gone already. That still can be dealt with - we need to save watch->wd, do idr_find() and compare its result with our pointer. If they match, we either have the damn thing still alive or we'd lost not one but two races at once, the watch had been killed and a new one got created with the same ->wd at the same address. That couldn't have happened in inotify_destroy(), but inotify_rm_wd() could run into that. Still, "new one got created" is not a problem - we have every right to kill it or leave it alone, whatever's more convenient. So we can use idr_find(...) == watch && watch->inode->i_sb == sb as "grab it and kill it" check. If it's been our original watch, we are fine, if it's a newcomer - nevermind, just pretend that we'd won the race and kill the fscker anyway; we are safe since we know that its superblock won't be going away. And yes, this is far beyond mere "not very pretty"; so's the entire concept of inotify to start with. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Greg KH <greg@kroah.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] get rid of nameidata in audit_treeAl Viro2008-10-231-24/+24
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* [PATCH] list_for_each_rcu must die: auditPaul E. McKenney2008-05-171-3/+2
| | | | | | | | | | All uses of list_for_each_rcu() can be profitably replaced by the easier-to-use list_for_each_entry_rcu(). This patch makes this change for the Audit system, in preparation for removing the list_for_each_rcu() API entirely. This time with well-formed SOB. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Introduce path_put()Jan Blunck2008-02-141-6/+6
| | | | | | | | | | | | | | | | | | | * Add path_put() functions for releasing a reference to the dentry and vfsmount of a struct path in the right order * Switch from path_release(nd) to path_put(&nd->path) * Rename dput_path() to path_put_conditional() [akpm@linux-foundation.org: fix cifs] Signed-off-by: Jan Blunck <jblunck@suse.de> Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Acked-by: Christoph Hellwig <hch@lst.de> Cc: <linux-fsdevel@vger.kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Steven French <sfrench@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Embed a struct path into struct nameidata instead of nd->{dentry,mnt}Jan Blunck2008-02-141-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the central patch of a cleanup series. In most cases there is no good reason why someone would want to use a dentry for itself. This series reflects that fact and embeds a struct path into nameidata. Together with the other patches of this series - it enforced the correct order of getting/releasing the reference count on <dentry,vfsmount> pairs - it prepares the VFS for stacking support since it is essential to have a struct path in every place where the stack can be traversed - it reduces the overall code size: without patch series: text data bss dec hex filename 5321639 858418 715768 6895825 6938d1 vmlinux with patch series: text data bss dec hex filename 5320026 858418 715768 6894212 693284 vmlinux This patch: Switch from nd->{dentry,mnt} to nd->path.{dentry,mnt} everywhere. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix cifs] [akpm@linux-foundation.org: fix smack] Signed-off-by: Jan Blunck <jblunck@suse.de> Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Acked-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] audit: watching subtreesAl Viro2007-10-211-0/+903
New kind of audit rule predicates: "object is visible in given subtree". The part that can be sanely implemented, that is. Limitations: * if you have hardlink from outside of tree, you'd better watch it too (or just watch the object itself, obviously) * if you mount something under a watched tree, tell audit that new chunk should be added to watched subtrees * if you umount something in a watched tree and it's still mounted elsewhere, you will get matches on events happening there. New command tells audit to recalculate the trees, trimming such sources of false positives. Note that it's _not_ about path - if something mounted in several places (multiple mount, bindings, different namespaces, etc.), the match does _not_ depend on which one we are using for access. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
OpenPOWER on IntegriCloud