op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	atomic: use <linux/atomic.h>	Arun Sharma	2011-07-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This allows us to move duplicated code in <asm/atomic.h> (atomic_inc_not_zero() for now) to <linux/atomic.h> Signed-off-by: Arun Sharma <asharma@fb.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	fs: dcache remove dcache_lock	Nick Piggin	2011-01-07	1	-4/+7
\| \| \| \| \| \|	dcache_lock no longer protects anything. remove it. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
*	fanotify: on group destroy allow all waiters to bypass permission check	Lino Sanfilippo	2010-12-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When fanotify_release() is called, there may still be processes waiting for access permission. Currently only processes for which an event has already been queued into the groups access list will be woken up. Processes for which no event has been queued will continue to sleep and thus cause a deadlock when fsnotify_put_group() is called. Furthermore there is a race allowing further processes to be waiting on the access wait queue after wake_up (if they arrive before clear_marks_by_group() is called). This patch corrects this by setting a flag to inform processes that the group is about to be destroyed and thus not to wait for access permission. [additional changelog from eparis] Lets think about the 4 relevant code paths from the PoV of the 'operator' 'listener' 'responder' and 'closer'. Where operator is the process doing an action (like open/read) which could require permission. Listener is the task (or in this case thread) slated with reading from the fanotify file descriptor. The 'responder' is the thread responsible for responding to access requests. 'Closer' is the thread attempting to close the fanotify file descriptor. The 'operator' is going to end up in: fanotify_handle_event() get_response_from_access() (THIS BLOCKS WAITING ON USERSPACE) The 'listener' interesting code path fanotify_read() copy_event_to_user() prepare_for_access_response() (THIS CREATES AN fanotify_response_event) The 'responder' code path: fanotify_write() process_access_response() (REMOVE A fanotify_response_event, SET RESPONSE, WAKE UP 'operator') The 'closer': fanotify_release() (SUPPOSED TO CLEAN UP THE REST OF THIS MESS) What we have today is that in the closer we remove all of the fanotify_response_events and set a bit so no more response events are ever created in prepare_for_access_response(). The bug is that we never wake all of the operators up and tell them to move along. You fix that in fanotify_get_response_from_access(). You also fix other operators which haven't gotten there yet. So I agree that's a good fix. [/additional changelog from eparis] [remove additional changes to minimize patch size] [move initialization so it was inside CONFIG_FANOTIFY_PERMISSION] Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: remove alignment padding from fsnotify_mark on 64 bit builds	Richard Kennedy	2010-10-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Reorder struct fsnotfiy_mark to remove 8 bytes of alignment padding on 64 bit builds. Shrinks fsnotfiy_mark to 128 bytes allowing more objects per slab in its kmem_cache and reduces the number of cachelines needed for each structure. Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: rename FS_IN_ISDIR to FS_ISDIR	Eric Paris	2010-10-28	1	-2/+2
\| \| \| \| \| \| \| \|	The _IN_ in the naming is reserved for flags only used by inotify. Since I am about to use this flag for fanotify rename it to be generic like the rest. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fanotify: limit number of listeners per user	Eric Paris	2010-10-28	1	-0/+1
\| \| \| \| \| \| \| \|	fanotify currently has no limit on the number of listeners a given user can have open. This patch limits the total number of listeners per user to 128. This is the same as the inotify default limit. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fanotify: limit the number of marks in a single fanotify group	Eric Paris	2010-10-28	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	There is currently no limit on the number of marks a given fanotify group can have. Since fanotify is gated on CAP_SYS_ADMIN this was not seen as a serious DoS threat. This patch implements a default of 8192, the same as inotify to work towards removing the CAP_SYS_ADMIN gating and eliminating the default DoS'able status. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: call fsnotify_parent in perm events	Eric Paris	2010-10-28	1	-3/+5
\| \| \| \| \| \| \| \|	fsnotify perm events do not call fsnotify parent. That means you cannot register a perm event on a directory and enforce permissions on all inodes in that directory. This patch fixes that situation. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: correctly handle return codes from listeners	Eric Paris	2010-10-28	1	-0/+2
\| \| \| \| \| \| \| \|	When fsnotify groups return errors they are ignored. For permissions events these should be passed back up the stack, but for most events these should continue to be ignored. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: implement ordering between notifiers	Eric Paris	2010-10-28	1	-0/+8
\| \| \| \| \| \| \| \| \| \|	fanotify needs to be able to specify that some groups get events before others. They use this idea to make sure that a hierarchical storage manager gets access to files before programs which actually use them. This is purely infrastructure. Everything will have a priority of 0, but the infrastructure will exist for it to be non-zero. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fanotify: flush outstanding perm requests on group destroy	Eric Paris	2010-08-22	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When an fanotify listener is closing it may cause a deadlock between the listener and the original task doing an fs operation. If the original task is waiting for a permissions response it will be holding the srcu lock. The listener cannot clean up and exit until after that srcu lock is syncronized. Thus deadlock. The fix introduced here is to stop accepting new permissions events when a listener is shutting down and to grant permission for all outstanding events. Thus the original task will eventually release the srcu lock and the listener can complete shutdown. Reported-by: Andreas Gruenbacher <agruen@suse.de> Cc: Andreas Gruenbacher <agruen@suse.de> Signed-off-by: Eric Paris <eparis@redhat.com>
*	Revert "fsnotify: store struct file not struct path"	Linus Torvalds	2010-08-12	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 3bcf3860a4ff9bbc522820b4b765e65e4deceb3e (and the accompanying commit c1e5c954020e "vfs/fsnotify: fsnotify_close can delay the final work in fput" that was a horribly ugly hack to make it work at all). The 'struct file' approach not only causes that disgusting hack, it somehow breaks pulseaudio, probably due to some other subtlety with f_count handling. Fix up various conflicts due to later fsnotify work. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	fanotify: use both marks when possible	Eric Paris	2010-07-28	1	-1/+1
\| \| \| \| \| \| \| \|	fanotify currently, when given a vfsmount_mark will look up (if it exists) the corresponding inode mark. This patch drops that lookup and uses the mark provided. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: pass both the vfsmount mark and inode mark	Eric Paris	2010-07-28	1	-2/+5
\| \| \| \| \| \| \| \|	should_send_event() and handle_event() will both need to look up the inode event if they get a vfsmount event. Lets just pass both at the same time since we have them both after walking the lists in lockstep. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: remove global fsnotify groups lists	Eric Paris	2010-07-28	1	-15/+0
\| \| \| \| \| \| \| \| \|	The global fsnotify groups lists were invented as a way to increase the performance of fsnotify by shortcutting events which were not interesting. With the changes to walk the object lists rather than global groups lists these shortcuts are not useful. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: remove group->mask	Eric Paris	2010-07-28	1	-11/+0
\| \| \| \| \| \| \|	group->mask is now useless. It was originally a shortcut for fsnotify to save on performance. These checks are now redundant, so we remove them. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: remove the global masks	Eric Paris	2010-07-28	1	-2/+0
\| \| \| \| \| \| \| \| \|	Because we walk the object->fsnotify_marks list instead of the global fsnotify groups list we don't need the fsnotify_inode_mask and fsnotify_vfsmount_mask as these were simply shortcuts in fsnotify() for performance. They are now extra checks, rip them out. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: send fsnotify_mark to groups in event handling functions	Eric Paris	2010-07-28	1	-3/+4
\| \| \| \| \| \| \| \| \|	With the change of fsnotify to use srcu walking the marks list instead of walking the global groups list we now know the mark in question. The code can send the mark to the group's handling functions and the groups won't have to find those marks themselves. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: srcu to protect read side of inode and vfsmount locks	Eric Paris	2010-07-28	1	-0/+1
\| \| \| \| \| \| \| \| \|	Currently reading the inode->i_fsnotify_marks or vfsmount->mnt_fsnotify_marks lists are protected by a spinlock on both the read and the write side. This patch protects the read side of those lists with a new single srcu. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: use an explicit flag to indicate fsnotify_destroy_mark has been called	Eric Paris	2010-07-28	1	-0/+1
\| \| \| \| \| \| \| \| \|	Currently fsnotify check is mark->group is NULL to decide if fsnotify_destroy_mark() has already been called or not. With the upcoming rcu work it is a heck of a lot easier to use an explicit flag than worry about group being set to NULL. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: store struct file not struct path	Eric Paris	2010-07-28	1	-8/+8
\| \| \| \| \| \| \| \| \|	Al explains that calling dentry_open() with a mnt/dentry pair is only garunteed to be safe if they are already used in an open struct file. To make sure this is the case don't store and use a struct path in fsnotify, always use a struct file. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: fsnotify_add_notify_event should return an event	Eric Paris	2010-07-28	1	-7/+5
\| \| \| \| \| \| \| \| \|	Rather than the horrific void ** argument and such just to pass the fanotify_merge event back to the caller of fsnotify_add_notify_event() have those things return an event if it was different than the event suggusted to be added. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fanotify: groups can specify their f_flags for new fd	Eric Paris	2010-07-28	1	-2/+5
\| \| \| \| \| \| \| \| \|	Currently fanotify fds opened for thier listeners are done with f_flags equal to O_RDONLY \| O_LARGEFILE. This patch instead takes f_flags from the fanotify_init syscall and uses those when opening files in the context of the listener. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: check to make sure all fsnotify bits are unique	Eric Paris	2010-07-28	1	-0/+9
\| \| \| \| \| \| \|	This patch adds a check to make sure that all fsnotify bits are unique and we cannot accidentally use the same bit for 2 different fsnotify event types. Signed-off-by: Eric Paris <eparis@redhat.com>
*	inotify: allow users to request not to recieve events on unlinked children	Eric Paris	2010-07-28	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	An inotify watch on a directory will send events for children even if those children have been unlinked. This patch add a new inotify flag IN_EXCL_UNLINK which allows a watch to specificy they don't care about unlinked children. This should fix performance problems seen by tasks which add a watch to /tmp and then are overrun with events when other processes are reading and writing to unlinked files they created in /tmp. https://bugzilla.kernel.org/show_bug.cgi?id=16296 Requested-by: Matthias Clasen <mclasen@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
*	fanotify: drop the useless priority argument	Eric Paris	2010-07-28	1	-1/+0
\| \| \| \| \| \|	The priority argument in fanotify is useless. Kill it. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fanotify: permissions and blocking	Eric Paris	2010-07-28	1	-0/+12
\| \| \| \| \| \| \| \| \|	This is the backend work needed for fanotify to support the new FS_OPEN_PERM and FS_ACCESS_PERM fsnotify events. This is done using the new fsnotify secondary queue. No userspace interface is provided actually respond to or request these events. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: new fsnotify hooks and events types for access decisions	Eric Paris	2010-07-28	1	-5/+10
\| \| \| \| \| \| \| \| \| \|	introduce a new fsnotify hook, fsnotify_perm(), which is called from the security code. This hook is used to allow fsnotify groups to make access control decisions about events on the system. We also must change the generic fsnotify function to return an error code if we intend these hooks to be in any way useful. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: use unsigned char * for dentry->d_name.name	Eric Paris	2010-07-28	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \|	fsnotify was using char * when it passed around the d_name.name string internally but it is actually an unsigned char *. This patch switches fsnotify to use unsigned and should silence some pointer signess warnings which have popped out of xfs. I do not add -Wpointer-sign to the fsnotify code as there are still issues with kstrdup and strlen which would pop out needless warnings. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: intoduce a notification merge argument	Eric Paris	2010-07-28	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	Each group can define their own notification (and secondary_q) merge function. Inotify does tail drop, fanotify does matching and drop which can actually allocate a completely new event. But for fanotify to properly deal with permissions events it needs to know the new event which was ultimately added to the notification queue. This patch just implements a void ** argument which is passed to the merge function. fanotify can use this field to pass the new event back to higher layers. Signed-off-by: Eric Paris <eparis@redhat.com> for fanotify to properly deal with permissions events
*	fsnotify: add group priorities	Eric Paris	2010-07-28	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This introduces an ordering to fsnotify groups. With purely asynchronous notification based "things" implementing fsnotify (inotify, dnotify) ordering isn't particularly important. But if people want to use fsnotify for the basis of sycronous notification or blocking notification ordering becomes important. eg. A Hierarchical Storage Management listener would need to get its event before an AV scanner could get its event (since the HSM would need to bring the data in for the AV scanner to scan.) Typically asynchronous notification would want to run after the AV scanner made any relevant access decisions so as to not send notification about an event that was denied. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fanotify: clear all fanotify marks	Eric Paris	2010-07-28	1	-0/+6
\| \| \| \| \| \| \| \| \|	fanotify listeners may want to clear all marks. They may want to do this to destroy all of their inode marks which have nothing but ignores. Realistically this is useful for av vendors who update policy and want to clear all of their cached allows. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: allow ignored_mask to survive modification	Eric Paris	2010-07-28	1	-0/+1
\| \| \| \| \| \| \| \|	Some inodes a group may want to never hear about a set of events even if the inode is modified. We add a new mark flag which indicates that these marks should not have their ignored_mask cleared on modification. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: ignored_mask - excluding notification	Eric Paris	2010-07-28	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \|	The ignored_mask is a new mask which is part of fsnotify marks. A group's should_send_event() function can use the ignored mask to determine that certain events are not of interest. In particular if a group registers a mask including FS_OPEN on a vfsmount they could add FS_OPEN to the ignored_mask for individual inodes and not send open events for those inodes. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: allow marks to not pin inodes in core	Eric Paris	2010-07-28	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	inotify marks must pin inodes in core. dnotify doesn't technically need to since they are closed when the directory is closed. fanotify also need to pin inodes in core as it works today. But the next step is to introduce the concept of 'ignored masks' which is actually a mask of events for an inode of no interest. I claim that these should be liberally sent to the kernel and should not pin the inode in core. If the inode is brought back in the listener will get an event it may have thought excluded, but this is not a serious situation and one any listener should deal with. This patch lays the ground work for non-pinning inode marks by using lazy inode pinning. We do not pin a mark until it has a non-zero mask entry. If a listener new sets a mask we never pin the inode. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fanotify: should_send_event needs to handle vfsmounts	Eric Paris	2010-07-28	1	-0/+2
\| \| \| \| \| \| \| \|	currently should_send_event in fanotify only cares about marks on inodes. This patch extends that interface to indicate that it cares about events that happened on vfsmounts. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: Infrastructure for per-mount watches	Andreas Gruenbacher	2010-07-28	1	-0/+4
\| \| \| \| \| \| \| \| \|	Per-mount watches allow groups to listen to fsnotify events on an entire mount. This patch simply adds and initializes the fields needed in the vfsmount struct to make this happen. Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: vfsmount marks generic functions	Eric Paris	2010-07-28	1	-0/+2
\| \| \| \| \| \| \| \|	Much like inode-mark.c has all of the code dealing with marks on inodes this patch adds a vfsmount-mark.c which has similar code but is intended for marks on vfsmounts. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: split generic and inode specific mark code	Eric Paris	2010-07-28	1	-2/+3
\| \| \| \| \| \| \| \|	currently all marking is done by functions in inode-mark.c. Some of this is pretty generic and should be instead done in a generic function and we should only put the inode specific code in inode-mark.c Signed-off-by: Eric Paris <eparis@redhat.com>
*	fanotify: Add pids to events	Andreas Gruenbacher	2010-07-28	1	-0/+1
\| \| \| \| \| \| \| \|	Pass the process identifiers of the triggering processes to fanotify listeners: this information is useful for event filtering and logging. Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: rename mark_entry to just mark	Eric Paris	2010-07-28	1	-13/+13
\| \| \| \| \| \| \|	previously I used mark_entry when talking about marks on inodes. The _entry is pretty useless. Just use "mark" instead. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: rename fsnotify_find_mark_entry to fsnotify_find_mark	Eric Paris	2010-07-28	1	-2/+2
\| \| \| \| \| \|	the _entry portion of fsnotify functions is useless. Drop it. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: rename fsnotify_mark_entry to just fsnotify_mark	Eric Paris	2010-07-28	1	-19/+19
\| \| \| \| \| \| \|	The name is long and it serves no real purpose. So rename fsnotify_mark_entry to just fsnotify_mark. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: kill FSNOTIFY_EVENT_FILE	Andreas Gruenbacher	2010-07-28	1	-3/+2
\| \| \| \| \| \| \| \| \|	Some fsnotify operations send a struct file. This is more information than we technically need. We instead send a struct path in all cases instead of sometimes a path and sometimes a file. Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: add flags to fsnotify_mark_entries	Eric Paris	2010-07-28	1	-0/+3
\| \| \| \| \| \| \| \|	To differentiate between inode and vfsmount (or other future) types of marks we add a flags field and set the inode bit on inode marks (the only currently supported type of mark) Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: add vfsmount specific fields to the fsnotify_mark_entry union	Eric Paris	2010-07-28	1	-0/+10
\| \| \| \| \| \| \| \|	vfsmount marks need mostly the same data as inode specific fields, but for consistency and understandability we put that data in a vfsmount specific struct inside a union with inode specific data. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: put inode specific fields in an fsnotify_mark in a union	Eric Paris	2010-07-28	1	-4/+13
\| \| \| \| \| \| \| \| \|	The addition of marks on vfs mounts will be simplified if the inode specific parts of a mark and the vfsmnt specific parts of a mark are actually in a union so naming can be easy. This patch just implements the inode struct and the union. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: include vfsmount in should_send_event when appropriate	Eric Paris	2010-07-28	1	-1/+2
\| \| \| \| \| \| \| \| \|	To ensure that a group will not duplicate events when it receives it based on the vfsmount and the inode should_send_event test we should distinguish those two cases. We pass a vfsmount to this function so groups can make their own determinations. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: mount point listeners list and global mask	Eric Paris	2010-07-28	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	currently all of the notification systems implemented select which inodes they care about and receive messages only about those inodes (or the children of those inodes.) This patch begins to flesh out fsnotify support for the concept of listeners that want to hear notification for an inode accessed below a given monut point. This patch implements a second list of fsnotify groups to hold these types of groups and a second global mask to hold the events of interest for this type of group. The reason we want a second group list and mask is because the inode based notification should_send_event support which makes each group look for a mark on the given inode. With one nfsmount listener that means that every group would have to take the inode->i_lock, look for their mark, not find one, and return for every operation. By seperating vfsmount from inode listeners only when there is a inode listener will the inode groups have to look for their mark and take the inode lock. vfsmount listeners will have to grab the lock and look for a mark but there should be fewer of them, and one vfsmount listener won't cause the i_lock to be grabbed and released for every fsnotify group on every io operation. Signed-off-by: Eric Paris <eparis@redhat.com>
*	fsnotify: rename fsnotify_groups to fsnotify_inode_groups	Eric Paris	2010-07-28	1	-3/+3
\| \| \| \| \| \| \| \|	Simple renaming patch. fsnotify is about to support mount point listeners so I am renaming fsnotify_groups and fsnotify_mask to indicate these are lists used only for groups which have watches on inodes. Signed-off-by: Eric Paris <eparis@redhat.com>