op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	mm: mempolicy: turn vma_set_policy() into vma_dup_policy()	Oleg Nesterov	2013-09-11	4	-20/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Simple cleanup. Every user of vma_set_policy() does the same work, this looks a bit annoying imho. And the new trivial helper which does mpol_dup() + vma_set_policy() to simplify the callers. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	drivers/block/swim.c: remove unnecessary platform_set_drvdata()	Jingoo Han	2013-09-11	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \|	The driver core clears the driver data to NULL after device_release or on probe failure. Thus, it is not needed to manually clear the device driver data to NULL. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Jean Delvare <khali@linux-fr.org> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	cciss: set max scatter gather entries to 32 on P600	Mike Miller	2013-09-11	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At one time we used to set the maximum number of scatter gather elements on all Smart Array controllers to 32. At some point in time the firmware began to write the "appropriate" value for each controller into the config table. The cciss driver would then read that and set h->maxsgentries. h->maxsgentries = readl(&(h->cfgtable->MaxSGElements); On the P600 that value is 544. Under some workloads a significant performance reduction may result. This patch forces the P600 to use only 32 scatter gather elements. Other controllers are not affected. Signed-off-by: Mike Miller <mike.miller@hp.com> Signed-off-by: Dwight (Bud) Brown <bubrown@redhat.com> Signed-off-by: Tomas Henzl <thenzl@redhat.com> Acked-by: Stephen M. Cameron <steve.cameron@hp.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	drivers/block/mg_disk.c: make mg_times_out() static	Jingoo Han	2013-09-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	mg_times_out() is used only in this file. Fix the following sparse warning: drivers/block/mg_disk.c:639:6: warning: symbol 'mg_times_out' was not declared. Should it be static? Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	block: support embedded device command line partition	Cai Zhiyong	2013-09-11	10	-0/+452
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Read block device partition table from command line. The partition used for fixed block device (eMMC) embedded device. It is no MBR, save storage space. Bootloader can be easily accessed by absolute address of data on the block device. Users can easily change the partition. This code reference MTD partition, source "drivers/mtd/cmdlinepart.c" About the partition verbose reference "Documentation/block/cmdline-partition.txt" [akpm@linux-foundation.org: fix printk text] [yongjun_wei@trendmicro.com.cn: fix error return code in parse_parts()] Signed-off-by: Cai Zhiyong <caizhiyong@huawei.com> Cc: Karel Zak <kzak@redhat.com> Cc: "Wanglin (Albert)" <albert.wanglin@huawei.com> Cc: Marius Groeger <mag@sysgo.de> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Brian Norris <computersforpeace@gmail.com> Cc: Artem Bityutskiy <dedekind@infradead.org> Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	block/blk-sysfs.c: replace strict_strtoul() with kstrtoul()	Jingoo Han	2013-09-11	1	-1/+1
\| \| \| \| \| \| \| \| \|	The usage of strict_strtoul() is not preferred, because strict_strtoul() is obsolete. Thus, kstrtoul() should be used. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	block: replace strict_strtoul() with kstrtoul()	Jingoo Han	2013-09-11	3	-3/+3
\| \| \| \| \| \| \| \| \|	The use of strict_strtoul() is not preferred, because strict_strtoul() is obsolete. Thus, kstrtoul() should be used. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	include/linux/sched.h: don't use task->pid/tgid in ↵	Oleg Nesterov	2013-09-11	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	same_thread_group/has_group_leader_pid task_struct->pid/tgid should go away. 1. Change same_thread_group() to use task->signal for comparison. 2. Change has_group_leader_pid(task) to compare task_pid(task) with signal->leader_pid. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Sergey Dyasly <dserrg@gmail.com> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	ocfs2: fix the end cluster offset of FIEMAP	Jie Liu	2013-09-11	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Call fiemap ioctl(2) with given start offset as well as an desired mapping range should show extents if possible. However, we somehow figure out the end offset of mapping via 'mapping_end -= cpos' before iterating the extent records which would cause problems if the given fiemap length is too small to a cluster size, e.g, Cluster size 4096: debugfs.ocfs2 1.6.3 Block Size Bits: 12 Cluster Size Bits: 12 The extended fiemap test utility From David: https://gist.github.com/anonymous/6172331 # dd if=/dev/urandom of=/ocfs2/test_file bs=1M count=1000 # ./fiemap /ocfs2/test_file 4096 10 start: 4096, length: 10 File /ocfs2/test_file has 0 extents: # Logical Physical Length Flags ^^^^^ <-- No extent is shown In this case, at ocfs2_fiemap(): cpos == mapping_end == 1. Hence the loop of searching extent records was not executed at all. This patch remove the in question 'mapping_end -= cpos', and loops until the cpos is larger than the mapping_end as usual. # ./fiemap /ocfs2/test_file 4096 10 start: 4096, length: 10 File /ocfs2/test_file has 1 extents: # Logical Physical Length Flags 0: 0000000000000000 0000000056a01000 0000000006a00000 0000 Signed-off-by: Jie Liu <jeff.liu@oracle.com> Reported-by: David Weber <wb@munzinger.de> Tested-by: David Weber <wb@munzinger.de> Cc: Sunil Mushran <sunil.mushran@gmail.com> Cc: Mark Fashen <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	ocfs2: remove unused variable ip in dlmfs_get_root_inode()	Joseph Qi	2013-09-11	1	-3/+0
\| \| \| \| \| \| \| \| \| \|	Variable ip in dlmfs_get_root_inode() is defined but not used. So clean it up. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Reviewed-by: Jie Liu <jeff.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	ocfs2: fix a tiny race case when firing callbacks	Joyce	2013-09-11	1	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In o2hb_shutdown_slot() and o2hb_check_slot(), since event is defined as local, it is only valid during the call stack. So the following tiny race case may happen in a multi-volumes mounted environment: o2hb-vol1 o2hb-vol2 1) o2hb_shutdown_slot allocate local event1 2) queue_node_event add event1 to global o2hb_node_events 3) o2hb_shutdown_slot allocate local event2 4) queue_node_event add event2 to global o2hb_node_events 5) o2hb_run_event_list delete event1 from o2hb_node_events 6) o2hb_run_event_list event1 empty, return 7) o2hb_shutdown_slot event1 lifecycle ends 8) o2hb_fire_callbacks event1 is already invalid This patch lets it wait on o2hb_callback_sem when another thread is firing callbacks. And for performance consideration, we only call o2hb_run_event_list when there is an event queued. Signed-off-by: Joyce <xuejiufei@huawei.com> Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	ocfs2: avoid possible NULL pointer dereference in o2net_accept_one()	Joseph Qi	2013-09-11	1	-6/+10
\| \| \| \| \| \| \| \| \| \| \|	Since o2nm_get_node_by_num() may return NULL, we add this check in o2net_accept_one() to avoid possible NULL pointer dereference. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	ocfs2: adjust code style for o2net_handler_tree_lookup()	Joseph Qi	2013-09-11	1	-17/+17
\| \| \| \| \| \| \| \| \| \| \|	Code in o2net_handler_tree_lookup() may be corrupted by mistake. So adjust it to promote readability. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	ocfs2: free path in ocfs2_remove_inode_range()	Younger Liu	2013-09-11	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	In ocfs2_remove_inode_range(), there is a memory leak. The variable path has allocated memory with ocfs2_new_path_from_et(), but it is not free. Signed-off-by: Younger Liu <younger.liu@huawei.com> Reviewed-by: Jie Liu <jeff.liu@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	ocfs2: fix possible double free in ocfs2_reflink_xattr_rec	Joseph Qi	2013-09-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In ocfs2_reflink_xattr_rec(), meta_ac and data_ac are allocated by calling ocfs2_lock_reflink_xattr_rec_allocators(). Once an error occurs when allocating data_ac, it frees meta_ac which is allocated before. Here it mistakenly sets meta_ac to NULL but *meta_ac. Then ocfs2_reflink_xattr_rec() will try to free meta_ac again which is already invalid. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Reviewed-by: Jie Liu <jeff.liu@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	ocfs2/dlm: force clean refmap when doing local cleanup	Xue jiufei	2013-09-11	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	dlm_do_local_recovery_cleanup() should force clean refmap if the owner of lockres is UNKNOWN. Otherwise node may hang when umounting filesystems. Here's the situation: Node1 Node2 dlmlock() -> dlm_get_lock_resource() send DLM_MASTER_REQUEST_MSG to other nodes. trying to master this lockres, return MAYBE. selected as the master of lockresA, set mle->master to Node1, and do assert_master, send DLM_ASSERT_MASTER_MSG to Node2. Node 2 has interest on lockresA and return DLM_ASSERT_RESPONSE_MASTERY_REF then something happened and Node2 crashed. Receiving DLM_ASSERT_RESPONSE_MASTERY_REF, set Node2 into refmap, and keep sending DLM_ASSERT_MASTER_MSG to other nodes o2hb found node2 down, calling dlm_hb_node_down() --> dlm_do_local_recovery_cleanup() the master of lockresA is still UNKNOWN, no need to call dlm_free_dead_locks(). Set the master of lockresA to Node1, but Node2 stills remains in refmap. When Node1 umount, it found that the refmap of lockresA is not empty and attempted to migrate it to Node2, But Node2 is already down, so umount hang, trying to migrate lockresA again and again. Signed-off-by: joyce <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jie Liu <jeff.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	ocfs2: free meta_ac and data_ac when ocfs2_start_trans fails in ↵	Younger Liu	2013-09-11	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ocfs2_xattr_set() In ocfs2_xattr_set(), if ocfs2_start_trans failed, meta_ac and data_ac should be free. Otherwise, It would lead to a memory leak. Signed-off-by: Younger Liu <younger.liu@huawei.com> Cc: Joseph Qi <joseph.qi@huawei.com> Reviewed-by: Jie Liu <jeff.liu@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	ocfs2: add the missing return value check of ocfs2_xattr_get_clusters	Joseph Qi	2013-09-11	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In ocfs2_xattr_value_attach_refcount(), if error occurs when calling ocfs2_xattr_get_clusters(), it will go with unexpected behavior since local variables p_cluster, num_clusters and ext_flags are declared without initialization. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Acked-by: Jie Liu <jeff.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	ocfs2: fix a memory leak in __ocfs2_move_extents()	Jie Liu	2013-09-11	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ocfs2 path is not properly freed which leads to a memory leak at __ocfs2_move_extents(). This patch stops the leaks of the ocfs2_path structure. Signed-off-by: Jie Liu <jeff.liu@oracle.com> Reviewed-by: Younger Liu <younger.liu@huawei.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	ocfs2: add missing return value check of ocfs2_get_clusters()	Joseph Qi	2013-09-11	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In ocfs2_attach_refcount_tree() and ocfs2_duplicate_extent_list(), if error occurs when calling ocfs2_get_clusters(), it will go with unexpected behavior as local variables p_cluster, num_clusters and ext_flags are declared without initialization. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Reviewed-by: Jie Liu <jeff.liu@oracle.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	ocfs2: clean up dead code in ocfs2_acl_from_xattr()	Joseph Qi	2013-09-11	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In ocfs2_acl_from_xattr(), if size is less than sizeof(struct posix_acl_entry), it returns ERR_PTR(-EINVAL) directly. Then assign (size / sizeof(struct posix_acl_entry)) to count which will be at least 1, that means the following branch (count < 0) and (count == 0) will never be true. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Acked-by: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	ocfs2: use list_for_each_entry() instead of list_for_each()	Dong Fang	2013-09-11	10	-99/+45
\| \| \| \| \| \| \| \| \| \| \|	[dan.carpenter@oracle.com: fix up some NULL dereference bugs] Signed-off-by: Dong Fang <yp.fangdong@gmail.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jeff Liu <jeff.liu@oracle.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	fs/ocfs2/cluster/tcp.c: fix possible null pointer dereferences	Sunil Mushran	2013-09-11	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix some possible null pointer dereferences that were detected by the static code analyser, smatch. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Reported-by: Dan Carpenter <error27@gmail.com> Reported-by: Guozhonghua <guozhonghua@h3c.com> Cc: Sunil Mushran <sunil.mushran@gmail.com> Cc: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	ocfs2: ac_bits_wanted should be local_alloc_bits when returns -ENOSPC	Younger Liu	2013-09-11	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is an issue in reserving and claiming space for localalloc, When localalloc space is not enough, it would claim space from global_bitmap. And if there is not enough free space in global_bitmap, the size of claiming space would set to half of orignal size and retry. The issue is as follows: osb->local_alloc_bits is set to half of orignal size in ocfs2_recalc_la_window(), but ac->ac_bits_wanted is set to osb->local_alloc_default_bits which is not changed. localalloc always reserves and claims local_alloc_default_bits space and returns ENOSPC. So, ac->ac_bits_wanted should be osb->local_alloc_bits which would be changed. Signed-off-by: Younger Liu <younger.liu@huawei.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Jeff Liu <jeff.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	ocfs2: dlm_request_all_locks() should deal with the status sent from target node	Xue jiufei	2013-09-11	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	dlm_request_all_locks() should deal with the status sent from target node if DLM_LOCK_REQUEST_MSG is sent successfully, or recovery master will fall into endless loop, waiting for other nodes to send locks and DLM_RECO_DATA_DONE_MSG to me. NodeA NodeB selected as recovery master dlm_remaster_locks() ->dlm_request_all_locks() send DLM_LOCK_REQUEST_MSG to nodeA It happened that NodeA cannot alloc memory when it processes this message. dlm_request_all_locks_handler() do not queue dlm_request_all_locks_worker and returns -ENOMEM. It will never send locks and DLM_RECO_DATA_DONE_MSG to NodeB. NodeB do not deal with the status sent from nodeA, and will fall in endless loop waiting for the recovery state of NodeA to be changed. Signed-off-by: joyce <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Jeff Liu <jeff.liu@oracle.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	ocfs2: use i_size_read() to access i_size	Junxiao Bi	2013-09-11	7	-21/+21
\| \| \| \| \| \| \| \| \| \| \| \| \|	Though ocfs2 uses inode->i_mutex to protect i_size, there are both i_size_read/write() and direct accesses. Clean up all direct access to eliminate confusion. Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Cc: Jie Liu <jeff.liu@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	ocfs2: lighten up allocate transaction	Younger Liu	2013-09-11	4	-5/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The issue scenario is as following: When fallocating a very large disk space for a small file, __ocfs2_extend_allocation attempts to get a very large transaction. For some journal sizes, there may be not enough room for this transaction, and the fallocate will fail. The patch below extends & restarts the transaction as necessary while allocating space, and should work with even the smallest journal. This patch refers ext4 resize. Test: # mkfs.ocfs2 -b 4K -C 32K -T datafiles /dev/sdc ...(jounral size is 32M) # mount.ocfs2 /dev/sdc /mnt/ocfs2/ # touch /mnt/ocfs2/1.log # fallocate -o 0 -l 400G /mnt/ocfs2/1.log fallocate: /mnt/ocfs2/1.log: fallocate failed: Cannot allocate memory # tail -f /var/log/messages [ 7372.278591] JBD: fallocate wants too many credits (2051 > 2048) [ 7372.278597] (fallocate,6438,0):__ocfs2_extend_allocation:709 ERROR: status = -12 [ 7372.278603] (fallocate,6438,0):ocfs2_allocate_unwritten_extents:1504 ERROR: status = -12 [ 7372.278607] (fallocate,6438,0):__ocfs2_change_file_space:1955 ERROR: status = -12 ^C With this patch, the test works well. Signed-off-by: Younger Liu <younger.liu@huawei.com> Cc: Jie Liu <jeff.liu@oracle.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	drivers/iommu: remove unnecessary platform_set_drvdata()	Jingoo Han	2013-09-11	2	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The driver core clears the driver data to NULL after device_release or on probe failure. Thus, it is not needed to manually clear the device driver data to NULL. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: David Brown <davidb@codeaurora.org> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Suman Anna <s-anna@ti.com> Acked-by: Libo Chen <libo.chen@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	drivers/video/acornfb.c: remove dead code	Paul Bolle	2013-09-11	2	-294/+1
\| \| \| \| \| \| \| \| \| \| \| \|	acornfb checks for HAS_VIDC while support for that macro was removed in v2.6.23 (when the arm26 port was removed). So we can remove a bit of dead code. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	fork: unify and tighten up CLONE_NEWUSER/CLONE_NEWPID checks	Oleg Nesterov	2013-09-11	1	-15/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	do_fork() denies CLONE_THREAD \| CLONE_PARENT if NEWUSER \| NEWPID. Then later copy_process() denies CLONE_SIGHAND if the new process will be in a different pid namespace (task_active_pid_ns() doesn't match current->nsproxy->pid_ns). This looks confusing and inconsistent. CLONE_NEWPID is very similar to the case when ->pid_ns was already unshared, we want the same restrictions so copy_process() should also nack CLONE_PARENT. And it would be better to deny CLONE_NEWUSER && CLONE_SIGHAND as well just for consistency. Kill the "CLONE_NEWUSER \| CLONE_NEWPID" check in do_fork() and change copy_process() to do the same check along with ->pid_ns check we already have. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Andy Lutomirski <luto@amacapital.net> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Colin Walters <walters@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	pidns: kill the unnecessary CLONE_NEWPID in copy_process()	Oleg Nesterov	2013-09-11	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 8382fcac1b81 ("pidns: Outlaw thread creation after unshare(CLONE_NEWPID)") nacks CLONE_NEWPID if the forking process unshared pid_ns. This is correct but unnecessary, copy_pid_ns() does the same check. Remove the CLONE_NEWPID check to cleanup the code and prepare for the next change. Test-case: static int child(void arg) { return 0; } static char stack[16 1024]; int main(void) { pid_t pid; assert(unshare(CLONE_NEWUSER \| CLONE_NEWPID) == 0); pid = clone(child, stack + sizeof(stack) / 2, CLONE_NEWPID \| SIGCHLD, NULL); assert(pid < 0 && errno == EINVAL); return 0; } clone(CLONE_NEWPID) correctly fails with or without this change. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Andy Lutomirski <luto@amacapital.net> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Colin Walters <walters@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	pidns: fix vfork() after unshare(CLONE_NEWPID)	Oleg Nesterov	2013-09-11	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 8382fcac1b81 ("pidns: Outlaw thread creation after unshare(CLONE_NEWPID)") nacks CLONE_VM if the forking process unshared pid_ns, this obviously breaks vfork: int main(void) { assert(unshare(CLONE_NEWUSER \| CLONE_NEWPID) == 0); assert(vfork() >= 0); _exit(0); return 0; } fails without this patch. Change this check to use CLONE_SIGHAND instead. This also forbids CLONE_THREAD automatically, and this is what the comment implies. We could probably even drop CLONE_SIGHAND and use CLONE_THREAD, but it would be safer to not do this. The current check denies CLONE_SIGHAND implicitely and there is no reason to change this. Eric said "CLONE_SIGHAND is fine. CLONE_THREAD would be even better. Having shared signal handling between two different pid namespaces is the case that we are fundamentally guarding against." Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Colin Walters <walters@redhat.com> Acked-by: Andy Lutomirski <luto@amacapital.net> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	include/linux/smp.h:on_each_cpu(): switch back to a C function	Andrew Morton	2013-09-11	1	-8/+12
\| \| \| \| \| \| \| \| \| \| \|	Revert commit c846ef7deba2 ("include/linux/smp.h:on_each_cpu(): switch back to a macro"). It turns out that the problematic linux/irqflags.h include was fixed within ia64 and mn10300. Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: David Daney <david.daney@cavium.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	Merge branch 'for-linus' of ↵	Linus Torvalds	2013-09-11	26	-135/+130
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Heiko Carstens: "This includes one bpf/jit bug fix where the jit compiler could sometimes write generated code out of bounds of the allocated memory area. The rest of the patches are only cleanups and minor improvements" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/irq: reduce size of external interrupt handler hash array s390/compat,uid16: use current_cred() s390/ap_bus: use and-mask instead of a cast s390/ftrace: avoid pointer arithmetics with function pointers s390: make various functions static, add declarations to header files s390/compat signal: add couple of __force annotations s390/mm: add __releases()/__acquires() annotations to gmap_alloc_table() s390: keep Kconfig sorted s390/irq: rework irq subclass handling s390/irq: use hlists for external interrupt handler array s390/dumpstack: convert print_symbol to %pSR s390/perf: Remove print_hex_dump_bytes() debug output s390: update defconfig s390/bpf,jit: fix address randomization
\| *	s390/irq: reduce size of external interrupt handler hash array	Heiko Carstens	2013-09-09	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change the hash algorithm a bit so it produces only values in the range of 0..31. This allows to reduce the size of the external interrupt handler hash array even further while making sure that each of the known interrupt sources keeps its unique hash with the slightly modified algorithm: 0x1004 --> 12 0x1201 --> 10 0x1202 --> 11 0x1406 --> 16 0x1407 --> 17 0x2401 --> 19 0x2603 --> 22 0x4000 --> 0 This also means that the entire array now fits into exactly one cache line; so add a proper align statement as well. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
\| *	s390/compat,uid16: use current_cred()	Heiko Carstens	2013-09-07	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	86a264ab "CRED: Wrap current->cred and a few other accessors" converted all uses of current->cred into current_cred() but left s390 alone. So let's convert s390 finally as well, only five years later. This way we also get rid of a sparse warning which complains about a possible invalid rcu dereference which however is a false positive. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
\| *	s390/ap_bus: use and-mask instead of a cast	Heiko Carstens	2013-09-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Let's get rid of another sparse false positive: drivers/s390/crypto/ap_bus.c:416:64: warning: cast truncates bits from constant value (102030405060708 becomes 5060708) So instead of using a cast let's use an and-mask. That way sparse remains silent and one doesn't always have to check if this is a valid warning/bug or just a false positive. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
\| *	s390/ftrace: avoid pointer arithmetics with function pointers	Heiko Carstens	2013-09-07	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pointer arithmetics with function pointers is not really defined, but seems to do the right thing. Let's cast to a void pointer to have a defined behaviour, at least when using gcc. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
\| *	s390: make various functions static, add declarations to header files	Heiko Carstens	2013-09-07	10	-22/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make various functions static, add declarations to header files to fix a couple of sparse findings. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
\| *	s390/compat signal: add couple of __force annotations	Heiko Carstens	2013-09-07	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add __force annotations to get rid of a couple of sparse warnings: arch/s390/kernel/compat_signal.c:335:35: warning: cast removes address space of expression Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
\| *	s390/mm: add __releases()/__acquires() annotations to gmap_alloc_table()	Heiko Carstens	2013-09-07	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Let sparse not incorrectly complain about unbalanced locking. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
\| *	s390: keep Kconfig sorted	Heiko Carstens	2013-09-07	1	-3/+3
\| \| \| \| \| \| \| \|	Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
\| *	s390/irq: rework irq subclass handling	Heiko Carstens	2013-09-04	9	-57/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Let's not add a function for every external interrupt subclass for which we need reference counting. Just have two register/unregister functions which have a subclass parameter: void irq_subclass_register(enum irq_subclass subclass); void irq_subclass_unregister(enum irq_subclass subclass); Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
\| *	s390/irq: use hlists for external interrupt handler array	Heiko Carstens	2013-09-04	1	-12/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use hlists for the hashed array of external interrupt handlers. Reduces the size of the array by 50% (2KB). Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
\| *	s390/dumpstack: convert print_symbol to %pSR	Heiko Carstens	2013-09-04	1	-10/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the same as what other architectures did. The change has also the advantage that there won't be any interleaving messages between printk() and print_symbol(). Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
\| *	s390/perf: Remove print_hex_dump_bytes() debug output	Hendrik Brueckner	2013-09-04	1	-4/+1
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
\| *	s390: update defconfig	Heiko Carstens	2013-09-04	1	-12/+27
\| \| \| \| \| \| \| \|	Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
\| *	s390/bpf,jit: fix address randomization	Heiko Carstens	2013-09-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add misssing braces to hole calculation. This resulted in an addition instead of an substraction. Which in turn means that the jit compiler could try to write out of bounds of the allocated piece of memory. This bug was introduced with aa2d2c73 "s390/bpf,jit: address randomize and write protect jit code". Fixes this one: [ 37.320956] Unable to handle kernel pointer dereference at virtual kernel address 000003ff80231000 [ 37.320984] Oops: 0011 [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 37.320993] Modules linked in: dm_multipath scsi_dh eadm_sch dm_mod ctcm fsm autofs4 [ 37.321007] CPU: 28 PID: 6443 Comm: multipathd Not tainted 3.10.9-61.x.20130829-s390xdefault #1 [ 37.321011] task: 0000004ada778000 ti: 0000004ae3304000 task.ti: 0000004ae3304000 [ 37.321014] Krnl PSW : 0704c00180000000 000000000012d1de (bpf_jit_compile+0x198e/0x23d0) [ 37.321022] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 EA:3 Krnl GPRS: 000000004350207d 0000004a00000001 0000000000000007 000003ff80231002 [ 37.321029] 0000000000000007 000003ff80230ffe 00000000a7740000 000003ff80230f76 [ 37.321032] 000003ffffffffff 000003ff00000000 000003ff0000007d 000000000071e820 [ 37.321035] 0000004adbe99950 000000000071ea18 0000004af3d9e7c0 0000004ae3307b80 [ 37.321046] Krnl Code: 000000000012d1d0: 41305004 la %r3,4(%r5) 000000000012d1d4: e330f0f80021 clg %r3,248(%r15) #000000000012d1da: a7240009 brc 2,12d1ec >000000000012d1de: 50805000 st %r8,0(%r5) 000000000012d1e2: e330f0f00004 lg %r3,240(%r15) 000000000012d1e8: 41303004 la %r3,4(%r3) 000000000012d1ec: e380f0e00004 lg %r8,224(%r15) 000000000012d1f2: e330f0f00024 stg %r3,240(%r15) [ 37.321074] Call Trace: [ 37.321077] ([<000000000012da78>] bpf_jit_compile+0x2228/0x23d0) [ 37.321083] [<00000000006007c2>] sk_attach_filter+0xfe/0x214 [ 37.321090] [<00000000005d2d92>] sock_setsockopt+0x926/0xbdc [ 37.321097] [<00000000005cbfb6>] SyS_setsockopt+0x8a/0xe8 [ 37.321101] [<00000000005ccaa8>] SyS_socketcall+0x264/0x364 [ 37.321106] [<0000000000713f1c>] sysc_nr_ok+0x22/0x28 [ 37.321113] [<000003fffce10ea8>] 0x3fffce10ea8 [ 37.321118] INFO: lockdep is turned off. [ 37.321121] Last Breaking-Event-Address: [ 37.321124] [<000000000012d192>] bpf_jit_compile+0x1942/0x23d0 [ 37.321132] [ 37.321135] Kernel panic - not syncing: Fatal exception: panic_on_oops Cc: stable@vger.kernel.org # v3.11 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
* \|	Merge branch 'kconfig' of ↵	Linus Torvalds	2013-09-11	12	-345/+413
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull kconfig updates from Michal Marek: "This is the kconfig part of kbuild for v3.12-rc1: - post-3.11 search code fixes and micro-optimizations - CONFIG_MODULES is no longer a special case; this is needed to eventually fix the bug that using KCONFIG_ALLCONFIG breaks allmodconfig - long long is used to store hex and int values - make silentoldconfig no longer warns when a symbol changes from tristate to bool (it's a job for make oldconfig) - scripts/diffconfig updated to work with newer Pythons - scripts/config does not rely on GNU sed extensions" * 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: kconfig: do not allow more than one symbol to have 'option modules' kconfig: regenerate bison parser kconfig: do not special-case 'MODULES' symbol diffconfig: Update script to support python versions 2.5 through 3.3 diffconfig: Gracefully exit if the default config files are not present modules: do not depend on kconfig to set 'modules' option to symbol MODULES kconfig: silence warning when parsing auto.conf when a symbol has changed type scripts/config: use sed's POSIX interface kconfig: switch to "long long" for sanity kconfig: simplify symbol-search code kconfig: don't allocate n+1 elements in temporary array kconfig: minor style fixes in symbol-search code kconfig/[mn]conf: shorten title in search-box kconfig: avoid multiple calls to strlen Documentation/kconfig: more concise and straightforward search explanation
\| * \|	kconfig: do not allow more than one symbol to have 'option modules'	Yann E. MORIN	2013-09-05	2	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, it was possible to have more than one symbol with the 'option modules' attached to them, although only the last one would in fact control tristates. Since this does not make much sense, only allow at most one symbol to control tristates. Note: it is still possible to have more than one symbol that control tristates, but indirectly: config MOD1 bool "mod1" select MODULES config MOD2 bool "mod2" select MODULES config MODULES bool option modules Signed-off-by: "Yann E. MORIN" <yann.morin.1998@free.fr> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Michal Marek <mmarek@suse.cz>