summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'linus' into cpus4096Ingo Molnar2008-12-183-4/+11
|\
| * Merge branch 'upstream-linus' of ↵Linus Torvalds2008-12-172-3/+9
| |\ | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: ocfs2: Add JBD2 compat feature bit. ocfs2: Always update xattr search when creating bucket.
| | * ocfs2: Add JBD2 compat feature bit.Joel Becker2008-12-161-1/+7
| | | | | | | | | | | | | | | | | | | | | Define the OCFS2_FEATURE_COMPAT_JBD2 bit in the filesystem header. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
| | * ocfs2: Always update xattr search when creating bucket.Tao Ma2008-12-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we create xattr bucket during the process of xattr set, we always need to update the ocfs2_xattr_search since even if the bucket size is the same as block size, the offset will change because of the removal of the ocfs2_xattr_block header. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
| * | cifs: fix buffer overrun in parse_DFS_referralsJeff Layton2008-12-171-1/+2
| |/ | | | | | | | | | | | | | | | | | | | | While testing a kernel with memory poisoning enabled, I saw some warnings about the redzone getting clobbered when chasing DFS referrals. The buffer allocation for the unicode converted version of the searchName is too small and needs to take null termination into account. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge branch 'irq/sparseirq' into cpus4096Ingo Molnar2008-12-171-5/+2
|\ \ | | | | | | | | | | | | | | | | | | Conflicts: arch/x86/kernel/io_apic.c Merge irq/sparseirq here, to resolve conflicts.
| * | proc: enclose desc variable of show_stat() in CONFIG_SPARSE_IRQKOSAKI Motohiro2008-12-161-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: restructure code to fix compiler warning commit 240d367b4e6c6e3c5075e034db14dba60a6f5fa7 moved desc usage point into #ifdef CONFIG_SPARSE_IRQ. Eliminate the desc variable, otherwise following warning happens: fs/proc/stat.c: In function 'show_stat': fs/proc/stat.c:31: warning: unused variable 'desc' [ akpm: cleaned up the patch to remove #ifdef ] Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | |
| \ \
| \ \
| \ \
*---. \ \ Merge branches 'irq/sparseirq', 'x86/quirks' and 'x86/reboot' into cpus4096Ingo Molnar2008-12-121-4/+15
|\ \ \ \ \ | | |_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | We merge the irq/sparseirq, x86/quirks and x86/reboot trees into the cpus4096 tree because the io-apic changes in the sparseirq change conflict with the cpumask changes in the cpumask tree, and we want to resolve those.
| * | | | sparseirq: fix Alpha build failureYinghai Lu2008-12-091-7/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: build fix on Alpha -tip testing found this build failure on the Alpha defconfig: /home/mingo/tip/fs/proc/stat.c: In function 'show_stat': /home/mingo/tip/fs/proc/stat.c:48: error: implicit declaration of function 'for_each_irq_desc' /home/mingo/tip/fs/proc/stat.c:48: error: expected ';' before '{' token can not use irq_desc() in stat.c on older architectures. Signed-off-by: Yinghai Lu <yinghai@kernel.orgg> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | sparse irq_desc[] array: core kernel and x86 changesYinghai Lu2008-12-081-6/+11
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: new feature Problem on distro kernels: irq_desc[NR_IRQS] takes megabytes of RAM with NR_CPUS set to large values. The goal is to be able to scale up to much larger NR_IRQS value without impacting the (important) common case. To solve this, we generalize irq_desc[NR_IRQS] to an (optional) array of irq_desc pointers. When CONFIG_SPARSE_IRQ=y is used, we use kzalloc_node to get irq_desc, this also makes the IRQ descriptors NUMA-local (to the site that calls request_irq()). This gets rid of the irq_cfg[] static array on x86 as well: irq_cfg now uses desc->chip_data for x86 to store irq_cfg. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | Merge branch 'sched/core' into cpus4096Ingo Molnar2008-12-1210-16/+52
|\ \ \ \ | | |_|/ | |/| | | | | | | | | | | | | | Conflicts: include/linux/ftrace.h kernel/sched.c
| * | | Merge branch 'to-linus' of ↵Linus Torvalds2008-12-101-1/+9
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland * 'to-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland: tracehook: exec double-reporting fix
| | * | | tracehook: exec double-reporting fixRoland McGrath2008-12-091-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch 6341c39 "tracehook: exec" introduced a small regression in 2.6.27 regarding binfmt_misc exec event reporting. Since the reporting is now done in the common search_binary_handler() function, an exec of a misc binary will result in two (or possibly multiple) exec events being reported, instead of just a single one, because the misc handler contains a recursive call to search_binary_handler. To add to the confusion, if PTRACE_O_TRACEEXEC is not active, the multiple SIGTRAP signals will in fact cause only a single ptrace intercept, as the signals are not queued. However, if PTRACE_O_TRACEEXEC is on, the debugger will actually see multiple ptrace intercepts (PTRACE_EVENT_EXEC). The test program included below demonstrates the problem. This change fixes the bug by calling tracehook_report_exec() only in the outermost search_binary_handler() call (bprm->recursion_depth == 0). The additional change to restore bprm->recursion_depth after each binfmt load_binary call is actually superfluous for this bug, since we test the value saved on entry to search_binary_handler(). But it keeps the use of of the depth count to its most obvious expected meaning. Depending on what binfmt handlers do in certain cases, there could have been false-positive tests for recursion limits before this change. /* Test program using PTRACE_O_TRACEEXEC. This forks and exec's the first argument with the rest of the arguments, while ptrace'ing. It expects to see one PTRACE_EVENT_EXEC stop and then a successful exit, with no other signals or events in between. Test for kernel doing two PTRACE_EVENT_EXEC stops for a binfmt_misc exec: $ gcc -g traceexec.c -o traceexec $ sudo sh -c 'echo :test:M::foobar::/bin/cat: > /proc/sys/fs/binfmt_misc/register' $ echo 'foobar test' > ./foobar $ chmod +x ./foobar $ ./traceexec ./foobar; echo $? ==> good <== foobar test 0 $ ==> bad <== foobar test unexpected status 0x4057f != 0 3 $ */ #include <stdio.h> #include <sys/types.h> #include <sys/wait.h> #include <sys/ptrace.h> #include <unistd.h> #include <signal.h> #include <stdlib.h> static void wait_for (pid_t child, int expect) { int status; pid_t p = wait (&status); if (p != child) { perror ("wait"); exit (2); } if (status != expect) { fprintf (stderr, "unexpected status %#x != %#x\n", status, expect); exit (3); } } int main (int argc, char **argv) { pid_t child = fork (); if (child < 0) { perror ("fork"); return 127; } else if (child == 0) { ptrace (PTRACE_TRACEME); raise (SIGUSR1); execv (argv[1], &argv[1]); perror ("execve"); _exit (127); } wait_for (child, W_STOPCODE (SIGUSR1)); if (ptrace (PTRACE_SETOPTIONS, child, 0L, (void *) (long) PTRACE_O_TRACEEXEC) != 0) { perror ("PTRACE_SETOPTIONS"); return 4; } if (ptrace (PTRACE_CONT, child, 0L, 0L) != 0) { perror ("PTRACE_CONT"); return 5; } wait_for (child, W_STOPCODE (SIGTRAP | (PTRACE_EVENT_EXEC << 8))); if (ptrace (PTRACE_CONT, child, 0L, 0L) != 0) { perror ("PTRACE_CONT"); return 6; } wait_for (child, W_EXITCODE (0, 0)); return 0; } Reported-by: Arnd Bergmann <arnd@arndb.de> CC: Ulrich Weigand <ulrich.weigand@de.ibm.com> Signed-off-by: Roland McGrath <roland@redhat.com>
| * | | | KSYM_SYMBOL_LEN fixesHugh Dickins2008-12-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Miles Lane tailing /sys files hit a BUG which Pekka Enberg has tracked to my 966c8c12dc9e77f931e2281ba25d2f0244b06949 sprint_symbol(): use less stack exposing a bug in slub's list_locations() - kallsyms_lookup() writes a 0 to namebuf[KSYM_NAME_LEN-1], but that was beyond the end of page provided. The 100 slop which list_locations() allows at end of page looks roughly enough for all the other stuff it might print after the symbol before it checks again: break out KSYM_SYMBOL_LEN earlier than before. Latencytop and ftrace and are using KSYM_NAME_LEN buffers where they need KSYM_SYMBOL_LEN buffers, and vmallocinfo a 2*KSYM_NAME_LEN buffer where it wants a KSYM_SYMBOL_LEN buffer: fix those before anyone copies them. [akpm@linux-foundation.org: ftrace.h needs module.h] Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc Miles Lane <miles.lane@gmail.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Steven Rostedt <srostedt@redhat.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | inotify: fix IN_ONESHOT unmount event watcherDmitri Monakhov2008-12-101-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On umount two event will be dispatched to watcher: 1: inotify_dev_queue_event(.., IN_UNMOUNT,..) 2: remove_watch(watch, dev) ->inotify_dev_queue_event(.., IN_IGNORED, ..) But if watcher has IN_ONESHOT bit set then the watcher will be released inside first event. Which result in accessing invalid object later. IMHO it is not pure regression. This bug wasn't triggered while initial inotify interface testing phase because of another bug in IN_ONESHOT handling logic :) commit ac74c00e499ed276a965e5b5600667d5dc04a84a Author: Ulisses Furquim <ulissesf@gmail.com> Date: Fri Feb 8 04:18:16 2008 -0800 inotify: fix check for one-shot watches before destroying them As the IN_ONESHOT bit is never set when an event is sent we must check it in the watch's mask and not in the event's mask. TESTCASE: mkdir mnt mount -ttmpfs none mnt mkdir mnt/d ./inotify mnt/d& umount mnt ## << lockup or crash here TESTSOURCE: /* gcc -oinotify inotify.c */ #include <stdio.h> #include <stdlib.h> #include <sys/inotify.h> int main(int argc, char **argv) { char buf[1024]; struct inotify_event *ie; char *p; int i; ssize_t l; p = argv[1]; i = inotify_init(); inotify_add_watch(i, p, ~0); l = read(i, buf, sizeof(buf)); printf("read %d bytes\n", l); ie = (struct inotify_event *) buf; printf("event mask: %d\n", ie->mask); return 0; } Signed-off-by: Dmitri Monakhov <dmonakhov@openvz.org> Cc: John McCutchan <ttb@tentacle.dhs.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Robert Love <rlove@google.com> Cc: Ulisses Furquim <ulissesf@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | pagemap: fix 32-bit pagemap regressionMatt Mackall2008-12-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The large pages fix from bcf8039ed45 broke 32-bit pagemap by pulling the pagemap entry code out into a function with the wrong return type. Pagemap entries are 64 bits on all systems and unsigned long is only 32 bits on 32-bit systems. Signed-off-by: Matt Mackall <mpm@selenic.com> Reported-by: Doug Graham <dgraham@nortel.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: <stable@kernel.org> [2.6.26.x, 2.6.27.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | revert "percpu_counter: new function percpu_counter_sum_and_set"Andrew Morton2008-12-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Revert commit e8ced39d5e8911c662d4d69a342b9d053eaaac4e Author: Mingming Cao <cmm@us.ibm.com> Date: Fri Jul 11 19:27:31 2008 -0400 percpu_counter: new function percpu_counter_sum_and_set As described in revert "percpu counter: clean up percpu_counter_sum_and_set()" the new percpu_counter_sum_and_set() is racy against updates to the cpu-local accumulators on other CPUs. Revert that change. This means that ext4 will be slow again. But correct. Reported-by: Eric Dumazet <dada1@cosmosbay.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mingming Cao <cmm@us.ibm.com> Cc: <linux-ext4@vger.kernel.org> Cc: <stable@kernel.org> [2.6.27.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | revert "percpu counter: clean up percpu_counter_sum_and_set()"Andrew Morton2008-12-101-2/+2
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Revert commit 1f7c14c62ce63805f9574664a6c6de3633d4a354 Author: Mingming Cao <cmm@us.ibm.com> Date: Thu Oct 9 12:50:59 2008 -0400 percpu counter: clean up percpu_counter_sum_and_set() Before this patch we had the following: percpu_counter_sum(): return the percpu_counter's value percpu_counter_sum_and_set(): return the percpu_counter's value, copying that value into the central value and zeroing the per-cpu counters before returning. After this patch, percpu_counter_sum_and_set() has gone, and percpu_counter_sum() gets the old percpu_counter_sum_and_set() functionality. Problem is, as Eric points out, the old percpu_counter_sum_and_set() functionality was racy and wrong. It zeroes out counters on "other" cpus, without holding any locks which will prevent races agaist updates from those other CPUS. This patch reverts 1f7c14c62ce63805f9574664a6c6de3633d4a354. This means that percpu_counter_sum_and_set() still has the race, but percpu_counter_sum() does not. Note that this is not a simple revert - ext4 has since started using percpu_counter_sum() for its dirty_blocks counter as well. Note that this revert patch changes percpu_counter_sum() semantics. Before the patch, a call to percpu_counter_sum() will bring the counter's central counter mostly up-to-date, so a following percpu_counter_read() will return a close value. After this patch, a call to percpu_counter_sum() will leave the counter's central accumulator unaltered, so a subsequent call to percpu_counter_read() can now return a significantly inaccurate result. If there is any code in the tree which was introduced after e8ced39d5e8911c662d4d69a342b9d053eaaac4e was merged, and which depends upon the new percpu_counter_sum() semantics, that code will break. Reported-by: Eric Dumazet <dada1@cosmosbay.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mingming Cao <cmm@us.ibm.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | EXPORTFS: handle NULL returns from fh_to_dentry()/fh_to_parent()J. Bruce Fields2008-12-081-0/+4
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While 440037287c5 "[PATCH] switch all filesystems over to d_obtain_alias" removed some cases where fh_to_dentry() and fh_to_parent() could return NULL, there are still a few NULL returns left in individual filesystems. Thus it was a mistake for that commit to remove the handling of NULL returns in the callers. Revert those parts of 440037287c5 which removed the NULL handling. (We could, alternatively, modify all implementations to return -ESTALE instead of NULL, but that proves to require fixing a number of filesystems, and in some cases it's arguably more natural to return NULL.) Thanks to David for original patch and Linus, Christoph, and Hugh for review. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: David Howells <dhowells@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | Fix a race condition in FASYNC handlingJonathan Corbet2008-12-052-4/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Changeset a238b790d5f99c7832f9b73ac8847025815b85f7 (Call fasync() functions without the BKL) introduced a race which could leave file->f_flags in a state inconsistent with what the underlying driver/filesystem believes. Revert that change, and also fix the same races in ioctl_fioasync() and ioctl_fionbio(). This is a minimal, short-term fix; the real fix will not involve the BKL. Reported-by: Oleg Nesterov <oleg@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | Merge branch 'for-linus' of ↵Linus Torvalds2008-12-041-5/+16
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/bdev * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/bdev: [PATCH] fix bogus argument of blkdev_put() in pktcdvd [PATCH 2/2] documnt FMODE_ constants [PATCH 1/2] kill FMODE_NDELAY_NOW [PATCH] clean up blkdev_get a little bit [PATCH] Fix block dev compat ioctl handling [PATCH] kill obsolete temporary comment in swsusp_close()
| | * | [PATCH 1/2] kill FMODE_NDELAY_NOWChristoph Hellwig2008-12-041-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update FMODE_NDELAY before each ioctl call so that we can kill the magic FMODE_NDELAY_NOW. It would be even better to do this directly in setfl(), but for that we'd need to have FMODE_NDELAY for all files, not just block special files. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * | [PATCH] clean up blkdev_get a little bitChristoph Hellwig2008-12-041-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The way the bd_claim for the FMODE_EXCL case is implemented is rather confusing. Clean it up to the most logical style. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | [XFS] Fix hang after disallowed rename across directory quota domainsDave Chinner2008-12-051-1/+1
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When project quota is active and is being used for directory tree quota control, we disallow rename outside the current directory tree. This requires a check to be made after all the inodes involved in the rename are locked. We fail to unlock the inodes correctly if we disallow the rename when the target is outside the current directory tree. This results in a hang on the next access to the inodes involved in failed rename. Reported-by: Arkadiusz Miskiewicz <arekm@maven.pl> Signed-off-by: Dave Chinner <david@fromorbit.com> Tested-by: Arkadiusz Miskiewicz <arekm@maven.pl> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
* | | Merge branches 'tracing/ftrace', 'tracing/function-graph-tracer' and ↵Ingo Molnar2008-12-0518-111/+226
|\ \ \ | |/ / | | | | | | 'tracing/urgent' into tracing/core
| * | Merge branch 'for-2.6.28' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2008-12-034-2/+5
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-2.6.28' of git://linux-nfs.org/~bfields/linux: NLM: client-side nlm_lookup_host() should avoid matching on srcaddr nfsd: use of unitialized list head on error exit in nfs4recover.c Add a reference to sunrpc in svc_addsock nfsd: clean up grace period on early exit
| | * | NLM: client-side nlm_lookup_host() should avoid matching on srcaddrChuck Lever2008-11-241-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit c98451bd, the loop in nlm_lookup_host() unconditionally compares the host's h_srcaddr field to the incoming source address. For client-side nlm_host entries, both are always AF_UNSPEC, so this check is unnecessary. Since commit 781b61a6, which added support for AF_INET6 addresses to nlm_cmp_addr(), nlm_cmp_addr() now returns FALSE for AF_UNSPEC addresses, which causes nlm_lookup_host() to create a fresh nlm_host entry every time it is called on the client. These extra entries will eventually expire once the server is unmounted, so the impact of this regression, introduced with lockd IPv6 support in 2.6.28, should be minor. We could fix this by adding an arm in nlm_cmp_addr() for AF_UNSPEC addresses, but really, nlm_lookup_host() shouldn't be matching on the srcaddr field for client-side nlm_host lookups. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| | * | nfsd: use of unitialized list head on error exit in nfs4recover.cJ. Bruce Fields2008-11-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Thanks to Matthew Dodd for this bug report: A file label issue while running SELinux in MLS mode provoked the following bug, which is a result of use before init on a 'struct list_head'. In nfsd4_list_rec_dir() if the call to dentry_open() fails the 'goto out' skips INIT_LIST_HEAD() which results in the normally improbable case where list_entry() returns NULL. Trace follows. NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory SELinux: Context unconfined_t:object_r:var_lib_nfs_t:s0 is not valid (left unmapped). type=1400 audit(1227298063.609:282): avc: denied { read } for pid=1890 comm="rpc.nfsd" name="v4recovery" dev=dm-0 ino=148726 scontext=system_u:system_r:nfsd_t:s0-s15:c0.c1023 tcontext=system_u:object_r:unlabeled_t:s15:c0.c1023 tclass=dir BUG: unable to handle kernel NULL pointer dereference at 00000004 IP: [<c050894e>] list_del+0x6/0x60 *pde = 0d9ce067 *pte = 00000000 Oops: 0000 [#1] SMP Modules linked in: nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc ipv6 dm_multipath scsi_dh ppdev parport_pc sg parport floppy ata_piix pata_acpi ata_generic libata pcnet32 i2c_piix4 mii pcspkr i2c_core dm_snapshot dm_zero dm_mirror dm_log dm_mod BusLogic sd_mod scsi_mod crc_t10dif ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode] Pid: 1890, comm: rpc.nfsd Not tainted (2.6.27.5-37.fc9.i686 #1) EIP: 0060:[<c050894e>] EFLAGS: 00010217 CPU: 0 EIP is at list_del+0x6/0x60 EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: cd99e480 ESI: cf9caed8 EDI: 00000000 EBP: cf9caebc ESP: cf9caeb8 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process rpc.nfsd (pid: 1890, ti=cf9ca000 task=cf4de580 task.ti=cf9ca000) Stack: 00000000 cf9caef0 d0a9f139 c0496d04 d0a9f217 fffffff3 00000000 00000000 00000000 00000000 cf32b220 00000000 00000008 00000801 cf9caefc d0a9f193 00000000 cf9caf08 d0a9b6ea 00000000 cf9caf1c d0a874f2 cf9c3004 00000008 Call Trace: [<d0a9f139>] ? nfsd4_list_rec_dir+0xf3/0x13a [nfsd] [<c0496d04>] ? do_path_lookup+0x12d/0x175 [<d0a9f217>] ? load_recdir+0x0/0x26 [nfsd] [<d0a9f193>] ? nfsd4_recdir_load+0x13/0x34 [nfsd] [<d0a9b6ea>] ? nfs4_state_start+0x2a/0xc5 [nfsd] [<d0a874f2>] ? nfsd_svc+0x51/0xff [nfsd] [<d0a87f2d>] ? write_svc+0x0/0x1e [nfsd] [<d0a87f48>] ? write_svc+0x1b/0x1e [nfsd] [<d0a87854>] ? nfsctl_transaction_write+0x3a/0x61 [nfsd] [<c04b6a4e>] ? sys_nfsservctl+0x116/0x154 [<c04975c1>] ? putname+0x24/0x2f [<c04975c1>] ? putname+0x24/0x2f [<c048d49f>] ? do_sys_open+0xad/0xb7 [<c048d337>] ? filp_close+0x50/0x5a [<c048d4eb>] ? sys_open+0x1e/0x26 [<c0403cca>] ? syscall_call+0x7/0xb [<c064007b>] ? init_cyrix+0x185/0x490 ======================= Code: 75 e1 8b 53 08 8d 4b 04 8d 46 04 e8 75 00 00 00 8b 53 10 8d 4b 0c 8d 46 0c e8 67 00 00 00 5b 5e 5f 5d c3 90 90 55 89 e5 53 89 c3 <8b> 40 04 8b 00 39 d8 74 16 50 53 68 3e d6 6f c0 6a 30 68 78 d6 EIP: [<c050894e>] list_del+0x6/0x60 SS:ESP 0068:cf9caeb8 ---[ end trace a89c4ad091c4ad53 ]--- Cc: Matthew N. Dodd <Matthew.Dodd@spart.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| | * | nfsd: clean up grace period on early exitJ. Bruce Fields2008-11-242-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If nfsd was shut down before the grace period ended, we could end up with a freed object still on grace_list. Thanks to Jeff Moyer for reporting the resulting list corruption warnings. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Tested-by: Jeff Moyer <jmoyer@redhat.com>
| * | | Merge branch 'linux-next' of git://git.infradead.org/ubifs-2.6Linus Torvalds2008-12-0214-109/+221
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'linux-next' of git://git.infradead.org/ubifs-2.6: UBIFS: pre-allocate bulk-read buffer UBIFS: do not allocate too much UBIFS: do not print scary memory allocation warnings UBIFS: allow for gaps when dirtying the LPT UBIFS: fix compilation warnings MAINTAINERS: change UBI/UBIFS git tree URLs UBIFS: endian handling fixes and annotations UBIFS: remove printk
| | * | | UBIFS: pre-allocate bulk-read bufferArtem Bityutskiy2008-11-213-18/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To avoid memory allocation failure during bulk-read, pre-allocate a bulk-read buffer, so that if there is only one bulk-reader at a time, it would just use the pre-allocated buffer and would not do any memory allocation. However, if there are more than 1 bulk- reader, then only one reader would use the pre-allocated buffer, while the other reader would allocate the buffer for itself. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| | * | | UBIFS: do not allocate too muchArtem Bityutskiy2008-11-214-33/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bulk-read allocates 128KiB or more using kmalloc. The allocation starts failing often when the memory gets fragmented. UBIFS still works fine in this case, because it falls-back to standard (non-optimized) read method, though. This patch teaches bulk-read to allocate exactly the amount of memory it needs, instead of allocating 128KiB every time. This patch is also a preparation to the further fix where we'll have a pre-allocated bulk-read buffer as well. For example, now the @bu object is prepared in 'ubifs_bulk_read()', so we could path either pre-allocated or allocated information to 'ubifs_do_bulk_read()' later. Or teaching 'ubifs_do_bulk_read()' not to allocate 'bu->buf' if it is already there. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| | * | | UBIFS: do not print scary memory allocation warningsArtem Bityutskiy2008-11-213-8/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bulk-read allocates a lot of memory with 'kmalloc()', and when it is/gets fragmented 'kmalloc()' fails with a scarry warning. But because bulk-read is just an optimization, UBIFS keeps working fine. Supress the warning by passing __GFP_NOWARN option to 'kmalloc()'. This patch also introduces a macro for the magic 128KiB constant. This is just neater. Note, this is not really fixes the problem we had, but just hides the warnings. The further patches fix the problem. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| | * | | UBIFS: allow for gaps when dirtying the LPTAdrian Hunter2008-11-071-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The LPT may have gaps in it because initially empty LEBs are not added by mkfs.ubifs - because it does not know how many there are. Then UBIFS allocates empty LEBs in the reverse order that they are discovered i.e. they are added to, and removed from, the front of a list. That creates a gap in the middle of the LPT. The function dirtying the LPT tree (for the purpose of small model garbage collection) assumed that a gap could only occur at the very end of the LPT and stopped dirtying prematurely, which in turn resulted in the LPT running out of space - something that is designed to be impossible. Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
| | * | | UBIFS: fix compilation warningsArtem Bityutskiy2008-11-067-50/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We print 'ino_t' type using '%lu' printk() placeholder, but this results in many warnings when compiling for Alpha platform. Fix this by adding (unsingned long) casts. Fixes these warnings: fs/ubifs/journal.c:693: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/journal.c:1131: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/dir.c:163: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/tnc.c:2680: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/tnc.c:2700: warning: format '%lu' expects type 'long unsigned int', but argument 5 has type 'ino_t' fs/ubifs/replay.c:1066: warning: format '%lu' expects type 'long unsigned int', but argument 7 has type 'ino_t' fs/ubifs/orphan.c:108: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/orphan.c:135: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/orphan.c:142: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/orphan.c:154: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/orphan.c:159: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/orphan.c:451: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/orphan.c:539: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/orphan.c:612: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/orphan.c:843: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/orphan.c:856: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/recovery.c:1438: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/recovery.c:1443: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/recovery.c:1475: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/recovery.c:1495: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/debug.c:105: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'ino_t' fs/ubifs/debug.c:105: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'ino_t' fs/ubifs/debug.c:110: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'ino_t' fs/ubifs/debug.c:110: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'ino_t' fs/ubifs/debug.c:114: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'ino_t' fs/ubifs/debug.c:114: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'ino_t' fs/ubifs/debug.c:118: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'ino_t' fs/ubifs/debug.c:118: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'ino_t' fs/ubifs/debug.c:1591: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/debug.c:1671: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/debug.c:1674: warning: format '%lu' expects type 'long unsigned int', but argument 5 has type 'ino_t' fs/ubifs/debug.c:1680: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/debug.c:1699: warning: format '%lu' expects type 'long unsigned int', but argument 5 has type 'ino_t' fs/ubifs/debug.c:1788: warning: format '%lu' expects type 'long unsigned int', but argument 5 has type 'ino_t' fs/ubifs/debug.c:1821: warning: format '%lu' expects type 'long unsigned int', but argument 5 has type 'ino_t' fs/ubifs/debug.c:1833: warning: format '%lu' expects type 'long unsigned int', but argument 5 has type 'ino_t' fs/ubifs/debug.c:1924: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/debug.c:1932: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/debug.c:1938: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/debug.c:1945: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/debug.c:1953: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/debug.c:1960: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/debug.c:1967: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/debug.c:1973: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/debug.c:1988: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t' fs/ubifs/debug.c:1991: warning: format '%lu' expects type 'long unsigned int', but argument 5 has type 'ino_t' fs/ubifs/debug.c:2009: warning: format '%lu' expects type 'long unsigned int', but argument 2 has type 'ino_t' Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| | * | | UBIFS: endian handling fixes and annotationsHarvey Harrison2008-11-066-13/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Noticed by sparse: fs/ubifs/file.c:75:2: warning: restricted __le64 degrades to integer fs/ubifs/file.c:629:4: warning: restricted __le64 degrades to integer fs/ubifs/dir.c:431:3: warning: restricted __le64 degrades to integer This should be checked to ensure the ubifs_assert is working as intended, I've done the suggested annotation in this patch. fs/ubifs/sb.c:298:6: warning: incorrect type in assignment (different base types) fs/ubifs/sb.c:298:6: expected int [signed] [assigned] tmp fs/ubifs/sb.c:298:6: got restricted __le64 [usertype] <noident> fs/ubifs/sb.c:299:19: warning: incorrect type in assignment (different base types) fs/ubifs/sb.c:299:19: expected restricted __le64 [usertype] atime_sec fs/ubifs/sb.c:299:19: got int [signed] [assigned] tmp fs/ubifs/sb.c:300:19: warning: incorrect type in assignment (different base types) fs/ubifs/sb.c:300:19: expected restricted __le64 [usertype] ctime_sec fs/ubifs/sb.c:300:19: got int [signed] [assigned] tmp fs/ubifs/sb.c:301:19: warning: incorrect type in assignment (different base types) fs/ubifs/sb.c:301:19: expected restricted __le64 [usertype] mtime_sec fs/ubifs/sb.c:301:19: got int [signed] [assigned] tmp This looks like a bugfix as your tmp was a u32 so there was truncation in the atime, mtime, ctime value, probably not intentional, add a tmp_le64 and use it here. fs/ubifs/key.h:348:9: warning: cast to restricted __le32 fs/ubifs/key.h:348:9: warning: cast to restricted __le32 fs/ubifs/key.h:419:9: warning: cast to restricted __le32 Read from the annotated union member instead. fs/ubifs/recovery.c:175:13: warning: incorrect type in assignment (different base types) fs/ubifs/recovery.c:175:13: expected unsigned int [unsigned] [usertype] save_flags fs/ubifs/recovery.c:175:13: got restricted __le32 [usertype] flags fs/ubifs/recovery.c:186:13: warning: incorrect type in assignment (different base types) fs/ubifs/recovery.c:186:13: expected restricted __le32 [usertype] flags fs/ubifs/recovery.c:186:13: got unsigned int [unsigned] [usertype] save_flags Do byteshifting at compile time of the flag value. Annotate the saved_flags as le32. fs/ubifs/debug.c:368:10: warning: cast to restricted __le32 fs/ubifs/debug.c:368:10: warning: cast from restricted __le64 Should be checked if the truncation was intentional, I've changed the printk to print the full width. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| | * | | UBIFS: remove printkArtem Bityutskiy2008-11-061-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the "UBIFS background thread ubifs_bgd0_0 started" message. We kill the background thread when we switch to R/O mode, and start it again whan we switch to R/W mode. OLPC is doing this many times during boot, and we see this message many times as well, which is irritating. So just kill the message. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
* | | | | Merge commit 'v2.6.28-rc7' into tracing/coreIngo Molnar2008-12-0411-51/+150
|\ \ \ \ \ | |/ / / /
| * | | | ntfs: don't fool kernel-docRandy Dunlap2008-12-011-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kernel-doc handles macros now (it has for quite some time), so change the ntfs_debug() macro's kernel-doc to be just before the macro instead of before a phony function prototype. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Anton Altaparmakov <aia21@cantab.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | epoll: introduce resource usage limitsDavide Libenzi2008-12-011-8/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It has been thought that the per-user file descriptors limit would also limit the resources that a normal user can request via the epoll interface. Vegard Nossum reported a very simple program (a modified version attached) that can make a normal user to request a pretty large amount of kernel memory, well within the its maximum number of fds. To solve such problem, default limits are now imposed, and /proc based configuration has been introduced. A new directory has been created, named /proc/sys/fs/epoll/ and inside there, there are two configuration points: max_user_instances = Maximum number of devices - per user max_user_watches = Maximum number of "watched" fds - per user The current default for "max_user_watches" limits the memory used by epoll to store "watches", to 1/32 of the amount of the low RAM. As example, a 256MB 32bit machine, will have "max_user_watches" set to roughly 90000. That should be enough to not break existing heavy epoll users. The default value for "max_user_instances" is set to 128, that should be enough too. This also changes the userspace, because a new error code can now come out from EPOLL_CTL_ADD (-ENOSPC). The EMFILE from epoll_create() was already listed, so that should be ok. [akpm@linux-foundation.org: use get_current_user()] Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: <stable@kernel.org> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Reported-by: Vegard Nossum <vegardno@ifi.uio.no> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | ocfs2: fix regression in ocfs2_read_blocks_sync()Mark Fasheh2008-12-011-11/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We're panicing in ocfs2_read_blocks_sync() if a jbd-managed buffer is seen. At first glance, this seems ok but in reality it can happen. My test case was to just run 'exorcist'. A struct inode is being pushed out of memory but is then re-read at a later time, before the buffer has been checkpointed by jbd. This causes a BUG to be hit in ocfs2_read_blocks_sync(). Reviewed-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
| * | | | ocfs2: fix return value set in init_dlmfs_fs()Coly Li2008-12-011-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In init_dlmfs_fs(), if calling kmem_cache_create() failed, the code will use return value from calling bdi_init(). The correct behavior should be set status as -ENOMEM before going to "bail:". Signed-off-by: Coly Li <coyli@suse.de> Acked-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
| * | | | ocfs2: fix wake_up in unlock_astDavid Teigland2008-12-011-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In ocfs2_unlock_ast(), call wake_up() on lockres before releasing the spin lock on it. As soon as the spin lock is released, the lockres can be freed. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
| * | | | ocfs2: initialize stack_user lvbptrDavid Teigland2008-12-011-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The locking_state dump, ocfs2_dlm_seq_show, reads the lvb on locks where it has not yet been initialized by a lock call. Signed-off-by: David Teigland <teigland@redhat.com> Acked-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
| * | | | ocfs2: comments typo fixColy Li2008-12-012-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes two typos in comments of ocfs2. Signed-off-by: Coly Li <coyli@suse.de> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
| * | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6Linus Torvalds2008-11-301-21/+56
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: [CIFS] fix regression in cifs_write_begin/cifs_write_end
| | * | | | [CIFS] fix regression in cifs_write_begin/cifs_write_endJeff Layton2008-11-261-21/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The conversion to write_begin/write_end interfaces had a bug where we were passing a bad parameter to cifs_readpage_worker. Rather than passing the page offset of the start of the write, we needed to pass the offset of the beginning of the page. This was reliably showing up as data corruption in the fsx-linux test from LTP. It also became evident that this code was occasionally doing unnecessary read calls. Optimize those away by using the PG_checked flag to indicate that the unwritten part of the page has been initialized. CC: Nick Piggin <npiggin@suse.de> Acked-by: Dave Kleikamp <shaggy@us.ibm.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | | | | udf: Fix BUG_ON() in destroy_inode()Jan Kara2008-11-272-0/+2
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | udf_clear_inode() can leave behind buffers on mapping's i_private list (when we truncated preallocation). Call invalidate_inode_buffers() so that the list is properly cleaned-up before we return from udf_clear_inode(). This is ugly and suggest that we should cleanup preallocation earlier than in clear_inode() but currently there's no such call available since drop_inode() is called under inode lock and thus is unusable for disk operations. Signed-off-by: Jan Kara <jack@suse.cz>
* | | | | vfs, seqfile: export mangle_path() generallyIngo Molnar2008-11-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mangle_path() is trivial enough to make export restrictions on it pointless - so change the export from EXPORT_SYMBOL_GPL to EXPORT_SYMBOL. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
* | | | | blktrace: port to tracepoints, updateIngo Molnar2008-11-261-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Port to the new tracepoints API: split DEFINE_TRACE() and DECLARE_TRACE() sites. Spread them out to the usage sites, as suggested by Mathieu Desnoyers. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
OpenPOWER on IntegriCloud