| Commit message (Collapse) | Author | Age | Files | Lines |
|\
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
After converting the cpu physical address to shub2 physical
addressing, the address was run through TO_PHYS() which
clobbered a high node offset bit causing the BTE to fail
on shub2 nodes with large memory. This fix corrects
that problem.
Signed-off-by: Russ Anderson (rja@sgi.com)
Signed-off-by: Tony Luck <tony.luck@intel.com>
|
| |\ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
arch/ia64/sn/Makefile sets CPPFLAGS, expecting that setting to
propogate to all the subdirectories. For a normal build with its
recursive descent it does work, but doing a selective build like
'make arch/ia64/sn/kernel/io_init.i' does not do a recursive descent,
it goes directly to arch/ia64/sn/kernel/Makefile so the flags do not
get set.
To support selective builds, set the flags in all the subordinate Makefiles.
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
During some testing, we got a warning about trying to allocate
memory while holding a lock. This fixes that problem.
Signed-off-by: Robin Holt <holt@sgi.com>
Acked-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
House keeping - eliminate unneeded parenthesis in macro defines.
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Maintenance patch:
- Add missing __init calls
- Do not zero initialize global variables
- No need to typecast function call returns to void
- Some formatting
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
|
| |\ \ |
|
| | |/
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Actually I think this is more appropriate so we don't end up with 17
cases that add drivers/sn to the build lib.
Include drivers/sn when CONFIG_IA64_SGI_SN2 or CONFIG_IA64_GENERIC
is enabled.
Acked-by: Dave Jones <davej@redhat.com>
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
If SAL_CACHE_FLUSH drops interrupts, complain about it and fall back to
using PAL_CACHE_FLUSH instead.
This is to work around a defect in HP rx5670 firmware: when an interrupt
occurs during SAL_CACHE_FLUSH, SAL drops the interrupt but leaves it marked
"in-service", which leaves the interrupt (and others of equal or lower
priority) masked.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Somehow I doubt this comment is meant to be here anymore... It's
been floating after the L1_CACHE_SHIFT entry since before Linux
moved to bitkeeper.
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Temporary patch to make pci_enable_msi() fail gracefully on altix. Will be
removed after 2.6.16 releases and the msi abstraction patches start flowing.
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Redirecting interrupts using smp_affinity on altix does not work on kernels
built with CONFIG_PCI_MSI. The problem is that move_irq() turns into a noop
if MSI is built in. This patch calls move_native_irq() instead of move_irq()
to get around that.
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
With the recent optimization made to wrap_mmu_context function,
we don't hold tasklist_lock anymore when wrapping context id.
The comments in asm/system.h must fall through the crack earlier.
Remove staled comments.
I believe it is still beneficial to unlock the runqueue lock
across context switch. So leave __ARCH_WANT_UNLOCKED_CTXSW on.
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
|
| |\ \ |
|
| | |/
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This patch finishes support for SHUB2 (the new chipset). Most of the
changes are performance related. A few changes are workarounds for
"interesting" chipset features.
Some temporary debugging code has also been deleted.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When we pull the PPP protocol off the skb, we forgot to update the
hardware RX checksum. This may lead to messages such as
dsl0: hw csum failure.
Similarly, we need to clear the hardware checksum flag when we use
the existing packet to store the decompressed result.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
I thought we had fixed up all non-gpl USB drivers, and was wrong to do
this.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The non-NUMA case would do an unmatched "free_alien_cache()" on an alien
pointer that had never been allocated.
It might not matter from a code generation standpoint (since in the
non-NUMA case, the code doesn't actually _do_ anything), but it not only
results in a compiler warning, it's really really ugly too.
Fix the compiler warning by just having a matching dummy allocation.
That also avoids an unnecessary #ifdef in the code.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|\ \ \ |
|
| | | |
| | | |
| | | |
| | | | |
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | | |
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|\ \ \ \ |
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
After DNAT the original dst_entry needs to be released if present
so the packet doesn't skip input routing with its new address. The
current check for DNAT in ip_nat_in is reversed and checks for SNAT.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The IPv4 and IPv6 version of the policy match are identical besides address
comparison and the data structure used for userspace communication. Unify
the data structures to break compatiblity now (before it is released), so
we can port it to x_tables in 2.6.17.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Fix two bugs in ip6t_policy address matching:
- misorder arguments to ip6_masked_addrcmp, mask must be the second argument
- inversion incorrectly applied to the entire expression instead of just
the address comparison
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
netfilter's do_replace() can overflow on addition within SMP_ALIGN()
and/or on multiplication by NR_CPUS, resulting in a buffer overflow on
the copy_from_user(). In practice, the overflow on addition is
triggerable on all systems, whereas the multiplication one might require
much physical memory to be present due to the check above. Either is
sufficient to overwrite arbitrary amounts of kernel memory.
I really hate adding the same check to all 4 versions of do_replace(),
but the code is duplicate...
Found by Solar Designer during security audit of OpenVZ.org
Signed-Off-By: Kirill Korotaev <dev@openvz.org>
Signed-Off-By: Solar Designer <solar@openwall.com>
Signed-off-by: Patrck McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
This memset() is executing with a bad size. According to Yasuyuki Kozakai,
this memset() can be deleted, as 'ftp' is declared in global area.
Signed-off-by: Samir Bellabes <sbellabes@mandriva.com>
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Fix some typos that make iptables userspace compilation fail.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Reported by David Ahern <dahern@avaya.com>, netfilter bugzilla #426.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The packet marked is the netlink skb, not the queued skb.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The skb allocated is always of size nlbufsize, even if that is smaller than
the size needed for the current packet.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Performance tests showed that ULOG may fail on heavy loaded systems
because of failed order-N allocations (N >= 1).
The default value of 4096 is not optimal in the sense that it actually
allocates _two_ contigous physical pages. Reasoning: ULOG uses
alloc_skb(), which adds another ~300 bytes for skb_shared_info.
This patch sets the default value to NLMSG_GOODSIZE and adds some
documentation at the top.
Signed-off-by: Holger Eitzenberger <heitzenberger@astaro.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
__nf_conntrack_{l3}proto_find() doesn't check the passed protocol family,
then it's possible to touch out of the array which has only AF_MAX items.
Spotted by Pablo Neira Ayuso.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Add load-on-demand support for expectation request. eg. conntrack -L expect
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The ctnetlink expectation events should use the NFNL_SUBSYS_CTNETLINK_EXP
subsystem, not NFNL_SUBSYS_CTNETLINK.
Signed-off-by: Marcus Sundberg <marcus@ingate.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |/ / /
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
When two ip_route_output_key lookups in icmp_send were combined I
forgot to change the error path for ip_options_echo to not drop the
dst reference since it now sits before the dst lookup. To fix it we
simply jump past the ip_rt_put call.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
On a system where libintl.h is present, but the NLS functionality is
supplied by a separate library instead of the system C library, an attempt
to "make config" or "make menuconfig" will fail with link errors, ex:
scripts/kconfig/mconf.o:mconf.c:(.text+0xf63): undefined reference to
`_libintl_gettext'
This patch attempts to correct the problem by detecting whether or not NLS
support requires linking with libintl.
Signed-off-by: Samuel J Robb <sam.robb@timesys.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Due to the usage of set_64bit in include/asm-i386/pgtable-3level.h,
HIGHMEM64G must depend on X86_CMPXCHG64.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Fix gcc4.1 compile warnings "value computed is not used" with
set_current_state() and set_task_state() on i386/SMP and x86-64.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Show first field of kernel version in register dumps like x86_64 does.
Changes output from e.g.:
(2.6.16-rc1)
to:
(2.6.16-rc1 #12)
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
i386 CPU init code accesses freed init memory when booting a newly-started
processor after CPU hotplug. The cpu_devs array is searched to find the
vendor and it contains pointers to freed data.
Fix that by:
1. Zeroing entries for freed vendor data after bootup.
2. Changing Transmeta, NSC and UMC to all __init[data].
3. Printing a warning (once only) and setting this_cpu
to a safe default when the vendor is not found.
This does not change behavior for AMD systems. They were broken already
but no error was reported.
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
When walking a path, the LOOKUP_CONTINUE flag is used by some filesystems
(for instance NFS) in order to determine whether or not it is looking up
the last component of the path. It this is the case, it may have to look
at the intent information in order to perform various tasks such as atomic
open.
A problem currently occurs when link_path_walk() hits a symlink. In this
case LOOKUP_CONTINUE may be cleared prematurely when we hit the end of the
path passed by __vfs_follow_link() (i.e. the end of the symlink path)
rather than when we hit the end of the path passed by the user.
The solution is to have link_path_walk() clear LOOKUP_CONTINUE if and only
if that flag was unset when we entered the function.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This fixes locking and bugs in cpu_down and cpu_up paths of the NUMA slab
allocator. Sonny Rao <sonny@burdell.org> reported problems sometime back on
POWER5 boxes, when the last cpu on the nodes were being offlined. We could
not reproduce the same on x86_64 because the cpumask (node_to_cpumask) was not
being updated on cpu down. Since that issue is now fixed, we can reproduce
Sonny's problems on x86_64 NUMA, and here is the fix.
The problem earlier was on CPU_DOWN, if it was the last cpu on the node to go
down, the array_caches (shared, alien) and the kmem_list3 of the node were
being freed (kfree) with the kmem_list3 lock held. If the l3 or the
array_caches were to come from the same cache being cleared, we hit on
badness.
This patch cleans up the locking in cpu_up and cpu_down path. We cannot
really free l3 on cpu down because, there is no node offlining yet and even
though a cpu is not yet up, node local memory can be allocated for it. So l3s
are usually allocated at keme_cache_create and destroyed at
kmem_cache_destroy. Hence, we don't need cachep->spinlock protection to get
to the cachep->nodelist[nodeid] either.
Patch survived onlining and offlining on a 4 core 2 node Tyan box with a 4
dbench process running all the time.
Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Cc: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Earlier, we had to disable on chip interrupts while taking the
cachep->spinlock because, at cache_grow, on every addition of a slab to a slab
cache, we incremented colour_next which was protected by the cachep->spinlock,
and cache_grow could occur at interrupt context. Since, now we protect the
per-node colour_next with the node's list_lock, we do not need to disable on
chip interrupts while taking the per-cache spinlock, but we just need to
disable interrupts when taking the per-node kmem_list3 list_lock.
Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Cc: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
colour_next is used as an index to add a colouring offset to a new slab in the
cache (colour_off * colour_next). Now with the NUMA aware slab allocator, it
makes sense to colour slabs added on the same node sequentially with
colour_next.
This patch moves the colouring index "colour_next" per-node by placing it on
kmem_list3 rather than kmem_cache.
This also helps simplify locking for CPU up and down paths.
Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Cc: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
I just spent some time researching a Bus Error. Turns out that the huge
page fault handler can return VM_FAULT_SIGBUS for various conditions where
no huge page is available.
Add a note explaining the reasoning in the source.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Acked-by: William Lee Irwin III <wli@holomorphy.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Ben points out that:
When writing files out using O_SYNC, jbd's 1 jiffy delay results in a
significant drop in throughput as the disk sits idle. The patch below
results in a 4-5x performance improvement (from 6.5MB/s to ~24-30MB/s on my
IDE test box) when writing out files using O_SYNC.
So optimise the batching code by omitting it entirely if the process which is
doing a sync write is the same as the one which did the most recent sync
write. If that's true, we're unlikely to get any other processes joining the
transaction.
(Has been in -mm for ages - it took me a long time to get on to performance
testing it)
Numbers, on write-cache-disabled IDE:
/usr/bin/time -p synctest -n 10 -uf -t 1 -p 1 dir-name
Unpatched:
40 seconds
Patched:
35 seconds
Batching disabled:
35 seconds
This is the problematic single-process-doing-fsync case. With multiple
fsyncing processes the numbers are AFACIT unaltered by the patch.
Aside: performance testing and instrumentation shows that the transaction
batching almost doesn't help (testing with synctest -n 1 -uf -t 100 -p 10
dir-name on non-writeback-caching IDE). This is because by the time one
process is running a synchronous commit, a bunch of other processes already
have a transaction handle open, so they're all going to batch into the same
transaction anyway.
The batching seems to offer maybe 5-10% speedup with this workload, but I'm
pretty sure it was more important than that when it was first developed 4-odd
years ago...
Cc: "Stephen C. Tweedie" <sct@redhat.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|