| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
The 32bit kernel does not add padding bytes in the fc_bsg_request structure
whereas the 64bit kernel adds padding bytes in the fc_bsg_request structure.
Due to this, structure elements gets mismatched with 32bit application and
64bit kernel.To resolve this, used packed modifier to avoid adding padding bytes.
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
|
|
|
|
|
|
| |
address alignment.
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
|
|
|
|
|
|
|
|
|
| |
The driver did not account for non-tape devices needing to employ
proper FCP2 recovery. Driver now checks the FCP2-capable flag
only, rather than using a midlayer-determined flag (TYPE_TAPE).
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Because of the terrible structuring of scsi-bidi-commands
it breaks some of the life time rules of a scsi-command.
It is now not allowed to free up the block-request before
cleanup and partial deallocation of the scsi-command. (Which
is not so for none bidi commands)
The right fix to this problem would be to make bidi command
a first citizen by allocating a scsi_sdb pointer at scsi command
just like cmd->prot_sdb. The bidi sdb should be allocated/deallocated
as part of the get/put_command (Again like the prot_sdb) and the
current decoupling of scsi_cmnd and blk-request should be kept.
For now make sure scsi_release_buffers() is called before the
call to blk_end_request_all() which might cause the suicide of
the block requests. At best the leak of bidi buffers, at worse
a crash, as there is a race between the existence of the bidi_request
and the free of the associated bidi_sdb.
The reason this was never hit before is because only OSD has the potential
of doing asynchronous bidi commands. (So does bsg but it is never used)
And OSD clients just happen to do all their bidi commands synchronously, up
until recently.
CC: Stable Tree <stable@kernel.org>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since commit 9d2e9d66a3f032667934144cd61c396ba49f090d
mptsas driver fails to allocate memory for the MPT chain buffers
for second LSI adapter on PPC440SPe Katmai platform:
...
ioc1: LSISAS1068E B3: Capabilities={Initiator}
mptbase: ioc1: ERROR - Unable to allocate Reply, Request, Chain Buffers!
mptbase: ioc1: ERROR - didn't initialize properly! (-3)
mptsas: probe of 0002:31:00.0 failed with error -3
This commit increased MPT_FC_CAN_QUEUE value but initChainBuffers()
doesn't differentiate between SAS and FC causing increased allocation
for SAS case, too. Later pci_alloc_consistent() fails to allocate
increased chain buffer pool size for SAS case.
Provide a fix by looking at the bus type and using appropriate
MPT_SAS_CAN_QUEUE value while calculation of the number of chain
buffers.
Signed-off-by: Anatolij Gustschin <agust@denx.de>
Acked-by: Kashyap Desai <kashyap.desai@lsi.com>
Cc: Stable Tree <stable@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These particular problems were reported by Cisco and SAP and customers
as well. Cisco reported on RHEL4 U6 and SAP reported on SLES9 SP4 and
SLES10 SP2. We added these fixes on RHEL4 U6 and gave a private build
to IBM and Cisco. Cisco and IBM tested it for more than 15 days and
they reported that they did not see the issue so far. Before the fix,
Cisco used to see the issue within 5 days. We generated a patch for
SLES9 SP4 and SLES10 SP2 and submitted to Novell. Novell applied the
patch and gave a test build to SAP. SAP tested and reported that the
build is working properly.
We also tested in our lab using the tools "dishogsync", which is IO
stress tool and the tool was provided by Cisco.
Issue1: File System going into read-only mode
Root cause: The driver tends to not free the memory (FIB) when the
management request exits prematurely. The accumulation of such
un-freed memory causes the driver to fail to allocate anymore memory
(FIB) and hence return 0x70000 value to the upper layer, which puts
the file system into read only mode.
Fix details: The fix makes sure to free the memory (FIB) even if the
request exits prematurely hence ensuring the driver wouldn't run out
of memory (FIBs).
Issue2: False Raid Alert occurs
When the Physical Drives and Logical drives are reported as deleted or
added, even though there is no change done on the system
Root cause: Driver IOCTLs is signaled with EINTR while waiting on
response from the lower layers. Returning "EINTR" will never initiate
internal retry.
Fix details: The issue was fixed by replacing "EINTR" with
"ERESTARTSYS" for mid-layer retries.
Signed-off-by: Penchala Narasimha Reddy <ServeRAIDDriver@hcl.in>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
|
|
|
|
|
|
|
| |
lpfc_hbadisc.c and lpfc_hw4.h accidentally got set executable.
Reported-by: Thomas Backlund <tmb@mandriva.org>
Cc: James Smart <James.Smart@Emulex.Com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit f2260e6b (page allocator: update NR_FREE_PAGES only as necessary)
made one minor regression. if __rmqueue() was failed, NR_FREE_PAGES stat
go wrong. this patch fixes it.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Reported-by: Huang Shijie <shijie8@gmail.com>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
* 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
i2c: Do not use device name after device_unregister
i2c/pca: Don't use *_interruptible
i2c-ali1563: Remove sparse warnings
i2c: Test off by one in {piix4,vt596}_transaction()
i2c-core: Storage class should be before const qualifier
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
dev_dbg outputs dev_name, which is released with device_unregister. This bug
resulted in output like this:
i2c Xy2�0: adapter [SMBus I801 adapter at 1880] unregistered
The right output would be:
i2c i2c-0: adapter [SMBus I801 adapter at 1880] unregistered
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Unexpected signals can disturb the bus-handling and lock it up. Don't use
interruptible in 'wait_event_*' and 'wake_*' as in commits
dc1972d02747d2170fb1d78d114801f5ecb27506 (for cpm),
1ab082d7cbd0f34e39a5396cc6340c00bc5d66ef (for mpc),
b7af349b175af45f9d87b3bf3f0a221e1831ed39 (for omap).
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Remove the following sparse warnings (see "make C=1"):
* drivers/i2c/busses/i2c-ali1563.c:91:3: warning: do-while statement
is not a compound statement
* drivers/i2c/busses/i2c-ali1563.c:161:3: warning: do-while statement
is not a compound statement
Signed-off-by: Márton Németh <nm127@freemail.hu>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
|
| |
| |
| |
| |
| |
| |
| |
| | |
With `while (timeout++ < MAX_TIMEOUT)' timeout reaches MAX_TIMEOUT + 1
after the loop. This is probably unlikely to produce a problem.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The C99 specification states in section 6.11.5:
The placement of a storage-class specifier other than at the beginning
of the declaration specifiers in a declaration is an obsolescent
feature.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, uv: Ensure hub revision set for all ACPI modes.
x86, uv: Add function retrieving node controller revision number
x86: xen: 64-bit kernel RPL should be 0
x86: kernel_thread() -- initialize SS to a known state
x86/agp: Fix agp_amd64_init and agp_amd64_cleanup
x86: SGI UV: Fix mapping of MMIO registers
x86: mce.h: Fix warning in header checks
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Ensure that UV hub revision is set for all ACPI modes.
Signed-off-by: Russ Anderson <rja@sgi.com>
LKML-Reference: <20100115180908.GB7757@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Add function for determining the revision id of the SGI UV
node controller chip (HUB). This function is needed in a
subsequent patch.
Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20100112210904.GA24546@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Under Xen 64 bit guests actually run their kernel in ring 3,
however the hypervisor takes care of squashing descriptor the
RPLs transparently (in order to allow them to continue to
differentiate between user and kernel space CS using the RPL).
Therefore the Xen paravirt backend should use RPL==0 instead of
1 (or 3). Using RPL==1 causes generic arch code to take
incorrect code paths because it uses "testl $3, <CS>, je foo"
type tests for a userspace CS and this considers 1==userspace.
This issue was previously masked because get_kernel_rpl() was
omitted when setting CS in kernel_thread(). This was fixed when
kernel_thread() was unified with 32 bit in
f443ff4201dd25cd4dec183f9919ecba90c8edc2.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Jeremy Fitzhardinge <Jeremy.Fitzhardinge@citrix.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1263377768-19600-2-git-send-email-ian.campbell@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Before the kernel_thread was converted into "C" we had
pt_regs::ss set to __KERNEL_DS (by SAVE_ALL asm macro).
Though I must admit I didn't find any *explicit* load of
%ss from this structure the better to be on a safe side
and set it to a known value.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Jeremy Fitzhardinge <Jeremy.Fitzhardinge@citrix.com>
Cc: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1263377768-19600-1-git-send-email-ian.campbell@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This fixes the regression introduced by the commit
f405d2c02395a74d3883bd03ded36457aa3697ad.
The above commit fixes the following issue:
http://marc.info/?l=linux-kernel&m=126192729110083&w=2
However, it doesn't work properly when you remove and insert the
agp_amd64 module again.
agp_amd64_init() and agp_amd64_cleanup should be called only
when gart_iommu is not called earlier (that is, the GART IOMMU
is not enabled). We need to use 'gart_iommu_aperture' to see if
GART IOMMU is enabled or not.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: mitov@issp.bas.bg
Cc: davej@redhat.com
LKML-Reference: <20100104161603L.fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This fixes the problem of the initialization code not correctly
mapping the entire MMIO space on a UV system. A side effect is
the map_high() interface needed to be changed to accommodate
different address and size shifts.
Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Mike Habeck <habeck@sgi.com>
Cc: <stable@kernel.org>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <4B479202.7080705@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Someone isn't reading their build output: Move the definition
out of the exported header.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Cc: linux-kernel@vger.kernelorg
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
futexes: Remove rw parameter from get_futex_key()
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Currently, futexes have two problem:
A) The current futex code doesn't handle private file mappings properly.
get_futex_key() uses PageAnon() to distinguish file and
anon, which can cause the following bad scenario:
1) thread-A call futex(private-mapping, FUTEX_WAIT), it
sleeps on file mapping object.
2) thread-B writes a variable and it makes it cow.
3) thread-B calls futex(private-mapping, FUTEX_WAKE), it
wakes up blocked thread on the anonymous page. (but it's nothing)
B) Current futex code doesn't handle zero page properly.
Read mode get_user_pages() can return zero page, but current
futex code doesn't handle it at all. Then, zero page makes
infinite loop internally.
The solution is to use write mode get_user_page() always for
page lookup. It prevents the lookup of both file page of private
mappings and zero page.
Performance concerns:
Probaly very little, because glibc always initialize variables
for futex before to call futex(). It means glibc users never see
the overhead of this patch.
Compatibility concerns:
This patch has few compatibility issues. After this patch,
FUTEX_WAIT require writable access to futex variables (read-only
mappings makes EFAULT). But practically it's not a problem,
glibc always initalizes variables for futexes explicitly - nobody
uses read-only mappings.
Reported-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Darren Hart <dvhltc@us.ibm.com>
Cc: <stable@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Ulrich Drepper <drepper@gmail.com>
LKML-Reference: <20100105162633.45A2.A69D9226@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|\ \ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf tools: Check if /dev/null can be used as the -o gcc argument
perf tools: Move QUIET_STDERR def to before first use
perf: Stop stack frame walking off kernel addresses boundaries
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
At least on Debian PARISC64, using:
acme@parisc:~/git/linux-2.6-tip$ gcc -v
Using built-in specs.
Target: hppa-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian
4.3.4-6' --with-bugurl=file:///usr/share/doc/gcc-4.3/README.Bugs
--enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr
--enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --with-gxx-include-dir=/usr/include/c++/4.3 --program-suffix=-4.3 --enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --enable-mpfr --disable-libssp --enable-checking=release --build=hppa-linux-gnu --host=hppa-linux-gnu --target=hppa-linux-gnu Thread model: posix gcc version 4.3.4 (Debian 4.3.4-6)
there are issues about using 'gcc -o /dev/null':
/usr/bin/ld: final link failed: File truncated
collect2: ld returned 1 exit status
So we test that and use /dev/null in environments where it
works, while using an .INTERMEDIATE file on those where it can't
be used, so that the .perf.dev.null file can be used instead and
then deleted when make exits.
Researched-with: Kyle McMartin <kyle@mcmartin.ca>
Researched-with: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1263293910-8484-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
QUIET_STDERR is used when detecting if -fstack-protector-all can
be used.
Noticed while building the perf tools on a Debian PARISC64
machine.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1263293910-8484-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| |/ / /
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
While processing kernel perf callchains, an bad entry can be
considered as a valid stack pointer but not as a kernel address.
In this case, we hang in an endless loop. This can happen in an
x86-32 kernel after processing the last entry in a kernel
stacktrace.
Just stop the stack frame walking after we encounter an invalid
kernel address.
This fixes a hard lockup in x86-32.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1262227945-27014-1-git-send-regression-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|\ \ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tracing/filters: Add comment for match callbacks
tracing/filters: Fix MATCH_FULL filter matching for PTR_STRING
tracing/filters: Fix MATCH_MIDDLE_ONLY filter matching
lib: Introduce strnstr()
tracing/filters: Fix MATCH_END_ONLY filter matching
tracing/filters: Fix MATCH_FRONT_ONLY filter matching
ftrace: Fix MATCH_END_ONLY function filter
tracing/x86: Derive arch from bits argument in recordmcount.pl
ring-buffer: Add rb_list_head() wrapper around new reader page next field
ring-buffer: Wrap a list.next reference with rb_list_head()
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
We should be clear on 2 things:
- the length parameter of a match callback includes
tailing '\0'.
- the string to be searched might not be NULL-terminated.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4B4E8770.7000608@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
MATCH_FULL matching for PTR_STRING is not working correctly:
# echo 'func == vt' > events/bkl/lock_kernel/filter
# echo 1 > events/bkl/lock_kernel/enable
...
# cat trace
Xorg-1484 [000] 1973.392586: lock_kernel: ... func=vt_ioctl()
gpm-1402 [001] 1974.027740: lock_kernel: ... func=vt_ioctl()
We should pass to regex.match(..., len) the length (including '\0')
of the source string instead of the length of the pattern string.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4B4E8763.5070707@cn.fujitsu.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The @str might not be NULL-terminated if it's of type
DYN_STRING or STATIC_STRING, so we should use strnstr()
instead of strstr().
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4B4E8753.2000102@cn.fujitsu.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
It differs strstr() in that it limits the length to be searched
in the first string.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4B4E8743.6030805@cn.fujitsu.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
For '*foo' pattern, we should allow any string ending with
'foo', but event filtering incorrectly disallows strings
like bar_foo_foo:
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4B4E8735.6070604@cn.fujitsu.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
MATCH_FRONT_ONLY actually is a full matching:
# ./perf record -R -f -a -e lock:lock_acquire \
--filter 'name ~rcu_*' sleep 1
# ./perf trace
(no output)
We should pass the length of the pattern string to strncmp().
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4B4E8721.5090301@cn.fujitsu.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
For '*foo' pattern, we should allow any string ending with
'foo', but ftrace filter incorrectly disallows strings
like bar_foo_foo:
# echo '*io' > set_ftrace_filter
# cat set_ftrace_filter | grep 'req_bio_endio'
# cat available_filter_functions | grep 'req_bio_endio'
req_bio_endio
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4B4E870E.6060607@cn.fujitsu.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Let the arch argument be overruled by bits. Otherwise, building of
external modules against a i386 target on a x86-64 host (and likely vice
versa as well) fails unless ARCH=i386 is explicitly passed to make.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
LKML-Reference: <4B4AFE10.8050109@siemens.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| |\ \ \ \
| | | | | |
| | | | | |
| | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/urgent
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
If the very unlikely case happens where the writer moves the head by one
between where the head page is read and where the new reader page
is assigned _and_ the writer then writes and wraps the entire ring buffer
so that the head page is back to what was originally read as the head page,
the page to be swapped will have a corrupted next pointer.
Simple solution is to wrap the assignment of the next pointer with a
rb_list_head().
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
This reference at the end of rb_get_reader_page() was causing off-by-one
writes to the prev pointer of the page after the reader page when that
page is the head page, and therefore the reader page has the RB_PAGE_HEAD
flag in its list.next pointer. This eventually results in a GPF in a
subsequent call to rb_set_head_page() (usually from rb_get_reader_page())
when that prev pointer is dereferenced. The dereferenced register would
characteristically have an address that appears shifted left by one byte
(eg, ffxxxxxxxxxxxxyy instead of ffffxxxxxxxxxxxx) due to being written at
an address one byte too high.
Signed-off-by: David Sharp <dhsharp@google.com>
LKML-Reference: <1262826727-9090-1-git-send-email-dhsharp@google.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Fix divide by zero and broken output. Commit 600ce1a0fa ("fix clock
setting for Samsung SoC Framebuffer") introduced a mandatory refresh
parameter to the platform data for the S3C framebuffer but did not
introduce any validation code, causing existing platforms (none of which
have refresh set) to divide by zero whenever the framebuffer is
configured, generating warnings and unusable output.
Ben Dooks noted several problems with the patch:
- The platform data supplies the pixclk directly and should already
have taken care of the refresh rate.
- The addition of a window ID parameter doesn't help since only the
root framebuffer can control the pixclk.
- pixclk is specified in picoseconds (rather than Hz) as the patch
assumed.
and suggests reverting the commit so do that. Without fixing this no
mainline user of the driver will produce output.
[akpm@linux-foundation.org: don't revert the correct bit]
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: InKi Dae <inki.dae@samsung.com>
Cc: Kyungmin Park <kmpark@infradead.org>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Ben Dooks <ben-linux@fluff.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Fix a problem in NOMMU mmap with ramfs whereby a shared mmap can happen
over the end of a truncation. The problem is that
ramfs_nommu_check_mappings() checks that the reduced file size against the
VMA tree, but not the vm_region tree.
The following sequence of events can cause the problem:
fd = open("/tmp/x", O_RDWR|O_TRUNC|O_CREAT, 0600);
ftruncate(fd, 32 * 1024);
a = mmap(NULL, 32 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
b = mmap(NULL, 16 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
munmap(a, 32 * 1024);
ftruncate(fd, 16 * 1024);
c = mmap(NULL, 32 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
Mapping 'a' creates a vm_region covering 32KB of the file. Mapping 'b'
sees that the vm_region from 'a' is covering the region it wants and so
shares it, pinning it in memory.
Mapping 'a' then goes away and the file is truncated to the end of VMA
'b'. However, the region allocated by 'a' is still in effect, and has
_not_ been reduced.
Mapping 'c' is then created, and because there's a vm_region covering the
desired region, get_unmapped_area() is _not_ called to repeat the check,
and the mapping is granted, even though the pages from the latter half of
the mapping have been discarded.
However:
d = mmap(NULL, 16 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
Mapping 'd' should work, and should end up sharing the region allocated by
'a'.
To deal with this, we shrink the vm_region struct during the truncation,
lest do_mmap_pgoff() take it as licence to share the full region
automatically without calling the get_unmapped_area() file op again.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Greg Ungerer <gerg@snapgear.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Fix the race between the truncation of a ramfs file and an attempt to make
a shared mmap of region of that file.
The problem is that do_mmap_pgoff() calls f_op->get_unmapped_area() to
verify that the file region is made of contiguous pages and to find its
base address - but there isn't any locking to guarantee this region until
vma_prio_tree_insert() is called by add_vma_to_mm().
Note that moving the functionality into f_op->mmap() doesn't help as that
is also called before vma_prio_tree_insert().
Instead make ramfs_nommu_check_mappings() grab nommu_region_sem whilst it
does its checks. This means that this function will wait whilst mmaps
take place.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Greg Ungerer <gerg@snapgear.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
get_unmapped_area() is unnecessary for NOMMU as no-one calls it.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Greg Ungerer <gerg@snapgear.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
In split_vma(), there's no need to check if the VMA being split has a
region that's in use by more than one VMA because:
(1) The preceding test prohibits splitting of non-anonymous VMAs and regions
(eg: file or chardev backed VMAs).
(2) Anonymous regions can't be mapped multiple times because there's no handle
by which to refer to the already existing region.
(3) If a VMA has previously been split, then the region backing it has also
been split into two regions, each of usage 1.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Greg Ungerer <gerg@snapgear.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
The vm_usage count field in struct vm_region does not need to be atomic as
it's only even modified whilst nommu_region_sem is write locked.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Greg Ungerer <gerg@snapgear.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Commit c4caa778157dbbf04116f0ac2111e389b5cd7a29 ("file
->get_unmapped_area() shouldn't duplicate work of get_unmapped_area()")
broke SYSV SHM for NOMMU by taking away the pointer to
shm_get_unmapped_area() from shm_file_operations.
Put it back conditionally on CONFIG_MMU=n.
file->f_ops->get_unmapped_area() is used to find out the base address for a
mapping of a mappable chardev device or mappable memory-based file (such as a
ramfs file). It needs to be called prior to file->f_ops->mmap() being called.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Greg Ungerer <gerg@snapgear.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
The function prototype mismatches in call stack:
[<ffffffff81494268>] print_block_size+0x58/0x60
[<ffffffff81487e3f>] sysdev_class_show+0x1f/0x30
[<ffffffff811d629b>] sysfs_read_file+0xcb/0x1f0
[<ffffffff81176328>] vfs_read+0xc8/0x180
Due to prototype mismatch, print_block_size() will sprintf() into
*attribute instead of *buf, hence user space will read the initial
zeros from *buf:
$ hexdump /sys/devices/system/memory/block_size_bytes
0000000 0000 0000 0000 0000
0000008
After patch:
cat /sys/devices/system/memory/block_size_bytes
0x8000000
This complements commits c29af9636 and 4a0b2b4dbe.
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: "Zheng, Shaohui" <shaohui.zheng@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Current mem_cgroup_force_empty() only ensures mem->res.usage == 0 on
success. But this doesn't guarantee memcg's LRU is really empty, because
there are some cases in which !PageCgrupUsed pages exist on memcg's LRU.
For example:
- Pages can be uncharged by its owner process while they are on LRU.
- race between mem_cgroup_add_lru_list() and __mem_cgroup_uncharge_common().
So there can be a case in which the usage is zero but some of the LRUs are not empty.
OTOH, mem_cgroup_del_lru_list(), which can be called asynchronously with
rmdir, accesses the mem_cgroup, so this access can cause a problem if it
races with rmdir because the mem_cgroup might have been freed by rmdir.
Actually, I saw a bug which seems to be caused by this race.
[1530745.949906] BUG: unable to handle kernel NULL pointer dereference at 0000000000000230
[1530745.950651] IP: [<ffffffff810fbc11>] mem_cgroup_del_lru_list+0x30/0x80
[1530745.950651] PGD 3863de067 PUD 3862c7067 PMD 0
[1530745.950651] Oops: 0002 [#1] SMP
[1530745.950651] last sysfs file: /sys/devices/system/cpu/cpu7/cache/index1/shared_cpu_map
[1530745.950651] CPU 3
[1530745.950651] Modules linked in: configs ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp nfsd nfs_acl auth_rpcgss exportfs autofs4 hidp rfcomm l2cap crc16 bluetooth lockd sunrpc ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i cxgb3 mdio libiscsi_tcp libiscsi scsi_transport_iscsi dm_mirror dm_multipath scsi_dh video output sbs sbshc battery ac lp kvm_intel kvm sg ide_cd_mod cdrom serio_raw tpm_tis tpm tpm_bios acpi_memhotplug button parport_pc parport rtc_cmos rtc_core rtc_lib e1000 i2c_i801 i2c_core pcspkr dm_region_hash dm_log dm_mod ata_piix libata shpchp megaraid_mbox sd_mod scsi_mod megaraid_mm ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: freq_table]
[1530745.950651] Pid: 19653, comm: shmem_test_02 Tainted: G M 2.6.32-mm1-00701-g2b04386 #3 Express5800/140Rd-4 [N8100-1065]
[1530745.950651] RIP: 0010:[<ffffffff810fbc11>] [<ffffffff810fbc11>] mem_cgroup_del_lru_list+0x30/0x80
[1530745.950651] RSP: 0018:ffff8803863ddcb8 EFLAGS: 00010002
[1530745.950651] RAX: 00000000000001e0 RBX: ffff8803abc02238 RCX: 00000000000001e0
[1530745.950651] RDX: 0000000000000000 RSI: ffff88038611a000 RDI: ffff8803abc02238
[1530745.950651] RBP: ffff8803863ddcc8 R08: 0000000000000002 R09: ffff8803a04c8643
[1530745.950651] R10: 0000000000000000 R11: ffffffff810c7333 R12: 0000000000000000
[1530745.950651] R13: ffff880000017f00 R14: 0000000000000092 R15: ffff8800179d0310
[1530745.950651] FS: 0000000000000000(0000) GS:ffff880017800000(0000) knlGS:0000000000000000
[1530745.950651] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[1530745.950651] CR2: 0000000000000230 CR3: 0000000379d87000 CR4: 00000000000006e0
[1530745.950651] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[1530745.950651] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[1530745.950651] Process shmem_test_02 (pid: 19653, threadinfo ffff8803863dc000, task ffff88038612a8a0)
[1530745.950651] Stack:
[1530745.950651] ffffea00040c2fe8 0000000000000000 ffff8803863ddd98 ffffffff810c739a
[1530745.950651] <0> 00000000863ddd18 000000000000000c 0000000000000000 0000000000000000
[1530745.950651] <0> 0000000000000002 0000000000000000 ffff8803863ddd68 0000000000000046
[1530745.950651] Call Trace:
[1530745.950651] [<ffffffff810c739a>] release_pages+0x142/0x1e7
[1530745.950651] [<ffffffff810c778f>] ? pagevec_move_tail+0x6e/0x112
[1530745.950651] [<ffffffff810c781e>] pagevec_move_tail+0xfd/0x112
[1530745.950651] [<ffffffff810c78a9>] lru_add_drain+0x76/0x94
[1530745.950651] [<ffffffff810dba0c>] exit_mmap+0x6e/0x145
[1530745.950651] [<ffffffff8103f52d>] mmput+0x5e/0xcf
[1530745.950651] [<ffffffff81043ea8>] exit_mm+0x11c/0x129
[1530745.950651] [<ffffffff8108fb29>] ? audit_free+0x196/0x1c9
[1530745.950651] [<ffffffff81045353>] do_exit+0x1f5/0x6b7
[1530745.950651] [<ffffffff8106133f>] ? up_read+0x2b/0x2f
[1530745.950651] [<ffffffff8137d187>] ? lockdep_sys_exit_thunk+0x35/0x67
[1530745.950651] [<ffffffff81045898>] do_group_exit+0x83/0xb0
[1530745.950651] [<ffffffff810458dc>] sys_exit_group+0x17/0x1b
[1530745.950651] [<ffffffff81002c1b>] system_call_fastpath+0x16/0x1b
[1530745.950651] Code: 54 53 0f 1f 44 00 00 83 3d cc 29 7c 00 00 41 89 f4 75 63 eb 4e 48 83 7b 08 00 75 04 0f 0b eb fe 48 89 df e8 18 f3 ff ff 44 89 e2 <48> ff 4c d0 50 48 8b 05 2b 2d 7c 00 48 39 43 08 74 39 48 8b 4b
[1530745.950651] RIP [<ffffffff810fbc11>] mem_cgroup_del_lru_list+0x30/0x80
[1530745.950651] RSP <ffff8803863ddcb8>
[1530745.950651] CR2: 0000000000000230
[1530745.950651] ---[ end trace c3419c1bb8acc34f ]---
[1530745.950651] Fixing recursive fault but reboot is needed!
The problem here is pages on LRU may contain pointer to stale memcg. To
make res->usage to be 0, all pages on memcg must be uncharged or moved to
another(parent) memcg. Moved page_cgroup have already removed from
original LRU, but uncharged page_cgroup contains pointer to memcg withou
PCG_USED bit. (This asynchronous LRU work is for improving performance.)
If PCG_USED bit is not set, page_cgroup will never be added to memcg's
LRU. So, about pages not on LRU, they never access stale pointer. Then,
what we have to take care of is page_cgroup _on_ LRU list. This patch
fixes this problem by making mem_cgroup_force_empty() visit all LRUs
before exiting its loop and guarantee there are no pages on its LRU.
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|