op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	[ARM] add Marvell Kirkwood (88F6000) SoC support	Saeed Bishara	2008-06-22	14	-2/+1006
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The Marvell Kirkwood (88F6000) is a family of ARM SoCs based on a Shiva CPU core, and features a DDR2 controller, a x1 PCIe interface, a USB 2.0 interface, a SPI controller, a crypto accelerator, a TS interface, and IDMA/XOR engines, and depending on the model, also features one or two Gigabit Ethernet interfaces, two SATA II interfaces, one or two TWSI interfaces, one or two UARTs, a TDM/SLIC interface, a NAND controller, an I2S/SPDIF interface, and an SDIO interface. This patch adds supports for the Marvell DB-88F6281-BP Development Board and the RD-88F6192-NAS and the RD-88F6281 Reference Designs, enabling support for the PCIe interface, the USB interface, the ethernet interfaces, the SATA interfaces, the TWSI interfaces, the UARTs, and the NAND controller. Signed-off-by: Saeed Bishara <saeed@marvell.com> Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
*	[ARM] Feroceon: 88fr131 support	Lennert Buytenhek	2008-06-22	1	-0/+30
\| \| \| \| \| \| \|	Add support for the Shiva 88fr131 CPU core as found in e.g. the Marvell Kirkwood family of ARM SoCs. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
*	[ARM] Feroceon: L2 cache support	Lennert Buytenhek	2008-06-22	4	-1/+354
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support for the unified Feroceon L2 cache controller as found in e.g. the Marvell Kirkwood and Marvell Discovery Duo families of ARM SoCs. Note that: - Page table walks are outer uncacheable on Kirkwood and Discovery Duo, since the ARMv5 spec provides no way to indicate outer cacheability of page table walks (specifying it in TTBR[4:3] is an ARMv6+ feature). This requires adding L2 cache clean instructions to proc-feroceon.S (dcache_clean_area(), set_pte()) as well as to tlbflush.h ({flush,clean}_pmd_entry()). The latter case is handled by defining a new TLB type (TLB_FEROCEON) which is almost identical to the v4wbi one but provides a TLB_L2CLEAN_FR flag. - The Feroceon L2 cache controller supports L2 range (i.e. 'clean L2 range by MVA' and 'invalidate L2 range by MVA') operations, and this patch uses those range operations for all Linux outer cache operations, as they are faster than the regular per-line operations. L2 range operations are not interruptible on this hardware, which avoids potential livelock issues, but can be bad for interrupt latency, so there is a compile-time tunable (MAX_RANGE_SIZE) which allows you to select the maximum range size to operate on at once. (Valid range is between one cache line and one 4KiB page, and must be a multiple of the line size.) Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
*	[ARM] Feroceon: L1 cache range operation support	Stanislav Samsonov	2008-06-22	1	-1/+68
\| \| \| \| \| \| \| \| \| \| \|	This patch adds support for the L1 D cache range operations that are supported by the Marvell Discovery Duo and Marvell Kirkwood ARM SoCs. Signed-off-by: Stanislav Samsonov <samsonov@marvell.com> Acked-by: Saeed Bishara <saeed@marvell.com> Reviewed-by: Nicolas Pitre <nico@marvell.com> Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
*	[ARM] Loki: add defconfig	Lennert Buytenhek	2008-06-22	1	-0/+1147
\| \| \| \|	Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
*	[ARM] add Marvell Loki (88RC8480) SoC support	Lennert Buytenhek	2008-06-22	11	-1/+614
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The Marvell Loki (88RC8480) is an ARM SoC based on a Feroceon CPU core running at between 400 MHz and 1.0 GHz, and features a 64 bit DDR controller, 512K of internal SRAM, two x4 PCI-Express ports, two Gigabit Ethernet ports, two 4x SAS/SATA controllers, two UARTs, two TWSI controllers, and IDMA/XOR engines. This patch adds support for the Marvell LB88RC8480 Development Board, enabling the use of the PCIe interfaces, the ethernet interfaces, the TWSI interfaces and the UARTs. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
*	[ARM] Orion: add a separate BRIDGE_INT_TIMER1_CLR define	Ke Wei	2008-06-22	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Some Feroceon-based SoCs have an MBUS bridge interrupt controller that requires writing a one instead of a zero to clear edge interrupt sources such as timer expiry. This patch adds a new BRIDGE_INT_TIMER1_CLR define, which platform code can set to either ~BRIDGE_INT_TIMER1 (write-zero-to-clear) or BRIDGE_INT_TIMER1 (write-one-to-clear) depending on the platform. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
*	[ARM] Feroceon: allow more old Feroceon IDs	Ke Wei	2008-06-22	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	There are a couple more Feroceon-based SoCs out in the field that use different Variant and Architecture fields in their Main ID registers -- this patch tweaks the processor match/mask in proc-feroceon.S to catch those SoCs as well. Signed-off-by: Ke Wei <kewei@marvell.com> Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
*	[ARM] Feroceon: catch other Feroceon CPU IDs in head.S	Nicolas Pitre	2008-06-22	1	-2/+2
\| \| \| \| \| \| \| \| \|	Tweak the Feroceon match/mask in arch/arm/boot/compressed/head.S to match a couple of newer Feroceon cores (such as the 88fr571vd with CPU ID 0x56155710, and the 88fr131 with CPU ID 0x56251310) as well. Signed-off-by: Nicolas Pitre <nico@marvell.com> Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
*	[ARM] Feroceon: speed up flushing of the entire cache	Nicolas Pitre	2008-06-22	1	-11/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Flushing the L1 D cache with a test/clean/invalidate loop is very easy in software, but it is not the quickest way of doing it, as there is a lot of overhead involved in re-scanning the cache from the beginning every time we hit a dirty line. This patch makes proc-feroceon.S use "clean+invalidate by set/way" loops according to possible cache configuration of Feroceon CPUs (either direct-mapped or 4-way set associative). Signed-off-by: Nicolas Pitre <nico@marvell.com> Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
*	[ARM] Orion: nuke orion5x_{read,write}	Lennert Buytenhek	2008-06-22	6	-57/+57
\| \| \| \| \| \|	Nuke the Orion-specific orion5x_{read,write} wrappers. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
*	[ARM] Orion: add Maxtor Shared Storage II support	Sylver Bruneau	2008-06-22	3	-0/+277
\| \| \| \| \| \| \|	This patch adds support for the Maxtor Shared Storage II hardware. Signed-off-by: Sylver Bruneau <sylver.bruneau@googlemail.com> Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
*	[ARM] Orion: add Technologic Systems TS-78xx support	Alexander Clouter	2008-06-22	3	-0/+284
\| \| \| \| \|	Signed-off-by: Alexander Clouter <alex@digriz.org.uk> Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
*	[ARM] Orion: remove code duplication in TS209 and TS409 setup files	Sylver Bruneau	2008-06-22	5	-239/+168
\| \| \| \| \|	Signed-off-by: Sylver Bruneau <sylver.bruneau@googlemail.com> Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
*	[ARM] Orion: add HP Media Vault mv2120 support	Martin Michlmayr	2008-06-22	3	-0/+201
\| \| \| \|	Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
*	[ARM] Orion: add Linksys WRT350N v2 support	Lennert Buytenhek	2008-06-22	3	-0/+180
\| \| \| \| \| \|	Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Tested-by: Dirk Teurlings <dirk@upexia.nl> Tested-by: Peter van Valderen <p.v.valderen@gmail.com>
*	[ARM] Orion: add 88F5181L (Orion-VoIP) support	Lennert Buytenhek	2008-06-22	2	-2/+5
\| \| \| \| \|	Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Acked-by: Russell King <linux@arm.linux.org.uk>
*	[ARM] Orion: add QNAP TS-409 support	Sylver Bruneau	2008-06-22	3	-0/+392
\| \| \| \|	Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
*	[ARM] Orion: implement power-off method for Kurobox Pro	Sylver Bruneau	2008-06-22	1	-4/+143
\| \| \| \| \| \| \| \| \| \|	This patch implements the communication with the microcontroller on the Kurobox Pro and Linkstation Pro/Live boards. This is allowing to send the commands needed to power-off the board correctly. Signed-off-by: Sylver Bruneau <sylver.bruneau@googlemail.com> Acked-by: Russell King <linux@arm.linux.org.uk> Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
*	[ARM] Orion: avoid setting ->force_phy_addr	Lennert Buytenhek	2008-06-22	5	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The mv643xx_eth platform data field ->force_phy_addr only needs to be set if the passed-in ->phy_addr field is zero (to distinguish the case of not having specified a phy address (force_phy_addr = 0) from the case where a phy address of zero needs to be used (force_phy_addr = 1.)) Also, the ->force_phy_addr field will hopefully disappear in a future mv643xx_eth reorganisation. Therefore, this patch deletes the ->force_phy_addr field initialiser from all Orion board code. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
*	[ARM] Orion: remove error printks in ->map_irq() implementations	Lennert Buytenhek	2008-06-22	2	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If all PCI devices are working as expected, the error printks in the various implementations of ->map_irq() doesn't really provide any useful info. And if something is not working as expected, turning on pci=debug gives you more useful information than the printk calls in ->map_irq(), since the former also tells you which devices _did_ get IRQs successfully assigned. Therefore, delete these printks entirely. Spotted by Russell King. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Acked-by: Russell King <linux@arm.linux.org.uk>
*	[ARM] Orion: rework MPP handling	Lennert Buytenhek	2008-06-22	10	-114/+376
\| \| \| \| \| \| \| \| \| \| \| \|	Instead of having board code poke directly into the MPP configuration registers, and separately calling orion5x_gpio_set_valid_pins() to indicate which MPP pins can be used as GPIO pins, introduce a helper function for configuring the roles of each of the MPP pins, and have that helper function handle gpio validity internally. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Acked-by: Sylver Bruneau <sylver.bruneau@googlemail.com> Acked-by: Russell King <linux@arm.linux.org.uk>
*	[ARM] Orion: move setting up PCIe WA window into PCIe setup path	Lennert Buytenhek	2008-06-22	5	-25/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	It makes no sense to do PCIe WA window setup in the individual board support files while the decision whether or not to use the PCIe WA access method is made in a different place, in the PCIe support code. This patch moves the configuration of a PCIe WA window from the individual Orion board support files to the central Orion PCIe support code. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Acked-by: Russell King <linux@arm.linux.org.uk>
*	[ARM] Orion: move EHCI/I2C/UART peripheral init into board code	Lennert Buytenhek	2008-06-22	7	-195/+240
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch moves initialisation of EHCI/I2C/UART platform devices from the common orion5x_init() into the board support code. The rationale behind this is that only the board support code knows whether certain peripherals have been brought out on the board, and not initialising peripherals that haven't been brought out is desirable for example: - to reduce user confusion (e.g. seeing both 'eth0' and 'eth1' appear while there is only one ethernet port on the board); and - to allow for future power savings (peripherals that have not been brought out can be clock gated off entirely). Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Acked-by: Russell King <linux@arm.linux.org.uk>
*	[ARM] Orion: top-level IRQs are level-triggered	Lennert Buytenhek	2008-06-22	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make it clear that Orion top-level IRQs are level-triggered. This means that we don't need an ->ack() handler, or at least, we don't need the ->ack() handler (or the acking part of the ->mask_ack() handler) to actually do anything. Given that, we might as well point our ->mask_ack() handler at the ->mask() handler instead of providing a dummy ->ack() handler, since providing a ->mask_ack() handler on level IRQ sources will prevent ->ack() from ever being called. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Acked-by: Russell King <linux@arm.linux.org.uk>
*	[ARM] Feroceon: annotate 88fr531-vd CPU entries	Lennert Buytenhek	2008-06-22	1	-4/+9
\| \| \| \| \| \| \| \| \|	Annotate the entries for the 88fr531-vd CPU core in arch/arm/boot/compressed/head.S and arch/arm/mm/proc-feroceon.S with the full name of the core. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Acked-by: Russell King <linux@arm.linux.org.uk>
*	[ARM] Orion: DRAM mapping granularity is 64KiB, not 16MiB	Lennert Buytenhek	2008-06-22	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	The DRAM base address and size fields in the CPU's MBUS bridge have 64KiB granularity, instead of the currently used 16MiB. Since all of the currently supported MBUS peripherals support 64KiB granularity as well, this patch changes the Orion address map code to stop rounding base addresses down and sizes up to multiples of 16MiB. Found by Ke Wei <kewei@marvell.com>. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Acked-by: Russell King <linux@arm.linux.org.uk>
*	[ARM] Orion: make window setup a little more safe	Lennert Buytenhek	2008-06-22	1	-5/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, Orion window setup uses hardcoded window indexes for each of the boot/cs0/cs1/cs2/PCIe WA windows. The static window allocation used can clash if board support code will ever attempt to configure both a dev2 and a PCIe WA window, as both of those use CPU mbus window #7 at present. This patch keeps track of the last used window, and opens subsequently requested windows sequentially, starting from 4. (Windows 0-3 are used as MEM/IO windows for the PCI/PCIe buses.) Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
*	[ARM] Orion: fix various whitespace and coding style issues	Lennert Buytenhek	2008-06-22	8	-141/+132
\| \| \| \| \| \| \| \| \| \|	More cosmetic cleanup: - Replace 8-space indents by proper tab indents. - In structure initialisers, use a trailing comma for every member. - Collapse "},\n{" in structure initialiers to "}, {". Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Acked-by: Russell King <linux@arm.linux.org.uk>
*	[ARM] cache align memset and memzero	Nicolas Pitre	2008-06-22	2	-0/+90
\| \| \| \| \| \| \| \|	This is a natural extension following the previous patch. Non Feroceon based targets are unchanged. Signed-off-by: Nicolas Pitre <nico@marvell.com> Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
*	[ARM] cache align destination pointer when copying memory for some processors	Nicolas Pitre	2008-06-22	2	-20/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The implementation for memory copy functions on ARM had a (disabled) provision for aligning the source pointer before loading registers with data. Turns out that aligning the _destination_ pointer is much more useful, as the read side is already sufficiently helped with the use of preload. So this changes the definition of the CALGN() macro to target the destination pointer instead, and turns it on for Feroceon processors where the gain is very noticeable. Signed-off-by: Nicolas Pitre <nico@marvell.com> Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
*	[ARM] fix cache alignment code in memset.S	Nicolas Pitre	2008-06-22	1	-1/+1
\| \| \| \| \| \| \|	This code is currently disabled, which explains why no one was affected. Signed-off-by: Nicolas Pitre <nico@marvell.com> Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
*	[ARM] latencytop support	Nicolas Pitre	2008-06-22	2	-4/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Available for !SMP only at the moment. From Russell: \|Basically, if a thread is running on a CPU, thread_saved_fp() is invalid. \|So, the question is: what guarantees do we have here that 'tsk' is not \|running on another CPU? Signed-off-by: Nicolas Pitre <nico@marvell.com> Tested-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
*	[ARM] Orion: update defconfig to 2.6.26-rc4	Nicolas Pitre	2008-06-22	1	-122/+187
\| \| \| \| \|	Signed-off-by: Nicolas Pitre <nico@marvell.com> Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
*	Merge branch 'x86-fixes-for-linus' of ↵	Linus Torvalds	2008-06-20	5	-13/+22
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, geode: add a VSA2 ID for General Software x86: use BOOTMEM_EXCLUSIVE on 32-bit x86, 32-bit: fix boot failure on TSC-less processors x86: fix NULL pointer deref in __switch_to x86: set PAE PHYSICAL_MASK_SHIFT to 44 bits.
\| *	x86, geode: add a VSA2 ID for General Software	Jordan Crouse	2008-06-19	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	General Software writes their own VSA2 module for their version of the Geode BIOS, which returns a different ID then the standard VSA2. This was causing the framebuffer driver to break for most GSW boards. Signed-off-by: Jordan Crouse <jordan.crouse@amd.com> Cc: tglx@linutronix.de Cc: linux-geode@lists.infradead.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
\| *	x86: use BOOTMEM_EXCLUSIVE on 32-bit	Bernhard Walle	2008-06-19	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch uses the BOOTMEM_EXCLUSIVE for crashkernel reservation also for i386 and prints a error message on failure. The patch is still for 2.6.26 since it is only bug fixing. The unification of reserve_crashkernel() between i386 and x86_64 should be done for 2.6.27. Signed-off-by: Bernhard Walle <bwalle@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org>
\| *	x86, 32-bit: fix boot failure on TSC-less processors	Mikael Pettersson	2008-06-19	1	-10/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Booting 2.6.26-rc6 on my 486 DX/4 fails with a "BUG: Int 6" (invalid opcode) and a kernel halt immediately after the kernel has been uncompressed. The BUG shows EIP pointing to an rdtsc instruction in native_read_tsc(), invoked from native_sched_clock(). (This error occurs so early that not even the serial console can capture it.) A bisection showed that this bug first occurs in 2.6.26-rc3-git7, via commit 9ccc906c97e34fd91dc6aaf5b69b52d824386910: >x86: distangle user disabled TSC from unstable > >tsc_enabled is set to 0 from the command line switch "notsc" and from >the mark_tsc_unstable code. Seperate those functionalities and replace >tsc_enable with tsc_disable. This makes also the native_sched_clock() >decision when to use TSC understandable. > >Preparatory patch to solve the sched_clock() issue on 32 bit. > >Signed-off-by: Thomas Gleixner <tglx@linutronix.de> The core reason for this bug is that native_sched_clock() gets called before tsc_init(). Before the commit above, tsc_32.c used a "tsc_enabled" variable which defaulted to 0 == disabled, and which only got enabled late in tsc_init(). Thus early calls to native_sched_clock() would skip the TSC and use jiffies instead. After the commit above, tsc_32.c uses a "tsc_disabled" variable which defaults to 0, meaning that the TSC is Ok to use. Early calls to native_sched_clock() now erroneously try to use the TSC on !cpu_has_tsc processors, leading to invalid opcode exceptions. My proposed fix is to initialise tsc_disabled to a "soft disabled" state distinct from the hard disabled state set up by the "notsc" kernel option. This fixes the native_sched_clock() problem. It also allows tsc_init() to be simplified: instead of setting tsc_disabled = 1 on every error return, we just set tsc_disabled = 0 once when all checks have succeeded. I've verified that this lets my 486 boot again. I've also verified that a Core2 machine still uses the TSC as clocksource after the patch. Signed-off-by: Mikael Pettersson <mikpe@it.uu.se> Signed-off-by: Ingo Molnar <mingo@elte.hu>
\| *	x86: fix NULL pointer deref in __switch_to	Suresh Siddha	2008-06-19	2	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patrick McHardy reported a crash: > > I get this oops once a day, its apparently triggered by something > > run by cron, but the process is a different one each time. > > > > Kernel is -git from yesterday shortly before the -rc6 release > > (last commit is the usb-2.6 merge, the x86 patches are missing), > > .config is attached. > > > > I'll retry with current -git, but the patches that have gone in > > since I last updated don't look related. > > > > [62060.043009] BUG: unable to handle kernel NULL pointer dereference at > > 000001ff > > [62060.043009] IP: [<c0102a9b>] __switch_to+0x2f/0x118 > > [62060.043009] pde = 00000000 > > [62060.043009] Oops: 0002 [#1] PREEMPT Vegard Nossum analyzed it: > This decodes to > > 0: 0f ae 00 fxsave (%eax) > > so it's related to the floating-point context. This is the exact > location of the crash: > > $ addr2line -e arch/x86/kernel/process_32.o -i ab0 > include/asm/i387.h:232 > include/asm/i387.h:262 > arch/x86/kernel/process_32.c:595 > > ...so it looks like prev_task->thread.xstate->fxsave has become NULL. > Or maybe it never had any other value. Somehow (as described below) TS_USEDFPU is set but the fpu is not allocated or freed. Another possible FPU pre-emption issue with the sleazy FPU optimization which was benign before but not so anymore, with the dynamic FPU allocation patch. New task is getting exec'd and it is prempted at the below point. flush_thread() { ... / * Forget coprocessor state.. */ clear_fpu(tsk); <----- Preemption point clear_used_math(); ... } Now when it context switches in again, as the used_math() is still set and fpu_counter can be > 5, we will do a math_state_restore() which sets the task's TS_USEDFPU. After it continues from the above preemption point it does clear_used_math() and much later free_thread_xstate(). Now, at the next context switch, it is quite possible that xstate is null, used_math() is not set and TS_USEDFPU is still set. This will trigger unlazy_fpu() causing kernel oops. Fix this by clearing tsk's fpu_counter before clearing task's fpu. Reported-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* \|	Reinstate ZERO_PAGE optimization in 'get_user_pages()' and fix XIP	Linus Torvalds	2008-06-20	1	-1/+1
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	KAMEZAWA Hiroyuki and Oleg Nesterov point out that since the commit 557ed1fa2620dc119adb86b34c614e152a629a80 ("remove ZERO_PAGE") removed the ZERO_PAGE from the VM mappings, any users of get_user_pages() will generally now populate the VM with real empty pages needlessly. We used to get the ZERO_PAGE when we did the "handle_mm_fault()", but since fault handling no longer uses ZERO_PAGE for new anonymous pages, we now need to handle that special case in follow_page() instead. In particular, the removal of ZERO_PAGE effectively removed the core file writing optimization where we would skip writing pages that had not been populated at all, and increased memory pressure a lot by allocating all those useless newly zeroed pages. This reinstates the optimization by making the unmapped PTE case the same as for a non-existent page table, which already did this correctly. While at it, this also fixes the XIP case for follow_page(), where the caller could not differentiate between the case of a page that simply could not be used (because it had no "struct page" associated with it) and a page that just wasn't mapped. We do that by simply returning an error pointer for pages that could not be turned into a "struct page *". The error is arbitrarily picked to be EFAULT, since that was what get_user_pages() already used for the equivalent IO-mapped page case. [ Also removed an impossible test for pte_offset_map_lock() failing: that's not how that function works ] Acked-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Nick Piggin <npiggin@suse.de> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[POWERPC] Clear sub-page HPTE present bits when demoting page size	Paul Mackerras	2008-06-18	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we demote a slice from 64k to 4k, and we are about to insert an HPTE for a 4k subpage and we notice that there is an existing 64k HPTE, we first invalidate that HPTE before inserting the new 4k subpage HPTE. Since the bits that encode which hash bucket the old HPTE was in overlap with the bits that encode which of the 16 subpages have HPTEs, we need to clear out the subpage HPTE-present bits before starting to insert HPTEs for the 4k subpages. If we don't do that, we can erroneously think that a subpage already has an HPTE when it doesn't. That in itself wouldn't be such a problem except that when we go to update the HPTE that we think is present on machines with a hypervisor, the hypervisor can tell us that the HPTE we think is there is actually there even though it isn't, which can lead to a process getting stuck in a loop, continually faulting. The reason for the confusion is that the AVPN (abbreviated virtual page number) we are looking for in the HPTE for a 4k subpage can actually match the AVPN in a stale HPTE for another 64k page. For example, the HPTE for the 4k subpage at 0x84000f000 will be in the same hash bucket and have the same AVPN as the HPTE for the 64k page at 0x8400f0000. This fixes the code to clear out the subpage HPTE-present bits. Signed-off-by: Paul Mackerras <paulus@samba.org>
*	[POWERPC] 4xx: Clear new TLB cache attribute bits in Data Storage vector	Josh Boyer	2008-06-18	1	-1/+6
\| \| \| \| \| \| \| \| \| \|	A recent commit added support for the new 440x6 and 464 cores that have the added WL1, IL1I, IL1D, IL2I, and ILD2 bits for the caching attributes in the TLBs. The new bits were cleared in the finish_tlb_load function, however a similar bit of code was missed in the DataStorage interrupt vector. Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
*	x86-64: Fix "bytes left to copy" return value for copy_from_user()	Linus Torvalds	2008-06-17	2	-28/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Most users by far do not care about the exact return value (they only really care about whether the copy succeeded in its entirety or not), but a few special core routines actually care deeply about exactly how many bytes were copied from user space. And the unrolled versions of the x86-64 user copy routines would sometimes report that it had copied more bytes than it actually had. Very few uses actually have partial copies to begin with, but to make this bug even harder to trigger, most x86 CPU's use the "rep string" instructions for normal user copies, and that version didn't have this issue. To make it even harder to hit, the one user of this that really cared about the return value (and used the uncached version of the copy that doesn't use the "rep string" instructions) was the generic write routine, which pre-populated its source, once more hiding the problem by avoiding the exception case that triggers the bug. In other words, very special thanks to Bron Gondwana who not only triggered this, but created a test-program to show it, and bisected the behavior down to commit 08291429cfa6258c4cd95d8833beb40f828b194e ("mm: fix pagecache write deadlocks") which changed the access pattern just enough that you can now trigger it with 'writev()' with multiple iovec's. That commit itself was not the cause of the bug, it just allowed all the stars to align just right that you could trigger the problem. [ Side note: this is just the minimal fix to make the copy routines (with __copy_from_user_inatomic_nocache as the particular version that was involved in showing this) have the right return values. We really should improve on the exceptional case further - to make the copy do a byte-accurate copy up to the exact page limit that causes it to fail. As it is, the callers have to do extra work to handle the limit case gracefully. ] Reported-by: Bron Gondwana <brong@fastmail.fm> Cc: Nick Piggin <npiggin@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (which didn't have this problem), and since most users that do the carethis was very hard to trigger, but
*	Merge branch 'release' of ↵	Linus Torvalds	2008-06-16	4	-9/+21
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] Fix CONFIG_IA64_SGI_UV build error [IA64] Update check_sal_cache_flush to use platform_send_ipi() [IA64] perfmon: fix async exit bug
\| *	[IA64] Fix CONFIG_IA64_SGI_UV build error	Jack Steiner	2008-06-16	2	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix build error in CONFIG_IA64_SGI_UV config. (GENERIC builds are ok). Signed-off-by: Jack Steiner <steiner@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
\| *	[IA64] Update check_sal_cache_flush to use platform_send_ipi()	Alex Chiang	2008-06-11	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	check_sal_cache_flush is used to detect broken firmware that drops pending interrupts. The old implementation schedules a timer interrupt for itself in the future by getting the current value of the Interval Timer Counter + 1000 cycles, waits for the interrupt to be pended, calls SAL_CACHE_FLUSH, and finally checks to see if the interrupt is still pending. This implementation can cause problems for virtual machine code if the process of scheduling the timer interrupt takes more than 1000 cycles; the virtual machine can end up sleeping for several hundred years while waiting for the ITC to wrap around. The fix is to use platform_send_ipi. The processor will still send an interrupt to itself, using the IA64_IPI_DM_INT delivery mode, which causes the IPI to look like an external interrupt. The rest of the SAL_CACHE_FLUSH + checking to see if the interrupt is still pending remains unchanged. This fix has been boot tested successfully on: - intel tiger2 - hp rx6600 - hp rx5670 The rx5670 has known buggy firmware, where SAL_CACHE_FLUSH drops pending interrupts. A boot test on this machine showed this message on the console: SAL: SAL_CACHE_FLUSH drops interrupts; PAL_CACHE_FLUSH will be used instead Which proves that the self-inflicted IPI approach is viable. And as expected, the other tested platforms correctly did not display the warning. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
\| *	[IA64] perfmon: fix async exit bug	stephane eranian	2008-06-11	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move the cleanup of the async queue to the close callback from the flush callback. This avoids losing asynchronous overflow notifications when the file descriptor is shared by multiple processes and one terminates. Signed-off-by: Stephane Eranian <eranian@gmail.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
* \|	Merge branch 'merge' of ↵	Linus Torvalds	2008-06-16	62	-2208/+4651
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (21 commits) [POWERPC] Turn on ATA_SFF so we get SATA_SVW back in defconfigs [POWERPC] Remove ppc32's export of console_drivers [POWERPC] Fix -Os kernel builds with newer gcc versions [POWERPC] Fix bootwrapper builds with newer gcc versions [POWERPC] Build fix for drivers/macintosh/mediabay.c [POWERPC] Fix warning in pseries/eeh_driver.c [POWERPC] Add missing of_node_put in drivers/macintosh/therm_adt746x.c [POWERPC] Add missing of_node_put in drivers/macintosh/smu.c [POWERPC] Add missing of_node_put in pseries/nvram.c [POWERPC] Fix return value check logic in debugfs virq_mapping setup [POWERPC] Fix rmb to order cacheable vs. noncacheable powerpc/spufs: fix missed stop-and-signal event powerpc/spufs: synchronize interaction between spu exception handling and time slicing powerpc/spufs: remove class_0_dsisr from spu exception handling powerpc/spufs: wait for stable spu status in spu_stopped() [POWERPC] bootwrapper: add simpleImage* to list of boot targets [POWERPC] 83xx: MPC837xRDB's VSC7385 ethernet switch isn't on the MDIO bus [POWERPC] Updated Freescale PPC defconfigs [POWERPC] 8610: Update defconfig for MPC8610 HPCD [POWERPC] 85xx: MPC8548CDS - Fix size of PCIe IO space ...
\| * \|	[POWERPC] Turn on ATA_SFF so we get SATA_SVW back in defconfigs	Paul Mackerras	2008-06-16	2	-6/+122
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This enables CONFIG_ATA_SFF in the defconfigs that are intended to work on a G5 powermac, i.e. g5_defconfig and ppc64_defconfig. Since the support for the SATA cell in the K2 chipset is provided by the sata_svw.c driver, and that depends on CONFIG_ATA_SFF, we need to turn that and CONFIG_SATA_SVW back on so we can get to the hard disk on G5s. Signed-off-by: Paul Mackerras <paulus@samba.org>
\| * \|	[POWERPC] Remove ppc32's export of console_drivers	Stephen Rothwell	2008-06-16	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are no in-tree uses of the export any more and in linux-next there is a change that exports it globally which causes warnings: WARNING: vmlinux: 'console_drivers' exported twice. Previous export was in vmlinux and in one case (mpc85xx_defconfig) a build error: kernel/built-in.o: In function `__crc_console_drivers': (ABS+0x1eb0e6f5): multiple definition of `__crc_console_drivers' So remove the export now. Also, there is no longer any need to include linux/console.h. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>