summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* ARM: Kirkwood: fix ns2 gpios by converting to pinctrlSimon Guinot2013-01-102-38/+16
| | | | | | | | | | | | Note that the pinctrl conversion also fixes GPIO support for ns2 boards. Since commit f9e75922: "ARM: Kirkwood: Make use of mvebu pincltl and gpio", the mvbu_gpio driver is used for DT boards. As mvbu_gpio relies on the pinctrl driver, then a pinctrl definition must be given to allow the GPIO configuration. Signed-off-by: Simon Guinot <simon.guinot@sequanux.org> Acked-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
* arm: mvebu: use global interrupts for GPIOs on Armada XPThomas Petazzoni2013-01-083-32/+24
| | | | | | | | | | | | | | | | | | | | | | | | The Armada XP GPIO controller has two ways of notifying interrupts: using global interrupts or using per-CPU interrupts. In an attempt to use the best available features, the 'marvell,armadaxp-gpio' compatible string selects a variant of the gpio-mvebu driver that makes use of the per-CPU interrupts. Unfortunately, this doesn't work properly in a SMP context, because we fall into cases where the GPIO interrupt is enabled on CPU X at the GPIO controller level, but on CPU Y at the interrupt controller level. It is not yet clear how to fix that easily. So for 3.8, our approach is to switch to global interrupts for GPIOs, so that we do not fall into this per-CPU interrupts problem. This patch therefore fixes GPIO interrupts on Armada XP platforms. Without this patch, GPIO interrupts simply do not work reliably, because their proper operation depends on which CPU the code requesting the interrupt is running. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
* pinctrl: mvebu: make pdma clock on dove mandatorySebastian Hesselbarth2013-01-071-2/+5
| | | | | | | | | With the ability to pass clocks through DT, now make the pdma clock of dove pinctrl mandatory. Otherwise, pinctrl will hang the system when accessing some registers. Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
* ARM: Dove: Add pinctrl clock to DTSebastian Hesselbarth2013-01-071-0/+1
| | | | | | | | During merge of the mvebu patches a clock gate for pinctrl was lost. This patch just readds the clock gate. Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
* dma: mv_xor: fix error handling for clocksThomas Petazzoni2013-01-061-2/+5
| | | | | | | | | | | | | When a channel fails to initialize, we release all ressources, including clocks. However, a XOR unit is not necessarily associated to a clock (some variants of Marvell SoCs have a clock for XOR units, some don't), so we shouldn't unconditionally be releasing the clock. Instead, just like we do in the mv_xor_remove() function, we should check if one clock was found before releasing it. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
* dma: mv_xor: fix error handling of mv_xor_channel_add()Thomas Petazzoni2013-01-061-1/+1
| | | | | | | | | | | | | | | | | | | When mv_xor_channel_add() fails for one XOR channel, we jump to the err_channel_add label to clean up all previous channels that had been initialized correctly. Unfortunately, while handling this error condition, we were disposing the IRQ mapping before calling mv_xor_channel_remove() (which does the free_irq()), which is incorrect. Instead, do things properly in the reverse order of the initialization: first remove the XOR channel (so that free_irq() is done), and then dispose the IRQ mapping. This avoids ugly warnings when for some reason one of the XOR channel fails to initialize. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
* arm: mvebu: Add missing ; for cpu node.Andrew Lunn2013-01-061-1/+1
| | | | | | | | | The Armada XP MV78230 DT include file is missing a ; at the end of the cpu node. Reported-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
* arm: mvebu: Armada XP MV78230 has only three Ethernet interfacesThomas Petazzoni2013-01-063-8/+16
| | | | | | | | | | | | | | | | | | | | | | We originally thought that the MV78230 variant of the Armada XP had four Ethernet interfaces, like the other variants MV78260 and MV78460. In fact, this is not true, and the MV78230 has only three Ethernet interfaces. So, the definitions of the Ethernet interfaces is now done as follows: * armada-370-xp.dtsi: definitions of the first two interfaces, that are common to Armada 370 and Armada XP * armada-xp.dtsi: definition of the third interface, common to all Armada XP variants. * armada-xp-mv78260.dtsi and armada-xp-mv78460.dtsi: definition of the fourth interface. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
* arm: mvebu: Armada XP MV78230 has two cores, not oneThomas Petazzoni2013-01-061-0/+7
| | | | | | | | | | | Contrary to our understanding at the time armada-xp-mv78230.dtsi was written, the MV78230 variant of the Armada XP SoC has two cores and not one. This patch updates the .dtsi file to take into account this reality. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
* clk: mvebu: Remove inappropriate __init taggingJoshua Coombs2013-01-061-1/+1
| | | | | | | | | | | If the Orion WDT driver is built as a module, an opps occurs during clk lookup when calling mvebu_clk_gating_get_src(). Remove the inappropriate __init tag so the function is available for modules after kernel init. Signed-off-by: Joshua Coombs <josh.coombs@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
* ARM: Kirkwood: Use fixed-regulator instead of board gpio callAndrew Lunn2013-01-062-4/+17
| | | | | | | | | | | | With the change to a DT based pinctrl/gpio driver, using gpio API calls in board-*.c files no longer works, a dereferenced NULL pointer exception occurs instead. By converting the GPIO code into a fixed-regulator which gets probed later once pinctrl/gpio is available, we avoid the exception. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Stefan Peter <s.peter@mplch> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
* ARM: Kirkwood: Fix missing sdio clockAndrew Lunn2013-01-061-0/+4
| | | | | | | | | | We moved to declaring clk gates in DT. However, device which do not yet have a DT binding need to have a clkdev alias. This was missing for SDIO. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Stefan Peter <s.peter@mplch> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
* ARM: Kirkwood: Switch TWSI1 of 88f6282 to DT clock providersNobuhiro Iwamatsu2013-01-061-0/+1
| | | | | | | | | | Clock Management of kirkwood has moved to DT clock providers. However, TWSI1 has not yet been done. This switches TWSI1 of 88f6282 to DT clock providers. Signed-off-by: Nobuhiro Iwamatsu <iwamatsu@nigauri.org> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
* Power: gpio-poweroff: Fix documentation and gpio_is_validAndrew Lunn2013-01-062-21/+32
| | | | | | | | | | Improve the documentation to clarify level vs edge triggered power off. Improve the comments for level vs edge triggered power off. Make use of gpio_is_valid(). Reported-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
* ARM: Kirkwood: Fix missing clk for USB device.Andrew Lunn2013-01-061-0/+1
| | | | | | | | | Without the clock being held by a driver, it gets turned off at a bad time causing the SoC to lockup. This is often during reboot. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Stefan Peter <s.peter@mpl.ch> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
* arm: mvebu: Use dw-apb-uart instead of ns16650 as UART driverGregory CLEMENT2013-01-063-7/+9
| | | | | | | | | | | | | The UART controller used in the Armada 370 and Armada XP SoCs is the Synopsys DesignWare 8250 (aka Synopsys DesignWare ABP UART). The improper use of the ns16550 can lead to a kernel oops during boot if a character is sent to the UART before the initialization of the driver. The DW APB has an extra interrupt that gets raised when writing to the LCR when busy. This explains why we need to use dw-apb-uart driver to handle this. Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
* Linux 3.8-rc2v3.8-rc2Linus Torvalds2013-01-021-1/+1
|
* Merge branch 'fixes-for-3.8' of ↵Linus Torvalds2013-01-021-2/+3
|\ | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds Pull LED fix from Bryan Wu. * 'fixes-for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds: leds: leds-gpio: set devm_gpio_request_one() flags param correctly
| * leds: leds-gpio: set devm_gpio_request_one() flags param correctlyJavier Martinez Canillas2013-01-021-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit a99d76f leds: leds-gpio: use gpio_request_one changed the leds-gpio driver to use gpio_request_one() instead of gpio_request() + gpio_direction_output() Unfortunately, it also made a semantic change that breaks the leds-gpio driver. The gpio_request_one() flags parameter was set to: GPIOF_DIR_OUT | (led_dat->active_low ^ state) Since GPIOF_DIR_OUT is 0, the final flags value will just be the XOR'ed value of led_dat->active_low and state. This value were used to distinguish between HIGH/LOW output initial level and call gpio_direction_output() accordingly. With this new semantic gpio_request_one() will take the flags value of 1 as a configuration of input direction (GPIOF_DIR_IN) and will call gpio_direction_input() instead of gpio_direction_output(). int gpio_request_one(unsigned gpio, unsigned long flags, const char *label) { .. if (flags & GPIOF_DIR_IN) err = gpio_direction_input(gpio); else err = gpio_direction_output(gpio, (flags & GPIOF_INIT_HIGH) ? 1 : 0); .. } The right semantic is to evaluate led_dat->active_low ^ state and set the output initial level explicitly. Signed-off-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk> Reported-by: Arnaud Patard <arnaud.patard@rtp-net.org> Tested-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com> Signed-off-by: Bryan Wu <cooloney@gmail.com>
* | Merge git://www.linux-watchdog.org/linux-watchdogLinus Torvalds2013-01-025-14/+29
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull watchdog fixes from Wim Van Sebroeck: "This fixes some small errors in the new da9055 driver, eliminates a compiler warning and adds DT support for the twl4030_wdt driver (so that we can have multiple watchdogs with DT on the omap platforms)." * git://www.linux-watchdog.org/linux-watchdog: watchdog: twl4030_wdt: add DT support watchdog: omap_wdt: eliminate unused variable and a compiler warning watchdog: da9055: Don't update wdt_dev->timeout in da9055_wdt_set_timeout error path watchdog: da9055: Fix invalid free of devm_ allocated data
| * | watchdog: twl4030_wdt: add DT supportAaro Koskinen2013-01-023-2/+23
| | | | | | | | | | | | | | | | | | | | | | | | Add DT support for twl4030_wdt. This is needed to get twl4030_wdt to probe when booting with DT. Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * | watchdog: omap_wdt: eliminate unused variable and a compiler warningAaro Koskinen2013-01-021-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We forgot to delete this in the commit 4f4753d9 (watchdog: omap_wdt: convert to devm_ functions), and as a result the following compilation warning was introduced: drivers/watchdog/omap_wdt.c: In function 'omap_wdt_remove': drivers/watchdog/omap_wdt.c:299:19: warning: unused variable 'res' [-Wunused-variable] Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Reviewed-by: Paul Walmsley <paul@pwsan.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * | watchdog: da9055: Don't update wdt_dev->timeout in da9055_wdt_set_timeout ↵Axel Lin2013-01-021-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | error path Otherwise, WDIOC_GETTIMEOUT returns wrong value if set_timeout fails. This patch also removes unnecessary ret variable in da9055_wdt_ping function. Signed-off-by: Axel Lin <axel.lin@ingics.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * | watchdog: da9055: Fix invalid free of devm_ allocated dataAxel Lin2013-01-021-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | It is not required to free devm_ allocated data. Since kref_put needs a valid release function, da9055_wdt_release_resources() is not deleted. Signed-off-by: Axel Lin <axel.lin@ingics.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* | | Merge tag '3.8-pci-fixes' of ↵Linus Torvalds2013-01-026-55/+63
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "Some fixes for v3.8. They include a fix for the new SR-IOV sysfs management support, an expanded quirk for Ricoh SD card readers, a Stratus DMI quirk fix, and a PME polling fix." * tag '3.8-pci-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: Reduce Ricoh 0xe822 SD card reader base clock frequency to 50MHz PCI/PM: Do not suspend port if any subordinate device needs PME polling PCI: Add PCIe Link Capability link speed and width names PCI: Work around Stratus ftServer broken PCIe hierarchy (fix DMI check) PCI: Remove spurious error for sriov_numvfs store and simplify flow
| * | | PCI: Reduce Ricoh 0xe822 SD card reader base clock frequency to 50MHzAndy Lutomirski2012-12-262-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise it fails like this on cards like the Transcend 16GB SDHC card: mmc0: new SDHC card at address b368 mmcblk0: mmc0:b368 SDC 15.0 GiB mmcblk0: error -110 sending status command, retrying mmcblk0: error -84 transferring data, sector 0, nr 8, cmd response 0x900, card status 0xb0 Tested on my Lenovo x200 laptop. [bhelgaas: changelog] Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Chris Ball <cjb@laptop.org> CC: Manoj Iyer <manoj.iyer@canonical.com> CC: stable@vger.kernel.org
| * | | PCI/PM: Do not suspend port if any subordinate device needs PME pollingHuang Ying2012-12-261-1/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ulrich reported that his USB3 cardreader does not work reliably when connected to the USB3 port. It turns out that USB3 controller failed to awaken when plugging in the USB3 cardreader. Further experiments found that the USB3 host controller can only be awakened via polling, not via PME interrupt. But if the PCIe port to which the USB3 host controller is connected is suspended, we cannot poll the controller because its config space is not accessible when the PCIe port is in a low power state. To solve the issue, the PCIe port will not be suspended if any subordinate device needs PME polling. [bhelgaas: use bool consistently rather than mixing int/bool] Reference: http://lkml.kernel.org/r/50841CCC.9030809@uli-eckhardt.de Reported-by: Ulrich Eckhardt <usb@uli-eckhardt.de> Tested-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> CC: stable@vger.kernel.org # v3.6+
| * | | PCI: Add PCIe Link Capability link speed and width namesBjorn Helgaas2012-12-261-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add standard #defines for the Supported Link Speeds field in the PCIe Link Capabilities register. Note that prior to PCIe spec r3.0, these encodings were defined: 0001b 2.5GT/s Link speed supported 0010b 5.0GT/s and 2.5GT/s Link speed supported Starting with spec r3.0, these encodings refer to bits 0 and 1 in the Supported Link Speeds Vector in the Link Capabilities 2 register, and bits 0 and 1 there mean 2.5 GT/s and 5.0 GT/s, respectively. Therefore, code that followed r2.0 and interpreted 0x1 as 2.5GT/s and 0x2 as 5.0GT/s will continue to work, and we can identify a device using the new encodings because it will have a non-zero Link Capabilities 2 register. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
| * | | PCI: Work around Stratus ftServer broken PCIe hierarchy (fix DMI check)Myron Stowe2012-12-261-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 284f5f9 was intended to disable the "only_one_child()" optimization on Stratus ftServer systems, but its DMI check is wrong. It looks for DMI_SYS_VENDOR that contains "ftServer", when it should look for DMI_SYS_VENDOR containing "Stratus" and DMI_PRODUCT_NAME containing "ftServer". Tested on Stratus ftServer 6400. Reported-by: Fadeeva Marina <astarta@rat.ru> Reference: https://bugzilla.kernel.org/show_bug.cgi?id=51331 Signed-off-by: Myron Stowe <myron.stowe@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: stable@vger.kernel.org # v3.5+
| * | | PCI: Remove spurious error for sriov_numvfs store and simplify flowBjorn Helgaas2012-12-261-51/+34
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we request "num_vfs" and the driver's sriov_configure() method enables exactly that number ("num_vfs_enabled"), we complain "Invalid value for number of VFs to enable" and return an error. We should silently return success instead. Also, use kstrtou16() since numVFs is defined to be a 16-bit field and rework to simplify control flow. Reported-by: Greg Rose <gregory.v.rose@intel.com> Reference: http://lkml.kernel.org/r/20121214101911.00002f59@unknown Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Donald Dutile <ddutile@redhat.com>
* | | UAPI: Strip _UAPI prefix on header install no matter the whitespaceDavid Howells2013-01-021-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 56c176c9cac9 ("UAPI: strip the _UAPI prefix from header guards during header installation") strips the _UAPI prefix from header guards, but only if there's a single space between the cpp directive and the label. Make it more flexible and able to handle tabs and multiple white space characters. Signed-off-by: David Howells <dhowell@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | UAPI: Remove empty Kbuild filesDavid Howells2013-01-028-8/+0
| | | | | | | | | | | | | | | | | | | | | | | | Empty files can get deleted by the patch program, so remove empty Kbuild files and their links from the parent Kbuilds. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge tag 'ecryptfs-3.8-rc2-fixes' of ↵Linus Torvalds2013-01-023-6/+14
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs Pull ecryptfs fixes from Tyler Hicks: "Two self-explanatory fixes and a third patch which improves performance: when overwriting a full page in the eCryptfs page cache, skip reading in and decrypting the corresponding lower page." * tag 'ecryptfs-3.8-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs: fs/ecryptfs/crypto.c: make ecryptfs_encode_for_filename() static eCryptfs: fix to use list_for_each_entry_safe() when delete items eCryptfs: Avoid unnecessary disk read and data decryption during writing
| * | | fs/ecryptfs/crypto.c: make ecryptfs_encode_for_filename() staticCong Ding2012-12-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | the function ecryptfs_encode_for_filename() is only used in this file Signed-off-by: Cong Ding <dinggnu@gmail.com> Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
| * | | eCryptfs: fix to use list_for_each_entry_safe() when delete itemsWei Yongjun2012-12-181-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since we will be removing items off the list using list_del() we need to use a safer version of the list_for_each_entry() macro aptly named list_for_each_entry_safe(). We should use the safe macro if the loop involves deletions of items. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> [tyhicks: Fixed compiler err - missing list_for_each_entry_safe() param] Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
| * | | eCryptfs: Avoid unnecessary disk read and data decryption during writingLi Wang2012-11-071-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ecryptfs_write_begin grabs a page from page cache for writing. If the page contains invalid data, or data older than the counterpart on the disk, eCryptfs will read out the corresponing data from the disk into the page, decrypt them, then perform writing. However, for this page, if the length of the data to be written into is equal to page size, that means the whole page of data will be overwritten, in which case, it does not matter whatever the data were before, it is beneficial to perform writing directly rather than bothering to read and decrypt first. With this optimization, according to our test on a machine with Intel Core 2 Duo processor, iozone 'write' operation on an existing file with write size being multiple of page size will enjoy a steady 3x speedup. Signed-off-by: Li Wang <wangli@kylinos.com.cn> Signed-off-by: Yunchuan Wen <wenyunchuan@kylinos.com.cn> Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
* | | | Merge branch 'for-linus' of ↵Linus Torvalds2013-01-022-28/+29
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph fixes from Sage Weil: "Two of Alex's patches deal with a race when reseting server connections for open RBD images, one demotes some non-fatal BUGs to WARNs, and my patch fixes a protocol feature bit failure path." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: libceph: fix protocol feature mismatch failure path libceph: WARN, don't BUG on unexpected connection states libceph: always reset osds when kicking libceph: move linger requests sooner in kick_requests()
| * | | | libceph: fix protocol feature mismatch failure pathSage Weil2012-12-271-10/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We should not set con->state to CLOSED here; that happens in ceph_fault() in the caller, where it first asserts that the state is not yet CLOSED. Avoids a BUG when the features don't match. Since the fail_protocol() has become a trivial wrapper, replace calls to it with direct calls to reset_connection(). Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>
| * | | | libceph: WARN, don't BUG on unexpected connection statesAlex Elder2012-12-271-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A number of assertions in the ceph messenger are implemented with BUG_ON(), killing the system if connection's state doesn't match what's expected. At this point our state model is (evidently) not well understood enough for these assertions to trigger a BUG(). Convert all BUG_ON(con->state...) calls to be WARN_ON(con->state...) so we learn about these issues without killing the machine. We now recognize that a connection fault can occur due to a socket closure at any time, regardless of the state of the connection. So there is really nothing we can assert about the state of the connection at that point so eliminate that assertion. Reported-by: Ugis <ugis22@gmail.com> Tested-by: Ugis <ugis22@gmail.com> Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
| * | | | libceph: always reset osds when kickingAlex Elder2012-12-271-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When ceph_osdc_handle_map() is called to process a new osd map, kick_requests() is called to ensure all affected requests are updated if necessary to reflect changes in the osd map. This happens in two cases: whenever an incremental map update is processed; and when a full map update (or the last one if there is more than one) gets processed. In the former case, the kick_requests() call is followed immediately by a call to reset_changed_osds() to ensure any connections to osds affected by the map change are reset. But for full map updates this isn't done. Both cases should be doing this osd reset. Rather than duplicating the reset_changed_osds() call, move it into the end of kick_requests(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
| * | | | libceph: move linger requests sooner in kick_requests()Alex Elder2012-12-271-11/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kick_requests() function is called by ceph_osdc_handle_map() when an osd map change has been indicated. Its purpose is to re-queue any request whose target osd is different from what it was when it was originally sent. It is structured as two loops, one for incomplete but registered requests, and a second for handling completed linger requests. As a special case, in the first loop if a request marked to linger has not yet completed, it is moved from the request list to the linger list. This is as a quick and dirty way to have the second loop handle sending the request along with all the other linger requests. Because of the way it's done now, however, this quick and dirty solution can result in these incomplete linger requests never getting re-sent as desired. The problem lies in the fact that the second loop only arranges for a linger request to be sent if it appears its target osd has changed. This is the proper handling for *completed* linger requests (it avoids issuing the same linger request twice to the same osd). But although the linger requests added to the list in the first loop may have been sent, they have not yet completed, so they need to be re-sent regardless of whether their target osd has changed. The first required fix is we need to avoid calling __map_request() on any incomplete linger request. Otherwise the subsequent __map_request() call in the second loop will find the target osd has not changed and will therefore not re-send the request. Second, we need to be sure that a sent but incomplete linger request gets re-sent. If the target osd is the same with the new osd map as it was when the request was originally sent, this won't happen. This can be fixed through careful handling when we move these requests from the request list to the linger list, by unregistering the request *before* it is registered as a linger request. This works because a side-effect of unregistering the request is to make the request's r_osd pointer be NULL, and *that* will ensure the second loop actually re-sends the linger request. Processing of such a request is done at that point, so continue with the next one once it's been moved. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
* | | | | mm: mempolicy: Convert shared_policy mutex to spinlockMel Gorman2013-01-022-21/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sasha was fuzzing with trinity and reported the following problem: BUG: sleeping function called from invalid context at kernel/mutex.c:269 in_atomic(): 1, irqs_disabled(): 0, pid: 6361, name: trinity-main 2 locks held by trinity-main/6361: #0: (&mm->mmap_sem){++++++}, at: [<ffffffff810aa314>] __do_page_fault+0x1e4/0x4f0 #1: (&(&mm->page_table_lock)->rlock){+.+...}, at: [<ffffffff8122f017>] handle_pte_fault+0x3f7/0x6a0 Pid: 6361, comm: trinity-main Tainted: G W 3.7.0-rc2-next-20121024-sasha-00001-gd95ef01-dirty #74 Call Trace: __might_sleep+0x1c3/0x1e0 mutex_lock_nested+0x29/0x50 mpol_shared_policy_lookup+0x2e/0x90 shmem_get_policy+0x2e/0x30 get_vma_policy+0x5a/0xa0 mpol_misplaced+0x41/0x1d0 handle_pte_fault+0x465/0x6a0 This was triggered by a different version of automatic NUMA balancing but in theory the current version is vunerable to the same problem. do_numa_page -> numa_migrate_prep -> mpol_misplaced -> get_vma_policy -> shmem_get_policy It's very unlikely this will happen as shared pages are not marked pte_numa -- see the page_mapcount() check in change_pte_range() -- but it is possible. To address this, this patch restores sp->lock as originally implemented by Kosaki Motohiro. In the path where get_vma_policy() is called, it should not be calling sp_alloc() so it is not necessary to treat the PTL specially. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | Merge tag 'ext4_for_linus' of ↵Linus Torvalds2013-01-029-58/+152
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 bug fixes from Ted Ts'o: "Various bug fixes for ext4. Perhaps the most serious bug fixed is one which could cause file system corruptions when performing file punch operations." * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: avoid hang when mounting non-journal filesystems with orphan list ext4: lock i_mutex when truncating orphan inodes ext4: do not try to write superblock on ro remount w/o journal ext4: include journal blocks in df overhead calcs ext4: remove unaligned AIO warning printk ext4: fix an incorrect comment about i_mutex ext4: fix deadlock in journal_unmap_buffer() ext4: split off ext4_journalled_invalidatepage() jbd2: fix assertion failure in jbd2_journal_flush() ext4: check dioread_nolock on remount ext4: fix extent tree corruption caused by hole punch
| * | | | | ext4: avoid hang when mounting non-journal filesystems with orphan listTheodore Ts'o2012-12-271-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When trying to mount a file system which does not contain a journal, but which does have a orphan list containing an inode which needs to be truncated, the mount call with hang forever in ext4_orphan_cleanup() because ext4_orphan_del() will return immediately without removing the inode from the orphan list, leading to an uninterruptible loop in kernel code which will busy out one of the CPU's on the system. This can be trivially reproduced by trying to mount the file system found in tests/f_orphan_extents_inode/image.gz from the e2fsprogs source tree. If a malicious user were to put this on a USB stick, and mount it on a Linux desktop which has automatic mounts enabled, this could be considered a potential denial of service attack. (Not a big deal in practice, but professional paranoids worry about such things, and have even been known to allocate CVE numbers for such problems.) Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Zheng Liu <wenqing.lz@taobao.com> Cc: stable@vger.kernel.org
| * | | | | ext4: lock i_mutex when truncating orphan inodesTheodore Ts'o2012-12-271-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit c278531d39 added a warning when ext4_flush_unwritten_io() is called without i_mutex being taken. It had previously not been taken during orphan cleanup since races weren't possible at that point in the mount process, but as a result of this c278531d39, we will now see a kernel WARN_ON in this case. Take the i_mutex in ext4_orphan_cleanup() to suppress this warning. Reported-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Zheng Liu <wenqing.lz@taobao.com> Cc: stable@vger.kernel.org
| * | | | | ext4: do not try to write superblock on ro remount w/o journalMichael Tokarev2012-12-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a journal-less ext4 filesystem is mounted on a read-only block device (blockdev --setro will do), each remount (for other, unrelated, flags, like suid=>nosuid etc) results in a series of scary messages from kernel telling about I/O errors on the device. This is becauese of the following code ext4_remount(): if (sbi->s_journal == NULL) ext4_commit_super(sb, 1); at the end of remount procedure, which forces writing (flushing) of a superblock regardless whenever it is dirty or not, if the filesystem is readonly or not, and whenever the device itself is readonly or not. We only need call ext4_commit_super when the file system had been previously mounted read/write. Thanks to Eric Sandeen for help in diagnosing this issue. Signed-off-By: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
| * | | | | ext4: include journal blocks in df overhead calcsEric Sandeen2012-12-251-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To more accurately calculate overhead for "bsd" style df reporting, we should count the journal blocks as overhead as well. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Tested-by: Eric Whitney <enwlinux@gmail.com>
| * | | | | ext4: remove unaligned AIO warning printkEric Sandeen2012-12-251-8/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Although I put this in, I now think it was a bad decision. For most users, there is very little to be done in this case. They get the message, once per day, with no real context or proposed action. TBH, it generates support calls when it probably does not need to; the message sounds more dire than the situation really is. Just nuke it. Normal investigation via blktrace or whatnot can reveal poor IO patterns if bad performance is encountered. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | | | ext4: fix an incorrect comment about i_mutexAndy Lutomirski2012-12-251-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | i_mutex is not held when ->sync_file is called. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | | | ext4: fix deadlock in journal_unmap_buffer()Jan Kara2012-12-253-25/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We cannot wait for transaction commit in journal_unmap_buffer() because we hold page lock which ranks below transaction start. We solve the issue by bailing out of journal_unmap_buffer() and jbd2_journal_invalidatepage() with -EBUSY. Caller is then responsible for waiting for transaction commit to finish and try invalidation again. Since the issue can happen only for page stradding i_size, it is simple enough to manually call jbd2_journal_invalidatepage() for such page from ext4_setattr(), check the return value and wait if necessary. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
OpenPOWER on IntegriCloud