summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* proc: Allow proc_free_inum to be called from any contextEric W. Biederman2012-12-251-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While testing the pid namespace code I hit this nasty warning. [ 176.262617] ------------[ cut here ]------------ [ 176.263388] WARNING: at /home/eric/projects/linux/linux-userns-devel/kernel/softirq.c:160 local_bh_enable_ip+0x7a/0xa0() [ 176.265145] Hardware name: Bochs [ 176.265677] Modules linked in: [ 176.266341] Pid: 742, comm: bash Not tainted 3.7.0userns+ #18 [ 176.266564] Call Trace: [ 176.266564] [<ffffffff810a539f>] warn_slowpath_common+0x7f/0xc0 [ 176.266564] [<ffffffff810a53fa>] warn_slowpath_null+0x1a/0x20 [ 176.266564] [<ffffffff810ad9ea>] local_bh_enable_ip+0x7a/0xa0 [ 176.266564] [<ffffffff819308c9>] _raw_spin_unlock_bh+0x19/0x20 [ 176.266564] [<ffffffff8123dbda>] proc_free_inum+0x3a/0x50 [ 176.266564] [<ffffffff8111d0dc>] free_pid_ns+0x1c/0x80 [ 176.266564] [<ffffffff8111d195>] put_pid_ns+0x35/0x50 [ 176.266564] [<ffffffff810c608a>] put_pid+0x4a/0x60 [ 176.266564] [<ffffffff8146b177>] tty_ioctl+0x717/0xc10 [ 176.266564] [<ffffffff810aa4d5>] ? wait_consider_task+0x855/0xb90 [ 176.266564] [<ffffffff81086bf9>] ? default_spin_lock_flags+0x9/0x10 [ 176.266564] [<ffffffff810cab0a>] ? remove_wait_queue+0x5a/0x70 [ 176.266564] [<ffffffff811e37e8>] do_vfs_ioctl+0x98/0x550 [ 176.266564] [<ffffffff810b8a0f>] ? recalc_sigpending+0x1f/0x60 [ 176.266564] [<ffffffff810b9127>] ? __set_task_blocked+0x37/0x80 [ 176.266564] [<ffffffff810ab95b>] ? sys_wait4+0xab/0xf0 [ 176.266564] [<ffffffff811e3d31>] sys_ioctl+0x91/0xb0 [ 176.266564] [<ffffffff810a95f0>] ? task_stopped_code+0x50/0x50 [ 176.266564] [<ffffffff81939199>] system_call_fastpath+0x16/0x1b [ 176.266564] ---[ end trace 387af88219ad6143 ]--- It turns out that spin_unlock_bh(proc_inum_lock) is not safe when put_pid is called with another spinlock held and irqs disabled. For now take the easy path and use spin_lock_irqsave(proc_inum_lock) in proc_free_inum and spin_loc_irq in proc_alloc_inum(proc_inum_lock). Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* pidns: Stop pid allocation when init diesEric W. Biederman2012-12-254-4/+20
| | | | | | | | | | | | | | | | | | | | | | Oleg pointed out that in a pid namespace the sequence. - pid 1 becomes a zombie - setns(thepidns), fork,... - reaping pid 1. - The injected processes exiting. Can lead to processes attempting access their child reaper and instead following a stale pointer. That waitpid for init can return before all of the processes in the pid namespace have exited is also unfortunate. Avoid these problems by disabling the allocation of new pids in a pid namespace when init dies, instead of when the last process in a pid namespace is reaped. Pointed-out-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* pidns: Outlaw thread creation after unshare(CLONE_NEWPID)Eric W. Biederman2012-12-241-0/+8
| | | | | | | | | | | | | | | | | | The sequence: unshare(CLONE_NEWPID) clone(CLONE_THREAD|CLONE_SIGHAND|CLONE_VM) Creates a new process in the new pid namespace without setting pid_ns->child_reaper. After forking this results in a NULL pointer dereference. Avoid this and other nonsense scenarios that can show up after creating a new pid namespace with unshare by adding a new check in copy_prodcess. Pointed-out-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* Linux 3.8-rc1v3.8-rc1Linus Torvalds2012-12-211-2/+2
|
* Merge git://www.linux-watchdog.org/linux-watchdogLinus Torvalds2012-12-2119-445/+738
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull watchdog updates from Wim Van Sebroeck: "This includes some fixes and code improvements (like clk_prepare_enable and clk_disable_unprepare), conversion from the omap_wdt and twl4030_wdt drivers to the watchdog framework, addition of the SB8x0 chipset support and the DA9055 Watchdog driver and some OF support for the davinci_wdt driver." * git://www.linux-watchdog.org/linux-watchdog: (22 commits) watchdog: mei: avoid oops in watchdog unregister code path watchdog: Orion: Fix possible null-deference in orion_wdt_probe watchdog: sp5100_tco: Add SB8x0 chipset support watchdog: davinci_wdt: add OF support watchdog: da9052: Fix invalid free of devm_ allocated data watchdog: twl4030_wdt: Change TWL4030_MODULE_PM_RECEIVER to TWL_MODULE_PM_RECEIVER watchdog: remove depends on CONFIG_EXPERIMENTAL watchdog: Convert dev_printk(KERN_<LEVEL> to dev_<level>( watchdog: DA9055 Watchdog driver watchdog: omap_wdt: eliminate goto watchdog: omap_wdt: delete redundant platform_set_drvdata() calls watchdog: omap_wdt: convert to devm_ functions watchdog: omap_wdt: convert to new watchdog core watchdog: WatchDog Timer Driver Core: fix comment watchdog: s3c2410_wdt: use clk_prepare_enable and clk_disable_unprepare watchdog: imx2_wdt: Select the driver via ARCH_MXC watchdog: cpu5wdt.c: add missing del_timer call watchdog: hpwdt.c: Increase version string watchdog: Convert twl4030_wdt to watchdog core davinci_wdt: preparation for switch to common clock framework ...
| * watchdog: mei: avoid oops in watchdog unregister code pathTomas Winkler2012-12-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With commit c7d3df3 "mei: use internal watchdog device registration tracking" will crash the kernel on shutdown path on systems where ME watchdog is not present. Since the watchdog was never initialized in such case the WDOG_UNREGISTERED bit is never set and the system crashes on access to uninitialized variables down the path. To solve the issue we query for NULL on watchdog driver driver_data to check whether the device is registered. This is handled in the driver and doesn't depend on watchdog core internals. Cc: Borislav Petkov <bp@alien8.de> Cc: Wanlong Gao <gaowanlong@cn.fujitsu.com> Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * watchdog: Orion: Fix possible null-deference in orion_wdt_probeJason Gunthorpe2012-12-191-0/+2
| | | | | | | | | | | | | | | | If the DT does not include a regs parameter then the null res would be dereferenced. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * watchdog: sp5100_tco: Add SB8x0 chipset supportTakahisa Tanaka2012-12-192-61/+306
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current sp5100_tco driver only supports SP5100/SB7x0 chipset, doesn't support SB8x0 chipset, because current sp5100_tco driver doesn't know that the offset address for watchdog timer was changed from SB8x0 chipset. The offset address of SP5100 and SB7x0 chipsets are as follows, quote from the AMD SB700/710/750 Register Reference Guide (Page 164) and the AMD SP5100 Register Reference Guide (Page 166). WatchDogTimerControl 69h WatchDogTimerBase0 6Ch WatchDogTimerBase1 6Dh WatchDogTimerBase2 6Eh WatchDogTimerBase3 6Fh In contrast, the offset address of SB8x0 chipset is as follows, quote from AMD SB800-Series Southbridges Register Reference Guide (Page 147). WatchDogTimerEn 48h WatchDogTimerConfig 4Ch So, In the case of SB8x0 chipset, sp5100_tco reads meaningless MMIO address (for example, 0xbafe00) from wrong offset address, and the following message is logged. SP5100 TCO timer: mmio address 0xbafe00 already in use With this patch, sp5100_tco driver supports SB8x0 chipset, and can avoid iomem resource conflict. The processing of this patch is as follows. Step 1) Attempt to get the watchdog base address from indirect I/O (0xCD6 and 0xCD7). - Go to the step 7 if obtained address hasn't conflicted with other resource. But, currently, the address (0xfec000f0) conflicts with the IOAPIC MMIO address, and the following message is logged. SP5100 TCO timer: mmio address 0xfec000f0 already in use 0xfec000f0 is recommended by AMD BIOS Developer's Guide. So, go to the next step. Step 2) Attempt to get the SBResource_MMIO base address from AcpiMmioEN (for SB8x0, PM_Reg:24h) or SBResource_MMIO (SP5100/SB7x0, PCI_Reg:9Ch) register. - Go to the step 7 if these register has enabled by BIOS, and obtained address hasn't conflicted with other resource. - If above condition isn't true, go to the next step. Step 3) Attempt to get the free MMIO address from allocate_resource(). - Go to the step 7 if these register has enabled by BIOS, and obtained address hasn't conflicted with other resource. - Driver initialization has failed if obtained address has conflicted with other resource, and no 'force_addr' parameter is specified. Step 4) Use the specified address If 'force_addr' parameter is specified. - allocate_resource() function may fail, when the PCI bridge device occupies iomem resource from 0xf0000000 to 0xffffffff. To handle such a case, I added 'force_addr' parameter to sp5100_tco driver. With 'force_addr' parameter, sp5100_tco driver directly can assign MMIO address for watchdog timer from free iomem region. Note that It's dangerous to specify wrong address in the 'force_addr' parameter. Example of force_addr parameter use # cat /proc/iomem ...snip... fec00000-fec003ff : IOAPIC 0 <--- free MMIO region fec10000-fec1001f : pnp 00:0b fec20000-fec203ff : IOAPIC 1 ...snip... # cat /etc/modprobe.d/sp5100_tco.conf options sp5100_tco force_addr=0xfec00800 # modprobe sp5100_tco # cat /proc/iomem ...snip... fec00000-fec003ff : IOAPIC 0 fec00800-fec00807 : SP5100 TCO <--- watchdog timer MMIO address fec10000-fec1001f : pnp 00:0b fec20000-fec203ff : IOAPIC 1 ...snip... # - Driver initialization has failed if specified address has conflicted with other resource. Step 5) Disable the watchdog timer - To rewrite the watchdog timer register of the chipset, absolutely guarantee that the watchdog timer is disabled. Step 6) Re-program the watchdog timer MMIO address to chipset. - Re-program the obtained MMIO address in Step 3 or Step 4 to chipset via indirect I/O (0xCD6 and 0xCD7). Step 7) Enable and setup the watchdog timer This patch has worked fine on my test environment (ASUS M4A89GTD-PRO/USB3 and DL165G7). therefore I believe that it's no problem to re-program the MMIO address for watchdog timer to chipset during disabled watchdog. However, I'm not sure about it, because I don't know much about chipset programming. So, any comments will be welcome. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=43176 Tested-by: Arkadiusz Miskiewicz <arekm@maven.pl> Tested-by: Paul Menzel <paulepanter@users.sourceforge.net> Signed-off-by: Takahisa Tanaka <mc74hc00@gmail.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * watchdog: davinci_wdt: add OF supportMurali Karicheri2012-12-192-0/+19
| | | | | | | | | | | | | | | | This adds OF support for davinci_wdt driver. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Acked-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * watchdog: da9052: Fix invalid free of devm_ allocated dataTushar Behera2012-12-191-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | It is not required to free devm_ allocated data. Since kref_put needs a valid release function, da9052_wdt_release_resources() is not deleted. Fixes following warning. drivers/watchdog/da9052_wdt.c:59:1-6: WARNING: invalid free of devm_ allocated data Signed-off-by: Tushar Behera <tushar.behera@linaro.org> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * watchdog: twl4030_wdt: Change TWL4030_MODULE_PM_RECEIVER to ↵Peter Ujfalusi2012-12-191-1/+1
| | | | | | | | | | | | | | | | | | | | TWL_MODULE_PM_RECEIVER To facilitate upcoming cleanup in twl stack. No functional changes. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * watchdog: remove depends on CONFIG_EXPERIMENTALKees Cook2012-12-191-2/+2
| | | | | | | | | | | | | | | | | | The CONFIG_EXPERIMENTAL config item has not carried much meaning for a while now and is almost always enabled by default. As agreed during the Linux kernel summit, remove it from any "depends on" lines in Kconfigs. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * watchdog: Convert dev_printk(KERN_<LEVEL> to dev_<level>(Joe Perches2012-12-191-10/+9
| | | | | | | | | | | | | | | | dev_<level> calls take less code than dev_printk(KERN_<LEVEL> and reducing object size is good. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * watchdog: DA9055 Watchdog driverAshish Jangam2012-12-193-0/+227
| | | | | | | | | | | | | | | | | | | | | | This is the Watchdog patch for the DA9055 PMIC. This patch has got dependency on the DA9055 MFD core. This patch is functionally tested on SMDK6410 Signed-off-by: David Dajun Chen <dchen@diasemi.com> Signed-off-by: Ashish Jangam <ashish.jangam@kpitcummins.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * watchdog: omap_wdt: eliminate gotoAaro Koskinen2012-12-191-7/+4
| | | | | | | | | | | | | | Eliminate a goto to simplify the code. Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * watchdog: omap_wdt: delete redundant platform_set_drvdata() callsAaro Koskinen2012-12-191-2/+0
| | | | | | | | | | | | | | It's not needed to manually reset the driver data. Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * watchdog: omap_wdt: convert to devm_ functionsAaro Koskinen2012-12-191-37/+13
| | | | | | | | | | | | | | | | Use devm_kzalloc(), devm_request_mem_region() ande devm_ioremap() to simplify the code. Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * watchdog: omap_wdt: convert to new watchdog coreAaro Koskinen2012-12-192-149/+112
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert omap_wdt to new watchdog core. On OMAP boards, there are usually multiple watchdogs. Since the new watchdog core supports multiple watchdogs, all watchdog drivers used on OMAP should be converted. The legacy watchdog device node is still created, so this should not break existing users. Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Tested-by: Jarkko Nikula <jarkko.nikula@jollamobile.com> Tested-by: Lokesh Vutla <lokeshvutla@ti.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * watchdog: WatchDog Timer Driver Core: fix commentFabio Porcedda2012-12-191-1/+1
| | | | | | | | | | Signed-off-by: Fabio Porcedda <fabio.porcedda@gmail.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * watchdog: s3c2410_wdt: use clk_prepare_enable and clk_disable_unprepareThomas Abraham2012-12-191-3/+3
| | | | | | | | | | | | | | | | Convert clk_enable/clk_disable to clk_prepare_enable/clk_disable_unprepare calls as required by common clock framework. Signed-off-by: Thomas Abraham <thomas.abraham@linaro.org> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * watchdog: imx2_wdt: Select the driver via ARCH_MXCFabio Estevam2012-12-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | With device tree support in place, we should not use IMX_HAVE_PLATFORM_IMX2_WDT as a dependency for selecting the imx2_wdt driver. Use ARCH_MXC symbol instead, so that the driver can be even selected by a device-tree only SoC, such as i.MX6. Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Acked-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * watchdog: cpu5wdt.c: add missing del_timer calldevendra.aaru2012-12-191-0/+1
| | | | | | | | | | | | | | | | We do a setup_timer at init stage of the module, but we didn't de-activate the time using del_timer. Signed-off-by: devendra.aaru <devendra.aaru@gmail.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * watchdog: hpwdt.c: Increase version stringTom Mingarelli2012-12-191-1/+1
| | | | | | | | | | | | | | | | Changing the version of the driver for all the latest patches being applied for kdump fixes. Signed-off-by: Thomas Mingarelli <thomas.mingarelli@hp.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * watchdog: Convert twl4030_wdt to watchdog coreJarkko Nikula2012-12-192-149/+35
| | | | | | | | | | | | | | | | | | | | | | Convert the twl4030_wdt watchdog driver to watchdog core. While at there use devm_kzalloc and set the default timeout in order to be able test this driver with a simple shell script. Signed-off-by: Jarkko Nikula <jarkko.nikula@jollamobile.com> Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * davinci_wdt: preparation for switch to common clock frameworkKaricheri, Muralidharan2012-12-191-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | As a first step towards migrating davinci platforms to use common clock framework, replace all instances of clk_enable() with clk_prepare_enable() and clk_disable() with clk_disable_unprepare(). Until the platform is switched to use the CONFIG_HAVE_CLK_PREPARE Kconfig variable, this just adds a might_sleep() call and would work without any issues. This will make it easy later to switch to common clk based implementation of clk driver from DaVinci specific driver. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * watchdog: sp805_wdt.c: use clk_prepare_enable and clk_disable_unprepareJulia Lawall2012-12-191-9/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clk_prepare_enable and clk_disable_unprepare combine clk_prepare and clk_enable, and clk_disable and clk_unprepare. They make the code more concise, and ensure that clk_unprepare is called when clk_enable fails. A simplified version of the semantic patch that introduces calls to these functions is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression e; @@ - clk_prepare(e); - clk_enable(e); + clk_prepare_enable(e); @@ expression e; @@ - clk_disable(e); - clk_unprepare(e); + clk_disable_unprepare(e); // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * watchdog: ath79_wdt: convert to use module_platform_driverGabor Juhos2012-12-191-11/+2
| | | | | | | | | | | | | | | | This makes the code a bit smaller by getting rid of some boilerplate code. Signed-off-by: Gabor Juhos <juhosg@openwrt.org> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* | Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds2012-12-214-17/+18
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull CIFS fixes from Steve French: "Misc small cifs fixes" * 'for-next' of git://git.samba.org/sfrench/cifs-2.6: cifs: eliminate cifsERROR variable cifs: don't compare uniqueids in cifs_prime_dcache unless server inode numbers are in use cifs: fix double-free of "string" in cifs_parse_mount_options
| * | cifs: eliminate cifsERROR variableJeff Layton2012-12-202-6/+1
| | | | | | | | | | | | | | | | | | | | | It's always set to "1" and there's no way to change it to anything else. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | cifs: don't compare uniqueids in cifs_prime_dcache unless server inode ↵Jeff Layton2012-12-201-4/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | numbers are in use Oliver reported that commit cd60042c caused his cifs mounts to continually thrash through new inodes on readdir. His servers are not sending inode numbers (or he's not using them), and the new test in that function doesn't account for that sort of setup correctly. If we're not using server inode numbers, then assume that the inode attached to the dentry hasn't changed. Go ahead and update the attributes in place, but keep the same inode number. Cc: <stable@vger.kernel.org> # v3.5+ Reported-and-Tested-by: Oliver Mössinger <Oliver.Moessinger@ichaus.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | cifs: fix double-free of "string" in cifs_parse_mount_optionsJeff Layton2012-12-201-7/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Dan reported the following regression in commit d387a5c5: + fs/cifs/connect.c:1903 cifs_parse_mount_options() error: double free of 'string' That patch has some of the new option parsing code free "string" without setting the variable to NULL afterward. Since "string" is automatically freed in an error condition, fix the code to just rely on that instead of freeing it explicitly. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
* | | Merge tag 'dm-3.8-fixes' of ↵Linus Torvalds2012-12-2130-443/+522
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm Pull dm update from Alasdair G Kergon: "Miscellaneous device-mapper fixes, cleanups and performance improvements. Of particular note: - Disable broken WRITE SAME support in all targets except linear and striped. Use it when kcopyd is zeroing blocks. - Remove several mempools from targets by moving the data into the bio's new front_pad area(which dm calls 'per_bio_data'). - Fix a race in thin provisioning if discards are misused. - Prevent userspace from interfering with the ioctl parameters and use kmalloc for the data buffer if it's small instead of vmalloc. - Throttle some annoying error messages when I/O fails." * tag 'dm-3.8-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm: (36 commits) dm stripe: add WRITE SAME support dm: remove map_info dm snapshot: do not use map_context dm thin: dont use map_context dm raid1: dont use map_context dm flakey: dont use map_context dm raid1: rename read_record to bio_record dm: move target request nr to dm_target_io dm snapshot: use per_bio_data dm verity: use per_bio_data dm raid1: use per_bio_data dm: introduce per_bio_data dm kcopyd: add WRITE SAME support to dm_kcopyd_zero dm linear: add WRITE SAME support dm: add WRITE SAME support dm: prepare to support WRITE SAME dm ioctl: use kmalloc if possible dm ioctl: remove PF_MEMALLOC dm persistent data: improve improve space map block alloc failure message dm thin: use DMERR_LIMIT for errors ...
| * | | dm stripe: add WRITE SAME supportMike Snitzer2012-12-211-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename stripe_map_discard to stripe_map_range and reuse it for WRITE SAME bio processing. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * | | dm: remove map_infoMikulas Patocka2012-12-2114-54/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes map_info from bio-based device mapper targets. map_info is still used for request-based targets. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * | | dm snapshot: do not use map_contextMikulas Patocka2012-12-211-13/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Eliminate struct map_info from dm-snap. map_info->ptr was used in dm-snap to indicate if the bio was tracked. If map_info->ptr was non-NULL, the bio was linked in tracked_chunk_hash. This patch removes the use of map_info->ptr. We determine if the bio was tracked based on hlist_unhashed(&c->node). If hlist_unhashed is true, the bio is not tracked, if it is false, the bio is tracked. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * | | dm thin: dont use map_contextMikulas Patocka2012-12-211-36/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes endio_hook_pool from dm-thin and uses per-bio data instead. This patch removes any use of map_info in preparation for the next patch that removes map_info from bio-based device mapper. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * | | dm raid1: dont use map_contextMikulas Patocka2012-12-211-9/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Don't use map_info any more in dm-raid1. map_info was used for writes to hold the region number. For this purpose we add a new field dm_bio_details to dm_raid1_bio_record. map_info was used for reads to hold a pointer to dm_raid1_bio_record (if the pointer was non-NULL, bio details were saved; if the pointer was NULL, bio details were not saved). We use dm_raid1_bio_record.details->bi_bdev for this purpose. If bi_bdev is NULL, details were not saved, if bi_bdev is non-NULL, details were saved. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * | | dm flakey: dont use map_contextMikulas Patocka2012-12-211-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace map_info with a per-bio structure "struct per_bio_data" in dm-flakey. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * | | dm raid1: rename read_record to bio_recordMikulas Patocka2012-12-211-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename struct read_record to bio_record in dm-raid1. In the following patch, the structure will be used for both read and write bios, so rename it. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * | | dm: move target request nr to dm_target_ioMikulas Patocka2012-12-214-8/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch moves target_request_nr from map_info to dm_target_io and makes it accessible with dm_bio_get_target_request_nr. This patch is a preparation for the next patch that removes map_info. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * | | dm snapshot: use per_bio_dataMikulas Patocka2012-12-211-35/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace tracked_chunk_pool with per_bio_data in dm-snap. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * | | dm verity: use per_bio_dataMikulas Patocka2012-12-211-18/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace io_mempool with per_bio_data in dm-verity. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * | | dm raid1: use per_bio_dataMikulas Patocka2012-12-211-34/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace read_record_pool with per_bio_data in dm-raid1. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * | | dm: introduce per_bio_dataMikulas Patocka2012-12-214-20/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a field per_bio_data_size in struct dm_target. Targets can set this field in the constructor. If a target sets this field to a non-zero value, "per_bio_data_size" bytes of auxiliary data are allocated for each bio submitted to the target. These data can be used for any purpose by the target and help us improve performance by removing some per-target mempools. Per-bio data is accessed with dm_per_bio_data. The argument data_size must be the same as the value per_bio_data_size in dm_target. If the target has a pointer to per_bio_data, it can get a pointer to the bio with dm_bio_from_per_bio_data() function (data_size must be the same as the value passed to dm_per_bio_data). Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * | | dm kcopyd: add WRITE SAME support to dm_kcopyd_zeroMike Snitzer2012-12-213-10/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add WRITE SAME support to dm-io and make it accessible to dm_kcopyd_zero(). dm_kcopyd_zero() provides an asynchronous interface whereas the blkdev_issue_write_same() interface is synchronous. WRITE SAME is a SCSI command that can be leveraged for more efficient zeroing of a specified logical extent of a device which supports it. Only a single zeroed logical block is transfered to the target for each WRITE SAME and the target then writes that same block across the specified extent. The dm thin target uses this. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * | | dm linear: add WRITE SAME supportMike Snitzer2012-12-211-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The linear target can already support WRITE SAME requests so signal this by setting num_write_same_requests to 1. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * | | dm: add WRITE SAME supportMike Snitzer2012-12-211-5/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | WRITE SAME bios have a payload that contain a single page. When cloning WRITE SAME bios DM has no need to modify the bi_io_vec attributes (and doing so would be detrimental). DM need only alter the start and end of the WRITE SAME bio accordingly. Rather than duplicate __clone_and_map_discard, factor out a common function that is also used by __clone_and_map_write_same. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * | | dm: prepare to support WRITE SAMEMike Snitzer2012-12-212-1/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow targets to opt in to WRITE SAME support by setting 'num_write_same_requests' in the dm_target structure. A dm device will only advertise WRITE SAME support if all its targets and all its underlying devices support it. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * | | dm ioctl: use kmalloc if possibleMikulas Patocka2012-12-211-13/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the parameter buffer is small enough, try to allocate it with kmalloc() rather than vmalloc(). vmalloc is noticeably slower than kmalloc because it has to manipulate page tables. In my tests, on PA-RISC this patch speeds up activation 13 times. On Opteron this patch speeds up activation by 5%. This patch introduces a new function free_params() to free the parameters and this uses new flags that record whether or not vmalloc() was used and whether or not the input buffer must be wiped after use. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * | | dm ioctl: remove PF_MEMALLOCMikulas Patocka2012-12-212-11/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When allocating memory for the userspace ioctl data, set some appropriate GPF flags directly instead of using PF_MEMALLOC. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
OpenPOWER on IntegriCloud