op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	this_cpu: Eliminate get/put_cpu	Christoph Lameter	2009-10-03	1	-23/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are cases where we can use this_cpu_ptr and as the result of using this_cpu_ptr() we no longer need to determine the currently executing cpu. In those places no get/put_cpu combination is needed anymore. The local cpu variable can be eliminated. Preemption still needs to be disabled and enabled since the modifications of the per cpu variables is not atomic. There may be multiple per cpu variables modified and those must all be from the same processor. Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Tejun Heo <tj@kernel.org> cc: Eric Biederman <ebiederm@aristanetworks.com> cc: Stephen Hemminger <shemminger@vyatta.com> cc: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Tejun Heo <tj@kernel.org>
*	Merge branch 'next' of ↵	NeilBrown	2009-09-23	32	-2403/+6071
\|\ \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx into for-linus
\| *	ioat3: fix uninitialized var warnings	Dan Williams	2009-09-21	1	-6/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	drivers/dma/ioat/dma_v3.c: In function 'ioat3_prep_memset_lock': drivers/dma/ioat/dma_v3.c:439: warning: 'fill' may be used uninitialized in this function drivers/dma/ioat/dma_v3.c:437: warning: 'desc' may be used uninitialized in this function drivers/dma/ioat/dma_v3.c: In function '__ioat3_prep_xor_lock': drivers/dma/ioat/dma_v3.c:489: warning: 'xor' may be used uninitialized in this function drivers/dma/ioat/dma_v3.c:486: warning: 'desc' may be used uninitialized in this function drivers/dma/ioat/dma_v3.c: In function '__ioat3_prep_pq_lock': drivers/dma/ioat/dma_v3.c:631: warning: 'pq' may be used uninitialized in this function drivers/dma/ioat/dma_v3.c:628: warning: 'desc' may be used uninitialized in this function gcc-4.0, unlike gcc-4.3, does not see that these variables are initialized before use. Convert the descriptor loops to do-while make this initialization apparent. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| *	drivers/dma/ioat/dma_v2.c: fix warnings	Andrew Morton	2009-09-21	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	drivers/dma/ioat/dma_v2.c: In function 'ioat2_dma_prep_memcpy_lock': drivers/dma/ioat/dma_v2.c:680: warning: 'hw' may be used uninitialized in this function drivers/dma/ioat/dma_v2.c:681: warning: 'desc' may be used uninitialized in this function Cc: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| *	ioat2: clarify ring size limits	Dan Williams	2009-09-16	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With the addition of ioat_max_alloc_order it is not clear what the maximum allocation order is, so document that in the modinfo. Also take an opportunity to kill a stray semicolon. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| *	ioat: driver version 4.0	Dan Williams	2009-09-10	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A new ring implementation and the addition of raid functionality constitutes a bump in the driver major version number. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| *	dca: registering requesters in multiple dca domains	Maciej Sosnowski	2009-09-10	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch enables DCA support on multiple-IOH/multiple-IIO architectures. It modifies dca module by replacing single dca_providers list with dca_domains list, each domain containing separate list of providers. This approach lets dca driver manage multiple domains, i.e. sets of providers and requesters mapped back to the same PCI root complex device. The driver takes care to register each requester to a provider from the same domain. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
\| *	async_tx: remove HIGHMEM64G restriction	Dan Williams	2009-09-08	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This restriction prevented ASYNC_TX_DMA from being enabled on platform configurations where DMA address conversion could not be performed in place on the stack. Since commit 04ce9ab3 ("async_xor: permit callers to pass in a 'dma/page scribble' region") the async_tx api now either uses a caller provided 'scribble' buffer, or performs the conversion in place when sizeof(dma_addr_t) <= sizeof(struct page *). Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| *	dmaengine: sh: Add Support SuperH DMA Engine driver	Nobuhiro Iwamatsu	2009-09-08	4	-0/+859
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This supported all DMA channels, and it was tested in SH7722, SH7780, SH7785 and SH7763. This can not use with SH DMA API. Signed-off-by: Nobuhiro Iwamatsu <iwamatsu.nobuhiro@renesas.com> Reviewed-by: Matt Fleming <matt@console-pimps.org> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| *	Merge branch 'dmaengine' into async-tx-next	Dan Williams	2009-09-08	20	-74/+3618
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: crypto/async_tx/async_xor.c drivers/dma/ioat/dma_v2.h drivers/dma/ioat/pci.c drivers/md/raid5.c
\| \| *	dmaengine: Move all map_sg/unmap_sg for slave channel to its client	Atsushi Nemoto	2009-09-08	2	-33/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Dan Williams wrote: ... DMA-slave clients request specific channels and know the hardware details at a low level, so it should not be too high an expectation to push dma mapping responsibility to the client. Also this patch includes DMA_COMPL_{SRC,DEST}_UNMAP_SINGLE support for dw_dmac driver. Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| \| *	fsldma: Add DMA_SLAVE support	Ira Snyder	2009-09-08	1	-0/+227
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use the DMA_SLAVE capability of the DMAEngine API to copy/from a scatterlist into an arbitrary list of hardware address/length pairs. This allows a single DMA transaction to copy data from several different devices into a scatterlist at the same time. This also adds support to enable some controller-specific features such as external start and external pause for a DMA transaction. [dan.j.williams@intel.com: rebased on tx_list movement] Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu> Acked-by: Li Yang <leoli@freescale.com> Acked-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| \| *	fsldma: split apart external pause and request count features	Ira Snyder	2009-09-08	2	-17/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When using the Freescale DMA controller in external control mode, both the request count and external pause bits need to be setup correctly. This was being done with the same function. The 83xx controller lacks the external pause feature, but has a similar feature called external start. This feature requires that the request count bits be setup correctly. Split the function into two parts, to make it possible to use the external start feature on the 83xx controller. Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| \| *	ioat2,3: cacheline align software descriptor allocations	Dan Williams	2009-09-08	3	-4/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	All the necessary fields for handling an ioat2,3 ring entry can fit into one cacheline. Move ->len prior to ->txd in struct ioat_ring_ent, and move allocation of these entries to a hw-cache-aligned kmem cache to reduce the number of cachelines dirtied for descriptor management. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| \| *	dmaengine: kill tx_list	Dan Williams	2009-09-08	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The tx_list attribute of struct dma_async_tx_descriptor is common to most, but not all dma driver implementations. None of the upper level code (dmaengine/async_tx) uses it, so allow drivers to implement it locally if they need it. This saves sizeof(struct list_head) bytes for drivers that do not manage descriptors with a linked list (e.g.: ioatdma v2,3). Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| \| *	txx9dmac: implement a private tx_list	Dan Williams	2009-09-08	2	-13/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Drop txx9dmac's use of tx_list from struct dma_async_tx_descriptor in preparation for removal of this field. Cc: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| \| *	at_hdmac: implement a private tx_list	Dan Williams	2009-09-08	2	-8/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Drop at_hdmac's use of tx_list from struct dma_async_tx_descriptor in preparation for removal of this field. Cc: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| \| *	mv_xor: implement a private tx_list	Dan Williams	2009-09-08	2	-5/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Drop mv_xor's use of tx_list from struct dma_async_tx_descriptor in preparation for removal of this field. Cc: Saeed Bishara <saeed@marvell.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| \| *	ioat: implement a private tx_list	Dan Williams	2009-09-08	2	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Drop ioatdma's use of tx_list from struct dma_async_tx_descriptor in preparation for removal of this field. Cc: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| \| *	iop-adma: implement a private tx_list	Dan Williams	2009-09-08	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Drop iop-adma's use of tx_list from struct dma_async_tx_descriptor in preparation for removal of this field. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| \| *	fsldma: implement a private tx_list	Dan Williams	2009-09-08	2	-7/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Drop fsldma's use of tx_list from struct dma_async_tx_descriptor in preparation for removal of this field. Cc: Li Yang <leoli@freescale.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| \| *	dw_dmac: implement a private tx_list	Dan Williams	2009-09-08	2	-9/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Drop dw_dmac's use of tx_list from struct dma_async_tx_descriptor in preparation for removal of this field. Cc: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| \| *	Merge branch 'ioat' into dmaengine	Dan Williams	2009-09-08	13	-2034/+2598
\| \| \|\
\| \| * \	Merge commit 'v2.6.31-rc1' into dmaengine	Dan Williams	2009-09-08	4	-0/+1674
\| \| \|\ \
\| * \| \ \	Merge branch 'iop-raid6' into async-tx-next	Dan Williams	2009-09-08	1	-44/+393
\| \|\ \ \ \
\| \| * \| \| \|	iop-adma: P+Q self test	Dan Williams	2009-08-29	1	-1/+181
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Even though the intent is to extend dmatest with P+Q tests there is still value in having an always-on sanity check to prevent an unintentionally broken driver from registering. This depends on raid6_pq.ko for verification, the side effect being that PQ capable channels will fail to register when raid6 is disabled. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| \| * \| \| \|	iop-adma: P+Q support for iop13xx adma engines	Dan Williams	2009-08-29	1	-34/+197
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	iop33x support is not included because that engine is a bit more awkward to handle in that it can either be in xor mode or pq mode. The dmaengine/async_tx layers currently only comprehend static capabilities. Note iop13xx does not support hardware PQ continuation so the driver must handle the DMA_PREP_CONTINUE flag for operations across > 16 sources. From the comment for dma_maxpq: /* When an engine does not support native continuation we need 3 extra * source slots to reuse P and Q with the following coefficients: * 1/ {00} * P : remove P from Q', but use it as a source for P' * 2/ {01} * Q : use Q to continue Q' calculation * 3/ {00} * Q : subtract Q from P' to cancel (2) */ Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| \| * \| \| \|	iop-adma: fix lockdep false positive	Dan Williams	2009-08-29	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	lockdep correctly identifies a potential recursive locking case for iop_chan->lock, but in the dependency submission case we expect that the same class will be acquired for both the parent dependency and the child channel. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| \| * \| \| \|	iop-adma: cleanup iop_adma_run_tx_complete_actions	Dan Williams	2009-08-29	1	-9/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Replace 'desc->async_tx.' with 'tx->' [ Impact: pure cleanup ] Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| * \| \| \| \|	I/OAT: Convert to PCI_VDEVICE()	Roland Dreier	2009-09-08	1	-23/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Trivial cleanup to make the PCI ID table easier to read. [dan.j.williams@intel.com: extended to v3.2 devices] Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| * \| \| \| \|	Add MODULE_DEVICE_TABLE() so ioatdma module is autoloaded	Roland Dreier	2009-09-08	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ioatdma module is missing aliases for the PCI devices it supports, so it is not autoloaded on boot. Add a MODULE_DEVICE_TABLE() to get these aliases. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| * \| \| \| \|	ioat3: segregate raid engines	Dan Williams	2009-09-08	3	-9/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The cleanup routine for the raid cases imposes extra checks for handling raid descriptors and extended descriptors. If the channel does not support raid it can avoid this extra overhead by using the ioat2 cleanup path. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| * \| \| \| \|	ioat3: ioat3.2 pci ids for Jasper Forest	Tom Picard	2009-09-08	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Jasper Forest introduces raid offload support via ioat3.2 support. When raid offload is enabled two (out of 8 channels) will report raid5/raid6 offload capabilities. The remaining channels will only report ioat3.0 capabilities (memcpy). Signed-off-by: Tom Picard <tom.s.picard@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| * \| \| \| \|	ioat3: interrupt descriptor support	Dan Williams	2009-09-08	1	-1/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The async_tx api uses the DMA_INTERRUPT operation type to terminate a chain of issued operations with a callback routine. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| * \| \| \| \|	ioat3: support xor via pq descriptors	Dan Williams	2009-09-08	1	-0/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a platform advertises pq capabilities, but not xor, then use ioat3_prep_pqxor and ioat3_prep_pqxor_val to simulate xor support. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| * \| \| \| \|	ioat3: pq support	Dan Williams	2009-09-08	1	-1/+264
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ioat3.2 adds support for raid6 syndrome generation (xor sum of galois field multiplication products) using up to 8 sources. It can also perform an pq-zero-sum operation to validate whether the syndrome for a given set of sources matches a previously computed syndrome. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| * \| \| \| \|	ioat3: xor self test	Dan Williams	2009-09-08	4	-3/+282
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds a hardware specific self test to be called from ioat_probe. In the ioat3 case we will have tests for all the different raid operations, while ioat1 and ioat2 will continue to just test memcpy. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| * \| \| \| \|	ioat3: xor support	Dan Williams	2009-09-08	4	-3/+222
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ioat3.2 adds xor offload support for up to 8 sources. It can also perform an xor-zero-sum operation to validate whether all given sources sum to zero, without writing to a destination. Xor descriptors differ from memcpy in that one operation may require multiple descriptors depending on the number of sources. When the number of sources exceeds 5 an extended descriptor is needed. These descriptors need to be accounted for when updating the DMA_COUNT register. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| * \| \| \| \|	ioat3: enable dca for completion writes	Dan Williams	2009-09-08	2	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Tag completion writes for direct cache access to reduce the latency of checking for descriptor completions. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| * \| \| \| \|	ioat: add 'ioat' sysfs attributes	Dan Williams	2009-09-08	6	-6/+166
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Export driver attributes for diagnostic purposes: 'ring_size': total number of descriptors available to the engine 'ring_active': number of descriptors in-flight 'capabilities': supported operation types for this channel 'version': Intel(R) QuickData specfication revision This also allows some chattiness to be removed from the driver startup as this information is now available via sysfs. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| * \| \| \| \|	ioat3: split ioat3 support to its own file, add memset	Dan Williams	2009-09-08	7	-84/+421
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Up until this point the driver for Intel(R) QuickData Technology engines, specification versions 2 and 3, were mostly identical save for a few quirks. Version 3.2 hardware adds many new capabilities (like raid offload support) requiring some infrastructure that is not relevant for v2. For better code organization of the new funcionality move v3 and v3.2 support to its own file dma_v3.c, and export some routines from the base files (dma.c and dma_v2.c) that can be reused directly. The first new capability included in this code reorganization is support for v3.2 memset operations. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| * \| \| \| \|	ioat3: hardware version 3.2 register / descriptor definitions	Dan Williams	2009-09-08	4	-2/+185
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ioat3.2 adds raid5 and raid6 offload capabilities. Signed-off-by: Tom Picard <tom.s.picard@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| * \| \| \| \|	ioat2+: add fence support	Dan Williams	2009-09-08	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In preparation for adding more operation types to the ioat3 path the driver needs to honor the DMA_PREP_FENCE flag. For example the async_tx api will hand xor->memcpy->xor chains to the driver with the 'fence' flag set on the first xor and the memcpy operation. This flag in turn sets the 'fence' flag in the descriptor control field telling the hardware that future descriptors in the chain depend on the result of the current descriptor, so wait for all writes to complete before starting the next operation. Note that ioat1 does not prefetch the descriptor chain, so does not require/support fenced operations. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| * \| \| \| \|	dmaengine, async_tx: support alignment checks	Dan Williams	2009-09-08	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some engines have transfer size and address alignment restrictions. Add a per-operation alignment property to struct dma_device that the async routines and dmatest can use to check alignment capabilities. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| * \| \| \| \|	dmaengine: cleanup unused transaction types	Dan Williams	2009-09-08	1	-4/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	No drivers currently implement these operation types, so they can be deleted. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| * \| \| \| \|	dmaengine, async_tx: add a "no channel switch" allocator	Dan Williams	2009-09-08	2	-0/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Channel switching is problematic for some dmaengine drivers as the architecture precludes separating the ->prep from ->submit. In these cases the driver can select ASYNC_TX_DISABLE_CHANNEL_SWITCH to modify the async_tx allocator to only return channels that support all of the required asynchronous operations. For example MD_RAID456=y selects support for asynchronous xor, xor validate, pq, pq validate, and memcpy. When ASYNC_TX_DISABLE_CHANNEL_SWITCH=y any channel with all these capabilities is marked DMA_ASYNC_TX allowing async_tx_find_channel() to quickly locate compatible channels with the guarantee that dependency chains will remain on one channel. When ASYNC_TX_DISABLE_CHANNEL_SWITCH=n async_tx_find_channel() may select channels that lead to operation chains that need to cross channel boundaries using the async_tx channel switch capability. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| * \| \| \| \|	Merge branch 'md-raid6-accel' into ioat3.2	Dan Williams	2009-09-08	4	-58/+63
\| \|\ \ \ \ \ \| \| \|/ / / / \| \| \| \| \| / \| \| \|_\|_\|/ \| \|/\| \| \|	Conflicts: include/linux/dmaengine.h
\| \| * \| \|	dmatest: add pq support	Dan Williams	2009-08-29	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Test raid6 p+q operations with a simple "always multiply by 1" q calculation to fit into dmatest's current destination verification scheme. Reviewed-by: Andre Noll <maan@systemlinux.org> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| \| * \| \|	async_tx: add support for asynchronous GF multiplication	Dan Williams	2009-08-29	2	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[ Based on an original patch by Yuri Tikhonov ] This adds support for doing asynchronous GF multiplication by adding two additional functions to the async_tx API: async_gen_syndrome() does simultaneous XOR and Galois field multiplication of sources. async_syndrome_val() validates the given source buffers against known P and Q values. When a request is made to run async_pq against more than the hardware maximum number of supported sources we need to reuse the previous generated P and Q values as sources into the next operation. Care must be taken to remove Q from P' and P from Q'. For example to perform a 5 source pq op with hardware that only supports 4 sources at a time the following approach is taken: p, q = PQ(src0, src1, src2, src3, COEF({01}, {02}, {04}, {08})) p', q' = PQ(p, q, q, src4, COEF({00}, {01}, {00}, {10})) p' = p + q + q + src4 = p + src4 q' = {00}p + {01}q + {00}q + {10}src4 = q + {10}*src4 Note: 4 is the minimum acceptable maxpq otherwise we punt to synchronous-software path. The DMA_PREP_CONTINUE flag indicates to the driver to reuse p and q as sources (in the above manner) and fill the remaining slots up to maxpq with the new sources/coefficients. Note1: Some devices have native support for P+Q continuation and can skip this extra work. Devices with this capability can advertise it with dma_set_maxpq. It is up to each driver how to handle the DMA_PREP_CONTINUE flag. Note2: The api supports disabling the generation of P when generating Q, this is ignored by the synchronous path but is implemented by some dma devices to save unnecessary writes. In this case the continuation algorithm is simplified to only reuse Q as a source. Cc: H. Peter Anvin <hpa@zytor.com> Cc: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Yuri Tikhonov <yur@emcraft.com> Signed-off-by: Ilya Yanok <yanok@emcraft.com> Reviewed-by: Andre Noll <maan@systemlinux.org> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| \| * \| \|	async_tx: remove walk of tx->parent chain in dma_wait_for_async_tx	Dan Williams	2009-08-29	1	-35/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We currently walk the parent chain when waiting for a given tx to complete however this walk may race with the driver cleanup routine. The routines in async_raid6_recov.c may fall back to the synchronous path at any point so we need to be prepared to call async_tx_quiesce() (which calls dma_wait_for_async_tx). To remove the ->parent walk we guarantee that every time a dependency is attached ->issue_pending() is invoked, then we can simply poll the initial descriptor until completion. This also allows for a lighter weight 'issue pending' implementation as there is no longer a requirement to iterate through all the channels' ->issue_pending() routines as long as operations have been submitted in an ordered chain. async_tx_issue_pending() is added for this case. Signed-off-by: Dan Williams <dan.j.williams@intel.com>