summaryrefslogtreecommitdiffstats
path: root/arch/x86/crypto/aesni-intel_glue.c
Commit message (Collapse)AuthorAgeFilesLines
* crypto: aead - Remove CRYPTO_ALG_AEAD_NEW flagHerbert Xu2015-08-171-2/+1
| | | | | | | This patch removes the CRYPTO_ALG_AEAD_NEW flag now that everyone has been converted. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: aesni - Use new IV conventionHerbert Xu2015-07-141-36/+20
| | | | | | | This patch converts rfc4106 to the new calling convention where the IV is now in the AD and needs to be skipped. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: aesni - fix failing setkey for rfc4106-gcm-aesniTadeusz Struk2015-06-291-1/+1
| | | | | | | | | | | rfc4106(gcm(aes)) uses ctr(aes) to generate hash key. ctr(aes) needs chainiv, but the chainiv gets initialized after aesni_intel when both are statically linked so the setkey fails. This patch forces aesni_intel to be initialized after chainiv. Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Tested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds2015-06-221-256/+167
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull crypto update from Herbert Xu: "Here is the crypto update for 4.2: API: - Convert RNG interface to new style. - New AEAD interface with one SG list for AD and plain/cipher text. All external AEAD users have been converted. - New asymmetric key interface (akcipher). Algorithms: - Chacha20, Poly1305 and RFC7539 support. - New RSA implementation. - Jitter RNG. - DRBG is now seeded with both /dev/random and Jitter RNG. If kernel pool isn't ready then DRBG will be reseeded when it is. - DRBG is now the default crypto API RNG, replacing krng. - 842 compression (previously part of powerpc nx driver). Drivers: - Accelerated SHA-512 for arm64. - New Marvell CESA driver that supports DMA and more algorithms. - Updated powerpc nx 842 support. - Added support for SEC1 hardware to talitos" * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (292 commits) crypto: marvell/cesa - remove COMPILE_TEST dependency crypto: algif_aead - Temporarily disable all AEAD algorithms crypto: af_alg - Forbid the use internal algorithms crypto: echainiv - Only hold RNG during initialisation crypto: seqiv - Add compatibility support without RNG crypto: eseqiv - Offer normal cipher functionality without RNG crypto: chainiv - Offer normal cipher functionality without RNG crypto: user - Add CRYPTO_MSG_DELRNG crypto: user - Move cryptouser.h to uapi crypto: rng - Do not free default RNG when it becomes unused crypto: skcipher - Allow givencrypt to be NULL crypto: sahara - propagate the error on clk_disable_unprepare() failure crypto: rsa - fix invalid select for AKCIPHER crypto: picoxcell - Update to the current clk API crypto: nx - Check for bogus firmware properties crypto: marvell/cesa - add DT bindings documentation crypto: marvell/cesa - add support for Kirkwood and Dove SoCs crypto: marvell/cesa - add support for Orion SoCs crypto: marvell/cesa - add allhwsupport module parameter crypto: marvell/cesa - add support for all armada SoCs ...
| * crypto: aesni - Convert rfc4106 to new AEAD interfaceHerbert Xu2015-06-031-167/+83
| | | | | | | | | | | | | | This patch converts the low-level __gcm-aes-aesni algorithm to the new AEAD interface. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
| * crypto: aesni - Convert top-level rfc4106 algorithm to new interfaceHerbert Xu2015-06-031-89/+83
| | | | | | | | | | | | | | | | | | | | | | | | This patch converts rfc4106-gcm-aesni to the new AEAD interface. The low-level interface remains as is for now because we can't touch it until cryptd itself is upgraded. In the conversion I've also removed the duplicate copy of the context in the top-level algorithm. Now all processing is carried out in the low-level __driver-gcm-aes-aesni algorithm. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
| * crypto: aesni - Use crypto_aead_set_reqsize helperHerbert Xu2015-05-131-2/+3
| | | | | | | | | | | | | | This patch uses the crypto_aead_set_reqsize helper to avoid directly touching the internals of aead. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* | x86/fpu: Rename i387.h to fpu/api.hIngo Molnar2015-05-191-1/+1
|/ | | | | | | | | | | | | | | | | | We already have fpu/types.h, move i387.h to fpu/api.h. The file name has become a misnomer anyway: it offers generic FPU APIs, but is not limited to i387 functionality. Reviewed-by: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds2015-04-151-62/+125
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull crypto update from Herbert Xu: "Here is the crypto update for 4.1: New interfaces: - user-space interface for AEAD - user-space interface for RNG (i.e., pseudo RNG) New hashes: - ARMv8 SHA1/256 - ARMv8 AES - ARMv8 GHASH - ARM assembler and NEON SHA256 - MIPS OCTEON SHA1/256/512 - MIPS img-hash SHA1/256 and MD5 - Power 8 VMX AES/CBC/CTR/GHASH - PPC assembler AES, SHA1/256 and MD5 - Broadcom IPROC RNG driver Cleanups/fixes: - prevent internal helper algos from being exposed to user-space - merge common code from assembly/C SHA implementations - misc fixes" * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (169 commits) crypto: arm - workaround for building with old binutils crypto: arm/sha256 - avoid sha256 code on ARMv7-M crypto: x86/sha512_ssse3 - move SHA-384/512 SSSE3 implementation to base layer crypto: x86/sha256_ssse3 - move SHA-224/256 SSSE3 implementation to base layer crypto: x86/sha1_ssse3 - move SHA-1 SSSE3 implementation to base layer crypto: arm64/sha2-ce - move SHA-224/256 ARMv8 implementation to base layer crypto: arm64/sha1-ce - move SHA-1 ARMv8 implementation to base layer crypto: arm/sha2-ce - move SHA-224/256 ARMv8 implementation to base layer crypto: arm/sha256 - move SHA-224/256 ASM/NEON implementation to base layer crypto: arm/sha1-ce - move SHA-1 ARMv8 implementation to base layer crypto: arm/sha1_neon - move SHA-1 NEON implementation to base layer crypto: arm/sha1 - move SHA-1 ARM asm implementation to base layer crypto: sha512-generic - move to generic glue implementation crypto: sha256-generic - move to generic glue implementation crypto: sha1-generic - move to generic glue implementation crypto: sha512 - implement base layer for SHA-512 crypto: sha256 - implement base layer for SHA-256 crypto: sha1 - implement base layer for SHA-1 crypto: api - remove instance when test failed crypto: api - Move alg ref count init to crypto_check_alg ...
| * crypto: aesni - mark AES-NI helper ciphersStephan Mueller2015-03-311-8/+15
| | | | | | | | | | | | | | | | Flag all AES-NI helper ciphers as internal ciphers to prevent them from being called by normal users. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
| * crypto: aesni - make driver-gcm-aes-aesni helper a proper aead algTadeusz Struk2015-02-281-54/+110
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Changed the __driver-gcm-aes-aesni to be a proper aead algorithm. This required a valid setkey and setauthsize functions to be added and also some changes to make sure that math context is not corrupted when the alg is used directly. Note that the __driver-gcm-aes-aesni should not be used directly by modules that can use it in interrupt context as we don't have a good fallback mechanism in this case. Signed-off-by: Adrian Hoban <adrian.hoban@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* | crypto: aesni - fix memory usage in GCM decryptionStephan Mueller2015-03-131-2/+2
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kernel crypto API logic requires the caller to provide the length of (ciphertext || authentication tag) as cryptlen for the AEAD decryption operation. Thus, the cipher implementation must calculate the size of the plaintext output itself and cannot simply use cryptlen. The RFC4106 GCM decryption operation tries to overwrite cryptlen memory in req->dst. As the destination buffer for decryption only needs to hold the plaintext memory but cryptlen references the input buffer holding (ciphertext || authentication tag), the assumption of the destination buffer length in RFC4106 GCM operation leads to a too large size. This patch simply uses the already calculated plaintext size. In addition, this patch fixes the offset calculation of the AAD buffer pointer: as mentioned before, cryptlen already includes the size of the tag. Thus, the tag does not need to be added. With the addition, the AAD will be written beyond the already allocated buffer. Note, this fixes a kernel crash that can be triggered from user space via AF_ALG(aead) -- simply use the libkcapi test application from [1] and update it to use rfc4106-gcm-aes. Using [1], the changes were tested using CAVS vectors to demonstrate that the crypto operation still delivers the right results. [1] http://www.chronox.de/libkcapi.html CC: Tadeusz Struk <tadeusz.struk@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: aesni - Add support for 192 & 256 bit keys to AESNI RFC4106Timothy McCaffrey2015-01-141-6/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These patches fix the RFC4106 implementation in the aesni-intel module so it supports 192 & 256 bit keys. Since the AVX support that was added to this module also only supports 128 bit keys, and this patch only affects the SSE implementation, changes were also made to use the SSE version if key sizes other than 128 are specified. RFC4106 specifies that 192 & 256 bit keys must be supported (section 8.4). Also, this should fix Strongswan issue 341 where the aesni module needs to be unloaded if 256 bit keys are used: http://wiki.strongswan.org/issues/341 This patch has been tested with Sandy Bridge and Haswell processors. With 128 bit keys and input buffers > 512 bytes a slight performance degradation was noticed (~1%). For input buffers of less than 512 bytes there was no performance impact. Compared to 128 bit keys, 256 bit key size performance is approx. .5 cycles per byte slower on Sandy Bridge, and .37 cycles per byte slower on Haswell (vs. SSE code). This patch has also been tested with StrongSwan IPSec connections where it worked correctly. I created this diff from a git clone of crypto-2.6.git. Any questions, please feel free to contact me. Signed-off-by: Timothy McCaffrey <timothy.mccaffrey@unisys.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: prefix module autoloading with "crypto-"Kees Cook2014-11-241-1/+1
| | | | | | | | | | | This prefixes all crypto module loading with "crypto-" so we never run the risk of exposing module auto-loading to userspace via a crypto API, as demonstrated by Mathias Krause: https://lkml.org/lkml/2013/3/4/70 Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: aesni - remove unnecessary #defineValentin Rothberg2014-11-061-6/+2
| | | | | | | | | | | | | | | | | The CPP identifier 'HAS_PCBC' is defined when the Kconfig option CRYPTO_PCBC is set as 'y' or 'm', and is further used in two ifdef blocks to conditionally compile source code. This indirection hides the actual Kconfig dependency and complicates readability. Moreover, it's inconsistent with the rest of the ifdef blocks in the file, which directly reference Kconfig options. This patch removes 'HAS_PCBC' and replaces its occurrences with the actual dependency on 'CRYPTO_PCBC' being set as 'y' or 'm'. Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* Revert "crypto: aesni - disable "by8" AVX CTR optimization"Mathias Krause2014-10-021-2/+2
| | | | | | | | | | This reverts commit 7da4b29d496b1389d3a29b55d3668efecaa08ebd. Now, that the issue is fixed, we can re-enable the code. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Chandramouli Narayanan <mouli@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: aesni - disable "by8" AVX CTR optimizationMathias Krause2014-09-241-2/+2
| | | | | | | | | | | | | | | | | The "by8" implementation introduced in commit 22cddcc7df8f ("crypto: aes - AES CTR x86_64 "by8" AVX optimization") is failing crypto tests as it handles counter block overflows differently. It only accounts the right most 32 bit as a counter -- not the whole block as all other implementations do. This makes it fail the cryptomgr test #4 that specifically tests this corner case. As we're quite late in the release cycle, just disable the "by8" variant for now. Reported-by: Romain Francoise <romain@orebokech.com> Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Chandramouli Narayanan <mouli@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: aes - AES CTR x86_64 "by8" AVX optimizationchandramouli narayanan2014-06-201-2/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces "by8" AES CTR mode AVX optimization inspired by Intel Optimized IPSEC Cryptograhpic library. For additional information, please see: http://downloadcenter.intel.com/Detail_Desc.aspx?agr=Y&DwnldID=22972 The functions aes_ctr_enc_128_avx_by8(), aes_ctr_enc_192_avx_by8() and aes_ctr_enc_256_avx_by8() are adapted from Intel Optimized IPSEC Cryptographic library. When both AES and AVX features are enabled in a platform, the glue code in AESNI module overrieds the existing "by4" CTR mode en/decryption with the "by8" AES CTR mode en/decryption. On a Haswell desktop, with turbo disabled and all cpus running at maximum frequency, the "by8" CTR mode optimization shows better performance results across data & key sizes as measured by tcrypt. The average performance improvement of the "by8" version over the "by4" version is as follows: For 128 bit key and data sizes >= 256 bytes, there is a 10-16% improvement. For 192 bit key and data sizes >= 256 bytes, there is a 20-22% improvement. For 256 bit key and data sizes >= 256 bytes, there is a 20-25% improvement. A typical run of tcrypt with AES CTR mode encryption of the "by4" and "by8" optimization shows the following results: tcrypt with "by4" AES CTR mode encryption optimization on a Haswell Desktop: --------------------------------------------------------------------------- testing speed of __ctr-aes-aesni encryption test 0 (128 bit key, 16 byte blocks): 1 operation in 343 cycles (16 bytes) test 1 (128 bit key, 64 byte blocks): 1 operation in 336 cycles (64 bytes) test 2 (128 bit key, 256 byte blocks): 1 operation in 491 cycles (256 bytes) test 3 (128 bit key, 1024 byte blocks): 1 operation in 1130 cycles (1024 bytes) test 4 (128 bit key, 8192 byte blocks): 1 operation in 7309 cycles (8192 bytes) test 5 (192 bit key, 16 byte blocks): 1 operation in 346 cycles (16 bytes) test 6 (192 bit key, 64 byte blocks): 1 operation in 361 cycles (64 bytes) test 7 (192 bit key, 256 byte blocks): 1 operation in 543 cycles (256 bytes) test 8 (192 bit key, 1024 byte blocks): 1 operation in 1321 cycles (1024 bytes) test 9 (192 bit key, 8192 byte blocks): 1 operation in 9649 cycles (8192 bytes) test 10 (256 bit key, 16 byte blocks): 1 operation in 369 cycles (16 bytes) test 11 (256 bit key, 64 byte blocks): 1 operation in 366 cycles (64 bytes) test 12 (256 bit key, 256 byte blocks): 1 operation in 595 cycles (256 bytes) test 13 (256 bit key, 1024 byte blocks): 1 operation in 1531 cycles (1024 bytes) test 14 (256 bit key, 8192 byte blocks): 1 operation in 10522 cycles (8192 bytes) testing speed of __ctr-aes-aesni decryption test 0 (128 bit key, 16 byte blocks): 1 operation in 336 cycles (16 bytes) test 1 (128 bit key, 64 byte blocks): 1 operation in 350 cycles (64 bytes) test 2 (128 bit key, 256 byte blocks): 1 operation in 487 cycles (256 bytes) test 3 (128 bit key, 1024 byte blocks): 1 operation in 1129 cycles (1024 bytes) test 4 (128 bit key, 8192 byte blocks): 1 operation in 7287 cycles (8192 bytes) test 5 (192 bit key, 16 byte blocks): 1 operation in 350 cycles (16 bytes) test 6 (192 bit key, 64 byte blocks): 1 operation in 359 cycles (64 bytes) test 7 (192 bit key, 256 byte blocks): 1 operation in 635 cycles (256 bytes) test 8 (192 bit key, 1024 byte blocks): 1 operation in 1324 cycles (1024 bytes) test 9 (192 bit key, 8192 byte blocks): 1 operation in 9595 cycles (8192 bytes) test 10 (256 bit key, 16 byte blocks): 1 operation in 364 cycles (16 bytes) test 11 (256 bit key, 64 byte blocks): 1 operation in 377 cycles (64 bytes) test 12 (256 bit key, 256 byte blocks): 1 operation in 604 cycles (256 bytes) test 13 (256 bit key, 1024 byte blocks): 1 operation in 1527 cycles (1024 bytes) test 14 (256 bit key, 8192 byte blocks): 1 operation in 10549 cycles (8192 bytes) tcrypt with "by8" AES CTR mode encryption optimization on a Haswell Desktop: --------------------------------------------------------------------------- testing speed of __ctr-aes-aesni encryption test 0 (128 bit key, 16 byte blocks): 1 operation in 340 cycles (16 bytes) test 1 (128 bit key, 64 byte blocks): 1 operation in 330 cycles (64 bytes) test 2 (128 bit key, 256 byte blocks): 1 operation in 450 cycles (256 bytes) test 3 (128 bit key, 1024 byte blocks): 1 operation in 1043 cycles (1024 bytes) test 4 (128 bit key, 8192 byte blocks): 1 operation in 6597 cycles (8192 bytes) test 5 (192 bit key, 16 byte blocks): 1 operation in 339 cycles (16 bytes) test 6 (192 bit key, 64 byte blocks): 1 operation in 352 cycles (64 bytes) test 7 (192 bit key, 256 byte blocks): 1 operation in 539 cycles (256 bytes) test 8 (192 bit key, 1024 byte blocks): 1 operation in 1153 cycles (1024 bytes) test 9 (192 bit key, 8192 byte blocks): 1 operation in 8458 cycles (8192 bytes) test 10 (256 bit key, 16 byte blocks): 1 operation in 353 cycles (16 bytes) test 11 (256 bit key, 64 byte blocks): 1 operation in 360 cycles (64 bytes) test 12 (256 bit key, 256 byte blocks): 1 operation in 512 cycles (256 bytes) test 13 (256 bit key, 1024 byte blocks): 1 operation in 1277 cycles (1024 bytes) test 14 (256 bit key, 8192 byte blocks): 1 operation in 8745 cycles (8192 bytes) testing speed of __ctr-aes-aesni decryption test 0 (128 bit key, 16 byte blocks): 1 operation in 348 cycles (16 bytes) test 1 (128 bit key, 64 byte blocks): 1 operation in 335 cycles (64 bytes) test 2 (128 bit key, 256 byte blocks): 1 operation in 451 cycles (256 bytes) test 3 (128 bit key, 1024 byte blocks): 1 operation in 1030 cycles (1024 bytes) test 4 (128 bit key, 8192 byte blocks): 1 operation in 6611 cycles (8192 bytes) test 5 (192 bit key, 16 byte blocks): 1 operation in 354 cycles (16 bytes) test 6 (192 bit key, 64 byte blocks): 1 operation in 346 cycles (64 bytes) test 7 (192 bit key, 256 byte blocks): 1 operation in 488 cycles (256 bytes) test 8 (192 bit key, 1024 byte blocks): 1 operation in 1154 cycles (1024 bytes) test 9 (192 bit key, 8192 byte blocks): 1 operation in 8390 cycles (8192 bytes) test 10 (256 bit key, 16 byte blocks): 1 operation in 357 cycles (16 bytes) test 11 (256 bit key, 64 byte blocks): 1 operation in 362 cycles (64 bytes) test 12 (256 bit key, 256 byte blocks): 1 operation in 515 cycles (256 bytes) test 13 (256 bit key, 1024 byte blocks): 1 operation in 1284 cycles (1024 bytes) test 14 (256 bit key, 8192 byte blocks): 1 operation in 8681 cycles (8192 bytes) crypto: Incorporate feed back to AES CTR mode optimization patch Specifically, the following: a) alignment around main loop in aes_ctrby8_avx_x86_64.S b) .rodata around data constants used in the assembely code. c) the use of CONFIG_AVX in the glue code. d) fix up white space. e) informational message for "by8" AES CTR mode optimization f) "by8" AES CTR mode optimization can be simply enabled if the platform supports both AES and AVX features. The optimization works superbly on Sandybridge as well. Testing on Haswell shows no performance change since the last. Testing on Sandybridge shows that the "by8" AES CTR mode optimization greatly improves performance. tcrypt log with "by4" AES CTR mode optimization on Sandybridge -------------------------------------------------------------- testing speed of __ctr-aes-aesni encryption test 0 (128 bit key, 16 byte blocks): 1 operation in 383 cycles (16 bytes) test 1 (128 bit key, 64 byte blocks): 1 operation in 408 cycles (64 bytes) test 2 (128 bit key, 256 byte blocks): 1 operation in 707 cycles (256 bytes) test 3 (128 bit key, 1024 byte blocks): 1 operation in 1864 cycles (1024 bytes) test 4 (128 bit key, 8192 byte blocks): 1 operation in 12813 cycles (8192 bytes) test 5 (192 bit key, 16 byte blocks): 1 operation in 395 cycles (16 bytes) test 6 (192 bit key, 64 byte blocks): 1 operation in 432 cycles (64 bytes) test 7 (192 bit key, 256 byte blocks): 1 operation in 780 cycles (256 bytes) test 8 (192 bit key, 1024 byte blocks): 1 operation in 2132 cycles (1024 bytes) test 9 (192 bit key, 8192 byte blocks): 1 operation in 15765 cycles (8192 bytes) test 10 (256 bit key, 16 byte blocks): 1 operation in 416 cycles (16 bytes) test 11 (256 bit key, 64 byte blocks): 1 operation in 438 cycles (64 bytes) test 12 (256 bit key, 256 byte blocks): 1 operation in 842 cycles (256 bytes) test 13 (256 bit key, 1024 byte blocks): 1 operation in 2383 cycles (1024 bytes) test 14 (256 bit key, 8192 byte blocks): 1 operation in 16945 cycles (8192 bytes) testing speed of __ctr-aes-aesni decryption test 0 (128 bit key, 16 byte blocks): 1 operation in 389 cycles (16 bytes) test 1 (128 bit key, 64 byte blocks): 1 operation in 409 cycles (64 bytes) test 2 (128 bit key, 256 byte blocks): 1 operation in 704 cycles (256 bytes) test 3 (128 bit key, 1024 byte blocks): 1 operation in 1865 cycles (1024 bytes) test 4 (128 bit key, 8192 byte blocks): 1 operation in 12783 cycles (8192 bytes) test 5 (192 bit key, 16 byte blocks): 1 operation in 409 cycles (16 bytes) test 6 (192 bit key, 64 byte blocks): 1 operation in 434 cycles (64 bytes) test 7 (192 bit key, 256 byte blocks): 1 operation in 792 cycles (256 bytes) test 8 (192 bit key, 1024 byte blocks): 1 operation in 2151 cycles (1024 bytes) test 9 (192 bit key, 8192 byte blocks): 1 operation in 15804 cycles (8192 bytes) test 10 (256 bit key, 16 byte blocks): 1 operation in 421 cycles (16 bytes) test 11 (256 bit key, 64 byte blocks): 1 operation in 444 cycles (64 bytes) test 12 (256 bit key, 256 byte blocks): 1 operation in 840 cycles (256 bytes) test 13 (256 bit key, 1024 byte blocks): 1 operation in 2394 cycles (1024 bytes) test 14 (256 bit key, 8192 byte blocks): 1 operation in 16928 cycles (8192 bytes) tcrypt log with "by8" AES CTR mode optimization on Sandybridge -------------------------------------------------------------- testing speed of __ctr-aes-aesni encryption test 0 (128 bit key, 16 byte blocks): 1 operation in 383 cycles (16 bytes) test 1 (128 bit key, 64 byte blocks): 1 operation in 401 cycles (64 bytes) test 2 (128 bit key, 256 byte blocks): 1 operation in 522 cycles (256 bytes) test 3 (128 bit key, 1024 byte blocks): 1 operation in 1136 cycles (1024 bytes) test 4 (128 bit key, 8192 byte blocks): 1 operation in 7046 cycles (8192 bytes) test 5 (192 bit key, 16 byte blocks): 1 operation in 394 cycles (16 bytes) test 6 (192 bit key, 64 byte blocks): 1 operation in 418 cycles (64 bytes) test 7 (192 bit key, 256 byte blocks): 1 operation in 559 cycles (256 bytes) test 8 (192 bit key, 1024 byte blocks): 1 operation in 1263 cycles (1024 bytes) test 9 (192 bit key, 8192 byte blocks): 1 operation in 9072 cycles (8192 bytes) test 10 (256 bit key, 16 byte blocks): 1 operation in 408 cycles (16 bytes) test 11 (256 bit key, 64 byte blocks): 1 operation in 428 cycles (64 bytes) test 12 (256 bit key, 256 byte blocks): 1 operation in 595 cycles (256 bytes) test 13 (256 bit key, 1024 byte blocks): 1 operation in 1385 cycles (1024 bytes) test 14 (256 bit key, 8192 byte blocks): 1 operation in 9224 cycles (8192 bytes) testing speed of __ctr-aes-aesni decryption test 0 (128 bit key, 16 byte blocks): 1 operation in 390 cycles (16 bytes) test 1 (128 bit key, 64 byte blocks): 1 operation in 402 cycles (64 bytes) test 2 (128 bit key, 256 byte blocks): 1 operation in 530 cycles (256 bytes) test 3 (128 bit key, 1024 byte blocks): 1 operation in 1135 cycles (1024 bytes) test 4 (128 bit key, 8192 byte blocks): 1 operation in 7079 cycles (8192 bytes) test 5 (192 bit key, 16 byte blocks): 1 operation in 414 cycles (16 bytes) test 6 (192 bit key, 64 byte blocks): 1 operation in 417 cycles (64 bytes) test 7 (192 bit key, 256 byte blocks): 1 operation in 572 cycles (256 bytes) test 8 (192 bit key, 1024 byte blocks): 1 operation in 1312 cycles (1024 bytes) test 9 (192 bit key, 8192 byte blocks): 1 operation in 9073 cycles (8192 bytes) test 10 (256 bit key, 16 byte blocks): 1 operation in 415 cycles (16 bytes) test 11 (256 bit key, 64 byte blocks): 1 operation in 454 cycles (64 bytes) test 12 (256 bit key, 256 byte blocks): 1 operation in 598 cycles (256 bytes) test 13 (256 bit key, 1024 byte blocks): 1 operation in 1407 cycles (1024 bytes) test 14 (256 bit key, 8192 byte blocks): 1 operation in 9288 cycles (8192 bytes) crypto: Fix redundant checks a) Fix the redundant check for cpu_has_aes b) Fix the key length check when invoking the CTR mode "by8" encryptor/decryptor. crypto: fix typo in AES ctr mode transform Signed-off-by: Chandramouli Narayanan <mouli@linux.intel.com> Reviewed-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: aesni - fix build on x86 (32bit)Andy Shevchenko2013-12-311-0/+2
| | | | | | | | | | It seems commit d764593a "crypto: aesni - AVX and AVX2 version of AESNI-GCM encode and decode" breaks a build on x86_32 since it's designed only for x86_64. This patch makes a compilation unit conditional to CONFIG_64BIT and functions usage to CONFIG_X86_64. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: aesni - AVX and AVX2 version of AESNI-GCM encode and decodeTim Chen2013-12-201-2/+141
| | | | | | | | | | | We have added AVX and AVX2 routines that optimize AESNI-GCM encode/decode. These routines are optimized for encrypt and decrypt of large buffers. In tests we have seen up to 6% speedup for 1K, 11% speedup for 2K and 18% speedup for 8K buffer over the existing SSE version. These routines should provide even better speedup for future Intel x86_64 cpus. Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: arch - use crypto_memneq instead of memcmpDaniel Borkmann2013-12-201-1/+1
| | | | | | | | | | | Replace remaining occurences (just as we did in crypto/) under arch/*/crypto/ that make use of memcmp() for comparing keys or authentication tags for usage with crypto_memneq(). It can simply be used as a drop-in replacement for the normal memcmp(). Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: James Yonan <james@openvpn.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: move x86 to the generic version of ablk_helperArd Biesheuvel2013-09-241-1/+1
| | | | | | | | | Move all users of ablk_helper under x86/ to the generic version and delete the x86 specific version. Acked-by: Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: aesni_intel - add more optimized XTS mode for x86-64Jussi Kivilinna2013-04-251-0/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add more optimized XTS code for aesni_intel in 64-bit mode, for smaller stack usage and boost for speed. tcrypt results, with Intel i5-2450M: 256-bit key enc dec 16B 0.98x 0.99x 64B 0.64x 0.63x 256B 1.29x 1.32x 1024B 1.54x 1.58x 8192B 1.57x 1.60x 512-bit key enc dec 16B 0.98x 0.99x 64B 0.60x 0.59x 256B 1.24x 1.25x 1024B 1.39x 1.42x 8192B 1.38x 1.42x I chose not to optimize smaller than block size of 256 bytes, since XTS is practically always used with data blocks of size 512 bytes. This is why performance is reduced in tcrypt for 64 byte long blocks. Cc: Huang Ying <ying.huang@intel.com> Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: aesni-intel - remove rfc3686(ctr(aes)), utilize rfc3686 from ↵Jussi Kivilinna2013-01-081-37/+0
| | | | | | | | | | | ctr-module instead rfc3686 in CTR module is now able of using asynchronous ctr(aes) from aesni-intel, so rfc3686(ctr(aes)) in aesni-intel is no longer needed. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* crypto: aesni - fix XTS mode on x86-32, add wrapper function for asmlinkage ↵Jussi Kivilinna2012-10-181-2/+7
| | | | | | | | | | | | | | | aesni_enc() Calling convention for internal functions and 'asmlinkage' functions is different on x86-32. Therefore do not directly cast aesni_enc as XTS tweak function, but use wrapper function in between. Fixes crash with "XTS + aesni_intel + x86-32" combination. Cc: stable@vger.kernel.org Reported-by: Krzysztof Kolasa <kkolasa@winsoft.pl> Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* crypto: aesni_intel - improve lrw and xts performance by utilizing parallel ↵Jussi Kivilinna2012-08-201-35/+218
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | AES-NI hardware pipelines Use parallel LRW and XTS encryption facilities to better utilize AES-NI hardware pipelines and gain extra performance. Tcrypt benchmark results (async), old vs new ratios: Intel Core i5-2450M CPU (fam: 6, model: 42, step: 7) aes:128bit lrw:256bit xts:256bit size lrw-enc lrw-dec xts-dec xts-dec 16B 0.99x 1.00x 1.22x 1.19x 64B 1.38x 1.50x 1.58x 1.61x 256B 2.04x 2.02x 2.27x 2.29x 1024B 2.56x 2.54x 2.89x 2.92x 8192B 2.85x 2.99x 3.40x 3.23x aes:192bit lrw:320bit xts:384bit size lrw-enc lrw-dec xts-dec xts-dec 16B 1.08x 1.08x 1.16x 1.17x 64B 1.48x 1.54x 1.59x 1.65x 256B 2.18x 2.17x 2.29x 2.28x 1024B 2.67x 2.67x 2.87x 3.05x 8192B 2.93x 2.84x 3.28x 3.33x aes:256bit lrw:348bit xts:512bit size lrw-enc lrw-dec xts-dec xts-dec 16B 1.07x 1.07x 1.18x 1.19x 64B 1.56x 1.56x 1.70x 1.71x 256B 2.22x 2.24x 2.46x 2.46x 1024B 2.76x 2.77x 3.13x 3.05x 8192B 2.99x 3.05x 3.40x 3.30x Cc: Huang Ying <ying.huang@intel.com> Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Reviewed-by: Kim Phillips <kim.phillips@freescale.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: arch/x86 - cleanup - remove unneeded crypto_alg.cra_list initializationsJussi Kivilinna2012-08-011-4/+1
| | | | | | | | | | | Initialization of cra_list is currently mixed, most ciphers initialize this field and most shashes do not. Initialization however is not needed at all since cra_list is initialized/overwritten in __crypto_register_alg() with list_add(). Therefore perform cleanup to remove all unneeded initializations of this field in 'arch/x86/crypto/'. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: aesni-intel - fix wrong kfree pointerMilan Broz2012-07-111-4/+4
| | | | | | | | | | kfree(new_key_mem) in rfc4106_set_key() should be called on malloced pointer, not on aligned one, otherwise it can cause invalid pointer on free. (Seen at least once when running tcrypt tests with debug kernel.) Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: move arch/x86/include/asm/aes.h to arch/x86/include/asm/crypto/Jussi Kivilinna2012-06-271-1/+1
| | | | | | | Move AES header to the new asm/crypto directory. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: aes_ni - change to use shared ablk_* functionsJussi Kivilinna2012-06-271-92/+8
| | | | | | | Remove duplicate ablk_* functions and make use of ablk_helper module instead. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: aesni-intel - move more common code to ablk_init_commonJussi Kivilinna2012-05-151-55/+15
| | | | | | | | | ablk_*_init functions share more common code than what is currently in ablk_init_common. Move all of the common code to ablk_init_common. Cc: Huang Ying <ying.huang@intel.com> Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: aesni-intel - use crypto_[un]register_algsJussi Kivilinna2012-05-151-420/+303
| | | | | | | | | Combine all crypto_alg to be registered and use new crypto_[un]register_algs functions. Simplifies init/exit code and reduce object size. Cc: Huang Ying <ying.huang@intel.com> Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* Merge branch 'kmap_atomic' of git://github.com/congwang/linuxLinus Torvalds2012-03-211-12/+12
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull kmap_atomic cleanup from Cong Wang. It's been in -next for a long time, and it gets rid of the (no longer used) second argument to k[un]map_atomic(). Fix up a few trivial conflicts in various drivers, and do an "evil merge" to catch some new uses that have come in since Cong's tree. * 'kmap_atomic' of git://github.com/congwang/linux: (59 commits) feature-removal-schedule.txt: schedule the deprecated form of kmap_atomic() for removal highmem: kill all __kmap_atomic() [swarren@nvidia.com: highmem: Fix ARM build break due to __kmap_atomic rename] drbd: remove the second argument of k[un]map_atomic() zcache: remove the second argument of k[un]map_atomic() gma500: remove the second argument of k[un]map_atomic() dm: remove the second argument of k[un]map_atomic() tomoyo: remove the second argument of k[un]map_atomic() sunrpc: remove the second argument of k[un]map_atomic() rds: remove the second argument of k[un]map_atomic() net: remove the second argument of k[un]map_atomic() mm: remove the second argument of k[un]map_atomic() lib: remove the second argument of k[un]map_atomic() power: remove the second argument of k[un]map_atomic() kdb: remove the second argument of k[un]map_atomic() udf: remove the second argument of k[un]map_atomic() ubifs: remove the second argument of k[un]map_atomic() squashfs: remove the second argument of k[un]map_atomic() reiserfs: remove the second argument of k[un]map_atomic() ocfs2: remove the second argument of k[un]map_atomic() ntfs: remove the second argument of k[un]map_atomic() ...
| * x86: remove the second argument of k[un]map_atomic()Cong Wang2012-03-201-12/+12
| | | | | | | | | | | | Acked-by: Avi Kivity <avi@redhat.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Cong Wang <amwang@redhat.com>
* | crypto: Add support for x86 cpuid auto loading for x86 crypto driversAndi Kleen2012-01-261-3/+9
|/ | | | | | | | | | | | | | | | | | | | | | | | | | Add support for auto-loading of crypto drivers based on cpuid features. This enables auto-loading of the VIA and Intel specific drivers for AES, hashing and CRCs. Requires the earlier infrastructure patch to add x86 modinfo. I kept it all in a single patch for now. I dropped the printks when the driver cpuid doesn't match (imho drivers never should print anything in such a case) One drawback is that udev doesn't know if the drivers are used or not, so they will be unconditionally loaded at boot up. That's better than not loading them at all, like it often happens. Cc: Dave Jones <davej@redhat.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Jen Axboe <axboe@kernel.dk> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Huang Ying <ying.huang@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Renninger <trenn@suse.de> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* x86: fix up files really needing to include module.hPaul Gortmaker2011-10-311-0/+1
| | | | | | | These files aren't just exporting symbols -- they are also defining a MODULE_LICENSE etc. so give them the full module.h file. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
* crypto: aesni-intel - fix aesni build on i386Randy Dunlap2011-05-181-3/+4
| | | | | | | | | | | | Fix build error on i386 by moving function prototypes: arch/x86/crypto/aesni-intel_glue.c: In function 'aesni_init': arch/x86/crypto/aesni-intel_glue.c:1263: error: implicit declaration of function 'crypto_fpu_init' arch/x86/crypto/aesni-intel_glue.c: In function 'aesni_exit': arch/x86/crypto/aesni-intel_glue.c:1373: error: implicit declaration of function 'crypto_fpu_exit' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: aesni-intel - Merge with fpu.koAndy Lutomirski2011-05-161-0/+8
| | | | | | | | | | | | | | | Loading fpu without aesni-intel does nothing. Loading aesni-intel without fpu causes modes like xts to fail. (Unloading aesni-intel will restore those modes.) One solution would be to make aesni-intel depend on fpu, but it seems cleaner to just combine the modules. This is probably responsible for bugs like: https://bugzilla.redhat.com/show_bug.cgi?id=589390 Signed-off-by: Andy Lutomirski <luto@mit.edu> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: aesni-intel - fixed problem with packets that are not multiple of ↵Tadeusz Struk2011-03-271-2/+12
| | | | | | | | | | | | 64bytes This patch fixes problem with packets that are not multiple of 64bytes. Signed-off-by: Adrian Hoban <adrian.hoban@intel.com> Signed-off-by: Aidan O'Mahony <aidan.o.mahony@intel.com> Signed-off-by: Gabriele Paoloni <gabriele.paoloni@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: aesni-intel - Fix remaining leak in rfc4106_set_hash_keyJesper Juhl2011-02-161-9/+6
| | | | | | | | | | Fix up previous patch that failed to properly fix mem leak in rfc4106_set_hash_subkey(). This add-on patch; fixes the leak. moves kfree() out of the error path, returns -ENOMEM rather than -EINVAL when ablkcipher_request_alloc() fails. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: aesni-intel - Don't leak memory in rfc4106_set_hash_subkeyJesper Juhl2011-01-231-8/+9
| | | | | | | | | | | | | | | | | There's a small memory leak in arch/x86/crypto/aesni-intel_glue.c::rfc4106_set_hash_subkey(). If the call to kmalloc() fails and returns NULL then the memory allocated previously by ablkcipher_request_alloc() is not freed when we leave the function. I could have just added a call to ablkcipher_request_free() before we return -ENOMEM, but that started to look too much like the code we already had at the end of the function, so I chose instead to rework the code a bit so that there are now a few labels at the end that we goto when various allocations fail, so we don't have to repeat the same blocks of code (this also reduces the object code size slightly). Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: aesni-intel - Fixed build error on x86-32Mathias Krause2010-11-291-13/+13
| | | | | | | | | | Exclude AES-GCM code for x86-32 due to heavy usage of 64-bit registers not available on x86-32. While at it, fixed unregister order in aesni_exit(). Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: aesni-intel - Ported implementation to x86-32Mathias Krause2010-11-271-5/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The AES-NI instructions are also available in legacy mode so the 32-bit architecture may profit from those, too. To illustrate the performance gain here's a short summary of a dm-crypt speed test on a Core i7 M620 running at 2.67GHz comparing both assembler implementations: x86: i568 aes-ni delta ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4% CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3% LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5% XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7% Additionally, due to some minor optimizations, the 64-bit version also got a minor performance gain as seen below: x86-64: old impl. new impl. delta ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5% CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9% LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6% XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7% Signed-off-by: Mathias Krause <minipli@googlemail.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: aesni-intel - RFC4106 AES-GCM Driver Using Intel New InstructionsTadeusz Struk2010-11-131-2/+516
| | | | | | | | | | | | | | | | | This patch adds an optimized RFC4106 AES-GCM implementation for 64-bit kernels. It supports 128-bit AES key size. This leverages the crypto AEAD interface type to facilitate a combined AES & GCM operation to be implemented in assembly code. The assembly code leverages Intel(R) AES New Instructions and the PCLMULQDQ instruction. Signed-off-by: Adrian Hoban <adrian.hoban@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Gabriele Paoloni <gabriele.paoloni@intel.com> Signed-off-by: Aidan O'Mahony <aidan.o.mahony@intel.com> Signed-off-by: Erdinc Ozturk <erdinc.ozturk@intel.com> Signed-off-by: James Guilford <james.guilford@intel.com> Signed-off-by: Wajdi Feghali <wajdi.k.feghali@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: aesni-intel - Add AES-NI accelerated CTR modeHuang Ying2010-03-101-7/+123
| | | | | | | | | | | | | To take advantage of the hardware pipeline implementation of AES-NI instructions. CTR mode cryption is implemented in ASM to schedule multiple AES-NI instructions one after another. This way, some latency of AES-NI instruction can be eliminated. Performance testing based on dm-crypt should 50% reduction of ecryption/decryption time. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: aesni-intel - Fix irq_fpu_usable usageHuang Ying2009-10-201-5/+5
| | | | | | | | | | | | | When renaming kernel_fpu_using to irq_fpu_usable, the semantics of the function is changed too, from mesuring whether kernel is using FPU, that is, the FPU is NOT available, to measuring whether FPU is usable, that is, the FPU is available. But the usage of irq_fpu_usable in aesni-intel_glue.c is not changed accordingly. This patch fixes this. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* Merge branch 'x86-cpu-for-linus' of ↵Linus Torvalds2009-09-141-12/+5
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (22 commits) x86: Fix code patching for paravirt-alternatives on 486 x86, msr: change msr-reg.o to obj-y, and export its symbols x86: Use hard_smp_processor_id() to get apic id for AMD K8 cpus x86, sched: Workaround broken sched domain creation for AMD Magny-Cours x86, mcheck: Use correct cpumask for shared bank4 x86, cacheinfo: Fixup L3 cache information for AMD multi-node processors x86: Fix CPU llc_shared_map information for AMD Magny-Cours x86, msr: Fix msr-reg.S compilation with gas 2.16.1, on 32-bit too x86: Move kernel_fpu_using to irq_fpu_usable in asm/i387.h x86, msr: fix msr-reg.S compilation with gas 2.16.1 x86, msr: Export the register-setting MSR functions via /dev/*/msr x86, msr: Create _on_cpu helpers for {rw,wr}msr_safe_regs() x86, msr: Have the _safe MSR functions return -EIO, not -EFAULT x86, msr: CFI annotations, cleanups for msr-reg.S x86, asm: Make _ASM_EXTABLE() usable from assembly code x86, asm: Add 32-bit versions of the combined CFI macros x86, AMD: Disable wrongly set X86_FEATURE_LAHF_LM CPUID bit x86, msr: Rewrite AMD rd/wrmsr variants x86, msr: Add rd/wrmsr interfaces with preset registers x86: add specific support for Intel Atom architecture ...
| * x86: Move kernel_fpu_using to irq_fpu_usable in asm/i387.hHuang Ying2009-09-011-12/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This function measures whether the FPU/SSE state can be touched in interrupt context. If the interrupted code is in user space or has no valid FPU/SSE context (CR0.TS == 1), FPU/SSE state can be used in IRQ or soft_irq context too. This is used by AES-NI accelerated AES implementation and PCLMULQDQ accelerated GHASH implementation. v3: - Renamed to irq_fpu_usable to reflect the purpose of the function. v2: - Renamed to irq_is_fpu_using to reflect the real situation. Signed-off-by: Huang Ying <ying.huang@intel.com> CC: H. Peter Anvin <hpa@zytor.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* | crypto: aes-ni - Don't print message with KERN_ERR on old systemRoland Dreier2009-06-241-1/+1
|/ | | | | | | | | | | | | | | When the aes-intel module is loaded on a system that does not have the AES instructions, it prints Intel AES-NI instructions are not detected. at level KERN_ERR. Since aes-intel is aliased to "aes" it will be tried whenever anything uses AES and spam the console. This doesn't match existing practice for how to handle "no hardware" when initializing a module, so downgrade the message to KERN_INFO. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: aes-ni - Do not sleep when using the FPUHuang Ying2009-06-181-0/+4
| | | | | | | | | | Because AES-NI instructions will touch XMM state, corresponding code must be enclosed within kernel_fpu_begin/end, which used preempt_disable/enable. So sleep should be prevented between kernel_fpu_begin/end. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
OpenPOWER on IntegriCloud