summaryrefslogtreecommitdiffstats
path: root/arch/arm64/crypto/aes-modes.S
Commit message (Collapse)AuthorAgeFilesLines
* crypto: arm64/aes-blk - yield NEON after every block of inputArd Biesheuvel2018-05-121-124/+207
| | | | | | | | Avoid excessive scheduling delays under a preemptible kernel by yielding the NEON after every block of input. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: arm64/aes-blk - add 4 way interleave to CBC-MAC encrypt pathArd Biesheuvel2018-03-161-2/+21
| | | | | | | | | | | | | | | CBC MAC is strictly sequential, and so the current AES code simply processes the input one block at a time. However, we are about to add yield support, which adds a bit of overhead, and which we prefer to align with other modes in terms of granularity (i.e., it is better to have all routines yield every 64 bytes and not have an exception for CBC MAC which yields every 16 bytes) So unroll the loop by 4. We still cannot perform the AES algorithm in parallel, but we can at least merge the loads and stores. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: arm64/aes-blk - add 4 way interleave to CBC encrypt pathArd Biesheuvel2018-03-161-6/+25
| | | | | | | | | | | | | | | CBC encryption is strictly sequential, and so the current AES code simply processes the input one block at a time. However, we are about to add yield support, which adds a bit of overhead, and which we prefer to align with other modes in terms of granularity (i.e., it is better to have all routines yield every 64 bytes and not have an exception for CBC encrypt which yields every 16 bytes) So unroll the loop by 4. We still cannot perform the AES algorithm in parallel, but we can at least merge the loads and stores. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: arm64/aes-blk - remove configurable interleaveArd Biesheuvel2018-03-161-197/+40
| | | | | | | | | | | | | | | The AES block mode implementation using Crypto Extensions or plain NEON was written before real hardware existed, and so its interleave factor was made build time configurable (as well as an option to instantiate all interleaved sequences inline rather than as subroutines) We ended up using INTERLEAVE=4 with inlining disabled for both flavors of the core AES routines, so let's stick with that, and remove the option to configure this at build time. This makes the code easier to modify, which is nice now that we're adding yield support. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: arm64/aes-blk - move kernel mode neon en/disable into loopArd Biesheuvel2018-03-161-45/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | When kernel mode NEON was first introduced on arm64, the preserve and restore of the userland NEON state was completely unoptimized, and involved saving all registers on each call to kernel_neon_begin(), and restoring them on each call to kernel_neon_end(). For this reason, the NEON crypto code that was introduced at the time keeps the NEON enabled throughout the execution of the crypto API methods, which may include calls back into the crypto API that could result in memory allocation or other actions that we should avoid when running with preemption disabled. Since then, we have optimized the kernel mode NEON handling, which now restores lazily (upon return to userland), and so the preserve action is only costly the first time it is called after entering the kernel. So let's put the kernel_neon_begin() and kernel_neon_end() calls around the actual invocations of the NEON crypto code, and run the remainder of the code with kernel mode NEON disabled (and preemption enabled) Note that this requires some reshuffling of the registers in the asm code, because the XTS routines can no longer rely on the registers to retain their contents between invocations. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: arm64/aes - add NEON/Crypto Extensions CBCMAC/CMAC/XCBC driverArd Biesheuvel2017-02-111-1/+28
| | | | | | | | | | | | | | | | | | On ARMv8 implementations that do not support the Crypto Extensions, such as the Raspberry Pi 3, the CCM driver falls back to the generic table based AES implementation to perform the MAC part of the algorithm, which is slow and not time invariant. So add a CBCMAC implementation to the shared glue code between NEON AES and Crypto Extensions AES, so that it can be used instead now that the CCM driver has been updated to look for CBCMAC implementations other than the one it supplies itself. Also, given how these algorithms mostly only differ in the way the key handling and the final encryption are implemented, expose CMAC and XCBC algorithms as well based on the same core update code. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: arm64/aes-blk - remove cra_alignmaskArd Biesheuvel2017-02-031-5/+3
| | | | | | | | | Remove the unnecessary alignmask: it is much more efficient to deal with the misalignment in the core algorithm than relying on the crypto API to copy the data to a suitably aligned buffer. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: arm64/aes-blk - honour iv_out requirement in CBC and CTR modesArd Biesheuvel2017-01-231-46/+42
| | | | | | | | | | | | | Update the ARMv8 Crypto Extensions and the plain NEON AES implementations in CBC and CTR modes to return the next IV back to the skcipher API client. This is necessary for chaining to work correctly. Note that for CTR, this is only done if the request is a round multiple of the block size, since otherwise, chaining is impossible anyway. Cc: <stable@vger.kernel.org> # v3.16+ Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: arm64/aes-xts-ce: fix for big endianArd Biesheuvel2016-10-211-1/+2
| | | | | | | | | Emit the XTS tweak literal constants in the appropriate order for a single 128-bit scalar literal load. Fixes: 49788fe2a128 ("arm64/crypto: AES-ECB/CBC/CTR/XTS using ARMv8 NEON and Crypto Extensions") Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* arm64/crypto: AES-ECB/CBC/CTR/XTS using ARMv8 NEON and Crypto ExtensionsArd Biesheuvel2014-05-141-0/+532
This adds ARMv8 implementations of AES in ECB, CBC, CTR and XTS modes, both for ARMv8 with Crypto Extensions and for plain ARMv8 NEON. The Crypto Extensions version can only run on ARMv8 implementations that have support for these optional extensions. The plain NEON version is a table based yet time invariant implementation. All S-box substitutions are performed in parallel, leveraging the wide range of ARMv8's tbl/tbx instructions, and the huge NEON register file, which can comfortably hold the entire S-box and still have room to spare for doing the actual computations. The key expansion routines were borrowed from aes_generic. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
OpenPOWER on IntegriCloud