diff --git a/docs/architecture/psa-migration/md-cipher-dispatch.md b/docs/architecture/psa-migration/md-cipher-dispatch.md index 355f5618d..430b0caec 100644 --- a/docs/architecture/psa-migration/md-cipher-dispatch.md +++ b/docs/architecture/psa-migration/md-cipher-dispatch.md @@ -99,8 +99,8 @@ We can classify code that implements or uses cryptographic mechanisms into sever * Software implementations of primitive cryptographic mechanisms. These are not expected to change. * Software implementations of constructed cryptographic mechanisms (e.g. HMAC, CTR_DRBG, RSA (calling a hash for PSS/OAEP, and needing to know the hash length in PKCS1v1.5 sign/verify), …). These need to keep working whenever a legacy implementation of the auxiliary mechanism is available, regardless of whether a PSA implementation is also available. * Code implementing the PSA crypto interface. This is not expected to change, except perhaps to expose some internal functionality to overhauled glue code. -* Code that's subject to `MBEDTLS_USE_PSA_CRYPTO`: `pk.h`, X.509, TLS (excluding TLS 1.3). -* Code that always uses PSA for crypto: TLS 1.3, LMS. +* Code that's subject to `MBEDTLS_USE_PSA_CRYPTO`: `pk.h`, X.509, TLS (excluding parts specific TLS 1.3). +* Code that always uses PSA for crypto: TLS 1.3 (except things common with 1.2), LMS. For the purposes of this work, three domains emerge: @@ -110,23 +110,79 @@ For the purposes of this work, three domains emerge: #### Non-use-PSA modules -The following modules in Mbed TLS call another module to perform cryptographic operations which, in the long term, will be provided through a PSA interface, but cannot make any PSA-related assumption: +The following modules in Mbed TLS call another module to perform cryptographic operations which, in the long term, will be provided through a PSA interface, but cannot make any PSA-related assumption. -* CCM (block cipher in ECB mode; interdependent with cipher) -* cipher (cipher and AEAD algorithms) -* CMAC (AES-ECB and DES-ECB, but could be extended to the other block ciphers; interdependent with cipher) -* CTR\_DRBG (AES-ECB, but could be extended to the other block ciphers) -* entropy (hashes via low-level) +Hashes and HMAC (after the work on driver-only hashes): + +* entropy (hashes via MD-light) * ECDSA (HMAC\_DRBG; `md.h` exposed through API) -* ECJPAKE (hashes via md; `md.h` exposed through API) -* GCM (block cipher in ECB mode; interdependent with cipher) -* md (hashes and HMAC) -* NIST\_KW (AES-ECB; interdependent with cipher) +* ECJPAKE (hashes via MD-light; `md.h` exposed through API) +* MD (hashes and HMAC) +* HKDF (HMAC via `md.h`; `md.h` exposed through API) * HMAC\_DRBG (hashes and HMAC via `md.h`; `md.h` exposed through API) -* PEM (AES and DES in CBC mode without padding; MD5 hash via low-level) -* PKCS12 (cipher, generically, selected from ASN.1 or function parameters; hashes via md; `cipher.h` exposed through API) -* PKCS5 (cipher, generically, selected from ASN.1; HMAC via `md.h`; `md.h` exposed through API) -* RSA (hash via md for PSS and OAEP; `md.h` exposed through API) +* PKCS12 (hashes via MD-light) +* PKCS5 (HMAC via `md.h`; `md.h` exposed through API) +* PKCS7 (hashes via MD) +* RSA (hash via MD-light for PSS and OAEP; `md.h` exposed through API) +* PEM (MD5 hash via MD-light) + +Symmetric ciphers and AEADs (before work on driver-only cipher): + +* PEM: + * AES, DES or 3DES in CBC mode without padding, decrypt only (!). + * Currently using low-level non-generic APIs. + * No hard dependency, features guarded by `AES_C` resp. `DES_C`. + * Functions called: `setkey_dec()` + `crypt_cbc()`. +* PKCS12: + * In practice: 2DES or 3DES in CBC mode with PKCS7 padding, decrypt only + (when called from pkparse). + * In principle: any cipher-mode (default padding), passed an + `mbedtls_cipher_type_t` as an argument, no documented restriction. + * Cipher, generically, selected from ASN.1 or function parameters; + no documented restriction but in practice TODO (inc. padding and + en/decrypt, look at standards and tests) + * Unconditional dependency on `CIPHER_C` in `check_config.h`. + * Note: `cipher.h` exposed through API. + * Functions called: `setup`, `setkey`, `set_iv`, `reset`, `update`, `finish` (in sequence, once). +* PKCS5 (PBES2, `mbedtls_pkcs5_pbes2()`): + * 3DES or DES in CBC mode with PKCS7 padding, both encrypt and decrypt. + * Note: could also be AES in the future, see #7038. + * Unconditional dependency on `CIPHER_C` in `check_config.h`. + * Functions called: `setup`, `setkey`, `crypt`. +* CTR\_DRBG: + * AES in ECB mode, encrypt only. + * Currently using low-level non-generic API (`aes.h`). + * Unconditional dependency on `AES_C` in `check_config.h`. + * Functions called: `setkey_enc`, `crypt_ecb`. +* CCM: + * AES, Camellia or Aria in ECB mode, encrypt only. + * Unconditional dependency on `AES_C || CAMELLIA_C || ARIA_C` in `check_config.h`. + * Unconditional dependency on `CIPHER_C` in `check_config.h`. + * Note: also called by `cipher.c` if enabled. + * Functions called: `info`, `setup`, `setkey`, `update` (several times) - (never finish) +* CMAC: + * AES or DES in ECB mode, encrypt only. + * Unconditional dependency on `AES_C || DES_C` in `check_config.h`. + * Unconditional dependency on `CIPHER_C` in `check_config.h`. + * Note: also called by `cipher.c` if enabled. + * Functions called: `info`, `setup`, `setkey`, `update` (several times) - (never finish) +* GCM: + * AES, Camellia or Aria in ECB mode, encrypt only. + * Unconditional dependency on `AES_C || CAMELLIA_C || ARIA_C` in `check_config.h`. + * Unconditional dependency on `CIPHER_C` in `check_config.h`. + * Note: also called by `cipher.c` if enabled. + * Functions called: `info`, `setup`, `setkey`, `update` (several times) - (never finish) +* NIST\_KW: + * AES in ECB mode, both encryt and decrypt. + * Unconditional dependency on `AES_C || DES_C` in `check_config.h`. + * Unconditional dependency on `CIPHER_C` in `check_config.h`. + * Note: also called by `cipher.c` if enabled. + * Note: `cipher.h` exposed through API. + * Functions called: `info`, `setup`, `setkey`, `update` (several times) - (never finish) +* Cipher: + * potentially any cipher/AEAD in any mode and any direction + +Note: PSA cipher is built on Cipher, but PSA AEAD directly calls the underlying AEAD modules (GCM, CCM, ChachaPoly). ### Difficulties @@ -263,12 +319,72 @@ These problems are easily solvable. * We can make names and HMAC optional. The mixed-domain hash interface won't be the full `MBEDTLS_MD_C` but a subset. * We can optimize `md.c` without making API changes to `md.h`. +### Scope reductions and priorities for 3.x + +This section documents things that we chose to temporarily exclude from the scope in the 3.x branch (which will eventually be in scope again after 4.0) as well as things we chose to prioritize if we don't have time to support everything. + +#### Don't support PK, X.509 and TLS without `MBEDTLS_USE_PSA_CRYPTO` + +We do not need to support driver-only hashes and ciphers in PK. X.509 and TLS without `MBEDTLS_USE_PSA_CRYPTO`. Users who want to take full advantage of drivers will need to enabled this macro. + +Note that this applies to TLS 1.3 as well, as some uses of hashes and all uses of ciphers there are common with TLS 1.2, hence governed by `MBEDTLS_USE_PSA_CRYPTO`, see [this macro's extended documentation](../../docs/use-psa-crypto.html). + +This will go away naturally in 4.0 when this macros is not longer an option (because it's always on). + +#### Don't support for `MBEDTLS_PSA_CRYPTO_CLIENT` without `MBEDTLS_PSA_CRYPTO_C` + +We generally don't really support builds with `MBEDTLS_PSA_CRYPTO_CLIENT` without `MBEDTLS_PSA_CRYPTO_C`. For example, both `MBEDTLS_USE_PSA_CRYPTO` and `MBEDTLS_SSL_PROTO_TLS1_3` require `MBEDTLS_PSA_CRYPTO_C`, while in principle they should only require `MBEDTLS_PSA_CRYPTO_CLIENT`. + +Considering this existing restriction which we do not plan to lift before 4.0, it is acceptable driver-only hashes and cipher support to have the same restriction in 3.x. + +It is however desirable for the design to keep support for `MBEDTLS_PSA_CRYPTO_CLIENT` in mind, in order to avoid making it more difficult to add in the future. + +#### For cipher: prioritize constrained devices and modern TLS + +The primary target is a configuration like TF-M's medium profile, plus TLS with only AEAD ciphersuites. + +This excludes things like: +- Support for encrypted PEM, PKCS5 and PKCS12 encryption, and PKCS8 encrypted keys in PK parse. (Not widely used on highly constrained devices.) +- Support for NIST-KW. (Same justification.) +- Support for CMAC. (Same justification, plus can be directly accelerated.) +- Support for CBC ciphersuites in TLS. (They've been recommended against for a while now.) + +### Dual-dispatch for block cipher primitives + +Considering the priorities stated above, initially we want to support GCM, CCM and CTR-DRBG. All three of them use the block cipher primitive only in the encrypt direction. Currently, GCM and CCM use the Cipher layer in order to work with AES, Aria and Camellia (DES is excluded by the standards due to its smaller block size) and CTR-DRBG directly uses the low-level API from `aes.h`. In all cases, access to the "block cipher primitive" is done by using "ECB mode" (which for both Cipher and `aes.h` only allows a single block, contrary to PSA which implements actual ECB mode). + +The two AEAD modes, GCM and CCM, have very similar needs and positions in the stack, strongly suggesting using the same design for both. On the other hand, there are a number of differences between CTR-DRBG and them. +- CTR-DRBG only uses AES (and there is no plan to extend it to other block ciphers at the moment), while GCM and CCM need to work with 3 block ciphers already. +- CTR-DRBG holds a special position in the stack: most users don't care about it per se, they only care about getting random numbers - in fact PSA users don't even need to know what DRBG is used. In particular, no part of the stack is asking questions like "is CTR-DRBG-AES available?" - an RNG needs to be available and that's it - contrary to similar questions about AES-GCM etc. which are asked for example by TLS. + +So, it makes sense to use different designs for CTR-DRBG on one hand, and GCM/CCM on the other hand: +- CTR-DRBG can just check if `AES_C` is present and "fall back" to PSA if not. +- GCM and CCM need an common abstraction layer that allows: + - Using AES, Aria or Camellia in a uniform way. + - Dispatching to built-in or driver. + +The abstraction layer used by GCM and CCM may either be a new internal module, or a subset of the existing Cipher API, extended with the ability to dispatch to a PSA driver. + +Reasons for making this layer's API a subset of the existing Cipher API: +- No need to design, implement and test a new module. (Will need to test the new subset though, as well as the extended behaviour.) +- No code change in GCM and CCM - only need to update dependencies. +- No risk for code duplication between a potential new module and Cipher: source-level, and in in particular in builds that still have `CIPHER_C` enabled. (Compiled-code duplication could be avoided by excluding the new module in such builds, though.) +- If want to support other users of Cipher later (such as NIST-KW, CMAC, PKCS5 and PKCS12), we can just extend dual-dispatch support to other modes/operations in Cipher and keep those extra modules unchanged as well. + +Possible costs of re-using (a subset of) the existing Cipher API instead of defining a new one: +- We carry over costs associated with `cipher_info_t` structures. (Currently the info structure is used for 3 things: (1) to check if the cipher is supported, (2) to check its block size, (3) because `setup()` requires it). +- We carry over questionable implementation decisions, like dynamic allocation of context. + +Those costs could be avoided by refactoring (parts of) Cipher, but that would probably mean either: +- significant differences in how the `cipher.h` API is implemented between builds with the full Cipher or only a subset; +- or more work to apply the simplifications to all of Cipher. + +Prototyping both approaches showed better code size savings and cleaner code with a new internal module (see section "Internal "block cipher" abstraction (Cipher light)" below). + ## Specification ### MD light -https://github.com/Mbed-TLS/mbedtls/pull/6474 implements part of this specification, but it's based on Mbed TLS 3.2, so it needs to be rewritten for 3.3. - #### Definition of MD light MD light is a subset of `md.h` that implements the hash calculation interface described in ”[Designing an interface for hashes](#designing-an-interface-for-hashes)”. It is activated by `MBEDTLS_MD_LIGHT` in `mbedtls_config.h`. @@ -398,31 +514,7 @@ Note that this assumes that an operation that has been started via PSA can be co #### Error code conversion -After calling a PSA function, call `mbedtls_md_error_from_psa` to convert its status code. This function is currently defined in `hash_info.c`. - -### Migration to MD light - -#### Migration of modules that used to call MD and now do the legacy-or-PSA dance - -Get rid of the case where `MBEDTLS_MD_C` is undefined. Enable `MBEDTLS_MD_LIGHT` in `build_info.h`. - -#### Migration of modules that used to call a low-level hash module and now do the legacy-or-PSA dance - -Switch to calling MD (light) unconditionally. Enable `MBEDTLS_MD_LIGHT` in `build_info.h`. - -#### Migration of modules that call a low-level hash module - -Switch to calling MD (light). Enable `MBEDTLS_MD_LIGHT` in `build_info.h`. - -#### Migration of use-PSA mixed code - -Instead of calling `hash_info.h` functions to obtain metadata, get it from `md.h`. - -Optionally, code that currently tests on `MBEDTLS_USE_PSA_CRYPTO` just to determine whether to call MD or PSA to calculate hashes can switch to just having the MD variant. - -#### Remove `legacy_or_psa.h` - -It's no longer used. +After calling a PSA function, call `mbedtls_md_error_from_psa` to convert its status code. ### Support all legacy algorithms in PSA @@ -461,10 +553,6 @@ static inline psa_algorithm_t psa_alg_of_md_info( Work in progress on this conversion is at https://github.com/gilles-peskine-arm/mbedtls/tree/hash-unify-ids-wip-1 -#### Get rid of the hash_info module - -The hash_info module is redundant with MD light. Move `mbedtls_md_error_from_psa` to `md.c`, defined only when `MBEDTLS_MD_SOME_PSA` is defined. The rest is no longer used. - #### Unify HMAC with PSA PSA has its own HMAC implementation. In builds with both `MBEDTLS_MD_C` and `PSA_WANT_ALG_HMAC` not fully provided by drivers, we should have a single implementation. Replace the one in `md.h` by calls to the PSA driver interface. This will also give mixed-domain modules access to HMAC accelerated directly by a PSA driver (eliminating the need to a HMAC interface in software if all supported hashes have an accelerator that includes HMAC support). @@ -477,3 +565,34 @@ The architecture can be extended to support `MBEDTLS_PSA_CRYPTO_CLIENT` with a l * Compile-time dependencies: instead of checking `defined(MBEDTLS_PSA_CRYPTO_C)`, check `defined(MBEDTLS_PSA_CRYPTO_C) || defined(MBEDTLS_PSA_CRYPTO_CLIENT)`. * Implementers of `MBEDTLS_PSA_CRYPTO_CLIENT` will need to provide `psa_can_do_hash()` (or a more general function `psa_can_do`) alongside `psa_crypto_init()`. Note that at this point, it will become a public interface, hence we won't be able to change it at a whim. + +### Internal "block cipher" abstraction (Cipher light) + +#### Definition + +The new module is automatically enabled in `build_info.h` by modules that need +it, namely: CCM, GCM, only when `CIPHER_C` is not available. Note: CCM and GCM +currently depend on the full `CIPHER_C` (enforced by `check_config.h`); this +hard dependency would be replaced by the above auto-enablement. + +The following API functions are offered: +``` +void mbedtls_block_cipher_init(mbedtls_block_cipher_context_t *ctx); +void mbedtls_block_cipher_free(mbedtls_block_cipher_context_t *ctx); +int mbedtls_block_cipher_setup(mbedtls_block_cipher_context_t *ctx, + mbedtls_cipher_id_t cipher_id); +int mbedtls_block_cipher_setkey(mbedtls_block_cipher_context_t *ctx, + const unsigned char *key, + unsigned key_bitlen); +int mbedtls_block_cipher_encrypt(mbedtls_block_cipher_context_t *ctx, + const unsigned char input[16], + unsigned char output[16]); +``` + +The only supported ciphers are AES, ARIA and Camellia. They are identified by +an `mbedtls_cipher_id_t` in the `setup()` function, because that's how they're +identifed by callers (GCM/CCM). + +#### Cipher light dual dispatch + +This is likely to come in the future, but has not been defined yet.