mbedtls/docs/architecture/psa-migration/strategy.md

This document explains the strategy that was used so far in starting the
migration to PSA Crypto and mentions future perspectives and open questions.

Goals
=====

Several benefits are expected from migrating to PSA Crypto:

G1. Use PSA Crypto drivers when available.
G2. Allow isolation of long-term secrets (for example, private keys).
G3. Allow isolation of short-term secrets (for example, TLS sesssion keys).
G4. Have a clean, unified API for Crypto (retire the legacy API).
G5. Code size: compile out our implementation when a driver is available.

Currently, some parts of (G1) and (G2) are implemented when
`MBEDTLS_USE_PSA_CRYPTO` is enabled. For (G2) to take effect, the application
needs to be changed to use new APIs.

Generally speaking, the numbering above doesn't mean that each goal requires
the preceding ones to be completed, for example G2-G5 could be done in any
order; however they all either depend on G1 or are just much more convenient
if G1 is done before (note that this is not a dependency on G1 being complete,
it's more like each bit of G2-G5 is helped by some speficic bit in G1).

So, a solid intermediate goal would be to complete (G1) when
`MBEDTLS_USA_PSA_CRYPTO` is enabled - that is, all crypto operations in X.509
and TLS would be done via the PSA Crypto API.

Compile-time options
====================

We currently have two compile-time options that are relevant to the migration:

- `MBEDTLS_PSA_CRYPTO_C` - enabled by default, controls the presence of the PSA
  Crypto APIs.
- `MBEDTLS_USE_PSA_CRYPTO` - disabled by default (enabled in "full" config),
  controls usage of PSA Crypto APIs to perform operations in X.509 and TLS
(G1 above), as well as the availability of some new APIs (G2 above).

The reason why `MBEDTLS_USE_PSA_CRYPTO` is optional, and disabled by default,
is mostly to avoid introducing a hard (or even default) dependency of X509 and
TLS and `MBEDTLS_PSA_CRYPTO_C`. This is mostly reasons of code size, and
historically concerns about the maturity of the PSA code (which we might want
to re-evaluate).

The downside of this approach is that until we feel ready to make
`MBDEDTLS_USE_PSA_CRYPTO` non-optional (always enabled), we have to maintain
two versions of some parts of the code: one using PSA, the other using the
legacy APIs. However, see next section for strategies that can lower that
cost.

Taking advantage of the existing abstractions layers - or not
=============================================================

The Crypto library in Mbed TLS currently has 3 abstraction layers that offer
algorithm-agnostic APIs for a class of algorithms:

- MD for messages digests aka hashes (including HMAC)
- Cipher for symmetric ciphers (included AEAD)
- PK for asymmetric (aka public-key) cryptography (excluding key exchange)

Note: key exchange (FFDH, ECDH) is not covered by an abstraction layer.

These abstraction layers typically provide, in addition to the API for crypto
operations, types and numerical identifiers for algorithms (for
example `mbedtls_cipher_mode_t` and its values). The
current strategy is to keep using those identifiers in most of the code, in
particular in existing structures and public APIs, even when
`MBEDTLS_USE_PSA_CRYPTO` is enabled. (This is not an issue for G1, G2, G3
above, and is only potentially relevant for G4.)

The are multiple strategies that can be used regarding the place of those
layers in the migration to PSA.

Silently call to PSA from the abstraction layer
-----------------------------------------------

- Provide a new definition (conditionally on `USE_PSA_CRYPTO`) of wrapper
  functions in the abstraction layer, that calls PSA instead of the legacy
crypto API.
- Upside: changes contained to a single place, no need to change TLS or X.509
  code anywhere.
- Downside: tricky to implement if the PSA implementation is currently done on
  top of that layer (dependency loop).

This strategy is currently used for ECDSA signature verification in the PK
layer, and could be extended to all operations in the PK layer.

This strategy is not very well suited to the Cipher and MD layers, as the PSA
implementation is currently done on top of those layers.

Replace calls for each operation
--------------------------------

- For every operation that's done through this layer in TLS or X.509, just
  replace function call with calls to PSA (conditionally on `USE_PSA_CRYPTO`)
- Upside: conceptually simple, and if the PSA implementation is currently done
  on top of that layer, avoids concerns about dependency loops.
- Downside: TLS/X.509 code has to be done for each operation.

This strategy is currently used for the MD layer. (Currently only a subset of
calling places, but could be extended to all of them.)

Opt-in use of PSA from the abstraction layer
--------------------------------------------

- Provide a new way to set up a context that causes operations on that context
  to be done via PSA.
- Upside: changes mostly contained in one place, TLS/X.509 code only needs to
  be changed when setting up the context, but not when using it. In
  particular, no changes to/duplication of existing public APIs that expect a
  key to be passed as a context of this layer (eg, `mbedtls_pk_context`).
- Upside: avoids dependency loop when PSA implemented on top of that layer.
- Downside: when the context is typically set up by the application, requires
  changes in application code.

There are two variants of this strategy: one where using the new setup
function also allows for key isolation (the key is only held by PSA,
supporting both G1 and G2 in that area), and one without isolation (the key is
still stored outsde of PSA most of the time, supporting only G1).

This strategy, with support for key isolation, is currently used for ECDSA
signature generation in the PK layer - see `mbedtls_pk_setup_opaque()`. This
allows use of PSA-held private ECDSA keys in TLS and X.509 with no change to
the TLS/X.509 code, but a contained change in the application. If could be
extended to other private key operations in the PK layer.

This strategy, without key isolation, is also currently used in the Cipher
layer - see `mbedtls_cipher_setup_psa()`. This allows use of PSA for cipher
operations in TLS with no change to the application code, and a
contained change in TLS code. (It currently only supports a subset of ciphers,
but could easily be extended to all of them.)

Note: for private key operations in the PK layer, both the "silent" and the
"opt-in" strategy can apply, and can complement each other, as one provides
support for key isolation, but at the (unavoidable) code of change in
application code, while the other requires no application change to get
support for drivers, but fails to provide isolation support.

Migrating away from the legacy API
==================================

This section briefly introduces questions and possible plans towards G4,
mainly as they relate to choices in previous stages.

The role of the PK/Cipher/MD APIs in user migration
---------------------------------------------------

We're currently taking advantage of the existing PK and Cipher layers in order
to reduce the number of places where library code needs to be changed. It's
only natural to consider using the same strategy (with the PK, MD and Cipher
layers) for facilitating migration of application code.

Note: a necessary first step for that would be to make sure PSA is no longer
implemented of top of the concerned layers

### Zero-cost compatibility layer?

The most favourable case is if we can have a zero-cost abstraction (no
runtime, RAM usage or code size penalty), for example just a bunch of
`#define`s, essentialy mapping `mbedtls_` APIs to their `psa_` equivalent.

Unfortunately that's unlikely fully work. For example, the MD layer uses the
same context type for hashes and HMACs, while the PSA API (rightfully) has
distinct operation types. Similarly, the Cipher layer uses the same context
type for unauthenticated and AEAD ciphers, which again the PSA API
distinguishes.

It is unclear how much value, if any, a zero-cost compatibility layer that's
incomplete (for example, for MD covering only hashes, or for Cipher covering
only AEAD) or differs significantly from the existing API (for example,
introducing new context types) would provide to users.

### Low-cost compatibility layers?

Another possibility is to keep most or all of the existing API for the PK, MD
and Cipher layers, implemented on top of PSA, aiming for the lowest possible
cost. For example, `mbedtls_md_context_t` would be defined as a (tagged) union
of `psa_hash_operation_t` and `psa_mac_operation_t`, then `mbedtls_md_setup()`
would initialize the correct part, and the rest of the functions be simple
wrappers around PSA functions. This would vastly reduce the complexity of the
layers compared to the existing (no need to dispatch through function
pointers, just call the corresponding PSA API).

Since this would still represent a non-zero cost, not only in terms of code
size, but also in terms of maintainance (testing, etc.) this would probably
be a temporary solution: for example keep the compatibility layers in 4.0 (and
make them optional), but remove them in 5.0.

Again, this provides the most value to users if we can manage to keep the
existing API unchanged. Their might be conflcits between this goal and that of
reducing the cost, and judgment calls may need to be made.

Note: when it comes to holding public keys in the PK layer, depending on how
the rest of the code is structured, it may be worth holding the key data in
memory controlled by the PK layer as opposed to a PSA key slot, moving it to a
slot only when needed (see current `ecdsa_verify_wrap` when
`MBEDTLS_USE_PSA_CRYPTO` is defined)  For example, when parsing a large
number, N, of X.509 certificates (for example the list of trusted roots), it
might be undesirable to use N PSA key slots for their public keys as long as
the certs are loaded. OTOH, this could also be addressed by merging the "X.509
parsing on-demand" (#2478), and then the public key data would be held as
bytes in the X.509 CRT structure, and only moved to a PK context / PSA slot
when it's actually used.

Note: the PK layer actually consists of two relatively distinct parts: crypto
operations, which will be covered by PSA, and parsing/writing (exporting)
from/to various formats, which is currently not fully covered by the PSA
Crypto API.

### Algorithm identifiers and other identifiers

It should be easy to provide the user with a bunch of `#define`s for algorithm
identifiers, for example `#define MBEDTLS_MD_SHA256 PSA_ALG_SHA_256`; most of
those would be in the MD, Cipher and PK compatibility layers mentioned above,
but there might be some in other modules that may be worth considering, for
example identifiers for elliptic curves.

### Lower layers

Generally speaking, we would retire all of the low-level, non-generic modules,
such as AES, SHA-256, RSA, DHM, ECDH, ECP, bignum, etc, without providing
compatibility APIs for them. People would be encouraged to switch to the PSA
API. (The compatiblity implementation of the existing PK, MD, Cipher APIs
would mostly benefit people who already used those generic APis rather than
the low-level, alg-specific ones.)

### APIs in TLS and X.509

Public APIs in TLS and X.509 may be affected by the migration in at least two
ways:

1. APIs that rely on a legacy `mbedtls_` crypto type: for example
   `mbedtls_ssl_conf_own_cert()` to configure a (certificate and the
associated) private key. Currently the private key is passed as a
`mbedtls_pk_context` object, which would probably change to a `psa_key_id_t`.
Since some users would probably still be using the compatibility PK layer, it
would need a way to easily extract the PSA key ID from the PK context.

2. APIs the accept list of identifiers: for example
   `mbedtls_ssl_conf_curves()` taking a list of `mbedtls_ecp_group_id`s. This
could be changed to accept a list of pairs (`psa_ecc_familiy_t`, size) but we
should probably take this opportunity to move to a identifier independant from
the underlying crypto implementation and use TLS-specific identifiers instead
(based on IANA values or custom enums), as is currently done in the new
`mbedtls_ssl_conf_groups()` API, see #4859).

Testing
-------

An question that needs careful consideration when we come around to removing
the low-level crypto APIs and making PK, MD and Cipher optional compatibility
layers is to be sure to preserve testing quality. A lot of the existing test
cases use the low level crypto APIs; we would need to either keep using that
API for tests, or manually migrated test to the PSA Crypto API. Perhaps a
combination of both, perhaps evolving gradually over time.
Document the general strategy for PSA migration Signed-off-by: Manuel Pégourié-Gonnard <manuel.pegourie-gonnard@arm.com> 2021-09-30 11:52:04 +02:00			`This document explains the strategy that was used so far in starting the`
			`migration to PSA Crypto and mentions future perspectives and open questions.`

			`Goals`
			`=====`

			`Several benefits are expected from migrating to PSA Crypto:`

Expand discussion of goals Signed-off-by: Manuel Pégourié-Gonnard <manuel.pegourie-gonnard@arm.com> 2021-10-27 14:00:08 +02:00			`G1. Use PSA Crypto drivers when available.`
Document the general strategy for PSA migration Signed-off-by: Manuel Pégourié-Gonnard <manuel.pegourie-gonnard@arm.com> 2021-09-30 11:52:04 +02:00			`G2. Allow isolation of long-term secrets (for example, private keys).`
			`G3. Allow isolation of short-term secrets (for example, TLS sesssion keys).`
			`G4. Have a clean, unified API for Crypto (retire the legacy API).`
Expand discussion of goals Signed-off-by: Manuel Pégourié-Gonnard <manuel.pegourie-gonnard@arm.com> 2021-10-27 14:00:08 +02:00			`G5. Code size: compile out our implementation when a driver is available.`
Document the general strategy for PSA migration Signed-off-by: Manuel Pégourié-Gonnard <manuel.pegourie-gonnard@arm.com> 2021-09-30 11:52:04 +02:00
			`Currently, some parts of (G1) and (G2) are implemented when`
			`MBEDTLS_USE_PSA_CRYPTO` is enabled. For (G2) to take effect, the application
			`needs to be changed to use new APIs.`

			`Generally speaking, the numbering above doesn't mean that each goal requires`
Expand discussion of goals Signed-off-by: Manuel Pégourié-Gonnard <manuel.pegourie-gonnard@arm.com> 2021-10-27 14:00:08 +02:00			`the preceding ones to be completed, for example G2-G5 could be done in any`
			`order; however they all either depend on G1 or are just much more convenient`
			`if G1 is done before (note that this is not a dependency on G1 being complete,`
			`it's more like each bit of G2-G5 is helped by some speficic bit in G1).`
Document the general strategy for PSA migration Signed-off-by: Manuel Pégourié-Gonnard <manuel.pegourie-gonnard@arm.com> 2021-09-30 11:52:04 +02:00
			`So, a solid intermediate goal would be to complete (G1) when`
			`MBEDTLS_USA_PSA_CRYPTO` is enabled - that is, all crypto operations in X.509
			`and TLS would be done via the PSA Crypto API.`

			`Compile-time options`
			`====================`

			`We currently have two compile-time options that are relevant to the migration:`

			- `MBEDTLS_PSA_CRYPTO_C` - enabled by default, controls the presence of the PSA
			`Crypto APIs.`
			- `MBEDTLS_USE_PSA_CRYPTO` - disabled by default (enabled in "full" config),
			`controls usage of PSA Crypto APIs to perform operations in X.509 and TLS`
			`(G1 above), as well as the availability of some new APIs (G2 above).`

			The reason why `MBEDTLS_USE_PSA_CRYPTO` is optional, and disabled by default,
			`is mostly to avoid introducing a hard (or even default) dependency of X509 and`
			TLS and `MBEDTLS_PSA_CRYPTO_C`. This is mostly reasons of code size, and
			`historically concerns about the maturity of the PSA code (which we might want`
			`to re-evaluate).`

			`The downside of this approach is that until we feel ready to make`
			`MBDEDTLS_USE_PSA_CRYPTO` non-optional (always enabled), we have to maintain
			`two versions of some parts of the code: one using PSA, the other using the`
			`legacy APIs. However, see next section for strategies that can lower that`
			`cost.`

			`Taking advantage of the existing abstractions layers - or not`
			`=============================================================`

			`The Crypto library in Mbed TLS currently has 3 abstraction layers that offer`
			`algorithm-agnostic APIs for a class of algorithms:`

			`- MD for messages digests aka hashes (including HMAC)`
			`- Cipher for symmetric ciphers (included AEAD)`
			`- PK for asymmetric (aka public-key) cryptography (excluding key exchange)`

			`Note: key exchange (FFDH, ECDH) is not covered by an abstraction layer.`

			`These abstraction layers typically provide, in addition to the API for crypto`
			`operations, types and numerical identifiers for algorithms (for`
			example `mbedtls_cipher_mode_t` and its values). The
			`current strategy is to keep using those identifiers in most of the code, in`
			`particular in existing structures and public APIs, even when`
			`MBEDTLS_USE_PSA_CRYPTO` is enabled. (This is not an issue for G1, G2, G3
			`above, and is only potentially relevant for G4.)`

			`The are multiple strategies that can be used regarding the place of those`
			`layers in the migration to PSA.`

			`Silently call to PSA from the abstraction layer`
			`-----------------------------------------------`

			- Provide a new definition (conditionally on `USE_PSA_CRYPTO`) of wrapper
			`functions in the abstraction layer, that calls PSA instead of the legacy`
			`crypto API.`
			`- Upside: changes contained to a single place, no need to change TLS or X.509`
			`code anywhere.`
			`- Downside: tricky to implement if the PSA implementation is currently done on`
			`top of that layer (dependency loop).`

			`This strategy is currently used for ECDSA signature verification in the PK`
			`layer, and could be extended to all operations in the PK layer.`

			`This strategy is not very well suited to the Cipher and MD layers, as the PSA`
			`implementation is currently done on top of those layers.`

			`Replace calls for each operation`
			`--------------------------------`

			`- For every operation that's done through this layer in TLS or X.509, just`
			replace function call with calls to PSA (conditionally on `USE_PSA_CRYPTO`)
			`- Upside: conceptually simple, and if the PSA implementation is currently done`
			`on top of that layer, avoids concerns about dependency loops.`
			`- Downside: TLS/X.509 code has to be done for each operation.`

			`This strategy is currently used for the MD layer. (Currently only a subset of`
			`calling places, but could be extended to all of them.)`

			`Opt-in use of PSA from the abstraction layer`
			`--------------------------------------------`

			`- Provide a new way to set up a context that causes operations on that context`
			`to be done via PSA.`
			`- Upside: changes mostly contained in one place, TLS/X.509 code only needs to`
			`be changed when setting up the context, but not when using it. In`
			`particular, no changes to/duplication of existing public APIs that expect a`
			key to be passed as a context of this layer (eg, `mbedtls_pk_context`).
			`- Upside: avoids dependency loop when PSA implemented on top of that layer.`
			`- Downside: when the context is typically set up by the application, requires`
			`changes in application code.`

			`There are two variants of this strategy: one where using the new setup`
			`function also allows for key isolation (the key is only held by PSA,`
			`supporting both G1 and G2 in that area), and one without isolation (the key is`
			`still stored outsde of PSA most of the time, supporting only G1).`

			`This strategy, with support for key isolation, is currently used for ECDSA`
			signature generation in the PK layer - see `mbedtls_pk_setup_opaque()`. This
			`allows use of PSA-held private ECDSA keys in TLS and X.509 with no change to`
			`the TLS/X.509 code, but a contained change in the application. If could be`
			`extended to other private key operations in the PK layer.`

			`This strategy, without key isolation, is also currently used in the Cipher`
			layer - see `mbedtls_cipher_setup_psa()`. This allows use of PSA for cipher
			`operations in TLS with no change to the application code, and a`
			`contained change in TLS code. (It currently only supports a subset of ciphers,`
			`but could easily be extended to all of them.)`

			`Note: for private key operations in the PK layer, both the "silent" and the`
			`"opt-in" strategy can apply, and can complement each other, as one provides`
			`support for key isolation, but at the (unavoidable) code of change in`
			`application code, while the other requires no application change to get`
			`support for drivers, but fails to provide isolation support.`

			`Migrating away from the legacy API`
			`==================================`

			`This section briefly introduces questions and possible plans towards G4,`
			`mainly as they relate to choices in previous stages.`

			`The role of the PK/Cipher/MD APIs in user migration`
			`---------------------------------------------------`

			`We're currently taking advantage of the existing PK and Cipher layers in order`
			`to reduce the number of places where library code needs to be changed. It's`
			`only natural to consider using the same strategy (with the PK, MD and Cipher`
			`layers) for facilitating migration of application code.`

			`Note: a necessary first step for that would be to make sure PSA is no longer`
			`implemented of top of the concerned layers`

			`### Zero-cost compatibility layer?`

			`The most favourable case is if we can have a zero-cost abstraction (no`
			`runtime, RAM usage or code size penalty), for example just a bunch of`
			`#define`s, essentialy mapping `mbedtls_` APIs to their `psa_` equivalent.

			`Unfortunately that's unlikely fully work. For example, the MD layer uses the`
			`same context type for hashes and HMACs, while the PSA API (rightfully) has`
			`distinct operation types. Similarly, the Cipher layer uses the same context`
			`type for unauthenticated and AEAD ciphers, which again the PSA API`
			`distinguishes.`

			`It is unclear how much value, if any, a zero-cost compatibility layer that's`
			`incomplete (for example, for MD covering only hashes, or for Cipher covering`
			`only AEAD) or differs significantly from the existing API (for example,`
			`introducing new context types) would provide to users.`

			`### Low-cost compatibility layers?`

			`Another possibility is to keep most or all of the existing API for the PK, MD`
			`and Cipher layers, implemented on top of PSA, aiming for the lowest possible`
			cost. For example, `mbedtls_md_context_t` would be defined as a (tagged) union
			of `psa_hash_operation_t` and `psa_mac_operation_t`, then `mbedtls_md_setup()`
			`would initialize the correct part, and the rest of the functions be simple`
			`wrappers around PSA functions. This would vastly reduce the complexity of the`
			`layers compared to the existing (no need to dispatch through function`
			`pointers, just call the corresponding PSA API).`

			`Since this would still represent a non-zero cost, not only in terms of code`
			`size, but also in terms of maintainance (testing, etc.) this would probably`
			`be a temporary solution: for example keep the compatibility layers in 4.0 (and`
			`make them optional), but remove them in 5.0.`

			`Again, this provides the most value to users if we can manage to keep the`
			`existing API unchanged. Their might be conflcits between this goal and that of`
			`reducing the cost, and judgment calls may need to be made.`

			`Note: when it comes to holding public keys in the PK layer, depending on how`
			`the rest of the code is structured, it may be worth holding the key data in`
			`memory controlled by the PK layer as opposed to a PSA key slot, moving it to a`
			slot only when needed (see current `ecdsa_verify_wrap` when
			`MBEDTLS_USE_PSA_CRYPTO` is defined) For example, when parsing a large
			`number, N, of X.509 certificates (for example the list of trusted roots), it`
			`might be undesirable to use N PSA key slots for their public keys as long as`
			`the certs are loaded. OTOH, this could also be addressed by merging the "X.509`
			`parsing on-demand" (#2478), and then the public key data would be held as`
			`bytes in the X.509 CRT structure, and only moved to a PK context / PSA slot`
			`when it's actually used.`

			`Note: the PK layer actually consists of two relatively distinct parts: crypto`
			`operations, which will be covered by PSA, and parsing/writing (exporting)`
			`from/to various formats, which is currently not fully covered by the PSA`
			`Crypto API.`

			`### Algorithm identifiers and other identifiers`

			It should be easy to provide the user with a bunch of `#define`s for algorithm
			identifiers, for example `#define MBEDTLS_MD_SHA256 PSA_ALG_SHA_256`; most of
			`those would be in the MD, Cipher and PK compatibility layers mentioned above,`
			`but there might be some in other modules that may be worth considering, for`
			`example identifiers for elliptic curves.`

			`### Lower layers`

			`Generally speaking, we would retire all of the low-level, non-generic modules,`
			`such as AES, SHA-256, RSA, DHM, ECDH, ECP, bignum, etc, without providing`
			`compatibility APIs for them. People would be encouraged to switch to the PSA`
			`API. (The compatiblity implementation of the existing PK, MD, Cipher APIs`
			`would mostly benefit people who already used those generic APis rather than`
			`the low-level, alg-specific ones.)`

			`### APIs in TLS and X.509`

			`Public APIs in TLS and X.509 may be affected by the migration in at least two`
			`ways:`

			1. APIs that rely on a legacy `mbedtls_` crypto type: for example
			`mbedtls_ssl_conf_own_cert()` to configure a (certificate and the
			`associated) private key. Currently the private key is passed as a`
			`mbedtls_pk_context` object, which would probably change to a `psa_key_id_t`.
			`Since some users would probably still be using the compatibility PK layer, it`
			`would need a way to easily extract the PSA key ID from the PK context.`

			`2. APIs the accept list of identifiers: for example`
			`mbedtls_ssl_conf_curves()` taking a list of `mbedtls_ecp_group_id`s. This
			could be changed to accept a list of pairs (`psa_ecc_familiy_t`, size) but we
			`should probably take this opportunity to move to a identifier independant from`
			`the underlying crypto implementation and use TLS-specific identifiers instead`
			`(based on IANA values or custom enums), as is currently done in the new`
			`mbedtls_ssl_conf_groups()` API, see #4859).

			`Testing`
			`-------`

			`An question that needs careful consideration when we come around to removing`
			`the low-level crypto APIs and making PK, MD and Cipher optional compatibility`
			`layers is to be sure to preserve testing quality. A lot of the existing test`
			`cases use the low level crypto APIs; we would need to either keep using that`
			`API for tests, or manually migrated test to the PSA Crypto API. Perhaps a`
			`combination of both, perhaps evolving gradually over time.`