This issue has been reported by Tuba Yavuz, Farhaan Fowze, Ken (Yihang) Bai,
Grant Hernandez, and Kevin Butler (University of Florida) and
Dave Tian (Purdue University).
In AES encrypt and decrypt some variables were left on the stack. The value
of these variables can be used to recover the last round key. To follow best
practice and to limit the impact of buffer overread vulnerabilities (like
Heartbleed) we need to zeroize them before exiting the function.
In the case of *ret we might need to preserve a 0 value throughout the
loop and therefore we need an extra condition to protect it from being
overwritten.
The value of done is always 1 after *ret has been set and does not need
to be protected from overwriting. Therefore in this case the extra
condition can be removed.
The code relied on the assumptions that CHAR_BIT is 8 and that unsigned
does not have padding bits.
In the Bignum module we already assume that the sign of an MPI is either
-1 or 1. Using this, we eliminate the above mentioned dependency.
The signature of mbedtls_mpi_cmp_mpi_ct() meant to support using it in
place of mbedtls_mpi_cmp_mpi(). This meant full comparison functionality
and a signed result.
To make the function more universal and friendly to constant time
coding, we change the result type to unsigned. Theoretically, we could
encode the comparison result in an unsigned value, but it would be less
intuitive.
Therefore we won't be able to represent the result as unsigned anymore
and the functionality will be constrained to checking if the first
operand is less than the second. This is sufficient to support the
current use case and to check any relationship between MPIs.
The only drawback is that we need to call the function twice when
checking for equality, but this can be optimised later if an when it is
needed.
Multiplication is known to have measurable timing variations based on
the operands. For example it typically is much faster if one of the
operands is zero. Remove them from constant time code.
The blinding applied to the scalar before modular inversion is
inadequate. Bignum is not constant time/constant trace, side channel
attacks can retrieve the blinded value, factor it (it is smaller than
RSA keys and not guaranteed to have only large prime factors). Then the
key can be recovered by brute force.
Reducing the blinded value makes factoring useless because the adversary
can only recover pk*t+z*N instead of pk*t.
mbedtls_ctr_drbg_seed() always set the entropy length to the default,
so a call to mbedtls_ctr_drbg_set_entropy_len() before seed() had no
effect. Change this to the more intuitive behavior that
set_entropy_len() sets the entropy length and seed() respects that and
only uses the default entropy length if there was no call to
set_entropy_len().
The former test-only function mbedtls_ctr_drbg_seed_entropy_len() is
no longer used, but keep it for strict ABI compatibility.
Move the definitions of mbedtls_ctr_drbg_seed_entropy_len() and
mbedtls_ctr_drbg_seed() to after they are used. This makes the code
easier to read and to maintain.
mbedtls_hmac_drbg_seed() always set the entropy length to the default,
so a call to mbedtls_hmac_drbg_set_entropy_len() before seed() had no
effect. Change this to the more intuitive behavior that
set_entropy_len() sets the entropy length and seed() respects that and
only uses the default entropy length if there was no call to
set_entropy_len().
ssl_decompress_buf() was operating on data from the ssl context, but called at
a point where this data is actually in the rec structure. Call it later so
that the data is back to the ssl structure.
Signed-off-by: Simon Butcher <simon.butcher@arm.com>
There is a 50% performance drop in the SCA_CM enabled encrypt and
decrypt functions. Therefore use the older version of encrypt/decypt
functions when SCA_CM is disabled.
-Do not reuse any part of randomized number, use separate byte for
each purpose.
-Combine some separate loops together to get rid of gap between them
-Extend usage of flow_control
* upstream/pr/2945:
Rename macro MBEDTLS_MAX_RAND_DELAY
Update signature of mbedtls_platform_random_delay
Replace mbedtls_platform_enforce_volatile_reads 2
Replace mbedtls_platform_enforce_volatile_reads
Add more variation to random delay countermeasure
Add random delay to enforce_volatile_reads
Update comments of mbedtls_platform_random_delay
Follow Mbed TLS coding style
Add random delay function to platform_utils
When reading the input, buffer will be initialised with random data
and the reading will start from a random offset. When writing the data,
the output will be initialised with random data and the writing will start
from a random offset.
When reading the input, the buffer will be initialised with random data
and the reading will start from a random offset. When writing the data,
the output will be initialised with random data and the writing will
start from a random offset.
Add more variation to the random delay function by xor:ing two
variables. It is not enough to increment just a counter to create a
delay as it will be visible as uniform delay that can be easily
removed from the trace by analysis.
The flag is used for tracking if the premaster has
been succesfully generated. Note that when resuming
a session, the flag should not be used when trying to
notice if all the key generation/derivation has been done.
Default flow assumes failure causes multiple issues with
compatibility tests when the return value is initialised
with error value in ssl_in_server_key_exchange_parse.
The function would need a significant change in structure for this.
The verification could be skipped in server, changed the default flow
so that the handshake status is ever updated if the verify
succeeds, and that is checked twice.
Check that the encryption has been done for the outbut buffer.
This is to ensure that glitching out the encryption doesn't
result as a unecrypted buffer to be sent.
This is to enable hardening the security when changing
states in state machine so that the state cannot be changed by bit flipping.
The later commit changes the enumerations so that the states have large
hamming distance in between them to prevent this kind of attack.
-Replace usage of rand() with mbedtls_platform_random_in_range()
-Prevent for-ever loop by hardcoding SCA countermeasure position in
case of used random function is always returning constant number.
-Use separate control bytes for start and final round to get them
randomized separately.
-Remove struct name.
-Fix comments and follow Mbed TLS coding style.
SCA CM implementation caused AES performance drop. For example
AES-CCM-128 calculation speed was dropped from 240 KB/s to 111 KB/s.
(-54%), Similarily AES-CBC-128 calculation speed was dropped from
536 KB/s to 237 KB/s (-56%).
Use functions instead of macros to reduce code indirections and
therefore increase performance. Now the performance is 163 KB/s for
AES-CCM-128 (-32%) and 348 KB/s for AES-CBC-128 (-35%).
When SCA countermeasures are activated the performance is as follows:
122 KB/s for AES-CCM-128 (-49%) and 258 KB/s for AES-CBC-128 (-52%)
compared to the original AES implementation.
Use control bytes to instruct AES calculation rounds. Each
calculation round has a control byte that indicates what data
(real/fake) is used and if any offset is required for AES data
positions.
First and last AES calculation round are calculated with SCA CM data
included. The calculation order is randomized by the control bytes.
Calculations between the first and last rounds contains 3 SCA CMs
in randomized positions.
- Add configuration for AES_SCA_COUNTERMEASURES to config.h. By
default the feature is disabled.
- Add AES_SCA_COUNTERMEASURES configuration check to check_config.h
- Add AES_SCA_COUNTERMEASURES test to all.sh
- 3 additional dummy AES rounds calculated with random data for each
AES encryption/decryption
- additional rounds can be occur in any point in sequence of rounds
- MSVC doesn't like -1u
- We need to include platform.h for MBEDTLS_ERR_PLATFORM_FAULT_DETECTED - in
some configurations it was already included indirectly, but not in all
configurations, so better include it directly.
This commit first changes the return convention of EccPoint_mult_safer() so
that it properly reports when faults are detected. Then all functions that
call it need to be changed to (1) follow the same return convention and (2)
properly propagate UECC_FAULT_DETECTED when it occurs.
Here's the reverse call graph from EccPoint_mult_safer() to the rest of the
library (where return values are translated to the MBEDTLS_ERR_ space) and test
functions (where expected return values are asserted explicitly).
EccPoint_mult_safer()
EccPoint_compute_public_key()
uECC_compute_public_key()
pkparse.c
tests/suites/test_suite_pkparse.function
uECC_make_key_with_d()
uECC_make_key()
ssl_cli.c
ssl_srv.c
tests/suites/test_suite_pk.function
tests/suites/test_suite_tinycrypt.function
uECC_shared_secret()
ssl_tls.c
tests/suites/test_suite_tinycrypt.function
uECC_sign_with_k()
uECC_sign()
pk.c
tests/suites/test_suite_tinycrypt.function
Note: in uECC_sign_with_k() a test for uECC_vli_isZero(p) is suppressed
because it is redundant with a more thorough test (point validity) done at the
end of EccPoint_mult_safer(). This redundancy was introduced in a previous
commit but not noticed earlier.