While these tests and the issue with it are pre-existing:
- we previously didn't understand that the issue was an openssl bug
- failures seem to have become more frequent since the recent changes
So let's disable these fragile tests in order to get a clean CI. We still have
the tests against gnutls-serv for interop testing.
While making the initial commit, I thought $OPENSSL_LEGACY was not affect by
this bug, but it turns out I was wrong. All versions of OpenSSL installed on
the CI are. Therefore, the corresponding tests are disabled for the same
reason as the gnutls-cli tests above it.
This commit is only about the tests that were added in the recent
fragmentation work. One of those two tests had a particularly
annoying mode of failure: it failed consistently with seed=1 (use in the
release version of all.sh), once #1951 was applied. This has nothing
particular to do with #1951, except that by changing retransmission behaviour
1951 made the proxy run into a path that triggered the OpenSSL bug with this
seed, while it previously did that only with other seeds.
Other 3d interop test are also susceptible to triggering this OpenSSL bug or
others (or bugs in GnuTLS), but they are left untouched by this commit as:
- they were pre-existing to the recent DTLS branches;
- they don't seem to have the particularly annoying seed=1 mode of failure.
However it's probably desirable to do something about them at some point in
the future.
This commit adds a test to ssl-opt.sh which exercises the behavior
of the library in the situation where a single proper fragment
of a future handshake message is received prior to the next
expected handshake message (concretely, the client receives
the first fragment of the server's Certificate message prior
to the server's ServerHello).
This commit adds tests to ssl-opt.sh which trigger code-paths
responsible for freeing future buffered messages when the buffering
limitations set by MBEDTLS_SSL_DTLS_MAX_BUFFERING don't allow the
next expected message to be reassembled.
These tests only work for very specific ranges of
MBEDTLS_SSL_DTLS_MAX_BUFFERING and will therefore be skipped
on a run of ssl-opt.sh in ordinary configurations.
This commit adds functions requires_config_value_at_most()
and requires_config_value_at_least() which can be used to
only run tests when a numerical value from config.h
(e.g. MBEDTLS_SSL_IN_CONTENT_LEN) is within a certain range.
The negotiated MFL is always the one suggested by the client, even
if the server has a smaller MFL configured locally. Hence, in the test
where the client asks for an MFL of 4096 bytes while the server locally
has an MFL of 512 bytes configured, the client will still send datagrams
of up to ~4K size.
Depending on the settings of the local machine, gnutls-cli will either try
IPv4 or IPv6 when trying to connect to localhost. With TLS, whatever it tries
first, it will notice if any failure happens and try the other protocol if
necessary. With DTLS it can't do that. Unfortunately for now there isn't
really any good way to specify an address and hostname independently, though
that might come soon: https://gitlab.com/gnutls/gnutls/issues/344
A work around is to specify an address directly and then use --insecure to
ignore certificate hostname mismatch; that is OK for tests that are completely
unrelated to certificate verification (such as the recent fragmenting tests)
but unacceptable for others.
For that reason, don't specify a default hostname for gnutls-cli, but instead
let each test choose between `--insecure 127.0.0.1` and `localhost` (or
`--insecure '::1'` if desired).
Alternatives include:
- having test certificates with 127.0.0.1 as the hostname, but having an IP as
the CN is unusual, and we would need to change our test certs;
- have our server open two sockets under the hood and listen on both IPv4 and
IPv6 (that's what gnutls-serv does, and IMO it's a good thing) but that
obviously requires development and testing (esp. for windows compatibility)
- wait for a newer version of GnuTLS to be released, install it on the CI and
developer machines, and use that in all tests - quite satisfying but can't
be done now (and puts stronger requirements on test environment).
From Hanno:
When a server replies to a cookieless ClientHello with a HelloVerifyRequest,
it is supposed to reset the connection and wait for a subsequent ClientHello
which includes the cookie from the HelloVerifyRequest.
In testing environments, it might happen that the reset of the server
takes longer than for the client to replying to the HelloVerifyRequest
with the ClientHello+Cookie. In this case, the ClientHello gets lost
and the client will need retransmit. This may happen even if the underlying
datagram transport is reliable.
This commit continues commit 47db877 by removing resend guards in the
ssl-opt.sh tests 'DTLS fragmenting: proxy MTU, XXX' which sometimes made
the tests fail in case the log showed a resend from the client.
See 47db877 for more information.
When a server replies to a cookieless ClientHello with a HelloVerifyRequest,
it is supposed to reset the connection and wait for a subsequent ClientHello
which includes the cookie from the HelloVerifyRequest.
In testing environments, it might happen that the reset of the server
takes longer than for the client to replying to the HelloVerifyRequest
with the ClientHello+Cookie. In this case, the ClientHello gets lost
and the client will need retransmit. This may happen even if the underlying
datagram transport is reliable.
This commit removes a guard in the ssl-opt.sh test
'DTLS fragmenting: proxy MTU, resumed handshake' which made
the test fail in case the log showed a resend from the client.
We previously observed random-looking failures from this test. I think they
were caused by a race condition where the client tries to reconnect while the
server is still closing the connection and has not yet returned to an
accepting state. In that case, the server would fail to see and reply to the
ClientHello, and the client would have to resend it.
I believe logs of failing runs are compatible with this interpretation:
- the proxy logs show the new ClientHello and the server's closing Alert are
sent the same millisecond.
- the client logs show the server's closing Alert is received after the new
handshake has been started (discarding message from wrong epoch).
The attempted fix is for the client to wait a bit before reconnecting, which
should vastly enhance the probability of the server reaching its accepting
state before the client tries to reconnect. The value of 1 second is arbitrary
but should be more than enough even on loaded machines.
The test was run locally 100 times in a row on a slightly loaded machine (an
instance of all.sh running in parallel) without any failure after this fix.
Use the same values as other 3d tests: this makes the test hopefully a bit
faster than the default values, while not increasing the failure rate.
While at it:
- adjust "needs_more_time" setting for 3d interop tests (we can't set the
timeout values for other implementations, so the test might be slow)
- fix some supposedly DTLS 1.0 test that were using dtls1_2 on the command
line
Now that the UDP proxy has the ability to delay specific
handshake message on the client and server side, use
this to rewrite the reordering tests and thereby make
them independent on the choice of PRNG used by the proxy
(which is not stable across platforms).
This commit adds four tests to ssl-opt.sh running default
DTLS client and server with and without datagram packing
enabled, and checking that datagram packing is / is not
used by inspecting the debug output.
The UDP proxy does currently not dissect datagrams into records,
an hence the coverage of the reordering, package loss and duplication
tests is much smaller if datagram packing is in use.
This commit disables datagram packing for most UDP proxy tests,
in particular all 3D (drop, duplicate, delay) tests.
Now that datagram packing can be dynamically configured,
the test exercising the behavior of Mbed TLS when facing
an out-of-order CCS message can be re-introduced, disabling
datagram packing for the sender of the delayed CCS.
The tests "DTLS fragmenting: none (for reference)" and
"DTLS fragmenting: none (for reference) (MTU)" used a
maximum fragment length resp. MTU value of 2048 which
was meant to be large enough so that fragmentation
of the certificate message would not be necessary.
However, it is not large enough to hold the entire flight
to which the certificate belongs, and hence there will
be fragmentation as soon as datagram packing is used.
This commit increases the maximum fragment length resp.
MTU values to 4096 bytes to ensure that even with datagram
packing in place, no fragmentation is necessary.
A similar change was made in "DTLS fragmenting: client (MTU)".
The test exercising a delayed CCS message is not
expected to work when datagram packing is used,
as the current UDP proxy is not able to recognize
records which are not at the beginning of a
datagram.
Adds a requirement for GNUTLS_NEXT (3.5.3 or above, in practice we should
install 3.6.3) on the CI.
See internal ref IOTSSL-2401 for analysis of the bugs and their impact on the
tests.
For now, just check that it causes us to fragment. More tests are coming in
follow-up commits to ensure we respect the exact value set, including when
renegotiating.
Note: no interop tests in ssl-opt.sh for now, as some of them make us run into
bugs in (the CI's default versions of) OpenSSL and GnuTLS, so interop tests
will be added later once the situation is clarified. <- TODO
1. Update the test script to un the ECC tests only if the relevant
configurations are defined in `config.h` file
2. Change the HASH of the ciphersuite from SHA1 based to SHA256
for better example
* development: (180 commits)
Change the library version to 2.11.0
Fix version in ChangeLog for fix for #552
Add ChangeLog entry for clang version fix. Issue #1072
Compilation warning fixes on 32b platfrom with IAR
Revert "Turn on MBEDTLS_SSL_ASYNC_PRIVATE by default"
Fix for missing len var when XTS config'd and CTR not
ssl_server2: handle mbedtls_x509_dn_gets failure
Fix harmless use of uninitialized memory in ssl_parse_encrypted_pms
SSL async tests: add a few test cases for error in decrypt
Fix memory leak in ssl_server2 with SNI + async callback
SNI + SSL async callback: make all keys async
ssl_async_resume: free the operation context on error
ssl_server2: get op_name from context in ssl_async_resume as well
Clarify "as directed here" in SSL async callback documentation
SSL async callbacks documentation: clarify resource cleanup
Async callback: use mbedtls_pk_check_pair to compare keys
Rename mbedtls_ssl_async_{get,set}_data for clarity
Fix copypasta in the async callback documentation
SSL async callback: cert is not always from mbedtls_ssl_conf_own_cert
ssl_async_set_key: detect if ctx->slots overflows
...
The code paths in the library are different for decryption and for
signature. Improve the test coverage by doing some error path tests
for decryption in addition to signature.
Summary of merge conflicts:
include/mbedtls/ecdh.h -> documentation style
include/mbedtls/ecdsa.h -> documentation style
include/mbedtls/ecp.h -> alt style, new error codes, documentation style
include/mbedtls/error.h -> new error codes
library/error.c -> new error codes (generated anyway)
library/ecp.c:
- code of an extracted function was changed
library/ssl_cli.c:
- code addition on one side near code change on the other side
(ciphersuite validation)
library/x509_crt.c -> various things
- top fo file: helper structure added near old zeroize removed
- documentation of find_parent_in()'s signature: improved on one side,
added arguments on the other side
- documentation of find_parent()'s signature: same as above
- verify_chain(): variables initialised later to give compiler an
opportunity to warn us if not initialised on a code path
- find_parent(): funcion structure completely changed, for some reason git
tried to insert a paragraph of the old structure...
- merge_flags_with_cb(): data structure changed, one line was fixed with a
cast to keep MSVC happy, this cast is already in the new version
- in verify_restratable(): adjacent independent changes (function
signature on one line, variable type on the next)
programs/ssl/ssl_client2.c:
- testing for IN_PROGRESS return code near idle() (event-driven):
don't wait for data in the the socket if ECP_IN_PROGRESS
tests/data_files/Makefile: adjacent independent additions
tests/suites/test_suite_ecdsa.data: adjacent independent additions
tests/suites/test_suite_x509parse.data: adjacent independent additions
* development: (1059 commits)
Change symlink to hardlink to avoid permission issues
Fix out-of-tree testing symlinks on Windows
Updated version number to 2.10.0 for release
Add a disabled CMAC define in the no-entropy configuration
Adapt the ARIA test cases for new ECB function
Fix file permissions for ssl.h
Add ChangeLog entry for PR#1651
Fix MicroBlaze register typo.
Fix typo in doc and copy missing warning
Fix edit mistake in cipher_wrap.c
Update CTR doc for the 64-bit block cipher
Update CTR doc for other 128-bit block ciphers
Slightly tune ARIA CTR documentation
Remove double declaration of mbedtls_ssl_list_ciphersuites
Update CTR documentation
Use zeroize function from new platform_util
Move to new header style for ALT implementations
Add ifdef for selftest in header file
Fix typo in comments
Use more appropriate type for local variable
...
The certificate passed to async callbacks may not be the one set by
mbedtls_ssl_conf_own_cert. For example, when using an SNI callback,
it's whatever the callback is using. Document this, and add a test
case (and code sample) with SNI.
Add a test case for SSL asynchronous signature where f_async_resume is
called twice. Verify that f_async_sign_start is only called once.
This serves as a non-regression test for a bug where f_async_sign_start
was only called once, which turned out to be due to a stale build
artifacts with mismatched numerical values of
MBEDTLS_ERR_SSL_ASYNC_IN_PROGRESS.
Testing the case where the resume callback returns an error at the
beginning and the case where it returns an error at the end is
redundant. Keep the test after the output has been produced, to
validate that the product does not use even a valid output if the
return value is an error code.
Document how the SSL async sign callback must treat its md_alg and
hash parameters when doing an RSA signature: sign-the-hash if md_alg
is nonzero (TLS 1.2), and sign-the-digestinfo if md_alg is zero
(TLS <= 1.1).
In ssl_server2, don't use md_alg=MBEDTLS_MD_NONE to indicate that
ssl_async_resume must perform an encryption, because md_alg is also
MBEDTLS_MD_NONE in TLS <= 1.1. Add a test case to exercise this
case (signature with MBEDTLS_MD_NONE).
Conflict resolution:
* ChangeLog: put the new entry from my branch in the proper place.
* include/mbedtls/error.h: counted high-level module error codes again.
* include/mbedtls/ssl.h: picked different numeric codes for the
concurrently added errors; made the new error a full sentence per
current standards.
* library/error.c: ran scripts/generate_errors.pl.
* library/ssl_srv.c:
* ssl_prepare_server_key_exchange "DHE key exchanges": the conflict
was due to style corrections in development
(4cb1f4d49c) which I merged with
my refactoring.
* ssl_prepare_server_key_exchange "For key exchanges involving the
server signing", first case, variable declarations: merged line
by line:
* dig_signed_len: added in async
* signature_len: removed in async
* hashlen: type changed to size_t in development
* hash: size changed to MBEDTLS_MD_MAX_SIZE in async
* ret: added in async
* ssl_prepare_server_key_exchange "For key exchanges involving the
server signing", first cae comment: the conflict was due to style
corrections in development (4cb1f4d49c)
which I merged with my comment changes made as part of refactoring
the function.
* ssl_prepare_server_key_exchange "Compute the hash to be signed" if
`md_alg != MBEDTLS_MD_NONE`: conflict between
ebd652fe2d
"ssl_write_server_key_exchange: calculate hashlen explicitly" and
46f5a3e9b4 "Check return codes from
MD in ssl code". I took the code from commit
ca1d742904 made on top of development
which makes mbedtls_ssl_get_key_exchange_md_ssl_tls return the
hash length.
* programs/ssl/ssl_server2.c: multiple conflicts between the introduction
of MBEDTLS_ERR_SSL_ASYNC_IN_PROGRESS and new auxiliary functions and
definitions for async support, and the introduction of idle().
* definitions before main: concurrent additions, kept both.
* main, just after `handshake:`: in the loop around
mbedtls_ssl_handshake(), merge the addition of support for
MBEDTLS_ERR_SSL_ASYNC_IN_PROGRESS and SSL_ASYNC_INJECT_ERROR_CANCEL
with the addition of the idle() call.
* main, if `opt.transport == MBEDTLS_SSL_TRANSPORT_STREAM`: take the
code from development and add a check for
MBEDTLS_ERR_SSL_ASYNC_IN_PROGRESS.
* main, loop around mbedtls_ssl_read() in the datagram case:
take the code from development and add a check for
MBEDTLS_ERR_SSL_ASYNC_IN_PROGRESS; revert to a do...while loop.
* main, loop around mbedtls_ssl_write() in the datagram case:
take the code from development and add a check for
MBEDTLS_ERR_SSL_ASYNC_IN_PROGRESS; revert to a do...while loop.
Add test cases for SSL asynchronous signature to ssl-opt.sh:
* Delay=0,1 to test the sequences of calls to f_async_resume
* Test fallback when the async callbacks don't support that key
* Test error injection at each stage
* Test renegotiation
Previously, the idling loop in ssl_server2 didn't check whether
the underlying call to mbedtls_net_poll signalled that the socket
became invalid. This had the consequence that during idling, the
server couldn't be terminated through a SIGTERM, as the corresponding
handler would only close the sockets and expect the remainder of
the program to shutdown gracefully as a consequence of this.
This was subsequently attempted to be fixed through a change
in ssl-opt.sh by terminating the server through a KILL signal,
which however lead to other problems when the latter was run
under valgrind.
This commit changes the idling loop in ssl_server2 and ssl_client2
to obey the return code of mbedtls_net_poll and gracefully shutdown
if an error occurs, e.g. because the socket was closed.
As a consequence, the server termination via a KILL signal in
ssl-opt.sh is no longer necessary, with the previous `kill; wait`
pattern being sufficient. The commit reverts the corresponding
change.
The UDP tests involving the merging of multiple records into single
datagrams accumulate records for 10ms, which can be less than the
total flight preparation time if e.g. the tests are being run with
valgrind.
This commit increases the packing time for the relevant tests
from 10ms to 50ms.
If lsof is not available, wait_server_start uses a fixed timeout,
which can trigger a race condition if the timeout turns out to be too
short. Emit a warning so that we know this is going on from the test
logs.
- Some of the CI machines don't have lsof installed yet, so rely on an sleeping
an arbitrary number of seconds while the server starts. We're seeing
occasional failures with the current delay because the CI machines are highly
loaded, which seems to indicate the current delay is not quite enough, but
hopefully not to far either, so double it.
- While at it, also double the watchdog delay: while I don't remember seeing
much failures due to client timeout, this change doesn't impact normal
running time of the script, so better err on the safe side.
These changes don't affect the test and should only affect the false positive
rate coming from the test framework in those scripts.
In wait_server_start, fork less. When lsof is present, call it on the
expected process. This saves a few percent of execution time on a
lightly loaded machine. Also, sleep for a short duration rather than
using a tight loop.
Add a DTLS small packet test for each of the following combinations:
- DTLS version: 1.0 or 1.2
- Encrypt then MAC extension enabled
- Truncated HMAC extension enabled
Large packets tests for DTLS are currently not possible due to parameter
constraints in ssl_server2.
This commit ensures that there is a small packet test for at least any
combination of
- SSL/TLS version: SSLv3, TLS 1.0, TLS 1.1 or TLS 1.2
- Stream cipher (RC4) or Block cipher (AES)
- Usage of Encrypt then MAC extension [TLS only]
- Usage of truncated HMAC extension [TLS only]
Noticed that the test cases in ssl-opt.sh exercising the truncated HMAC
extension do not depend on MBEDTLS_SSL_TRUNCATED_HMAC being enabled in
config.h. This commit fixes this.
Previously, MAC validation for an incoming record proceeded as follows:
1) Make a copy of the MAC contained in the record;
2) Compute the expected MAC in place, overwriting the presented one;
3) Compare both.
This resulted in a record buffer overflow if truncated MAC was used, as in this
case the record buffer only reserved 10 bytes for the MAC, but the MAC
computation routine in 2) always wrote a full digest.
For specially crafted records, this could be used to perform a controlled write of
up to 6 bytes past the boundary of the heap buffer holding the record, thereby
corrupting the heap structures and potentially leading to a crash or remote code
execution.
This commit fixes this by making the following change:
1) Compute the expected MAC in a temporary buffer that has the size of the
underlying message digest.
2) Compare to this to the MAC contained in the record, potentially
restricting to the first 10 bytes if truncated HMAC is used.
A similar fix is applied to the encryption routine `ssl_encrypt_buf`.
This commit adds regression tests for the bug when we didn't parse the
Signature Algorithm extension when renegotiating. (By nature, this bug
affected only the server)
The tests check for the fallback hash (SHA1) in the server log to detect
that the Signature Algorithm extension hasn't been parsed at least in
one of the handshakes.
A more direct way of testing is not possible with the current test
framework, since the Signature Algorithm extension is parsed in the
first handshake and any corresponding debug message is present in the
logs.
This commit adds regression tests for the bug when we didn't parse the
Signature Algorithm extension when renegotiating. (By nature, this bug
affected only the server)
The tests check for the fallback hash (SHA1) in the server log to detect
that the Signature Algorithm extension hasn't been parsed at least in
one of the handshakes.
A more direct way of testing is not possible with the current test
framework, since the Signature Algorithm extension is parsed in the
first handshake and any corresponding debug message is present in the
logs.
DTLS records from previous epochs were incorrectly checked against the
current epoch transform's minimal content length, leading to the
rejection of entire datagrams. This commit fixed that and adapts two
test cases accordingly.
Internal reference: IOTSSL-1417
It seems that tests from ssl-opt.sh are sometimes failing because
the server is killed before its output has been finalized. This commit
adds a small delay in ssl-opt.sh before killing the server to prevent
that.
ssl-opt.sh checks whether the client, server and proxy commands are
names of executable files, forbidding the use of default arguments by
by e.g. setting P_SRV="ssl_server2 debug_level=3". This commit relaxes
this check, only considering the part of the command string prior to
the first whitespace.
Add a test to ssl-opt.sh that parses the client and server debug
output and then checks that the Unix timestamp in the ServerHello
message is within acceptable bounds.
Extend the run_test function in ssl-opt.sh so that it accepts the -f
and -F options. These parameters take an argument which is the name of
a shell function that will be called by run_test and will be given the
client input and output debug log. The idea is that these functions are
defined by each test and they can be used to do some custom check
beyon those allowed by the pattern matching capabilities of the
run_test function.
Add a test to ssl-opt.sh that parses the client and server debug
output and then checks that the Unix timestamp in the ServerHello
message is within acceptable bounds.
Extend the run_test function in ssl-opt.sh so that it accepts the -f
and -F options. These parameters take an argument which is the name of
a shell function that will be called by run_test and will be given the
client input and output debug log. The idea is that these functions are
defined by each test and they can be used to do some custom check
beyon those allowed by the pattern matching capabilities of the
run_test function.
Some tests in ssl-opt.sh require MBEDTLS_SSL_MAX_CONTENT_LEN to be set to its
default value of 16384 to succeed. While ideally such a dependency should not
exist, as a short-term remedy this commit adds a small check that will at least
lead to graceful exit if that assumption is violated.
This commit adds four tests to ssl-opt.sh testing the library's behavior when
`mbedtls_ssl_write` is called with messages beyond 16384 bytes. The combinations
tested are TLS vs. DTLS and MBEDTLS_SSL_MAX_FRAGMENT_LENGTH enabled vs. disabled.