From Hanno:
When a server replies to a cookieless ClientHello with a HelloVerifyRequest,
it is supposed to reset the connection and wait for a subsequent ClientHello
which includes the cookie from the HelloVerifyRequest.
In testing environments, it might happen that the reset of the server
takes longer than for the client to replying to the HelloVerifyRequest
with the ClientHello+Cookie. In this case, the ClientHello gets lost
and the client will need retransmit. This may happen even if the underlying
datagram transport is reliable.
This commit continues commit 47db877 by removing resend guards in the
ssl-opt.sh tests 'DTLS fragmenting: proxy MTU, XXX' which sometimes made
the tests fail in case the log showed a resend from the client.
See 47db877 for more information.
When a server replies to a cookieless ClientHello with a HelloVerifyRequest,
it is supposed to reset the connection and wait for a subsequent ClientHello
which includes the cookie from the HelloVerifyRequest.
In testing environments, it might happen that the reset of the server
takes longer than for the client to replying to the HelloVerifyRequest
with the ClientHello+Cookie. In this case, the ClientHello gets lost
and the client will need retransmit. This may happen even if the underlying
datagram transport is reliable.
This commit removes a guard in the ssl-opt.sh test
'DTLS fragmenting: proxy MTU, resumed handshake' which made
the test fail in case the log showed a resend from the client.
We previously observed random-looking failures from this test. I think they
were caused by a race condition where the client tries to reconnect while the
server is still closing the connection and has not yet returned to an
accepting state. In that case, the server would fail to see and reply to the
ClientHello, and the client would have to resend it.
I believe logs of failing runs are compatible with this interpretation:
- the proxy logs show the new ClientHello and the server's closing Alert are
sent the same millisecond.
- the client logs show the server's closing Alert is received after the new
handshake has been started (discarding message from wrong epoch).
The attempted fix is for the client to wait a bit before reconnecting, which
should vastly enhance the probability of the server reaching its accepting
state before the client tries to reconnect. The value of 1 second is arbitrary
but should be more than enough even on loaded machines.
The test was run locally 100 times in a row on a slightly loaded machine (an
instance of all.sh running in parallel) without any failure after this fix.
Use the same values as other 3d tests: this makes the test hopefully a bit
faster than the default values, while not increasing the failure rate.
While at it:
- adjust "needs_more_time" setting for 3d interop tests (we can't set the
timeout values for other implementations, so the test might be slow)
- fix some supposedly DTLS 1.0 test that were using dtls1_2 on the command
line
Now that the UDP proxy has the ability to delay specific
handshake message on the client and server side, use
this to rewrite the reordering tests and thereby make
them independent on the choice of PRNG used by the proxy
(which is not stable across platforms).
This commit adds four tests to ssl-opt.sh running default
DTLS client and server with and without datagram packing
enabled, and checking that datagram packing is / is not
used by inspecting the debug output.
The UDP proxy does currently not dissect datagrams into records,
an hence the coverage of the reordering, package loss and duplication
tests is much smaller if datagram packing is in use.
This commit disables datagram packing for most UDP proxy tests,
in particular all 3D (drop, duplicate, delay) tests.
Now that datagram packing can be dynamically configured,
the test exercising the behavior of Mbed TLS when facing
an out-of-order CCS message can be re-introduced, disabling
datagram packing for the sender of the delayed CCS.
The tests "DTLS fragmenting: none (for reference)" and
"DTLS fragmenting: none (for reference) (MTU)" used a
maximum fragment length resp. MTU value of 2048 which
was meant to be large enough so that fragmentation
of the certificate message would not be necessary.
However, it is not large enough to hold the entire flight
to which the certificate belongs, and hence there will
be fragmentation as soon as datagram packing is used.
This commit increases the maximum fragment length resp.
MTU values to 4096 bytes to ensure that even with datagram
packing in place, no fragmentation is necessary.
A similar change was made in "DTLS fragmenting: client (MTU)".
The test exercising a delayed CCS message is not
expected to work when datagram packing is used,
as the current UDP proxy is not able to recognize
records which are not at the beginning of a
datagram.
Adds a requirement for GNUTLS_NEXT (3.5.3 or above, in practice we should
install 3.6.3) on the CI.
See internal ref IOTSSL-2401 for analysis of the bugs and their impact on the
tests.
For now, just check that it causes us to fragment. More tests are coming in
follow-up commits to ensure we respect the exact value set, including when
renegotiating.
Note: no interop tests in ssl-opt.sh for now, as some of them make us run into
bugs in (the CI's default versions of) OpenSSL and GnuTLS, so interop tests
will be added later once the situation is clarified. <- TODO