| Age | Commit message (Collapse) | Author | Files | Lines |
|
As per RFC 9000, Section 8.2.1:
When an endpoint is unable to expand the datagram size to 1200 bytes due
to the anti-amplification limit, the path MTU will not be validated.
To ensure that the path MTU is large enough, the endpoint MUST perform a
second path validation by sending a PATH_CHALLENGE frame in a datagram of
at least 1200 bytes.
|
|
The field "first" is removed. It's unused since 909b989ec088.
The field "last" is renamed to "send_time". It holds frame send time.
|
|
Previously ngx_quic_frame_sendto() ignored congestion control and did not
contribute to in_flight counter.
Now congestion control window is checked unless ignore_congestion flag is set.
Also, in_flight counter is incremented and the frame is stored in ctx->sent
queue if it's ack-eliciting. This behavior is now similar to
ngx_quic_output_packet().
|
|
According to RFC 9000, an endpoint SHOULD NOT send multiple PATH_CHALLENGE
frames in a single packet. The change adds a check to enforce this claim to
optimize server behavior. Previously each PATH_CHALLENGE always resulted in a
single response datagram being sent to client. The effect of this was however
limited by QUIC flood protection.
Also, PATH_CHALLENGE is explicitly disabled in Initial and Handshake levels,
see RFC 9000, Table 3. However, technically it may be sent by client in 0-RTT
over a new path without actual migration, even though the migration itself is
prohibited during handshake. This allows client to coalesce multiple 0-RTT
packets each carrying a PATH_CHALLENGE and end up with multiple PATH_CHALLENGEs
per datagram. This again leads to suboptimal behavior, see above. Since the
purpose of sending PATH_CHALLENGE frames in 0-RTT is unclear, these frames are
now only allowed in 1-RTT. For 0-RTT they are silently ignored.
|
|
Previously, when using ngx_quic_frame_sendto() to explicitly send a packet with
a single frame, anti-amplification limit was not properly enforced. Even when
there was no quota left for the packet, it was sent anyway, but with no padding.
Now the packet is not sent at all.
This function is called to send PATH_CHALLENGE/PATH_RESPONSE, PMTUD and probe
packets. For all these cases packet send is retried later in case the send was
not successful.
|
|
By default packets with these frames are expanded to 1200 bytes. Previously,
if anti-amplification limit did not allow this expansion, it was limited to
whatever size was allowed. However RFC 9000 clearly states no partial
expansion should happen in both cases.
Section 8.2.1. Initiating Path Validation:
An endpoint MUST expand datagrams that contain a PATH_CHALLENGE frame
to at least the smallest allowed maximum datagram size of 1200 bytes,
unless the anti-amplification limit for the path does not permit
sending a datagram of this size.
Section 8.2.2. Path Validation Responses:
An endpoint MUST expand datagrams that contain a PATH_RESPONSE frame
to at least the smallest allowed maximum datagram size of 1200 bytes.
...
However, an endpoint MUST NOT expand the datagram containing the
PATH_RESPONSE if the resulting data exceeds the anti-amplification limit.
|
|
Currently, packets generated by ngx_quic_frame_sendto() and
ngx_quic_send_early_cc() are not logged, thus making it hard
to read logs due to gaps appearing in packet numbers sequence.
At frames level, it is handy to see immediately packet number
in which they arrived or being sent.
|
|
|
|
It is made local as it is only needed now when creating crypto context.
BoringSSL lacks EVP interface for ChaCha20, providing instead
a function for one-shot encryption, thus hp is still preserved.
Based on a patch by Roman Arutyunyan.
|
|
After conversion to reusable crypto ctx, now there's enough caller
context to remove the "level" argument from ngx_quic_ciphers().
|
|
|
|
|
|
|
|
|
|
Now these functions have names ngx_quic_crypto_XXX():
- ngx_quic_tls_open() -> ngx_quic_crypto_open()
- ngx_quic_tls_seal() -> ngx_quic_crypto_seal()
- ngx_quic_tls_hp() -> ngx_quic_crypto_hp()
|
|
Previously it was possible to generate ACK frames using formally discarded
protection keys, in particular, when acknowledging a client Handshake packet
used to complete the TLS handshake and to discard handshake protection keys.
As it happens late in packet processing, it could be possible to generate ACK
frames after the keys were already discarded.
ACK frames are generated from ngx_quic_ack_packet(), either using a posted
push event, which envolves ngx_quic_generate_ack() as a part of the final
packet assembling, or directly in ngx_quic_ack_packet(), such as when there
is no room to add a new ACK range or when the received packet is out of order.
The added keys availability check is used to avoid generating late ACK frames
in both cases.
|
|
In addition to triggering alert, it ensures that such packets won't be sent.
With the previous change that marks server keys as discarded by zeroing the
key lengh, it is now an error to send packets with discarded keys. OpenSSL
based stacks tolerate such behaviour because key length isn't used in packet
protection, but BoringSSL will raise the UNSUPPORTED_KEY_SIZE cipher error.
It won't be possible to use discarded keys with reused crypto contexts as it
happens in subsequent changes.
|
|
Keys may be released by TLS stack in different times, so it makes sense
to check this independently as well. This allows to fine-tune what key
direction is used when checking keys availability.
When discarding, server keys are now marked in addition to client keys.
|
|
The error may be triggered in add_handhshake_data() by incorrect transport
parameter sent by client. The expected behaviour in this case is to close
connection complaining about incorrect parameter. Currently the connection
just times out.
|
|
Previously, the timer was never reset due to an explicit check. The check was
added in 36b59521a41c as part of connection close simplification. The reason
was to retain the earliest timeout. However, the timeouts are all the same
while QUIC handshake is in progress and resetting the timer for the same value
has no performance implications. After handshake completion there's only
application level. The change removes the check.
|
|
Instead, when worker is shutting down and handshake is not yet completed,
connection is terminated immediately.
Previously the callback could be called while QUIC handshake was in progress
and, what's more important, before the init() callback. Now it's postponed
after init().
This change is a preparation to postponing HTTP/3 session creation to init().
|
|
Previously QUIC did not have such parameter and handshake duration was
controlled by HTTP/3. However that required creating and storing HTTP/3
session on first client datagram. Apparently there's no convenient way to
store the session object until QUIC handshake is complete. In the followup
patches session creation will be postponed to init() callback.
|
|
As explained in BoringSSL change[1], levels were introduced in the original
QUIC API to draw a line between when keys are released and when are active.
In the new QUIC API they are released in separate calls when it's needed.
BoringSSL has then a consideration to remove levels API, hence the change.
If not available e.g. from a QUIC packet header, levels can be taken based on
keys availability. The only real use of levels is to prevent using app keys
before they are active in QuicTLS that provides the old BoringSSL QUIC API,
it is replaced with an equivalent check of c->ssl->handshaked.
This change also removes OpenSSL compat shims since they are no longer used.
The only exception left is caching write level from the keylog callback in
the internal field which is a handy equivalent of checking keys availability.
[1] https://boringssl.googlesource.com/boringssl/+/1e859054
|
|
As per RFC 9000, section 10.2.3, to ensure that peer successfully removed
packet protection, CONNECTION_CLOSE can be sent in multiple packets using
different packet protection levels.
Now it is sent in all protection levels available.
This roughly corresponds to the following paragraph:
* Prior to confirming the handshake, a peer might be unable to process 1-RTT
packets, so an endpoint SHOULD send a CONNECTION_CLOSE frame in both Handshake
and 1-RTT packets. A server SHOULD also send a CONNECTION_CLOSE frame in an
Initial packet.
In practice, this change allows to avoid sending an Initial packet when we know
the client has handshake keys, by checking if we have discarded initial keys.
Also, this fixes sending CONNECTION_CLOSE when using QuicTLS with old QUIC API,
where TLS stack releases application read keys before handshake confirmation;
it is fixed by sending CONNECTION_CLOSE additionally in a Handshake packet.
|
|
Previously, a socket error on a path being validated resulted in validation
error and subsequent QUIC connection closure. Now the error is ignored and
path validation proceeds as usual, with several retries and a timeout.
When validating the old path after an apparent migration, that path may already
be unavailable and sendmsg() may return an error, which should not result in
QUIC connection close.
When validating the new path, it's possible that the new client address is
spoofed (See RFC 9000, 9.3.2. On-Path Address Spoofing). This address may
as well be unavailable and should not trigger QUIC connection closure.
|
|
Previously, original dcid was used to receive initial client packets in case
server initial response was lost. However, last dcid should be used instead.
These two are the same unless retry is used. In case of retry, client resends
initial packet with a new dcid, that is different from the original dcid. If
server response is lost, the client resends this packet again with the same
dcid. This is shown in RFC 9000, 7.3. Authenticating Connection IDs, Figure 8.
The issue manifested itself with creating multiple server sessions in response
to each post-retry client initial packet, if server response is lost.
|
|
Since at least f9fbeb4ee0de and certainly after 924882f42dea, which
TLS Key Update support predates, queued data output is deferred to a
posted push handler. To address timing signals after these changes,
generating next keys is now posted to run after the push handler.
|
|
MTU selection starts by doubling the initial MTU until the first failure.
Then binary search is used to find the path MTU.
|
|
Previously, NGX_AGAIN returned by ngx_quic_send() was treated by
ngx_quic_frame_sendto() as error, which triggered errors in its callers.
However, a blocked socket is not an error. Now NGX_AGAIN is passed as is to
the ngx_quic_frame_sendto() callers, which can safely ignore it.
|
|
The frames for which the padding is removed are PATH_CHALLENGE and
PATH_RESPONSE, which are sent separately by ngx_quic_frame_sendto().
|
|
Its value is the opposite of path->validated.
|
|
When probe timeout expired while congestion window was exhausted, probe PINGs
could not be sent. As a result, lost packets could not be declared lost and
congestion window could not be freed for new packets. This deadlock
continued until connection idle timeout expiration.
Now PINGs are sent separately from the frame queue without congestion control,
as specified by RFC 9002, Section 7:
An endpoint MUST NOT send a packet if it would cause bytes_in_flight
(see Appendix B.2) to be larger than the congestion window, unless the
packet is sent on a PTO timer expiration (see Section 6.2) or when entering
recovery (see Section 7.3.2).
|
|
Previously, PTO handler analyzed the first packet in the sent queue for the
timeout expiration. However, the last sent packet should be analyzed instead.
An example is timeout calculation in ngx_quic_set_lost_timer().
|
|
Previously the field pnum of a potentially freed frame was accessed. Now the
value is copied to a local variable. The old behavior did not cause any
problems since the frame memory is not freed, but is moved to a free queue
instead.
|
|
In non-GSO mode, a datagram is sent if congestion window is not exceeded by the
time of send. The window could be exceeded by a small amount after the send.
In GSO mode, congestion window was checked in a similar way, but for all
concatenated datagrams as a whole. This could result in exceeding congestion
window by a lot. Now congestion window is checked for every datagram in GSO
mode as well.
|
|
Previously it was added to the tail as all other frames. However, if the
amount of queued data is large, it could delay the delivery of ACK, which
could trigger frames retransmissions and slow down the connection.
|
|
Previously ACK was not generated if max_ack_delay was not yet expired and the
number of unacknowledged ack-eliciting packets was less than two, as allowed by
RFC 9000 13.2.1-13.2.2. However this only makes sense to avoid sending ACK-only
packets, as explained by the RFC:
On the other hand, reducing the frequency of packets that carry only
acknowledgments reduces packet transmission and processing cost at both
endpoints.
Now ACK is delayed only if output frame queue is empty. Otherwise ACK is sent
immediately, which significantly improves QUIC performance with certain tests.
|
|
Previously used AES256-CBC is now substituted with AES256-GCM. Although there
seem to be no tangible consequences of token integrity loss.
|
|
They were preserved in 172705615d04 to ease transition from older BoringSSL.
|
|
|
|
|
|
|
|
The constants are used for both GCM and CHACHAPOLY.
|
|
Previously used constant EVP_GCM_TLS_TAG_LEN had misleading name since it was
used not only with GCM, but also with CHACHAPOLY. Now a new constant
NGX_QUIC_TAG_LEN introduced. Luckily all AEAD algorithms used by QUIC have
the same tag length of 16.
|
|
Previously, computing rttvar used an updated smoothed_rtt value as per
RFC 9002, section 5.3, which appears to be specified in a wrong order.
A technical errata ID 7539 is reported.
|
|
Previously, rec.level field was not uninitialized in SSL_provide_quic_data().
As a result, its value was always ssl_encryption_initial. Later in
ngx_quic_ciphers() such level resulted in resetting the cipher to
TLS1_3_CK_AES_128_GCM_SHA256 and using AES128 to encrypt the packet.
Now the level is initialized and the right cipher is used.
|
|
The layer is enabled as a fallback if the QUIC support is configured and the
BoringSSL API wasn't detected, or when using the --with-openssl option, also
compatible with QuicTLS and LibreSSL. For the latter, the layer is assumed
to be present if QUIC was requested, so it needs to be undefined to prevent
QUIC API redefinition as appropriate.
A previously used approach to test the TLSEXT_TYPE_quic_transport_parameters
macro doesn't work with OpenSSL 3.2 master branch where this macro appeared
with incompatible QUIC API. To fix the build there, the test is revised to
pass only for QuicTLS and LibreSSL.
|
|
Previously, ngx_quic_close_connection() could be called in a way that QUIC
connection was accessed after the call. In most cases the connection is not
closed right away, but close timeout is scheduled. However, it's not always
the case. Also, if the close process started earlier for a different reason,
calling ngx_quic_close_connection() may actually close the connection. The
connection object should not be accessed after that.
Now, when possible, return statement is added to eliminate post-close connection
object access. In other places ngx_quic_close_connection() is substituted with
posting close event.
Also, the new way of closing connection in ngx_quic_stream_cleanup_handler()
fixes another problem in this function. Previously it passed stream connection
instead of QUIC connection to ngx_quic_close_connection(). This could result
in incomplete connection shutdown. One consequence of that could be that QUIC
streams were freed without shutting down their application contexts. This could
result in another use-after-free.
Found by Coverity (CID 1530402).
|
|
The qsock->sockaddr field is a ngx_sockaddr_t union, and therefore can hold
any sockaddr (and union members, such qsock->sockaddr.sockaddr, can be used
to access appropriate variant of the sockaddr). It is better to set it via
qsock->sockaddr itself though, and not qsock->sockaddr.sockaddr, so static
analyzers won't complain about out-of-bounds access.
Prodded by Coverity (CID 1530403).
|
|
Previously, ngx_udp_rbtree_insert_value() was used for plain UDP and
ngx_quic_rbtree_insert_value() was used for QUIC. Because of this it was
impossible to initialize connection tree in ngx_create_listening() since
this function is not aware what kind of listening it creates.
Now ngx_udp_rbtree_insert_value() is used for both QUIC and UDP. To make
is possible, a generic key field is added to ngx_udp_connection_t. It keeps
client address for UDP and connection ID for QUIC.
|