| Age | Commit message (Collapse) | Author | Files | Lines |
|
Encryption level values are decoupled from ssl_encryption_level_t,
which is now limited to BoringSSL QUIC callbacks, with mappings
provided. Although the values match, this provides a technically
safe approach, in particular, to access protection level sized arrays.
In preparation for using OpenSSL 3.5 TLS callbacks.
|
|
Previously, it was not possible to send acknowledgments if the
congestion window was limited or temporarily exceeded, such as
after sending a large response or MTU probe. If ACKs were not
received from the peer for some reason to update the in-flight
bytes counter below the congestion window, this might result in
a stalled connection.
The fix is to send ACKs regardless of congestion control. This
meets RFC 9002, Section 7:
: Similar to TCP, packets containing only ACK frames do not count
: toward bytes in flight and are not congestion controlled.
This is a simplified implementation to send ACK frames from the
head of the queue. This was made possible after 6f5f17358.
Reported in trac ticket #2621 and subsequently by Vladimir Homutov:
https://mailman.nginx.org/pipermail/nginx-devel/2025-April/ZKBAWRJVQXSZ2ISG3YJAF3EWMDRDHCMO.html
|
|
As per RFC 9002, Section 7.8, congestion window should not be increased
when it's underutilized.
|
|
Previously, these functions operated on a per-level basis. This however
resulted in excessive logging of in_flight and will also led to extra
work detecting underutilized congestion window in the followup patches.
|
|
This is consistent with the rest of the code and fixes build on systems
with non-standard definition of struct iovec (Solaris, Illumos).
|
|
The field "first" is removed. It's unused since 909b989ec088.
The field "last" is renamed to "send_time". It holds frame send time.
|
|
Previously ngx_quic_frame_sendto() ignored congestion control and did not
contribute to in_flight counter.
Now congestion control window is checked unless ignore_congestion flag is set.
Also, in_flight counter is incremented and the frame is stored in ctx->sent
queue if it's ack-eliciting. This behavior is now similar to
ngx_quic_output_packet().
|
|
Previously, when using ngx_quic_frame_sendto() to explicitly send a packet with
a single frame, anti-amplification limit was not properly enforced. Even when
there was no quota left for the packet, it was sent anyway, but with no padding.
Now the packet is not sent at all.
This function is called to send PATH_CHALLENGE/PATH_RESPONSE, PMTUD and probe
packets. For all these cases packet send is retried later in case the send was
not successful.
|
|
By default packets with these frames are expanded to 1200 bytes. Previously,
if anti-amplification limit did not allow this expansion, it was limited to
whatever size was allowed. However RFC 9000 clearly states no partial
expansion should happen in both cases.
Section 8.2.1. Initiating Path Validation:
An endpoint MUST expand datagrams that contain a PATH_CHALLENGE frame
to at least the smallest allowed maximum datagram size of 1200 bytes,
unless the anti-amplification limit for the path does not permit
sending a datagram of this size.
Section 8.2.2. Path Validation Responses:
An endpoint MUST expand datagrams that contain a PATH_RESPONSE frame
to at least the smallest allowed maximum datagram size of 1200 bytes.
...
However, an endpoint MUST NOT expand the datagram containing the
PATH_RESPONSE if the resulting data exceeds the anti-amplification limit.
|
|
Currently, packets generated by ngx_quic_frame_sendto() and
ngx_quic_send_early_cc() are not logged, thus making it hard
to read logs due to gaps appearing in packet numbers sequence.
At frames level, it is handy to see immediately packet number
in which they arrived or being sent.
|
|
|
|
In addition to triggering alert, it ensures that such packets won't be sent.
With the previous change that marks server keys as discarded by zeroing the
key lengh, it is now an error to send packets with discarded keys. OpenSSL
based stacks tolerate such behaviour because key length isn't used in packet
protection, but BoringSSL will raise the UNSUPPORTED_KEY_SIZE cipher error.
It won't be possible to use discarded keys with reused crypto contexts as it
happens in subsequent changes.
|
|
MTU selection starts by doubling the initial MTU until the first failure.
Then binary search is used to find the path MTU.
|
|
Previously, NGX_AGAIN returned by ngx_quic_send() was treated by
ngx_quic_frame_sendto() as error, which triggered errors in its callers.
However, a blocked socket is not an error. Now NGX_AGAIN is passed as is to
the ngx_quic_frame_sendto() callers, which can safely ignore it.
|
|
The frames for which the padding is removed are PATH_CHALLENGE and
PATH_RESPONSE, which are sent separately by ngx_quic_frame_sendto().
|
|
Its value is the opposite of path->validated.
|
|
When probe timeout expired while congestion window was exhausted, probe PINGs
could not be sent. As a result, lost packets could not be declared lost and
congestion window could not be freed for new packets. This deadlock
continued until connection idle timeout expiration.
Now PINGs are sent separately from the frame queue without congestion control,
as specified by RFC 9002, Section 7:
An endpoint MUST NOT send a packet if it would cause bytes_in_flight
(see Appendix B.2) to be larger than the congestion window, unless the
packet is sent on a PTO timer expiration (see Section 6.2) or when entering
recovery (see Section 7.3.2).
|
|
In non-GSO mode, a datagram is sent if congestion window is not exceeded by the
time of send. The window could be exceeded by a small amount after the send.
In GSO mode, congestion window was checked in a similar way, but for all
concatenated datagrams as a whole. This could result in exceeding congestion
window by a lot. Now congestion window is checked for every datagram in GSO
mode as well.
|
|
Previously it was added to the tail as all other frames. However, if the
amount of queued data is large, it could delay the delivery of ACK, which
could trigger frames retransmissions and slow down the connection.
|
|
Previously, ssl_encryption_application was hardcoded. Before 9553eea74f2a,
ngx_quic_frame_sendto() was used only for PATH_CHALLENGE/PATH_RESPONSE sent
at the application level only. Since 9553eea74f2a, ngx_quic_frame_sendto()
is also used for CONNECTION_CLOSE, which can be sent at initial level after
SSL handshake error or rejection. This resulted in packet encryption error.
Now level is copied from frame, which fixes the error.
|
|
Previously, before sending CONNECTION_CLOSE to client, all pending frames
were sent. This is redundant and could prevent CONNECTION_CLOSE from being
sent due to congestion control. Now pending frames are freed and
CONNECTION_CLOSE is sent without congestion control, as advised by RFC 9002:
Packets containing frames besides ACK or CONNECTION_CLOSE frames
count toward congestion control limits and are considered to be in flight.
|
|
|
|
Previously, since 3550b00d9dc8, the token was allocated on stack, to get
rid of pool usage. Now the token is allocated by ngx_quic_copy_buffer()
in QUIC buffers, also used for STREAM, CRYPTO and ACK frames.
|
|
|
|
The ngx_quic_keys_t structure is now exposed.
|
|
|
|
As shown in RFC 8446, section 2.2, Figure 3, and further specified in
section 4.6.1, BoringSSL releases session tickets in Application Data
(along with Finished) early, based on a precalculated client Finished
transcript, once client signalled early data in extensions.
|
|
The cd8018bc81a5 fixed unintended send of non-padded initial packets,
but failed to restore context properly: only processed contexts need
to be restored. As a consequence, a packet number could be restored
from uninitialized value.
|
|
Notably, this became quite practicable after the recent fix in cd8018bc81a5.
Additionally, do not arm loss detection timer on connection termination.
|
|
Previously, non-padded initial packet could be sent as a result of the
following situation:
- initial queue is not empty (so padding to 1200 is required)
- handshake queue is not empty (so padding is to be added after h/s packet)
- path is limited
If serializing handshake packet would violate path limit, such packet was
omitted, and the non-padded initial packet was sent.
The fix is to avoid sending the packet at all in such case. This follows the
original intention introduced in c5155a0cb12f.
|
|
The NGX_DECLINED is replaced with NGX_DONE to match closer to return code
of ngx_quic_handle_packet() and ngx_quic_close_connection() rc argument.
The ngx_quic_close_connection() rc code is used only when quic connection
exists, thus anything goes if qc == NULL.
The ngx_quic_handle_datagram() does not return NG_OK in cases when quic
connection is not yet created.
|
|
This was broken in 1e2f4e9c8195.
While there, adjusted formatting of debug message with socket seqnum.
|
|
|
|
|
|
Previously, "early error" message was logged in this case.
|
|
The quic connection now holds active, backup and probe paths instead
of sockets. The number of migration paths is now limited and cannot
be inflated by a bad client or an attacker.
The client id is now associated with path rather than socket. This allows
to simplify processing of output and connection ids handling.
New migration abandons any previously started migrations. This allows to
free consumed client ids and request new for use in future migrations and
make progress in case when connection id limit is hit during migration.
A path now can be revalidated without losing its state.
The patch also fixes various issues with NAT rebinding case handling:
- paths are now validated (previously, there was no validation
and paths were left in limited state)
- attempt to reuse id on different path is now again verified
(this was broken in 40445fc7c403)
- former path is now validated in case of apparent migration
|
|
ngx_quic_alloc_buf() -> ngx_quic_alloc_chain(),
ngx_quic_free_bufs() -> ngx_quic_free_chain(),
ngx_quic_trim_bufs() -> ngx_quic_trim_chain()
|
|
The output is always sent to the active path, which is stored in the
quic connection. There is no need to pass it in arguments.
When output has to be send to to a specific path (in rare cases, such as
path probing), a separate method exists (ngx_quic_frame_sendto()).
|
|
|
|
The path validation status and anti-amplification limit status is actually
two different variables. It is possible that validating path should not
be limited (for example, when re-validating former path).
|
|
The function now takes path as an argument to deal with associated
restrictions and update sent counter.
|
|
|
|
Removed sending CLOSE_CONNECTION directly to avoid duplicate frames,
since it is sent later again in SSL_do_handshake() error handling.
As such, removed redundant settings of error fields set elsewhere.
While here, improved debug message.
|
|
Thanks to Andrey Kolyshkin <a.kolyshkin@corp.vk.com>
|
|
The "min" and "max" arguments refer to UDP datagram size. Generating payload
requires to account properly for header size, which is variable and depends on
payload size and packet number.
|
|
|
|
If packet needs to be expanded (for example Initial to 1200 bytes),
but path limit is less, such packet should not be created/sent.
|
|
Previously, in-flight byte counter and congestion window were properly
maintained, but the limit was not properly implemented.
Now a new datagram is sent only if in-flight byte counter is less than window.
The limit is datagram-based, which means that a single datagram may lead to
exceeding the limit, but the next one will not be sent.
|
|
Previously, the error was ignored leading to unnecessary retransmits.
Now, unsent frames are returned into output queue, state is reset, and
timer is started for the next send attempt.
|
|
The directive enables usage of UDP segmentation offloading by quic.
By default, gso is disabled since it is not always operational when
detected (depends on interface configuration).
|