| Age | Commit message (Collapse) | Author | Files | Lines |
|
Encryption level values are decoupled from ssl_encryption_level_t,
which is now limited to BoringSSL QUIC callbacks, with mappings
provided. Although the values match, this provides a technically
safe approach, in particular, to access protection level sized arrays.
In preparation for using OpenSSL 3.5 TLS callbacks.
|
|
|
|
RFC 9002, Section 6.1.1 defines packet reordering threshold as 3. Testing
shows that such low value leads to spurious packet losses followed by
congestion window collapse. The change implements dynamic packet threshold
detection based on in-flight packet range. Packet threshold is defined
as half the number of in-flight packets, with mininum value of 3.
Also, renamed ngx_quic_lost_threshold() to ngx_quic_time_threshold()
for better compliance with RFC 9002 terms.
|
|
|
|
As per RFC 9000, Section 14.4:
Loss of a QUIC packet that is carried in a PMTU probe is therefore
not a reliable indication of congestion and SHOULD NOT trigger a
congestion control reaction.
|
|
As per RFC 9002, Section 7.8, congestion window should not be increased
when it's underutilized.
|
|
On some systems the value of ngx_current_msec is derived from monotonic
clock, for which the following is defined by POSIX:
For this clock, the value returned by clock_gettime() represents
the amount of time (in seconds and nanoseconds) since an unspecified
point in the past.
As as result, overflow protection is needed when comparing two ngx_msec_t.
The change adds such protection to the ngx_quic_detect_lost() function.
|
|
Since recovery_start field was initialized with ngx_current_msec, all
congestion events that happened within the same millisecond or cycle
iteration, were treated as in recovery mode.
Also, when handling persistent congestion, initializing recovery_start
with ngx_current_msec resulted in treating all sent packets as in recovery
mode, which violates RFC 9002, see example in Appendix B.8.
While here, also fixed recovery_start wrap protection. Previously it used
2 * max_idle_timeout time frame for all sent frames, which is not a
reliable protection since max_idle_timeout is unrelated to congestion
control. Now recovery_start <= now condition is enforced. Note that
recovery_start wrap is highly unlikely and can only occur on a
32-bit system if there are no congestion events for 24 days.
|
|
As per RFC 9002, Section B.2, max_datagram_size used in congestion window
computations should be based on path MTU.
|
|
Improved logging for simpler data extraction for plotting congestion
window graphs. In particular, added current milliseconds number from
ngx_current_msec.
While here, simplified logging text and removed irrelevant data.
|
|
Since a2a513b93cae, stream frames no longer need to be retransmitted after it
was deleted. The frames which were retransmitted before, could be stream data
frames sent prior to a RESET_STREAM. Such retransmissions are explicitly
prohibited by RFC 9000, Section 19.4.
|
|
On-packet acknowledgement is made path aware, as per RFC 9000, Section 9.4:
Packets sent on the old path MUST NOT contribute to congestion control
or RTT estimation for the new path.
To make this possible in a single congestion control context, the first packet
to be sent after the new path has been validated, which includes resetting the
congestion controller and RTT estimator, is now remembered in the connection.
Packets sent previously, such as on the old path, are not taken into account.
Note that although the packet number is saved per-connection, the added checks
affect application level packets only. For non-application level packets,
which are only processed prior to the handshake is complete, the remembered
packet number remains set to zero.
|
|
The field "first" is removed. It's unused since 909b989ec088.
The field "last" is renamed to "send_time". It holds frame send time.
|
|
Previously ngx_quic_frame_sendto() ignored congestion control and did not
contribute to in_flight counter.
Now congestion control window is checked unless ignore_congestion flag is set.
Also, in_flight counter is incremented and the frame is stored in ctx->sent
queue if it's ack-eliciting. This behavior is now similar to
ngx_quic_output_packet().
|
|
Previously it was possible to generate ACK frames using formally discarded
protection keys, in particular, when acknowledging a client Handshake packet
used to complete the TLS handshake and to discard handshake protection keys.
As it happens late in packet processing, it could be possible to generate ACK
frames after the keys were already discarded.
ACK frames are generated from ngx_quic_ack_packet(), either using a posted
push event, which envolves ngx_quic_generate_ack() as a part of the final
packet assembling, or directly in ngx_quic_ack_packet(), such as when there
is no room to add a new ACK range or when the received packet is out of order.
The added keys availability check is used to avoid generating late ACK frames
in both cases.
|
|
MTU selection starts by doubling the initial MTU until the first failure.
Then binary search is used to find the path MTU.
|
|
When probe timeout expired while congestion window was exhausted, probe PINGs
could not be sent. As a result, lost packets could not be declared lost and
congestion window could not be freed for new packets. This deadlock
continued until connection idle timeout expiration.
Now PINGs are sent separately from the frame queue without congestion control,
as specified by RFC 9002, Section 7:
An endpoint MUST NOT send a packet if it would cause bytes_in_flight
(see Appendix B.2) to be larger than the congestion window, unless the
packet is sent on a PTO timer expiration (see Section 6.2) or when entering
recovery (see Section 7.3.2).
|
|
Previously, PTO handler analyzed the first packet in the sent queue for the
timeout expiration. However, the last sent packet should be analyzed instead.
An example is timeout calculation in ngx_quic_set_lost_timer().
|
|
Previously the field pnum of a potentially freed frame was accessed. Now the
value is copied to a local variable. The old behavior did not cause any
problems since the frame memory is not freed, but is moved to a free queue
instead.
|
|
Previously ACK was not generated if max_ack_delay was not yet expired and the
number of unacknowledged ack-eliciting packets was less than two, as allowed by
RFC 9000 13.2.1-13.2.2. However this only makes sense to avoid sending ACK-only
packets, as explained by the RFC:
On the other hand, reducing the frequency of packets that carry only
acknowledgments reduces packet transmission and processing cost at both
endpoints.
Now ACK is delayed only if output frame queue is empty. Otherwise ACK is sent
immediately, which significantly improves QUIC performance with certain tests.
|
|
Previously, computing rttvar used an updated smoothed_rtt value as per
RFC 9002, section 5.3, which appears to be specified in a wrong order.
A technical errata ID 7539 is reported.
|
|
Previously, ngx_quic_close_connection() could be called in a way that QUIC
connection was accessed after the call. In most cases the connection is not
closed right away, but close timeout is scheduled. However, it's not always
the case. Also, if the close process started earlier for a different reason,
calling ngx_quic_close_connection() may actually close the connection. The
connection object should not be accessed after that.
Now, when possible, return statement is added to eliminate post-close connection
object access. In other places ngx_quic_close_connection() is substituted with
posting close event.
Also, the new way of closing connection in ngx_quic_stream_cleanup_handler()
fixes another problem in this function. Previously it passed stream connection
instead of QUIC connection to ngx_quic_close_connection(). This could result
in incomplete connection shutdown. One consequence of that could be that QUIC
streams were freed without shutting down their application contexts. This could
result in another use-after-free.
Found by Coverity (CID 1530402).
|
|
Path validation packets containing PATH_CHALLENGE frames are sent separately
from regular frame queue, because of the need to use a decicated path and pad
the packets. The packets are sent periodically, separately from the regular
probe/lost detection mechanism. A path validation packet is resent up to 3
times, each time after PTO expiration, with increasing per-path PTO backoff.
|
|
The check is needed for clients in order to unblock a server due to
anti-amplification limits, and it seems to make no sense for servers.
See RFC 9002, A.6 and A.8 for a further explanation.
This makes max_ack_delay to now always account, notably including
PATH_CHALLENGE timers as noted in the last paragraph of 9000, 9.4,
unlike when it was only used when there are packets in flight.
While here, fixed nearby style.
|
|
Previously, stream was kept alive until all its data is sent. This resulted
in disabling retransmission of final part of stream when QUIC connection
was closed right after closing stream connection.
|
|
|
|
|
|
This allows to eliminate the usage of stream connection event flags for tracking
stream state.
|
|
RFC 9002 uses constants implying effective implementation,
i.e. using bit shift operations instead of floating point.
|
|
The sent queue is sorted by packet number. It is possible to avoid
traversing full queue while handling ack ranges. It makes sense to
start traversing from the queue head (i.e. check oldest packets first).
|
|
Previously, in-flight byte counter and congestion window were properly
maintained, but the limit was not properly implemented.
Now a new datagram is sent only if in-flight byte counter is less than window.
The limit is datagram-based, which means that a single datagram may lead to
exceeding the limit, but the next one will not be sent.
|
|
The information about the type is contained in off/len/fin bits.
Also, where possible, only the first stream type (0x08) is used for simplicity.
|
|
This includes updating citations and further clarification.
|
|
According to RFC 9002 (quic-recovery) 7.6.
|
|
Previously each stream had an input buffer. Now memory is allocated as
bytes arrive. Generic buffering mechanism is used for this.
|
|
The patch adds proper transitions between multiple networking addresses that
can be used by a single quic connection. New networking paths are validated
using PATH_CHALLENGE/PATH_RESPONSE frames.
|
|
Currently both names are used which is confusing. Historically these were
different objects, but now it's the same one. The name qs (quic stream) makes
more sense than sn (stream node).
|
|
|
|
|
|
|