-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restart retransmit timer on new data ACK #1018
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1018 +/- ##
==========================================
+ Coverage 80.81% 80.83% +0.02%
==========================================
Files 81 81
Lines 28385 28419 +34
==========================================
+ Hits 22939 22973 +34
Misses 5446 5446 ☔ View full report in Codecov by Sentry. |
If you want to test it you can use mininet. It can simulate networks with various host and switches, connected with links with custom delay and bandwidth. I can share my mininet config if you are interested. |
That would be really helpful, thank you :) |
As @Dirbaio mentioned in the matrix channel, the removal of if !self.seq_to_transmit(cx) && repr.segment_len() > 0 && !self.timer.is_retransmit() {
// RFC 6298: (5.1) If we've transmitted all data we could (and there was
// something at all, data or flag, to transmit, not just an ACK), start the
// retransmit timer if it is not already running.
self.timer
.set_for_retransmit(cx.now(), self.rtte.retransmission_timeout());
} While this part (line 1838) should not, as it specifically calls for restarting the timer on retransmission: // RFC 6298: (5.2) ACK of all outstanding data turn off the retransmit timer.
// (5.3) ACK of new data in ESTABLISHED state restart the retransmit timer.
(State::Established, TcpControl::None) => {
if ack_all {
self.timer.set_for_idle(cx.now(), self.keep_alive);
} else if ack_len > 0 {
self.timer
.set_for_retransmit(cx.now(), self.rtte.retransmission_timeout());
}
} If I am correct, it should be an easy fix (maybe an |
I was a bit unsure about the I see that there is now a new PR #1020 for fixing exponential backoff that modifies the same parts as this PR, so perhaps I should wait for that to land and then rebase this PR. |
Here is my mininet config: import subprocess
from mininet.cli import CLI
from mininet.link import TCLink
from mininet.net import Mininet
from mininet.node import OVSController
from mininet.topo import Topo
LEFT_HOST_NAME: str = 'left_h'
RIGHT_HOST_NAME: str = 'right_h'
class HostInTheMiddleTopo(Topo):
def build(self):
left_host = self.addHost(LEFT_HOST_NAME)
right_host = self.addHost(RIGHT_HOST_NAME)
self.addLink(left_host, right_host, bw=100, delay="1ms")
def get_interface(host: str, index: int):
return f"{host}-eth{index}"
def disable_offload(host, offload: str, index: int):
host.cmd(f"ethtool -K {get_interface(host.name, index)} {offload} off")
def run():
net = Mininet(topo=HostInTheMiddleTopo(),
controller=OVSController, link=TCLink)
left_host = net[LEFT_HOST_NAME]
right_host = net[RIGHT_HOST_NAME]
for host in [left_host, right_host, middle_host]:
for name in ["tx", "rx"]:
disable_offload(host, name, 0)
left_host.cmd("tcpdump -w left.pcap &")
right_host.cmd("tcpdump -w right.pcap &")
# left_host.cmd(f"nc -l 0.0.0.0 8000 > /dev/null &")
# left_host.cmd(f"yes | nc -l 0.0.0.0 8000 &")
# right_host.cmd("ifstat -i right_h-eth0 > speedd.txt &")
# left_host.cmd(f"sudo RUST_LOG=trace ./target/release/smol-speed > out.txt 2> err.txt &")
net.start()
CLI(net)
net.stop()
if __name__ == "__main__":
run() You can run it by |
546aba9
to
13774ff
Compare
This is an attempt to address the issue described in #949, inspired by the patch and discussion in #953.
I don't want to step on anybody's toes - if you want to continue working on the original PR I don't mind 🙂.
Basically, what I've tried to do is to implement rfc6298 5.3 based on my understanding of that and how the code works. I added a test for the retransmission timer restart but I'm not sure if it is enough.
For testing I simulated a high latency link with 1000ms RTT and wrote a simple smoltcp client that sends a single byte every 100ms with Nagle disabled so that there is always outstanding data. Without the patch retransmissions keep happening, while the patched version only has a few initial retransmissions before they stop occurring. I don't have access to a real setup like the one in the original issue so I can't say for sure how much it helps in such situations.