-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed : retransmission of discarded segments starts at beginning of new block #546
base: master
Are you sure you want to change the base?
Fixed : retransmission of discarded segments starts at beginning of new block #546
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #546 +/- ##
==========================================
+ Coverage 71.36% 71.84% +0.47%
==========================================
Files 26 26
Lines 3129 3129
Branches 480 480
==========================================
+ Hits 2233 2248 +15
+ Misses 765 752 -13
+ Partials 131 129 -2
|
Hello, What are your thoughts ? |
Sorry, didn't find the time to look at it yet. Will do soon when possible. |
I've tried to wrap my head around this part of the standard, but I really cannot judge from just reading it whether this is more correct. Sorry, I have very limited hands-on experience with SDO block transfers, so not easy to see what's going on. So I'm hesitant about merging the change without further understanding what it actually fixes. Could you maybe try to record a bus log which triggers this condition, from a correctly behaving client? Then we could add that as an expected message exchange in the |
Hi, Completely agree that a test should be added, the problem is that we currently don't have an sdo server supporting block transfer. Also I don't think we could see this just with the CAN frames because the protocol part is correct, what's happening is that some frames are getting ignored on the client side, which results in a wrong CRC at the end of the transfer. Let me try to re-phrase what the current problem is and add an example. The current implementation of retransmit looks like this : def _retransmit(self):
logger.info("Only %d sequences were received. Requesting retransmission",
self._ackseq)
end_time = time.time() + self.sdo_client.RESPONSE_TIMEOUT
self._ack_block()
while time.time() < end_time:
response = self.sdo_client.read_response()
res_command, = struct.unpack_from("B", response)
seqno = res_command & 0x7F
if seqno == self._ackseq + 1:
# We should be back in sync
self._ackseq = seqno
return response
self._error = True
self.sdo_client.abort(0x05040000)
raise SdoCommunicationError("Some data were lost and could not be retransmitted") We are waiting for the sequence number to be the same as the last good known sequence number, to start considering the messages. However, this is wrong because the SDO server will start sending the discarded segments at the start of a new block. SERVER [TX] 1... [TX] 4... ==> Last good segment is 4 SERVER [TX] 1... ==> This corresponds to data of seqno "5" of previous block [TX] 127 ==> Complete block received successfully I hope this makes things clearer. |
4725972
to
5341142
Compare
…entation but will fail with an invalid CRC without fix for discarded segments.
5341142
to
33aa620
Compare
Hello, I've added a test for SDO block transfer retransmit, this took me a bit of time. |
Hello, It would be great to have some feedback. |
Hi Samuel, I will try to have a look at your commit this week if @acolomb has no time for it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the long wait. I haven't had as much spare time as I had hoped for this project, and there is a bit of backlog.
I've re-read the protocol description and I think I now understand better what this fix is doing. It looks correct from my side, but again, I haven't been able to test it.
As for the unit test, one small issue was unclear even with the added comment. And could the test data be shortened? Can we somehow force a lower blksize
parameter in the client for this test execution, so fewer frames are required?
I'm also wondering whether there is a chance that the client might send the response (acknowledging the last good sequence number before failure) earlier, without waiting for the rest of the block being uploaded from the server. This will happen with our client implementation, right? So what does the unit test do with the extra RX frames?
(RX, b"\x34\x79\x20\x66\x6f\x78\x20\x6a"), | ||
(RX, b"\x34\x75\x6d\x70\x73\x20\x6f\x76"), # --> Wrong seqno (x34 instead of x33) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these lines intentionally using the same sequence number? From the comment I'd assume the previous segment should have been missing, not misunderstood.
if seqno == self._ackseq + 1: | ||
if seqno == 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering why the self._ack_block()
call doesn't simply reset the self._ackseq
attribute to zero in all cases. Then this check would be fine as is?
Just thinking out loud, let's hear your thoughts on why this is the better place to fix it.
Hi, no problem. For the test data, this is purely because of my setup and the client that I have (the data is taken from a real transaction). It is a bit "long" in lines, but it's not long to execute and I think it's pretty representative of the behavior of CANopen nodes (127 sized blocks are standard). We could probably create the frames programmatically but it would ruin readability. For the "extra" frames, the test was to make sure they were indeed "properly" ignored. The server behavior can differ here as it might wait for the hole block to be transmitted before dealing with the acknowledge block from the client. |
Hi,
I noticed this issue whilst working on another canopen package.
The block upload retransmit does not work correctly.
On the event that a client does not properly receive a sub-block, it sends an end sub-block message with the last acknowledged segment number.
All the frames between ackseq and blksize sent by the server should be ignored (this is currently the case).
However, the server will start resending the missed frames (between ackseq and blksize) at the beginning of the new block, so at seqno==1.
This is difficult to test within the library as there is no sdo server supporting block transfer, but I have tested it against another implementation and it works OK.