Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: stop services after transports #1138

Closed
wants to merge 1 commit into from
Closed

Conversation

diegomrsantos
Copy link
Contributor

#843 Services stop duplication, but services should be stopped after Transports, as they are higher level.

@diegomrsantos diegomrsantos requested a review from lchenut June 25, 2024 12:07
@kaiserd
Copy link
Collaborator

kaiserd commented Jun 25, 2024

Looking a the errors Unhandled defect: An attempt was made to complete a Future more than once., it seems the services close resources the transport tries to close again.

I agree it would make sense to close services after transports though. Maybe the services close resources they should not?

@diegomrsantos
Copy link
Contributor Author

Looking a the errors Unhandled defect: An attempt was made to complete a Future more than once., it seems the services close resources the transport tries to close again.

I agree it would make sense to close services after transports though. Maybe the services close resources they should not?

I'm surprised that the tests failed.

@diegomrsantos
Copy link
Contributor Author

diegomrsantos commented Jun 25, 2024

@lchenut onReservation is being called when the switch is stopped with an empty list of addresses

self.onReservation(concat(toSeq(self.relayAddresses.values)))

@diegomrsantos
Copy link
Contributor Author

diegomrsantos commented Jun 25, 2024

@lchenut onReservation is being called when the switch is stopped with an empty list of addresses

self.onReservation(concat(toSeq(self.relayAddresses.values)))

this fixes the tests, but not sure if it's the right way of doing it:

if not self.onReservation.isNil() and self.relayAddresses.len > 0 and switch.started:
  self.onReservation(concat(toSeq(self.relayAddresses.values)))

@lchenut
Copy link
Contributor

lchenut commented Jun 26, 2024

It looks good to me. It should probably be necessary to write a similar condition here aswell

if not self.onReservation.isNil():

@diegomrsantos
Copy link
Contributor Author

Why is onReservation being called? No reservations happened. Switch's started field is private and thus not used anywhere else in the system.

@diegomrsantos
Copy link
Contributor Author

self.relayAddresses.len > 0 fixes two tests, but without switch.started, "Three relays connections" still fails.

@lchenut
Copy link
Contributor

lchenut commented Jun 26, 2024

if not self.onReservation.isNil():
self.onReservation(concat(toSeq(self.relayAddresses.values)))

Here onReservation is called because we just connect someone and made a reservation, so we send an up to date list of address.
if not self.onReservation.isNil():
self.onReservation(concat(toSeq(self.relayAddresses.values)))

Here onReservation is called because some remote peer just disconnect or close connection, so we send an up to date list of address.

@lchenut
Copy link
Contributor

lchenut commented Jun 26, 2024

And Three relay connections still fails because you first stop the Transport, thus the service will receive the disconnection and try to update and send the newly list of addresses without those who disconnects.

One thing you could do to prevent this behavior is to add a fifth state in Three relay connections, something like TestFinished and it'll work just fine...

But to be perfectly honest, I'm not 100% certain this PR is useful. My gut feeling says that we should stop Services (which indirectly uses the Transport) before closing the Transports.

@diegomrsantos
Copy link
Contributor Author

But to be perfectly honest, I'm not 100% certain this PR is useful. My gut feeling says that we should stop Services (which indirectly uses the Transport) before closing the Transports.

I think you're right and I'll probably close it.

Here onReservation is called because some remote peer just disconnect or close connection, so we send an up to date list of address.

So onReservation name seems misleading and a renaming might be good. Additionally, shouldn't the Autorelay Service and its tests handle this situation correctly and not fail if unrelated things are moving around?

@lchenut
Copy link
Contributor

lchenut commented Jun 26, 2024

So onReservation name seems misleading and a renaming might be good. Additionally, shouldn't the Autorelay Service and its tests handle this situation correctly and not fail if unrelated things are moving around?

I agree that onReservation name could be misleading.

Autorelay Service's test handles this specific case, it's the Relay2UnreservedAndRelay1Reserved state in Three relays connections. What I haven't anticipated is that doing await switchClient.stop() will cause onReservation to be called even though it's setup to be a service of switchClient. Which is, in my opinion, a bit weird.

@diegomrsantos
Copy link
Contributor Author

What I haven't anticipated is that doing await switchClient.stop() will cause onReservation to be called even though it's setup to be a service of switchClient. Which is, in my opinion, a bit weird.

Could you elaborate on what is weird in your opinion?

@lchenut
Copy link
Contributor

lchenut commented Jun 26, 2024

Let's take an example:

  • SwitchA creates an Autorelay service and starts
  • SwitchB mounts the relay protocols and starts
  • SwitchA connects to switchB
  • The badly named onReservation is called with the relay address to passing through SwitchB
  • (...)
  • SwitchA stops.
  • The badly named onReservation is called to indicate that you are no longer connected to SwitchB.

I find it weird because we are stopping SwitchA. So we know we are disconnecting. I'm not sure it's a relevant point to indicate, after you decide to stop that, you indeed stopped. It's redundant.
I might be wrong, though. But if we accept to stop services after stopping transports, I will prefer a re-write of the autorelay test instead of adding and self.relayAddresses.len > 0 and switch.started (even though I said it looked good two hours ago, I changed my mind about that)

@diegomrsantos
Copy link
Contributor Author

I'm going to close this PR, we should stop the services first. One way to think about it is we start "things" and add them to an imaginary stack. When we want to stop them, we pop and stop, i.e. we stop in the reserve order that they were added. See #1095 (comment).

Even after this PR is closed and the tests pass, the service remains brittle if it behaves unexpectedly when unrelated changes occur.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: done
Development

Successfully merging this pull request may close these issues.

3 participants