Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a maximum wait time between iterations in the opcua server #295

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

cochicde
Copy link
Contributor

No description provided.

@cochicde cochicde marked this pull request as draft December 16, 2024 20:52
@cochicde cochicde requested a review from azoitl December 16, 2024 20:57
@cochicde
Copy link
Contributor Author

this moves the wait internally to the run_iterate which goes into fdselect. I quickly checked with the opcua_dev and the deployment was intantaneous.

The drawback now is that because there's a 200ms timeout in fdselect, the iternal part of the server takes way longer to start. Why? Because the server is protected with a mutex, so every node that needs to be created by forte, will need to wait for this 200ms before accessing the server.

The best case scenario is that forte can interrupt the fdselect when it needs to access the server. My first thought was to send some kind of dummy packet, what do you think? Is there a way to force a wake up from fdselect?

@azoitl
Copy link
Contributor

azoitl commented Dec 16, 2024

This will not help. I think it is a very bad idea to hold a mutex while the server is in fdselect. it should hold the mutex when it leaves fdselect to process the packets. Do we have any chance to do that?

@cochicde
Copy link
Contributor Author

mmm I don't think we can. I couldn't find anything in the server interface nor in the code handling the incoming data in open62541.

One possible solution would be to have the actions being executed asyncrhonously. What we do now, whenever we init/exexute/deinit an action (which are the places we touch the server) we hold the mutex. We could pass this actions into a queue (protected by a mutex) and process these between loops of the server, similarly to what it's done in the ecet.


Init/Execute/Deinit: 
1. Hold Queue Mutex
2. Copy/Move action to Queue 

run: 
1. If queue.size() != 0 -> Hold Queue Mutex -> Move Queue into Temp queue
3. Process Temp queue
4. open62541.run_iterate()
5. Go to 1

@azoitl
Copy link
Contributor

azoitl commented Dec 17, 2024

After sleeping it over I found an even bigger problem/question: The 20ms lock would also block all publishers for 200ms or?

@cochicde cochicde force-pushed the remove-wait-opcua-server branch from 6579728 to 400ea89 Compare December 17, 2024 18:20
@cochicde
Copy link
Contributor Author

cochicde commented Dec 17, 2024

yes, you're right.

THe problem is how we are dealing the return value from the UA_Server_run_iterate. This value represents the amount of time until the next callback. I'm not sure about which exact callbacks are registered by default, but it is returning 1 second regularly. If you connect to the server, for example with uaexpert, it creates a subscription with a period of 100ms, and UA_Server_run_iterate starts returning 100 ms. It means kind of "the wait time until the server will need to update some stuff", but there's no reason for forte to wait that time.

So, as it is this PR now, we are calling UA_Server_run_iterate with waitInternal = false meaning fdselect returns immediately if nothing is found. Forte then needs to wait the cycle time it wants. I set it now to maximum of 10 ms, but this could be a CMAKE setting

Actually, maybe we can completely ignore the return value and set a CMAKE waitingTime between iterations

@cochicde cochicde marked this pull request as ready for review December 18, 2024 07:02
@cochicde
Copy link
Contributor Author

It should be fine now. I still kept the usage of the return time in case there's a callback sooner than the maximum time (e.g. 5ms) and the minimum to avoid cases where for some reason the server always return 0 creating an issue.

I chose 10 ms for no reason

@cochicde cochicde changed the title remove wait on opcua server and use internal wait Add a maximum wait time between iterations in the opcua server Dec 18, 2024
@cochicde cochicde force-pushed the remove-wait-opcua-server branch from 9d3044f to 2db52b6 Compare December 18, 2024 17:42
@azoitl
Copy link
Contributor

azoitl commented Dec 20, 2024

While this is better then before it is still far from ideal. The best would still be that we could invoke a select loop, and only have the lock when the OPC UA server is really processing. But I guess this is not possible with the current API of Open62541 am I right.

@cochicde
Copy link
Contributor Author

I think that's what we have now. The UA_Server_run_iterate is the function where the OPC UA server is processing. I quickly checked the code and the function is doing mainly two things:

  1. Process delayed callback from the server
  2. Calling Select

Regarding the timeout in fdselect, the API now does not offer the possibility to set the timeout of fdselect as mentioned and it has either 0 or the hardcoded 200ms.
Assuming at some point we have the possibility to set this timeout, adding a timeout there will block for that time calls to the server coming from inside forte, as you mentioned for example, publishing data.

As we have now we have basically a zero delay for internal calls (publishing data wakes up the waiting semaphore) and FORTE_COM_OPC_UA_SERVER_MAX_ITERATION_INTERVAL for external calls (clients calling methods on forte).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants