-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fsm dialog #241
Fsm dialog #241
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## state_machine #241 +/- ##
=================================================
+ Coverage 84.18% 86.49% +2.30%
=================================================
Files 37 39 +2
Lines 487 607 +120
=================================================
+ Hits 410 525 +115
- Misses 77 82 +5 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great stuff @dalonsoa. I've suggested a few minor tweaks.
A little realism but I'm not sure how much we really need so long as we can interact with the processes as need to. |
If that's all, let's make that the default session and move on, so we don't get stuck on this. |
I say merge this as-is and make a separate issue about changing the default to lr-session |
I think it would be helpful to see the logs of |
@plasorak here you have them, with entries just before and after the
[05:41:31] INFO rest_api_child.py:303 ru-01-commander: Received reply from ru-01 to start
INFO broadcast_sender.py:65 Broadcast: Propagated execute_fsm_command to children (ru-01) successfully
INFO rest_api_child.py:303 ru-02-commander: Received reply from ru-02 to start
INFO broadcast_sender.py:65 Broadcast: Propagated execute_fsm_command to children (ru-02) successfully
INFO broadcast_sender.py:65 Broadcast: Changing operational_state from propagating-start to start-propagated
INFO broadcast_sender.py:65 Broadcast: Changing operational_state from start-propagated to executing-start
INFO broadcast_sender.py:65 Broadcast: Changing operational_state from executing-start to start-terminated
INFO broadcast_sender.py:65 Broadcast: Changing operational_state from configured to ready
INFO broadcast_sender.py:65 Broadcast: Changing operational_state from start-terminated to finalising-start
INFO broadcast_sender.py:65 Broadcast: Changing operational_state from finalising-start to ready
INFO broadcast_sender.py:65 Broadcast: User 'nobody' successfully executed 'execute_fsm_command'
INFO broadcast_sender.py:65 Broadcast: User 'nobody' successfully executed 'status'
Exception in thread connectivity_service_updating_thread:
Traceback (most recent call last):
File "/basedir/NFD_DEV_241114_A9/.venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 445, in _make_request
six.raise_from(e, None)
File "", line 3, in raise_from
File "/basedir/NFD_DEV_241114_A9/.venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 440, in _make_request
httplib_response = conn.getresponse()
File "/cvmfs/dunedaq.opensciencegrid.org/spack/externals/ext-v2.1/spack-0.22.0/opt/spack/linux-almalinux9-x86_64/gcc-12.1.0/python-3.10.10-gcsatsf5lmzrhmprzux7uv67w2omc7e3/lib/python3.10/http/client.py", line 1374, in getresponse
response.begin()
File "/cvmfs/dunedaq.opensciencegrid.org/spack/externals/ext-v2.1/spack-0.22.0/opt/spack/linux-almalinux9-x86_64/gcc-12.1.0/python-3.10.10-gcsatsf5lmzrhmprzux7uv67w2omc7e3/lib/python3.10/http/client.py", line 318, in begin
version, status, reason = self._read_status()
File "/cvmfs/dunedaq.opensciencegrid.org/spack/externals/ext-v2.1/spack-0.22.0/opt/spack/linux-almalinux9-x86_64/gcc-12.1.0/python-3.10.10-gcsatsf5lmzrhmprzux7uv67w2omc7e3/lib/python3.10/http/client.py", line 279, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/cvmfs/dunedaq.opensciencegrid.org/spack/externals/ext-v2.1/spack-0.22.0/opt/spack/linux-almalinux9-x86_64/gcc-12.1.0/python-3.10.10-gcsatsf5lmzrhmprzux7uv67w2omc7e3/lib/python3.10/socket.py", line 705, in readinto
return self._sock.recv_into(b)
TimeoutError: timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/basedir/NFD_DEV_241114_A9/.venv/lib/python3.10/site-packages/requests/adapters.py", line 439, in send
resp = conn.urlopen(
File "/basedir/NFD_DEV_241114_A9/.venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 755, in urlopen
retries = retries.increment(
File "/basedir/NFD_DEV_241114_A9/.venv/lib/python3.10/site-packages/urllib3/util/retry.py", line 532, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/basedir/NFD_DEV_241114_A9/.venv/lib/python3.10/site-packages/urllib3/packages/six.py", line 770, in reraise
raise value
File "/basedir/NFD_DEV_241114_A9/.venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 699, in urlopen
httplib_response = self._make_request(
File "/basedir/NFD_DEV_241114_A9/.venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 447, in _make_request
self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
File "/basedir/NFD_DEV_241114_A9/.venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 336, in _raise_timeout
raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPConnectionPool(host='localhost', port=5000): Read timed out. (read timeout=0.5)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/cvmfs/dunedaq.opensciencegrid.org/spack/externals/ext-v2.1/spack-0.22.0/opt/spack/linux-almalinux9-x86_64/gcc-12.1.0/python-3.10.10-gcsatsf5lmzrhmprzux7uv67w2omc7e3/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/cvmfs/dunedaq.opensciencegrid.org/spack/externals/ext-v2.1/spack-0.22.0/opt/spack/linux-almalinux9-x86_64/gcc-12.1.0/python-3.10.10-gcsatsf5lmzrhmprzux7uv67w2omc7e3/lib/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "/basedir/NFD_DEV_241114_A9/.venv/lib/python3.10/site-packages/drunc/controller/controller.py", line 281, in update_connectivity_service
ctrler.connectivity_service.publish(
File "/basedir/NFD_DEV_241114_A9/.venv/lib/python3.10/site-packages/drunc/connectivity_service/client.py", line 110, in publish
http_post(
File "/basedir/NFD_DEV_241114_A9/.venv/lib/python3.10/site-packages/drunc/utils/utils.py", line 268, in http_post
r = post(address, json=data, **post_kwargs)
File "/basedir/NFD_DEV_241114_A9/.venv/lib/python3.10/site-packages/requests/api.py", line 119, in post
return request('post', url, data=data, json=json, **kwargs)
File "/basedir/NFD_DEV_241114_A9/.venv/lib/python3.10/site-packages/requests/api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "/basedir/NFD_DEV_241114_A9/.venv/lib/python3.10/site-packages/requests/sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "/basedir/NFD_DEV_241114_A9/.venv/lib/python3.10/site-packages/requests/sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "/basedir/NFD_DEV_241114_A9/.venv/lib/python3.10/site-packages/requests/adapters.py", line 529, in send
raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPConnectionPool(host='localhost', port=5000): Read timed out. (read timeout=0.5)
[05:41:46] INFO broadcast_sender.py:65 Broadcast: User 'nobody' successfully executed 'status'
[05:41:48] INFO broadcast_sender.py:65 Broadcast: User 'nobody' successfully executed 'status'
[05:41:53] INFO broadcast_sender.py:65 Broadcast: Propagating take_control to children
INFO broadcast_sender.py:65 Broadcast: Propagating take_control to children (ru-01)
[05:41:54] INFO rest_api_child.py:517 ru-01-rest-api-child: Ignoring command 'take_control' sent to 'ru-01'
INFO broadcast_sender.py:65 Broadcast: Propagating take_control to children (ru-02)
INFO rest_api_child.py:517 ru-02-rest-api-child: Ignoring command 'take_control' sent to 'ru-02'
INFO broadcast_sender.py:65 Broadcast: User 'nobody' successfully executed 'take_control'
INFO broadcast_sender.py:65 Broadcast: Changing operational_state from ready to preparing-drain_dataflow
INFO broadcast_sender.py:65 Broadcast: Changing operational_state from preparing-drain_dataflow to drain_dataflow-ready
INFO broadcast_sender.py:65 Broadcast: Changing operational_state from drain_dataflow-ready to propagating-drain_dataflow
INFO broadcast_sender.py:65 Broadcast: Propagating execute_fsm_command to children
INFO broadcast_sender.py:65 Broadcast: Propagating execute_fsm_command to children (ru-01)
INFO rest_api_child.py:532 ru-01-rest-api-child: Sending 'drain_dataflow' to 'ru-01'
INFO broadcast_sender.py:65 Broadcast: Propagating execute_fsm_command to children (ru-02)
INFO rest_api_child.py:532 ru-02-rest-api-child: Sending 'drain_dataflow' to 'ru-02'
[05:41:55] INFO rest_api_child.py:303 ru-02-commander: Received reply from ru-02 to drain_dataflow
INFO rest_api_child.py:303 ru-01-commander: Received reply from ru-01 to drain_dataflow
INFO broadcast_sender.py:65 Broadcast: Propagated execute_fsm_command to children (ru-02) successfully
INFO broadcast_sender.py:65 Broadcast: Propagated execute_fsm_command to children (ru-01) successfully
INFO broadcast_sender.py:65 Broadcast: Changing operational_state from propagating-drain_dataflow to drain_dataflow-propagated
[05:41:56] INFO broadcast_sender.py:65 Broadcast: Changing operational_state from drain_dataflow-propagated to executing-drain_dataflow
INFO broadcast_sender.py:65 Broadcast: Changing operational_state from executing-drain_dataflow to drain_dataflow-terminated
INFO broadcast_sender.py:65 Broadcast: Changing operational_state from ready to dataflow_drained
INFO broadcast_sender.py:65 Broadcast: Changing operational_state from drain_dataflow-terminated to finalising-drain_dataflow
INFO broadcast_sender.py:65 Broadcast: Changing operational_state from finalising-drain_dataflow to dataflow_drained
INFO broadcast_sender.py:65 Broadcast: User 'nobody' successfully executed 'execute_fsm_command'
INFO broadcast_sender.py:65 Broadcast: User 'nobody' successfully executed 'status'
|
Thanks @dalonsoa , new issue here: DUNE-DAQ/drunc#321 |
Fantastic! Any idea of why things might be failing for Linux and MacOS but not for Windows? |
Not really... This could be due to the performance of the connectivity service, is it possible your Windows machine is more powerful? |
Maybe... It is from last year, so pretty new, but not particularly high specs, I think. |
Use 1x1 config as default session
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. :+ The dynamic class creation thing is weird. 😕
Adds the tree table page
Adds the app tree to the controller overview
Description
Adds the dialog to gather the arguments needed to execute transitions on the FSM. A couple of bugs with the hardcoded FSM have been implemented, as well.
Recording.2024-11-22.135358.mp4
Fixes #216
Fixes #217
Fixes #161 (the umbrella issue for the FSM)
Type of change
Key checklist
python -m pytest
)python -m sphinx -b html docs docs/build
)pre-commit run --all-files
)Further checks