-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dashboard] Remove ReportHead usage of DataSource.agents. #49878
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Ruiyang Wang <[email protected]>
Signed-off-by: Ruiyang Wang <[email protected]>
Signed-off-by: Ruiyang Wang <[email protected]>
Signed-off-by: Ruiyang Wang <[email protected]>
If either of them are not found, return None. | ||
""" | ||
node_info, agent_port_json = await asyncio.gather( | ||
self.gcs_aio_client.get_all_node_info(node_id=node_id), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as long as you have a gcs client, you have all nodes cached, we can just use that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That thing only have value if you subscribe to node change:
ray/src/ray/gcs/gcs_client/accessor.h
Lines 398 to 399 in 0022380
/// Note, the local cache is only available if `AsyncSubscribeToNodeChange` | |
/// is called before. |
a python side gcs client by default does not have that value.
src/ray/gcs/gcs_client/accessor.h
Outdated
virtual Status AsyncGetAll(std::optional<NodeID> node_id, | ||
const MultiItemCallback<rpc::GcsNodeInfo> &callback, | ||
int64_t timeout_ms); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: given filters are optional we can do
virtual Status AsyncGetAll(
const MultiItemCallback<rpc::GcsNodeInfo> &callback,
int64_t timeout_ms,
std::optional<NodeID> node_id = std::nullopt);
so you don't need to update all callers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we already changed, and there's only a few of them, let's just keep them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right we need to do the defaults; cpp tests have a lot of callsites
Signed-off-by: Ruiyang Wang <[email protected]>
Signed-off-by: Ruiyang Wang <[email protected]>
Signed-off-by: Ruiyang Wang <[email protected]>
@alanwguo this PR changes some HTTP endpoints, and I have changed callsites in frontend accordingly. PTAL! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
node_id: The ID of the node. | ||
""" | ||
if "pid" not in req.query: | ||
raise ValueError("pid is required") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems in the code code it's not required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think pid is required, otherwise we don't know what pid to traceback
Co-authored-by: Jiajun Yao <[email protected]> Signed-off-by: Ruiyang Wang <[email protected]>
Signed-off-by: Ruiyang Wang <[email protected]>
Signed-off-by: Ruiyang Wang <[email protected]>
Signed-off-by: Ruiyang Wang <[email protected]>
Signed-off-by: Ruiyang Wang <[email protected]>
ReportHead used to subscribe to DataSource.agents changes, to maintain a connection to every single O(#node) agents, just to make grpc calls when needed. This PR removes it, now it only make on demand connections when a profiling request comes.
Changes:
DASHBOARD_AGENT_PORT_PREFIX
->DASHBOARD_AGENT_ADDR_PREFIX
json.dumps([http_port, grpc_port])
->json.dumps([ip, http_port, grpc_port])
ip
to give paramnode_id
.DataSource.node_physical_stats
to NodeHead.After this PR, ReportHead no longer needs DataSource.
Smoke tested UI locally.