Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reintroduce timeout and keep-alive for watch requests to match client-go #2131

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Alabate
Copy link

@Alabate Alabate commented Dec 28, 2024

This will send sends keep-alive probes to the server every 30 seconds. These features were present prior to the 1.0 but were inadvertently removed.

Fixes #2127

Previous relevant issues:

Implementation details

Setting requestInit.timeout = 30000 is equivalent to socket.setTimeout(30000) as you can see in the sources.

One of the solution suggested in #2127 was to use the keepAlive of the http.Agent. But as documented here, that's just a boolean that instruct the agent that we want sockets to be re-used. We want to send packets at a fixed rate to know when a connection is broken. This can be done with the socket.setKeepAlive() method, but to use it, we need to access the socket object.

I managed to access the socket and call setKeepAlive(), but only after the await fetch(), when the response is already arriving. That's the only way I found to access the socket. The Agent create the socket, but there is no event to know when a socket created, you can just access them by list like I did. There is a socket event on the request object, but node-fetch abstract it away, and I couldn't find a clean way to access it (See discussion that request this feature).

The only way I've found to get the socket before the request was by monkey-patching the agent.createConnection(), but I think this solution is worse.

const _createConnection = agent.createConnection;
agent.createConnection = function (...args) {
  const socket = _createConnection.apply(agent, args);
  socket.setTimeout(3000);
  socket.setKeepAlive(true, 3000);
  return socket;
}

@k8s-ci-robot k8s-ci-robot added the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Dec 28, 2024
Copy link

linux-foundation-easycla bot commented Dec 28, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: Alabate / name: Aurélien Labate (8561542)

@k8s-ci-robot k8s-ci-robot added the cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. label Dec 28, 2024
@k8s-ci-robot
Copy link
Contributor

Welcome @Alabate!

It looks like this is your first PR to kubernetes-client/javascript 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-client/javascript has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Dec 28, 2024
@Alabate Alabate force-pushed the fix-watch-keepalive branch from a82c51a to 4113110 Compare December 28, 2024 03:04
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Dec 28, 2024
@Alabate Alabate force-pushed the fix-watch-keepalive branch from 4113110 to ded25aa Compare December 28, 2024 03:07
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Dec 28, 2024
This will send sends keep-alive probes to the server every 30 seconds.
These features were present prior to the 1.0 refactor but were inadvertently removed.
@Alabate Alabate force-pushed the fix-watch-keepalive branch from ded25aa to 8561542 Compare December 28, 2024 03:14
@cjihrig
Copy link
Contributor

cjihrig commented Dec 28, 2024

One of the solution suggested in #2127 was to use the keepAlive of the http.Agent. But as documented here, that's just a boolean that instruct the agent that we want sockets to be re-used. We want to send packets at a fixed rate to know when a connection is broken. This can be done with the socket.setKeepAlive() method, but to use it, we need to access the socket object.

There is also a keepAliveMsecs option in the same docs you linked to. Would using that in conjunction with the keepAlive boolean do what you need and avoid looping over all of the agent's sockets? I believe that is how socket.setKeepAlive() is implemented.

@Alabate
Copy link
Author

Alabate commented Dec 28, 2024

I agree that it should, but when I tried it didn't work, so I though I did not understand the documentation. But in the end, I think it's more because of this issue: nodejs/node#47137

To confirm it, I ran the following snippet that simply create an agent and a request and try to read the setKeepAlive status:

import * as protocol from 'node:http';

const agent = new protocol.Agent({
    keepAlive: true,
    keepAliveMsecs: 3000,
});
console.log('========= Protocol ', agent.protocol, '==========');

const req = protocol.request(`${agent.protocol}//google.com`, { agent: agent }, (res) => {
    logKeepaliveSocketStatus(res.socket);

    req.socket.setKeepAlive(true, 4000);
    logKeepaliveSocketStatus(res.socket);
});
req.end();

function logKeepaliveSocketStatus(socket) {
    const kSetKeepAlive = Object.getOwnPropertySymbols(socket).find(
        (sym) => sym.description === 'kSetKeepAlive',
    );
    const kSetKeepAliveInitialDelay = Object.getOwnPropertySymbols(socket).find(
        (sym) => sym.description === 'kSetKeepAliveInitialDelay',
    );
    console.log('----- Socket status -----');
    console.log('kSetKeepAlive:', socket[kSetKeepAlive]);
    console.log('kSetKeepAliveInitialDelay:', socket[kSetKeepAliveInitialDelay]);
}

When I execute this script with the node:http module, I get the following output:

========= Protocol  http: ==========
----- Socket status -----
kSetKeepAlive: true
kSetKeepAliveInitialDelay: 3
----- Socket status -----
kSetKeepAlive: true
kSetKeepAliveInitialDelay: 4

However, if I replace node:http with node:https, I get:

========= Protocol  https: ==========
----- Socket status -----
kSetKeepAlive: false
kSetKeepAliveInitialDelay: 0
----- Socket status -----
kSetKeepAlive: true
kSetKeepAliveInitialDelay: 4

And in our case we use the https agent

@brendandburns
Copy link
Contributor

Please add unit tests also.

@cjihrig
Copy link
Contributor

cjihrig commented Dec 31, 2024

As much as I don't like the workaround here, I haven't been able to find a better way with the current project setup. It's also surprising that no one has tried to fix nodejs/node#47137. LGTM once there are tests.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Alabate, anandfresh
Once this PR has been reviewed and has the lgtm label, please assign cjihrig for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Watch keepalive not working anymore with 1.0.0
5 participants