-
Notifications
You must be signed in to change notification settings - Fork 555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Suddenly unable to work #3183
Comments
After running normally for a period of time, some nodes may experience ping failure. The netclient service needs to be restarted before it can be restored, but after a period of recovery, there may be issues with the system. How can we investigate the specific cause? @afeiszli |
can you provide more information on your environment?
|
They are not behind NAT. OS:
|
When the issue happened, there are several places to check usually:
|
Through the wg command, I found that the endpoint IP of the peer is incorrect. It automatically obtained the network IP of my k8s cluster.
|
Auto Endpoint detection is enabled by default. So that the hosts are able to communicate each other with internal ip if they are in the same sub network. In your setup, the host could not communicate each other with the network IP of k8s cluster. |
After synchronizing the configuration through "netclient pull", the node still cannot ping. Use the "wg show" command to check for the following:
The last two nodes cannot be pinged properly. The wg show command shows that the problematic nodes do not have a "latest handshake". |
|
This is the information for the "wg show" on 10.104.0.5:
|
Through tcpdump packet capture, it was found that the netmaker network card has packets, but the external network card does not have packets. The commands are as follows (all of which are operated on peer 10.104.0.1):
|
can you share your network diagram? |
Is this what you want? |
@wuwo1952368901 , in the case, let's take an example, 10.104.0.1 could not ping 10.104.0.5 each other.
|
|
@wuwo1952368901 , in your reply, From the traceroute output, the ping package does not reach any of other nodes. It looks like something is not correct in local. |
@wuwo1952368901 , have you got a chance to check the route and the peer info? In one of your comments,
You mentioned that the peer ip is not correct. It's the k8s cluster ip. By default, ENDPOINT_DETECTION is enabled, and there will be endpoint detection automatically. If there are peers in the same sub network, sub network ip will replace the public ip. It supposes that the performance is better with internal ip. |
Contact Details
No response
What happened?
Suddenly unable to ping between nodes.
Version
v0.24.2
What OS are you using?
No response
Relevant log output
No response
Contributing guidelines
The text was updated successfully, but these errors were encountered: