Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert #3588 "IPv6 internal node IPs are usable externally" #4574

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jonasbadstuebner
Copy link
Contributor

Description

This reverts commit 683663e from #3588.
As stated in #4566, the behavior of IPv4 and IPv6 addresses should not differ the way it currently does.
This PR makes external-dns respect the difference between a Kubernetes internal and external Node-IP (again).

The tests are updated so that they check with both Internal and External IP where it makes sense to do so.

Reasoning

The fact that IPv6 addresses could be reached from the global networks is not a good enough reason to remove the possibility to not create IPv6 DNS records.
If you want to have external-dns create AAAA records for your "private" IPv6 addresses, you should mark them as ExternalIP, not have external-dns create them no matter what.

Fixes #4566

Checklist

  • Unit tests updated
  • End user documentation updated

Additional Notes

The next release after this PR should include a "Important Changes" section mentioning this change. Even though v0.13.5 did not have this for #3588. IMO it should have had it too.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 25, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @jonasbadstuebner. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jun 25, 2024
@johngmyers
Copy link
Contributor

Kubernetes does not publish IPv6 addresses as external node IPs; it only publishes them as internal. Reverting this commit would results in IPv6 addresses never being published as external.

If you don't want the addresses published externally, don't request the nodes to be published externally.

I object to this PR.
/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 3, 2024
@jonasbadstuebner
Copy link
Contributor Author

I am unsure what you mean by „Kubernetes publishes IPs“. Could you explain, what you mean by that?
Because if the cloud-controller-manager you use does not set the v6 Node-IPs as „ExternalIP“, this is not a problem external-dns should try to fix. Rather it should become a new feature in the CCM. Technically it is possible to have multiple ExternalIPs, so this is not a limitation we have to overcome.

The official documentation on Node addresses states that an InternalIP is „typically“ only to be reachable from within the cluster.
We have our own CCM, which sets Internal and External IPv6 addresses. But only the ExternalIPs should get a DNS entry. Currently it is impossible to disable DNS entries for InternalIPs because of your PR.

The reasoning behind your change is intransparent to me. I don’t understand why external-dns should behave this way?
Please explain it a bit more.

@mloiseleur
Copy link
Contributor

@jonasbadstuebner Wdyt of PR #4593 ? Would that solve your issue ?
@johngmyers Nice to see you here. Do you see an issue with PR #4593 ? Feel free to review it :)

@sebastiangaiser
Copy link

sebastiangaiser commented Jul 5, 2024

@mloiseleur I not sure if your PR fixes this but as stated in #3588 (comment), it is currently not possible to differentiate between internal and external ipv6 addresses for e.g. NodePorts or the K8s Node itself.
This leads to DNS entries for a NodePort containing a "private" ipv6 AND a "public" ipv6 address as stated here in the docs. But is this really a good behaviour? I think it could make sense to give users the possibility to use either the "public" or the "private" ip for a NodePort.
Additionally this behavior is different for ipv4 and ipv6 which might be confusing for users.

@jonasbadstuebner
Copy link
Contributor Author

Thank you for the reply, @mloiseleur, as @sebastiangaiser said, this is not the issue we are having.

We want AAAA records, but only for IPv6 addresses that are ExternalIP Node Addresses per Kubernetes definition.

But external-dns does not give us the ability to disable AAAA records for InternalIPs, so we always get (pseudo output of course)

$> host node01
node01 has address 12.3.123.12
node01 has IPv6 address 2a42:abcd:0::1
node01 has IPv6 address fd42:ffff:ffff::1

Where the node has:

address type Note
12.3.123.12 ExternalIP
2a42:abcd:0::1 ExternalIP
fd42:ffff:ffff::1 InternalIP << This is where our problem is; this IP should not become an AAAA record, but external-dns does not allow us to only have the other 2 DNS records

If someone wants to have public DNS records created for Nodes, they should set the respective IP as ExternalIP on the Node status.

For a DNS record to be created, it should not be enough that the IP simply is a v6. If you have a different opinion, please elaborate.

And as I said, a CCM not adding the IPv6 address as ExternalIP to the Nodes, is not a problem that should be fixed here (external-dns) but in the managers code.

@mloiseleur
Copy link
Contributor

mloiseleur commented Aug 9, 2024

@jonasbadstuebner I understand your reasoning.

Since some users will want to publish it, as described in #1875
and others (like you) do not want to, wdyt of introducing a flag about this ?

@johngmyers would that solve your objection ?

@jonasbadstuebner
Copy link
Contributor Author

@mloiseleur Thank you for the reply,
I think you misunderstood my problem.

The linked issue does not say anything about what kind of IPv6 should be exposed.

@mloiseleur
Copy link
Contributor

mloiseleur commented Aug 9, 2024

Sorry my message was not clear.

You're right : the linked issue does not say anything about what kind of IPv6 should be exposed.
What I observe in this issue is that all participants did not object with the implementation made with #3588.

Maybe I'm wrong, but I'm guessing from this fact that some users are fine with publishing the internal NodeIPv6. AS @johngmyers said in #3588, it can happen than IPv6 node addresses are reported as type NodeInternalIP despite being usable both internally and externally.

@jonasbadstuebner
Copy link
Contributor Author

jonasbadstuebner commented Aug 9, 2024

My reasoning is, if this is the case for your cluster, where the IPv6 marked as InternalIP is usable externally and you want to use it like that, you should make it an ExternalIP. Because users that have IPv6 addresses attached to their node, that are not usable externally, have no way of not creating an DNS record for it.

I am repeating myself, please read my comments in this PR and also arguments in the linked issue #4566.

@mloiseleur
Copy link
Contributor

Thanks for taking this time.

I double checked and it seems there is nothing that forbid a user to mark correctly the IP.

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 9, 2024
@jonasbadstuebner
Copy link
Contributor Author

Thank you very much.
Is there something I can/have to do to get this PR marked as "Important Change" in the release notes, if it gets merged?

Because this is changing the behavior of external-dns and might cause outages for users, if they don't double-check their setup to have the correct type for their NodeIPs set.

@Raffo
Copy link
Contributor

Raffo commented Aug 9, 2024

@johngmyers can you please provide a more detailed answer to the comments that followed your last one? I'd love to understand better both your point of view and the one of the author of this PR.

@szuecs
Copy link
Contributor

szuecs commented Aug 12, 2024

It could be that users of the feature (maybe wrong feature!?) do not control the environment so they have no choice.
What if you provide a PR that is backwards compatible in adding an annotation to only enable external IPs?

From my side, already stated a bunch of times, I personally think we should just drop DNS for pods/nodes. I won't actively do this, but IMO external-dns should not be the tool to provide all kind of DNS workarounds for every application that exists. It just adds complexity on our side and time is limited and external-dns was clearly build around ingress (and svc type loadbalancer), so everything that is similar to ingress will work reliable and rest is likely buggy (we can see this in all kind of PRs, issues, questions in chat, ...).

@jonasbadstuebner
Copy link
Contributor Author

@szuecs Please give me and everyone else involved or interested a good explanation of why the feature was approved as it is right now.
You must have had a reason to approve and lgtm it.

I think „users might not be able to change this“ is not a good reason for an annotation as a feature flag, because if you can annotate the Node, you should also be able to change the IP type. If you are not, you should not be the one running external-dns for Node-DNS entries in the first place.
That being said, I think it should be a breaking change, since the current behavior is lacking a good reason to exist.

Dropping support completely would force users of this feature to migrate to a completely different tool or have multiple different tools for handling DNS entries. The first option is probably not in the interest of external-dns and the second one would be way harder to operate.
I think my use-case is not a workaround. But please provide a better way of implementing it, if you think so.

@szuecs
Copy link
Contributor

szuecs commented Aug 12, 2024

@szuecs Please give me and everyone else involved or interested a good explanation of why the feature was approved as it is right now.

You must have had a reason to approve and lgtm it.

Please be friendly. Right now it seems that you are not.

I approved the feature because code looked fine and John was doing major work on ipv6/AAAA part.

Maybe he was testing clusters of a type (likely aws, because he is kops maintainer), that had the behavior as he wrote.
I trusted that he observed what he explicitly stated. Back then ipv6 support was new here and as far as I remember in kubernetes, too. So maybe it was a bug in the version he tested.

I think „users might not be able to change this“ is not a good reason for an annotation as a feature flag, because if you can annotate the Node, you should also be able to change the IP type. If you are not, you should not be the one running external-dns for Node-DNS entries in the first place.

Again please try to be friendly even if you are disappointed.

I meant the externalIp vs internalIp is not under user control if you think of user as cluster admin. You seem to be a provider user therefore you control these fields. Cluster admins don't.
I hope that you understand what I meant and we can clarify whatever we (me and you) are not understanding from each other.

That being said, I think it should be a breaking change, since the current behavior is lacking a good reason to exist.

I agree that a breaking change is bad and I didn't notice it back then.
"Lacking good reason to exist", seems to be more the unfriendly response that I would like not to have to read. I think the author has done work as best as of knowledge and state. You come today and kubernetes 1 year ago was different and maybe it was fixed in ccm, I don't know but the original PR seemed to have a good intention.

Dropping support completely would force users of this feature to migrate to a completely different tool or have multiple different tools for handling DNS entries. The first option is probably not in the interest of external-dns and the second one would be way harder to operate.

As I said earlier I will not do it.

I think my use-case is not a workaround. But please provide a better way of implementing it, if you think so.

That's a fair question and because we support node.
I guess we will revert as you suggested.
Right now you can also change your ccm in case it's urgent or use an old external-dns version as short term mitigation.

@jonasbadstuebner
Copy link
Contributor Author

jonasbadstuebner commented Aug 12, 2024

I am sorry that my pragmatic approach comes over as being unfriendly and that I seem to not hide my frustration very well.
I am looking for an answer and pushing towards it, not saying "you should have done better". Also I wrote the reply in a hurry, probably shouldn't have done that.

I meant the externalIp vs internalIp is not under user control if you think of user as cluster admin. You seem to be a provider user therefore you control these fields. Cluster admins don't.

You are right that I am the provider part. And probably I don't know every use case of external-dns.

If there is a use case for InternalIPs being published, it should be covered. IMO, there is only the one use case where a Node has only InternalIPs set. These are already published in that case.
As soon as a provider sets an ExternalIP on a node, this one should be used.

Whatever the solution to this specific problem is or might be - it should be the same behavior for IPv4 and IPv6.

If AWS CCM sets only the IPv4 as ExternalIP and the publicly reachable IPv6 as InternalIP, this would match the problem John was seeing. But I don't use AWS and never did, so I wouldn't know.

Maybe it would be an idea to run the same fallback logic for IPv4 and IPv6 separately, meaning if you have:

IP Type
1.2.3.4 ExternalIP
10.0.0.1 InternalIP
1:2:3::4 InternalIP

It should lead to the ExternalIPv4 to be published - because ExternalIP is chosen over Internalip - and the InternalIPv6, since there is no public v6 one.
If you want, I can change my code to use that. This would fix my issue, since I have an ExternalIPv6 set, that would be used instead of InternalIPv6. And it would not break for users that have a use case I am assuming @johngmyers has/had when implementing the other PR.

But then it would make sense to have a flag that chooses the running mode of external-dns, so that I can choose to run it in IPv4 only mode, even though I have an IPv6 set. And vice versa. Also with an option for having both.
I don't know, if this currently exists. But I could implement it in this PR, if this sounds like a good solution to you.

Lacking good reason to exist

I am sorry about that.

I think the author has done work as best as of knowledge and state. You come today and kubernetes 1 year ago was different and maybe it was fixed in ccm, I don't know but the original PR seemed to have a good intention.

I am sure of it. And as I said already, I never wanted to discredit any work done.
Also, I too am guilty of changing code because I saw it as the logical thing to do for me, without knowing about other use cases or that the world might look different for everyone.
In this specific case, I know that you know more about external-dns than I do. So I was asking for an answer - and then got an answer that I didn't quite understand and was most likely also based on the assumption made when merging the other PR. And I was misunderstood, even though I think my comments were pretty clear on what my problem is. This is where my frustration comes from. Simply communication.

To me it seems like I just was "left on read" whilst having a valid point.

I guess we will revert as you suggested.
Right now you can also change your ccm in case it's urgent or use an old external-dns version as short term mitigation.

I deployed the version of this PR to fix my problem.
And I don't want to have this PR be the next one someone wants to revert because they don't understand the change or it doesn't make sense again, so let's not merge it too quickly.
(Sorry about the "doesn't make sense" - I understand that this again could come over as being unfriendly. It is not meant that way.)

@szuecs
Copy link
Contributor

szuecs commented Aug 13, 2024

We are discussing external-dns change in https://kubernetes.slack.com/archives/C771MKDKQ/p1723531094274339
and the InternalIP/ExternalIP topic was discussed in https://kubernetes.slack.com/archives/C09QYUH5W/p1723197933110369 .
sig-network tech-leads said that node InternalIP/ExternalIP has no real meaning and different cloud providers might use it differently.

John explicitly stated that he needs InternalIP to support AWS, which is likely the main case he was developing the PR for. The PR is more than 1y old and I don't believe we should just revert the PR, if there was no issue until now.

My current suggestion was that we could have a provider specific override for the handling of these node IPs. This would enable providers to do the right thing based on provider specifics, which could also be provider specific options. Interesting will be how we fit this into webhook provider, but there are already ways how to pass provider specifics there so I think we can do this.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 6, 2024
@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign mloiseleur for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@iakat
Copy link

iakat commented Sep 20, 2024

Please give the ability to not expose IPv6 InternalIPs and/or revert this change.

As is, using ExternalIP as routable and InternalIP ULA addresses for the cluster, results in the ULA IPv6 to be added to the DNS record.

@jonasbadstuebner
Copy link
Contributor Author

I totally agree with #4808 (comment)
This should be rediscussed. I don't like the current approach and don't think it should be implemented with a custom flag.
It should be a breaking change and should be done the other way around. To opt-in to this behavior for AWS, if you need it.

@sebastiangaiser
Copy link

@szuecs @mloiseleur @Raffo please consider the breaking change as this behavior is unexpected from a user perspective. I totally respect decisions from the past but products/tools need to evolve in order to don't become replaced at some point.
Personally I really like to use "upstream" Kubernetes (sig) products like external-dns and others as they in common have the most generic support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Why was #3588 merged?
8 participants