Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] AKS 1.32 has coredns 1.12 which has a breaking change! #4843

Open
Michael-Sinz opened this issue Mar 6, 2025 · 2 comments
Open

[BUG] AKS 1.32 has coredns 1.12 which has a breaking change! #4843

Michael-Sinz opened this issue Mar 6, 2025 · 2 comments
Assignees

Comments

@Michael-Sinz
Copy link

Michael-Sinz commented Mar 6, 2025

Describe the bug
With AKS 1.32, the version of coredns is now at 1.12 which includes a breaking change: coredns/coredns#6898

What this does is not generate a DNS entry for stateless service pods (the pod specific entry).

Now, many services may not notice since they just use the standard kubernetes loadbalancer but our service has a much more complex load balancing requirement due to the vast non-uniformity of individual request (in the range of 4 orders of magnitude different amount of time/cost between requests and it is not knowable a priori)

Anyway, I think the coredns team accepts that this broke the prior contract. We reported it here: coredns/coredns#7177

To Reproduce
Create a stateless service with a deployment/replicaset. Try to a reverse DNS lookup by the pod's IP address. In AKS 1.32 this fails. In all prior AKS versions this worked. Make sure that each pod (of the replicaset) returns a unique value for their unique IP addresses.

Expected behavior
That service pods, once ready/healthy, have a DNS entry for that specific service pod.

Additional context
The workaround we have to continue to test in AKS 1.32 is definitely not seamless - it requires changes to the helm charts (deployment/pod specs) and our new custom admission controller that sets a unique hostname field for each pod instance of services that need this. None of which is a documented change in kubernetes or AKS but is a side-effect of the change that broke the behavior or coredns.

You can see that in the comment on the bug: coredns/coredns#7177 (comment)

@Michael-Sinz
Copy link
Author

Note that switching to statefulsets is not a viable option - our services scale to thousands of pods per service/replicaset and we have hundreds of unique microservices (replicasets) that scale independently (not all scale to the same size at the same time but some number of them hit very large scale during peak usage.)

@sjwaight
Copy link
Contributor

sjwaight commented Mar 6, 2025

Duplicate of #4823.

@robbiezhang robbiezhang self-assigned this Mar 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants