Skip to content

Enable DNS communication between redis#40

Open
prashanna-frsh wants to merge 15 commits into
freshworks-oss:masterfrom
prashanna-frsh:task/use-dns-name
Open

Enable DNS communication between redis#40
prashanna-frsh wants to merge 15 commits into
freshworks-oss:masterfrom
prashanna-frsh:task/use-dns-name

Conversation

@prashanna-frsh
Copy link
Copy Markdown

@prashanna-frsh prashanna-frsh commented Jan 8, 2026

Overview

This PR contains changes made to add headless service support to the Redis operator, allowing Redis replicas to be configured using DNS names instead of PodIPs.

Reason for Change

  1. Pod IP Instability: When Redis replicas are configured using PodIPs, pod restarts result in new IP addresses. This requires reconfiguration of all replicas pointing to the restarted pod, leading to:

    • Temporary replication failures during pod restarts
    • Increased operator reconciliation overhead
    • Potential data inconsistency windows during failover scenarios
  2. StatefulSet DNS Naming: Kubernetes StatefulSets provide stable DNS names for each pod in the format <pod-name>.<service-name>.<namespace>.svc.cluster.local. These DNS names remain constant across pod restarts, providing a stable network identity that aligns with Kubernetes best practices for stateful workloads.

  3. Network Identity Persistence: Using DNS names instead of IPs ensures that:

    • Redis replicas can maintain stable connections to their masters
    • Replication relationships survive pod restarts without requiring operator intervention
    • The cluster topology remains consistent even when individual pods are recreated

Feature Description

When headless: true is enabled in the RedisFailover spec, the operator:

  • Creates a headless service (ClusterIP: None) for Redis pods
  • Uses DNS names in the format <statefulset-name>-<ordinal-number>.<service-name>.<namespace>.svc.cluster.local to configure Redis replicas
  • Uses DNS names for Redis-to-Redis connections (slaveof commands)
  • Uses IP addresses for Sentinel monitoring (as required by Redis Sentinel)

@prashanna-frsh
Copy link
Copy Markdown
Author

@pratheep-kumar @Sasidharan-Gopal Can you review this?

Comment thread api/redisfailover/v1/types.go Outdated
for _, rp := range rps.Items {
if rp.Status.PodIP == master {
podAddress := GetPodAddress(&rp, rf)
if podAddress == master {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have this is a enum or static variable rather than a dynamic string?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

podAddress will be different for different clusters, don't think we'll be able to have it as ENUM or static variable

<statefulset-name>-<ordinal-number>.<service-name>.<namespace>.svc.cluster.local

}
// Compare master with what Redis returns
// Redis may return either DNS name or IP depending on how it was configured
if slave != "" && slave != master {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: same as above comment

Comment thread operator/redisfailover/service/check.go Outdated
start := redisNode.Status.StartTime.Round(time.Second)
alive := time.Since(start)
r.logger.Debugf("Pod %s has been alive for %.f seconds", redisNode.Status.PodIP, alive.Seconds())
podAddress := GetPodAddress(&redisNode, rf)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we are doing getPodAddress() call at many places and doing computation each and every time. Is it possible to do it in a base funtion and pass it around?

Comment thread operator/redisfailover/service/check.go
}
for _, rp := range rps.Items {
if rp.Status.PodIP == ip {
podAddress := GetPodAddress(&rp, rf)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we dont need getPodAddress() here right? rp.Status.PodIP this itself is enough

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when the disableIPMode is enabled, we'll have the hostname instead of IP address. in that case we'll need to GetPodAddress()

r.logger.WithField("redisfailover", rf.ObjectMeta.Name).WithField("namespace", rf.ObjectMeta.Namespace).Infof("Making pod %s slave of %s", pod.Name, masterIP)
if err := r.redisClient.MakeSlaveOfWithPort(pod.Status.PodIP, masterIP, port, password); err != nil {
r.logger.WithField("redisfailover", rf.ObjectMeta.Name).WithField("namespace", rf.ObjectMeta.Namespace).Errorf("Make slave failed, slave ip: %s, master ip: %s, error: %v", pod.Status.PodIP, masterIP, err)
if err := r.redisClient.MakeSlaveOfWithPort(podAddress, masterIP, port, password); err != nil {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can use the getPodAddress function only where the dns is required otherwise it will be too much of unnecessary changes. @samof76 any thoughts?

Comment thread operator/redisfailover/service/names.go Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants