Skip to content

Conversation

qiujian16
Copy link
Member

@qiujian16 qiujian16 commented Sep 25, 2025

Summary

Related issue(s)

Fixes #

Summary by CodeRabbit

  • Refactor

    • Switched server-side apply logging to structured logs with consistent fields for clearer diagnostics.
    • Adopted typed rate limiter in manifest controllers for safer, more predictable queuing behavior.
    • Introduced jittered resync scheduling to smooth requeue bursts and improve workload distribution.
    • Improved controller sync and finalization flows for more consistent reconciliation timing and observability.
  • Tests

    • Updated tests to use the typed rate limiter to match controller changes.

Copy link

coderabbitai bot commented Sep 25, 2025

Walkthrough

Updated logging in server-side apply to use structured logger; migrated workqueue rate limiters to typed TypedRateLimiter[string] and NewTypedItemExponentialFailureRateLimiter[string]; adjusted manifest controller to use bare informers, a sync context, and jittered requeue timing.

Changes

Cohort / File(s) Summary of changes
Structured logging in server-side apply
pkg/work/spoke/apply/server_side_apply.go
Replaced Eventf-based patch logging with a structured logger.Info call emitting fields: kind, key, fieldManager, and error.
Typed rate limiter in applied manifest reconciler
pkg/work/spoke/controllers/manifestcontroller/appliedmanifestwork_reconciler.go, pkg/work/spoke/controllers/manifestcontroller/appliedmanifestwork_reconciler_test.go
Changed rateLimiter field type to workqueue.TypedRateLimiter[string] and replaced NewItemExponentialFailureRateLimiter(...) with NewTypedItemExponentialFailureRateLimiter[string](...).
Manifest work controller: informers, sync context, jittered requeue
pkg/work/spoke/controllers/manifestcontroller/manifestwork_controller.go
Switched to bare informers (WithBareInformers), added a sync context (WithSyncContext), adjusted informer handlers, reduced reconcile log verbosity, replaced fixed ResyncInterval with wait.Jitter(ResyncInterval, 0.3), and minor PatchStatus return handling.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Title Check ⚠️ Warning The title claims that the PR implements a skip-apply behavior when the computed patch is empty, but the actual changes described focus on structured logging substitutions, rate limiter typing, and controller logic adjustments without any mention of skipping apply based on patch content. Please update the title to accurately reflect the implemented changes or adjust the code to match the stated behavior of skipping apply when the computed patch is empty.
Description Check ⚠️ Warning The pull request description includes only the template headings without any substantive content in the Summary or Related issue(s) sections, leaving required information entirely blank. Please fill in the Summary with a concise description of the changes and their purpose, and add any related issue numbers in the Related issue(s) section.
✅ Passed checks (1 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🧪 Early access (Sonnet 4.5): enabled

We are currently testing the Sonnet 4.5 model, which is expected to improve code review quality. However, this model may lead to increased noise levels in the review comments. Please disable the early access features if the noise level causes any inconvenience.

Note:

  • Public repositories are always opted into early access features.
  • You can enable or disable early access features from the CodeRabbit UI or by updating the CodeRabbit configuration file.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
pkg/work/spoke/controllers/manifestcontroller/manifestwork_reconciler_test.go (1)

579-581: Fix: expect GET before PATCH in SSA conflict test

With the new pre‑GET in SSA, this test should also expect a GET before the PATCH.

Apply this diff:

-        .withExpectedDynamicAction("patch").
+        .withExpectedDynamicAction("get", "patch").
pkg/work/spoke/apply/server_side_apply.go (1)

64-81: Bug: ignore-field hash/annotation not set on create; breaks future skip-on-spoke-change

Currently mutateRequiredIgnoreFields runs only when existing is found, so newly created resources won’t get the spec-hash annotation. Subsequent spoke-side changes to ignored fields won’t be skippable by hash, causing unintended applies. Compute/annotate the hash on create as well, while still only removing ignored fields on update.

Apply this diff:

@@
-    // get the existing object at first.
-    existing, err := c.client.Resource(gvr).Namespace(required.GetNamespace()).Get(
-        ctx, required.GetName(), metav1.GetOptions{})
-    if err != nil && !errors.IsNotFound(err) {
-        return nil, err
-    } else if err == nil {
-        required, requiredHash = mutateRequiredIgnoreFields(required, applyOption, logger)
-        // compare hash if the hash is computed and the existing object is not nil.
-        if len(requiredHash) > 0 && len(existing.GetAnnotations()) > 0 {
-            // skip the apply operation when the hash of the existing resource matches the required hash
-            existingHash := existing.GetAnnotations()[workapiv1.ManifestConfigSpecHashAnnotationKey]
-            if requiredHash == existingHash {
-                // still needs to apply ownerref since it might be changed due to deleteoption update.
-                err := helper.ApplyOwnerReferences(ctx, c.client, gvr, existing, owner)
-                return existing, err
-            }
-        }
-    }
+    // get the existing object at first.
+    existing, err := c.client.Resource(gvr).Namespace(required.GetNamespace()).
+        Get(ctx, required.GetName(), metav1.GetOptions{})
+    if err != nil && !errors.IsNotFound(err) {
+        return nil, err
+    }
+    existingFound := err == nil
+
+    // Always compute/annotate hash; only remove ignored fields when updating an existing object.
+    required, requiredHash = mutateRequiredIgnoreFields(required, applyOption, logger, existingFound)
+    if existingFound && len(requiredHash) > 0 && len(existing.GetAnnotations()) > 0 {
+        // skip the apply operation when the hash of the existing resource matches the required hash
+        existingHash := existing.GetAnnotations()[workapiv1.ManifestConfigSpecHashAnnotationKey]
+        if requiredHash == existingHash {
+            // still needs to apply ownerref since it might be changed due to deleteoption update.
+            err := helper.ApplyOwnerReferences(ctx, c.client, gvr, existing, owner)
+            return existing, err
+        }
+    }
@@
-func mutateRequiredIgnoreFields(required *unstructured.Unstructured, applyOption *workapiv1.ManifestConfigOption, logger klog.Logger) (*unstructured.Unstructured, string) {
+func mutateRequiredIgnoreFields(required *unstructured.Unstructured, applyOption *workapiv1.ManifestConfigOption, logger klog.Logger, existingFound bool) (*unstructured.Unstructured, string) {
     var requiredHash string
 
-    if applyOption.UpdateStrategy.ServerSideApply != nil {
+    if applyOption != nil && applyOption.UpdateStrategy.ServerSideApply != nil {
         ignoreFields := applyOption.UpdateStrategy.ServerSideApply.IgnoreFields
         if len(ignoreFields) > 0 {
-            for _, field := range ignoreFields {
-                // for IgnoreFieldsConditionOnSpokeChange, it will still be included when computing the hash. So when
-                // hash dismatch, these fields will still be patched on the cluster.
-                if field.Condition == workapiv1.IgnoreFieldsConditionOnSpokeChange {
-                    continue
-                }
-                for _, path := range field.JSONPaths {
-                    removeFieldByJSONPath(required.UnstructuredContent(), path, logger)
-                }
-            }
+            if existingFound {
+                for _, field := range ignoreFields {
+                    // IgnoreFieldsConditionOnSpokeChange stays included; others are removed for update comparisons.
+                    if field.Condition == workapiv1.IgnoreFieldsConditionOnSpokeChange {
+                        continue
+                    }
+                    for _, path := range field.JSONPaths {
+                        removeFieldByJSONPath(required.UnstructuredContent(), path, logger)
+                    }
+                }
+            }
             requiredHash = hashOfResourceStruct(required)
             annotation := required.GetAnnotations()
             if annotation == nil {
                 annotation = map[string]string{}
             }
             annotation[workapiv1.ManifestConfigSpecHashAnnotationKey] = requiredHash
             required.SetAnnotations(annotation)
         }
     }
 
     return required, requiredHash
 }

Also applies to: 123-150

🧹 Nitpick comments (4)
pkg/work/spoke/apply/server_side_apply_test.go (1)

600-676: Good coverage for shouldPatch; consider a list-merge case

Add a case where only list ordering changes to document current behavior (merge patch often treats list changes as replace).

pkg/work/spoke/apply/server_side_apply.go (2)

90-101: Pre‑patch skip logic: good; add GVR to log for triage

Optional: include GVR and namespace in the skip log for easier debugging.

Proposed tweak:

  • logger.V(2).Info("Skip patching object", "gvr", gvr, "namespace", required.GetNamespace(), "name", required.GetName())

209-218: Stabilize hashing to avoid map‑order nondeterminism

fmt.Sprintf over maps is not stable; across restarts, hash mismatches can spuriously force applies. Consider a canonical, key‑sorted serialization (e.g., spew with SpewKeys=true, or a canonical JSON implementation) before hashing.

test/integration/work/updatestrategy_test.go (1)

900-909: Strengthen no-op apply assertion by also checking ResourceVersion remains unchanged.

Generation alone may not change on a no-op server-side apply. Comparing ResourceVersion makes the test more robust against unnoticed updates.

Apply this diff within the test to also assert ResourceVersion is stable:

-        // Get the deployment's current generation
-        var originalGeneration int64
+        // Get the deployment's current generation and resourceVersion
+        var originalGeneration int64
+        var originalResourceVersion string
         gomega.Eventually(func() error {
             deploy, err := spokeKubeClient.AppsV1().Deployments(clusterName).Get(context.Background(), "deploy1", metav1.GetOptions{})
             if err != nil {
                 return err
             }
             originalGeneration = deploy.Generation
+            originalResourceVersion = deploy.ResourceVersion
             return nil
         }, eventuallyTimeout, eventuallyInterval).ShouldNot(gomega.HaveOccurred())
@@
-        // Verify deployment generation hasn't changed (no unnecessary apply was performed)
+        // Verify deployment generation and resourceVersion haven't changed (no unnecessary apply was performed)
         gomega.Consistently(func() error {
             deploy, err := spokeKubeClient.AppsV1().Deployments(clusterName).Get(context.Background(), "deploy1", metav1.GetOptions{})
             if err != nil {
                 return err
             }
 
             if deploy.Generation != originalGeneration {
                 return fmt.Errorf("deployment generation changed from %d to %d, indicating unnecessary apply was performed", originalGeneration, deploy.Generation)
             }
 
+            if deploy.ResourceVersion != originalResourceVersion {
+                return fmt.Errorf("deployment resourceVersion changed from %s to %s, indicating an update occurred unexpectedly", originalResourceVersion, deploy.ResourceVersion)
+            }
+
             return nil
         }, eventuallyTimeout, eventuallyInterval).ShouldNot(gomega.HaveOccurred())

Also applies to: 936-949

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 35bab44 and 4ce103e.

⛔ Files ignored due to path filters (2)
  • vendor/k8s.io/apimachinery/pkg/util/jsonmergepatch/patch.go is excluded by !vendor/**
  • vendor/modules.txt is excluded by !vendor/**
📒 Files selected for processing (7)
  • pkg/work/spoke/apply/server_side_apply.go (6 hunks)
  • pkg/work/spoke/apply/server_side_apply_test.go (5 hunks)
  • pkg/work/spoke/controllers/manifestcontroller/appliedmanifestwork_reconciler.go (1 hunks)
  • pkg/work/spoke/controllers/manifestcontroller/appliedmanifestwork_reconciler_test.go (1 hunks)
  • pkg/work/spoke/controllers/manifestcontroller/manifestwork_controller.go (1 hunks)
  • pkg/work/spoke/controllers/manifestcontroller/manifestwork_reconciler_test.go (2 hunks)
  • test/integration/work/updatestrategy_test.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
pkg/work/spoke/apply/server_side_apply.go (1)
pkg/work/helper/helpers.go (1)
  • ApplyOwnerReferences (410-438)
test/integration/work/updatestrategy_test.go (4)
test/integration/util/unstructured.go (1)
  • NewDeployment (271-286)
test/integration/util/structured.go (1)
  • ToManifest (84-88)
test/integration/util/assertion.go (1)
  • AssertWorkCondition (62-90)
test/integration/util/work.go (1)
  • NewWorkPatch (18-35)
pkg/work/spoke/apply/server_side_apply_test.go (2)
pkg/common/testing/assertion.go (1)
  • AssertActions (51-61)
pkg/common/testing/unstructured.go (1)
  • NewUnstructured (9-24)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: build
  • GitHub Check: verify
  • GitHub Check: unit
  • GitHub Check: integration
  • GitHub Check: e2e-singleton
  • GitHub Check: e2e-hosted
  • GitHub Check: e2e
  • GitHub Check: cloudevents-integration
🔇 Additional comments (6)
pkg/work/spoke/controllers/manifestcontroller/manifestwork_reconciler_test.go (1)

484-486: LGTM: action expectations updated for pre‑GET SSA flow

The added leading "get" before "patch" aligns with the new server‑side apply behavior.

Also applies to: 499-501

pkg/work/spoke/apply/server_side_apply_test.go (1)

35-35: LGTM: tests updated for pre‑GET and added skip scenario

  • Switching existing to corev1.Secret and adding to scheme is correct.
  • Action sequences including pre‑GET and the “skip” case look good.

Also applies to: 47-48, 62-64, 66-79, 90-91, 474-476

pkg/work/spoke/apply/server_side_apply.go (1)

180-207: LGTM: shouldPatch two‑way merge check

Empty JSON merge patch detection is a pragmatic fast‑path.

Note: list merges may still produce non‑empty patches even if SSA is a no‑op; tests already acknowledge this.

pkg/work/spoke/controllers/manifestcontroller/manifestwork_controller.go (1)

94-95: LGTM: migrate to TypedRateLimiter[string]

Typed rate limiter usage matches downstream Forget/When calls.

pkg/work/spoke/controllers/manifestcontroller/appliedmanifestwork_reconciler.go (1)

27-28: LGTM: TypedRateLimiter[string] adoption

Type change is coherent with callers and usage (When/Forget with string keys).

pkg/work/spoke/controllers/manifestcontroller/appliedmanifestwork_reconciler_test.go (1)

242-246: LGTM on typed rate limiter migration.

Typed exponential rate limiter aligns with generics-based workqueue usage. No issues spotted.

Comment on lines 508 to 509
fmt.Println(actual)
if err != nil {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove debug print in test

Drop the stray fmt.Println to keep tests clean.

Apply this diff:

-                fmt.Println(actual)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fmt.Println(actual)
if err != nil {
if err != nil {
🤖 Prompt for AI Agents
In pkg/work/spoke/apply/server_side_apply_test.go around lines 508-509 there is
a stray debug print (fmt.Println(actual)) in the test; remove that fmt.Println
call so the test output remains clean, leaving the subsequent error handling (if
err != nil { ... }) intact.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/work/spoke/apply/server_side_apply.go (1)

152-178: Bug: JSONPath removal fails for bracket-notation (e.g., .metadata.annotations['key']).

Current logic treats the entire "annotations['key']" as a single field name and deletes the wrong map key. Handle bracket-notation by splitting parent path and the actual map key.

Apply this diff:

 func removeFieldByJSONPath(obj interface{}, path string, logger klog.Logger) {
-	listKeys := strings.Split(path, ".")
-	if len(listKeys) == 0 {
-		return
-	}
-	lastKey := listKeys[len(listKeys)-1]
-	pathWithoutLastKey := strings.TrimSuffix(path, "."+lastKey)
+	if len(path) == 0 {
+		return
+	}
+	// Determine parent path and the final key to delete, supporting bracket-notation:
+	// e.g. ".metadata.annotations['key']" -> parent ".metadata.annotations", key "key"
+	//      ".spec.replicas"               -> parent ".spec",                 key "replicas"
+	var parentPath, deleteKey string
+	if idx := strings.LastIndex(path, "."); idx != -1 {
+		tail := path[idx+1:] // e.g. "replicas" or "annotations['key']"
+		if lb := strings.Index(tail, "["); lb != -1 && strings.HasSuffix(tail, "]") {
+			// bracket notation on a map key
+			parentPath = path[:idx+1] + tail[:lb] // include the base field name
+			// extract quoted key inside brackets
+			bracket := tail[lb:]                  // e.g. "['key']" or "[\"key\"]"
+			if strings.HasPrefix(bracket, "['") && strings.HasSuffix(bracket, "']") && len(bracket) >= 4 {
+				deleteKey = bracket[2 : len(bracket)-2]
+			} else if strings.HasPrefix(bracket, "[\"") && strings.HasSuffix(bracket, "\"]") && len(bracket) >= 4 {
+				deleteKey = bracket[2 : len(bracket)-2]
+			} else {
+				// unsupported bracket syntax
+				return
+			}
+		} else {
+			parentPath = path[:idx]
+			deleteKey = tail
+		}
+	} else {
+		// no dot segment to split
+		return
+	}
 	finder := jsonpath.New("ignoreFields").AllowMissingKeys(true)
-	if err := finder.Parse(fmt.Sprintf("{%s}", pathWithoutLastKey)); err != nil {
-		logger.Error(err, "parse jsonpath", "path", pathWithoutLastKey)
+	if err := finder.Parse(fmt.Sprintf("{%s}", parentPath)); err != nil {
+		logger.Error(err, "parse jsonpath", "path", parentPath)
 	}
 	results, err := finder.FindResults(obj)
 	if err != nil {
-		logger.Error(err, "find jsonpath", "path", pathWithoutLastKey)
+		logger.Error(err, "find jsonpath", "path", parentPath)
 	}
 	for _, result := range results {
 		for _, r := range result {
 			mapResult, ok := r.Interface().(map[string]interface{})
 			if !ok {
 				continue
 			}
-			delete(mapResult, lastKey)
+			delete(mapResult, deleteKey)
 		}
 	}
 }
🧹 Nitpick comments (4)
pkg/work/spoke/apply/server_side_apply.go (4)

55-66: Avoid mutating the caller’s object; sanitize the DeepCopy instead.

Calling removeCreationTimeFromMetadata on requiredOriginal mutates the input. Move it to operate on the copy.

-	// TODO The above issue is fixed in k8s 1.34+, this will be removed when supported spoke cluster version is higher than 1.34+
-	removeCreationTimeFromMetadata(requiredOriginal.Object, logger)
+	// TODO The above issue is fixed in k8s 1.34+, this will be removed when supported spoke cluster version is higher than 1.34+
+	// sanitize a copy to avoid mutating the caller
+	required := requiredOriginal.DeepCopy()
+	removeCreationTimeFromMetadata(required.Object, logger)
-
-	force := false
-	fieldManager := workapiv1.DefaultFieldManager
-	var requiredHash string
-
-	required := requiredOriginal.DeepCopy()
+	force := false
+	fieldManager := workapiv1.DefaultFieldManager
+	var requiredHash string

Note: The change introduces the DeepCopy earlier; subsequent code continues to use required.


98-101: Strengthen log context when skipping patching.

Include GVR and namespace to make triaging easier.

-		logger.V(2).Info("Skip patching object", "name", required.GetName())
+		logger.V(2).Info("Skip patching object",
+			"gvr", gvr,
+			"namespace", required.GetNamespace(),
+			"name", required.GetName())

180-206: Patch detection is approximate; consider SSA dry‑run or spec‑only equality.

CreateThreeWayJSONMergePatch with original=nil can over-report changes (status, server-populated metadata, list semantics). This may still cause redundant Apply calls.

Recommendation:

  • Option A (more accurate): Perform a server-side Apply with DryRun=All (using the same FieldManager/Force) and compare the returned object’s spec with existing; skip if equal.
  • Option B (cheaper): Sanitize existing by dropping status and server-populated metadata (RV, UID, managedFields, timestamps), apply ignore-fields, then deep-equal the sanitized spec with required spec.

These will reduce false positives vs merge-patch heuristics.


208-218: Make the spec hash deterministic.

hashOfResourceStruct uses fmt.Sprintf on a struct with maps, which can produce non-deterministic ordering. Hash the JSON bytes instead.

Apply this diff:

 func hashOfResourceStruct(o interface{}) string {
-	oString := fmt.Sprintf("%v", o)
-	h := md5.New() //nolint:gosec
-	if _, err := io.WriteString(h, oString); err != nil {
-		return ""
-	}
+	data, err := json.Marshal(o)
+	if err != nil {
+		return ""
+	}
+	h := md5.New() //nolint:gosec
+	if _, err := h.Write(data); err != nil {
+		return ""
+	}
 	rval := fmt.Sprintf("%x", h.Sum(nil))
 	return rval
 }

Add import:

import "encoding/json"
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4ce103e and f9bc431.

⛔ Files ignored due to path filters (2)
  • vendor/k8s.io/apimachinery/pkg/util/jsonmergepatch/patch.go is excluded by !vendor/**
  • vendor/modules.txt is excluded by !vendor/**
📒 Files selected for processing (7)
  • pkg/work/spoke/apply/server_side_apply.go (6 hunks)
  • pkg/work/spoke/apply/server_side_apply_test.go (4 hunks)
  • pkg/work/spoke/controllers/manifestcontroller/appliedmanifestwork_reconciler.go (1 hunks)
  • pkg/work/spoke/controllers/manifestcontroller/appliedmanifestwork_reconciler_test.go (1 hunks)
  • pkg/work/spoke/controllers/manifestcontroller/manifestwork_controller.go (1 hunks)
  • pkg/work/spoke/controllers/manifestcontroller/manifestwork_reconciler_test.go (2 hunks)
  • test/integration/work/updatestrategy_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • pkg/work/spoke/controllers/manifestcontroller/manifestwork_reconciler_test.go
  • pkg/work/spoke/controllers/manifestcontroller/appliedmanifestwork_reconciler_test.go
  • pkg/work/spoke/apply/server_side_apply_test.go
  • pkg/work/spoke/controllers/manifestcontroller/manifestwork_controller.go
🧰 Additional context used
🧬 Code graph analysis (2)
test/integration/work/updatestrategy_test.go (4)
test/integration/util/unstructured.go (1)
  • NewDeployment (271-286)
test/integration/util/structured.go (1)
  • ToManifest (84-88)
test/integration/util/assertion.go (1)
  • AssertWorkCondition (62-90)
test/integration/util/work.go (1)
  • NewWorkPatch (18-35)
pkg/work/spoke/apply/server_side_apply.go (1)
pkg/work/helper/helpers.go (1)
  • ApplyOwnerReferences (410-438)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: e2e-singleton
  • GitHub Check: e2e
  • GitHub Check: e2e-hosted
  • GitHub Check: unit
  • GitHub Check: verify
  • GitHub Check: cloudevents-integration
  • GitHub Check: integration
  • GitHub Check: build
🔇 Additional comments (2)
pkg/work/spoke/controllers/manifestcontroller/appliedmanifestwork_reconciler.go (1)

27-28: LGTM: migrate to TypedRateLimiter[string].

Type safety improvements look good, and uses (When/Forget) are correct with string keys.

Please confirm the constructor wiring uses workqueue.NewTypedItemExponentialFailureRateLimiterstring (or equivalent) where this reconciler is instantiated.

test/integration/work/updatestrategy_test.go (1)

875-986: LGTM: solid no‑op apply coverage.

Good guard against redundant Apply by asserting generation stability, plus verification that real changes still apply.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
test/integration/work/updatestrategy_test.go (1)

879-896: Minor: avoid shadowing outer work variable for clarity.

Redeclaring work := ... shadows the suite-scoped work. It’s harmless but can confuse future edits.

Apply this diff locally:

-		work := util.NewManifestWork(clusterName, workName, []workapiv1.Manifest{util.ToManifest(object)})
+		w := util.NewManifestWork(clusterName, workName, []workapiv1.Manifest{util.ToManifest(object)})
-		work.Spec.ManifestConfigs = []workapiv1.ManifestConfigOption{
+		w.Spec.ManifestConfigs = []workapiv1.ManifestConfigOption{
...
-		work, err = hubWorkClient.WorkV1().ManifestWorks(clusterName).Create(context.Background(), work, metav1.CreateOptions{})
+		work, err = hubWorkClient.WorkV1().ManifestWorks(clusterName).Create(context.Background(), w, metav1.CreateOptions{})
pkg/work/spoke/apply/server_side_apply.go (3)

89-100: Tighten no-op detection by ignoring server‑managed fields and ownerRefs.

shouldPatch can still report non-empty patches due to differences in server-managed metadata (managedFields, resourceVersion, generation, uid, creationTimestamp) or ownerReferences. Since ownerRefs are handled separately via ApplyOwnerReferences, exclude them (and status, server metadata) from the local diff to further reduce unnecessary apply calls.

Apply this diff to shouldPatch:

 func shouldPatch(existing, required *unstructured.Unstructured) (bool, error) {
 	if existing == nil {
 		return true, nil
 	}
 
-	existingBytes, err := existing.MarshalJSON()
+	// Work on sanitized deep-copies to avoid mutating inputs
+	ex := existing.DeepCopy()
+	req := required.DeepCopy()
+	// Drop fields that shouldn't influence no-op decisions
+	for _, u := range []*unstructured.Unstructured{ex, req} {
+		unstructured.RemoveNestedField(u.Object, "status")
+		unstructured.RemoveNestedField(u.Object, "metadata", "managedFields")
+		unstructured.RemoveNestedField(u.Object, "metadata", "resourceVersion")
+		unstructured.RemoveNestedField(u.Object, "metadata", "generation")
+		unstructured.RemoveNestedField(u.Object, "metadata", "uid")
+		unstructured.RemoveNestedField(u.Object, "metadata", "creationTimestamp")
+		// OwnerRefs are patched separately; exclude them from the decision
+		unstructured.RemoveNestedField(u.Object, "metadata", "ownerReferences")
+	}
+
+	existingBytes, err := ex.MarshalJSON()
 	if err != nil {
 		return false, err
 	}
 
-	requiredBytes, err := required.MarshalJSON()
+	requiredBytes, err := req.MarshalJSON()
 	if err != nil {
 		return false, err
 	}
 
 	patchBytes, err := jsonmergepatch.CreateThreeWayJSONMergePatch(nil, requiredBytes, existingBytes)
 	if err != nil {
 		return false, err
 	}
 
 	if string(patchBytes) == "{}" {
 		return false, nil
 	}
 
 	return true, nil
 }

122-149: Stabilize the spec-hash to avoid false diffs from Go map ordering.

hashOfResourceStruct(fmt.Sprintf("%v", o)) is non-deterministic for maps and can produce spurious hash changes. Prefer hashing a canonical JSON serialization.

Apply this diff and import crypto/sha256 plus sigs.k8s.io/json:

-	var requiredHash string
+	var requiredHash string
@@
-			requiredHash = hashOfResourceStruct(required)
+			requiredHash = hashOfResourceStruct(required)

And replace hashOfResourceStruct with a canonical JSON hash:

-// detect changes in a resource by caching a hash of the string representation of the resource
-// note: some changes in a resource e.g. nil vs empty, will not be detected this way
-func hashOfResourceStruct(o interface{}) string {
-	oString := fmt.Sprintf("%v", o)
-	h := md5.New() //nolint:gosec
-	if _, err := io.WriteString(h, oString); err != nil {
-		return ""
-	}
-	rval := fmt.Sprintf("%x", h.Sum(nil))
-	return rval
-}
+// detect changes via a stable JSON encoding to avoid Go map iteration randomness
+func hashOfResourceStruct(o interface{}) string {
+	b, err := json.Marshal(o) // sigs.k8s.io/json
+	if err != nil {
+		return ""
+	}
+	sum := sha256.Sum256(b)
+	return fmt.Sprintf("%x", sum[:])
+}

Required imports to add at top:

  • "crypto/sha256"
  • sigjson "sigs.k8s.io/json" (alias as json to minimize changes)

151-177: Guard against list paths in ignore-fields or fail fast.

removeFieldByJSONPath assumes JSONPaths don’t select lists; today it silently no-ops. Consider validating JSONPaths up-front and returning an error (or at least a warning) when a path selects a list element (e.g., .spec.template.spec.containers[0].image) to prevent misleading IgnoreFields configs.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f9bc431 and ea194cd.

⛔ Files ignored due to path filters (2)
  • vendor/k8s.io/apimachinery/pkg/util/jsonmergepatch/patch.go is excluded by !vendor/**
  • vendor/modules.txt is excluded by !vendor/**
📒 Files selected for processing (7)
  • pkg/work/spoke/apply/server_side_apply.go (5 hunks)
  • pkg/work/spoke/apply/server_side_apply_test.go (4 hunks)
  • pkg/work/spoke/controllers/manifestcontroller/appliedmanifestwork_reconciler.go (1 hunks)
  • pkg/work/spoke/controllers/manifestcontroller/appliedmanifestwork_reconciler_test.go (1 hunks)
  • pkg/work/spoke/controllers/manifestcontroller/manifestwork_controller.go (1 hunks)
  • pkg/work/spoke/controllers/manifestcontroller/manifestwork_reconciler_test.go (2 hunks)
  • test/integration/work/updatestrategy_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • pkg/work/spoke/controllers/manifestcontroller/appliedmanifestwork_reconciler_test.go
  • pkg/work/spoke/controllers/manifestcontroller/appliedmanifestwork_reconciler.go
  • pkg/work/spoke/controllers/manifestcontroller/manifestwork_controller.go
  • pkg/work/spoke/controllers/manifestcontroller/manifestwork_reconciler_test.go
  • pkg/work/spoke/apply/server_side_apply_test.go
🧰 Additional context used
🧬 Code graph analysis (2)
pkg/work/spoke/apply/server_side_apply.go (1)
pkg/work/helper/helpers.go (1)
  • ApplyOwnerReferences (410-438)
test/integration/work/updatestrategy_test.go (4)
test/integration/util/unstructured.go (1)
  • NewDeployment (271-286)
test/integration/util/structured.go (1)
  • ToManifest (84-88)
test/integration/util/assertion.go (1)
  • AssertWorkCondition (62-90)
test/integration/util/work.go (1)
  • NewWorkPatch (18-35)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: e2e-singleton
  • GitHub Check: e2e-hosted
  • GitHub Check: e2e
  • GitHub Check: cloudevents-integration
  • GitHub Check: unit
  • GitHub Check: verify
  • GitHub Check: build
  • GitHub Check: integration
🔇 Additional comments (2)
test/integration/work/updatestrategy_test.go (1)

875-986: Great addition: validates no-op SSA path doesn’t bump generation.

The test cleanly proves the skip-apply behavior and still covers the “real change” path. Nice use of Consistently to guard against unintended reconciles.

pkg/work/spoke/apply/server_side_apply.go (1)

61-70: Resolved: hash/annotation now computed before server calls.

Moving mutateRequiredIgnoreFields ahead of the initial GET fixes the create-path gap noted earlier.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2ae67bb and 0fb1339.

📒 Files selected for processing (1)
  • pkg/work/spoke/apply/server_side_apply.go (5 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
pkg/work/spoke/apply/server_side_apply.go (1)
pkg/work/helper/helpers.go (1)
  • ApplyOwnerReferences (410-438)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: e2e-singleton
  • GitHub Check: e2e-hosted
  • GitHub Check: e2e
  • GitHub Check: unit
  • GitHub Check: integration
  • GitHub Check: build
  • GitHub Check: cloudevents-integration
  • GitHub Check: verify

Comment on lines 94 to 100
return existing, err
}
if !should {
logger.V(2).Info("Skip patching object", "name", required.GetName())
err := helper.ApplyOwnerReferences(ctx, c.client, gvr, existing, owner)
return existing, err
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

shouldPatch never skips due to metadata noise

With live objects, existing always carries metadata.resourceVersion, metadata.uid, metadata.managedFields, etc. Because required (SSA payload) doesn’t include those fields, jsonmergepatch.CreateThreeWayJSONMergePatch(nil, required, existing) will always emit deletions for them, so the patch string is never "{}". In practice this means we’ll still hit the apiserver every sync, defeating the intended optimization. Strip the well-known server-populated metadata keys from a copy of existing before diffing (or compare only the managed/spec portions) so that an unchanged manifest actually yields an empty patch and we truly skip the apply.

Apply this diff to sanitize the live object before computing the patch:

-	should, err := shouldPatch(existing, required, logger)
+	should, err := shouldPatch(existing, required, logger)

Then update shouldPatch to make a copy of existing and remove the following keys before creating the patch:

-func shouldPatch(existing, required *unstructured.Unstructured, logger klog.Logger) (bool, error) {
+func shouldPatch(existing, required *unstructured.Unstructured, logger klog.Logger) (bool, error) {
 	if existing == nil {
 		return true, nil
 	}
 
-	existingBytes, err := existing.MarshalJSON()
+	// Drop server-managed metadata fields so they don’t force a diff.
+	existingCopy := existing.DeepCopy()
+	unstructured.RemoveNestedField(existingCopy.Object, "metadata", "resourceVersion")
+	unstructured.RemoveNestedField(existingCopy.Object, "metadata", "uid")
+	unstructured.RemoveNestedField(existingCopy.Object, "metadata", "generation")
+	unstructured.RemoveNestedField(existingCopy.Object, "metadata", "creationTimestamp")
+	unstructured.RemoveNestedField(existingCopy.Object, "metadata", "managedFields")
+
+	existingBytes, err := existingCopy.MarshalJSON()

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In pkg/work/spoke/apply/server_side_apply.go around lines 94-100, the three-way
patch never becomes "{}" because server-populated metadata on the live object
(resourceVersion, uid, managedFields, etc.) causes spurious diffs; to fix it,
make a deep copy of existing before diffing and remove well-known
server-populated metadata keys (at minimum: metadata.resourceVersion,
metadata.uid, metadata.managedFields, metadata.creationTimestamp,
metadata.generation, metadata.selfLink and metadata.finalizers) from that copy,
then call jsonmergepatch.CreateThreeWayJSONMergePatch(nil, required,
sanitizedExisting) and use that result in shouldPatch; ensure you do not mutate
the original existing object and keep ownerReferences/intentional fields intact
so ApplyOwnerReferences behavior remains unchanged.

Copy link
Contributor

openshift-ci bot commented Sep 26, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: qiujian16

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

codecov bot commented Sep 26, 2025

Codecov Report

❌ Patch coverage is 27.27273% with 24 lines in your changes missing coverage. Please review.
✅ Project coverage is 61.17%. Comparing base (35bab44) to head (d031d7d).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
...lers/manifestcontroller/manifestwork_controller.go 20.00% 24 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1196      +/-   ##
==========================================
- Coverage   61.33%   61.17%   -0.17%     
==========================================
  Files         207      209       +2     
  Lines       20545    20883     +338     
==========================================
+ Hits        12602    12775     +173     
- Misses       6846     6992     +146     
- Partials     1097     1116      +19     
Flag Coverage Δ
unit 61.17% <27.27%> (-0.17%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
pkg/work/spoke/controllers/manifestcontroller/manifestwork_controller.go (1)

16-16: Jittered requeue may exceed the documented “maximum” interval; clarify docs and extract jitter factor.

wait.Jitter(duration, 0.3) returns [duration, 1.3×duration), so requeues can exceed ResyncInterval despite the comment calling it a “maximum.” Either update the comment to “base/default interval,” or change the logic if you truly need an upper bound. Also consider a named constant for the jitter factor.

Apply within this file:

- var requeueTime = wait.Jitter(ResyncInterval, 0.3)
+ var requeueTime = wait.Jitter(ResyncInterval, RequeueJitterFactor)

Add near ResyncInterval (outside the selected range):

// ResyncInterval defines the base interval for resyncing a ManifestWork.
// It is used to:
//   1) Set the ResyncEvery for the ManifestWorkAgent controller
//   2) Requeue a ManifestWork after it has been successfully reconciled (with jitter)
var ResyncInterval = 5 * time.Minute

// RequeueJitterFactor widens the requeue window to reduce thundering herd.
// Effective requeue window is [ResyncInterval, (1+factor)*ResyncInterval).
const RequeueJitterFactor = 0.3

Also applies to: 149-149, 185-185

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0fb1339 and f7ca5d2.

📒 Files selected for processing (4)
  • pkg/work/spoke/apply/server_side_apply.go (1 hunks)
  • pkg/work/spoke/controllers/manifestcontroller/appliedmanifestwork_reconciler.go (1 hunks)
  • pkg/work/spoke/controllers/manifestcontroller/appliedmanifestwork_reconciler_test.go (1 hunks)
  • pkg/work/spoke/controllers/manifestcontroller/manifestwork_controller.go (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • pkg/work/spoke/apply/server_side_apply.go
  • pkg/work/spoke/controllers/manifestcontroller/appliedmanifestwork_reconciler_test.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: integration
  • GitHub Check: e2e-singleton
  • GitHub Check: e2e
  • GitHub Check: e2e-hosted
🔇 Additional comments (2)
pkg/work/spoke/controllers/manifestcontroller/manifestwork_controller.go (1)

95-96: Typed rate limiter migration — LGTM. Please check for leftover untyped usages.

Good upgrade to generics; aligns with string keys on the queue.

Run to confirm there are no remaining untyped rate limiter usages:

pkg/work/spoke/controllers/manifestcontroller/appliedmanifestwork_reconciler.go (1)

27-28: Typed rate limiter field — LGTM.

Usage with When/Forget on string keys matches the queue keys.

If any tests or helpers construct appliedManifestWorkReconciler directly, ensure they set rateLimiter. To spot missing initializations:

@qiujian16
Copy link
Member Author

/hold

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
pkg/work/spoke/controllers/manifestcontroller/manifestwork_controller.go (2)

203-204: Consider log level for requeue message.

The requeue behavior is now logged at info level. If the cluster has many ManifestWorks, this could generate significant log volume. Consider using klog.V(2) or klog.V(3) instead of klog.Infof to make it configurable.

Optional refactor:

-	klog.Infof("Requeue manifestwork %s after %v", manifestWork.Name, requeueTime)
+	klog.V(2).Infof("Requeue manifestwork %s after %v", manifestWork.Name, requeueTime)
 	controllerContext.Queue().AddAfter(manifestWorkName, requeueTime)

101-119: Use newAccessor.GetName() when enqueuing after finalizer addition

Consistent naming: replace oldAccessor.GetName() with newAccessor.GetName() in the UpdateFunc’s finalizer check:

-               syncCtx.Queue().Add(oldAccessor.GetName())
+               syncCtx.Queue().Add(newAccessor.GetName())

Double enqueue (finalizer addition + generation change) is harmless due to queue deduplication.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f7ca5d2 and d031d7d.

📒 Files selected for processing (1)
  • pkg/work/spoke/controllers/manifestcontroller/manifestwork_controller.go (7 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
pkg/work/spoke/controllers/manifestcontroller/manifestwork_controller.go (1)
pkg/common/helpers/clusters.go (1)
  • HasFinalizer (34-45)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: unit
  • GitHub Check: integration
  • GitHub Check: verify
  • GitHub Check: build
  • GitHub Check: e2e
  • GitHub Check: e2e-hosted
  • GitHub Check: e2e-singleton
🔇 Additional comments (7)
pkg/work/spoke/controllers/manifestcontroller/manifestwork_controller.go (7)

16-16: LGTM! Imports are used appropriately.

The new imports k8s.io/apimachinery/pkg/util/wait and k8s.io/client-go/tools/cache are correctly utilized for jittered requeue timing and explicit event handlers respectively.

Also applies to: 19-19


74-74: LGTM! Sync context properly initialized.

The explicit sync context creation enables direct queue management in the event handlers and is correctly wired into the controller via WithSyncContext.


96-96: LGTM! Typed rate limiter correctly configured.

The migration to NewTypedItemExponentialFailureRateLimiter[string] with 5ms initial and 1000s max delay is appropriate for the appliedManifestWork reconciler.


122-127: LGTM! Bare informers correctly configured.

The migration to WithBareInformers for both informers is correct. The explicit event handlers (lines 101-119) control when items are enqueued, and the sync context is properly wired via WithSyncContext.


136-136: LGTM! Log verbosity adjustment.

Lowering verbosity from V(5) to V(4) makes reconciliation start messages slightly more visible, which aids in debugging.


171-171: LGTM! Jitter reduces thundering herd risk.

Replacing the constant requeue interval with wait.Jitter(ResyncInterval, 0.3) introduces a 30% jitter (3.5-5 minutes), which helps prevent synchronized requeue storms across multiple ManifestWorks.


187-187: LGTM! Patch status return values appropriately handled.

Discarding the updated object return values from PatchStatus calls is acceptable since the controller doesn't require the updated objects after status patching, and error handling is properly implemented.

Also applies to: 192-196

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant