-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: handle stale BeforeHookCreation resources as progressing #461
base: master
Are you sure you want to change the base?
Conversation
5c153c9
to
ed6a1e1
Compare
ed6a1e1
to
407dc83
Compare
Codecov ReportBase: 55.38% // Head: 55.59% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## master #461 +/- ##
==========================================
+ Coverage 55.38% 55.59% +0.21%
==========================================
Files 41 41
Lines 4478 4493 +15
==========================================
+ Hits 2480 2498 +18
+ Misses 1807 1804 -3
Partials 191 191
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
407dc83
to
7631b8e
Compare
- fixes argoproj#446 - closes argoproj/argo-cd#10579 - original issue argoproj/argo-cd#10077 Signed-off-by: Krzysztof Nazarewski <[email protected]>
7631b8e
to
327f8d9
Compare
This comment was marked as outdated.
This comment was marked as outdated.
Signed-off-by: Krzysztof Nazarewski <[email protected]>
4382fe9
to
e4d98aa
Compare
Kudos, SonarCloud Quality Gate passed! 0 Bugs No Coverage information |
managed to fully test expected flow, PR is ready for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @nazarewk for this work!
Please make sure to create a PR in Argo CD repo pointing to this change so we can validate that all integration and e2e tests are passing. It would be great to make sure that there is an existing integration or e2e test that validates this scenario. If such test is not yet implemented it would be awesome to add it in Argo CD test suite.
TLDR;
DELETE
event during handling ofBeforeHookCreation
deletion policy can be delayed whileetcd
is running garbage collection:1. at the same time "finished" hook is present in the cache,
2.
gitops-engine
does not verify creation timestamp ofBeforeHookCreation
objects against start time of sync operation,3.
gitops-engine
is unaware (old) object should be purged from cache and happily proceeds to next Sync WaveThe PR fixes this specific failure mode with minimal changes to the code.
The change is impossible to reproduce as
etcd
's GC usually does not take long enough to trigger the issue. At the same time it is very difficult to debug.the EKS cluster I observed issues at had tens of thousands objects with >1k objects churned every few minutes and still GC triggered the issue only few times a week
fixes #446
closes argoproj/argo-cd#10579
original issue argoproj/argo-cd#10077