Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions hips/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,3 +36,4 @@ restricted markdown format and can be found in the
- [hip-0024: Improve the Helm Documentation](hip-0024.md)
- [hip-0025: Better Support for Resource Creation Sequencing](hip-0025.md)
- [hip-0026: H4HIP: Wasm plugin system](hip-0026.md)
- [hip-9999: Handling unknown hook-delete-policy values](hip-9999.md)
179 changes: 179 additions & 0 deletions hips/hip-9999.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,179 @@
---
hip: 9999
title: "Handling unknown hook-delete-policy values"
authors: [ "Marcin Owsiany <[email protected]>" ]
created: "2025-07-23"
type: "feature"
status: "draft"
---

## Abstract

Currently, unknown _values_ of `helm.sh/hook-delete-policy` annotations are silently permitted.
They result in the given hook _never_ being deleted.
However this behaviour seems to be by coincidence, rather than intentional.
This HIP proposes how to handle them going forward.

## Motivation

There are [a number of _documented_ hook deletion policies][Docs].
All of them result in hook deletion _at some point_.
A default hook delete policy (namely `before-hook-creation`)
is [applied when the list of hook policies for a resource is empty][Code].

However, when an _unrecognized_ value (e.g. `badger`) is specified:
* Helm does not complain in any way, and
* a hook resource annotated this way is **never deleted**.

It seems that the current behaviour is a coincidence or mistake, rather than intended.

## Rationale

In line with [Hyrum's Law][Hyrum's Law], at least [one project][StackRox chart] depends on the current behaviour.
Namely, it specifies a hook deletion policy of `never` on most of its storage-related resources
(such as `PersistentVolumeClaim`).
These resources are applied as hooks, in order to guard against data loss,
see [appendix A](#appendix-a-why-create-storage-resources-as-hooks) if you would like to know more.

I was able to find [one more unrelated chart][Anance personal chart] that also specifies the same value.

Some charts also allow the users to specify the policy using a Helm var,
and it's not clear whether they validate the value before using it in a template.

The current Helm behaviour in this case is unspecified.
There is a concern that a change in the current implementation
(for example applying the default policy more aggressively) could cause a catastrophic data loss.

## Specification

This proposal:
- suggests keeping status quo for Helm charts API v2, for the sake of backwards compatibility,
- outlines a few options for charts API v3.

### For Helm charts API v2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### For Helm charts API v2
### Charts API v2

nitpick: touch more clear I think

(and same for v3 below)


- No semantic changes to production code.
- There already is a [regression test](https://github.com/helm/helm/pull/31385) to make sure that the current behaviour is not changed by mistake.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- There already is a [regression test](https://github.com/helm/helm/pull/31385) to make sure that the current behaviour is not changed by mistake.
- There already is a [regression test](https://github.com/helm/helm/pull/31385) (merged) to make sure that the current behaviour is not changed by mistake.


### For Helm charts API v3

There are a few dimensions in which we can make a change, that are somewhat interconnected:
- whether to accept a deletion disabling policy at all, or break compatibility and reject it,
- if accepted, whether support it officially (documented, etc), or deprecate it,
- what to do about (other) undocumented hook deletion policy values.

#### 1. Official support for a policy which disables hook deletion

Add a new, documented hook deletion policy value: `never`.
When specified, such hook resource is not deleted.
Effectively, the same as suggestion for Helm charts API v2, but explicitly supported.

Pros:
- Keeps compatibility with Helm charts API v2.
- Allows a notion of resources which are installed by Helm, but afterwards not managed by it in any way.
- StackRox chart continues to work as before without changing.
- Other charts (if any) which happen depend on current undocumented behaviour could easily be made to work by changing
whatever value they use to `never`.

Cons:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Helm apply the standard labels to hook objects?

https://helm.sh/docs/chart_best_practices/labels/#standard-labels

If so, another con is that if the chart is uninstalled. The object will remain, appearing to be managed by Helm, when that is not true.

Copy link
Member

@gjenkins8 gjenkins8 Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs say (for Charts API v2 of course):

Garbage collection of hook resources when the corresponding release is deleted may be added to Helm 3 in the future, so any hook resources that must never be deleted should be annotated with helm.sh/resource-policy: keep.

Another con is that never duplicates the functionality of the (currently) documented approach

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This option should consider deprecating the helm.sh/resource-policy: keep policy for hooks?

- It seems that hooks were generally intended to be resources whose previous instances are cleaned up
before (subsequent) chart applications. Supporting this policy breaks this assumption and
introduces some complexity when reasoning about possible scenarios. For example such `pre-install`
hook resources need to be guarded by a `if $.Release.IsInstall` condition in order not to break upgrades.
Copy link
Member

@gjenkins8 gjenkins8 Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pre-install hook objects are only created on install, right?

Why would a if $.Release.IsInstall condition be required to stop the object from being created on upgrade?


#### 2. Keep the status quo

The same as described above for Helm charts API v2.

Pros:
- Keeps compatibility with Helm charts API v2.
- StackRox chart continues to work as before without changing.

#### 3. Reject all undocumented hook deletion policies (including `never`)

Make it an error to use undocumented hook deletion policies, including `never`.

Pros:
- Mental model for hook deletion stays simple: they are always deleted (at _some_ point).

Cons:
- Breaks compatibility with Helm charts API v2. In particular:
- StackRox Helm chart stops working as is. This would most likely force StackRox to stop supporting direct usage of `helm`
for installation; StackRox operator would still use Helm library internally, and manage the storage resources differently,
as it does now.

#### 4. Reject undocumented policies (_other than_ `never`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really a sub-option of option 1? (which can choose to reject unknown policy values or not)


Make it an error to use undocumented hook deletion policies, except `never` (which would be treated
as described in point either 1 or 2).

Pros:
- _Mostly_ keeps compatibility with Helm charts API v2. Technically breaks it, but unlike proposal (3),
there is a migration path for charts which use an undocumented value other than `never` -
they need to change it to `never`).
- A little easier to reason about the behaviour, since there are fewer options.

Cons:
- If an older chart exists out there that uses an undocumented value other than `never`,
then its legacy versions would be incompatible with Helm charts API v3.

#### 5. Warn about undocumented policies
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should do this regardless I think

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(also for Charts API v2)


Add a linter check that complains if an undocumented value for hook deletion policy is used.

Pros:
- Raises awareness about this issue.

Cons:
- None?
Comment on lines +122 to +128

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also be true.


## Backwards compatibility

Described above, this HIP is all about compatibility.

## Security implications

None.

## How to teach this

Documentation and linter concerns mentioned above.

## Reference implementation

N/A ATM.

## Rejected ideas

N/A

## Open issues

See alternative points for Helm charts API v3 above.

## Appendix A: Why create storage resources as hooks?

Disclaimer: some of the reasoning below are conjectures since the motivation is lost in the mists of time.

A Helm chart was introduced for the StackRox project around 2020.
At that time, there was a strong concern that a user mistake might lead to data loss,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: iirc, Helm has special logic (that I disagree with) that retains PVs upon delete

Charts API v3 should remove this IMHO

in case the `PersistentVolume` or `PersistentVolumeClaim` were deleted in case the Helm release was deleted accidentally.

Because of that it looks like all possible safeguards were applied, including `helm.sh/resource-policy: keep`
**and** `helm.sh/hook: pre-install,pre-upgrade` together with `helm.sh/hook-delete-policy: never`.
I do not know why this undocumented value was chosen.
I assume it might have been a mistake that was not caught until recently, since the result worked as desired.

According to [the documentation][Resource policy docs], using `helm.sh/resource-policy: keep` alone
would be sufficient to prevent deletion on uninstallation.
However, it is not completely clear (to me), given how Helm behaves during `--force` rollbacks.
Deleting and recreating a `PersistentVolumeClaim` in that case would lead to data loss.

## References

- [Docs]: https://helm.sh/docs/topics/charts_hooks/#hook-deletion-policies
- [Code]: https://github.com/helm/helm/blob/bd897c96fbaf7546d6a5c57be009f16f9d38d6de/pkg/action/hooks.go#L47
- [Resource policy docs]: https://helm.sh/docs/howto/charts_tips_and_tricks/#tell-helm-not-to-uninstall-a-resource
- [StackRox chart]: https://github.com/stackrox/stackrox/tree/master/image/templates/helm/stackrox-central
- [Anance personal chart]: https://github.com/ananace/personal-charts/blob/0e60e96373c8d827c0723ec0bfaa336bd09ffb35/charts/matrix-synapse/templates/signing-key-job.yaml#L176
- [Hyrum's Law]: https://www.hyrumslaw.com/