Background
BatchSandbox CRD defines ProcessTask defines command and args as string arrays and defines ShardTaskPatches as []runtime.RawExtension.
type ProcessTask struct {
Command []string `json:"command"`
Args []string `json:"args,omitempty"`
Env []corev1.EnvVar `json:"env,omitempty"`
}
// ...
// ...
ShardTaskPatches []runtime.RawExtension `json:"shardTaskPatches,omitempty"`
The batch sandbox example CR contains:
taskTemplate:
spec:
process:
command:
- sleep
args:
- infinite
# ...
shardTaskPatches:
- spec:
process:
args:
- 3600
# ...
Problem
shardTaskPatches bypasses CRD type validation
shardTaskPatches as []runtime.RawExtension accepts arbitrary JSON/YAML payloads, Kubernetes cannot enforce the same schema validation rules as those applied to TaskSpec.
As a result, the API server accept the following parameter without any vefirication:
shardTaskPatches:
- spec:
process:
args:
- 3600
Even though the indirect target field is ultimately:
The type mismatch is only discovered later when the controller decodes or merges the patch into a TaskTemplateSpec. This creates inconsistent validation behavior:
taskTemplate.spec.process.args
-> validated as []string
shardTaskPatches[].spec.process.args
-> accepts arbitrary JSON values
Expected Behavior
Make shardTaskPatches schema-aware and validate against TaskSpec. This would ensure invalid payloads such as:
are rejected during CR admission instead of failing later during reconciliation.
Follow-up Problems
Failure in Deletiong of CR Based on Finalizer Mechanism
Every bacth sandbox CR is inserted with field finalizer for cascading resource protection. In above situation, that will cause huge risk - failed to delete CR.
The flow of batch sandbox interacting with finalizer:
Insert label Finalizer into BatchSandbox1 // pass, because no shardTaskPatches with json.Marshal()
⬇
Merge shardTaskPatches into task spec // fail, due to json.Marshal(shardTaskPatches)
⬇
Delete BatchSandbox1 // fail, because label Finalizer is never cleared
⬇
BatchSandbox1 is spinning in terminating
We have to clear label Finalizer manually, and then resource with marked deletetimestamp will be deleted.
Background
BatchSandboxCRD definesProcessTaskdefinescommandandargsas string arrays and definesShardTaskPatchesas[]runtime.RawExtension.The batch sandbox example CR contains:
Problem
shardTaskPatchesbypasses CRD type validationshardTaskPatchesas[]runtime.RawExtensionaccepts arbitrary JSON/YAML payloads, Kubernetes cannot enforce the same schema validation rules as those applied toTaskSpec.As a result, the API server accept the following parameter without any vefirication:
Even though the indirect target field is ultimately:
The type mismatch is only discovered later when the controller decodes or merges the patch into a
TaskTemplateSpec. This creates inconsistent validation behavior:Expected Behavior
Make
shardTaskPatchesschema-aware and validate againstTaskSpec. This would ensure invalid payloads such as:are rejected during CR admission instead of failing later during reconciliation.
Follow-up Problems
Failure in Deletiong of CR Based on
FinalizerMechanismEvery bacth sandbox CR is inserted with field finalizer for cascading resource protection. In above situation, that will cause huge risk - failed to delete CR.
The flow of batch sandbox interacting with finalizer:
We have to clear label Finalizer manually, and then resource with marked
deletetimestampwill be deleted.