Skip to content
Discussion options

You must be logged in to vote

Kubernetes quota rejection happens before the worker pod exists, so Airflow can only retry or throttle pod creation; it cannot make Kubernetes accept pods beyond the namespace quota.

In the current Kubernetes Executor source, 403 errors whose message contains exceeded quota are retried only while task_publish_max_retries allows it. If that value is 0, or if the retry budget is exhausted while the quota is still full, the task is failed.

I would check these first:

  • AIRFLOW__KUBERNETES_EXECUTOR__TASK_PUBLISH_MAX_RETRIES
  • worker_pods_creation_batch_size
  • global parallelism
  • DAG/task concurrency
  • pools for the high-fanout work
  • worker pod CPU/memory requests versus the namespace quota

The durable…

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by datasc24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants